Metadata-Version: 2.4
Name: londonaicentre-mesa-runner
Version: 1.0.1
Summary: A stateless runner / deployment system for MESA models
Requires-Python: >=3.13
Description-Content-Type: text/markdown
Requires-Dist: boto3>=1.42.43
Requires-Dist: duckdb>=1.4.4
Requires-Dist: londonaicentre-oncoschema>=2.0.1
Requires-Dist: pandas>=2.3.3
Requires-Dist: pydantic>=2.12.5
Requires-Dist: pydantic-settings[yaml]>=2.12.0
Requires-Dist: snowflake-core>=1.11.0
Requires-Dist: snowflake-snowpark-python>=1.45.0

# MESA Runner

A stateless runner for deploying MESA registered models onto GSTT Infrastructure. It syncs models from S3, reads unprocessed documents from Snowflake, runs inference, and writes results back.

## Requirements

- Python 3.13+
- [uv](https://github.com/astral-sh/uv) package manager

## Installation

### Remote Inference (Default)

For remote inference via OpenAI-compatible endpoints:

```bash
uv sync
```

### Offline Inference (Optional)

For local GPU inference with vLLM, install the optional dependency:

```bash
uv sync --group vllm-offline
```

## Configuration

Create a `config.yaml` file (see example below):

### Remote Inference Example

```yaml
my_source:
  model_s3_uri: "s3://aicentre-nlpteam-mesa-build/models/oncoqwen/oncoqwen_1/"

  inference:
    openai_endpoint: "http://localhost:5000/v1"

  storage:
    type: snowflake
    source_database: "str"
    source_schema: "str"
    source_table: "str"

    sink_database: "str"
    sink_schema: "str"
    sink_table: "str"

    connection_params:
      account: "str"
      user: "str"
      role: "str"
      password: "str"
      warehouse: "str"
      database: "str"
```

### Offline Inference Example

```yaml
my_source:
  model_s3_uri: "s3://aicentre-nlpteam-mesa-build/models/oncoqwen/oncoqwen_1/"

  inference:
    max_model_len: 18000

  storage:
    type: snowflake
    source_database: "str"
    source_schema: "str"
    source_table: "str"

    sink_database: "str"
    sink_schema: "str"
    sink_table: "str"

    connection_params:
      account: "str"
      user: "str"
      role: "str"
      password: "str"
      warehouse: "str"
      database: "str"
```

## Usage

```bash
# Run with default config.yaml
mesa_runner

# Or specify a config file
mesa_runner --config /path/to/config.yaml

# Dry run mode (uses dummy data, does not read or write real data)
mesa_runner --dry-run
```

The dry run mode is useful for testing the runner without accessing real data sources or sinks. It generates 5 dummy documents by default and logs all write operations instead of executing them.

## Docker

```bash
# Remote inference (default)
docker build -t mesa-runner .
docker run mesa-runner

# Offline inference (includes vLLM)
docker build --target offline -t mesa-runner:offline .
docker run --gpus all mesa-runner:offline
```

## Development

```bash
# Run linting and tests
make test

# Auto-fix linting issues
make fix

# Run tests with coverage
make cov
```
