Metadata-Version: 2.4
Name: omtx
Version: 2.0.8
Summary: Official Python SDK for the Om API
Author-email: Om <hello@omtx.ai>
License-Expression: MIT
Project-URL: Homepage, https://omtx.ai
Project-URL: Documentation, https://docs.omtx.ai
Project-URL: Source, https://github.com/omtx-ai/om-public
Project-URL: Issues, https://github.com/omtx-ai/om-public/issues
Keywords: om,omtx,bioinformatics,drug-discovery,protein-design
Classifier: Intended Audience :: Science/Research
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3 :: Only
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Operating System :: OS Independent
Classifier: Topic :: Scientific/Engineering :: Bio-Informatics
Classifier: Typing :: Typed
Requires-Python: >=3.9
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: requests>=2.31.0
Requires-Dist: polars>=1.17
Requires-Dist: rdkit>=2023.9.5
Dynamic: license-file

# Om Python SDK (`omtx`)

Official Python SDK for the Om API.

The SDK talks only to the public Om API `/v2/*` surface and covers:
- Diligence workflows and job polling
- Active public Hub workflows through `client.hub.submit(...)` and selected typed helpers
- Artifact upload for artifact-backed Hub jobs
- Entitlement-scoped dataset catalog, shard exports, and Polars-backed `OmData` loaders
- Health and Wallet Credits helpers

Public docs: `https://docs.omtx.ai`

## Installation

```bash
pip install omtx
```

## Compatibility

For best compatibility:
- Linux: modern Ubuntu on x86_64 or arm64
- macOS: Apple Silicon with a native `arm64` Python interpreter
- macOS Python distribution: Miniforge or Mambaforge recommended

The dataframe helpers in `omtx` use `polars`:
- `load_binders(...)`
- `load_nonbinders(...)`
- `load_data(...)`

If you are on Apple Silicon, use a native `arm64` shell and Python. Avoid
Rosetta / `x86_64` Python for dataframe-backed workflows.

If you are on older `x86_64` hardware, or if the default `polars` runtime fails
with CPU-feature errors, install the compatibility runtime:

```bash
pip install "polars[rtcompat]"
```

Or install both in one step:

```bash
pip install omtx "polars[rtcompat]"
```

JSON-based SDK methods may still work without `rtcompat`, but dataframe helpers
are not guaranteed on older `x86_64` CPUs unless the compatibility runtime is
installed.

## Setup

```bash
export OMTX_API_KEY="your-api-key"
```

The SDK targets `https://api.omtx.ai`.

## Quick Start

```python
from omtx import OmClient

with OmClient() as client:
    print(client.status())

    job = client.diligence.deep_diligence(
        query="CRISPR applications in cancer therapy",
        preset="quick",
    )

    result = client.jobs.wait(
        job["job_id"],
        result_endpoint="/v2/jobs/deep-diligence/{job_id}",
    )
    print(result.get("result", {}).get("total_claims"))
```

## Hub Quick Start

```python
from omtx import OmClient

with OmClient() as client:
    artifact = client.artifacts.upload("target.cif")

    job = client.hub.boltzgen(
        protocol="protein_anything",
        target_cif_artifact_id=artifact["artifact_id"],
        target_chain_id="A",
        binder_length_min=90,
        binder_length_max=110,
        idempotency_key="boltzgen-demo-20260325",
    )

    status = client.jobs.wait(job["job_id"], poll_interval=5, timeout=3600)
    print(status["status"])
```

For active public Hub models without a dedicated typed helper, use
`client.hub.submit(job_type="hub.<model>", payload=...)`.

## Data Access

Primary training flow (single call):

```python
loaded = client.load_data(
    protein_uuid="550e8400-e29b-41d4-a716-446655440000",
    binders=50000,          # required
    nonbinder_multiplier=5, # optional, default 5x binders
    # nonbinders=200000,    # optional explicit override (wins over multiplier)
    sample_seed=42,         # optional: deterministic sampling
)

binders = loaded["binders"]
nonbinders = loaded["nonbinders"]
print(binders.shape, nonbinders.shape)
binders.show(top_n=24)  # defaults: smiles_col="smiles", sort_by="binding_score"
binders.show(top_n=24, sort_by="selectivity_score")
# show() renders inline in notebooks; no extra display() wrapper needed.
```

Explicit per-pool loading (advanced control):

```python
binders = client.load_binders(
    protein_uuid="550e8400-e29b-41d4-a716-446655440000",
    n=1000,          # optional: random sample size
    sample_seed=42,  # optional: deterministic sampling
)
nonbinders = client.load_nonbinders(
    protein_uuid="550e8400-e29b-41d4-a716-446655440000",
    n=10000,         # optional: random sample size
    sample_seed=42,  # optional: deterministic sampling
)
print(binders.shape, nonbinders.shape)

# Omit n (or set n=None) to load the full pool.
# binders = client.load_binders(protein_uuid="...")
# nonbinders = client.load_nonbinders(protein_uuid="...")
```

Manual shard export URLs (advanced use):

```python
urls = client.binders.urls(
    protein_uuid="550e8400-e29b-41d4-a716-446655440000",
)
print("Binder shard URLs:", len(urls["binder_urls"]))
print("Non-binder shard URLs:", len(urls["non_binder_urls"]))
print("First binder URL:", urls["binder_urls"][0] if urls["binder_urls"] else None)
```

Generated proteins available now:

```python
protein_uuids = client.datasets.generated_protein_uuids()
print("Generated protein UUIDs:", protein_uuids[:5])
```

Module-level convenience:

```python
import omtx as om

loaded = om.load_data(
    protein_uuid="550e8400-e29b-41d4-a716-446655440000",
    binders=50000,
    nonbinder_multiplier=5,
    sample_seed=42,
)
print(loaded["binders"].shape, loaded["nonbinders"].shape)

binders = om.load_binders(
    protein_uuid="550e8400-e29b-41d4-a716-446655440000",
    n=1000,
    sample_seed=42,
)
nonbinders = om.load_nonbinders(
    protein_uuid="550e8400-e29b-41d4-a716-446655440000",
    n=10000,
    sample_seed=42,
)
print(binders.shape, nonbinders.shape)
```

## Idempotency

- Every non-GET call gets an idempotency key automatically.
- Auto-generated keys are per-call convenience and are not retry-stable.
- For retry dedupe, pass and reuse your own `idempotency_key` (logical operation ID).

Example (retry-safe launch):

```python
request_key = "search-protein-x-20260303-001"

job = client.diligence.search(
    query="MKNK2 inhibitor landscape",
    idempotency_key=request_key,
)

# If you retry the same logical launch, reuse the same idempotency key.
# retried = client.diligence.search(query="MKNK2 inhibitor landscape", idempotency_key=request_key)
```

## Helper Surface

- `diligence.deep_diligence(query, preset=None, idempotency_key=None)`
- `diligence.synthesize_report(gene_key, idempotency_key=None)`
- `diligence.search(query, idempotency_key=None)`
- `diligence.gather(query, preset=None, idempotency_key=None)`
- `diligence.crawl(url, preset=None, idempotency_key=None)`
- `diligence.list_gene_keys()`
- `artifacts.upload(file_path, content_type=None)`, `artifacts.upload_bytes(...)`, `artifacts.get(artifact_id)`
- `hub.submit(job_type, payload, idempotency_key=None)` for the full active public Hub route set
- selected typed `hub.<model>(...)` helpers for `boltz2`, `boltzgen`, `rosettafold3`, `chai1`, `rfd3`, `bindcraft`, `alphafold`, `proteinttt`, `diffdock`, `flowdock`, `openfold3`
- `jobs.history(...)`, `jobs.status(job_id)`, `jobs.wait(job_id, ...)`
- `binders.get_shards(...)`
- `binders.urls(...)`
- `load_binders(...)`
- `load_nonbinders(...)`
- `load_data(...)` (combined binder/non-binder load)
- `datasets.catalog()`
- `datasets.generated_protein_uuids()`
- `status()`
- `users.profile()`

Visualization column contract:
- `OmData.show(...)` is strict (no column fallback aliases).
- Default columns are `smiles` and `binding_score`.
- For selectivity views, pass `sort_by="selectivity_score"`.

Route policy:
- `/v2/diligence/getTargetDiligenceReport` remains an alias route and is not a separate SDK helper.
- `/v2/rag/search` is intentionally not exposed in the SDK.
- Public Hub coverage follows the active public model set in the canonical
  gateway route inventory.
- `hub.submit(...)` covers the full active public model set.
- Typed helpers are the selected subset listed above.

## Hub and Artifacts

```python
artifact = client.artifacts.upload("target.pdb")

job = client.hub.diffdock(
    protein_artifact_id=artifact["artifact_id"],
    ligand_smiles="CCO",
    idempotency_key="diffdock-demo-20260316",
)

status = client.jobs.wait(job["job_id"], poll_interval=5, timeout=1800)
history = client.jobs.history(job_type_prefix="hub", limit=20)
```

Notes:

- Artifact-backed Hub workflows upload via `client.artifacts.*` first, then pass
  artifact IDs into canonical `client.hub.*` request fields.
- `client.hub.submit(...)` is the generic escape hatch for active Hub models
  using canonical `job_type="hub.<model>"`.
- Active public models without a dedicated helper, such as `neuralplexer`, are
  launched through `client.hub.submit(...)`.
- `jobs.history(...)` supports `job_type` and `job_type_prefix` filters for
  Hub/diligence separation.

## Migration

Breaking changes in `2.0.0`:
- `OMTXClient` removed.
- `OmClient` is now the only supported client class.
- Legacy pricing helpers removed from SDK surface.
- Legacy binder batch-cost helper removed from SDK surface.
- Shard access now resolves latest accessible dataset by `protein_uuid`.
- `client.status()` is the primary health helper.
- `load_data(...)` is the primary combined dataframe-loading helper; `load_binders(...)` and `load_nonbinders(...)` remain available for explicit per-pool control.
- Flat shard URL aliases are available as `binder_urls` / `non_binder_urls`.
- Core SDK runtime includes `polars` + `rdkit`.

Migration mapping (`1.x` -> `2.x`):
- `from omtx import OMTXClient` -> `from omtx import OmClient`
- `OMTXClient(...)` -> `OmClient(...)`

Breaking changes in `1.0.0`:
- `binders.get(...)` removed from core SDK.
- `binders.iter(...)` removed from core SDK.
- `pandas` removed from required dependencies.

Migration mapping (`0.x` -> `1.x`):
- `binders.get(...)` -> `client.load_binders(...)` / `client.load_nonbinders(...)` or `binders.get_shards(...)`
- `binders.iter(...)` -> `binders.get_shards(...)` + application-level streaming
- `pip install omtx` (with pandas) -> `pip install omtx` (with polars + rdkit)

Full details: see [`MIGRATION.md`](MIGRATION.md).

## Requirements

- Python `>=3.9`
- OMTX API key

## License

MIT. See `LICENSE`.
