Metadata-Version: 2.4
Name: pyshifty
Version: 0.0.11
Requires-Dist: maturin>=1.9.6
Requires-Dist: ontoenv>=0.5.1
Requires-Dist: rdflib>=7.4.0
Requires-Dist: sphinx>=7.2 ; extra == 'docs'
Provides-Extra: docs
Summary: Python bindings for the shifty validation engine
Home-Page: https://github.com/gtfierro/shifty
Author-email: Gabe Fierro <gtfierro@mines.edu>
Requires-Python: >=3.11
Description-Content-Type: text/markdown; charset=UTF-8; variant=GFM
Project-URL: Repository, https://github.com/gtfierro/shifty

# shifty Python bindings

This crate packages the `shifty` validator as a CPython extension using
[`PyO3`](https://pyo3.rs). It exposes the `shifty` module for RDFlib-centric workflows
while reusing the same Rust engine that powers the CLI.

## Installation

From PyPI (package name `pyshifty`, import name `shifty`):

```bash
pip install pyshifty
```

From the repository root you can develop locally with

```bash
cd python
uv sync  # or `pip install -r requirements.txt` if you prefer
uvx maturin develop --extras rdflib --release
```

The GitHub Actions workflow (`.github/workflows/python.yml`) publishes the package to
PyPI as `pyshifty` and uses this README as the long description, so keep it up to date
when the bindings change.

## Usage

The extension exports one-shot helpers plus a `CompiledShapeGraph` cache that mirrors the CLI subcommands:

```python
import shifty

# Validation. When you request diagnostics, a fourth element is returned.
conforms, results_graph, report_text, diag = shifty.validate(
    data_graph,
    shapes_graph,  # optional; omit to reuse data_graph as shapes_graph
    run_inference=True,
    inference={"min_iterations": 1, "max_iterations": 8},
    graphviz=True,
    heatmap=True,
    trace_events=True,
    return_inference_outcome=True,
)
print(diag["graphviz"])        # DOT for the shapes graph
print(diag["heatmap"])         # DOT for the execution heatmap
print(diag["trace_events"][0]) # First trace event (dict)
print(diag["inference_outcome"])

# Inference-only. Diagnostics are returned as a second element when requested.
inferred_graph, diag = shifty.infer(
    data_graph,
    shapes_graph,  # optional; omit to reuse data_graph as shapes_graph
    min_iterations=1,
    max_iterations=4,
    graphviz=True,
    union=True,
    return_inference_outcome=True,
)
print(diag["inference_outcome"]["triples_added"])

# Cache shapes once, then reuse them across datasets.
compiled = shifty.generate_ir(
    shapes_graph,
    skip_invalid_rules=True,
    warnings_are_errors=False,
    do_imports=True,
)
conforms, _, _, diag = compiled.validate(data_graph, run_inference=True)
print("Compiled cache conforms?", conforms)
```

Key options (mirroring the CLI flags):

- `skip_invalid_rules` (default: `False`), `warnings_are_errors`, `do_imports`
- `shapes_graph` is optional for top-level `validate`/`infer`; when omitted (or `None`),
  Shifty uses the provided `data_graph` as both the data and shapes graph.
- RDFlib inputs are ingested in-memory (no temporary Turtle graph files written by the binding).
  When `do_imports=True`, Shifty reads `owl:imports` IRIs from the in-memory root graph and
  asks OntoEnv to resolve those dependencies from their declared locations.
- Inference knobs: `min_iterations`, `max_iterations`, `run_until_converged`/`no_converge`,
  `error_on_blank_nodes`, `debug`, `union` (include original data); the `inference={...}` dict
  still works and aliases like `inference_min_iterations` remain.
- Diagnostics: `graphviz` (DOT for shapes), `heatmap` + `heatmap_all` (execution heatmap, triggers
  a validation pass), `trace_events`, `trace_file`, `trace_jsonl`, `return_inference_outcome`
  (adds iteration/insert counts).

If you omit all diagnostics, `validate` returns `(conforms, results_graph, report_text)` and
`infer` returns the inferred `rdflib.Graph` just like before.

See `example.py` and `brick.py` in this directory for full RDFlib/OntoEnv examples.

## Docs

Sphinx docs live in `python/docs`. Build them with:

```bash
cd python/docs
make html
```

The build writes `llms.txt` into the HTML output directory
(`python/docs/_build/html/llms.txt`).

## Packaging notes

- `Cargo.toml` declares `readme = "README.md"`, so this file must be checked into the repo.
- The sdist job in GitHub Actions uses the same path when stamping metadata via `maturin`.
- Any new runtime assets (fixtures, schemas, etc.) should be listed in `pyproject.toml`
  under `tool.maturin.sdist.include` so they reach PyPI.

