Metadata-Version: 2.4
Name: shadow-diff
Version: 3.2.1
Classifier: Development Status :: 4 - Beta
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Programming Language :: Rust
Classifier: Topic :: Software Development :: Libraries
Classifier: Topic :: Software Development :: Testing
Classifier: Topic :: Software Development :: Quality Assurance
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Operating System :: POSIX :: Linux
Classifier: Operating System :: MacOS
Classifier: Operating System :: Microsoft :: Windows
Classifier: Typing :: Typed
Requires-Dist: typer>=0.15,<1.0
Requires-Dist: pydantic>=2.10,<3
Requires-Dist: httpx>=0.27,<1
Requires-Dist: rich>=13.9,<15
Requires-Dist: scikit-learn>=1.6,<2
Requires-Dist: numpy>=1.24,<3
Requires-Dist: pyyaml>=6.0,<7
Requires-Dist: jsonschema>=4.0,<5
Requires-Dist: ag2>=0.9,<2 ; extra == 'ag2'
Requires-Dist: anthropic>=0.40,<1 ; extra == 'all'
Requires-Dist: openai>=1.58,<3 ; extra == 'all'
Requires-Dist: sentence-transformers>=3.3,<6 ; extra == 'all'
Requires-Dist: opentelemetry-sdk>=1.27,<2 ; extra == 'all'
Requires-Dist: fastapi>=0.115,<1 ; extra == 'all'
Requires-Dist: uvicorn>=0.32,<1 ; extra == 'all'
Requires-Dist: websockets>=13.1,<16 ; extra == 'all'
Requires-Dist: mcp>=0.9,<2 ; extra == 'all'
Requires-Dist: pillow>=10,<13 ; extra == 'all'
Requires-Dist: imagehash>=4.3,<5 ; extra == 'all'
Requires-Dist: sigstore>=3.0,<5 ; extra == 'all'
Requires-Dist: langgraph>=1.0.2,<2 ; extra == 'all'
Requires-Dist: langchain-openai>=0.3,<2 ; extra == 'all'
Requires-Dist: crewai>=1.14,<2 ; extra == 'all'
Requires-Dist: ag2>=0.9,<2 ; extra == 'all'
Requires-Dist: anthropic>=0.40,<1 ; extra == 'anthropic'
Requires-Dist: crewai>=1.14,<2 ; extra == 'crewai'
Requires-Dist: hypothesis==6.122.1 ; extra == 'dev'
Requires-Dist: mypy==1.14.0 ; extra == 'dev'
Requires-Dist: ruff==0.8.4 ; extra == 'dev'
Requires-Dist: pytest==9.0.3 ; extra == 'dev'
Requires-Dist: pytest-asyncio==1.3.0 ; extra == 'dev'
Requires-Dist: pytest-cov==7.1.0 ; extra == 'dev'
Requires-Dist: maturin==1.13.1 ; extra == 'dev'
Requires-Dist: types-pyyaml==6.0.12.20240917 ; extra == 'dev'
Requires-Dist: sentence-transformers>=3.3,<6 ; extra == 'embeddings'
Requires-Dist: langchain-core>=0.3,<3 ; extra == 'langgraph'
Requires-Dist: langgraph>=1.0.2,<2 ; extra == 'langgraph'
Requires-Dist: langchain-openai>=0.3,<2 ; extra == 'langgraph'
Requires-Dist: mcp>=0.9,<2 ; extra == 'mcp'
Requires-Dist: pillow>=10,<13 ; extra == 'multimodal'
Requires-Dist: imagehash>=4.3,<5 ; extra == 'multimodal'
Requires-Dist: openai>=1.58,<3 ; extra == 'openai'
Requires-Dist: opentelemetry-sdk>=1.27,<2 ; extra == 'otel'
Requires-Dist: fastapi>=0.115,<1 ; extra == 'serve'
Requires-Dist: uvicorn>=0.32,<1 ; extra == 'serve'
Requires-Dist: websockets>=13.1,<16 ; extra == 'serve'
Requires-Dist: sigstore>=3.0,<5 ; extra == 'sign'
Provides-Extra: ag2
Provides-Extra: all
Provides-Extra: anthropic
Provides-Extra: crewai
Provides-Extra: dev
Provides-Extra: embeddings
Provides-Extra: langgraph
Provides-Extra: mcp
Provides-Extra: multimodal
Provides-Extra: openai
Provides-Extra: otel
Provides-Extra: serve
Provides-Extra: sign
License-File: LICENSE-APACHE
Summary: Behavior contracts for AI agents — tested in your PR, enforced at runtime.
Keywords: llm,agents,testing,observability,diff,regression-testing,causal-attribution
Home-Page: https://github.com/manav8498/Shadow
Author: manav8498
License-Expression: Apache-2.0
Requires-Python: >=3.11
Description-Content-Type: text/markdown; charset=UTF-8; variant=GFM
Project-URL: Changelog, https://github.com/manav8498/Shadow/blob/main/CHANGELOG.md
Project-URL: Discussions, https://github.com/manav8498/Shadow/discussions
Project-URL: Documentation, https://github.com/manav8498/Shadow#readme
Project-URL: Homepage, https://github.com/manav8498/Shadow
Project-URL: Issues, https://github.com/manav8498/Shadow/issues
Project-URL: Repository, https://github.com/manav8498/Shadow
Project-URL: Source Code, https://github.com/manav8498/Shadow
Project-URL: Specification, https://github.com/manav8498/Shadow/blob/main/SPEC.md

# shadow-diff

**Find the exact change that broke your AI agent.**

Shadow is a CI-native regression-forensics tool for LLM agents. One command on the PR — `shadow diagnose-pr` — answers:

1. Did agent behavior change?
2. How many traces are affected?
3. **Which exact prompt / model / tool / config change caused it?**
4. With what confidence (ATE + bootstrap CI + E-value when run with `--backend live`)?
5. What fix should `verify-fix` confirm before merge?

The PyPI distribution is `shadow-diff`. The Python import path is `shadow`. The CLI is `shadow`.

## Install

```bash
pip install shadow-diff
```

Requires Python 3.11+. Pre-built wheels ship for Linux x86_64, macOS arm64, and Windows x86_64; other platforms build from source (Rust required).

Optional extras:

```bash
pip install 'shadow-diff[anthropic]'   # if your agent uses Claude
pip install 'shadow-diff[openai]'      # if your agent uses GPT
pip install 'shadow-diff[embeddings]'  # paraphrase-robust semantic diff
pip install 'shadow-diff[all]'         # everything
```

## 60-second tour

```bash
shadow demo                  # nine-axis diff on bundled fixtures, no API key
shadow quickstart            # writable copy of a runnable scenario
```

Then run `diff` against the writable scenario:

```bash
cd shadow-quickstart
shadow diff fixtures/baseline.agentlog fixtures/candidate.agentlog
```

For the full `diagnose-pr` flow against your own agent, see [`docs/features/causal-pr-diagnosis.md`](https://github.com/manav8498/Shadow/blob/main/docs/features/causal-pr-diagnosis.md) and the runnable [`refund-causal-diagnosis`](https://github.com/manav8498/Shadow/tree/main/examples/refund-causal-diagnosis) demo.

## Record your own agent

```python
from shadow.sdk import Session

with Session(output_path="trace.agentlog"):
    # Your existing Anthropic / OpenAI code, unchanged.
    client.messages.create(model="claude-sonnet-4-6", messages=[...])
```

Shadow auto-instruments the Anthropic and OpenAI SDKs and writes content-addressed `.agentlog` files. Secrets are redacted by default. Or skip the code change entirely:

```bash
shadow record -o trace.agentlog -- python your_agent.py
```

## Daily workflow — Shadow as `pytest` for agent behavior

```bash
shadow inspect trace.agentlog                  # debug a single trace
shadow scan baseline_traces/                   # block secret leaks
shadow baseline create baseline_traces/        # pin the gold standard
shadow gate-pr ...                             # gate every PR
```

## Full docs

The canonical README, the `.agentlog` spec, runnable examples, and the comparison against adjacent agent-eval and runtime-governance tools all live at **https://github.com/manav8498/Shadow**.

## License

Apache-2.0. See `LICENSE-APACHE` in this distribution. The `.agentlog` spec is independently published under Apache-2.0.

