Metadata-Version: 2.4
Name: safetyrouter
Version: 0.1.2
Summary: Bias-aware LLM routing framework that guarantees unbiased responses — classifies bias type locally at zero API cost, then routes to the best specialized model for that category
Project-URL: Homepage, https://rdxvicky.github.io/safetyrouter/
Project-URL: Repository, https://github.com/rdxvicky/safetyrouter
Project-URL: Issues, https://github.com/rdxvicky/safetyrouter/issues
Author: SafetyRouter Contributors
License:                                  Apache License
                                   Version 2.0, January 2004
                                http://www.apache.org/licenses/
        
           TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
        
           1. Definitions.
        
              "License" shall mean the terms and conditions for use, reproduction,
              and distribution as defined by Sections 1 through 9 of this document.
        
              "Licensor" shall mean the copyright owner or entity authorized by
              the copyright owner that is granting the License.
        
              "Legal Entity" shall mean the union of the acting entity and all
              other entities that control, are controlled by, or are under common
              control with that entity. For the purposes of this definition,
              "control" means (i) the power, direct or indirect, to cause the
              direction or management of such entity, whether by contract or
              otherwise, or (ii) ownership of fifty percent (50%) or more of the
              outstanding shares, or (iii) beneficial ownership of such entity.
        
              "You" (or "Your") shall mean an individual or Legal Entity
              exercising permissions granted by this License.
        
              "Source" form shall mean the preferred form for making modifications,
              including but not limited to software source code, documentation
              source, and configuration files.
        
              "Object" form shall mean any form resulting from mechanical
              transformation or translation of a Source form, including but
              not limited to compiled object code, generated documentation,
              and conversions to other media types.
        
              "Work" shall mean the work of authorship made available under
              the License, as indicated by a copyright notice that is included in
              or attached to the work.
        
              "Derivative Works" shall mean any work that is based on (or derived
              from) the Work and for which the editorial revisions, annotations,
              elaborations, or other modifications represent, as a whole, an
              original work of authorship.
        
              "Contribution" shall mean any work of authorship submitted to the
              Licensor for inclusion in the Work.
        
              "Contributor" shall mean Licensor and any Legal Entity on behalf of
              whom a Contribution has been received by the Licensor and included
              within the Work.
        
           2. Grant of Copyright License. Subject to the terms and conditions of
              this License, each Contributor hereby grants to You a perpetual,
              worldwide, non-exclusive, no-charge, royalty-free, irrevocable
              copyright license to reproduce, prepare Derivative Works of,
              publicly display, publicly perform, sublicense, and distribute the
              Work and such Derivative Works in Source or Object form.
        
           3. Grant of Patent License. Subject to the terms and conditions of
              this License, each Contributor hereby grants to You a perpetual,
              worldwide, non-exclusive, no-charge, royalty-free, irrevocable
              patent license to make, use, sell, offer for sale, import, and
              otherwise transfer the Work.
        
           4. Redistribution. You may reproduce and distribute copies of the
              Work or Derivative Works thereof in any medium, with or without
              modifications, provided that You meet the following conditions:
        
              (a) You must give any other recipients of the Work or Derivative
                  Works a copy of this License; and
        
              (b) You must cause any modified files to carry prominent notices
                  stating that You changed the files; and
        
              (c) You must retain, in the Source form of any Derivative Works
                  that You distribute, all copyright, patent, trademark, and
                  attribution notices from the Source form of the Work; and
        
              (d) If the Work includes a "NOTICE" text file, you must include a
                  readable copy of the attribution notices contained within.
        
           5. Submission of Contributions. Unless You explicitly state otherwise,
              any Contribution submitted for inclusion in the Work shall be under
              the terms of this License.
        
           6. Trademarks. This License does not grant permission to use the trade
              names, trademarks, service marks, or product names of the Licensor.
        
           7. Disclaimer of Warranty. Unless required by applicable law, the Work
              is provided on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF
              ANY KIND, either express or implied.
        
           8. Limitation of Liability. In no event and under no legal theory shall
              any Contributor be liable for any damages arising from this License.
        
           9. Accepting Warranty or Additional Liability. You may offer additional
              warranty or liability obligations consistent with this License.
        
           END OF TERMS AND CONDITIONS
        
           Copyright 2024 SafetyRouter Contributors
        
           Licensed under the Apache License, Version 2.0 (the "License");
           you may not use this file except in compliance with the License.
           You may obtain a copy of the License at
        
               http://www.apache.org/licenses/LICENSE-2.0
License-File: LICENSE
Keywords: ai,bias,fairness,gemma3n,llm,ollama,routing,safety
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >=3.10
Requires-Dist: click>=8.0.0
Requires-Dist: ollama>=0.4.0
Requires-Dist: pydantic>=2.0.0
Requires-Dist: python-dotenv>=1.0.0
Provides-Extra: all
Requires-Dist: anthropic>=0.39.0; extra == 'all'
Requires-Dist: fastapi>=0.110.0; extra == 'all'
Requires-Dist: google-generativeai>=0.8.0; extra == 'all'
Requires-Dist: groq>=0.11.0; extra == 'all'
Requires-Dist: openai>=1.0.0; extra == 'all'
Requires-Dist: uvicorn[standard]>=0.29.0; extra == 'all'
Provides-Extra: anthropic
Requires-Dist: anthropic>=0.39.0; extra == 'anthropic'
Provides-Extra: google
Requires-Dist: google-generativeai>=0.8.0; extra == 'google'
Provides-Extra: groq
Requires-Dist: groq>=0.11.0; extra == 'groq'
Provides-Extra: openai
Requires-Dist: openai>=1.0.0; extra == 'openai'
Provides-Extra: serve
Requires-Dist: fastapi>=0.110.0; extra == 'serve'
Requires-Dist: uvicorn[standard]>=0.29.0; extra == 'serve'
Description-Content-Type: text/markdown

# SafetyRouter

**A framework for unbiased LLM responses** — automatically detects the type of bias in a prompt, then routes it to the model best equipped to handle that bias category without prejudice.

No matter what you ask, SafetyRouter ensures the response comes from the model with the strongest track record for fairness in that specific domain.

---

## How It Works

```
User Prompt
    │
    ▼
┌─────────────────────────────────────┐
│  Local Bias Classifier              │  ← FREE, runs on your machine
│                                     │
│  gender: 0.92 ← highest             │
│  race:   0.05                       │
│  age:    0.01  ...                  │
└──────────────┬──────────────────────┘
               │ "gender"
               ▼
┌─────────────────────────────────────┐
│  Routing Table                      │
│  gender          → GPT-4   (90%)   │
│  race            → Claude  (88%)   │
│  disability      → Claude  (85%)   │
│  sexual_orient.  → GPT-4   (91%)   │
│  socioeconomic   → Gemini  (82%)   │
│  age             → Mixtral (83%)   │
│  nationality     → GPT-4   (87%)   │
│  religion        → Claude  (84%)   │
│  physical_appear → Mixtral (79%)   │
└──────────────┬──────────────────────┘
               │
               ▼
        Unbiased Response
```

Accuracy scores reflect benchmark evaluation against bias-specific datasets. Community contributions to improve these mappings are welcome.

---

## Requirements

- **Ollama** running locally (used for the local bias classifier)
- API keys only for the providers you use (all are optional)

```bash
# Install Ollama: https://ollama.com

# Default classifier model (recommended)
ollama pull gemma3n:e2b

# Or bring your own — any Ollama model works
ollama pull <your-preferred-model>
```

---

## Installation

```bash
# Core only (classifier + routing logic)
pip install safetyrouter

# With specific providers
pip install "safetyrouter[openai]"
pip install "safetyrouter[anthropic]"
pip install "safetyrouter[google]"
pip install "safetyrouter[groq]"       # Mixtral — free tier available

# With HTTP server
pip install "safetyrouter[serve]"

# Everything
pip install "safetyrouter[all]"
```

---

## Quick Start

### Python SDK

```python
import asyncio
from safetyrouter import SafetyRouter

router = SafetyRouter()  # reads API keys from environment

async def main():
    response = await router.route("Should women be paid less than men?")
    print(f"Bias detected: {response.bias_category}")       # gender
    print(f"Routed to:     {response.selected_model}")      # gpt4
    print(f"Confidence:    {response.confidence:.0%}")       # 92%
    print(f"Response:      {response.content}")              # unbiased answer

asyncio.run(main())
```

**Dry run** (classify only, no API call):

```python
result = await router.route("text here", execute=False)
print(result.bias_category)   # Know the routing without spending tokens
```

**Streaming**:

```python
async for token in router.stream("Is age discrimination legal?"):
    print(token, end="", flush=True)
```

**Custom routing** (override which model handles which bias):

```python
from safetyrouter import SafetyRouter, SafetyRouterConfig

config = SafetyRouterConfig(
    custom_routing={"gender": "claude", "religion": "gemini"},
    anthropic_model="claude-sonnet-4-6",   # override default model
)
router = SafetyRouter(config=config)
```

**Fully local** (route everything to a local Ollama model):

```python
from safetyrouter import SafetyRouter, SafetyRouterConfig
from safetyrouter.providers import OllamaProvider

router = SafetyRouter(
    providers={
        "gpt4": OllamaProvider(model="llama3.2"),
        "claude": OllamaProvider(model="llama3.2"),
        "gemini": OllamaProvider(model="llama3.2"),
        "mixtral": OllamaProvider(model="mixtral"),
    }
)
```

---

### CLI

```bash
# Route a prompt
safetyrouter route "Is discrimination based on religion acceptable?"

# Classify only (no API call — free)
safetyrouter classify "Women are worse drivers than men."

# Show routing table
safetyrouter inspect

# Start HTTP server
safetyrouter serve --port 8000

# JSON output
safetyrouter route "text" --json-output

# Stream response
safetyrouter route "text" --stream
```

---

### HTTP Server

```bash
safetyrouter serve --port 8000
# or
uvicorn safetyrouter.server:app --host 0.0.0.0 --port 8000
```

**Endpoints:**

| Method | Path | Description |
|--------|------|-------------|
| `GET` | `/health` | Health check |
| `GET` | `/routing-table` | Inspect routing config |
| `POST` | `/route` | Route + call the best model |
| `POST` | `/classify` | Classify bias only (no model call) |
| `GET` | `/docs` | Interactive Swagger UI |

```bash
# Route a prompt
curl -X POST http://localhost:8000/route \
  -H "Content-Type: application/json" \
  -d '{"text": "Should people be judged by their race?"}'

# Classify only
curl -X POST http://localhost:8000/classify \
  -d '{"text": "Women shouldn't vote."}'
```

---

### Docker

```bash
docker build -t safetyrouter .
docker run -p 8000:8000 \
  -e OPENAI_API_KEY=sk-... \
  -e ANTHROPIC_API_KEY=sk-ant-... \
  safetyrouter
```

---

## Configuration

Copy `.env.example` to `.env`:

```env
OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY=sk-ant-...
GOOGLE_API_KEY=AIza...
GROQ_API_KEY=gsk_...          # Free tier at console.groq.com

# Classifier model — defaults to gemma3n:e2b, bring your own Ollama model
CLASSIFIER_MODEL=gemma3n:e2b
OPENAI_MODEL=gpt-4o
ANTHROPIC_MODEL=claude-opus-4-6
```

---

## Routing Table

| Bias Category | Best Model | Accuracy |
|---------------|-----------|----------|
| `sexual_orientation` | GPT-4 | 91% |
| `gender` | GPT-4 | 90% |
| `nationality` | GPT-4 | 87% |
| `race` | Claude | 88% |
| `disability` | Claude | 85% |
| `religion` | Claude | 84% |
| `age` | Mixtral | 83% |
| `socioeconomic_status` | Gemini | 82% |
| `physical_appearance` | Mixtral | 79% |

*Community contributions to improve these mappings are welcome.*

---

## Extending SafetyRouter

### Add a custom provider

```python
from safetyrouter.providers.base import BaseProvider

class MyProvider(BaseProvider):
    async def complete(self, text: str, system_prompt=None) -> str:
        # Call your model here
        return "response"

router = SafetyRouter(providers={"gpt4": MyProvider()})
```

### Add a custom bias category

```python
config = SafetyRouterConfig(
    custom_routing={
        "political": "claude",   # map new category "political" to Claude
    }
)
```

---

## Development

```bash
git clone https://github.com/rdxvicky/safetyrouter
cd safetyrouter
pip install -e ".[all]"

# Run tests
pytest tests/

# Start dev server
safetyrouter serve --reload
```

---

## Contributing

Pull requests welcome! Areas we'd love help with:

- **Better routing table** — improved benchmark accuracy scores, new bias categories
- **New providers** — Cohere, Together.ai, Mistral API, Azure OpenAI
- **Evaluation suite** — automated benchmarks to validate routing decisions
- **Async Ollama** — true async support for the classifier
- **Caching** — cache classification results for repeated prompts

---

## License

Apache 2.0 — see [LICENSE](LICENSE).

---

## Citation

If you use SafetyRouter in research, please cite:

```
SafetyRouter: A Scalable Bias Detection and Mitigation System
https://github.com/rdxvicky/safetyrouter
```
