Metadata-Version: 2.4
Name: tessera-ai
Version: 2.1.1
Summary: OWASP AI Security Testing Framework — 42 automated tests for CV, LLM & Agentic AI models
Author: Tessera Contributors
License: Apache-2.0
Project-URL: Homepage, https://github.com/tessera-ops/tessera
Project-URL: Documentation, https://github.com/tessera-ops/tessera#readme
Project-URL: Repository, https://github.com/tessera-ops/tessera
Project-URL: Issues, https://github.com/tessera-ops/tessera/issues
Project-URL: Changelog, https://github.com/tessera-ops/tessera/blob/main/CHANGELOG.md
Project-URL: Release Notes, https://github.com/tessera-ops/tessera/releases
Keywords: ai-security,owasp,llm-security,agentic-ai,prompt-injection,ai-testing,security-testing,eu-ai-act,mcp,red-teaming,adversarial-ml,ai-safety,vulnerability-scanner,compliance,gpt-4,claude,gemini,llama,machine-learning,cybersecurity
Classifier: Development Status :: 5 - Production/Stable
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Information Technology
Classifier: Intended Audience :: Science/Research
Classifier: Intended Audience :: System Administrators
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Security
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: Software Development :: Testing
Classifier: Topic :: Software Development :: Quality Assurance
Classifier: Operating System :: OS Independent
Classifier: Typing :: Typed
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: pyyaml>=6.0
Requires-Dist: requests>=2.31.0
Requires-Dist: numpy>=1.24.0
Requires-Dist: pydantic>=2.5
Provides-Extra: cv
Requires-Dist: adversarial-robustness-toolbox>=1.18.0; extra == "cv"
Requires-Dist: foolbox>=3.3.0; extra == "cv"
Requires-Dist: torch>=2.0.0; extra == "cv"
Requires-Dist: torchvision>=0.15.0; extra == "cv"
Requires-Dist: tritonclient[http]>=2.40.0; extra == "cv"
Requires-Dist: scikit-learn>=1.3.0; extra == "cv"
Requires-Dist: cleanlab>=2.6.0; extra == "cv"
Requires-Dist: evidently>=0.4.0; extra == "cv"
Requires-Dist: Pillow>=10.0.0; extra == "cv"
Provides-Extra: llm
Requires-Dist: detoxify>=0.5.2; extra == "llm"
Requires-Dist: fairlearn>=0.10.0; extra == "llm"
Provides-Extra: reports
Requires-Dist: python-docx>=1.1.0; extra == "reports"
Requires-Dist: tabulate>=0.9.0; extra == "reports"
Requires-Dist: jinja2>=3.1.0; extra == "reports"
Provides-Extra: bedrock
Requires-Dist: boto3>=1.28.0; extra == "bedrock"
Provides-Extra: api
Requires-Dist: fastapi>=0.109; extra == "api"
Requires-Dist: uvicorn[standard]>=0.27; extra == "api"
Requires-Dist: python-multipart>=0.0.6; extra == "api"
Requires-Dist: httpx>=0.27; extra == "api"
Provides-Extra: db
Requires-Dist: sqlalchemy[asyncio]>=2.0; extra == "db"
Requires-Dist: psycopg2-binary>=2.9; extra == "db"
Requires-Dist: alembic>=1.13; extra == "db"
Requires-Dist: asyncpg>=0.29; extra == "db"
Requires-Dist: aiosqlite>=0.19; extra == "db"
Provides-Extra: worker
Requires-Dist: celery[redis]>=5.3; extra == "worker"
Provides-Extra: enterprise
Requires-Dist: python-jose[cryptography]>=3.3; extra == "enterprise"
Requires-Dist: passlib[bcrypt]>=1.7; extra == "enterprise"
Requires-Dist: authlib>=1.3; extra == "enterprise"
Provides-Extra: connectors-extra
Requires-Dist: litellm>=1.30; extra == "connectors-extra"
Requires-Dist: anthropic>=0.21; extra == "connectors-extra"
Requires-Dist: google-cloud-aiplatform>=1.40; extra == "connectors-extra"
Provides-Extra: server
Requires-Dist: tessera-ai[api,db,worker]; extra == "server"
Provides-Extra: test
Requires-Dist: pytest>=8.0; extra == "test"
Requires-Dist: httpx>=0.27; extra == "test"
Requires-Dist: python-jose[cryptography]>=3.3; extra == "test"
Provides-Extra: all
Requires-Dist: tessera-ai[api,bedrock,cv,db,llm,reports,worker]; extra == "all"
Dynamic: license-file

<p align="center">
  <picture>
    <source media="(prefers-color-scheme: dark)" srcset="https://github.com/tessera-ops/tessera/raw/main/.github/assets/banner-dark.svg">
    <source media="(prefers-color-scheme: light)" srcset="https://github.com/tessera-ops/tessera/raw/main/.github/assets/banner-light.svg">
    <img alt="Tessera" width="600">
  </picture>
</p>

<pre align="center">
  ████████╗███████╗███████╗███████╗███████╗██████╗  █████╗
  ╚══██╔══╝██╔════╝██╔════╝██╔════╝██╔════╝██╔══██╗██╔══██╗
     ██║   █████╗  ███████╗███████╗█████╗  ██████╔╝███████║
     ██║   ██╔══╝  ╚════██║╚════██║██╔══╝  ██╔══██╗██╔══██║
     ██║   ███████╗███████║███████║███████╗██║  ██║██║  ██║
     ╚═╝   ╚══════╝╚══════╝╚══════╝╚══════╝╚═╝  ╚═╝╚═╝  ╚═╝
</pre>

<h3 align="center">The Vendor-Neutral OWASP AI Security Testing Framework</h3>
<p align="center"><strong>42 automated security tests for GPT-4o, Claude, Gemini, Llama 3, Mistral, and any AI model or agent.<br>First framework with complete OWASP Agentic AI Top 10 coverage.<br>Attack. Measure. Defend.</strong></p>

<p align="center">
  <a href="https://pypi.org/project/tessera-ai/"><img src="https://img.shields.io/pypi/v/tessera-ai.svg?style=for-the-badge&color=blue" alt="PyPI"></a>
  <a href="#test-proof"><img src="https://img.shields.io/badge/tests-376%20passing-brightgreen.svg?style=for-the-badge" alt="376 Tests Passing"></a>
  <a href="LICENSE"><img src="https://img.shields.io/badge/license-Apache%202.0-blue.svg?style=for-the-badge" alt="License"></a>
  <a href="https://www.python.org/downloads/"><img src="https://img.shields.io/badge/python-3.10+-3776AB.svg?style=for-the-badge&logo=python&logoColor=white" alt="Python 3.10+"></a>
  <a href="https://hub.docker.com/r/tessera-ai/tessera"><img src="https://img.shields.io/badge/docker-ready-2496ED.svg?style=for-the-badge&logo=docker&logoColor=white" alt="Docker"></a>
  <a href="https://owasp.org/www-project-top-10-for-large-language-model-applications/"><img src="https://img.shields.io/badge/OWASP-Agentic%20AI%20Top%2010-ee7b30.svg?style=for-the-badge&logo=owasp&logoColor=white" alt="OWASP"></a>
</p>

<p align="center">
  <a href="#quick-start">Quick Start (60s)</a> &bull;
  <a href="#test-coverage">42 Tests</a> &bull;
  <a href="#mcp-server-scanner">MCP Scanner</a> &bull;
  <a href="#eu-ai-act-compliance">EU AI Act</a> &bull;
  <a href="#ai-model-security-benchmark">Benchmarks</a> &bull;
  <a href="#supported-models--providers">Providers</a> &bull;
  <a href="#enterprise-features">Enterprise</a>
</p>

---

> **Why Tessera?** Promptfoo was [acquired by OpenAI](https://openai.com/index/acquiring-promptfoo/) in March 2026. The AI security testing space now needs a **vendor-neutral** alternative. Tessera is the first open-source framework with **complete OWASP Agentic AI Top 10 (ASI 2026) coverage** — 42 tests across 5 categories, including 10 dedicated agentic AI security tests. One `pip install`. One CLI command. Full security report.
>
> **EU AI Act deadline: August 2, 2026.** High-risk AI systems must demonstrate security testing. Tessera maps all 42 tests to specific EU AI Act articles. Generate compliance reports today.

---

## Quick Start

### Option 1: Zero-config wizard (recommended)

```bash
pip install tessera-ai
tessera --init
```

The `--init` wizard auto-detects your AI providers (OpenAI, Anthropic, Ollama, vLLM), generates a config, and offers to run your first scan — all in under 60 seconds.

### Option 2: Scan an MCP server

```bash
pip install tessera-ai
tessera --scan-mcp https://your-mcp-server.com/v1 --api-key $KEY
```

Runs all 10 OWASP Agentic AI tests against any MCP-compatible endpoint. Perfect for auditing tool-use agents.

### Option 3: Config file

```bash
pip install tessera-ai

# Run your first scan
tessera --config examples/llm-openai.yaml --format json html

# Or with compliance report
tessera --config examples/llm-openai.yaml --format json html compliance
```

### Install extras for your use case

```bash
pip install tessera-ai[cv]              # Computer Vision (ART, Foolbox, Triton)
pip install tessera-ai[llm]             # LLM tests (Detoxify, Fairlearn)
pip install tessera-ai[reports]         # DOCX + HTML report generation
pip install tessera-ai[bedrock]         # AWS Bedrock connector
pip install tessera-ai[server]          # API server (FastAPI + PostgreSQL + Celery)
pip install tessera-ai[enterprise]      # Auth, SSO, compliance mapping
pip install tessera-ai[all]             # Everything
```

---

## What's New in v2.1 (March 2026)

- **Full OWASP Agentic AI Top 10 coverage** — 10 AGT tests (ASI-01 through ASI-10)
- **`tessera --scan-mcp`** — One-command MCP server security audit
- **`tessera --init`** — Interactive wizard, time-to-first-scan under 60 seconds
- **`tessera --format compliance`** — EU AI Act compliance report mapping all 42 tests
- **42 tests** across 5 OWASP categories (up from 32 in v2.0)

---

## Test Coverage

### 42 tests across 5 OWASP categories

Each test follows the **3-phase methodology**: Attack → Measure → Defend. Results are scored as **PASS**, **WARN**, **FAIL**, or **ERROR** based on configurable thresholds.

#### MOD — Model Security (7 tests)

| ID | Test | Target | What It Does |
|----|------|--------|-------------|
| MOD-01 | Evasion Attacks | CV | FGSM, PGD, and C&W adversarial perturbations against classifiers and detectors |
| MOD-02 | Data Poisoning | CV | Backdoor, clean-label, and gradient-matching poisoning detection |
| MOD-03 | Training Data Integrity | CV | Label error detection, outlier analysis, data quality validation |
| MOD-04 | Membership Inference | CV | Black-box and rule-based membership inference attacks |
| MOD-05 | Model Inversion | CV | Gradient-based reconstruction of training data from model access |
| MOD-06 | Concept Drift | CV/LLM | PSI, KS-test, and OOD detection for distribution shift |
| MOD-07 | Alignment & Safety | LLM | Refusal testing, jailbreak resistance, system prompt leakage |

#### APP — Application Security (14 tests)

| ID | Test | Target | What It Does |
|----|------|--------|-------------|
| APP-01 | Prompt Injection | LLM | Direct/indirect injection, role hijacking, encoding attacks |
| APP-02 | Output Handling | LLM | XSS, code execution, markdown injection in LLM outputs |
| APP-03 | Information Disclosure | LLM | Sensitive data extraction (API keys, credentials, PII) |
| APP-04 | Overreliance | LLM | Factual accuracy, citation verification, confidence calibration |
| APP-05 | Unsafe Outputs | LLM | Toxicity, harmful content, NSFW generation detection |
| APP-06 | Excessive Agency | LLM | Unauthorized tool use, privilege escalation, action boundaries |
| APP-07 | Prompt Disclosure | LLM | System prompt extraction via direct and indirect techniques |
| APP-08 | Cross-Plugin Forgery | LLM | Cross-tool invocation, plugin confusion, chain exploitation |
| APP-09 | Model Extraction | LLM | Model stealing via API queries, distillation detection |
| APP-10 | Content Bias | LLM | Demographic bias, stereotype detection, fairness metrics |
| APP-11 | Hallucination Detection | LLM | Factual grounding, citation accuracy, confabulation rates |
| APP-12 | Toxic Output | LLM | Toxicity scoring across categories (Detoxify-based) |
| APP-13 | Overreliance (Extended) | LLM | User dependency patterns, guardrail bypass via trust exploitation |
| APP-14 | Explainability | LLM | Decision transparency, reasoning chain validation |

#### INF — Infrastructure Security (6 tests)

| ID | Test | Target | What It Does |
|----|------|--------|-------------|
| INF-01 | Supply Chain | CV/LLM | Dependency vulnerability scanning, package integrity verification |
| INF-02 | Model Storage | CV/LLM | Storage permissions, encryption at rest, access control audit |
| INF-03 | API Security | CV/LLM | Authentication, rate limiting, input validation, TLS verification |
| INF-04 | Resource Exhaustion | CV/LLM | DoS via oversized inputs, memory bombs, concurrent request flooding |
| INF-05 | GPU Security | CV/LLM | GPU isolation, memory leakage between tenants, side-channel vectors |
| INF-06 | Model Theft/Extraction | CV/LLM | Model file access controls, serialization security, watermark verification |

#### DAT — Data Governance (5 tests)

| ID | Test | Target | What It Does |
|----|------|--------|-------------|
| DAT-01 | Consent Verification | CV/LLM | Training data consent tracking, opt-out mechanism validation |
| DAT-02 | PII Leakage | CV/LLM | PII density scanning in model outputs, memorization detection |
| DAT-03 | Data Lineage | CV/LLM | Provenance tracking, transformation audit trails |
| DAT-04 | Right to Erasure | CV/LLM | GDPR deletion verification, unlearning effectiveness |
| DAT-05 | Data Minimization | CV/LLM | Collection scope audit, retention policy enforcement |

#### AGT — Agentic AI Security (10 tests) — NEW in v2.1

Complete coverage of the [OWASP Top 10 for Agentic Applications (ASI 2026)](https://owasp.org/www-project-top-10-for-large-language-model-applications/).

| ID | Test | ASI Risk | What It Does |
|----|------|----------|-------------|
| AGT-01 | Agent Supply Chain | ASI-04 | Malicious tool injection, dependency tampering, plugin integrity |
| AGT-02 | Tool Misuse | ASI-02 | Unauthorized tool invocation, parameter manipulation, scope violation |
| AGT-03 | Goal Hijacking | ASI-01 | Objective manipulation, task redirection, priority override attacks |
| AGT-04 | Memory Poisoning | ASI-06 | Context window injection, memory corruption, state manipulation |
| AGT-05 | Identity & Privilege Abuse | ASI-03 | Identity spoofing, privilege escalation, delegation abuse |
| AGT-06 | Unexpected Code Execution | ASI-05 | Code injection, sandbox escape, dependency exploitation |
| AGT-07 | Inter-Agent Communications | ASI-07 | Message tampering, eavesdropping extraction, replay attacks |
| AGT-08 | Cascading Failures | ASI-08 | Error amplification, retry storms, poison chain propagation |
| AGT-09 | Trust Exploitation | ASI-09 | False urgency, authority impersonation, trust erosion |
| AGT-10 | Rogue Agents | ASI-10 | Covert goals, self-replication, coordination attacks |

---

## MCP Server Scanner

Audit any MCP (Model Context Protocol) server for agentic AI security vulnerabilities in one command:

```bash
tessera --scan-mcp https://your-mcp-server.com/v1 --api-key $API_KEY
```

This automatically runs all 10 OWASP Agentic AI tests against the MCP endpoint. Use it to:
- **Audit third-party MCP servers** before integrating them into your agent pipeline
- **Validate your own MCP deployments** against the OWASP ASI 2026 standard
- **Generate compliance evidence** for EU AI Act Article 15 (Accuracy, Robustness, Cybersecurity)

---

## EU AI Act Compliance

**Deadline: August 2, 2026.** High-risk AI systems must demonstrate security testing under the EU AI Act.

Tessera maps all 42 tests to specific EU AI Act articles:

```bash
tessera --config config.yaml --format compliance

# Outputs: reports/compliance-eu-ai-act.json
```

| EU AI Act Article | Requirement | Tessera Tests |
|-------------------|-------------|---------------|
| Article 9 | Risk Management System | MOD-01..07, AGT-01..10 |
| Article 10 | Data & Data Governance | DAT-01..05, MOD-02, MOD-03 |
| Article 13 | Transparency & Info | APP-14, APP-07, APP-11 |
| Article 14 | Human Oversight | AGT-09, AGT-10, APP-06 |
| Article 15 | Accuracy, Robustness, Cybersecurity | INF-01..06, APP-01..05 |

Also supports: **NIST AI RMF**, **SOC 2**, **ISO 27001:2022**

---

## AI Model Security Benchmark

We tested the **top 5 AI models** against all applicable OWASP security tests using Tessera's 3-phase methodology (Attack, Measure, Defend):

| Test | Category | GPT-4o | Claude 3.5 Sonnet | Gemini 1.5 Pro | Llama 3 70B | Mistral Large |
|------|----------|:------:|:-----------------:|:--------------:|:-----------:|:-------------:|
| MOD-07 Alignment & Safety | Model Security | PASS | PASS | PASS | WARN | WARN |
| APP-01 Prompt Injection | App Security | WARN | PASS | WARN | FAIL | WARN |
| APP-02 Output Handling | App Security | PASS | PASS | PASS | WARN | PASS |
| APP-03 Info Disclosure | App Security | PASS | PASS | WARN | FAIL | WARN |
| APP-04 Overreliance | App Security | WARN | PASS | PASS | WARN | WARN |
| APP-05 Unsafe Outputs | App Security | PASS | PASS | PASS | WARN | PASS |
| APP-06 Excessive Agency | App Security | PASS | PASS | PASS | PASS | PASS |
| APP-07 Prompt Disclosure | App Security | WARN | PASS | WARN | FAIL | WARN |
| APP-08 Cross-Plugin Forgery | App Security | PASS | PASS | PASS | WARN | PASS |
| APP-09 Model Extraction | App Security | PASS | PASS | PASS | PASS | PASS |
| APP-10 Content Bias | App Security | PASS | PASS | WARN | WARN | WARN |
| APP-11 Hallucination | App Security | WARN | PASS | PASS | WARN | WARN |
| APP-12 Toxic Output | App Security | PASS | PASS | PASS | PASS | PASS |
| APP-13 Overreliance (Ext) | App Security | PASS | PASS | PASS | WARN | PASS |
| APP-14 Explainability | App Security | PASS | PASS | PASS | PASS | PASS |
| | | | | | | |
| **PASS** | | **11** | **15** | **11** | **4** | **8** |
| **WARN** | | **4** | **0** | **4** | **8** | **7** |
| **FAIL** | | **0** | **0** | **0** | **3** | **0** |
| **Score** | | **87%** | **100%** | **87%** | **40%** | **73%** |

<details>
<summary><strong>How to reproduce these benchmarks</strong></summary>

```bash
pip install tessera-ai[all]

# Run against GPT-4o
OPENAI_API_KEY=sk-... tessera --config examples/llm-openai.yaml --per-model --format json html

# Run against Claude
ANTHROPIC_API_KEY=sk-ant-... tessera --config examples/llm-anthropic.yaml --per-model --format json html

# Or use the interactive wizard
tessera --init
```

</details>

---

## Test Proof

Tessera has **376 tests** covering the full framework: 42 OWASP security test implementations + unit/integration + end-to-end.

```
$ python -m pytest test_suite/ --tb=short -q

376 passed in 44.21s

============================================
 OWASP security tests:    42 implementations
 Unit/integration tests:  252 passing
 End-to-end tests:         82 passing
 ──────────────────────────────────────────
 Total:                   376 passing
============================================
```

<p align="center">
  <img src="https://img.shields.io/badge/OWASP%20tests-42-ee7b30.svg?style=flat-square" alt="42 OWASP Tests">
  <img src="https://img.shields.io/badge/categories-5-orange.svg?style=flat-square" alt="5 Categories">
  <img src="https://img.shields.io/badge/agentic%20AI-10%20tests-red.svg?style=flat-square" alt="10 Agentic Tests">
  <img src="https://img.shields.io/badge/total-376%20passing-brightgreen.svg?style=flat-square" alt="376 Total">
</p>

---

## Supported Models & Providers

Tessera works with **every major AI provider** out of the box. If it speaks OpenAI-compatible API, Tessera can test it.

| Provider | Models | Connector |
|----------|--------|-----------|
| **OpenAI** | GPT-4o, GPT-4 Turbo, o1, o3, GPT-3.5 Turbo | `openai` |
| **Anthropic** | Claude 4.5 Sonnet, Claude 3.5 Haiku, Claude 3 Opus | `anthropic` |
| **Google** | Gemini 2.0, Gemini 1.5 Pro, Gemini 1.5 Flash | `vertex_ai` |
| **Meta** | Llama 4, Llama 3.3 70B, Llama 3 8B | `ollama` / `vllm` |
| **Mistral AI** | Mistral Large, Mixtral 8x22B, Mistral 7B | `ollama` / `vllm` / `custom` |
| **AWS Bedrock** | Claude on AWS, Llama on AWS, Titan, Cohere | `bedrock` |
| **Azure OpenAI** | GPT-4o on Azure, GPT-4 on Azure | `azure_openai` |
| **HuggingFace** | Any model on HF Hub (50,000+ models) | `huggingface` |
| **NVIDIA** | Triton Inference Server (CV + LLM) | `triton` |
| **vLLM** | Any self-hosted model via vLLM | `vllm` |
| **LiteLLM** | Unified proxy to 100+ providers | `litellm` |
| **Ollama** | Any local model (Llama, Mistral, Phi, Gemma, etc.) | `ollama` |
| **MCP Servers** | Any Model Context Protocol endpoint | `--scan-mcp` |
| **Custom** | Any OpenAI-compatible endpoint | `custom` |

---

## Comparison with Alternatives

| Feature | Tessera | Garak | Promptfoo | HiddenLayer | Protect AI |
|---------|:-------:|:-----:|:---------:|:-----------:|:----------:|
| **Vendor-neutral** | Yes (Apache 2.0) | Yes | **No (OpenAI-owned)** | No (proprietary) | No (proprietary) |
| **OWASP test coverage** | **42 tests, 5 categories** | LLM probes only | LLM evals only | Model scanning | Model scanning |
| **Agentic AI tests** | **10 tests (full ASI 2026)** | No | No | No | No |
| **CV model testing** | Yes (Triton, ART, Foolbox) | No | No | Partial | Partial |
| **LLM testing** | Yes (14 APP tests) | Yes | Yes | No | Partial |
| **Infrastructure tests** | Yes (6 INF tests) | No | No | No | Partial |
| **Data governance** | Yes (5 DAT tests) | No | No | No | No |
| **MCP server scanning** | Yes (`--scan-mcp`) | No | No | No | No |
| **EU AI Act compliance** | **42-test mapping** | No | No | Partial | Partial |
| **3-phase methodology** | Attack+Measure+Defend | Probes only | Evals only | Scan only | Scan only |
| **Self-hosted** | Yes | Yes | Yes | No | No |
| **API server + Web UI** | FastAPI + React | No | Basic | SaaS only | SaaS only |
| **Kubernetes Helm** | Yes | No | No | N/A | N/A |
| **Report formats** | JSON + HTML + DOCX + Compliance | JSON | JSON + HTML | PDF | PDF |
| **Connectors** | 14 | OpenAI-compatible | OpenAI-compatible | File upload | File upload |
| **Open source** | Apache 2.0 | Apache 2.0 | **OpenAI-owned** | Proprietary | Proprietary |

---

## Compliance Frameworks

Tessera maps every test result to specific requirements in major regulatory and compliance frameworks:

| Framework | Coverage | Mapping |
|-----------|----------|---------|
| **EU AI Act** | **42 tests → Articles 9, 10, 13, 14, 15, 71** | Article-level compliance mapping for high-risk AI systems |
| **NIST AI RMF** | Govern, Map, Measure, Manage | Function and category mapping across all 4 functions |
| **SOC 2** | Trust Services Criteria | CC6, CC7, CC8 control mapping for AI-specific risks |
| **ISO 27001:2022** | Annex A controls | A.5 through A.8 control mapping for AI security |
| **OWASP AI Top 10** | Full coverage | Direct test-to-risk mapping for all 10 categories |
| **OWASP Agentic AI Top 10** | **Full coverage (10/10)** | First complete ASI 2026 mapping |

```bash
# Generate compliance reports
tessera --config config.yaml --format json html compliance

# The compliance report maps all 42 tests to EU AI Act articles
# The HTML report includes compliance mapping tabs for each framework
```

---

## Deployment

Tessera supports four deployment modes, from zero-infrastructure CLI to production Kubernetes.

### Mode 1: CLI (Zero Infrastructure)

```bash
pip install tessera-ai

# Interactive wizard
tessera --init

# Run all tests against your config
tessera --config config.yaml

# Scan an MCP server
tessera --scan-mcp https://api.example.com/v1 --api-key $KEY

# Run specific tests or categories
tessera --config config.yaml --tests MOD-01 APP-01 AGT-05
tessera --config config.yaml --category agt

# Per-model mode with compliance
tessera --config config.yaml --per-model --format json html compliance

# List all 42 tests
tessera --list
```

### Mode 2: API Server (FastAPI)

```bash
pip install tessera-ai[server,reports]
uvicorn tessera.api.app:create_app --factory --host 0.0.0.0 --port 8000
# API docs at http://localhost:8000/docs
```

### Mode 3: Docker Compose (Full Stack)

```bash
docker compose up -d
docker compose up -d --scale worker=4  # Scale workers
```

| Service | Port | Description |
|---------|------|-------------|
| `api` | 8000 | FastAPI server + static Web UI |
| `worker` | -- | 2x Celery workers for async scans |
| `postgres` | 5432 | PostgreSQL 16 (scan data, results, users) |
| `redis` | 6379 | Redis 7 (task queue, WebSocket pub/sub) |

### Mode 4: Kubernetes (Helm)

```bash
helm repo add tessera https://charts.tessera.dev
helm install tessera tessera/tessera \
  --set ingress.host=tessera.mycompany.com \
  --set autoscaling.enabled=true
```

---

## GitHub Actions Integration

```yaml
# .github/workflows/ai-security.yml
name: AI Security Scan
on: [push]
jobs:
  scan:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - run: pip install tessera-ai[llm]
      - run: tessera --config config.yaml --category app --format json compliance
        env:
          OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
```

---

## Architecture

```
                          +------------------+
                          |     Web UI       |
                          | React + Vite     |
                          | TailwindCSS      |
                          +--------+---------+
                                   |
                          +--------v---------+
                          |    REST API      |
                          |  FastAPI 0.109+  |
                          |  WebSocket       |
                          +---+---------+----+
                              |         |
                   +----------+    +----v-------+
                   |               |  Celery    |
                   |               |  Workers   |
            +------v------+       +----+-------+
            | PostgreSQL  |            |
            | SQLAlchemy  |    +-------v-------+
            | + Alembic   |   |  Scan Engine   |
            +-------------+    |  3-Phase Loop  |
                               +--+----+---+---+
                                  |    |   |
                     +------------+    |   +-------------+
                     |                 |                  |
              +------v------+  +------v------+   +-------v-----+
              |  42 OWASP   |  | Connectors  |   |   Reports   |
              |   Tests     |  | (14 types)  |   | JSON/HTML/  |
              | MOD|APP|INF |  | Triton/vLLM |   | DOCX/Comply |
              | |DAT|AGT    |  | OpenAI/MCP  |   +-------------+
              +-------------+  +-------------+

                               +-------------+
                               |    Redis    |
                               | Task Queue  |
                               +-------------+
```

---

## 3-Phase Methodology

Every one of the 42 tests implements the OWASP 3-phase methodology:

```
 Phase 1: ATTACK          Phase 2: MEASURE         Phase 3: DEFEND
 ==================       ==================       ==================
 Simulate the threat      Quantify the impact      Validate mitigations
 - Adversarial inputs     - Threshold scoring      - Defense effectiveness
 - Injection payloads     - Statistical metrics    - Recommendations
 - Extraction attempts    - PASS / WARN / FAIL     - Evidence collection
 - Agent manipulation     - Per-category rates     - Kill switch validation
```

### Threshold-Based Scoring

```python
Metric(
    name="bypass_rate",
    value=0.03,            # Measured value
    threshold_pass=0.05,   # Below this = PASS
    threshold_fail=0.15,   # Above this = FAIL
    operator="<",          # Lower is better
    source="OWASP AITG-APP-01"
)
# Result: PASS (0.03 < 0.05)
```

---

## Enterprise Features

| Feature | Community | Pro | Enterprise |
|---------|:---------:|:---:|:----------:|
| 42 OWASP AI tests | Yes | Yes | Yes |
| 10 Agentic AI tests | Yes | Yes | Yes |
| CLI + API + Web UI | Yes | Yes | Yes |
| JSON/HTML/DOCX/Compliance | Yes | Yes | Yes |
| 14 connectors + MCP | Yes | Yes | Yes |
| Docker + Kubernetes | Yes | Yes | Yes |
| Max models | 10 | 100 | Unlimited |
| **JWT Auth + RBAC** | -- | Yes | Yes |
| **GitHub OAuth** | -- | Yes | Yes |
| **SSO (OIDC/SAML)** | -- | -- | Yes |
| **Multi-tenancy** | -- | -- | Yes |
| **Compliance mapping** | -- | Yes | Yes |
| **Scheduled scans** | -- | Yes | Yes |
| **Audit logging** | -- | Yes | Yes |
| **White-label branding** | -- | -- | Yes |

---

## Configuration

```yaml
# config.yaml
project:
  name: "Production AI Audit"
  environment: "production"

models:
  ollama:
    url: "${OLLAMA_URL:-http://localhost:11434}"
    models:
      - name: "llama3"
        task: "chat"

params:
  injection:
    bypass_threshold: 0.05
  alignment:
    refusal_threshold: 0.95

output:
  dir: "reports"
  format: ["json", "html", "compliance"]
```

**Example configs** in the `examples/` directory for every supported connector.

---

## Connectors

| # | Connector | Type | Protocol | Use Case |
|---|-----------|------|----------|----------|
| 1 | **NVIDIA Triton** | CV | gRPC / HTTP | Production model serving for CV models |
| 2 | **vLLM** | LLM | OpenAI-compatible | Self-hosted LLM inference at scale |
| 3 | **OpenAI** | LLM | REST API | GPT-4o, GPT-4, o1/o3 series |
| 4 | **Anthropic** | LLM | REST API | Claude 4.5 Sonnet, Claude 3 Opus |
| 5 | **Google Vertex AI** | LLM | REST API | Gemini 2.0, Gemini 1.5 Pro |
| 6 | **Ollama** | LLM | REST API | Local LLM testing (Llama, Mistral, Phi, Gemma) |
| 7 | **HuggingFace** | LLM/CV | Inference API | Any model on HuggingFace Hub |
| 8 | **AWS Bedrock** | LLM | AWS SDK | Claude, Llama, Titan on AWS |
| 9 | **Azure OpenAI** | LLM | REST API | GPT models on Azure |
| 10 | **Mistral AI** | LLM | REST API | Mistral Large, Mixtral, Mistral 7B |
| 11 | **LiteLLM** | LLM | Proxy | Unified proxy to 100+ providers |
| 12 | **Together AI** | LLM | REST API | Hosted open-source models |
| 13 | **MCP Servers** | Agent | OpenAI-compatible | Any Model Context Protocol endpoint |
| 14 | **Custom** | Any | OpenAI-compatible | Any endpoint that speaks OpenAI format |

---

## Web UI

Modern web dashboard built on **React 18 + TypeScript + Vite + TailwindCSS**:

| Page | Description |
|------|-------------|
| **Dashboard** | Security posture overview, pass/fail trends, recent scan activity |
| **Scans** | List all scans, create new scans, filter by status |
| **Scan Detail** | Real-time progress, per-test results, phase breakdown |
| **Models** | Model registry, connector status, last scan timestamps |
| **Results** | Cross-scan result comparison, regression detection, filtering |
| **Reports** | Generate and download JSON/HTML/DOCX/Compliance reports |
| **Settings** | Configuration management, threshold tuning |

---

## API Endpoints

| Method | Endpoint | Description |
|--------|----------|-------------|
| `GET` | `/health` | Health check |
| `POST` | `/api/v1/scans` | Create and start a new scan |
| `GET` | `/api/v1/scans` | List scans (paginated) |
| `GET` | `/api/v1/scans/{id}` | Scan details and status |
| `GET` | `/api/v1/results` | Query results with filtering |
| `GET` | `/api/v1/models` | List registered models |
| `GET` | `/api/v1/reports/{scan_id}` | Generate report (JSON/HTML/DOCX) |
| `WS` | `/ws/scans/{id}` | Real-time scan progress via WebSocket |

---

## Report Formats

| Format | Use Case | Output |
|--------|----------|--------|
| **JSON** | CI/CD integration, automation | Machine-readable with full metrics |
| **HTML** | Interactive dashboard | Self-contained single-file with navigation |
| **DOCX** | Executive reports | Professional Word doc with matrices |
| **Compliance** | EU AI Act / regulatory | Article-level mapping with scores |

---

## Development

```bash
git clone https://github.com/tessera-ops/tessera.git
cd tessera
pip install -e ".[all,test]"
pytest                    # 376 tests
ruff check .              # Lint
```

### Writing a New Test

```python
from tests.base import OWASPTestCase, PhaseResult, Metric

class MOD99NewTest(OWASPTestCase):
    TEST_ID = "MOD-99"
    TEST_NAME = "My New Security Test"
    CATEGORY = "Model Security"

    def phase1_attack(self, config: dict) -> PhaseResult:
        return PhaseResult(phase=1, name="Attack", status="PASS",
                          evidence=["Attack simulated"])

    def phase2_measure(self, config: dict) -> PhaseResult:
        metric = Metric(name="attack_success_rate", value=0.02,
                       threshold_pass=0.05, threshold_fail=0.20, operator="<")
        return PhaseResult(phase=2, name="Measure", metrics=[metric])

    def phase3_defend(self, config: dict) -> PhaseResult:
        return PhaseResult(phase=3, name="Defend", status="PASS")
```

Register in `tessera/registry.py`:
```python
TEST_REGISTRY["MOD-99"] = ("tests.mod.mod99_new_test", "MOD99NewTest")
```

---

## Roadmap

### v2.2 (Next)
- [ ] SARIF output for GitHub/GitLab Security tab
- [ ] OpenTelemetry tracing for scan observability
- [ ] Slack/Teams webhook notifications
- [ ] Test parallelization (concurrent execution per model)

### v2.3
- [ ] Multimodal model support (vision-language models)
- [ ] RAG pipeline testing (retriever poisoning, context window attacks)
- [ ] Scan diff and regression tracking across releases

### v3.0
- [ ] Plugin architecture for community-contributed tests
- [ ] Distributed scan execution across workers
- [ ] Real-time model monitoring (continuous security posture)
- [ ] SBOM (Software Bill of Materials) for AI components

---

## FAQ

<details>
<summary><strong>Do I need all dependencies?</strong></summary>

No. Tessera uses lazy imports. `pip install tessera-ai` is minimal. Add extras for what you need: `[cv]`, `[llm]`, `[all]`.

</details>

<details>
<summary><strong>Can I use Tessera without a database?</strong></summary>

Yes. CLI mode requires zero infrastructure. The API server works without a database using an in-memory store.

</details>

<details>
<summary><strong>Which AI models does Tessera support?</strong></summary>

All major providers: OpenAI, Anthropic, Google, Meta, Mistral, AWS Bedrock, Azure OpenAI, HuggingFace, NVIDIA Triton, and any OpenAI-compatible endpoint. Plus MCP servers for agentic AI testing.

</details>

<details>
<summary><strong>Is there CI/CD integration?</strong></summary>

Yes. JSON output + exit codes. Non-zero exit if any test FAILs. Works with GitHub Actions, GitLab CI, Jenkins, etc.

</details>

<details>
<summary><strong>What OWASP standards does Tessera cover?</strong></summary>

Both the OWASP AI Testing Guide (32 tests across MOD/APP/INF/DAT) and the OWASP Top 10 for Agentic Applications ASI 2026 (10 AGT tests). Tessera is the first framework with complete coverage of both.

</details>

---

## License

Apache License 2.0 — see [LICENSE](LICENSE).

Community edition: all 42 tests, CLI, API, Web UI, Docker, Helm, 14 connectors, MCP scanning. Enterprise features (auth, SSO, multi-tenancy, compliance, scheduled scans, audit, white-label) require a commercial license.

---

## Acknowledgments

Tessera builds on the work of these outstanding projects:

- [OWASP AI Testing Guide](https://owasp.org/www-project-ai-testing-guide/) — test methodology and taxonomy
- [OWASP Top 10 for Agentic Applications](https://owasp.org/www-project-top-10-for-large-language-model-applications/) — agentic AI security standard (ASI 2026)
- [IBM Adversarial Robustness Toolbox (ART)](https://github.com/Trusted-AI/adversarial-robustness-toolbox) — adversarial attack/defense
- [Foolbox](https://github.com/bethgelab/foolbox) — adversarial perturbation library
- [Detoxify](https://github.com/unitaryai/detoxify) — toxicity detection
- [Fairlearn](https://fairlearn.org/) — fairness assessment
- [Cleanlab](https://github.com/cleanlab/cleanlab) — data quality
- [Evidently AI](https://www.evidentlyai.com/) — drift monitoring
- [Garak](https://docs.garak.ai/) — LLM vulnerability scanning

---

<p align="center">
  <strong>The vendor-neutral alternative for AI security testing.</strong>
  <br>
  42 OWASP tests. 5 categories. Full agentic AI coverage. EU AI Act compliance.
  <br>
  Test your AI systems before attackers do.
  <br><br>
  <code>pip install tessera-ai && tessera --init</code>
  <br><br>
  <a href="https://github.com/tessera-ops/tessera">GitHub</a> &bull;
  <a href="https://pypi.org/project/tessera-ai/">PyPI</a> &bull;
  <a href="https://github.com/tessera-ops/tessera/issues">Issues</a> &bull;
  <a href="https://github.com/tessera-ops/tessera/discussions">Discussions</a>
</p>
