Metadata-Version: 2.4
Name: antaris-memory
Version: 4.9.21
Summary: File-based persistent memory for AI agents. Zero dependencies.
Author-email: Antaris Analytics <dev@antarisanalytics.com>
License: Apache-2.0
Project-URL: Homepage, https://github.com/Antaris-Analytics/antaris-memory
Project-URL: Documentation, https://memory.antarisanalytics.ai
Project-URL: Repository, https://github.com/Antaris-Analytics/antaris-memory
Keywords: ai,memory,agents,llm,persistence,recall
Classifier: Development Status :: 5 - Production/Stable
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >=3.9
Description-Content-Type: text/markdown
License-File: LICENSE
Provides-Extra: embeddings
Requires-Dist: openai>=1.0; extra == "embeddings"
Provides-Extra: mcp
Requires-Dist: mcp>=1.0; extra == "mcp"
Provides-Extra: all
Requires-Dist: openai>=1.0; extra == "all"
Requires-Dist: mcp>=1.0; extra == "all"
Provides-Extra: pro
Requires-Dist: openai>=1.0; extra == "pro"
Requires-Dist: mcp>=1.0; extra == "pro"
Dynamic: license-file

# antaris-memory

File-based persistent memory for AI agents. Zero dependencies (stdlib only), crash-safe writes, file-based JSON storage.

[![PyPI](https://img.shields.io/pypi/v/antaris-memory)](https://pypi.org/project/antaris-memory/)
[![Python 3.9+](https://img.shields.io/badge/python-3.9+-green.svg)](https://python.org)
[![License](https://img.shields.io/badge/license-Apache%202.0-orange.svg)](LICENSE)

## Installation

```bash
pip install antaris-memory
```

Zero dependencies. No API keys. No external services.

## Core Features

- **BM25 search with TF-IDF scoring** - Full-text search with keyword relevance ranking
- **Write-ahead log (WAL)** - Crash-safe writes with automatic recovery
- **Sharding** - Horizontal scaling for large memory stores (10,000+ entries)
- **Decay engine** - Memories fade over time unless reinforced (Ebbinghaus curves)
- **Sentiment tagging** - Automatic emotional context detection
- **Temporal awareness** - Time-based queries and chronological context
- **Confidence scoring** - Reliability metrics for stored information
- **Compression and consolidation** - Automatic deduplication and clustering
- **Forgetting engine** - Selective deletion with audit trails
- **Input gating** - P0-P3 priority classification at intake
- **Recovery presets** - Context restoration after compaction/restart
- **Thread-safe operations** - FileLock using os.mkdir() atomicity
- **MCP server integration** - Expose as MCP tools (optional)
- **Export/Import** *(v4.2.0)* - Serialize all memories to JSON; import with merge and deduplication
- **GCS Backend Stub** *(v4.2.0)* - Interface defined for Google Cloud Storage backend (full implementation in v4.3)

## Quick Start

```python
from antaris_memory import MemorySystem

# Initialize — agent_name personalizes logs and namespace labels
mem = MemorySystem("./workspace", half_life=7.0, agent_name="MyBot")
mem.load()

# Store memories
mem.ingest("PostgreSQL chosen for primary database", 
           category="technical", memory_type="episodic")
mem.ingest("API costs exceed $500/month budget",
           category="operational", confidence=0.9)

# Search with BM25 ranking (searches all memories by default)
results = mem.search("database decision")
for r in results:
    print(f"[{r.confidence:.2f}] {r.content}")

# Multi-tenant: scope search to a specific session
results = mem.search("database decision", session_id="session-abc")

# Save to disk
mem.save()
```

## API Reference

### Core Exports

```python
from antaris_memory import (
    MemorySystem,           # Main interface (aliases MemorySystemV4)
    MemoryEntry,            # Individual memory record
    SearchResult,           # Search result with metadata
    RecoveryManager,        # Post-compaction context restoration
    RecoveryConfig,         # Recovery configuration presets
    
    # Engine Components
    DecayEngine,            # Time-based memory degradation
    SentimentTagger,        # Emotional context detection
    TemporalEngine,         # Time-aware queries
    ConfidenceEngine,       # Reliability scoring
    CompressionEngine,      # Deduplication and clustering
    ForgettingEngine,       # Selective deletion
    ConsolidationEngine,    # Memory optimization
    InputGate,              # Priority classification
)
```

### MemorySystem

Primary interface for all memory operations:

```python
mem = MemorySystem(
    workspace_path="./memory",
    half_life=7.0,                    # Days until 50% decay
    enable_wal=True,                  # Write-ahead logging
    shard_threshold=1000,             # Entries per shard
    recovery_config=RecoveryConfig()  # Post-restart recovery
)
```

#### Core Methods

```python
# Loading and saving
mem.load()                # Load from disk
mem.save()                # Save to disk with WAL

# Memory ingestion
mem.ingest(content, source="", category="", memory_type="episodic", 
          confidence=1.0, tags=[], metadata={})

# Typed ingestion helpers
mem.ingest_fact("PostgreSQL supports JSON columns")
mem.ingest_preference("User prefers concise responses")
mem.ingest_mistake("Connection pool not closed properly", "Use context managers or explicit close() in finally block")
mem.ingest_procedure("Deploy: git push → CI → staging → prod")

# Input gating (P0-P3 classification)
mem.ingest_with_gating("Critical security alert", source="monitoring")
mem.ingest_with_gating("Thanks for the update", source="chat")  # Dropped (P3)

# Search
results = mem.search(query, limit=10, explain=False)
results = mem.search(query, tags=["technical", "database"])
results = mem.between("2026-02-01", "2026-02-28")

# Search with instrumentation context (primary API for OpenClaw plugin / agent use)
results, ctx = mem.search_with_context(query, limit=5)
mem.mark_used([r.id for r in results], ctx)  # Boosts relevance of retrieved memories

# Temporal queries
recent = mem.on_date("2026-02-14")
this_week = mem.between("2026-02-17", "2026-02-24")
story = mem.narrative(topic="database migration")

# Maintenance
report = mem.consolidate()        # Dedup and optimize
mem.forget(entity="John Doe")     # GDPR deletion with audit
mem.compact()                     # Archive old shards
```

## BM25 Search Engine

Full-text search using BM25 algorithm with TF-IDF scoring:

```python
# Basic search
results = mem.search("database performance issues")

# Search with explanation
results = mem.search("postgres slow", explain=True)
for r in results:
    print(f"[{r.relevance:.2f}] {r.content[:60]}")
    print(f"  Explanation: {r.explanation}")

# Advanced parameters
results = mem.search(
    query="API optimization",
    limit=20,
    min_confidence=0.7,
    memory_types=["procedural", "episodic"],
    categories=["technical"]
)
```

Search scoring combines:
- BM25 keyword relevance
- Temporal decay (recent memories score higher)
- Access frequency (frequently accessed memories boost)
- Memory type boost (procedural > episodic for how-to queries)

## Write-Ahead Log (WAL)

Crash-safe writes with automatic recovery:

```python
# WAL enabled by default
mem = MemorySystem("./workspace", enable_wal=True)

# Writes go through WAL first
mem.ingest("Important data")  # Written to WAL, then committed
mem.save()                    # Flush WAL to main storage

# Automatic recovery on next load
mem.load()  # Replays uncommitted WAL entries
```

WAL format:
```
workspace/
├── wal/
│   ├── 20260224_143022_001.wal
│   ├── 20260224_143022_002.wal
│   └── current.wal
```

## Sharding

Horizontal scaling for large memory stores:

```python
# Auto-sharding at 1000 entries per shard (default)
mem = MemorySystem("./workspace", shard_threshold=1000)

# Custom sharding strategy
mem = MemorySystem("./workspace", shard_strategy="temporal")  # By date
mem = MemorySystem("./workspace", shard_strategy="semantic")   # By topic
```

Shard structure:
```
workspace/
├── shards/
│   ├── 2026-02-technical.json     # 847 entries
│   ├── 2026-02-operational.json  # 1,203 entries
│   └── 2026-01-tactical.json     # 512 entries
├── indexes/
│   ├── global_search.json
│   └── shard_manifest.json
```

## Decay Engine

Memories fade over time unless reinforced:

```python
# Configure decay parameters
mem = MemorySystem("./workspace", half_life=7.0)  # 7-day half-life

# Query with/without decay consideration
recent = mem.search("performance", use_decay=True)
historical = mem.search("performance", use_decay=False)

# Decay statistics
stats = mem.get_stats()
print(f"Total memories: {stats['total']}")
print(f"Average confidence: {stats['avg_confidence']:.3f}")
```

Decay follows Ebbinghaus forgetting curve:
```
strength = initial_strength * exp(-ln(2) * age_days / half_life)
```

## Sentiment Tagging

Automatic emotional context detection:

```python
# Automatic sentiment detection
mem.ingest("The deployment failed catastrophically")
# → Tagged with sentiment: negative, intensity: 0.8

mem.ingest("Successfully migrated all user data")  
# → Tagged with sentiment: positive, intensity: 0.7

# Query by sentiment via search filter
positive_memories = mem.search("", sentiment_filter="positive")
issues = mem.search("", sentiment_filter="negative", category="technical")

# Access sentiment metadata
for result in mem.search("deployment"):
    if result.metadata.get("sentiment"):
        sent = result.metadata["sentiment"]
        print(f"Sentiment: {sent['polarity']} ({sent['intensity']:.2f})")
```

Sentiment classification uses lexicon-based analysis (no model calls).

## Temporal Engine

Time-aware queries and chronological context:

```python
# Date range queries
Q1_memories = mem.between("2026-01-01", "2026-03-31")

# Single-day queries
yesterday_memories = mem.on_date("2026-02-23")

# Chronological narrative (returns formatted string)
story = mem.narrative(topic="database migration")

# Time-filtered search
recent_deployments = mem.search(
    "deployment",
    date_range=("2026-02-01", "2026-02-28")
)
```

## Confidence Engine

Track reliability of stored information:

```python
# Store with confidence scores
mem.ingest("PostgreSQL handles 10K QPS", confidence=0.95)  # Measured
mem.ingest("MongoDB might be faster", confidence=0.3)     # Speculation

# Filter by confidence
reliable = mem.search("database performance", min_confidence=0.8)

# Confidence statistics
stats = mem.get_stats()
print(f"Total memories: {stats['total']}")
print(f"Average confidence: {stats['avg_confidence']:.3f}")
```

## Compression and Consolidation

Automatic deduplication and memory optimization:

```python
# Manual consolidation — deduplicates and clusters similar memories
report = mem.consolidate()
print(f"Consolidated: {report}")

# Auto-consolidation at ingest (configurable threshold)
mem = MemorySystem("./workspace", auto_consolidate_threshold=5000)

# Compact — archives old shards and frees memory
report = mem.compact()
print(f"Compacted: {report}")
```

## Forgetting Engine

Selective deletion with audit trails:

```python
# GDPR deletion by entity
result = mem.forget(entity="John Doe")
print(f"Removed {len(result['removed'])} entries")

# Time-based cleanup
mem.forget(before_date="2025-01-01")

# Topic-based deletion
mem.forget(topic="staging")

# Audit trail (returned by forget())
result = mem.forget(entity="Jane Smith")
audit = result["audit"]
```

## Input Gating

Priority classification at intake (P0-P3):

> **Note:** `ingest_with_gating()` automatically filters low-signal content (P3 ephemeral).
> Dropped items are logged at DEBUG level. Use `mem.ingest()` directly if you want to
> store everything without filtering.

```python
# Automatic classification
mem.ingest_with_gating("URGENT: API down", source="alerts")
# → P0 (critical) → stored immediately

mem.ingest_with_gating("Decided on PostgreSQL", source="meeting")  
# → P1 (operational) → stored

mem.ingest_with_gating("sounds good!", source="chat")
# → P3 (ephemeral) → dropped (logged at DEBUG)

# Gating statistics
stats = mem.get_stats()
print(f"Total stored: {stats['total']}")
```

Priority levels:
- **P0 (Critical)**: Security alerts, errors, deadlines, financial data
- **P1 (Operational)**: Decisions, technical choices, assignments
- **P2 (Tactical)**: Background info, research, general discussion
- **P3 (Ephemeral)**: Greetings, acknowledgments, social noise (dropped)

## Recovery Manager

Context restoration after compaction or restart:

```python
# Default smart recovery
config = RecoveryConfig()  # 50 memories, 24h window
mem = MemorySystem("./workspace", recovery_config=config)

# Minimal recovery for token efficiency
config = RecoveryConfig(recovery_mode="minimal")  # 10 memories, session only

# Custom recovery
config = RecoveryConfig(
    recovery_search_limit=100,
    recovery_time_window="48h",
    recovery_channels="current",
    recovery_inject="cache"
)

mem.load()
mem.recover_memories()  # Automatic on load

# Access recovered context
recovery_mgr = mem.recovery_manager
cached = recovery_mgr.get_cached_memories()
context_block = recovery_mgr.inject_into_context()
```

Recovery presets:

| Mode | Memories | Window | Tokens | Use Case |
|------|----------|--------|---------|----------|
| `smart` | 50 | 24h | 5-10K | Balanced recovery |
| `minimal` | 10 | session | 1-2K | Token-constrained |

## Context Packets

Package relevant memories for sub-agent injection:

```python
# Single query packet
packet = mem.build_context_packet(
    task="Debug authentication flow",
    max_memories=10,
    max_tokens=2000,
    include_mistakes=True,
    tags=["auth", "security"]
)

# Render for injection
markdown_context = packet.render("markdown")
xml_context = packet.render("xml")

# Multi-query with deduplication
packet = mem.build_context_packet_multi(
    task="Performance optimization",
    queries=["slow queries", "database bottleneck", "caching"],
    max_tokens=3000
)

# Token budget management
packet.trim(max_tokens=1500)
print(f"Final token count: {packet.token_count}")
```

## MCP Server Integration

Expose memory as MCP tools:

```python
# Requires: pip install mcp
from antaris_memory import create_mcp_server

# Create server
server = create_mcp_server(workspace="./memory")

# Run with stdio transport
server.run()  # Connect from Claude Desktop, Cursor, etc.

# Available MCP tools:
# - memory_search(query, limit)
# - memory_ingest(content, category, memory_type)
# - memory_consolidate()
# - memory_stats()
```

## Thread Safety

Multiple processes can safely access the same workspace:

```python
from antaris_memory import FileLock

# Exclusive write lock
with FileLock("/path/to/shard.json", timeout=10.0):
    data = load_shard()
    modify_data(data)
    save_shard(data)

# Optimistic concurrency for reads
from antaris_memory import VersionTracker

tracker = VersionTracker()
version = tracker.snapshot("/path/to/data.json")
data = load_data()
process_data(data)
tracker.check(version)  # Raises ConflictError if modified
```

FileLock uses `os.mkdir()` for cross-platform atomic operations.

## Export and Import *(v4.2.0)*

Serialize all memories to a portable JSON snapshot, then restore or merge them into any workspace.

```python
from antaris_memory import MemorySystem

memory = MemorySystem(workspace="./data")
memory.load()

# Export all memories
count = memory.export("./backup.json")
print(f"Exported {count} memories")

# Import with merge (deduplicates by content hash)
imported = memory.import_from("./backup.json", merge=True)
print(f"Imported {imported} new memories")
```

`export(output_path)` writes a single JSON file containing every memory entry, shard manifests, and index metadata. Returns the total number of exported memories.

`import_from(input_path, merge=True)` reads the snapshot and merges entries into the current workspace, skipping exact duplicates (by content hash). Returns the count of newly added memories. Set `merge=False` to overwrite instead.

## GCS Backend Stub *(v4.2.0)*

The Google Cloud Storage backend interface is now defined and importable. The full cloud-backed implementation ships in v4.3.

```python
from antaris_memory.backends.gcs import GCSMemoryBackend

backend = GCSMemoryBackend(bucket="my-agent-memories", prefix="prod/")
# Interface is complete; persistence calls are stubbed until v4.3
```

Provides the same `load()`, `save()`, `search()`, and `ingest()` API as the default file backend, making it a drop-in replacement once fully implemented.

## Benchmarks

Tested on Apple M4, Python 3.14, SSD storage.

### Search Performance

| Memories | Ingest (avg) | Search (avg) | Search (p99) | Memory (MB) |
|----------|-------------|-------------|-------------|-------------|
| 100 | 0.053ms | 0.40ms | 0.65ms | 8 |
| 1,000 | 0.033ms | 3.43ms | 5.14ms | 45 |
| 10,000 | 0.035ms | 24.7ms | 38.2ms | 180 |
| 50,000 | 0.041ms | 127ms | 195ms | 850 |

### Comparison with Other Libraries

Search performance against existing memory libraries:

| Library | 1K memories | 10K memories | Dependencies |
|---------|-------------|--------------|-------------|
| antaris-memory | 3.4ms | 24.7ms | 0 (stdlib) |
| mem0 | 610ms | 1,507,000ms | Redis + Vector DB |
| langchain-memory | 185ms | 4,460ms | Multiple |

**Result: 61,030x faster than mem0 at scale, 180x faster at small scale.**

> **Note:** These benchmarks compare antaris-memory's local file-based storage against
> mem0's default networked backend (Qdrant vector DB + Redis). antaris-memory is designed
> as a zero-infrastructure local solution; mem0's strength is cloud-scale distributed search.
> The speed advantage comes from eliminating network round-trips and serialization overhead.

### Input Gating Performance

P0-P3 classification speed:

| Metric | Value |
|--------|-------|
| Average classification | 0.177ms |
| P99 classification | 0.45ms |
| Throughput | 5,650 classifications/sec |

### Storage Efficiency

| Memories | Raw JSON | Compressed | Compression Ratio |
|----------|----------|------------|-------------------|
| 1,000 | 1.1MB | 340KB | 3.2:1 |
| 10,000 | 11.2MB | 2.8MB | 4.0:1 |
| 50,000 | 56.8MB | 12.1MB | 4.7:1 |

## Storage Format

Plain JSON files for transparency and debuggability:

```
workspace/
├── shards/
│   ├── 2026-02-technical.json     # Technical memories
│   ├── 2026-02-operational.json  # Operational decisions
│   └── 2026-01-archive.json      # Archived memories
├── indexes/
│   ├── search_index.json         # BM25 inverted index
│   ├── tag_index.json           # Tag mappings
│   ├── date_index.json          # Temporal index
│   └── confidence_index.json    # Confidence levels
├── namespaces/                   # Isolated namespace stores
│   └── project-alpha/
│       ├── shards/
│       ├── indexes/
│       └── ...
├── wal/
│   ├── current.wal              # Active write-ahead log
│   └── 20260224_143022.wal      # Rotated WAL files
├── audit/
│   └── deletions.json           # GDPR audit trail
└── config.json                  # Workspace configuration
```

## Architecture

```
MemorySystem (v4.2.0)
├── Core Components
│   ├── ShardManager         # Horizontal scaling
│   ├── IndexManager         # Search indexes
│   └── WALManager           # Write-ahead logging
├── Search Engine
│   ├── BM25Engine           # Keyword ranking
│   ├── TFIDFScorer          # Term frequency scoring
│   └── TemporalRanker       # Time-based relevance
├── Memory Processing
│   ├── DecayEngine          # Ebbinghaus forgetting
│   ├── SentimentTagger      # Emotional context
│   ├── ConfidenceEngine     # Reliability scoring
│   ├── CompressionEngine    # Deduplication
│   ├── ConsolidationEngine  # Memory optimization
│   └── ForgettingEngine     # Selective deletion
├── Input Processing
│   ├── InputGate            # P0-P3 classification
│   └── MemoryTyper          # Episodic/semantic/procedural
├── Recovery System
│   ├── RecoveryManager      # Post-restart context
│   └── RecoveryConfig       # Smart/minimal presets
├── Concurrency
│   ├── FileLock             # Cross-platform locking
│   └── VersionTracker       # Optimistic concurrency
├── Integration
│   ├── MCPServer            # Model Context Protocol
│   └── ContextPacketBuilder # Sub-agent injection
└── Backends (v4.2.0)
    ├── FileBackend          # Default local JSON storage
    ├── ExportImport         # JSON snapshot export/import
    └── GCSMemoryBackend     # Google Cloud Storage stub (full impl v4.3)
```

## Memory Types

Store memories with type-specific optimizations:

```python
# Episodic: events, decisions, meeting notes
mem.ingest("Decided to migrate to PostgreSQL in Q2 meeting", memory_type="episodic")

# Semantic: facts, concepts, general knowledge
mem.ingest("PostgreSQL supports ACID transactions", memory_type="semantic")

# Procedural: how-to steps, runbooks, processes (shorthand helper)
mem.ingest_procedure("Deploy: git push → CI → staging → production")

# Preference: user preferences, style notes (shorthand helper)
mem.ingest_preference("User prefers Python code examples over pseudocode")

# Mistake: errors to avoid, lessons learned (shorthand helper)
mem.ingest_mistake("Forgot to close database connections in worker threads", "Use context managers or explicit close() in finally block")
```

Type-specific recall boosts:
- **Procedural**: 2.5x boost for how-to queries
- **Preference**: 2.0x boost for style/format queries  
- **Mistake**: 1.8x boost for troubleshooting queries
- **Semantic**: 1.2x boost for factual queries
- **Episodic**: Baseline (1.0x)

## Namespace Isolation

Multi-tenant workspaces with hard boundaries:

```python
from antaris_memory import NamespacedMemory, NamespaceManager

# Create isolated namespaces
manager = NamespaceManager("./workspace") 
agent_a = manager.create_namespace("agent-a")
agent_b = manager.create_namespace("agent-b")

# Each namespace is fully isolated
agent_a.ingest("Agent A decision")
agent_b.ingest("Agent B decision")

# Search within namespace only
results_a = agent_a.search("decision")  # Only sees agent A memories
results_b = agent_b.search("decision")  # Only sees agent B memories

# Cross-namespace operations (explicit)
all_decisions = manager.search_across_namespaces("decision", 
                                                 namespaces=["agent-a", "agent-b"])
```

## Testing

Run the full test suite:

```bash
git clone https://github.com/Antaris-Analytics-LLC/antaris-suite.git
cd antaris-memory
python -m pytest tests/ -v

# Run specific test categories
python -m pytest tests/test_search.py -v          # Search engine
python -m pytest tests/test_wal.py -v            # Write-ahead log
python -m pytest tests/test_decay.py -v          # Memory decay
python -m pytest tests/test_concurrency.py -v    # Thread safety
```

564 tests pass with zero external dependencies.

## Migration from v3.x

Automatic schema migration on first load:

```python
# v3.x workspaces load automatically
mem = MemorySystem("./existing_v3_workspace")
mem.load()  # Auto-detects v3 format, migrates to v4

# New v4 features available immediately
mem.ingest_with_gating("Test message", source="migration")
results = mem.search("test", explain=True)
```

## Limitations

We are honest about what this library cannot do:

1. **Storage scale**: JSON files work well up to ~50,000 memories. Beyond that, you need a database.
2. **Semantic understanding**: Core search is keyword-based. Add your own embedding function for semantic search.
3. **Graph relationships**: Flat memory store. No entity relationships or graph traversal.
4. **Real-time updates**: File-based storage has write latency. Not suitable for real-time applications.
5. **Distributed systems**: Single-machine only. No clustering or distributed consensus.

When you hit these limits, you know it's time for a more complex solution.

## License

Licensed under the Apache License 2.0. See [LICENSE](LICENSE) for details.

## Related Packages

- [**antaris-router**](https://pypi.org/project/antaris-router/) - Adaptive model routing with SLA enforcement
- [**antaris-guard**](https://pypi.org/project/antaris-guard/) - Security and prompt injection detection  
- [**antaris-context**](https://pypi.org/project/antaris-context/) - Context window optimization
- [**antaris-pipeline**](https://pypi.org/project/antaris-pipeline/) - Agent orchestration pipeline
