Metadata-Version: 2.4
Name: getgrip
Version: 0.4.0
Summary: The memory layer for AI — MCP server with 15 tools, episodic memory, learned vocabulary, and confidence scoring. No vectors, no cloud, no config.
Author: Grip Hub
License: Proprietary
Project-URL: Homepage, https://getgrip.dev
Project-URL: Documentation, https://github.com/Grip-Hub/getgrip.dev/blob/main/GUIDE.md
Project-URL: Repository, https://github.com/Grip-Hub/getgrip.dev
Project-URL: Bug Tracker, https://github.com/Grip-Hub/getgrip.dev/issues
Keywords: retrieval,search,rag,code-search,bm25,offline,no-embeddings,no-vector-db
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: License :: Other/Proprietary License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Software Development :: Libraries
Classifier: Topic :: Text Processing :: Indexing
Requires-Python: >=3.9
Description-Content-Type: text/markdown
Requires-Dist: fastapi>=0.95
Requires-Dist: uvicorn[standard]>=0.20
Requires-Dist: pydantic>=2.0
Requires-Dist: numpy>=1.20
Requires-Dist: requests>=2.28
Provides-Extra: mcp
Requires-Dist: fastmcp>=2.0.0; extra == "mcp"
Provides-Extra: license
Requires-Dist: cryptography>=41.0; extra == "license"
Provides-Extra: pdf
Requires-Dist: pypdf>=3.0; extra == "pdf"
Provides-Extra: rerank
Requires-Dist: sentence-transformers>=2.2; extra == "rerank"
Provides-Extra: llm
Requires-Dist: openai>=1.0; extra == "llm"
Requires-Dist: anthropic>=0.18; extra == "llm"
Requires-Dist: groq>=0.4; extra == "llm"
Provides-Extra: docs
Requires-Dist: pypdf>=4.0; extra == "docs"
Requires-Dist: python-docx>=1.0; extra == "docs"
Requires-Dist: openpyxl>=3.1; extra == "docs"
Requires-Dist: python-pptx>=0.6; extra == "docs"
Requires-Dist: striprtf>=0.0.26; extra == "docs"
Requires-Dist: odfpy>=1.4; extra == "docs"
Requires-Dist: xlrd>=2.0; extra == "docs"
Requires-Dist: olefile>=0.47; extra == "docs"
Provides-Extra: ocr
Requires-Dist: pytesseract>=0.3.10; extra == "ocr"
Requires-Dist: Pillow>=9.0; extra == "ocr"
Requires-Dist: scikit-image>=0.20; extra == "ocr"
Provides-Extra: ocr-rapid
Requires-Dist: rapidocr>=3.0; extra == "ocr-rapid"
Provides-Extra: vision
Requires-Dist: transformers>=4.36; extra == "vision"
Requires-Dist: timm>=0.9; extra == "vision"
Requires-Dist: Pillow>=9.0; extra == "vision"
Requires-Dist: torch>=2.0; extra == "vision"
Provides-Extra: all
Requires-Dist: cryptography>=41.0; extra == "all"
Requires-Dist: fastmcp>=2.0.0; extra == "all"
Requires-Dist: pypdf>=4.0; extra == "all"
Requires-Dist: sentence-transformers>=2.2; extra == "all"
Requires-Dist: openai>=1.0; extra == "all"
Requires-Dist: anthropic>=0.18; extra == "all"
Requires-Dist: groq>=0.4; extra == "all"
Requires-Dist: python-docx>=1.0; extra == "all"
Requires-Dist: openpyxl>=3.1; extra == "all"
Requires-Dist: python-pptx>=0.6; extra == "all"
Requires-Dist: striprtf>=0.0.26; extra == "all"
Requires-Dist: odfpy>=1.4; extra == "all"
Requires-Dist: xlrd>=2.0; extra == "all"
Requires-Dist: olefile>=0.47; extra == "all"
Requires-Dist: pytesseract>=0.3.10; extra == "all"
Requires-Dist: Pillow>=9.0; extra == "all"
Requires-Dist: scikit-image>=0.20; extra == "all"

# GRIP

**The memory layer for AI.**

An MCP server that gives your AI agents persistent memory across documents, conversations, and sessions. 15 tools. One config file. No vectors, no cloud, no config.

[getgrip.dev](https://getgrip.dev) | [User Guide](https://github.com/Grip-Hub/getgrip.dev/blob/main/GUIDE.md) | [GitHub](https://github.com/Grip-Hub/getgrip.dev)

---

## Install

```bash
pip install getgrip
```

### MCP (recommended)

Add to your `.mcp.json` or Claude Desktop config:

```json
{
  "mcpServers": {
    "grip": {
      "command": "grip-mcp",
      "args": []
    }
  }
}
```

Your agent now has 15 tools: search, ingest, store/recall memory, list sources, and more.

### Standalone

```bash
getgrip                          # starts web UI + API on localhost:7878
```

```bash
# Ingest
curl -X POST localhost:7878/ingest \
  -H "Content-Type: application/json" \
  -d '{"paths": ["/path/to/your/data"]}'

# Search
curl "localhost:7878/search?q=valve+specification&top_k=5"
```

Open `http://localhost:7878` for the web UI.

---

## What it does

GRIP gives AI agents memory. Not a vector database — actual memory that compounds with use.

**Episodic memory.** Agents store facts and observations during conversations. Those facts become searchable alongside your documents. Memories supersede each other, carry tags, and expire on schedule.

**Document memory.** Ingest files, directories, git repos, or URLs. GRIP reads 30+ formats including scanned documents (OCR) and technical drawings (visual captioning). Every chunk is indexed and searchable in under 5ms.

**Knowledge artifacts.** First query runs a full retrieval pass. The answer gets cached with citations. Second query on the same topic: zero LLM calls, sub-millisecond. The corpus gets smarter with use.

| | Context stuffing | Vector RAG | GRIP |
|---|---|---|---|
| Cost per query | ~$0.14 | ~$0.09 | **~$0.07** |
| Reads all documents | Yes (expensive) | No (top-k only) | **Yes (exhaustive)** |
| Learns your vocabulary | No | No | **Yes** |
| Remembers across sessions | No | No | **Yes (episodic memory)** |
| Knows when it doesn't know | No | No | **Yes (confidence scoring)** |
| Works offline | No | Rarely | **Fully air-gapped** |
| Setup | Complex | Complex | **One config file** |

---

## MCP tools

| Tool | Description |
|------|-------------|
| `search` | Search documents — returns ranked results with scores |
| `query` | Search + LLM answer with citations |
| `ingest` | Add files, directories, git repos, URLs |
| `store_memory` | Store a fact or observation from the conversation |
| `recall_memory` | Search stored memories by similarity |
| `list_sources` | List all indexed sources with chunk counts |
| `delete_source` | Remove a source and its chunks |
| `get_stats` | Server statistics and index health |
| `explain_search` | See why each result was retrieved |
| `search_with_authority` | Search with per-source weight tuning |
| `list_memories` | List stored memories with filters |
| `delete_memory` | Remove a stored memory |
| `get_memory` | Get a specific memory by ID |
| `configure_llm` | Set LLM provider for answer synthesis |
| `health` | Readiness check |

---

## Features

- **MCP server** — 15 tools, stdio transport, one config file
- **Episodic memory** — store/recall facts across conversations, with tags and supersession
- **Query parser** — exact phrases (`"valve spec"`), field filters (`source:manual`), OR groups, negation (`-obsolete`)
- **Vocabulary learning** — learns which terms co-occur in your data, expands queries automatically
- **Confidence scoring** — HIGH / MEDIUM / LOW / NONE so the LLM knows when to say "I don't know"
- **Explain mode** — see why each result was retrieved (BM25, title match, authority, structure)
- **Source authority** — weight sources differently (trust the manual more than the wiki)
- **30+ file formats** — PDF, Word, Excel, PowerPoint, RTF, ODF, CSV, Markdown, code, email
- **OCR** — scanned documents detected and read automatically (Tesseract, RapidOCR)
- **Visual captioning** — technical drawings described by Florence-2 for searchability
- **Knowledge artifacts** — cached answers with citations, stale detection, delta updates
- **Session context** — "tell me more" carries context from the previous query
- **Academic citations** — BibTeX parsing, author-year formatting, page references
- **Exhaustive synthesis** — reads every chunk, not top-k. Nothing skipped
- **Fully offline** — no cloud, no telemetry, air-gapped operation

---

## What's new in 0.4.0

- **MCP Server** — 15 tools, one `.mcp.json` config file
- **Episodic Memory** — store/recall facts across conversations
- **Query Parser** — exact phrases, field filters, OR groups, negation
- **Vocabulary Learning** — learns your terminology from ingested docs
- **Confidence Scoring** — HIGH/MEDIUM/LOW/NONE on every answer
- **Explain Mode** — see why each result was retrieved

---

## Benchmarks

### Accuracy (3,000 queries across 3 models, no cherry-picking)

| Model | Queries | Correct |
|-------|---------|---------|
| GPT-4o-mini | 1,000 | **70/70** |
| Claude 3.5 Haiku | 1,000 | **70/70** |
| Llama 3.1 8B | 1,000 | **70/70** |

70/70 correct across every model tested. GRIP's retrieval is accurate enough that the model choice doesn't matter.

### BEIR (6 datasets, 2,771 queries)

| Dataset | Corpus | BM25 | GRIP | Delta |
|---------|--------|------|------|-------|
| FEVER | 5,416,568 | 0.509 | **0.808** | +0.299 |
| HotpotQA | 5,233,329 | 0.595 | **0.741** | +0.146 |
| SciFact | 5,183 | 0.665 | **0.682** | +0.017 |
| NQ | 2,681,468 | 0.276 | **0.542** | +0.266 |
| FiQA | 57,638 | 0.232 | **0.347** | +0.116 |
| NFCorpus | 3,633 | 0.311 | **0.344** | +0.034 |

**Average NDCG@10: 0.58** — two-stage pipeline with optional MiniLM reranker (22M params).

### Cost

| Method | Cost per 1K queries | Monthly (10K/day) |
|--------|--------------------|--------------------|
| Context stuffing | $140 | $42,000 |
| Vector RAG | $90 | $27,000 |
| **GRIP** | **$70** | **$21,000** |

GRIP saves ~$19,897/month at 10K queries/day vs context stuffing.

---

## File format support

**30+ formats:** PDF, Word (.docx), Excel (.xlsx/.xls), PowerPoint (.pptx), RTF, OpenDocument (ODS/ODT/ODP), CSV, Markdown, plain text, and all major code file types.

**Scanned documents:** Detects pages with no selectable text and runs OCR automatically. Mixed PDFs (some pages scanned, some digital) handled transparently.

**Technical drawings:** ISO drawings, P&IDs, schematics. PaddleOCR with confidence-gated rotation plus Florence-2 visual captioning for structural descriptions.

---

## Integration

### MCP (recommended)

Works with Claude Desktop, Claude Code, Cursor, Windsurf, and any MCP-compatible client.

### HTTP API

GRIP is also a JSON API on localhost:

```python
import requests

# Search
r = requests.get("http://localhost:7878/search", params={"q": "valve specification", "top_k": 5})
results = r.json()["results"]

# Store memory
requests.post("http://localhost:7878/memory", json={"text": "Budget is $50K", "tags": ["budget"]})

# Recall memory
r = requests.get("http://localhost:7878/memory", params={"q": "budget"})
```

Works with LangChain, LlamaIndex, or any HTTP client.

### Docker

```bash
docker run -d -p 7878:7878 \
  -v grip-data:/data \
  -v /your/files:/code \
  griphub/grip:free
```

---

## Optional extras

```bash
pip install getgrip[mcp]        # MCP server (fastmcp)
pip install getgrip[pdf]        # PDF parsing
pip install getgrip[docs]       # All document formats (docx, xlsx, pptx, rtf, odt...)
pip install getgrip[ocr]        # OCR (pytesseract + Pillow, Apache-2.0)
pip install getgrip[vision]     # Visual pipeline (Florence-2 + OCR)
pip install getgrip[rerank]     # Cross-encoder reranking (MiniLM, 22M params)
pip install getgrip[llm]        # LLM answers (Ollama, OpenAI, Anthropic, Groq)
pip install getgrip[all]        # Everything
```

All extras are optional. Core retrieval works with zero extras installed.

---

## Pricing

All tiers include all features. The free tier has a 10,000 chunk limit (~3,500 files). No credit card. No time limit.

| Tier | Chunks | Price |
|------|--------|-------|
| Free | 10,000 | $0 |
| Personal | 100,000 | $499/year |
| Team | 500,000 | $1,499/year |
| Professional | 5,000,000 | $4,999/year |

One license per deployment. No per-seat fees. No per-query fees. Unlimited users.

---

[getgrip.dev](https://getgrip.dev) | [User Guide](https://github.com/Grip-Hub/getgrip.dev/blob/main/GUIDE.md) | [GitHub](https://github.com/Grip-Hub/getgrip.dev)
