Metadata-Version: 2.4
Name: llmhosts
Version: 0.5.1
Classifier: Development Status :: 3 - Alpha
Classifier: Environment :: Console
Classifier: Framework :: FastAPI
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: System Administrators
Classifier: License :: Other/Proprietary License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Internet :: Proxy Servers
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Typing :: Typed
Requires-Dist: fastapi>=0.115.0
Requires-Dist: uvicorn[standard]>=0.32.0
Requires-Dist: httpx>=0.28.0
Requires-Dist: litellm>=1.55.0
Requires-Dist: click>=8.1.0
Requires-Dist: rich>=13.9.0
Requires-Dist: textual>=1.0.0
Requires-Dist: pydantic>=2.10.0
Requires-Dist: cryptography>=44.0.0
Requires-Dist: toml>=0.10.2
Requires-Dist: aiosqlite>=0.20.0
Requires-Dist: websockets>=14.0
Requires-Dist: alembic>=1.14.0
Requires-Dist: ruff>=0.8.0 ; extra == 'dev'
Requires-Dist: mypy>=1.13.0 ; extra == 'dev'
Requires-Dist: pytest>=8.3.0 ; extra == 'dev'
Requires-Dist: pytest-cov>=6.0.0 ; extra == 'dev'
Requires-Dist: pytest-asyncio>=0.24.0 ; extra == 'dev'
Requires-Dist: httpx>=0.28.0 ; extra == 'dev'
Requires-Dist: respx>=0.22.0 ; extra == 'dev'
Requires-Dist: pre-commit>=4.0.0 ; extra == 'dev'
Requires-Dist: bandit>=1.8.0 ; extra == 'dev'
Requires-Dist: prometheus-client>=0.21.0 ; extra == 'dev'
Requires-Dist: asyncssh>=2.17.0 ; extra == 'dev'
Requires-Dist: torch>=2.5.0 ; extra == 'full'
Requires-Dist: sentence-transformers>=3.3.0 ; extra == 'full'
Requires-Dist: faiss-cpu>=1.9.0 ; extra == 'full'
Requires-Dist: transformers>=4.47.0 ; extra == 'full'
Requires-Dist: zeroconf>=0.131.0 ; extra == 'lan'
Requires-Dist: sentry-sdk[fastapi]>=2.0.0 ; extra == 'monitoring'
Requires-Dist: opentelemetry-api>=1.20.0 ; extra == 'observability'
Requires-Dist: opentelemetry-sdk>=1.20.0 ; extra == 'observability'
Requires-Dist: opentelemetry-exporter-otlp>=1.20.0 ; extra == 'observability'
Requires-Dist: onnxruntime>=1.20.0 ; extra == 'smart'
Requires-Dist: numpy>=1.26.0,<3.0 ; extra == 'smart'
Requires-Dist: faiss-cpu>=1.9.0 ; extra == 'smart'
Requires-Dist: tokenizers>=0.21.0 ; extra == 'smart'
Provides-Extra: dev
Provides-Extra: full
Provides-Extra: lan
Provides-Extra: monitoring
Provides-Extra: observability
Provides-Extra: smart
License-File: LICENSE
Summary: Your Personal AI Cloud -- intelligent proxy, router, and cache for LLMs
Keywords: llm,ai,proxy,router,cache,ollama,openai,anthropic
Author: LLMHosts Team
License: FSL-1.1-Apache-2.0
Requires-Python: >=3.10
Description-Content-Type: text/markdown; charset=UTF-8; variant=GFM
Project-URL: Changelog, https://github.com/LookNoHandsMom/LLMHosts.com/blob/main/CHANGELOG.md
Project-URL: Documentation, https://llmhosts.com/docs
Project-URL: Homepage, https://llmhosts.com
Project-URL: Issues, https://github.com/LookNoHandsMom/LLMHosts.com/issues
Project-URL: Repository, https://github.com/LookNoHandsMom/LLMHosts.com

# LLMHosts.com

[![PyPI version](https://img.shields.io/badge/pypi-v0.1.0-blue)](https://test.pypi.org/project/llmhosts/)
[![Python 3.10+](https://img.shields.io/badge/python-3.10%2B-blue)](https://pypi.org/project/llmhosts/)
[![License: FSL-1.1-Apache-2.0](https://img.shields.io/badge/license-FSL--1.1--Apache--2.0-blue)](https://fsl.software)
[![CI](https://github.com/LookNoHandsMom/LLMHosts.com/workflows/CI/badge.svg)](https://github.com/LookNoHandsMom/LLMHosts.com/actions/workflows/ci.yml)
[![Tests](https://img.shields.io/badge/tests-301%20passing-brightgreen)](https://github.com/LookNoHandsMom/LLMHosts.com)
[![PRs Welcome](https://img.shields.io/badge/PRs-welcome-brightgreen)](https://github.com/LookNoHandsMom/LLMHosts.com/pulls)

**Your hardware. Real AI infrastructure. From anywhere.**

LLMHosts turns your local GPU into production AI infrastructure with intelligent routing, verified caching, and global access. The CLI proxy makes your Ollama/vLLM OpenAI-compatible. The SaaS platform at [llmhosts.com](https://llmhosts.com) provides cost tracking, plan management, and team features.

**Two ways to use it:**
- **Self-hosted CLI** — `pip install llmhosts` and run on your own hardware (free, open source)
- **SaaS Platform** — Sign up at [llmhosts.com](https://llmhosts.com) for cloud cost tracking, API key management, and team features

---

## Licensing

LLMHosts uses an **open-core model** under the [Functional Source License 1.1 (FSL-1.1-Apache-2.0)](https://fsl.software). All components are FSL-licensed; the table below reflects open-core intent — components free for personal and non-competing use versus those that compete with our hosted service.

| Component | Intent | Converts to Apache 2.0 |
|-----------|--------|------------------------|
| Local inference proxy & router | Open-core (non-competing use free) | 2028-02-24 |
| CLI tool (`llmhosts`) | Open-core (non-competing use free) | 2028-02-24 |
| Auto-discovery | Open-core (non-competing use free) | 2028-02-24 |
| Cloud tunnel management | Proprietary (competing use restricted) | 2028-02-24 |
| SaaS platform & billing | Proprietary (competing use restricted) | 2028-02-24 |
| Fleet orchestration (Token) | Proprietary (competing use restricted) | 2028-02-24 |

**After 2028-02-24**, all components convert to Apache 2.0 with no restrictions.

---

## SaaS Platform (llmhosts.com)

Track your AI spending, manage API keys, and get real-time savings projections.

**Live at**: [https://llmhosts.com](https://llmhosts.com)

**Features:**
- 📊 Cost tracking across OpenAI, Anthropic, Google AI, AWS Bedrock, Azure
- 🔑 API key management with plan-based limits
- 📈 12-month spending projections with confidence scoring
- 💰 Real-time savings estimates (35% with intelligent caching + routing)
- 🎯 Gamified achievements for cost milestones
- 💳 Stripe-powered billing (Pro $29/mo, Team $99/mo, Enterprise $299/mo)
- 👥 Team management (coming soon)

**Quick Start:**
1. Sign up at [llmhosts.com](https://llmhosts.com)
2. Add your first cost entry
3. Generate an API key for the CLI proxy
4. Connect your self-hosted LLMHost proxy to track usage

---

## Self-Hosted CLI

Run the intelligent proxy on your own hardware.

```bash
pip install llmhosts
llmhosts serve
```

Point any OpenAI-compatible tool at `http://localhost:4000/v1`. Your tools now use your local GPU. Cost: $0.

---

## Why LLMHosts?

- **Cloud bills add up** — Route Cursor, Claude Code, and Aider to your local GPU instead. Same tools, zero API spend.
- **Your hardware, your control** — All inference runs on your machine. No data leaves your network unless you choose.
- **Works anywhere** — `llmhosts tunnel` uses the built-in LLMHosts Relay (zero config) or falls back to Tailscale/Cloudflare if installed. Your home GPU becomes your portable AI.

---

## Competitive Landscape

We are not entering an existing market — we are creating one. The market: *personal and small-team AI infrastructure*.

| Player | What They Do | Why They Lose |
|--------|-------------|---------------|
| OpenAI / Anthropic | Cloud API | 100x more expensive for same hardware quality |
| Ollama | Local model runner | No remote access, no routing, no SaaS |
| LM Studio | Local GUI | Nowhere near production-ready |
| LocalAI | Self-hosted API | Technical, no UX, no moat |
| Replicate / Together | Hosted inference | Still cloud cost, no local hardware |
| **LLMHosts** | **Infrastructure layer** | **The only production-grade local AI platform** |

## Competitive Moat

Five compounding advantages that deepen with every user:

| Layer | Name | What It Is |
|-------|------|------------|
| 1 | First-Mover Position | Building this market category before competition arrives |
| 2 | Data Flywheel | Routing telemetry trains better models → better product → more users |
| 3 | Simplicity Moat | Works for gamers, researchers, founders — not just DevOps engineers |
| 4 | Self-Healing Infrastructure | Fixes itself while you sleep. Zero tinkering required. |
| 5 | Ecosystem Lock-In | Token AI, Hardware Atlas, savings history = high switching cost |

---

## Features

| Area | Description |
|------|-------------|
| **Proxy** | OpenAI + Anthropic compatible API on port 4000. Drop-in for any client. |
| **Router** | Three-tier: rules first, then kNN similarity, then ModernBERT classifier. Routes each request to the right model. |
| **Cache** | Three-tier vCache: exact hash, entity namespace, verified semantic. Cut repeat calls to zero. |
| **Tunnel** | `llmhosts tunnel` — built-in LLMHosts Relay (zero config), falls back to Tailscale or Cloudflare. Your GPU on your laptop, anywhere. |
| **Dashboard** | TUI (terminal) + web UI at `/dashboard`. Live request flow, cache stats, model health. |
| **BYOK** | Bring your own cloud keys. Fallback to OpenAI/Anthropic when local models can't handle a request. |

---

## Quick Start

### Install

Three tiers, pick what you need:

```bash
pip install llmhosts                    # Core (~50MB) — proxy, router, dashboard
pip install "llmhosts[smart]"          # Smart (~150MB) — + ML router, semantic cache
pip install "llmhosts[full]"           # Full (~2GB) — + PyTorch, full intelligence
```

Docker:

```bash
docker run -p 4000:4000 llmhosts/llmhosts
# GPU: docker run --gpus all -p 4000:4000 llmhosts/llmhosts
```

### Start the Proxy

```bash
llmhosts serve
```

Starts the proxy on `http://localhost:4000`, auto-discovers Ollama, loads BYOK keys, and launches the TUI dashboard. Web dashboard at `http://localhost:4000/dashboard`.

### Access from Anywhere

The differentiator: make your home GPU reachable from your laptop, phone, or office.

```bash
llmhosts tunnel
```

Uses the built-in LLMHosts Relay by default (zero config, Rust binary included in the pip wheel). Falls back to Tailscale or Cloudflare if installed. Prints a URL — use it from any device. No VPN config, no port forwarding.

```bash
llmhosts tunnel                                  # Auto: relay first, then Tailscale/Cloudflare
llmhosts tunnel --provider tailscale --funnel    # Force Tailscale Funnel
llmhosts tunnel status                           # Check tunnel status
llmhosts tunnel stop                             # Stop active tunnel
```

### Works With Everything

Every tool that speaks OpenAI format works. Just set the base URL:

```bash
export OPENAI_API_BASE=http://localhost:4000/v1
# Some tools use: export OPENAI_BASE_URL=http://localhost:4000/v1
export OPENAI_API_KEY=anything   # LLMHosts accepts any key for local mode
```

| Tool | How |
|------|-----|
| Cursor | Settings > Models > Custom endpoint: `http://localhost:4000/v1` |
| Claude Code | Set `OPENAI_API_BASE` or configure base URL in settings |
| Aider | `aider --api-base http://localhost:4000/v1` |
| Continue.dev | Add OpenAI-compatible provider, base URL: `http://localhost:4000/v1` |
| Open WebUI | Set OpenAI API URL to `http://localhost:4000/v1` |
| Any OpenAI client | `base_url="http://localhost:4000/v1"` in client config |

---

## Architecture

```
Request  →  Proxy (4000)  →  Router  →  vCache  →  Backend
                │              │          │
                │              ├─ Tier 1: Rules
                │              ├─ Tier 2: kNN (FAISS + all-MiniLM)
                │              └─ Tier 3: ModernBERT → Qwen-0.5B
                │
                ├─ Cache: exact hash → namespace → semantic (vCache)
                │
                └─ Backend: Ollama | Cloud API (BYOK)
```

---

## Commands

| Command | Description |
|---------|-------------|
| `llmhosts serve` | Start proxy + dashboard |
| `llmhosts tunnel` | Start secure tunnel (built-in relay, Tailscale/Cloudflare fallback) |
| `llmhosts tunnel status` | Show tunnel status |
| `llmhosts tunnel stop` | Stop active tunnel |
| `llmhosts doctor` | Verify setup and dependencies |
| `llmhosts setup` | Interactive first-run wizard |
| `llmhosts keys add <provider> <key>` | Add BYOK API key |
| `llmhosts keys list` | List configured providers |
| `llmhosts keys validate` | Validate stored keys |
| `llmhosts cache stats` | Cache hit rates and size |
| `llmhosts cache clear` | Clear cache |
| `llmhosts suggest-models` | Recommend models for your hardware |

---

## Dashboard

- **TUI** — Built-in terminal UI when you run `llmhosts serve`. Live request flow, backends, cache activity.
- **Web** — Browser dashboard at `http://localhost:4000/dashboard`. Request history, cache stats, model health.

---

## Configuration

- **TOML** — `~/.config/llmhosts/config.toml` or `--config path/to/config.toml`
- **Env** — `LLMHOSTS_*` prefixed variables
- **CLI** — `--host`, `--port`, `--no-tui`, `--log-level`

---

## Development

```bash
docker compose run --rm dev
pip install -e ".[dev]"
llmhosts --version
pytest tests/ -v
```

---

## Contributing

PRs welcome. Open an issue first for large changes. Run `pytest tests/` and `ruff check .` before submitting.

---

## License

[FSL-1.1-Apache-2.0](LICENSE) — see [Licensing](#licensing) section above for the open-core breakdown.

---

## Links

- [Docs](https://llmhosts.com/docs)
- [Issues](https://github.com/LookNoHandsMom/LLMHosts.com/issues)
- [Changelog](https://github.com/LookNoHandsMom/LLMHosts.com/blob/main/CHANGELOG.md)

