Metadata-Version: 2.4
Name: csv-token-counter-mcp
Version: 1.0.1
Summary: MCP server to count tokens in CSV and Excel files
Project-URL: Homepage, https://pypi.org/project/csv-token-counter-mcp
License: MIT License
        
        Copyright (c) 2026 Anup
        
        Permission is hereby granted, free of charge, to any person obtaining a copy
        of this software and associated documentation files (the "Software"), to deal
        in the Software without restriction, including without limitation the rights
        to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
        copies of the Software, and to permit persons to whom the Software is
        furnished to do so, subject to the following conditions:
        
        The above copyright notice and this permission notice shall be included in all
        copies or substantial portions of the Software.
        
        THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
        IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
        FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
        AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
        LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
        OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
        SOFTWARE.
License-File: LICENSE
Keywords: csv,excel,llm,mcp,tiktoken,tokens
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Requires-Python: >=3.10
Requires-Dist: mcp[cli]
Requires-Dist: openpyxl
Requires-Dist: pandas
Requires-Dist: tiktoken
Description-Content-Type: text/markdown

# csv-token-counter-mcp

A lightweight MCP (Model Context Protocol) server that counts tokens in CSV 
and Excel files using [tiktoken](https://github.com/openai/tiktoken). 
Works with VS Code, Claude Desktop, and any MCP-compatible application.
No LLM or API key required — runs fully offline.

---

## Why Use This?

Before sending CSV or Excel data to any LLM (ChatGPT, Claude, Gemini etc.),
you need to know how many tokens your data contains — to estimate cost,
check context window limits, or split large files. This tool does exactly
that, directly inside your editor or AI assistant.

---

## Features

- Count total tokens across an entire CSV or Excel file
- Break down token counts column by column
- Analyze tokens in a single specific column (avg, min, max per row)
- Preview file schema — column names, types, row counts
- Supports `.csv`, `.xlsx`, and `.xls` formats
- Works fully offline — no API key or internet needed
- Compatible with VS Code, Claude Desktop, and any MCP client

---

## Installation

### Option 1 — pip (traditional)
```bash
pip install csv-token-counter-mcp
```

Add to your `.vscode/mcp.json`:
```json
{
  "servers": {
    "csv-token-counter": {
      "type": "stdio",
      "command": "csv-token-counter-mcp",
      "args": []
    }
  }
}
```

### Option 2 — uvx (recommended, no install needed)

First install `uv` if you don't have it:
```bash
pip install uv
```

Add directly to your `.vscode/mcp.json` — no separate install step needed:
```json
{
  "servers": {
    "csv-token-counter": {
      "type": "stdio",
      "command": "uvx",
      "args": ["csv-token-counter-mcp"]
    }
  }
}
```

### Option 3 — Claude Desktop

Add to your Claude Desktop config file:

**Mac:** `~/Library/Application Support/Claude/claude_desktop_config.json`  
**Windows:** `%APPDATA%\Claude\claude_desktop_config.json`
```json
{
  "mcpServers": {
    "csv-token-counter": {
      "command": "uvx",
      "args": ["csv-token-counter-mcp"]
    }
  }
}
```

---

## Available Tools

### `count_file_tokens`
Count total tokens across an entire CSV or Excel file, with optional
column-by-column breakdown.

**Arguments:**

| Argument | Type | Required | Default | Description |
|---|---|---|---|---|
| `file_path` | string | ✅ Yes | — | Path to `.csv`, `.xlsx`, or `.xls` file |
| `include_column_breakdown` | boolean | ❌ No | `false` | Return token count per column |

**Example request:**
```json
{
  "tool": "count_file_tokens",
  "arguments": {
    "file_path": "/data/sales.xlsx",
    "include_column_breakdown": true
  }
}
```

**Example response:**
```json
{
  "file": "/data/sales.xlsx",
  "encoding": "cl100k_base",
  "rows": 500,
  "columns": 5,
  "total_tokens": 18453,
  "column_breakdown": {
    "id": 1500,
    "product": 3200,
    "description": 9800,
    "price": 2100,
    "stock": 1853
  }
}
```

---

### `count_column_tokens`
Deep token analysis for a single column — total, average, min, and max
tokens per row.

**Arguments:**

| Argument | Type | Required | Default | Description |
|---|---|---|---|---|
| `file_path` | string | ✅ Yes | — | Path to `.csv`, `.xlsx`, or `.xls` file |
| `column_name` | string | ✅ Yes | — | Exact column name to analyze |

**Example request:**
```json
{
  "tool": "count_column_tokens",
  "arguments": {
    "file_path": "/data/sales.csv",
    "column_name": "description"
  }
}
```

**Example response:**
```json
{
  "file": "/data/sales.csv",
  "column": "description",
  "rows_analyzed": 500,
  "total_tokens": 9800,
  "avg_tokens_per_row": 19.6,
  "max_tokens_in_row": 48,
  "min_tokens_in_row": 6
}
```

---

### `preview_file_schema`
Preview the structure of a CSV or Excel file — column names, data types,
and non-null counts — without counting tokens.

**Arguments:**

| Argument | Type | Required | Default | Description |
|---|---|---|---|---|
| `file_path` | string | ✅ Yes | — | Path to `.csv`, `.xlsx`, or `.xls` file |

**Example request:**
```json
{
  "tool": "preview_file_schema",
  "arguments": {
    "file_path": "/data/sales.csv"
  }
}
```

**Example response:**
```json
{
  "file": "/data/sales.csv",
  "rows": 500,
  "columns": 5,
  "column_details": [
    { "name": "id",          "dtype": "int64",   "non_null": 500 },
    { "name": "product",     "dtype": "object",  "non_null": 500 },
    { "name": "description", "dtype": "object",  "non_null": 498 },
    { "name": "price",       "dtype": "float64", "non_null": 500 },
    { "name": "stock",       "dtype": "int64",   "non_null": 500 }
  ]
}
```

---

## Using in VS Code (GitHub Copilot)

Once installed and configured in `mcp.json`, open GitHub Copilot Chat
(`Ctrl+Shift+I`), switch to **Agent mode**, and ask naturally:
```
How many tokens are in my file at C:\data\customers.xlsx?
```
```
Show me a token breakdown by column for /data/sales.csv
```
```
What columns does my file /data/report.xlsx have?
```

Copilot will automatically call the right tool and return the result.

---

## Using in Claude Desktop

After adding to the Claude Desktop config, just ask Claude:
```
Count the tokens in my CSV file at /Users/me/data/sales.csv
```
```
Which column in /data/customers.xlsx has the most tokens?
```

---

## Supported File Formats

| Format | Extension | Notes |
|---|---|---|
| CSV | `.csv` | Any delimiter, auto-detected |
| Excel | `.xlsx` | Excel 2007 and newer |
| Excel Legacy | `.xls` | Excel 97-2003 format |

---

## Requirements

- Python 3.10 or higher
- No API key required
- No internet connection required after install

---

## Local Development
```bash
# Clone the repo
git clone https://github.com/yourusername/csv-token-counter-mcp.git
cd csv-token-counter-mcp

# Create virtual environment
python -m venv venv
venv\Scripts\activate        # Windows
source venv/bin/activate     # Mac/Linux

# Install dependencies
pip install -r requirements.txt

# Run the MCP inspector for testing
mcp dev src/server.py
```

---

## Running Tests
```bash
# Create sample data
python create_sample_data.py

# Run test suite
python test_server.py
```

---

## Publishing a New Version
```bash
# 1. Bump version in pyproject.toml
# 2. Clean old build files
rmdir /s /q dist build        # Windows
rm -rf dist/ build/           # Mac/Linux

# 3. Build
python -m build

# 4. Upload
twine upload dist/*
```

---

## Changelog

### v1.0.0 — 2026-03-28
- Initial release
- `count_file_tokens` — total token count with optional column breakdown
- `count_column_tokens` — per-row token stats for a single column
- `preview_file_schema` — file structure preview
- Supports `.csv`, `.xlsx`, `.xls`
- MCP stdio transport compatible with VS Code and Claude Desktop

---

## License

MIT License — see [LICENSE](LICENSE) for full text.

---

## Author

Built by Anup.  
PyPI: [https://pypi.org/project/csv-token-counter-mcp](https://pypi.org/project/csv-token-counter-mcp)