Metadata-Version: 2.4
Name: mergelens
Version: 0.1.0
Summary: Pre-merge diagnostic framework for LLM model merging — analyze, diagnose, and optimize before you merge.
Project-URL: Homepage, https://github.com/mergelens/mergelens
Project-URL: Documentation, https://mergelens.readthedocs.io
Project-URL: Repository, https://github.com/mergelens/mergelens
Project-URL: Issues, https://github.com/mergelens/mergelens/issues
Author: MergeLens Contributors
License-Expression: Apache-2.0
License-File: LICENSE
Keywords: diagnostics,llm,machine-learning,mergekit,model-merging,transformers
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >=3.9
Requires-Dist: diskcache>=5.6
Requires-Dist: huggingface-hub>=0.20
Requires-Dist: jinja2>=3.1
Requires-Dist: numpy>=1.24
Requires-Dist: plotly>=5.18
Requires-Dist: pydantic>=2.0
Requires-Dist: pyyaml>=6.0
Requires-Dist: rich>=13.0
Requires-Dist: safetensors>=0.4
Requires-Dist: torch>=2.0
Requires-Dist: typer>=0.9
Provides-Extra: all
Requires-Dist: accelerate>=0.25; extra == 'all'
Requires-Dist: mcp>=1.0; extra == 'all'
Requires-Dist: transformers>=4.36; extra == 'all'
Provides-Extra: audit
Requires-Dist: accelerate>=0.25; extra == 'audit'
Requires-Dist: transformers>=4.36; extra == 'audit'
Provides-Extra: dev
Requires-Dist: mypy; extra == 'dev'
Requires-Dist: pytest-cov; extra == 'dev'
Requires-Dist: pytest>=7.0; extra == 'dev'
Requires-Dist: ruff; extra == 'dev'
Provides-Extra: mcp
Requires-Dist: mcp>=1.0; extra == 'mcp'
Provides-Extra: vllm
Requires-Dist: vllm>=0.3; extra == 'vllm'
Description-Content-Type: text/markdown

<p align="center">
  <h1 align="center">MergeLens</h1>
  <p align="center"><strong>Pre-merge diagnostics for LLM model merging</strong></p>
  <p align="center">
    <a href="https://pypi.org/project/mergelens/"><img src="https://img.shields.io/pypi/v/mergelens" alt="PyPI"></a>
    <a href="https://pypi.org/project/mergelens/"><img src="https://img.shields.io/pypi/pyversions/mergelens" alt="Python"></a>
    <a href="https://github.com/mergelens/mergelens/blob/main/LICENSE"><img src="https://img.shields.io/badge/license-Apache%202.0-blue" alt="License"></a>
  </p>
</p>

---

34% of top Open LLM Leaderboard models are merges, yet merging is blind trial-and-error. MergeLens tells you **before you merge** whether it will work — and which method to use.

## What It Does

- **Merge Compatibility Index (MCI)** — A single 0-100 score: "should I merge these models?"
- **10 diagnostic metrics** — cosine similarity, spectral subspace overlap, sign disagreement, task vector interference, CKA, and more
- **Strategy recommender** — Maps your diagnostic profile to the optimal merge method (SLERP, TIES, DARE, linear) with a ready-to-paste MergeKit config
- **Conflict zone detection** — Identifies exactly which layers will cause problems
- **Interactive HTML reports** — Self-contained dashboards with Plotly charts
- **MCP server** — AI assistants (Claude, Cursor) can diagnose merges natively

## Install

```bash
pip install mergelens
```

Optional extras:

```bash
pip install mergelens[mcp]    # MCP server for AI assistants
pip install mergelens[audit]  # Capability probing (requires transformers)
pip install mergelens[all]    # Everything
```

## Quick Start

### CLI

```bash
# Compare two models
mergelens compare model_a/ model_b/

# Compare HuggingFace Hub models
mergelens compare meta-llama/Llama-3-8B mistralai/Mistral-7B-v0.1

# With a base model (for task vector metrics)
mergelens compare model_a/ model_b/ --base base_model/

# Generate HTML report
mergelens compare model_a/ model_b/ --report report.html

# Diagnose a MergeKit config before running it
mergelens diagnose --config merge.yaml

# Start MCP server
mergelens serve
```

### Python API

```python
from mergelens import compare_models

result = compare_models(["model_a/", "model_b/"])

# Single compatibility score (0-100)
print(f"MCI: {result.mci.score} — {result.mci.verdict}")
# MCI: 72.3 — compatible

# Per-layer metrics
for layer in result.layer_metrics:
    print(f"{layer.layer_name}: cosine={layer.cosine_similarity:.4f}")

# Conflict zones
for zone in result.conflict_zones:
    print(f"Layers {zone.start_layer}-{zone.end_layer}: {zone.severity.value}")
    print(f"  Fix: {zone.recommendation}")

# Strategy recommendation
if result.strategy:
    print(f"Use: {result.strategy.method.value}")
    print(result.strategy.mergekit_yaml)  # Copy-paste into MergeKit
```

```python
from mergelens import diagnose_config

result = diagnose_config("merge.yaml")
print(f"Overall interference: {result.overall_interference:.4f}")
for score in result.interference_scores:
    print(f"  {score.layer_name}: {score.score:.4f}")
```

```python
from mergelens import generate_report

generate_report(compare_result=result, output_path="dashboard.html")
```

## Metrics

| Metric | What It Measures | Range | Source |
|--------|-----------------|-------|--------|
| Cosine Similarity | Weight vector alignment | [-1, 1] | Standard |
| L2 Distance | Normalized weight divergence | [0, +inf) | Standard |
| KL Divergence | Weight distribution difference | [0, +inf) | Standard |
| Spectral Subspace Overlap | Top-k SVD direction alignment | [0, 1] | Zhou et al. 2026 |
| Effective Rank Ratio | Dimensionality compatibility | [0, 1] | Shannon entropy |
| Sign Disagreement Rate | Parameter sign conflicts | [0, 1] | TIES-Merging (Yadav et al. 2023) |
| TSV Interference | Cross-task singular vector conflict | [0, +inf) | Gargiulo et al. 2025 |
| Task Vector Energy | Knowledge concentration in top SVs | [0, 1] | Choi et al. 2024 |
| CKA Similarity | Activation representation similarity | [0, 1] | Kornblith et al. 2019 |
| **Merge Compatibility Index** | **Composite go/no-go score** | **[0, 100]** | **Ours** |

## MCI Verdicts

| Score | Verdict | Meaning |
|-------|---------|---------|
| 75-100 | Highly Compatible | Merge with confidence |
| 55-74 | Compatible | Should work, monitor quality |
| 35-54 | Risky | Expect degradation, use targeted methods |
| 0-34 | Incompatible | These models likely shouldn't be merged |

## Strategy Recommendations

MergeLens maps diagnostic profiles to merge methods. Different metrics predict success for different methods ([Zhou et al. 2026](https://arxiv.org/abs/2601.22285) found only 46.7% metric overlap between methods):

| Diagnostic Profile | Recommended Method |
|--------------------|--------------------|
| High cosine similarity everywhere | SLERP |
| High sign disagreement (>30%) | TIES |
| Concentrated task vector energy | DARE |
| Low spectral overlap | Linear (small alpha) |

Each recommendation includes a ready-to-paste MergeKit YAML config.

## MCP Integration

Add to your Claude Code or Cursor config:

```json
{
  "mcpServers": {
    "mergelens": {
      "command": "mergelens",
      "args": ["serve"]
    }
  }
}
```

Available tools: `compare_models`, `diagnose_merge`, `get_conflict_zones`, `suggest_strategy`, `generate_report`, `explain_layer`, `get_compatibility_score`

## Security

- **No pickle/torch.load** — Only `safetensors.safe_open()`. No arbitrary code execution risk.
- **YAML safety** — `yaml.safe_load()` only. No deserialization attacks.
- **Path validation** — MCP server rejects path traversal attempts.
- **Tensor size limits** — SVD operations capped at 50M elements to prevent DoS.
- **No credential leakage** — HF tokens never appear in reports or logs.

## How It Works

MergeLens loads model weights lazily via memory-mapped safetensors (peak memory: 2x largest layer, not 2x full model). It computes metrics layer-by-layer, detects conflict zones, and aggregates everything into the MCI score. The strategy recommender uses a rule-based decision tree mapping diagnostic profiles to merge methods.

## References

- Zhou et al. 2026, "Demystifying Mergeability of Homologous LLMs" ([arXiv:2601.22285](https://arxiv.org/abs/2601.22285))
- Gargiulo et al. CVPR 2025, "Task Singular Vectors" ([arXiv:2412.00081](https://arxiv.org/abs/2412.00081))
- Yadav et al. NeurIPS 2023, "TIES-Merging" ([arXiv:2306.01708](https://arxiv.org/abs/2306.01708))
- Choi et al. 2024, "Revisiting Weight Averaging for Model Merging" ([arXiv:2412.12153](https://arxiv.org/abs/2412.12153))
- Kornblith et al. 2019, "Similarity of Neural Network Representations Revisited"
- Rahamim et al. 2026, "Will it Merge?" ([arXiv:2601.06672](https://arxiv.org/abs/2601.06672))

## License

Apache 2.0
