Metadata-Version: 2.4
Name: insightplot
Version: 0.1.0
Summary: A data science graphing library that visualizes, fits, and recommends predictive models.
Author: Jeff
Author-email: Jeff <your-email@example.com>
License: MIT
Project-URL: Homepage, https://github.com/yourusername/insightplot
Project-URL: Documentation, https://github.com/yourusername/insightplot#readme
Project-URL: Repository, https://github.com/yourusername/insightplot
Project-URL: Issues, https://github.com/yourusername/insightplot/issues
Keywords: data science,visualization,machine learning,model selection,feature importance,hyperparameter tuning,confusion matrix,roc curve,regression,classification
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Science/Research
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Scientific/Engineering :: Visualization
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Operating System :: OS Independent
Requires-Python: >=3.9
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: matplotlib>=3.7
Requires-Dist: numpy>=1.24
Requires-Dist: scipy>=1.10
Requires-Dist: scikit-learn>=1.2
Requires-Dist: seaborn>=0.12
Requires-Dist: pandas>=2.0
Provides-Extra: boost
Requires-Dist: xgboost>=1.7; extra == "boost"
Requires-Dist: lightgbm>=3.3; extra == "boost"
Provides-Extra: all
Requires-Dist: xgboost>=1.7; extra == "all"
Requires-Dist: lightgbm>=3.3; extra == "all"
Requires-Dist: statsmodels>=0.14; extra == "all"
Requires-Dist: shap>=0.42; extra == "all"
Provides-Extra: dev
Requires-Dist: pytest>=7.0; extra == "dev"
Requires-Dist: pytest-cov>=4.0; extra == "dev"
Requires-Dist: black>=23.0; extra == "dev"
Requires-Dist: ruff>=0.1; extra == "dev"
Requires-Dist: mypy>=1.0; extra == "dev"
Requires-Dist: build>=1.0; extra == "dev"
Requires-Dist: twine>=4.0; extra == "dev"
Dynamic: author
Dynamic: license-file
Dynamic: requires-python

# InsightPlot

**A data science graphing library that doesn't just visualize — it analyzes.**

InsightPlot wraps matplotlib with a clean, chainable API and adds three capabilities that no other plotting library offers out of the box:

1. **Model Advisor** — Plot your data, and the library evaluates its statistical properties to recommend which ML models will give you the most predictive power.
2. **Fit Toolkit** — Add regression lines (linear, polynomial, exponential, logistic, LOWESS, etc.) with R², equations, and confidence intervals in one line of code.
3. **Native ML Diagnostics** — Confusion Matrix and ROC Curve are first-class plot types that display all relevant metrics (precision, recall, specificity, F1, AUC, Youden's J, Matthews correlation, Cohen's kappa).

## Installation

```bash
pip install -e .
```

### Dependencies
- matplotlib >= 3.7
- numpy >= 1.24  
- scipy >= 1.10
- scikit-learn >= 1.2
- seaborn >= 0.12
- pandas >= 2.0

---

## Quick Start

```python
import insightplot as ip
import numpy as np

# Generate sample data
np.random.seed(42)
x = np.linspace(0, 10, 100)
y = 2.5 * x**2 - 3 * x + np.random.normal(0, 10, 100)

# One-liner: scatter + polynomial fit + model recommendations
fig = ip.scatter(x, y,
    title="Revenue vs. Ad Spend",
    xlabel="Ad Spend ($K)",
    ylabel="Revenue ($K)",
    fit="polynomial",
    fit_degree=2,
    show_r2=True,
    suggest_models=True
)
fig.show()
```

---

## Core Features

### 1. Multi-Axis Figures (Chainable API)

```python
import insightplot as ip
import numpy as np

months = np.arange(1, 13)
revenue = [120, 135, 150, 142, 168, 195, 210, 225, 198, 240, 260, 290]
growth  = [None, 12.5, 11.1, -5.3, 18.3, 16.1, 7.7, 7.1, -12.0, 21.2, 8.3, 11.5]

fig = ip.Figure(
    title="2024 Monthly Performance",
    xlabel="Month",
    ylabel="Revenue ($K)",
    ylabel_right="YoY Growth (%)"
)

fig.plot(months, revenue, axis="left", label="Revenue", marker="o")
fig.plot(months[1:], growth[1:], axis="right", label="Growth %",
         color="#E94F37", marker="s", linestyle="--")
fig.hline(0, axis="right", color="#999999", linestyle=":")
fig.show()
```

### 2. Fit Toolkit

Add fitted curves to any data series with one call. Supports: `linear`, `polynomial`, `exponential`, `logarithmic`, `power`, `logistic`, `lowess`.

```python
fig = ip.Figure(title="Curve Fitting Comparison")
fig.scatter(x, y, label="Observations")

# Add multiple fits
fig.add_fit("left", "linear", show_r2=True, color="#2E86AB")
fig.add_fit("left", "polynomial", degree=3, show_r2=True,
            show_equation=True, color="#A23B72")
fig.add_fit("left", "exponential", show_r2=True, color="#F18F01")
fig.show()
```

#### Fit with Confidence Intervals

```python
fig = ip.scatter(x, y, label="Data")
fig.add_fit("left", "linear", show_r2=True, show_ci=True, ci_level=0.95)
fig.show()
```

#### Compare All Fits Programmatically

```python
from insightplot import FitToolkit

results = FitToolkit.fit_all(x, y)
for r in results:
    print(f"{r['method']:15s}  R² = {r['r_squared']:.4f}  {r['equation']}")
```

### 3. Model Advisor

The killer feature. Analyzes your data's statistical fingerprint and recommends predictive models with reasoning.

```python
fig = ip.Figure(title="Data with Model Recommendations")
fig.scatter(x, y, label="Training Data")
recommendations = fig.suggest_models("left", top_k=5)
fig.show()
```

**Output:**
```
========================================================================
  INSIGHTPLOT MODEL ADVISOR — DATA ANALYSIS & RECOMMENDATIONS
========================================================================

  DATA PROFILE
  ----------------------------------------
  Samples           : 100
  Pearson r         : 0.9412
  Spearman ρ        : 0.9587
  Linear R²         : 0.8859
  Nonlinearity Δ    : 0.0923
  Monotonicity      : 0.8485
  Noise level       : 0.0412
  Heteroscedasticity: 2.31x
  Binary target     : No
  Count target      : No
  Plateau detected  : No

  TOP 5 MODEL RECOMMENDATIONS
  ----------------------------------------

  #1  Polynomial Regression  (Score: 80/100)
       Reasoning:
         • Nonlinear pattern (Δ R²=0.092); polynomial can capture this.

  #2  Random Forest Regressor  (Score: 75/100)
       Reasoning:
         • Nonlinear relationship benefits from tree-based flexibility.
         ...
```

#### Standalone Advisor (No Plot Required)

```python
from insightplot import ModelAdvisor

advisor = ModelAdvisor()
results = advisor.evaluate(x, y, top_k=5)
advisor.print_report(results)
```

### 4. Confusion Matrix

```python
import insightplot as ip
from sklearn.datasets import make_classification
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split

X, y = make_classification(n_samples=1000, n_classes=3,
                            n_informative=5, random_state=42)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3)

clf = RandomForestClassifier(random_state=42).fit(X_train, y_train)
y_pred = clf.predict(X_test)

fig, metrics = ip.confusion_matrix(
    y_test, y_pred,
    labels=["Class A", "Class B", "Class C"],
    title="Random Forest — 3-Class Confusion Matrix",
    show_metrics_panel=True,
    show_per_class=True,
    return_metrics=True,
)
```

**Displays:** Heatmap with counts + percentages, plus a side panel showing accuracy, balanced accuracy, weighted precision/recall/F1, Matthews correlation coefficient, Cohen's kappa, and a per-class breakdown table.

### 5. ROC Curve

```python
y_scores = clf.predict_proba(X_test)[:, 1]  # binary case

fig, metrics = ip.roc_curve(
    y_test, y_scores,
    title="Logistic Regression ROC",
    show_optimal_threshold=True,
    show_metrics_panel=True,
    n_bootstrap=500,
    return_metrics=True,
)
```

**Displays:** ROC curve with AUC shading, bootstrap confidence band, Youden's J optimal threshold marker, and a side panel with AUC, threshold, accuracy, precision, recall, specificity, F1, and prevalence.

#### Multi-Class ROC

```python
y_scores_all = clf.predict_proba(X_test)

fig = ip.roc_multiclass(
    y_test, y_scores_all,
    class_labels=["Class A", "Class B", "Class C"],
    title="Multi-Class ROC (One-vs-Rest)",
)
```

### 6. Themes

```python
ip.set_theme("dark")     # Dark mode with neon palette
ip.set_theme("minimal")  # No grid, spine-only, monochrome
ip.set_theme("ocean")    # Blue gradient palette
ip.set_theme("earth")    # Warm natural tones
ip.set_theme("insight")  # Default — professional blues/purples
```

---

## API Reference

### `insightplot.Figure`

| Method | Description |
|--------|-------------|
| `.plot(x, y, ...)` | Line/marker series |
| `.scatter(x, y, ...)` | Scatter series |
| `.bar(x, y, ...)` | Bar series |
| `.fill_between(x, y1, y2)` | Shaded region |
| `.add_fit(axis, method, ...)` | Overlay fitted curve |
| `.suggest_models(axis, ...)` | Run Model Advisor |
| `.annotate_point(x, y, text)` | Arrow annotation |
| `.hline(y)` / `.vline(x)` | Reference lines |
| `.legend()` | Unified dual-axis legend |
| `.show()` / `.save(path)` | Display or export |

### `insightplot.FitToolkit`

| Method | Description |
|--------|-------------|
| `.fit(x, y, method, ...)` | Fit a single model, returns dict with predict(), R², equation |
| `.fit_all(x, y)` | Fit all methods, return sorted by R² |

### Fit Methods

| Method | Equation Form |
|--------|--------------|
| `linear` | y = mx + b |
| `polynomial` | y = aₙxⁿ + ... + a₁x + a₀ |
| `exponential` | y = a · eᵇˣ |
| `logarithmic` | y = a · ln(x) + b |
| `power` | y = a · xᵇ |
| `logistic` | y = L / (1 + e⁻ᵏ⁽ˣ⁻ˣ⁰⁾) |
| `lowess` | Non-parametric smoothing |

### `insightplot.confusion_matrix(y_true, y_pred, ...)`

Metrics displayed: Accuracy, Balanced Accuracy, Weighted Precision, Weighted Recall, Weighted F1, Matthews Correlation Coefficient, Cohen's Kappa, and per-class Precision / Recall / Specificity / F1 / Support.

### `insightplot.roc_curve(y_true, y_scores, ...)`

Metrics displayed: AUC, Youden's J, Optimal Threshold, Accuracy / Precision / Recall / Specificity / F1 at optimal threshold, class prevalence.

### `insightplot.ModelAdvisor`

Analyzes: sample size, Pearson/Spearman correlation, linearity, nonlinearity (polynomial R² gain), monotonicity, noise level, heteroscedasticity, target type (binary/count/continuous), plateau/saturation detection. Scores 15 models across regression and classification.

---

## License

MIT
