Metadata-Version: 2.4
Name: nextstat
Version: 0.10.1
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Science/Research
Classifier: Topic :: Scientific/Engineering :: Physics
Classifier: License :: OSI Approved :: GNU Affero General Public License v3 or later (AGPLv3+)
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Programming Language :: Python :: 3.14
Classifier: Programming Language :: Rust
Classifier: Operating System :: MacOS
Classifier: Operating System :: POSIX :: Linux
Classifier: Operating System :: Microsoft :: Windows
Classifier: Typing :: Typed
Requires-Dist: nextstat-cli==0.10.1
Requires-Dist: httpx>=0.27 ; extra == 'agent'
Requires-Dist: jsonschema>=4.0 ; extra == 'agent'
Requires-Dist: numpy>=2.0 ; extra == 'agent'
Requires-Dist: uproot>=5.0 ; extra == 'agent'
Requires-Dist: nextstat[bayes,viz,io,remote,torch,langchain] ; extra == 'all'
Requires-Dist: arviz>=0.23.4 ; extra == 'bayes'
Requires-Dist: numpy>=2.0 ; extra == 'bayes'
Requires-Dist: emcee>=3.1.6 ; extra == 'bayes'
Requires-Dist: mypy>=1.19 ; extra == 'dev'
Requires-Dist: pytest>=9.0 ; extra == 'dev'
Requires-Dist: pytest-cov>=7.0 ; extra == 'dev'
Requires-Dist: ruff>=0.15 ; extra == 'dev'
Requires-Dist: pyarrow>=16.0 ; extra == 'io'
Requires-Dist: langchain-core>=0.3 ; extra == 'langchain'
Requires-Dist: httpx>=0.27 ; extra == 'remote'
Requires-Dist: torch>=2.2 ; extra == 'torch'
Requires-Dist: numpy>=2.0 ; extra == 'torch'
Requires-Dist: numpy>=2.0 ; extra == 'validation'
Requires-Dist: pyhf>=0.7.6 ; extra == 'validation'
Requires-Dist: uproot>=5.0 ; extra == 'validation'
Requires-Dist: tqdm>=4.0 ; extra == 'validation'
Requires-Dist: matplotlib>=3.9 ; extra == 'viz'
Requires-Dist: numpy>=2.0 ; extra == 'viz'
Provides-Extra: agent
Provides-Extra: all
Provides-Extra: bayes
Provides-Extra: dev
Provides-Extra: io
Provides-Extra: langchain
Provides-Extra: remote
Provides-Extra: torch
Provides-Extra: validation
Provides-Extra: viz
Summary: High-performance statistical inference engine — MLE, Bayesian, survival, GLM, time series. Rust core with Python bindings.
Keywords: statistics,physics,hep,fitting,likelihood,bayesian,survival,glm,inference,gpu
Home-Page: https://nextstat.io
Author: NextStat Contributors
License: AGPL-3.0-or-later OR LicenseRef-Commercial
Requires-Python: >=3.11
Description-Content-Type: text/markdown; charset=UTF-8; variant=GFM
Project-URL: Changelog, https://github.com/NextStat/nextstat.io/blob/main/CHANGELOG.md
Project-URL: Documentation, https://nextstat.io/docs
Project-URL: Homepage, https://nextstat.io
Project-URL: Issues, https://github.com/NextStat/nextstat.io/issues
Project-URL: Repository, https://github.com/NextStat/nextstat.io

# NextStat

[![License: AGPL-3.0](https://img.shields.io/badge/License-AGPL%203.0-blue.svg)](LICENSE)
[![Rust](https://img.shields.io/badge/rust-1.93%2B-orange.svg)](https://www.rust-lang.org)
[![Python](https://img.shields.io/badge/python-3.11%2B-blue.svg)](https://www.python.org)

NextStat is a high-performance statistical fitting toolkit for High Energy Physics (HEP), implemented in Rust with Python bindings.

## The God Run (toy-based CLs)

- **Model:** S+B HistFactory (synthetic), 50 channels × 4 bins, 201 parameters (mu + 200 nuisances)
- **Task:** CLs via toy-based q~_mu
- **Load:** 10,000 toys (b-only) + 10,000 toys (s+b)
- **Machine:** Apple M5 (arm64), macOS-26.2-arm64-arm-64bit-Mach-O
- **Versions:** nextstat 0.1.0, pyhf 0.7.6, Python 3.13.11
- **Recorded:** 2026-02-07 (UTC)
- **Commit:** 88d57856

| Tool | Wall time | Speedup |
|---|---:|---:|
| NextStat (Rayon) | 3.47 s | 1.0× |
| pyhf (multiprocessing, 10 procs) | 50m 11.7s | 868.0× |

Reproduce:

```bash
PYTHONPATH=bindings/ns-py/python ./.venv/bin/python scripts/god_run_benchmark.py --n-toys 10000
```

## What You Get

- pyhf JSON compatibility (HistFactory-style workspaces)
- Native HS3 (HEP Statistics Serialization Standard) v0.2 support — load ROOT 6.37+ HS3 JSON directly, auto-detected alongside pyhf
- Native ROOT TTree reader with mmap I/O, rayon-parallel basket decompression, and columnar extraction — no ROOT C++ dependency
- Ntuple-to-workspace pipeline: ROOT ntuples → histograms → HistFactory workspace (TRExFitter replacement)
- Expression engine for string-based selections and weights (`"njet >= 4 && pt > 25.0"`)
- Negative log-likelihood (Poisson + constraints), including Barlow-Beeston auxiliary terms
- Maximum Likelihood Estimation (L-BFGS-B) with uncertainties via (damped) Hessian-based covariance + diagonal fallback
- NUTS sampling surface (generic `Posterior` API) + optional ArviZ integration
- SIMD kernels, Rayon parallelism, Apple Accelerate (vDSP/vForce), and optional GPU acceleration (CUDA for NVIDIA, Metal for Apple Silicon)
- Rust library, Python package (PyO3/maturin), and a CLI
- Implemented packs: regression/GLM, hierarchical models, time series (Kalman/EM/forecast), econometrics/causal helpers, and PK/NLME baselines

## Docs

- Docs index: `docs/README.md`
- Tutorials (end-to-end): `docs/tutorials/README.md`
- References (CLI/Python/Rust/Server/Tools): `docs/references/`
- Demo: Physics Assistant (ROOT -> anomaly scan -> p-values + plots): `docs/demos/physics-assistant.md`

## Quickstart

### Install (Rust)

```bash
cargo add ns-core ns-inference ns-compute
```

### Install (Python)

> **Requires Python 3.11+.** macOS ships with Python 3.9 via Xcode CLI Tools — upgrade with `brew install python@3.13` before installing.

```bash
pip install nextstat
```

### Build From Source

```bash
git clone https://github.com/NextStat/nextstat.io.git
cd nextstat.io

# Rust workspace
cargo build --release

# Python bindings (editable dev install)
cd bindings/ns-py
maturin develop --release
```

### Try the Playground (WASM)

Run asymptotic CLs upper limits (Brazil bands) in the browser (no Python, no server) using a pyhf-style `workspace.json`.

From the repo root:

```bash
rustup target add wasm32-unknown-unknown
cargo install wasm-bindgen-cli --version 0.2.108

make playground-build-wasm
make playground-serve
```

Open `http://localhost:8000/` and drag & drop a `workspace.json` (example: `playground/examples/simple_workspace.json`).

## Usage

### Rust API

```rust
use ns_inference::mle::MaximumLikelihoodEstimator;
use ns_translate::pyhf::{HistFactoryModel, HistoSysInterpCode, NormSysInterpCode, Workspace};

let json = std::fs::read_to_string("workspace.json")?;
let workspace: Workspace = serde_json::from_str(&json)?;

// Default interpolation (NextStat "smooth" defaults): NormSys=Code4, HistoSys=Code4p.
// For strict HistFactory/pyhf defaults, use Code1/Code0:
let model = HistFactoryModel::from_workspace_with_settings(
    &workspace,
    NormSysInterpCode::Code1,
    HistoSysInterpCode::Code0,
)?;

let mle = MaximumLikelihoodEstimator::new();
let result = mle.fit(&model)?;

println!("Best-fit params: {:?}", result.parameters);
println!("NLL at minimum: {}", result.nll);
```

### Python API

```python
import json

import nextstat

workspace = json.loads(open("workspace.json").read())
model = nextstat.from_pyhf(json.dumps(workspace))
result = nextstat.fit(model)

poi_idx = model.poi_index()
print("POI index:", poi_idx)
print("Best-fit POI:", result.bestfit[poi_idx])
print("Uncertainty:", result.uncertainties[poi_idx])
```

### Population PK (`nlme_foce`) with multi-cpt FO/ITS/IMP

```python
import nextstat

times = [0.5, 1.0, 2.0, 4.0, 8.0] * 4
subject_idx = [0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3]
y = [8.0, 6.2, 4.1, 2.6, 1.2] * 4

fit_fo = nextstat.nlme_foce(
    times,
    y,
    subject_idx,
    4,
    model="2cpt_iv",
    method="fo",
    doses=[120.0],  # length-1 broadcasts to all subjects
    theta_init=[1.2, 15.0, 0.8, 20.0],
    omega_init=[0.2, 0.2, 0.2, 0.2],
    error_model="additive",
    sigma=0.1,
)

fit_imp = nextstat.nlme_foce(
    times,
    y,
    subject_idx,
    4,
    model="3cpt_iv",
    method="imp",
    doses=[120.0],
    theta_init=[1.1, 14.0, 0.7, 18.0, 0.5, 28.0],
    omega_init=[0.2] * 6,
    imp_n_iter=5,
    imp_n_samples=100,
    error_model="additive",
    sigma=0.1,
)
```

### Unbinned (event-level) API

Compile an event-level likelihood from an `unbinned_spec_v0` JSON/YAML file and run fits/scans:

```python
import nextstat

analysis = nextstat.unbinned.from_config("unbinned.json")
fit = analysis.fit()
print("NLL:", fit.nll)

scan = analysis.scan([0.0, 0.5, 1.0, 2.0])
print("mu_hat:", scan["mu_hat"])

cls = analysis.hypotest_toys(1.0, n_toys=2000, seed=42)
print("CLs:", cls)
```

### Bayesian (NUTS) + ArviZ

Install optional deps:

```bash
pip install "nextstat[bayes]"
```

Run sampling and get an ArviZ `InferenceData`:

```python
import json
from pathlib import Path

import nextstat

workspace = json.loads(Path("workspace.json").read_text())
model = nextstat.from_pyhf(json.dumps(workspace))

idata = nextstat.bayes.sample(
    model,
    n_chains=2,
    n_warmup=500,
    n_samples=1000,
    seed=42,
    target_accept=0.8,
)

print(idata)
```

### Viz (CLs Brazil bands, profile scans)

Install optional deps:

```bash
pip install "nextstat[viz]"
```

Compute artifacts and plot (matplotlib):

```python
import json
import numpy as np
from pathlib import Path

import nextstat

workspace = json.loads(Path("workspace.json").read_text())
model = nextstat.from_pyhf(json.dumps(workspace))

scan = np.linspace(0.0, 5.0, 101)
cls_art = nextstat.viz.cls_curve(model, scan, alpha=0.05)
nextstat.viz.plot_cls_curve(cls_art, title="CLs Brazil band")

mu = [0.0, 0.5, 1.0, 2.0]
prof_art = nextstat.viz.profile_curve(model, mu)
nextstat.viz.plot_profile_curve(prof_art, title="Profile likelihood scan")
```

### Ntuple → Workspace (TRExFitter replacement)

```rust
use ns_translate::NtupleWorkspaceBuilder;

let ws = NtupleWorkspaceBuilder::new()
    .ntuple_path("ntuples/")
    .tree_name("events")
    .measurement("meas", "mu")
    .add_channel("SR", |ch| {
        ch.variable("mbb")
          .binning(&[0., 50., 100., 150., 200., 300.])
          .selection("njet >= 4 && pt > 25.0")
          .data_file("data.root")
          .add_sample("signal", |s| {
              s.file("ttH.root")
               .weight("weight_mc * weight_sf")
               .normfactor("mu")
          })
          .add_sample("background", |s| {
              s.file("ttbar.root")
               .weight("weight_mc * weight_sf")
               .normsys("bkg_norm", 0.9, 1.1)
               .weight_sys("jes", "weight_jes_up", "weight_jes_down")
               .tree_sys("jer", "jer_up.root", "jer_down.root")
               .staterror()
          })
    })
    .build()?;  // → Workspace (same type as pyhf JSON path)
```

No ROOT C++ dependency. ~8.5x faster than uproot+numpy on the full pipeline.

### Low-level TTree access

```rust
use ns_root::RootFile;

let file = RootFile::open("data.root")?;
let tree = file.get_tree("events")?;

// Columnar access
let pt: Vec<f64> = file.branch_data(&tree, "pt")?;
let eta: Vec<f64> = file.branch_data(&tree, "eta")?;

// Expression engine
let expr = ns_root::CompiledExpr::compile("pt > 25.0 && abs(eta) < 2.5")?;
```

### CLI

```bash
nextstat fit --input workspace.json
nextstat --interp-defaults pyhf fit --input workspace.json   # NormSys=Code1, HistoSys=Code0
nextstat hypotest --input workspace.json --mu 1.0 --expected-set
nextstat hypotest-toys --input workspace.json --mu 1.0 --n-toys 10000 --seed 42 --threads 0
nextstat hypotest-toys --input workspace.json --mu 1.0 --n-toys 10000 --gpu cuda   # NVIDIA GPU (f64)
nextstat hypotest-toys --input workspace.json --mu 1.0 --n-toys 10000 --gpu metal  # Apple Silicon GPU (f32)
nextstat upper-limit --input workspace.json --expected --scan-start 0 --scan-stop 5 --scan-points 201
nextstat version
```

## Documentation

- Tutorial index: `docs/tutorials/README.md`
- Python API reference: `docs/references/python-api.md`
- Rust API reference: `docs/references/rust-api.md`
- CLI reference: `docs/references/cli.md`
- Playground (browser/WASM): `docs/references/playground.md` (and `playground/README.md`)

## Architecture

NextStat follows a "clean architecture" style: inference depends on stable abstractions, not on specific execution backends.

```
┌─────────────────────────────────────────────────────────────────┐
│                        HIGH-LEVEL LOGIC                          │
│  ns-inference (MLE, Profile Likelihood, Hypothesis Tests, ...)   │
│  - depends on core types and model interfaces                     │
└─────────────────────────┬───────────────────────────────────────┘
                          │ depends on abstractions
┌─────────────────────────┴───────────────────────────────────────┐
│                      ns-core (interfaces)                        │
│  - error types, FitResult, traits                                 │
└─────────────────────────┬───────────────────────────────────────┘
                          │ implemented by
┌─────────────────────────┴───────────────────────────────────────┐
│                    LOW-LEVEL IMPLEMENTATIONS                     │
│  ns-translate (pyhf + ntuple → Workspace)  ns-compute (SIMD/CUDA/Metal) │
│  ns-ad (dual/tape AD)  ns-root (ROOT I/O, TTree, expressions)    │
└─────────────────────────────────────────────────────────────────┘
```

## Project Layout

```
nextstat/
├── crates/
│   ├── ns-core/         # Core types, traits, error handling
│   ├── ns-compute/      # SIMD kernels, Apple Accelerate, CUDA/Metal batch NLL+grad
│   ├── ns-ad/           # Automatic differentiation (dual/tape)
│   ├── ns-root/         # Native ROOT file reader (TH1, TTree, expressions, filler)
│   ├── ns-translate/    # Format translators (pyhf, HS3, HistFactory XML, ntuple builder)
│   ├── ns-inference/    # MLE, NUTS, CLs, GLM, time series, PK/NLME
│   ├── ns-viz/          # Visualization artifacts
│   └── ns-cli/          # CLI binary
├── bindings/
│   └── ns-py/           # Python bindings (PyO3/maturin)
├── docs/
│   ├── legal/
│   ├── plans/
│   └── references/
└── tests/
```

## Development

### Requirements

- Rust 1.93+ (edition 2024)
- Python 3.11+ (for bindings)
- maturin (for Python bindings)

### Build and Test

```bash
# Build
cargo build --workspace

# Build with CUDA support (requires nvcc)
cargo build --workspace --features cuda

# Build with Metal support (Apple Silicon, macOS)
cargo build --workspace --features metal

# Tests (default features)
cargo test --workspace

# Tests including optional backends (CUDA requires nvcc)
cargo test --workspace --all-features

# Opt-in slow Rust tests (toys, SBC, NUTS quality gates)
make rust-slow-tests

# Very slow (release) regression check
make rust-very-slow-tests

# Format and lint
cargo fmt --check
cargo clippy --workspace -- -D warnings
```

### Python Tests

Local test runs should use the repo venv (it pins a Python version compatible with the built extension).

```bash
# Run fast Python tests (parity + API contracts)
PYTHONPATH=bindings/ns-py/python ./.venv/bin/python -m pytest -q -m "not slow" tests/python

# Run slow toy regression tests (opt-in)
PYTHONPATH=bindings/ns-py/python NS_RUN_SLOW=1 NS_TOYS=200 NS_SEED=0 ./.venv/bin/python -m pytest -q -m slow tests/python
```

### Benchmarks

```bash
# Compile and run all benches
cargo bench --workspace

# Common entry points
cargo bench -p ns-translate --bench model_benchmark
cargo bench -p ns-translate --bench nll_benchmark
cargo bench -p ns-compute --bench simd_benchmark
cargo bench -p ns-inference --bench mle_benchmark
cargo bench -p ns-inference --bench hypotest_benchmark
cargo bench -p ns-ad --bench ad_benchmark
cargo bench -p ns-core --bench core_benchmark
```

Details (quick mode, baselines, CI workflows): `docs/benchmarks.md`.

### Apex2 Baselines (pyhf + P6 GLM)

Record a reference baseline (writes JSON under `tmp/baselines/` with a full environment fingerprint):

```bash
make apex2-baseline-record
```

Compare current HEAD vs the latest recorded baseline (writes `tmp/baseline_compare_report.json`):

```bash
make apex2-baseline-compare
```

Pre-release gate runbook: see CONTRIBUTING.md § Release Checklist.

## Documentation

- White paper (Markdown): `docs/WHITEPAPER.md`
- White paper (PDF): built by `python3 scripts/build_whitepaper.py` and attached to GitHub Releases on tags (`v*`)
- Internal plans/design notes are maintained outside this public repository.

## Contributing

See `CONTRIBUTING.md`. All commits must include DCO sign-off (`git commit -s`).

## License

NextStat uses a dual-licensing model:

- Open Source: `LICENSE` (AGPL-3.0-or-later)
- Commercial: `LICENSE-COMMERCIAL`

## Contact

- Website: https://nextstat.io
- GitHub: https://github.com/NextStat/nextstat.io

