Metadata-Version: 2.4
Name: simgen-vla
Version: 5.0.5
Summary: High-precision GPU computing with exact arithmetic. Free during beta.
Home-page: https://simgen.dev
Author: Clouthier Simulation Labs
Author-email: Clouthier Simulation Labs <kyle@simgen.dev>
License: Free Beta
Project-URL: Homepage, https://simgen.dev
Project-URL: Documentation, https://simgen.dev/docs
Project-URL: Repository, https://github.com/DigitalMax321/simgen
Keywords: exact-arithmetic,GPU,precision,lossless,scientific-computing,machine-learning,deep-learning,simulation,finance,HPC,cuda,pytorch
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Science/Research
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Financial and Insurance Industry
Classifier: Operating System :: POSIX :: Linux
Classifier: Operating System :: Microsoft :: Windows
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: Scientific/Engineering :: Mathematics
Classifier: Topic :: Scientific/Engineering :: Physics
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Provides-Extra: dev
Requires-Dist: pytest; extra == "dev"
Requires-Dist: cython; extra == "dev"
Requires-Dist: torch>=2.0; extra == "dev"
Requires-Dist: numpy>=1.20; extra == "dev"
Dynamic: author
Dynamic: home-page
Dynamic: license-file
Dynamic: requires-python

# SimGen VLA - Zero-Error GPU Arithmetic

**Drop-in PyTorch replacement with exact arithmetic. No accumulation error. Ever.**

[![PyPI version](https://badge.fury.io/py/simgen-vla.svg)](https://pypi.org/project/simgen-vla/)
[![Python 3.10+](https://img.shields.io/badge/python-3.10+-blue.svg)](https://www.python.org/downloads/)
[![License](https://img.shields.io/badge/license-Proprietary-blue.svg)](LICENSE)

> **Free during beta** - Use freely for research, academic, and commercial projects.

**Support development:** [ko-fi.com/kyleclouthier](https://ko-fi.com/kyleclouthier)

---

## The Problem: Floating-Point Lies

Every GPU computation accumulates tiny errors. These errors compound silently until your results are wrong.

```python
import torch

# Classic floating-point failure
x = torch.tensor([1e16, 1.0, -1e16])
print(x.sum())  # 0.0  <- WRONG! Should be 1.0

# 10 million additions - error explodes
values = torch.ones(10_000_000) * 0.1
print(values.sum())  # 999999.9880... <- Should be 1000000.0
```

**This affects:** financial calculations, scientific simulations, ML training stability, physics engines, cryptography, and any computation requiring precision.

---

## The Solution: SimGen VLA

```python
from simgen import vla

# Exact arithmetic - mathematically correct
x = vla.tensor([1e16, 1.0, -1e16])
print(x.sum())  # 1.0  <- CORRECT!

# 10 million additions - still exact
values = vla.ones(10_000_000) * 0.1
print(values.sum())  # 1000000.0  <- EXACTLY correct
```

**No code changes.** Same PyTorch API. Just import `vla` instead of `torch`.

---

## Installation

```bash
pip install simgen-vla
```

**Requirements:**
- Python 3.10, 3.11, or 3.12
- PyTorch 2.0+ with CUDA
- NVIDIA GPU (Pascal through Hopper: sm_60 to sm_90)

**Platforms:** Windows, Linux

---

## Use Cases

### Financial Computing

Mixed-magnitude calculations where every cent matters:

```python
from simgen import vla

# Portfolio with massive range - standard FP loses the pennies
positions = vla.tensor([
    1_000_000_000.00,   # $1 billion position
    0.01,                # 1 cent transaction fee
    -999_999_999.99,     # Large short position
    50_000.50,           # Medium holding
])

total = positions.sum()
print(f"Portfolio: ${float(total):,.2f}")  # $50,000.52 - exact!

# High-frequency trading accumulation
trades = vla.randn(1_000_000) * 0.001  # Million micro-trades
pnl = trades.sum()  # Exact P&L, no drift
```

### Scientific Simulation

Physics simulations that don't drift over time:

```python
from simgen import vla

# Chaotic system (Lorenz attractor)
def lorenz_step(state, dt=0.01):
    x, y, z = state[0], state[1], state[2]
    sigma, rho, beta = 10.0, 28.0, 8.0/3.0

    dx = sigma * (y - x)
    dy = x * (rho - z) - y
    dz = x * y - beta * z

    return vla.tensor([
        x + dx * dt,
        y + dy * dt,
        z + dz * dt
    ])

# Run forward 10000 steps, then backward 10000 steps
state = vla.tensor([1.0, 1.0, 1.0])
initial = state.clone()

for _ in range(10000):
    state = lorenz_step(state, dt=0.01)

for _ in range(10000):
    state = lorenz_step(state, dt=-0.01)  # Reverse time

# With VLA: returns to EXACTLY initial state
# With PyTorch: chaotic divergence, nowhere near initial
error = (state - initial).abs().sum()
print(f"Reversal error: {float(error)}")  # 0.0 with VLA!
```

### Machine Learning

Deterministic training with stable gradients:

```python
from simgen import vla

# Gradient accumulation without drift
gradients = vla.zeros(1_000_000)

for batch in range(1000):
    grad = vla.randn(1_000_000) * 0.001
    gradients = gradients + grad  # No accumulation error!

# Mean gradient is exact
mean_grad = gradients / 1000
```

### Matrix Operations

Large linear algebra with preserved precision:

```python
from simgen import vla

# Matrix multiplication - exact dot products
A = vla.randn((1000, 1000))
B = vla.randn((1000, 1000))
C = vla.matmul(A, B)

# Batch operations
batch = vla.randn((32, 64, 64))
result = vla.bmm(batch, batch.transpose(1, 2))

# Linear system solving (coming soon)
# x = vla.solve(A, b)  # Exact solution
```

### Cryptography & Verification

When bit-exact results are required:

```python
from simgen import vla

# Checksums that actually sum correctly
data = vla.tensor([...])  # Large dataset
checksum = data.sum()  # Exact, reproducible

# Verification across systems
# Same inputs -> identical outputs, guaranteed
```

---

## Complete API Reference

### Tensor Creation

```python
from simgen import vla

x = vla.tensor([1.0, 2.0, 3.0])       # From list
z = vla.zeros((3, 3))                  # Zeros
o = vla.ones((100,))                   # Ones
r = vla.randn((10, 10))                # Random normal
u = vla.rand((5, 5))                   # Random uniform [0,1]
a = vla.arange(0, 10)                  # Range [0,1,2,...,9]
l = vla.linspace(0, 1, 100)            # 100 points from 0 to 1
I = vla.eye(5)                         # 5x5 identity matrix
```

### Arithmetic Operations

```python
# All operations preserve full precision
c = a + b          # Exact addition
c = a - b          # Exact subtraction
c = a * b          # Exact multiplication
c = a / b          # Exact division
c = -a             # Negation
c = a ** 2         # Power
```

### Reductions (Zero Drift)

```python
total = vla.sum(x)         # Exact sum
avg = vla.mean(x)          # Exact mean
product = vla.prod(x)      # Exact product
minimum = vla.min(x)       # Minimum
maximum = vla.max(x)       # Maximum
std_dev = vla.std(x)       # Standard deviation
variance = vla.var(x)      # Variance
```

### Linear Algebra

```python
C = vla.matmul(A, B)       # Matrix multiplication
C = vla.mm(A, B)           # Matrix-matrix multiply
y = vla.mv(A, x)           # Matrix-vector multiply
d = vla.dot(a, b)          # Dot product
C = vla.bmm(A, B)          # Batched matrix multiply
```

### Math Functions

```python
y = vla.exp(x)             # Exponential
y = vla.log(x)             # Natural log
y = vla.sqrt(x)            # Square root
y = vla.abs(x)             # Absolute value
y = vla.sin(x)             # Sine
y = vla.cos(x)             # Cosine
y = vla.tan(x)             # Tangent
y = vla.tanh(x)            # Hyperbolic tangent
y = vla.sigmoid(x)         # Sigmoid
```

### Shape Operations

```python
y = vla.reshape(x, (2, 3))       # Reshape
y = vla.transpose(x, 0, 1)       # Transpose dims
y = vla.squeeze(x)               # Remove size-1 dims
y = vla.unsqueeze(x, 0)          # Add dimension
y = vla.stack([a, b, c])         # Stack tensors
y = vla.cat([a, b])              # Concatenate
```

### Tensor Methods

```python
x.shape                    # Tensor shape
x.numel()                  # Number of elements
x.clone()                  # Deep copy
x.item()                   # Extract scalar
x.tolist()                 # Convert to Python list
x.numpy()                  # Convert to NumPy array
x.to_torch()               # Convert to PyTorch tensor
```

---

## Supported GPUs

| Architecture | Example GPUs | Compute Capability |
|-------------|--------------|-------------------|
| Pascal | GTX 1080, P100, P40 | sm_60, sm_61 |
| Volta | V100, Titan V | sm_70 |
| Turing | RTX 2080, T4, Quadro RTX | sm_75 |
| Ampere | RTX 3090, A100, A10 | sm_80, sm_86 |
| Ada Lovelace | RTX 4090, 4080, 4070, L40 | sm_89 |
| Hopper | H100, H200 | sm_90 |

**Cloud Support:** AWS (P3, P4, G4, G5), GCP (T4, A100, L4), Azure (NC, ND series), Kaggle (T4 x2 free), Colab

---

## Benchmarks

| Operation | Elements | PyTorch Error | VLA Error |
|-----------|----------|---------------|-----------|
| Sum | 10M | 10^-7 relative | **0.0** |
| Dot Product | 1M | 10^-8 relative | **0.0** |
| Matrix Multiply | 1000x1000 | 10^-6 relative | **0.0** |
| Chained Ops | 1000 iterations | Diverges | **Exact** |

---

## FAQ

**Q: Is this slower than PyTorch?**
A: Slightly. The overhead is typically 2-5x, which is negligible for most applications where correctness matters more than raw speed.

**Q: Does this work with autograd?**
A: Not yet. Exact autodiff is in development.

**Q: Can I use this for training neural networks?**
A: Currently best for inference, validation, and forward passes. Training support coming soon.

**Q: What about CPU?**
A: GPU required. CPU fallback uses PyTorch (loses exactness guarantee).

---

## Support & Contact

**Website:** [simgen.dev](https://simgen.dev)

**Support Development:** [ko-fi.com/kyleclouthier](https://ko-fi.com/kyleclouthier)

**Email:** kyle@simgen.dev

**GitHub:** [github.com/DigitalMax321/simgen](https://github.com/DigitalMax321/simgen)

---

## License

Proprietary. Free during beta for research, academic, and commercial use.

(c) 2025-2026 Clouthier Simulation Labs. All rights reserved.
