Metadata-Version: 2.4
Name: bijli
Version: 0.1.0
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: Topic :: Software Development :: Debuggers
Classifier: Topic :: System :: Monitoring
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Programming Language :: Python :: 3.14
Classifier: Programming Language :: Rust
Classifier: Typing :: Typed
License-File: LICENSE
Summary: Rust-powered Python profiler: in-process, no sudo, CPU + memory
Keywords: profiler,cpu,memory,performance,sampling,rust
License: MIT
Requires-Python: >=3.9
Description-Content-Type: text/markdown; charset=UTF-8; variant=GFM

# Bijli ⚡

**Rust-powered Python profiler — in-process, no sudo, CPU + memory.**

```
pip install bijli
```

---

## Why Bijli?

The two most common profiling approaches each have a fundamental flaw:

- **cProfile / tracemalloc** — in-process, no sudo, embeddable — but adds 30–100% overhead. Unusable in production.
- **py-spy** — near-zero overhead — but requires root (ptrace), can't profile memory, and can't be embedded. Its low overhead is an architectural side-effect of being out-of-process and never touching the GIL.

Bijli is the only tool that combines all four properties:

| | Bijli | cProfile + tracemalloc | py-spy |
|---|:---:|:---:|:---:|
| **No root / sudo** | ✅ | ✅ | ❌ always needs ptrace |
| **Low overhead** | ✅ **< 5% CPU, < 1% I/O** | ❌ 30–100% | ✅ (out-of-process) |
| **Memory profiling** | ✅ RSS + allocation sites | ✅ (but always-on cost) | ❌ CPU only |
| **Embeddable** | ✅ `@profile`, `with Profile()` | ⚠️ programmatic but slow | ❌ CLI only |
| **GIL hold per sample** | ✅ < 50 µs | ❌ every call | ✅ zero (out-of-process) |
| **Python version lag** | ✅ hours (stable C API) | ✅ none (stdlib) | ❌ weeks (raw struct offsets) |

In short: if you can't use py-spy (no root, need memory, need to embed), bijli gives you
py-spy-class overhead without the trade-offs.

---

## Quickstart

### Decorator

```python
import bijli

@bijli.profile
def process_data(items):
    return [expensive(x) for x in items]

result = process_data(my_list)
# Bijli prints an ANSI report to stderr when the function returns.
```

### Context manager

```python
import bijli

with bijli.Profile() as p:
    run_my_pipeline()

p.print_report()          # ANSI to stderr
p.print_report(json_mode=True) # JSON to stderr
```

### CLI

```bash
# Profile a script
python -m bijli my_script.py

# Profile with arguments
python -m bijli my_script.py -- arg1 arg2

# JSON output
python -m bijli --json my_script.py

# Custom sample interval
python -m bijli --interval 20 my_script.py
```

---

## Installation

```bash
pip install bijli
```

Prebuilt wheels for:
- Linux x86_64 / aarch64 (glibc + musl)
- macOS x86_64 / Apple Silicon
- Windows x86_64

Requires Python 3.9+. No external dependencies.

---

## Output

```
================================================================
  bijli profiler report
================================================================
  wall time  :  2.341 s
  samples    :  46
  CPU        :  avg 12.3%   peak 97.4%
  memory     :  44.2 MB -> 67.2 MB  (+23.0 MB)
  spikes     :  1 CPU detected

  CPU  [_______________________##########_____________________]  0%..97%
  RSS  [_________________-----------############################]  44..67 MB
         _ = low   - = medium   # = high

----------------------------------------------------------------
  HOT PATHS  -- where the program spent its time

   1.  process_batch              34.8%  ##############
        my_module.py:88

   2.  _transform                 21.7%  #########
        my_module.py:61

   3.  sum                        13.0%  #####
        fromnumeric.py:86

----------------------------------------------------------------
  SPIKES  (1 detected)

  1. [CPU]  0.85 s -- 1.23 s  (380 ms)
     peak 97.4%   threshold 34.2%
     Stack at peak:
       process_batch             my_module.py:88
       _transform                my_module.py:61
================================================================
```

---

## Configuration

```python
bijli.Profile(
    interval_ms=50,   # Sample interval in milliseconds (default: 50)
    top_n=20,         # Hot paths to include in report (default: 20)
)
```

CLI flags:

| Flag | Default | Description |
|------|---------|-------------|
| `--interval MS` | 50 | Sample interval (ms) |
| `--top-n N` | 20 | Hot paths in report |
| `--json` | off | JSON output instead of ANSI |

---

## How it works

1. **Sampler thread** — a Rust background thread wakes every `interval_ms`,
   reads CPU% and RSS from the OS directly (no psutil), walks CPython's
   `sys._current_frames()` to capture stack snapshots, and pushes samples
   into a lock-free SPSC ring buffer. GIL hold time per sample is < 50 µs.

2. **Spike detection** — after `stop()`, Welford's online algorithm computes
   mean + σ over a baseline window (first 25% of wall time), then groups
   contiguous above-threshold samples into `SpikeEvent` objects.

3. **Memory spikes** — when RSS delta exceeds the threshold, a Rust atomic
   flag is set. A Python daemon thread polls the flag and activates
   `tracemalloc` for 500 ms to capture allocation sites, then restores
   the original `tracemalloc` state.

4. **Report** — spike events, hot paths (by frame frequency), and
   allocation sites are rendered as an ANSI report or emitted as JSON.

---

## License

MIT

