Metadata-Version: 2.4
Name: edgefirst_hal
Version: 0.16.3
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Operating System :: POSIX :: Linux
Classifier: Operating System :: MacOS
Classifier: Operating System :: Microsoft :: Windows
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Programming Language :: Rust
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: Scientific/Engineering :: Image Processing
Classifier: Topic :: Software Development :: Libraries
Requires-Dist: numpy
Requires-Dist: pytest ; extra == 'test'
Requires-Dist: psutil ; extra == 'test'
Provides-Extra: test
Summary: Hardware Abstraction Layer for edge AI with zero-copy tensors, image processing, and YOLO decoding
Keywords: edge-ai,computer-vision,machine-learning,yolo,tensor,image-processing,dma,zero-copy,embedded,nxp-imx
Home-Page: https://edgefirst.ai
Author-email: Au-Zone Technologies <support@au-zone.com>
Maintainer-email: Au-Zone Technologies <support@au-zone.com>
License: Apache-2.0
Requires-Python: >=3.8
Description-Content-Type: text/markdown; charset=UTF-8; variant=GFM
Project-URL: Changelog, https://github.com/EdgeFirstAI/hal/blob/main/CHANGELOG.md
Project-URL: Documentation, https://github.com/EdgeFirstAI/hal#readme
Project-URL: Homepage, https://edgefirst.ai
Project-URL: Issues, https://github.com/EdgeFirstAI/hal/issues
Project-URL: Repository, https://github.com/EdgeFirstAI/hal.git

# edgefirst-hal

[![PyPI](https://img.shields.io/pypi/v/edgefirst-hal.svg)](https://pypi.org/project/edgefirst-hal/)
[![Python](https://img.shields.io/pypi/pyversions/edgefirst-hal.svg)](https://pypi.org/project/edgefirst-hal/)
[![License](https://img.shields.io/badge/License-Apache_2.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)

Hardware-accelerated image processing, zero-copy tensors, and YOLO decoding
for edge AI inference pipelines. Built in Rust with Python bindings via PyO3.

## Installation

```bash
pip install edgefirst-hal
```

Pre-built wheels are available for Linux (x86_64, aarch64), macOS, and Windows.
No Rust toolchain required.

> **Python 3.11+** wheels use the improved stable ABI for zero-copy buffer
> protocol support. Python 3.8–3.10 wheels use a compatible fallback.
> Pip selects the best wheel automatically.

## Quick Start

```python
import edgefirst_hal as ef

# Load a source image
src = ef.Tensor.load("photo.jpg", ef.PixelFormat.Rgb)

# Create an image processor (auto-selects best backend: GPU > G2D > CPU)
processor = ef.ImageProcessor()

# Allocate a GPU-optimal output buffer — always use create_image() for
# destinations passed to convert(), so the processor can select the best
# memory type (DMA-buf, PBO, or system memory) for zero-copy GPU paths.
dst = processor.create_image(640, 640, ef.PixelFormat.Rgb)

# Convert with letterbox resize (preserves aspect ratio)
processor.convert(src, dst)

# Access pixel data as a numpy array
import numpy as np
pixels = np.frombuffer(dst.map(), dtype=np.uint8).reshape(dst.shape())
```

## Key Features

- **Zero-copy tensors** — DMA-BUF, POSIX shared memory, and PBO-backed
  buffers with automatic fallback to system memory
- **Hardware-accelerated image processing** — OpenGL, NXP G2D, and
  optimized CPU backends with automatic selection
- **Letterbox resize** — aspect-ratio-preserving resize with configurable
  padding color, rotation, and flip
- **Int8 output** — `create_image(..., dtype="int8")` for direct signed
  int8 tensor output with GPU-accelerated XOR bias
- **YOLO decoding** — YOLOv5, YOLOv8, YOLO11, and YOLO26 detection and
  instance segmentation (including end-to-end models)
- **Object tracking** — ByteTrack multi-object tracker with Kalman filtering
- **Fully typed** — ships with `.pyi` stubs for IDE autocompletion and
  type checking with mypy / pyright

## Image Processing

```python
import edgefirst_hal as ef

processor = ef.ImageProcessor()
src = ef.Tensor.load("frame.jpg", ef.PixelFormat.Rgb)

# Letterbox resize to model input size
dst = processor.create_image(640, 640, ef.PixelFormat.Rgb)
processor.convert(src, dst)

# With rotation and horizontal flip
processor.convert(src, dst, rotation=ef.Rotation.Rotate90, flip=ef.Flip.Horizontal)

# Crop source region
processor.convert(src, dst, src_crop=ef.Rect(100, 100, 400, 400))

# Int8 output for quantized models
dst_i8 = processor.create_image(640, 640, ef.PixelFormat.Rgb, dtype="int8")
processor.convert(src, dst_i8)
```

## Zero-Copy External Buffer (Linux)

When integrating with an NPU delegate that owns DMA-BUF buffers, render
directly into the delegate's buffer to eliminate a `memcpy`:

```python
import edgefirst_hal as ef

processor = ef.ImageProcessor()
src = ef.Tensor.load("frame.jpg", ef.PixelFormat.Rgb)

# Render directly into the delegate's DMA-BUF — zero copies
dst = processor.import_image(fd=vx_fd, width=640, height=640, format=ef.PixelFormat.Rgb)
processor.convert(src, dst)

# Reverse: HAL allocates, consumer imports the fd
hal_dst = processor.create_image(640, 640, ef.PixelFormat.Rgb)
fd = hal_dst.dmabuf_clone()  # Raises if not DMA-backed
delegate.register(fd)
```

You can also attach format metadata to any raw tensor created via `from_fd()`:

```python
t = ef.Tensor.from_fd(some_fd, [480, 640, 3])
t.set_format(ef.PixelFormat.Rgb)
processor.convert(src, t)
```

**Performance tip:** When rotating through a pool of DMA-BUFs (e.g. 2-3
from an NPU delegate), create the `Tensor` wrappers once at init and
reuse them across frames. This avoids EGL image cache misses (~100-300us
each on Vivante GPUs).

## YOLO Decoding

```python
import edgefirst_hal as ef

# Configure decoder from model metadata
decoder = ef.Decoder(
    {"detection": {"shape": [1, 84, 8400], "dtype": "float32"}},
    score_threshold=0.5,
    iou_threshold=0.45,
)

# Decode model outputs → (boxes, scores, class_ids)
boxes, scores, classes = decoder.decode([output_tensor])
```

## Object Tracking

`ByteTrack` is a multi-object tracker based on ByteTrack with Kalman filtering.
It assigns consistent track IDs across frames.

```python
import edgefirst_hal as ef

tracker = ef.ByteTrack(
    high_conf=0.7,         # High-confidence detection threshold
    iou=0.25,              # IoU threshold for association
    update=0.25,           # Update/low-confidence threshold
    lifespan_ns=500_000_000,  # Track lifespan without detection (nanoseconds)
)

# Decode and track in one call (returns boxes, scores, classes, masks, track_infos)
boxes, scores, classes, masks, tracks = decoder.decode_tracked(
    tracker, timestamp_ns, [output_tensor]
)
# masks is empty for detection-only models

# Or query currently active tracks
active = tracker.get_active_tracks()
```

## Segmentation Mask Rendering

### draw_decoded_masks()

Draw pre-decoded masks onto a destination image:

```python
processor.draw_decoded_masks(
    dst,
    bbox,           # numpy array [N, 4]
    scores,         # numpy array [N]
    classes,        # numpy array [N]
    seg=[],         # list of segmentation arrays (optional)
    background=None,  # optional background tensor to blit before drawing
    opacity=1.0,    # mask alpha scale (0.0 – 1.0)
)
```

### draw_masks()

Decode model outputs and draw segmentation masks in a single call. Masks never
leave Rust, eliminating the Python round-trip overhead of `decode()` +
`draw_decoded_masks()`.

Without a tracker, returns `(boxes, scores, classes)`. With a tracker, returns
`(boxes, scores, classes, track_infos)`.

```python
import edgefirst_hal as ef

processor = ef.ImageProcessor()
tracker = ef.ByteTrack()

# Without tracking
boxes, scores, classes = processor.draw_masks(decoder, outputs, dst)

# With overlay parameters
boxes, scores, classes = processor.draw_masks(
    decoder, outputs, dst,
    background=bg_tensor,  # blit bg_tensor into dst before masks
    opacity=0.7,           # semi-transparent masks
)

# With tracking (requires tracker= and timestamp=)
import time
ts = time.monotonic_ns()
boxes, scores, classes, tracks = processor.draw_masks(
    decoder, outputs, dst,
    tracker=tracker,
    timestamp=ts,
)
```

## Platform Support

| Platform | GPU Acceleration | Memory Types |
|----------|-----------------|-------------|
| Linux (NXP i.MX8/i.MX95) | OpenGL + G2D | DMA-buf, SHM, PBO, Mem |
| Linux (x86_64, other ARM) | OpenGL | SHM, PBO, Mem |
| macOS / Windows | CPU only | Mem |

Hardware acceleration is used automatically when available. All platforms
fall back to CPU.

## Part of the EdgeFirst Ecosystem

`edgefirst-hal` is the runtime inference library in the
[EdgeFirst](https://edgefirst.ai) platform for deploying AI at the edge.

- **[EdgeFirst Studio](https://edgefirst.studio)** — label, train, and
  deploy models for edge devices
- **[Rust crates](https://crates.io/crates/edgefirst-hal)** — use the
  same library directly from Rust or C
- **[GitHub](https://github.com/EdgeFirstAI/hal)** — source code,
  architecture docs, benchmarks, and contribution guide

## License

Apache-2.0 — see [LICENSE](https://github.com/EdgeFirstAI/hal/blob/main/LICENSE).

