Metadata-Version: 2.4
Name: heatmaps_to_keypoints
Version: 0.0.2
Summary: Fast C++ heatmap-to-keypoints decoder for 2D human pose estimation.
Author: Mwangi Kuya
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: MIT License
Requires-Python: >=3.8
Description-Content-Type: text/markdown
License-File: LICENSE.txt
Requires-Dist: opencv-python>=4.0
Dynamic: license-file

# HeatmapToKeypoints

A high-performance C++/PyTorch extension for decoding 2D heatmaps into precise keypoint coordinates.

This library is designed for **human pose estimation** and **keypoint detection** workflows. It accepts standard NCHW heatmap tensors (e.g., from HRNet, SimpleBaseline, or KD-student models) and efficiently converts them into normalized (x, y) coordinates with confidence scores using sub-pixel refinement.

## Problem Statement

Pose estimation models typically output a 4D tensor of shape `[N, K, H, W]`:
- `N`: Batch size
- `K`: Number of keypoints
- `H, W`: Heatmap spatial dimensions

Each channel `K` contains a 2D Gaussian distribution where the peak represents the predicted location of a specific body part. Converting these heatmaps into coordinates requires finding the peak and refining it to achieve sub-pixel accuracy. Doing this efficiently on CPU for large batches can be slow in pure Python.

## Solution

**HeatmapToKeypoints** provides a direct C++ binding to PyTorch that:
1.  **Scans Memory Linearly**: Optimizes memory access patterns for NCHW layout.
2.  **Refines Coordinates**: Uses Taylor expansion (Hessian matrix inversion) to compute sub-pixel offsets from the discrete peak.
3.  **Outputs Normalized Data**: Returns coordinates in `[0.0, 1.0]` range, independent of resolution.

## Installation

Ensure you have a C++17 compatible compiler and PyTorch installed.

```bash
pip install .
```

## Usage

The library exposes a single function `decode_heatmaps`.

### Input format
- **Type**: `torch.Tensor` (float32)
- **Shape**: `[Batch, Keypoints, Height, Width]`
- **Device**: CPU
- **Memory**: Contiguous

### Python Example

```python
import torch
import heatmaps_to_keypoints

# Example Tensor: Batch=8, Keypoints=17, Height=96, Width=72
heatmaps = torch.randn(8, 17, 96, 72)

# Decode
# Returns a list of lists: results[batch_index][keypoint_index]
results = heatmaps_to_keypoints.decode_heatmaps(heatmaps)

# Access data for the first image in batch, first keypoint
landmark = results[0][0]

print(f"X: {landmark.x}")     # Normalized [0-1]
print(f"Y: {landmark.y}")     # Normalized [0-1]
print(f"Score: {landmark.score}") # Heatmap peak value
```

## Algorithm Details

For each keypoint heatmap, the decoder performs:
1.  **Global Argmax**: Finds the discrete pixel $(p_x, p_y)$ with the maximum value.
2.  **Taylor Expansion**: Uses the 2nd-order Taylor series approximation around the peak to find the true sub-pixel maximum.
    $$ \Delta = -H^{-1} \nabla $$
    Where $H$ is the Hessian matrix and $\nabla$ is the gradient vector computed from neighbors.
3.  **Coordinate Normalization**:
    $$ x_{final} = \frac{p_x + \Delta_x}{W}, \quad y_{final} = \frac{p_y + \Delta_y}{H} $$

## License

This project is licensed under the MIT License - see the [LICENSE](LICENSE.txt) file for details.
