Metadata-Version: 2.1
Name: cuda-bench
Version: 0.2.0
Summary: CUDA Kernel Benchmarking Package
Author: NVIDIA Corporation
License: Apache-2.0 WITH LLVM-exception
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Programming Language :: Python :: 3 :: Only
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Environment :: GPU :: NVIDIA CUDA
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Operating System :: POSIX :: Linux
Project-URL: Homepage, https://github.com/NVIDIA/nvbench
Project-URL: Repository, https://github.com/NVIDIA/nvbench
Project-URL: Issues, https://github.com/NVIDIA/nvbench/issues
Requires-Python: >=3.10
Provides-Extra: cu12
Requires-Dist: cuda-bindings<13.0.0,>=12.0.0; extra == "cu12"
Provides-Extra: cu13
Requires-Dist: cuda-bindings<14.0.0,>=13.0.0; extra == "cu13"
Provides-Extra: test-cu12
Requires-Dist: cuda-bench[cu12]; extra == "test-cu12"
Requires-Dist: pytest; extra == "test-cu12"
Requires-Dist: cupy-cuda12x; extra == "test-cu12"
Requires-Dist: numba; extra == "test-cu12"
Provides-Extra: test-cu13
Requires-Dist: cuda-bench[cu13]; extra == "test-cu13"
Requires-Dist: pytest; extra == "test-cu13"
Requires-Dist: cupy-cuda13x; extra == "test-cu13"
Requires-Dist: numba; extra == "test-cu13"
Provides-Extra: test
Requires-Dist: pytest; extra == "test"
Requires-Dist: cupy-cuda12x; extra == "test"
Requires-Dist: numba; extra == "test"
Provides-Extra: tools
Requires-Dist: colorama; extra == "tools"
Requires-Dist: jsondiff; extra == "tools"
Requires-Dist: matplotlib; extra == "tools"
Requires-Dist: numpy; extra == "tools"
Requires-Dist: pandas; extra == "tools"
Requires-Dist: seaborn; extra == "tools"
Requires-Dist: tabulate; extra == "tools"
Description-Content-Type: text/markdown

# CUDA Kernel Benchmarking Package

This package provides a Python API to the CUDA Kernel Benchmarking
Library `NVBench`.

## Installation

Install from PyPi

```bash
pip install cuda-bench[cu13]  # For CUDA 13.x
pip install cuda-bench[cu12]  # For CUDA 12.x
```

## Building from source

### Ensure recent version of CMake

Since `nvbench` requires a rather new version of CMake (>=3.30.4), either build CMake from sources, or create a conda environment with a recent version of CMake, using

```
conda create -n build_env --yes  cmake ninja
conda activate build_env
```

### Ensure CUDA compiler

Since building `NVBench` library requires CUDA compiler, ensure that appropriate environment variables
are set. For example, assuming CUDA toolkit is installed system-wide, and assuming Ampere GPU architecture:

```bash
export CUDACXX=/usr/local/cuda/bin/nvcc
export CUDAARCHS=86
```

### Build Python project

Now switch to python folder, configure and install NVBench library, and install the package in editable mode:

```bash
cd nvbench/python
pip install -e .
```

### Verify that package works

```bash
python test/run_1.py
```

### Run examples

```bash
# Example benchmarking numba.cuda kernel
python examples/throughput.py
```

```bash
# Example benchmarking kernels authored using cuda.core
python examples/axes.py
```

```bash
# Example benchmarking algorithms from cuda.cccl.parallel
python examples/cccl_parallel_segmented_reduce.py
```

```bash
# Example benchmarking CuPy function
python examples/cupy_extract.py
```
