Metadata-Version: 2.4
Name: gwatch
Version: 0.0.7
Summary: G-Watch is a toolbox for GPU profiling and program analysis.
Home-page: https://github.com/G-Watch/G-Watch
Author: Zhuobin Huang
Author-email: zhuobin@u.nus.edu
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Requires-Python: >=3.8
Description-Content-Type: text/markdown
Requires-Dist: prettytable
Provides-Extra: test
Requires-Dist: pytest; extra == "test"
Dynamic: author
Dynamic: author-email
Dynamic: classifier
Dynamic: description
Dynamic: description-content-type
Dynamic: home-page
Dynamic: provides-extra
Dynamic: requires-dist
Dynamic: requires-python
Dynamic: summary

# G-Watch

<div align="center" style="display: flex; margin:10px;">
    <img src="https://raw.githubusercontent.com/mars-compute-ai/G-Watch/refs/heads/main/docs/logo.jpg" style="width: 350px; margin: 0px; padding: 0px;" />
</div>

***G-Watch*** is a toolbox for agentic GPU kernel optimization.
It features rich **Profiling** capabilities on both NVIDIA and AMD GPUs.
Additionally,
G-Watch offers **Program Analysis** tools for inspecting compiler-generated GPU binaries,
facilitating secondary development tasks such as register analysis and binary instrumentation.


## Installation

[![Install](https://img.shields.io/badge/Install-red.svg?logo=pypi)]() You can install G-Watch directly from PyPI:

``` bash
# install from PyPI source
pip install gwatch
```

### [![build via conda](https://img.shields.io/badge/Development-pink.svg?logo=python)]() Build from source with `Conda`

If you prefer not to use Docker, you can set up the development environment directly on your host machine using Conda.

1. Clone this repository
    ```bash
    git clone --recursive [https://github.com/mars-compute-ai/G-Watch-dev.git](https://github.com/mars-compute-ai/G-Watch-dev.git)
    cd G-Watch-dev
    ```

2. Create and activate a conda environment
    ```bash
    conda create -n gw312 python=3.12
    conda activate gw312
    ```

3. Install system dependencies and build tools via Conda
    ```bash
    # Install C/C++ compilers, build tools, and required libraries from conda-forge
    conda install -c conda-forge gcc_linux-64 gxx_linux-64 cmake make meson pkg-config \
        eigen elfutils libwebsockets protobuf libcurl openssl libdwarf
    ```


4. Install Python dependencies
    ```bash
    # install required python packages
    pip3 install pytest perfetto PyYAML tqdm packaging loguru PrettyTable matplotlib pandas ninja==1.11.1.3

    # install torch for test workloads
    pip3 install torch torchvision torchaudio
    ```

5. Build and install python package from source
    ```bash
    # build and install gwatch
    python3 setup.py clean  
    python3 setup.py install
    ```

6. Build the wheel of gwatch
    ```bash
    python3 setup.py clean
    python3 setup.py bdist_wheel
    
    # Optional: use auditwheel to bundle external system libraries if needed
    # Note: auditwheel requires the 'auditwheel' package (pip install auditwheel)
    auditwheel repair \
        --exclude "libcuda.so.*"        \
        --exclude "libcudart.so*"       \
        --exclude "libcupti.so*"        \
        --exclude "libnvperf.so*"       \
        --exclude "libnvidia-ml.so*"    \
        --exclude "libcheckpoint.so"    \
        ./dist/gwatch*.whl
    ```


### [![Development](https://img.shields.io/badge/Development-pink.svg?logo=docker)]() Build and Install from source with `Docker`

1. Clone this repository
    ```bash
    git clone --recursive https://github.com/mars-compute-ai/G-Watch-dev.git
    ```

2. Start dev container for building and testing

* For CUDA 12.8, libc 2.34

    ```bash
    cd G-Watch-dev/scripts/docker
    bash run_cuda_12_8_rockylinux9.sh -s 1  # start and enter a container with id 1

    # don't need to run, just noted here
    bash run_cuda_12_8_rockylinux9.sh -e 1  # enter a container with id 1
    bash run_cuda_12_8_rockylinux9.sh -c 1  # close and remove a container with id 1
    ```

* For CUDA 12.8, libc 2.28, please use the script of `run_cuda_12_8_rockylinux8.sh`

3. Install prerequisites

    ```bash
    # inside the container
    cd /root

    # install dnf plugins to enable config-manager
    dnf install -y dnf-plugins-core

    # [only for rockylinux9] enable crb repository to get development packages
    dnf config-manager --set-enabled crb

    # [only for rockylinux8] enable powertool repository to get development packages
    dnf config-manager --set-enabled powertools

    # update system
    dnf update -y

    # must run this first
    dnf install epel-release ncurses -y

    # install system dependencies and allow erasing conflicting minimal packages
    dnf install -y --allowerasing git wget curl gdb pkgconf python3 \
                    python3-pip unzip cmake meson gcc gcc-c++ make

    # these packages are required to build gwatch
    dnf install -y --allowerasing eigen3-devel \
        python3-devel elfutils-libelf-devel libwebsockets-devel numactl-devel protobuf-compiler \
        libcurl-devel vim-common libdwarf-devel protobuf-devel openssl-devel

    # install cusparselt
    dnf install -y libcusparselt0-cuda-12 libcusparselt0-devel-cuda-12

    # link the address of dwarf
    ln -s /usr/include/libdwarf-0 /usr/include/libdwarf

    # install and enable gcc13
    dnf install -y gcc-toolset-13 gcc-toolset-13-gcc-c++
    echo "source /opt/rh/gcc-toolset-13/enable" | tee /etc/profile.d/gcc-toolset-13.sh
    chmod +x /etc/profile.d/gcc-toolset-13.sh

    # install miniconda
    wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
    bash Miniconda3-latest-Linux-x86_64.sh
    source ~/.bashrc
    conda tos accept --override-channels --channel https://repo.anaconda.com/pkgs/main
    conda tos accept --override-channels --channel https://repo.anaconda.com/pkgs/r

    # create conda environment for building
    conda create -n gw312 python=3.12
    conda activate gw312

    # install python packages
    pip3 install pytest perfetto PyYAML tqdm pyyaml packaging loguru PrettyTable matplotlib pandas ninja==1.11.1.3

    # install torch for test workloads
    pip3 install torch torchvision torchaudio
    ```

4. Build and install python package from source
    ```bash
    # inside the container
    cd /root

    # build and install gwatch
    python3 setup.py clean  
    python3 setup.py install
    ```

5. Build the wheel of gwatch
    ```bash
    # inside the container
    cd /root
    python3 setup.py clean
    python3 setup.py bdist_wheel

    # use auditwheel for bundling external system libraries
    # exclude nvidia libraries from bundling
    auditwheel repair \
        --exclude "libcuda.so.*"        \
        --exclude "libcudart.so*"       \
        --exclude "libcupti.so*"        \
        --exclude "libnvperf.so*"       \
        --exclude "libnvidia-ml.so*"    \
        --exclude "libcheckpoint.so"    \
        ./dist/gwatch*.whl
    ```

    The resulted wheel file would be located in `/root/wheelhouse` directory.
