Metadata-Version: 2.4
Name: clinical-cad
Version: 0.1.1
Summary: Robust Clinical Association Displacement (CAD) analysis with statistical normalization
Author-email: Aditya Parikh <adipa@dtu.dk>
License: MIT
Project-URL: Homepage, https://github.com/ADE-17/clinical-cad
Project-URL: Repository, https://github.com/ADE-17/clinical-cad
Keywords: clinical,fairness,nlp,association,displacement,radiology
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering
Requires-Python: >=3.8
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: numpy>=1.20
Requires-Dist: pandas>=1.3
Requires-Dist: scipy>=1.7
Requires-Dist: matplotlib>=3.4
Requires-Dist: seaborn>=0.11
Provides-Extra: dev
Requires-Dist: pytest>=7; extra == "dev"
Requires-Dist: build; extra == "dev"
Requires-Dist: twine; extra == "dev"
Dynamic: license-file

# clinical-cad

**Clinical Association Displacement (CAD)** and **Weighted Association Error (WAE)** for **lexical fairness** over **any binary demographic** (sex, race, age group, etc.). Compare a reference corpus (e.g. ground-truth text) with model predictions to see how word associations with two groups change, and get a single fairness number (WAE).

---

## Installation

```bash
pip install clinical-cad
```

---

## Quick example

**1. Run with your own CSVs** (reference = ground truth, prediction = model output):

```bash
cad --reference_csv reference.csv --prediction_csv predictions.csv
```

Results go to `./cad_output` by default. Use `--output_dir my_results` to change that.

**2. Try it without data** (demo):

```bash
cad --demo
```

**3. Use a different demographic** (e.g. race instead of sex):

Your CSV must have a column with exactly two values (e.g. `White` and `Black`). Then:

```bash
cad --reference_csv ref.csv --prediction_csv pred.csv \
    --label_column race --group_a White --group_b Black
```

---

## What it does (for new users)

- **CAD**  
  For each word, it measures how strongly it’s associated with **group A** vs **group B** in the reference corpus, and how that **changes** in the prediction corpus. Words that shift a lot are “displaced” and can indicate bias.

- **WAE (Weighted Association Error)**  
  One number summarizing how much associations shifted overall. **Lower WAE = fairer** (less lexical bias). It comes with 95% confidence intervals (bootstrap).

You get JSON summaries, CSV tables, and plots in the output directory.

---

## Input format

Both CSVs need:

1. A **text column** (default: `findings` for reference, `predicted_report` for predictions; override with `--reference_text_column` / `--prediction_text_column`).
2. A **demographic column** with exactly **two** values (e.g. M/F, White/Black, young/old). Default column name is `PatientSex` if you don’t pass `--label_column`.

Example:

| PatientSex | findings              |
|------------|------------------------|
| M          | No acute findings.     |
| F          | Normal heart size.     |

---

## Parameters (only the important ones)

| Parameter | Default | Why it matters |
|-----------|---------|----------------|
| `--reference_csv` | *(required)* | Path to reference (ground-truth) CSV. |
| `--prediction_csv` | *(required)* | Path to model-output CSV. |
| `--output_dir` | `./cad_output` | Where to write results. |
| `--label_column` | `PatientSex` | Column that holds the two groups (e.g. sex, race). |
| `--group_a` | `F` | First group value (e.g. F, White). |
| `--group_b` | `M` | Second group value (e.g. M, Black). |
| `--alpha` | `0.1` | Smoothing for rare words; keeps log-odds stable. |
| `--min_freq` | `1` | Ignore words with fewer than this many occurrences. |
| `--significance_level` | `0.0455` | P-value for “strong” association; lower = stricter. |
| `--neutral_significance` | `0.317` | P-value for “neutral”; higher = more words treated as neutral. |
| `--displacement_significance` | `0.01` | P-value for when a word is considered “displaced”. |
| `--wae_weight` | `total` | How to weight words: `total`, `ref`, or `pred`. |
| `--bootstrap_samples` | `1000` | Number of bootstrap samples for WAE confidence intervals. |
| `--stopwords_path` | — | Optional file with extra stopwords (one per line). |
| `--no_stopword_removal` | `False` | Turn off stopword removal. |
| `--demo` | `False` | Run on built-in demo data (no CSVs needed). |

---

## Paper and citation

This tool supports the methodology described in our paper on lexical fairness in clinical text (e.g. radiology report generation).

- **Paper**: https://arxiv.org/abs/2603.01625  
  *(Replace with your actual URL.)*

If you use **clinical-cad** in your work, please cite:

```bibtex
@article{parikh2026measuring,
  title={Measuring What VLMs Don't Say: Validation Metrics Hide Clinical Terminology Erasure in Radiology Report Generation},
  author={Parikh, Aditya and Feragen, Aasa and Das, Sneha and Frank, Stella},
  journal={arXiv preprint arXiv:2603.01625},
  year={2026}
}
```

*(Replace title, author, booktitle, year, and url with your paper details.)*

---

## Python API

```python
from clinical_cad import parse_args, run_pipeline

args = parse_args()
run_pipeline(
    args.reference_csv,
    args.reference_text_column,
    args.prediction_csv,
    args.prediction_text_column,
    args.label_column or "PatientSex",
    args.output_dir,
    args,
)
```

---

## License

MIT
