Metadata-Version: 2.4
Name: safe-earth
Version: 0.1.12
Summary: Stratified Assessments of Forecasts over Earth
Project-URL: Repository, https://github.com/n-masi/safe
Project-URL: Homepage, https://n-masi.github.io/safe/
Author-email: Nick Masi <nicholas_masi@alumni.brown.edu>, Daniel Cai <daniel_cai@brown.edu>, Randall Balestriero <randall_balestriero@brown.edu>
License-Expression: MIT
License-File: LICENSE
Keywords: climate,earth,fairness,machine learning,stratification,weather
Requires-Dist: cfgrib
Requires-Dist: fsspec
Requires-Dist: gcsfs
Requires-Dist: geopandas
Requires-Dist: kaleido
Requires-Dist: numpy
Requires-Dist: pandas
Requires-Dist: plotly
Requires-Dist: pygeoboundaries-geolab
Requires-Dist: pytest
Requires-Dist: requests-cache
Requires-Dist: scikit-learn
Requires-Dist: shapely
Requires-Dist: xarray
Requires-Dist: zarr
Description-Content-Type: text/markdown

# SAFE: Stratified Assessments of Forecasts over Earth

[![arXiv](https://img.shields.io/badge/arXiv-2510.26099-b31b1b.svg)](https://arxiv.org/abs/2510.26099)
[![PyPI - Version](https://img.shields.io/pypi/v/safe-earth)](https://pypi.org/project/safe-earth/)
[![GitHub](https://img.shields.io/github/license/n-masi/safe)](https://github.com/n-masi/safe/blob/master/LICENSE)
![Static Badge](https://img.shields.io/badge/pytest-passing-brightgreen)
[![Website](https://img.shields.io/website?url=https://n-masi.github.io/safe&up_color=blue)](https://n-masi.github.io/safe)

<!-- 
[Preprint](https://arxiv.org/abs/2510.26099) and [Website](https://n-masi.github.io/safe)
-->

## Installation

`pip install safe-earth`

To build from source instead:

```
# get repo
git clone git@github.com:N-Masi/safe.git

# create dev environment
conda create -n safe.env
conda activate safe.env
pip install --file requirements.txt
conda install --channel conda-forge pygmt plotly typing_extensions
```
<!-- 
If you are an authorized contributor and want to upload a new version to pypi: 

```
python3 -m build
python3 -m twine upload dist/*
```
-->

<!-- When running directly from the source repository, run files with `python -m safe_earth.<directory>.<file_without_extension>` while in the `src/` subdirectory. -->

## Basic Usage

There are 3 basic steps to any SAFE pipeline:

1. Measure **loss**: any function that operates between each predicted $\hat{y}$ and the ground truth $y$. There is a loss calculated for every prediction by a given model at every permutation of gridpoint, timestamp, lead time, variable, and vertical level. 

    Example: the latitude-weighted squared difference of $\hat{y}$ and $y$.

2. Measure **stratified error**: any function that reduces across gridpoints to calculate a metric for each strata. 

    Example: RMSE.

3. Measure **fairness**: any function that operates on a set of stratified errors. Calculates a fairness metric for each permutation of model and attribute (e.g., the fairness of GraphCast in prediction by territory).

    Example: greatest absolute difference in RMSEs.

It is most useful to look at the errors and fairness. Errors allow you to see how well a particular model works in a specific strata, which can be useful to decision makers determining which model is most accurate for their country or region. Fairness metrics provide a summary statistic for the overall amount of bias in a model. 

For now, loss functions should create dataframes with columns for the output of the function. The name of that column is passed into the error function. Calls to `src/safe_earth/metrics/fairness.measure_fairness` take in the fairness functions as objects and run them all internally. The first major version of the package will bring this paradigm to the errors as well by taking in loss functions as parameters.

<!-- TODO: pass error function handles to losses rather than error names -->
<!-- To facilitate ease of use, you generate errors by passing in the loss functions of choice as arguments to a call from `src/safe_earth/metrics/errors`, and also submit functions as arguments in calls to `src/safe_earth/metrics/fairness.measure_fairness`. This reduces the lines of code in a SAFE pipeline, and also allows you to extend SAFE with your own functions. You can still access the losses themselves through direct calls to `src/safe_earth/metrics/losses`. -->

## Demos

An example of using SAFE to collect metrics on 6 AIWP models across the territory, subregion, income, and landcover attributes is availabe in `demos/iclr_workflow.py`. It generates error and fairness data by assessing the models on 2020 ERA5 data. 

To see the type of analysis that can be performed with this data, you can reproduce the figures and tables from the paper by running `demos/iclr_figs.py` and `demos/iclr_tables.py`, respectively.

An interactive notebook utilizing SAFE to investigate territorial disparities is available in `demos/interactive_demo.ipynb`.

<!-- Instructions to run on OSCAR:

# TO RUN:
# 0. activate environment (conda activate faireenvconda)
# 1. move the file to the src/ directory
# 2. cd src/
# 3. python -m toy_workflow

-->

## Data Notes

To unify the coordinate system across all integrated data sources, latitude ranges [-90, 90] with index 0 at -90, and longitude [-180, 180) but with index 0 at 0 and a wraparound from 180 to -180 in the middle. This is because metadata sourced from pygeoboundaries_geolab follows this coordinate system, and it is easiest to bring tabular data into conformance.

## Testing

Run `pytest` in the terminal of the repo directory while in a python environment that has pytest installed.

## Citation

If you use SAFE in your work, please cite us!

```
@article{masi2025safe,
  title={SAFE: A Novel Approach to AI Weather Evaluation through Stratified Assessments of Forecasts over Earth},
  author={Masi, Nick and Balestriero, Randall},
  journal={arXiv preprint arXiv:2510.26099},
  year={2025}
}
```
