Metadata-Version: 2.3
Name: atlas-common-data
Version: 1.0.0
Summary: Shared classification data and economic indicator APIs for Growth Lab projects
License: Apache-2.0
Keywords: economics,trade,classification,imf,wdi,growth-lab
Author: Brendan Leonard
Author-email: brendan_leonard@hks.harvard.edu
Requires-Python: >=3.10,<4.0
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Scientific/Engineering
Requires-Dist: fredapi (>=0.5.2,<0.6.0)
Requires-Dist: openpyxl (>=3.1.5,<4.0.0)
Requires-Dist: pandas (>=2.2.3,<3.0.0)
Requires-Dist: requests (>=2.32.3,<3.0.0)
Requires-Dist: xlrd (>=2.0.2,<3.0.0)
Project-URL: Homepage, https://github.com/cid-harvard/atlas-common-data
Project-URL: Repository, https://github.com/cid-harvard/atlas-common-data
Description-Content-Type: text/markdown

# Atlas Common Data

Shared classification data and economic indicator APIs for [Growth Lab](https://growthlab.hks.harvard.edu/) projects.

This package provides:

- **Static classification data** — countries, products (HS92/HS12/HS22/SITC), product space networks, conversion weights, and reference datasets, all returned as pandas DataFrames
- **Data fetching classes** — clean APIs for IMF WEO, World Bank WDI, FRED, and UN Population data that return DataFrames matching the schemas of the legacy CSV files

## Installation

```bash
# Core package (static data loaders only — no external API dependencies)
pip install atlas-common-data

# With specific data source extras
pip install atlas-common-data[imf]    # IMF World Economic Outlook
pip install atlas-common-data[wdi]    # World Bank Development Indicators
pip install atlas-common-data[fred]   # Federal Reserve Economic Data
pip install atlas-common-data[all]    # All data sources
```

## Quick reference

```python
import atlas_common_data
atlas_common_data.describe()   # prints full API reference
```

---

## Static data loaders

No network required. All functions return pandas DataFrames.

### Countries and groups

```python
from atlas_common_data import (
    load_countries,
    load_groups,
    load_group_members,
    get_country_by_iso3,
    get_country_id_mapping,
    get_iso3_to_name_mapping,
)

countries = load_countries()           # iso3_code, name_en, country_id, …
groups    = load_groups()              # group_id, name_en, group_type
members   = load_group_members()       # country_id → group_id mappings

usa       = get_country_by_iso3("USA")          # → Series
id_map    = get_country_id_mapping()            # {'USA': 231, 'CHN': 44, …}
name_map  = get_iso3_to_name_mapping()          # {'USA': 'United States', …}
```

### Products

```python
from atlas_common_data import (
    load_products,
    list_available_product_classifications,
    load_product_space_edges,
    load_product_space_clusters,
    load_services,
)

print(list_available_product_classifications())   # ['hs12', 'hs22', 'hs92', 'sitc']

products  = load_products("hs92")                 # default; also 'hs12', 'hs22', 'sitc'
edges     = load_product_space_edges("hs92")      # proximity-weighted network edges
clusters  = load_product_space_clusters("hs92")   # cluster assignments
services  = load_services("bilateral")            # or 'unilateral'
```

### Classification conversion weights

```python
from atlas_common_data import (
    load_classifications,
    list_available_conversion_weights,
    load_conversion_weights,
)

load_classifications()               # {'H0': {'name': 'HS92', …}, 'H1': …}
list_available_conversion_weights()  # ['H0_to_H1', 'H0_to_S3', 'H1_to_H0', …]
load_conversion_weights("H0", "H1")  # source_code, target_code, weight, group_id
```

### Reference data

```python
from atlas_common_data import (
    load_comparator_countries,
    load_geo_distances,
    load_sitc_natural_resource_products,
)

load_comparator_countries()             # comparator pairs per country
load_geo_distances()                    # pairwise geographic distances (~4 MB)
load_sitc_natural_resource_products     # sitc products with natural resource identifier
```

### Growth projections

```python
from atlas_common_data import (
    load_growth_projections,
)

load_growth_projections()          # iso3_code, growth_proj, year
```

---

## Data source classes

These classes fetch live data from external APIs and return DataFrames. Each primary method (`get_imf_data`, `get_wdi_data`, etc.) produces a schema identical to the legacy CSV files it replaces.

### IMFData — IMF World Economic Outlook

Requires `pip install atlas-common-data[imf]`.

```python
from atlas_common_data import IMFData
import datetime

latest = datetime.date.today().year - 1
imf = IMFData(latest_year=latest)

# Primary method — matches imf_data.csv schema
# Applies FRED PPIIDC deflator to produce *_const columns
df = imf.get_imf_data(end_year=latest)
# Columns: year, iso3_code, country_id, current_account, population,
#          gdp, gdppc, gdp_const_growth, gdp_ppp, gdppc_ppp,
#          gdp_const, gdp_ppp_const, gdppc_const, gdppc_ppp_const

# Country income level & region — matches imf_country_data.csv schema
meta = imf.get_country_metadata()
# Columns: country_id, iso3_code, region, incomelevel_enum

# Average GDP/pc growth rates — matches imf_avg_growth.csv schema
growth = imf.get_average_growth_rates()
# Columns: year, iso3_code, country_id, lookback, gdppc_const_growth

# GDP-related indicators only
gdp = imf.get_gdp_data(start_year=2010, end_year=latest)

# Raw indicators without deflator
raw = imf.get_indicators(
    indicators=["NGDPD", "LP"],   # defaults to all 8 standard indicators
    country_codes=["USA", "CHN"], # optional filter
    start_year=2010,
    end_year=latest,
)

# Inflation index from IMF PCPIPCH series
idx = imf.get_inflation_index(base_year=latest, country_code="USA")

# List available indicator codes
IMFData.list_available_indicators()
# {'NGDP_R': 'gdp_const', 'NGDPD': 'gdp', 'LP': 'population', …}
```

**Available IMF indicator codes:** `NGDP_R`, `NGDP_RPCH`, `NGDPD`, `PPPGDP`, `NGDPDPC`, `PPPPC`, `LP`, `BCA`, `PCPIPCH`

### WDIData — World Bank Development Indicators

Requires `pip install atlas-common-data[wdi]`.

```python
from atlas_common_data import WDIData
import datetime

latest = datetime.date.today().year - 1
wdi = WDIData(latest_year=latest)

# Primary method — matches wdi_data.csv schema
df = wdi.get_wdi_data(end_year=latest)
# Columns: iso3_code, year, country_id, current_account, exports_goods_bop,
#          gdp, gdp_const, gdp_ppp, gdp_ppp_const, gdppc, gdppc_const,
#          gdppc_ppp, gdppc_ppp_const, imports_goods_bop, population

# Country income level & region — matches wdi_country_data.csv schema
meta = wdi.get_country_metadata()
# Columns: iso3_code, region, incomelevel_enum, country_id

# Services trade — matches wdi_service_data.csv schema
svc = wdi.get_services_data()
# Columns: iso3_code, year, services_export_value, services_import_value,
#          travel_*_share, finance_*_share, transport_*_share, comms_*_share

# Convenience subsets
gdp = wdi.get_gdp_data(start_year=2015, end_year=latest)
bop = wdi.get_bop_data(start_year=2015, end_year=latest)

# Custom indicators (dict: WDI code → column name)
raw = wdi.get_indicators({"NY.GDP.MKTP.CD": "gdp", "SP.POP.TOTL": "population"})

# Discover indicators
WDIData.list_available_indicators()           # {'economic': {…}, 'services': {…}}
WDIData.get_indicator_description("NY.GDP.MKTP.CD")  # 'gdp'
```

> **Note:** The World Bank Excel API rejects requests with more than 8 indicators. `get_indicators()` handles this automatically by batching requests and merging results.

### FREDData — Federal Reserve Economic Data

Requires `pip install atlas-common-data[fred]` and a [FRED API key](https://fred.stlouisfed.org/docs/api/api_key.html).

Set your key as an environment variable (recommended):

```bash
export FRED_API_KEY="your_key_here"
```

```python
from atlas_common_data import FREDData
import datetime

latest = datetime.date.today().year - 1

# api_key is read from FRED_API_KEY env var if not passed explicitly
fred = FREDData()

# Primary method — matches fred_ppiidc.csv schema
# base_year controls the deflator normalisation (deflator = 1.0 at that year)
# It is distinct from latest_year on IMFData/WDIData — pass it explicitly
df = fred.get_fred_data(base_year=latest)
# Columns: year, ppiidc_index, atlas_base_year, deflator

df = fred.get_inflation_index(base_year=latest, month=12)  # month=12 uses December values

FREDData.list_available_series()
# {'PPIIDC': 'Producer Price Index by Commodity: Industrial Commodities'}
```

### UNPopulationData — UN World Population Prospects

Requires `pip install atlas-common-data[all]`.

```python
from atlas_common_data import UNPopulationData
import datetime

latest = datetime.date.today().year - 1

# variants and latest_year can be set on the constructor as defaults
un = UNPopulationData(latest_year=latest, variants=["Medium"])

# Primary method — matches un_population_forecast_medium.csv schema
# variants and end_year fall back to constructor defaults when not passed
df = un.get_population_forecast(start_year=2020)
# Columns: iso3_code, year, population  (variant column is not included)

# Override per-call
df = un.get_population_forecast(variants=["High", "Low"], start_year=2020, end_year=2050)
# Other valid variants: 'High', 'Low', 'Constant fertility', etc.
```

---

## Missing-data warnings

`get_imf_data()` and `get_wdi_data()` emit a `UserWarning` when countries from the 252-country Atlas list appear in historical data but have no row for `end_year`. This is expected when some countries lag in reporting — data is still returned, just incomplete for the latest year.

```
UserWarning: 3 countries have no IMF data for 2024 and may not have reported yet.
  In rankings (2): ['ERI', 'SYR']
```

Each section is only shown if that group is non-empty.

---

## Migration from legacy scripts

The original per-module scripts still work but now emit deprecation warnings. No changes are required in existing code.

```python
# Old usage (still works, emits DeprecationWarning)
from atlas_common_data.imf_indicators.calculate_imf_data import calculate_imf_data
df = calculate_imf_data(latest_atlas_year=2023)

# New usage
from atlas_common_data import IMFData
df = IMFData(latest_year=2023).get_imf_data(end_year=2023)
```
---

## Development

```bash
# Install with dev dependencies
poetry install

# Run fast tests (no network)
poetry run pytest tests/ -v -m "not network"

# Run all tests (requires FRED_API_KEY and network access)
poetry run pytest tests/ -v

# Override the latest data year used in tests (default: current year - 1)
poetry run pytest tests/ -v --latest-year=2023

# Build the package
poetry build
```

## License

Apache 2.0 — see [LICENSE](LICENSE) for details.

