Hierarchical Time Series

Many real-world datasets are naturally organised as trees: a country’s electricity consumption breaks into regions, which break into cities.
HierarchicalTimeSeries lets you model that structure directly and aggregate bottom-up through the tree.

Class

Purpose

HierarchyNode

A single node — key, level, children, and an optional TimeSeriesList

HierarchicalTimeSeries

The tree container — traversal, aggregation, conversion

AggregationMethod

SUM, MEAN, MIN, MAX

[1]:
from datetime import datetime, timedelta, timezone

import numpy as np

import timedatamodel as tdm

base = datetime(2024, 1, 15, tzinfo=timezone.utc)
timestamps = [base + timedelta(hours=i) for i in range(24)]
rng = np.random.default_rng(42)

Create leaf time series

Each leaf in the hierarchy holds a TimeSeriesList. Here we model electricity consumption for five Norwegian cities.

[2]:
def make_consumption(name: str, base_mw: float) -> tdm.TimeSeriesList:
    pattern = base_mw * (1 + 0.3 * np.sin(np.linspace(0, 2 * np.pi, 24)))
    noise = rng.normal(0, base_mw * 0.05, 24)
    return tdm.TimeSeriesList(
        tdm.Frequency.PT1H,
        timezone="Europe/Oslo",
        timestamps=timestamps,
        values=(pattern + noise).tolist(),
        name=name,
        unit="MW",
    )

ts_oslo = make_consumption("Oslo", 500)
ts_bergen = make_consumption("Bergen", 200)
ts_stavanger = make_consumption("Stavanger", 150)
ts_tromsoe = make_consumption("Tromsø", 80)
ts_bodoe = make_consumption("Bodø", 50)

Building a hierarchy with HierarchyNode

Construct the tree by nesting HierarchyNode objects. Leaves hold a TimeSeriesList; interior nodes have children.

[3]:
root = tdm.HierarchyNode(
    key="Norway",
    level="country",
    children=[
        tdm.HierarchyNode(
            key="South",
            level="region",
            children=[
                tdm.HierarchyNode(key="Oslo", level="city", timeseries=ts_oslo),
                tdm.HierarchyNode(key="Bergen", level="city", timeseries=ts_bergen),
                tdm.HierarchyNode(key="Stavanger", level="city", timeseries=ts_stavanger),
            ],
        ),
        tdm.HierarchyNode(
            key="North",
            level="region",
            children=[
                tdm.HierarchyNode(key="Tromsø", level="city", timeseries=ts_tromsoe),
                tdm.HierarchyNode(key="Bodø", level="city", timeseries=ts_bodoe),
            ],
        ),
    ],
)

hierarchy = tdm.HierarchicalTimeSeries(
    root,
    name="Norway Consumption",
    description="Hourly electricity consumption by city",
    levels=["country", "region", "city"],
    aggregation=tdm.AggregationMethod.SUM,
)
hierarchy
[3]:
HierarchicalTimeSeries
NameNorway Consumption
Levelscountry, region, city
Nodes8 (5 leaves)
FrequencyPT1H
TimezoneEurope/Oslo (+00:00)
UnitMW
Aggregationsum
namelevellengthbeginend
Oslocity242024-01-15 00:002024-01-15 23:00
Bergencity242024-01-15 00:002024-01-15 23:00
Stavangercity242024-01-15 00:002024-01-15 23:00
Tromsøcity242024-01-15 00:002024-01-15 23:00
Bodøcity242024-01-15 00:002024-01-15 23:00

Inspecting the tree

Basic properties tell you the shape of the hierarchy.

[4]:
print(f"Name:     {hierarchy.name}")
print(f"Levels:   {hierarchy.levels}")
print(f"# levels: {hierarchy.n_levels}")
print(f"# nodes:  {hierarchy.n_nodes}")
print(f"# leaves: {hierarchy.n_leaves}")
Name:     Norway Consumption
Levels:   ['country', 'region', 'city']
# levels: 3
# nodes:  8
# leaves: 5

Leaves and walking

leaves() returns all leaf nodes. walk() yields nodes in pre-order (default) or post-order.

[8]:
print("All leaves:")
for leaf in hierarchy.leaves():
    print(f"  {leaf.key:12s}  path={leaf.path}")
All leaves:
  Oslo          path=['Norway', 'South', 'Oslo']
  Bergen        path=['Norway', 'South', 'Bergen']
  Stavanger     path=['Norway', 'South', 'Stavanger']
  Tromsø        path=['Norway', 'North', 'Tromsø']
  Bodø          path=['Norway', 'North', 'Bodø']
[9]:
print("Pre-order walk:")
for node in hierarchy.walk(order="pre"):
    indent = "  " * node.depth
    label = f"{node.key} [{node.level}]"
    if node.is_leaf:
        label += f" — {len(node.timeseries)} pts"
    print(f"{indent}{label}")
Pre-order walk:
Norway [country]
  South [region]
    Oslo [city] — 24 pts
    Bergen [city] — 24 pts
    Stavanger [city] — 24 pts
  North [region]
    Tromsø [city] — 24 pts
    Bodø [city] — 24 pts

Bottom-up aggregation

aggregate() recursively combines leaf series using the chosen method (default: SUM).
Calling it on the root gives the total for the whole hierarchy.
[10]:
total = hierarchy.aggregate()
print(f"Name:   {total.name}")
print(f"Length: {len(total)} data points")
print(f"Mean:   {np.nanmean(total.arr):.1f} MW")
Name:   Norway
Length: 24 data points
Mean:   979.5 MW
[11]:
south_total = hierarchy.aggregate(south)
print(f"South region total — mean: {np.nanmean(south_total.arr):.1f} MW")

north = hierarchy.get_node("North")
north_total = hierarchy.aggregate(north)
print(f"North region total — mean: {np.nanmean(north_total.arr):.1f} MW")
South region total — mean: 850.6 MW
North region total — mean: 128.8 MW

Level-wise aggregation

aggregate_level(level) aggregates every node at the named level, returning a dict.

[12]:
region_agg = hierarchy.aggregate_level("region")

for name, ts in region_agg.items():
    print(f"{name:8s}  mean={np.nanmean(ts.arr):7.1f} MW  max={np.nanmax(ts.arr):7.1f} MW")
South     mean=  850.6 MW  max= 1125.5 MW
North     mean=  128.8 MW  max=  172.1 MW

Choosing an aggregation method

Override the default method by passing a different AggregationMethod.

[13]:
for method in tdm.AggregationMethod:
    agg = hierarchy.aggregate(method=method)
    vals = agg.arr
    print(f"{method.value:5s}  mean={np.nanmean(vals):7.1f}  min={np.nanmin(vals):7.1f}  max={np.nanmax(vals):7.1f}")
sum    mean=  979.5  min=  672.3  max= 1296.0
mean   mean=  174.0  min=  119.5  max=  230.2
min    mean=   49.5  min=   31.4  max=   67.9
max    mean=  499.5  min=  326.4  max=  652.8

Subtree extraction

subtree(*path) creates a new HierarchicalTimeSeries rooted at the specified node.

[14]:
south_tree = hierarchy.subtree("South")
print(south_tree)
print(f"\nLevels: {south_tree.levels}")
print(f"Leaves: {south_tree.n_leaves}")
HierarchicalTimeSeries
┌────────────────────────────────────────────────────────────────┐
│  Name:             Norway Consumption                          │
│  Levels:           region, city                                │
│  Nodes:            4 (3 leaves)                                │
│  Frequency:        PT1H                                        │
│  Timezone:         Europe/Oslo (+00:00)                        │
│  Unit:             MW                                          │
│  Aggregation:      sum                                         │
├────────────────────────────────────────────────────────────────┤
│  name       level  length  begin             end               │
├────────────────────────────────────────────────────────────────┤
│  Oslo       city   24      2024-01-15 00:00  2024-01-15 23:00  │
│  Bergen     city   24      2024-01-15 00:00  2024-01-15 23:00  │
│  Stavanger  city   24      2024-01-15 00:00  2024-01-15 23:00  │
└────────────────────────────────────────────────────────────────┘

Levels: ['region', 'city']
Leaves: 3

Converting to other containers

Flatten the hierarchy into a TimeSeriesCollection or TimeSeriesTable.

[15]:
collection = hierarchy.to_collection()
print(f"Leaf-level collection: {list(collection.keys())}")
Leaf-level collection: ['Oslo', 'Bergen', 'Stavanger', 'Tromsø', 'Bodø']
[16]:
collection_regions = hierarchy.to_collection(level="region")
print("Region-level collection (aggregated):")
for key, ts in collection_regions.items():
    print(f"  {key}: {len(ts)} pts, mean={np.nanmean(ts.arr):.1f} MW")
Region-level collection (aggregated):
  South: 24 pts, mean=850.6 MW
  North: 24 pts, mean=128.8 MW
[17]:
table = hierarchy.to_table()
print(f"Table shape: {len(table)} rows × {table.n_columns} columns")
print(f"Columns: {table.names}")
table
Table shape: 24 rows × 5 columns
Columns: ['Oslo', 'Bergen', 'Stavanger', 'Tromsø', 'Bodø']
[17]:
TimeSeriesTable
Nameunnamed
ColumnsOslo, Bergen, Stavanger, Tromsø, Bodø
Length24 × 5
FrequencyPT1H
TimezoneEurope/Oslo (+00:00)
UnitMW, MW, MW, MW, MW
timestampOsloBergenStavangerTromsøBodø
2024-01-15 00:00507.618195.717155.09276.322246.6933
2024-01-15 01:00514.47212.666162.64888.463851.5538
2024-01-15 02:00596.699236.498175.5593.039758.7932
2024-01-15 21:00405.039171.012125.18467.211142.2609
2024-01-15 22:00490.094192.526128.29166.775549.9575
2024-01-15 23:00496.137202.236141.574.211649.4016

Building from a DataFrame

from_dataframe builds the tree from a long-format DataFrame with hierarchy columns.
Each unique combination of level columns becomes a leaf.
[18]:
import pandas as pd

rows = []
for ts_dt in timestamps:
    for region, cities in [("South", ["Oslo", "Bergen"]), ("North", ["Tromsø", "Bodø"])]:
        for city in cities:
            rows.append({
                "timestamp": ts_dt,
                "region": region,
                "city": city,
                "consumption_mw": float(rng.normal(200, 30)),
            })

df = pd.DataFrame(rows)
print(f"DataFrame shape: {df.shape}")
df.head(8)
DataFrame shape: (96, 4)
[18]:
timestamp region city consumption_mw
0 2024-01-15 00:00:00+00:00 South Oslo 169.295075
1 2024-01-15 00:00:00+00:00 South Bergen 205.378269
2 2024-01-15 00:00:00+00:00 North Tromsø 206.599901
3 2024-01-15 00:00:00+00:00 North Bodø 240.775627
4 2024-01-15 01:00:00+00:00 South Oslo 225.053337
5 2024-01-15 01:00:00+00:00 South Bergen 210.706132
6 2024-01-15 01:00:00+00:00 North Tromsø 243.899087
7 2024-01-15 01:00:00+00:00 North Bodø 164.337108
[19]:
h_from_df = tdm.HierarchicalTimeSeries.from_dataframe(
    df,
    level_columns=["region", "city"],
    value_column="consumption_mw",
    timestamp_column="timestamp",
    name="Consumption from DataFrame",
    frequency=tdm.Frequency.PT1H,
    timezone="Europe/Oslo",
)
h_from_df
[19]:
HierarchicalTimeSeries
NameConsumption from DataFrame
Levelsroot, region, city
Nodes7 (4 leaves)
FrequencyPT1H
TimezoneEurope/Oslo (+00:00)
Aggregationsum
namelevellengthbeginend
Bodøcity242024-01-15 00:002024-01-15 23:00
Tromsøcity242024-01-15 00:002024-01-15 23:00
Bergencity242024-01-15 00:002024-01-15 23:00
Oslocity242024-01-15 00:002024-01-15 23:00
[20]:
total_df = h_from_df.aggregate()
print(f"Total from DataFrame hierarchy: mean={np.nanmean(total_df.arr):.1f} MW")
Total from DataFrame hierarchy: mean=797.6 MW

Another example: energy production by source

Hierarchies can model any tree-shaped relationship — here, power production broken down by energy source.

[21]:
energy_root = tdm.HierarchyNode(
    key="Total",
    level="total",
    children=[
        tdm.HierarchyNode(
            key="Wind",
            level="source",
            children=[
                tdm.HierarchyNode(key="Farm A", level="farm", timeseries=make_consumption("Farm A", 100)),
                tdm.HierarchyNode(key="Farm B", level="farm", timeseries=make_consumption("Farm B", 80)),
            ],
        ),
        tdm.HierarchyNode(
            key="Solar",
            level="source",
            children=[
                tdm.HierarchyNode(key="Plant X", level="farm", timeseries=make_consumption("Plant X", 60)),
            ],
        ),
    ],
)

energy = tdm.HierarchicalTimeSeries(
    energy_root,
    name="Energy Production",
    levels=["total", "source", "farm"],
    aggregation=tdm.AggregationMethod.SUM,
)
energy
[21]:
HierarchicalTimeSeries
NameEnergy Production
Levelstotal, source, farm
Nodes6 (3 leaves)
FrequencyPT1H
TimezoneEurope/Oslo (+00:00)
UnitMW
Aggregationsum
namelevellengthbeginend
Farm Afarm242024-01-15 00:002024-01-15 23:00
Farm Bfarm242024-01-15 00:002024-01-15 23:00
Plant Xfarm242024-01-15 00:002024-01-15 23:00
[22]:
source_agg = energy.aggregate_level("source")
for name, ts in source_agg.items():
    print(f"{name:8s}  mean={np.nanmean(ts.arr):.1f} MW")
Wind      mean=179.2 MW
Solar     mean=60.5 MW

Sequence protocol

HierarchicalTimeSeries supports len, in, and bracket indexing with slash-separated paths.

[23]:
print(f"Total nodes:        {len(hierarchy)}")
print(f"'Oslo' in tree:     {'Oslo' in hierarchy}")
print(f"'Helsinki' in tree: {'Helsinki' in hierarchy}")

node = hierarchy["South/Oslo"]
print(f"\nBracket access:     {node.key} (leaf={node.is_leaf})")
Total nodes:        8
'Oslo' in tree:     True
'Helsinki' in tree: False

Bracket access:     Oslo (leaf=True)

Summary

Feature

API

Build manually

HierarchyNode(key, level, children, timeseries)

Build from DataFrame

HierarchicalTimeSeries.from_dataframe(df, level_columns, value_column)

Build from dict

HierarchicalTimeSeries.from_dict(tree, series_map, levels=...)

Navigate

get_node(*path), get_level(name), leaves()

Walk

walk(order="pre") / walk(order="post")

Aggregate

aggregate(node, method) — bottom-up recursion

Level aggregate

aggregate_level(level)dict[str, TimeSeriesList]

Subtree

subtree(*path) → new HierarchicalTimeSeries

Convert

to_collection(level), to_table(level)

Sequence ops

len(h), key in h, h["path/to/node"]