Hierarchical Time Series
HierarchicalTimeSeries lets you model that structure directly and aggregate bottom-up through the tree.Class |
Purpose |
|---|---|
|
A single node — key, level, children, and an optional |
|
The tree container — traversal, aggregation, conversion |
|
|
[1]:
from datetime import datetime, timedelta, timezone
import numpy as np
import timedatamodel as tdm
base = datetime(2024, 1, 15, tzinfo=timezone.utc)
timestamps = [base + timedelta(hours=i) for i in range(24)]
rng = np.random.default_rng(42)
Create leaf time series
Each leaf in the hierarchy holds a TimeSeriesList. Here we model electricity consumption for five Norwegian cities.
[2]:
def make_consumption(name: str, base_mw: float) -> tdm.TimeSeriesList:
pattern = base_mw * (1 + 0.3 * np.sin(np.linspace(0, 2 * np.pi, 24)))
noise = rng.normal(0, base_mw * 0.05, 24)
return tdm.TimeSeriesList(
tdm.Frequency.PT1H,
timezone="Europe/Oslo",
timestamps=timestamps,
values=(pattern + noise).tolist(),
name=name,
unit="MW",
)
ts_oslo = make_consumption("Oslo", 500)
ts_bergen = make_consumption("Bergen", 200)
ts_stavanger = make_consumption("Stavanger", 150)
ts_tromsoe = make_consumption("Tromsø", 80)
ts_bodoe = make_consumption("Bodø", 50)
Building a hierarchy with HierarchyNode
Construct the tree by nesting HierarchyNode objects. Leaves hold a TimeSeriesList; interior nodes have children.
[3]:
root = tdm.HierarchyNode(
key="Norway",
level="country",
children=[
tdm.HierarchyNode(
key="South",
level="region",
children=[
tdm.HierarchyNode(key="Oslo", level="city", timeseries=ts_oslo),
tdm.HierarchyNode(key="Bergen", level="city", timeseries=ts_bergen),
tdm.HierarchyNode(key="Stavanger", level="city", timeseries=ts_stavanger),
],
),
tdm.HierarchyNode(
key="North",
level="region",
children=[
tdm.HierarchyNode(key="Tromsø", level="city", timeseries=ts_tromsoe),
tdm.HierarchyNode(key="Bodø", level="city", timeseries=ts_bodoe),
],
),
],
)
hierarchy = tdm.HierarchicalTimeSeries(
root,
name="Norway Consumption",
description="Hourly electricity consumption by city",
levels=["country", "region", "city"],
aggregation=tdm.AggregationMethod.SUM,
)
hierarchy
[3]:
| name | level | length | begin | end |
|---|---|---|---|---|
| Oslo | city | 24 | 2024-01-15 00:00 | 2024-01-15 23:00 |
| Bergen | city | 24 | 2024-01-15 00:00 | 2024-01-15 23:00 |
| Stavanger | city | 24 | 2024-01-15 00:00 | 2024-01-15 23:00 |
| Tromsø | city | 24 | 2024-01-15 00:00 | 2024-01-15 23:00 |
| Bodø | city | 24 | 2024-01-15 00:00 | 2024-01-15 23:00 |
Inspecting the tree
Basic properties tell you the shape of the hierarchy.
[4]:
print(f"Name: {hierarchy.name}")
print(f"Levels: {hierarchy.levels}")
print(f"# levels: {hierarchy.n_levels}")
print(f"# nodes: {hierarchy.n_nodes}")
print(f"# leaves: {hierarchy.n_leaves}")
Name: Norway Consumption
Levels: ['country', 'region', 'city']
# levels: 3
# nodes: 8
# leaves: 5
Leaves and walking
leaves() returns all leaf nodes. walk() yields nodes in pre-order (default) or post-order.
[8]:
print("All leaves:")
for leaf in hierarchy.leaves():
print(f" {leaf.key:12s} path={leaf.path}")
All leaves:
Oslo path=['Norway', 'South', 'Oslo']
Bergen path=['Norway', 'South', 'Bergen']
Stavanger path=['Norway', 'South', 'Stavanger']
Tromsø path=['Norway', 'North', 'Tromsø']
Bodø path=['Norway', 'North', 'Bodø']
[9]:
print("Pre-order walk:")
for node in hierarchy.walk(order="pre"):
indent = " " * node.depth
label = f"{node.key} [{node.level}]"
if node.is_leaf:
label += f" — {len(node.timeseries)} pts"
print(f"{indent}{label}")
Pre-order walk:
Norway [country]
South [region]
Oslo [city] — 24 pts
Bergen [city] — 24 pts
Stavanger [city] — 24 pts
North [region]
Tromsø [city] — 24 pts
Bodø [city] — 24 pts
Bottom-up aggregation
aggregate() recursively combines leaf series using the chosen method (default: SUM).[10]:
total = hierarchy.aggregate()
print(f"Name: {total.name}")
print(f"Length: {len(total)} data points")
print(f"Mean: {np.nanmean(total.arr):.1f} MW")
Name: Norway
Length: 24 data points
Mean: 979.5 MW
[11]:
south_total = hierarchy.aggregate(south)
print(f"South region total — mean: {np.nanmean(south_total.arr):.1f} MW")
north = hierarchy.get_node("North")
north_total = hierarchy.aggregate(north)
print(f"North region total — mean: {np.nanmean(north_total.arr):.1f} MW")
South region total — mean: 850.6 MW
North region total — mean: 128.8 MW
Level-wise aggregation
aggregate_level(level) aggregates every node at the named level, returning a dict.
[12]:
region_agg = hierarchy.aggregate_level("region")
for name, ts in region_agg.items():
print(f"{name:8s} mean={np.nanmean(ts.arr):7.1f} MW max={np.nanmax(ts.arr):7.1f} MW")
South mean= 850.6 MW max= 1125.5 MW
North mean= 128.8 MW max= 172.1 MW
Choosing an aggregation method
Override the default method by passing a different AggregationMethod.
[13]:
for method in tdm.AggregationMethod:
agg = hierarchy.aggregate(method=method)
vals = agg.arr
print(f"{method.value:5s} mean={np.nanmean(vals):7.1f} min={np.nanmin(vals):7.1f} max={np.nanmax(vals):7.1f}")
sum mean= 979.5 min= 672.3 max= 1296.0
mean mean= 174.0 min= 119.5 max= 230.2
min mean= 49.5 min= 31.4 max= 67.9
max mean= 499.5 min= 326.4 max= 652.8
Subtree extraction
subtree(*path) creates a new HierarchicalTimeSeries rooted at the specified node.
[14]:
south_tree = hierarchy.subtree("South")
print(south_tree)
print(f"\nLevels: {south_tree.levels}")
print(f"Leaves: {south_tree.n_leaves}")
HierarchicalTimeSeries
┌────────────────────────────────────────────────────────────────┐
│ Name: Norway Consumption │
│ Levels: region, city │
│ Nodes: 4 (3 leaves) │
│ Frequency: PT1H │
│ Timezone: Europe/Oslo (+00:00) │
│ Unit: MW │
│ Aggregation: sum │
├────────────────────────────────────────────────────────────────┤
│ name level length begin end │
├────────────────────────────────────────────────────────────────┤
│ Oslo city 24 2024-01-15 00:00 2024-01-15 23:00 │
│ Bergen city 24 2024-01-15 00:00 2024-01-15 23:00 │
│ Stavanger city 24 2024-01-15 00:00 2024-01-15 23:00 │
└────────────────────────────────────────────────────────────────┘
Levels: ['region', 'city']
Leaves: 3
Converting to other containers
Flatten the hierarchy into a TimeSeriesCollection or TimeSeriesTable.
[15]:
collection = hierarchy.to_collection()
print(f"Leaf-level collection: {list(collection.keys())}")
Leaf-level collection: ['Oslo', 'Bergen', 'Stavanger', 'Tromsø', 'Bodø']
[16]:
collection_regions = hierarchy.to_collection(level="region")
print("Region-level collection (aggregated):")
for key, ts in collection_regions.items():
print(f" {key}: {len(ts)} pts, mean={np.nanmean(ts.arr):.1f} MW")
Region-level collection (aggregated):
South: 24 pts, mean=850.6 MW
North: 24 pts, mean=128.8 MW
[17]:
table = hierarchy.to_table()
print(f"Table shape: {len(table)} rows × {table.n_columns} columns")
print(f"Columns: {table.names}")
table
Table shape: 24 rows × 5 columns
Columns: ['Oslo', 'Bergen', 'Stavanger', 'Tromsø', 'Bodø']
[17]:
| timestamp | Oslo | Bergen | Stavanger | Tromsø | Bodø |
|---|---|---|---|---|---|
| 2024-01-15 00:00 | 507.618 | 195.717 | 155.092 | 76.3222 | 46.6933 |
| 2024-01-15 01:00 | 514.47 | 212.666 | 162.648 | 88.4638 | 51.5538 |
| 2024-01-15 02:00 | 596.699 | 236.498 | 175.55 | 93.0397 | 58.7932 |
| … | … | … | … | … | … |
| 2024-01-15 21:00 | 405.039 | 171.012 | 125.184 | 67.2111 | 42.2609 |
| 2024-01-15 22:00 | 490.094 | 192.526 | 128.291 | 66.7755 | 49.9575 |
| 2024-01-15 23:00 | 496.137 | 202.236 | 141.5 | 74.2116 | 49.4016 |
Building from a DataFrame
from_dataframe builds the tree from a long-format DataFrame with hierarchy columns.[18]:
import pandas as pd
rows = []
for ts_dt in timestamps:
for region, cities in [("South", ["Oslo", "Bergen"]), ("North", ["Tromsø", "Bodø"])]:
for city in cities:
rows.append({
"timestamp": ts_dt,
"region": region,
"city": city,
"consumption_mw": float(rng.normal(200, 30)),
})
df = pd.DataFrame(rows)
print(f"DataFrame shape: {df.shape}")
df.head(8)
DataFrame shape: (96, 4)
[18]:
| timestamp | region | city | consumption_mw | |
|---|---|---|---|---|
| 0 | 2024-01-15 00:00:00+00:00 | South | Oslo | 169.295075 |
| 1 | 2024-01-15 00:00:00+00:00 | South | Bergen | 205.378269 |
| 2 | 2024-01-15 00:00:00+00:00 | North | Tromsø | 206.599901 |
| 3 | 2024-01-15 00:00:00+00:00 | North | Bodø | 240.775627 |
| 4 | 2024-01-15 01:00:00+00:00 | South | Oslo | 225.053337 |
| 5 | 2024-01-15 01:00:00+00:00 | South | Bergen | 210.706132 |
| 6 | 2024-01-15 01:00:00+00:00 | North | Tromsø | 243.899087 |
| 7 | 2024-01-15 01:00:00+00:00 | North | Bodø | 164.337108 |
[19]:
h_from_df = tdm.HierarchicalTimeSeries.from_dataframe(
df,
level_columns=["region", "city"],
value_column="consumption_mw",
timestamp_column="timestamp",
name="Consumption from DataFrame",
frequency=tdm.Frequency.PT1H,
timezone="Europe/Oslo",
)
h_from_df
[19]:
| name | level | length | begin | end |
|---|---|---|---|---|
| Bodø | city | 24 | 2024-01-15 00:00 | 2024-01-15 23:00 |
| Tromsø | city | 24 | 2024-01-15 00:00 | 2024-01-15 23:00 |
| Bergen | city | 24 | 2024-01-15 00:00 | 2024-01-15 23:00 |
| Oslo | city | 24 | 2024-01-15 00:00 | 2024-01-15 23:00 |
[20]:
total_df = h_from_df.aggregate()
print(f"Total from DataFrame hierarchy: mean={np.nanmean(total_df.arr):.1f} MW")
Total from DataFrame hierarchy: mean=797.6 MW
Another example: energy production by source
Hierarchies can model any tree-shaped relationship — here, power production broken down by energy source.
[21]:
energy_root = tdm.HierarchyNode(
key="Total",
level="total",
children=[
tdm.HierarchyNode(
key="Wind",
level="source",
children=[
tdm.HierarchyNode(key="Farm A", level="farm", timeseries=make_consumption("Farm A", 100)),
tdm.HierarchyNode(key="Farm B", level="farm", timeseries=make_consumption("Farm B", 80)),
],
),
tdm.HierarchyNode(
key="Solar",
level="source",
children=[
tdm.HierarchyNode(key="Plant X", level="farm", timeseries=make_consumption("Plant X", 60)),
],
),
],
)
energy = tdm.HierarchicalTimeSeries(
energy_root,
name="Energy Production",
levels=["total", "source", "farm"],
aggregation=tdm.AggregationMethod.SUM,
)
energy
[21]:
| name | level | length | begin | end |
|---|---|---|---|---|
| Farm A | farm | 24 | 2024-01-15 00:00 | 2024-01-15 23:00 |
| Farm B | farm | 24 | 2024-01-15 00:00 | 2024-01-15 23:00 |
| Plant X | farm | 24 | 2024-01-15 00:00 | 2024-01-15 23:00 |
[22]:
source_agg = energy.aggregate_level("source")
for name, ts in source_agg.items():
print(f"{name:8s} mean={np.nanmean(ts.arr):.1f} MW")
Wind mean=179.2 MW
Solar mean=60.5 MW
Sequence protocol
HierarchicalTimeSeries supports len, in, and bracket indexing with slash-separated paths.
[23]:
print(f"Total nodes: {len(hierarchy)}")
print(f"'Oslo' in tree: {'Oslo' in hierarchy}")
print(f"'Helsinki' in tree: {'Helsinki' in hierarchy}")
node = hierarchy["South/Oslo"]
print(f"\nBracket access: {node.key} (leaf={node.is_leaf})")
Total nodes: 8
'Oslo' in tree: True
'Helsinki' in tree: False
Bracket access: Oslo (leaf=True)
Summary
Feature |
API |
|---|---|
Build manually |
|
Build from DataFrame |
|
Build from dict |
|
Navigate |
|
Walk |
|
Aggregate |
|
Level aggregate |
|
Subtree |
|
Convert |
|
Sequence ops |
|