Cubes and Collections
For multi-dimensional data (e.g., scenario x time, or region x time) use TimeSeriesCube. For grouping heterogeneous time series that don’t share the same timestamps, use TimeSeriesCollection.
TimeSeriesCube
A cube stores an N-dimensional array with named Dimension objects. Common use cases include ensemble forecasts, scenario analysis, and region-by-time grids.
[1]:
from datetime import datetime, timedelta, timezone
import numpy as np
import timedatamodel as tdm
base = datetime(2024, 1, 15, tzinfo=timezone.utc)
hours = [base + timedelta(hours=i) for i in range(24)]
---------------------------------------------------------------------------
ModuleNotFoundError Traceback (most recent call last)
Cell In[1], line 5
1 from datetime import datetime, timedelta, timezone
3 import numpy as np
----> 5 import timedatamodel as tdm
7 base = datetime(2024, 1, 15, tzinfo=timezone.utc)
8 hours = [base + timedelta(hours=i) for i in range(24)]
ModuleNotFoundError: No module named 'timedatamodel'
Building a cube from scratch
Create a 3-scenario x 24-hour cube representing price forecasts under different assumptions.
[2]:
rng = np.random.default_rng(42)
base_prices = 50 + 20 * np.sin(np.linspace(0, 2 * np.pi, 24))
data = np.array([
base_prices * 0.8 + rng.normal(0, 2, 24), # low scenario
base_prices + rng.normal(0, 2, 24), # base scenario
base_prices * 1.3 + rng.normal(0, 3, 24), # high scenario
])
cube = tdm.TimeSeriesCube(
tdm.Frequency.PT1H,
timezone="UTC",
name="price_forecast",
unit="EUR/MWh",
data_type=tdm.DataType.FORECAST,
dimensions=[
tdm.Dimension("scenario", ["low", "base", "high"]),
tdm.Dimension("valid_time", hours),
],
values=data,
)
cube
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[2], line 10
2 base_prices = 50 + 20 * np.sin(np.linspace(0, 2 * np.pi, 24))
4 data = np.array([
5 base_prices * 0.8 + rng.normal(0, 2, 24), # low scenario
6 base_prices + rng.normal(0, 2, 24), # base scenario
7 base_prices * 1.3 + rng.normal(0, 3, 24), # high scenario
8 ])
---> 10 cube = tdm.TimeSeriesCube(
11 tdm.Frequency.PT1H,
12 timezone="UTC",
13 name="price_forecast",
14 unit="EUR/MWh",
15 data_type=tdm.DataType.FORECAST,
16 dimensions=[
17 tdm.Dimension("scenario", ["low", "base", "high"]),
18 tdm.Dimension("valid_time", hours),
19 ],
20 values=data,
21 )
22 cube
NameError: name 'tdm' is not defined
Cube properties
[3]:
print(f"Shape: {cube.shape}")
print(f"Dimensions: {cube.dim_names}")
print(f"Begin: {cube.begin}")
print(f"End: {cube.end}")
print(f"Has missing:{cube.has_missing}")
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[3], line 1
----> 1 print(f"Shape: {cube.shape}")
2 print(f"Dimensions: {cube.dim_names}")
3 print(f"Begin: {cube.begin}")
NameError: name 'cube' is not defined
Selecting with sel() — label-based
Select a single scenario to collapse the cube into a TimeSeries.
[4]:
base_scenario = cube.sel(scenario="base")
base_scenario
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[4], line 1
----> 1 base_scenario = cube.sel(scenario="base")
2 base_scenario
NameError: name 'cube' is not defined
Selecting with isel() — index-based
Select by integer position.
[5]:
first_scenario = cube.isel(scenario=0)
first_scenario
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[5], line 1
----> 1 first_scenario = cube.isel(scenario=0)
2 first_scenario
NameError: name 'cube' is not defined
Slicing a dimension
Select a range of labels to get a smaller cube or table.
[6]:
two_scenarios = cube.sel(scenario=slice("low", "base"))
print(f"Type: {type(two_scenarios).__name__}")
print(two_scenarios)
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[6], line 1
----> 1 two_scenarios = cube.sel(scenario=slice("low", "base"))
2 print(f"Type: {type(two_scenarios).__name__}")
3 print(two_scenarios)
NameError: name 'cube' is not defined
Auto-collapse to Table or Series
When a sel() or isel() call removes enough dimensions, the result automatically becomes a TimeSeriesTable (2D) or TimeSeries (1D).
[7]:
table_view = cube.to_table()
print(f"Type: {type(table_view).__name__}")
table_view
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[7], line 1
----> 1 table_view = cube.to_table()
2 print(f"Type: {type(table_view).__name__}")
3 table_view
NameError: name 'cube' is not defined
Building a cube from a list of TimeSeries
from_timeseries_list() is handy when you already have individual scenario forecasts.
[8]:
series_list = [
tdm.TimeSeries(
tdm.Frequency.PT1H,
timestamps=hours,
values=(base_prices * factor + rng.normal(0, 2, 24)).tolist(),
name="price",
unit="EUR/MWh",
)
for factor in [0.7, 0.85, 1.0, 1.15, 1.3]
]
ensemble = tdm.TimeSeriesCube.from_timeseries_list(
series_list,
dimension=tdm.Dimension("percentile", ["p10", "p25", "p50", "p75", "p90"]),
name="price_ensemble",
)
ensemble
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[8], line 2
1 series_list = [
----> 2 tdm.TimeSeries(
3 tdm.Frequency.PT1H,
4 timestamps=hours,
5 values=(base_prices * factor + rng.normal(0, 2, 24)).tolist(),
6 name="price",
7 unit="EUR/MWh",
8 )
9 for factor in [0.7, 0.85, 1.0, 1.15, 1.3]
10 ]
12 ensemble = tdm.TimeSeriesCube.from_timeseries_list(
13 series_list,
14 dimension=tdm.Dimension("percentile", ["p10", "p25", "p50", "p75", "p90"]),
15 name="price_ensemble",
16 )
17 ensemble
NameError: name 'tdm' is not defined
TimeSeriesCollection
A TimeSeriesCollection groups time series that may have different frequencies, time ranges, or numbers of points. Think of it as a named bag of TimeSeries and TimeSeriesTable objects.
[9]:
daily_base = datetime(2024, 1, 1, tzinfo=timezone.utc)
ts_hourly = tdm.TimeSeries(
tdm.Frequency.PT1H,
timestamps=hours,
values=[100.0 + rng.normal(0, 10) for _ in range(24)],
name="wind_hourly",
unit="MW",
)
ts_daily = tdm.TimeSeries(
tdm.Frequency.P1D,
timestamps=[daily_base + timedelta(days=d) for d in range(30)],
values=[2400.0 + rng.normal(0, 200) for _ in range(30)],
name="wind_daily_energy",
unit="MWh",
)
ts_15min = tdm.TimeSeries(
tdm.Frequency.PT15M,
timestamps=[base + timedelta(minutes=15 * i) for i in range(96)],
values=[50.0 + rng.normal(0, 5) for _ in range(96)],
name="solar_15min",
unit="MW",
)
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[9], line 3
1 daily_base = datetime(2024, 1, 1, tzinfo=timezone.utc)
----> 3 ts_hourly = tdm.TimeSeries(
4 tdm.Frequency.PT1H,
5 timestamps=hours,
6 values=[100.0 + rng.normal(0, 10) for _ in range(24)],
7 name="wind_hourly",
8 unit="MW",
9 )
11 ts_daily = tdm.TimeSeries(
12 tdm.Frequency.P1D,
13 timestamps=[daily_base + timedelta(days=d) for d in range(30)],
(...) 16 unit="MWh",
17 )
19 ts_15min = tdm.TimeSeries(
20 tdm.Frequency.PT15M,
21 timestamps=[base + timedelta(minutes=15 * i) for i in range(96)],
(...) 24 unit="MW",
25 )
NameError: name 'tdm' is not defined
Creating a collection
[10]:
collection = tdm.TimeSeriesCollection(
[ts_hourly, ts_daily, ts_15min],
name="Plant overview",
description="Mixed-frequency data for a single plant",
)
collection
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[10], line 1
----> 1 collection = tdm.TimeSeriesCollection(
2 [ts_hourly, ts_daily, ts_15min],
3 name="Plant overview",
4 description="Mixed-frequency data for a single plant",
5 )
6 collection
NameError: name 'tdm' is not defined
Dictionary-like access
[11]:
print(f"Names: {collection.names}")
print(f"Count: {collection.series_count}")
collection["wind_hourly"]
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[11], line 1
----> 1 print(f"Names: {collection.names}")
2 print(f"Count: {collection.series_count}")
4 collection["wind_hourly"]
NameError: name 'collection' is not defined
Adding and removing series
Collections are immutable — add() and remove() return new collections.
[12]:
ts_price = tdm.TimeSeries(
tdm.Frequency.PT1H,
timestamps=hours,
values=[45.0 + rng.normal(0, 8) for _ in range(24)],
name="spot_price",
unit="EUR/MWh",
)
extended = collection.add(ts_price)
print(f"Original: {collection.names}")
print(f"Extended: {extended.names}")
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[12], line 1
----> 1 ts_price = tdm.TimeSeries(
2 tdm.Frequency.PT1H,
3 timestamps=hours,
4 values=[45.0 + rng.normal(0, 8) for _ in range(24)],
5 name="spot_price",
6 unit="EUR/MWh",
7 )
9 extended = collection.add(ts_price)
10 print(f"Original: {collection.names}")
NameError: name 'tdm' is not defined
[13]:
reduced = extended.remove("wind_daily_energy")
print(f"Reduced: {reduced.names}")
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[13], line 1
----> 1 reduced = extended.remove("wind_daily_energy")
2 print(f"Reduced: {reduced.names}")
NameError: name 'extended' is not defined
Iterating over a collection
[14]:
for name, series in collection.items():
print(f"{name:20s} freq={str(series.frequency):5s} len={len(series):3d} begin={series.begin}")
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[14], line 1
----> 1 for name, series in collection.items():
2 print(f"{name:20s} freq={str(series.frequency):5s} len={len(series):3d} begin={series.begin}")
NameError: name 'collection' is not defined
Summary
``TimeSeriesCube``: N-dimensional time series with
Dimensionlabels; slice withsel()/isel(); auto-collapses to Table or Series``TimeSeriesCollection``: heterogeneous container for series with different frequencies and time ranges; dictionary-like access; immutable add/remove
Next up: nb_07 covers data quality tools — coverage bars and validation.