I/O and Interoperability
TimeDataModel is designed to sit between your domain logic and the broader Python data ecosystem. This notebook shows how to move data seamlessly between TimeSeries / TimeSeriesTable and pandas, numpy, polars, JSON, and CSV.
[1]:
from datetime import datetime, timedelta, timezone
import numpy as np
import timedatamodel as tdm
base = datetime(2024, 1, 15, tzinfo=timezone.utc)
timestamps = [base + timedelta(hours=i) for i in range(24)]
values = [100.0 + 50 * np.sin(2 * np.pi * i / 24) for i in range(24)]
ts = tdm.TimeSeries(
tdm.Frequency.PT1H,
timezone="UTC",
timestamps=timestamps,
values=values,
name="power",
unit="MW",
description="Synthetic daily power curve",
data_type=tdm.DataType.SYNTHETIC,
)
---------------------------------------------------------------------------
ModuleNotFoundError Traceback (most recent call last)
Cell In[1], line 5
1 from datetime import datetime, timedelta, timezone
3 import numpy as np
----> 5 import timedatamodel as tdm
7 base = datetime(2024, 1, 15, tzinfo=timezone.utc)
8 timestamps = [base + timedelta(hours=i) for i in range(24)]
ModuleNotFoundError: No module named 'timedatamodel'
NumPy
to_numpy() returns a float64 array (None becomes NaN). Use it when you need fast vectorized computation.
[2]:
arr = ts.to_numpy()
print(f"Type: {type(arr)}")
print(f"Shape: {arr.shape}")
print(f"Mean: {arr.mean():.2f} MW")
print(f"Max: {arr.max():.2f} MW")
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[2], line 1
----> 1 arr = ts.to_numpy()
2 print(f"Type: {type(arr)}")
3 print(f"Shape: {arr.shape}")
NameError: name 'ts' is not defined
Pandas — export
to_pandas_dataframe() produces a DataFrame with a DatetimeIndex.
[3]:
df = ts.to_pandas_dataframe()
df.head()
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[3], line 1
----> 1 df = ts.to_pandas_dataframe()
2 df.head()
NameError: name 'ts' is not defined
Pandas — import with from_pandas()
Create a TimeSeries from any pandas DataFrame that has a DatetimeIndex. Frequency and timezone are auto-inferred when possible.
[4]:
import pandas as pd
idx = pd.date_range("2024-06-01", periods=48, freq="h", tz="UTC")
df_external = pd.DataFrame({"load": np.random.default_rng(42).normal(150, 20, 48)}, index=idx)
ts_from_pd = tdm.TimeSeries.from_pandas(df_external, unit="MW", data_type=tdm.DataType.MEASUREMENT)
ts_from_pd
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[4], line 6
3 idx = pd.date_range("2024-06-01", periods=48, freq="h", tz="UTC")
4 df_external = pd.DataFrame({"load": np.random.default_rng(42).normal(150, 20, 48)}, index=idx)
----> 6 ts_from_pd = tdm.TimeSeries.from_pandas(df_external, unit="MW", data_type=tdm.DataType.MEASUREMENT)
7 ts_from_pd
NameError: name 'tdm' is not defined
[ ]:
Pandas — TimeSeriesTable round-trip
[5]:
rng = np.random.default_rng(42)
table = tdm.TimeSeriesTable(
tdm.Frequency.PT1H,
timestamps=timestamps,
values=np.column_stack([
100 + rng.normal(0, 10, 24),
50 + rng.normal(0, 5, 24),
]),
names=["wind", "solar"],
units=["MW", "MW"],
)
df_table = table.to_pandas_dataframe()
print(f"Columns: {list(df_table.columns)}")
table_back = tdm.TimeSeriesTable.from_pandas(df_table, units=["MW", "MW"])
print(f"Round-trip equals: {table.equals(table_back)}")
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[5], line 2
1 rng = np.random.default_rng(42)
----> 2 table = tdm.TimeSeriesTable(
3 tdm.Frequency.PT1H,
4 timestamps=timestamps,
5 values=np.column_stack([
6 100 + rng.normal(0, 10, 24),
7 50 + rng.normal(0, 5, 24),
8 ]),
9 names=["wind", "solar"],
10 units=["MW", "MW"],
11 )
13 df_table = table.to_pandas_dataframe()
14 print(f"Columns: {list(df_table.columns)}")
NameError: name 'tdm' is not defined
Polars (optional)
If polars is installed, you get to_polars_dataframe() and from_polars().
[6]:
try:
df_pl = ts.to_polars_dataframe()
print(df_pl.head())
ts_from_pl = tdm.TimeSeries.from_polars(df_pl, tdm.Frequency.PT1H, unit="MW")
print(f"\nRound-trip length: {len(ts_from_pl)}")
except ImportError:
print("polars not installed — skip this cell or: pip install timedatamodel[polars]")
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[6], line 2
1 try:
----> 2 df_pl = ts.to_polars_dataframe()
3 print(df_pl.head())
5 ts_from_pl = tdm.TimeSeries.from_polars(df_pl, tdm.Frequency.PT1H, unit="MW")
NameError: name 'ts' is not defined
JSON serialization
to_json() produces an ISO-8601 JSON string. from_json() reconstructs the series with full metadata.
[7]:
json_str = ts.to_json()
print(json_str[:200], "...")
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[7], line 1
----> 1 json_str = ts.to_json()
2 print(json_str[:200], "...")
NameError: name 'ts' is not defined
[8]:
ts_restored = tdm.TimeSeries.from_json(json_str)
print(f"Name: {ts_restored.name}")
print(f"Unit: {ts_restored.unit}")
print(f"Frequency: {ts_restored.frequency}")
print(f"Length: {len(ts_restored)}")
print(f"Equals original: {ts.equals(ts_restored)}")
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[8], line 1
----> 1 ts_restored = tdm.TimeSeries.from_json(json_str)
2 print(f"Name: {ts_restored.name}")
3 print(f"Unit: {ts_restored.unit}")
NameError: name 'tdm' is not defined
JSON for TimeSeriesTable
[9]:
json_table = table.to_json()
table_restored = tdm.TimeSeriesTable.from_json(json_table)
print(f"Columns: {table_restored.column_names}")
print(f"Equals original: {table.equals(table_restored)}")
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[9], line 1
----> 1 json_table = table.to_json()
2 table_restored = tdm.TimeSeriesTable.from_json(json_table)
3 print(f"Columns: {table_restored.column_names}")
NameError: name 'table' is not defined
CSV serialization
to_csv() and from_csv() write and read simple CSV files.
[10]:
import tempfile
from pathlib import Path
with tempfile.TemporaryDirectory() as tmp:
csv_path = Path(tmp) / "power.csv"
ts.to_csv(csv_path)
with open(csv_path) as f:
for i, line in enumerate(f):
print(line.rstrip())
if i >= 4:
print("...")
break
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[10], line 6
4 with tempfile.TemporaryDirectory() as tmp:
5 csv_path = Path(tmp) / "power.csv"
----> 6 ts.to_csv(csv_path)
8 with open(csv_path) as f:
9 for i, line in enumerate(f):
NameError: name 'ts' is not defined
[11]:
with tempfile.TemporaryDirectory() as tmp:
csv_path = Path(tmp) / "power.csv"
ts.to_csv(csv_path)
ts_from_csv = tdm.TimeSeries.from_csv(csv_path, tdm.Frequency.PT1H, unit="MW")
print(f"Length: {len(ts_from_csv)}")
print(f"Name: {ts_from_csv.name}")
ts_from_csv.head()
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[11], line 3
1 with tempfile.TemporaryDirectory() as tmp:
2 csv_path = Path(tmp) / "power.csv"
----> 3 ts.to_csv(csv_path)
4 ts_from_csv = tdm.TimeSeries.from_csv(csv_path, tdm.Frequency.PT1H, unit="MW")
6 print(f"Length: {len(ts_from_csv)}")
NameError: name 'ts' is not defined
Multi-index round-trip
TimeSeries supports tuple-based timestamps for hierarchical indexing. This is preserved through pandas and JSON round-trips.
[12]:
issue_times = [
datetime(2024, 1, 15, 0, tzinfo=timezone.utc),
datetime(2024, 1, 15, 0, tzinfo=timezone.utc),
datetime(2024, 1, 15, 0, tzinfo=timezone.utc),
datetime(2024, 1, 15, 12, tzinfo=timezone.utc),
datetime(2024, 1, 15, 12, tzinfo=timezone.utc),
datetime(2024, 1, 15, 12, tzinfo=timezone.utc),
]
valid_times = [
datetime(2024, 1, 15, 0, tzinfo=timezone.utc),
datetime(2024, 1, 15, 1, tzinfo=timezone.utc),
datetime(2024, 1, 15, 2, tzinfo=timezone.utc),
datetime(2024, 1, 15, 12, tzinfo=timezone.utc),
datetime(2024, 1, 15, 13, tzinfo=timezone.utc),
datetime(2024, 1, 15, 14, tzinfo=timezone.utc),
]
ts_multi = tdm.TimeSeries(
tdm.Frequency.PT1H,
timestamps=list(zip(issue_times, valid_times)),
values=[100.0, 105.0, 110.0, 95.0, 100.0, 108.0],
name="forecast",
unit="MW",
index_names=["issue_time", "valid_time"],
)
df_multi = ts_multi.to_pandas_dataframe()
df_multi
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[12], line 18
1 issue_times = [
2 datetime(2024, 1, 15, 0, tzinfo=timezone.utc),
3 datetime(2024, 1, 15, 0, tzinfo=timezone.utc),
(...) 7 datetime(2024, 1, 15, 12, tzinfo=timezone.utc),
8 ]
9 valid_times = [
10 datetime(2024, 1, 15, 0, tzinfo=timezone.utc),
11 datetime(2024, 1, 15, 1, tzinfo=timezone.utc),
(...) 15 datetime(2024, 1, 15, 14, tzinfo=timezone.utc),
16 ]
---> 18 ts_multi = tdm.TimeSeries(
19 tdm.Frequency.PT1H,
20 timestamps=list(zip(issue_times, valid_times)),
21 values=[100.0, 105.0, 110.0, 95.0, 100.0, 108.0],
22 name="forecast",
23 unit="MW",
24 index_names=["issue_time", "valid_time"],
25 )
27 df_multi = ts_multi.to_pandas_dataframe()
28 df_multi
NameError: name 'tdm' is not defined
[13]:
ts_multi_back = tdm.TimeSeries.from_pandas(df_multi, tdm.Frequency.PT1H, unit="MW")
print(f"Index names: {ts_multi_back.index_names}")
print(f"Multi-index: {ts_multi_back.is_multi_index}")
print(f"Equals original: {ts_multi.equals(ts_multi_back)}")
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[13], line 1
----> 1 ts_multi_back = tdm.TimeSeries.from_pandas(df_multi, tdm.Frequency.PT1H, unit="MW")
2 print(f"Index names: {ts_multi_back.index_names}")
3 print(f"Multi-index: {ts_multi_back.is_multi_index}")
NameError: name 'tdm' is not defined
Summary
Format |
Export |
Import |
Metadata preserved |
|---|---|---|---|
NumPy |
|
— |
Values only |
pandas |
|
|
Name, freq, tz |
polars |
|
|
Name |
JSON |
|
|
Full |
CSV |
|
|
Timestamps + values |
Next up: nb_09 covers geographical support — GeoLocation, GeoArea, and spatial queries.