I/O and Interoperability

TimeDataModel is designed to sit between your domain logic and the broader Python data ecosystem. This notebook shows how to move data seamlessly between TimeSeries / TimeSeriesTable and pandas, numpy, polars, JSON, and CSV.

[1]:
from datetime import datetime, timedelta, timezone

import numpy as np

import timedatamodel as tdm

base = datetime(2024, 1, 15, tzinfo=timezone.utc)
timestamps = [base + timedelta(hours=i) for i in range(24)]
values = [100.0 + 50 * np.sin(2 * np.pi * i / 24) for i in range(24)]

ts = tdm.TimeSeries(
    tdm.Frequency.PT1H,
    timezone="UTC",
    timestamps=timestamps,
    values=values,
    name="power",
    unit="MW",
    description="Synthetic daily power curve",
    data_type=tdm.DataType.SYNTHETIC,
)
---------------------------------------------------------------------------
ModuleNotFoundError                       Traceback (most recent call last)
Cell In[1], line 5
      1 from datetime import datetime, timedelta, timezone
      3 import numpy as np
----> 5 import timedatamodel as tdm
      7 base = datetime(2024, 1, 15, tzinfo=timezone.utc)
      8 timestamps = [base + timedelta(hours=i) for i in range(24)]

ModuleNotFoundError: No module named 'timedatamodel'

NumPy

to_numpy() returns a float64 array (None becomes NaN). Use it when you need fast vectorized computation.

[2]:
arr = ts.to_numpy()
print(f"Type:  {type(arr)}")
print(f"Shape: {arr.shape}")
print(f"Mean:  {arr.mean():.2f} MW")
print(f"Max:   {arr.max():.2f} MW")
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[2], line 1
----> 1 arr = ts.to_numpy()
      2 print(f"Type:  {type(arr)}")
      3 print(f"Shape: {arr.shape}")

NameError: name 'ts' is not defined

Pandas — export

to_pandas_dataframe() produces a DataFrame with a DatetimeIndex.

[3]:
df = ts.to_pandas_dataframe()
df.head()
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[3], line 1
----> 1 df = ts.to_pandas_dataframe()
      2 df.head()

NameError: name 'ts' is not defined

Pandas — import with from_pandas()

Create a TimeSeries from any pandas DataFrame that has a DatetimeIndex. Frequency and timezone are auto-inferred when possible.

[4]:
import pandas as pd

idx = pd.date_range("2024-06-01", periods=48, freq="h", tz="UTC")
df_external = pd.DataFrame({"load": np.random.default_rng(42).normal(150, 20, 48)}, index=idx)

ts_from_pd = tdm.TimeSeries.from_pandas(df_external, unit="MW", data_type=tdm.DataType.MEASUREMENT)
ts_from_pd
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[4], line 6
      3 idx = pd.date_range("2024-06-01", periods=48, freq="h", tz="UTC")
      4 df_external = pd.DataFrame({"load": np.random.default_rng(42).normal(150, 20, 48)}, index=idx)
----> 6 ts_from_pd = tdm.TimeSeries.from_pandas(df_external, unit="MW", data_type=tdm.DataType.MEASUREMENT)
      7 ts_from_pd

NameError: name 'tdm' is not defined
[ ]:

Pandas — TimeSeriesTable round-trip

[5]:
rng = np.random.default_rng(42)
table = tdm.TimeSeriesTable(
    tdm.Frequency.PT1H,
    timestamps=timestamps,
    values=np.column_stack([
        100 + rng.normal(0, 10, 24),
        50 + rng.normal(0, 5, 24),
    ]),
    names=["wind", "solar"],
    units=["MW", "MW"],
)

df_table = table.to_pandas_dataframe()
print(f"Columns: {list(df_table.columns)}")

table_back = tdm.TimeSeriesTable.from_pandas(df_table, units=["MW", "MW"])
print(f"Round-trip equals: {table.equals(table_back)}")
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[5], line 2
      1 rng = np.random.default_rng(42)
----> 2 table = tdm.TimeSeriesTable(
      3     tdm.Frequency.PT1H,
      4     timestamps=timestamps,
      5     values=np.column_stack([
      6         100 + rng.normal(0, 10, 24),
      7         50 + rng.normal(0, 5, 24),
      8     ]),
      9     names=["wind", "solar"],
     10     units=["MW", "MW"],
     11 )
     13 df_table = table.to_pandas_dataframe()
     14 print(f"Columns: {list(df_table.columns)}")

NameError: name 'tdm' is not defined

Polars (optional)

If polars is installed, you get to_polars_dataframe() and from_polars().

[6]:
try:
    df_pl = ts.to_polars_dataframe()
    print(df_pl.head())

    ts_from_pl = tdm.TimeSeries.from_polars(df_pl, tdm.Frequency.PT1H, unit="MW")
    print(f"\nRound-trip length: {len(ts_from_pl)}")
except ImportError:
    print("polars not installed — skip this cell or: pip install timedatamodel[polars]")
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[6], line 2
      1 try:
----> 2     df_pl = ts.to_polars_dataframe()
      3     print(df_pl.head())
      5     ts_from_pl = tdm.TimeSeries.from_polars(df_pl, tdm.Frequency.PT1H, unit="MW")

NameError: name 'ts' is not defined

JSON serialization

to_json() produces an ISO-8601 JSON string. from_json() reconstructs the series with full metadata.

[7]:
json_str = ts.to_json()
print(json_str[:200], "...")
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[7], line 1
----> 1 json_str = ts.to_json()
      2 print(json_str[:200], "...")

NameError: name 'ts' is not defined
[8]:
ts_restored = tdm.TimeSeries.from_json(json_str)
print(f"Name:      {ts_restored.name}")
print(f"Unit:      {ts_restored.unit}")
print(f"Frequency: {ts_restored.frequency}")
print(f"Length:    {len(ts_restored)}")
print(f"Equals original: {ts.equals(ts_restored)}")
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[8], line 1
----> 1 ts_restored = tdm.TimeSeries.from_json(json_str)
      2 print(f"Name:      {ts_restored.name}")
      3 print(f"Unit:      {ts_restored.unit}")

NameError: name 'tdm' is not defined

JSON for TimeSeriesTable

[9]:
json_table = table.to_json()
table_restored = tdm.TimeSeriesTable.from_json(json_table)
print(f"Columns: {table_restored.column_names}")
print(f"Equals original: {table.equals(table_restored)}")
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[9], line 1
----> 1 json_table = table.to_json()
      2 table_restored = tdm.TimeSeriesTable.from_json(json_table)
      3 print(f"Columns: {table_restored.column_names}")

NameError: name 'table' is not defined

CSV serialization

to_csv() and from_csv() write and read simple CSV files.

[10]:
import tempfile
from pathlib import Path

with tempfile.TemporaryDirectory() as tmp:
    csv_path = Path(tmp) / "power.csv"
    ts.to_csv(csv_path)

    with open(csv_path) as f:
        for i, line in enumerate(f):
            print(line.rstrip())
            if i >= 4:
                print("...")
                break
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[10], line 6
      4 with tempfile.TemporaryDirectory() as tmp:
      5     csv_path = Path(tmp) / "power.csv"
----> 6     ts.to_csv(csv_path)
      8     with open(csv_path) as f:
      9         for i, line in enumerate(f):

NameError: name 'ts' is not defined
[11]:
with tempfile.TemporaryDirectory() as tmp:
    csv_path = Path(tmp) / "power.csv"
    ts.to_csv(csv_path)
    ts_from_csv = tdm.TimeSeries.from_csv(csv_path, tdm.Frequency.PT1H, unit="MW")

print(f"Length: {len(ts_from_csv)}")
print(f"Name:   {ts_from_csv.name}")
ts_from_csv.head()
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[11], line 3
      1 with tempfile.TemporaryDirectory() as tmp:
      2     csv_path = Path(tmp) / "power.csv"
----> 3     ts.to_csv(csv_path)
      4     ts_from_csv = tdm.TimeSeries.from_csv(csv_path, tdm.Frequency.PT1H, unit="MW")
      6 print(f"Length: {len(ts_from_csv)}")

NameError: name 'ts' is not defined

Multi-index round-trip

TimeSeries supports tuple-based timestamps for hierarchical indexing. This is preserved through pandas and JSON round-trips.

[12]:
issue_times = [
    datetime(2024, 1, 15, 0, tzinfo=timezone.utc),
    datetime(2024, 1, 15, 0, tzinfo=timezone.utc),
    datetime(2024, 1, 15, 0, tzinfo=timezone.utc),
    datetime(2024, 1, 15, 12, tzinfo=timezone.utc),
    datetime(2024, 1, 15, 12, tzinfo=timezone.utc),
    datetime(2024, 1, 15, 12, tzinfo=timezone.utc),
]
valid_times = [
    datetime(2024, 1, 15, 0, tzinfo=timezone.utc),
    datetime(2024, 1, 15, 1, tzinfo=timezone.utc),
    datetime(2024, 1, 15, 2, tzinfo=timezone.utc),
    datetime(2024, 1, 15, 12, tzinfo=timezone.utc),
    datetime(2024, 1, 15, 13, tzinfo=timezone.utc),
    datetime(2024, 1, 15, 14, tzinfo=timezone.utc),
]

ts_multi = tdm.TimeSeries(
    tdm.Frequency.PT1H,
    timestamps=list(zip(issue_times, valid_times)),
    values=[100.0, 105.0, 110.0, 95.0, 100.0, 108.0],
    name="forecast",
    unit="MW",
    index_names=["issue_time", "valid_time"],
)

df_multi = ts_multi.to_pandas_dataframe()
df_multi
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[12], line 18
      1 issue_times = [
      2     datetime(2024, 1, 15, 0, tzinfo=timezone.utc),
      3     datetime(2024, 1, 15, 0, tzinfo=timezone.utc),
   (...)      7     datetime(2024, 1, 15, 12, tzinfo=timezone.utc),
      8 ]
      9 valid_times = [
     10     datetime(2024, 1, 15, 0, tzinfo=timezone.utc),
     11     datetime(2024, 1, 15, 1, tzinfo=timezone.utc),
   (...)     15     datetime(2024, 1, 15, 14, tzinfo=timezone.utc),
     16 ]
---> 18 ts_multi = tdm.TimeSeries(
     19     tdm.Frequency.PT1H,
     20     timestamps=list(zip(issue_times, valid_times)),
     21     values=[100.0, 105.0, 110.0, 95.0, 100.0, 108.0],
     22     name="forecast",
     23     unit="MW",
     24     index_names=["issue_time", "valid_time"],
     25 )
     27 df_multi = ts_multi.to_pandas_dataframe()
     28 df_multi

NameError: name 'tdm' is not defined
[13]:
ts_multi_back = tdm.TimeSeries.from_pandas(df_multi, tdm.Frequency.PT1H, unit="MW")
print(f"Index names: {ts_multi_back.index_names}")
print(f"Multi-index: {ts_multi_back.is_multi_index}")
print(f"Equals original: {ts_multi.equals(ts_multi_back)}")
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[13], line 1
----> 1 ts_multi_back = tdm.TimeSeries.from_pandas(df_multi, tdm.Frequency.PT1H, unit="MW")
      2 print(f"Index names: {ts_multi_back.index_names}")
      3 print(f"Multi-index: {ts_multi_back.is_multi_index}")

NameError: name 'tdm' is not defined

Summary

Format

Export

Import

Metadata preserved

NumPy

to_numpy()

Values only

pandas

to_pandas_dataframe()

from_pandas()

Name, freq, tz

polars

to_polars_dataframe()

from_polars()

Name

JSON

to_json()

from_json()

Full

CSV

to_csv()

from_csv()

Timestamps + values

Next up: nb_09 covers geographical support — GeoLocation, GeoArea, and spatial queries.