Unit Handling and Validation
Setting and inspecting units on
TimeSeriesListandTimeSeriesTableConverting between compatible units with
convert_unit()Automatic unit conversion in arithmetic operations
Resolving units to pint objects with
pint_unitValidating timestamps and frequency with
validate()Using
DataType,TimeSeriesType, and customattributes
[1]:
from datetime import datetime, timedelta, timezone
import numpy as np
import timedatamodel as tdm
base = datetime(2024, 1, 15, tzinfo=timezone.utc)
timestamps = [base + timedelta(hours=i) for i in range(24)]
rng = np.random.default_rng(42)
Setting units on a TimeSeriesList
The unit parameter is a free-form string. It appears in the repr and is carried through all operations.
[2]:
wind = tdm.TimeSeriesList(
tdm.Frequency.PT1H,
timezone="UTC",
timestamps=timestamps,
values=(8 + rng.normal(0, 2, 24)).tolist(),
name="wind_speed",
unit="m/s",
)
wind
[2]:
| timestamp | wind_speed |
|---|---|
| 2024-01-15 00:00 | 8.60943 |
| 2024-01-15 01:00 | 5.92003 |
| 2024-01-15 02:00 | 9.5009 |
| … | … |
| 2024-01-15 21:00 | 6.63814 |
| 2024-01-15 22:00 | 10.4451 |
| 2024-01-15 23:00 | 7.69094 |
[3]:
print(f"Unit: {wind.unit}")
Unit: m/s
Converting units with convert_unit()
convert_unit() uses pint under the hood to convert values.TimeSeriesList — the original is unchanged.pip install timedatamodel[pint]
[4]:
wind_kmh = wind.convert_unit("km/h")
wind_knot = wind.convert_unit("knot")
print(f"Original: {wind.unit:5s} mean={np.nanmean(wind.arr):.2f}")
print(f"Converted: {wind_kmh.unit:5s} mean={np.nanmean(wind_kmh.arr):.2f}")
print(f"Converted: {wind_knot.unit:5s} mean={np.nanmean(wind_knot.arr):.2f}")
Original: m/s mean=7.96
Converted: km/h mean=28.66
Converted: knot mean=15.48
[5]:
energy_kwh = tdm.TimeSeriesList(
tdm.Frequency.PT1H,
timezone="UTC",
timestamps=timestamps,
values=(500 + rng.normal(0, 50, 24)).tolist(),
name="energy",
unit="kWh",
)
energy_mwh = energy_kwh.convert_unit("MWh")
energy_j = energy_kwh.convert_unit("J")
print(f"kWh: mean={np.nanmean(energy_kwh.arr):.1f}")
print(f"MWh: mean={np.nanmean(energy_mwh.arr):.4f}")
print(f"J: mean={np.nanmean(energy_j.arr):.0f}")
kWh: mean=508.9
MWh: mean=0.5089
J: mean=1832028606
Incompatible units raise an error
[6]:
try:
wind.convert_unit("MW")
except ValueError as e:
print(f"Error: {e}")
Error: cannot convert 'm/s' to 'MW': incompatible dimensions
[7]:
no_unit = tdm.TimeSeriesList(
tdm.Frequency.PT1H, timezone="UTC",
timestamps=timestamps,
values=rng.normal(0, 1, 24).tolist(),
name="dimensionless",
)
try:
no_unit.convert_unit("MW")
except ValueError as e:
print(f"Error: {e}")
Error: cannot convert units: source unit is None
Automatic unit conversion in arithmetic
When you add or subtract two TimeSeriesList with compatible units, values are automatically converted to the left operand’s unit.
[8]:
power_mw = tdm.TimeSeriesList(
tdm.Frequency.PT1H, timezone="UTC",
timestamps=timestamps,
values=(100 + rng.normal(0, 10, 24)).tolist(),
name="plant_a",
unit="MW",
)
power_kw = tdm.TimeSeriesList(
tdm.Frequency.PT1H, timezone="UTC",
timestamps=timestamps,
values=(50000 + rng.normal(0, 5000, 24)).tolist(),
name="plant_b",
unit="kW",
)
total = power_mw + power_kw
print(f"Result unit: {total.unit}")
print(f"plant_a mean: {np.nanmean(power_mw.arr):.1f} MW")
print(f"plant_b mean: {np.nanmean(power_kw.arr):.1f} kW = {np.nanmean(power_kw.arr)/1000:.1f} MW")
print(f"total mean: {np.nanmean(total.arr):.1f} MW")
Result unit: MW
plant_a mean: 98.4 MW
plant_b mean: 48948.6 kW = 48.9 MW
total mean: 147.4 MW
Mismatched unit presence (one has a unit, the other doesn’t) raises an error:
[9]:
try:
_ = power_mw + no_unit
except ValueError as e:
print(f"Error: {e}")
Error: unit mismatch: one operand has unit='MW' and the other has unit=None
Resolving units with pint_unit
The pint_unit property returns a pint.Unit object for programmatic inspection.
[10]:
pu = power_mw.pint_unit
print(f"pint unit: {pu}")
print(f"type: {type(pu).__name__}")
pint unit: megawatt
type: Unit
Units on TimeSeriesTable
TimeSeriesTable supports per-column units via the units parameter.convert_unit() can target a single column or all columns.[11]:
table = tdm.TimeSeriesTable(
tdm.Frequency.PT1H,
timezone="UTC",
timestamps=timestamps,
values=np.column_stack([
100 + rng.normal(0, 15, 24),
8 + rng.normal(0, 2, 24),
]),
names=["power", "wind_speed"],
units=["MW", "m/s"],
)
table
[11]:
| timestamp | power | wind_speed |
|---|---|---|
| 2024-01-15 00:00 | 84.6475 | 6.37412 |
| 2024-01-15 01:00 | 102.689 | 7.16929 |
| 2024-01-15 02:00 | 103.3 | 6.77581 |
| … | … | … |
| 2024-01-15 21:00 | 85.1569 | 6.19415 |
| 2024-01-15 22:00 | 68.0193 | 9.86315 |
| 2024-01-15 23:00 | 104.016 | 8.7699 |
[12]:
table_kw = table.convert_unit("kW", column="power")
print(f"Original units: {table.units}")
print(f"After convert: {table_kw.units}")
print(f"Power mean: {table.arr[:, 0].mean():.1f} MW → {table_kw.arr[:, 0].mean():.1f} kW")
Original units: ['MW', 'm/s']
After convert: ['kW', 'm/s']
Power mean: 100.3 MW → 100261.2 kW
Validating timestamps and frequency
validate() checks that timestamps are strictly increasing and match the declared frequency.[13]:
good = tdm.TimeSeriesList(
tdm.Frequency.PT1H, timezone="UTC",
timestamps=timestamps,
values=rng.normal(0, 1, 24).tolist(),
name="clean",
)
warnings = good.validate()
print(f"Warnings: {warnings}")
Warnings: []
[14]:
gap_timestamps = timestamps[:12] + timestamps[14:]
gap_values = rng.normal(0, 1, len(gap_timestamps)).tolist()
gapped = tdm.TimeSeriesList(
tdm.Frequency.PT1H, timezone="UTC",
timestamps=gap_timestamps,
values=gap_values,
name="has_gap",
)
for w in gapped.validate():
print(f"⚠ {w}")
⚠ inconsistent frequency at index 12: expected 1:00:00, got 3:00:00
[15]:
bad_order = timestamps[:12] + [timestamps[13], timestamps[12]] + timestamps[14:]
bad_values = rng.normal(0, 1, len(bad_order)).tolist()
unordered = tdm.TimeSeriesList(
tdm.Frequency.PT1H, timezone="UTC",
timestamps=bad_order,
values=bad_values,
name="unordered",
)
for w in unordered.validate():
print(f"⚠ {w}")
⚠ inconsistent frequency at index 12: expected 1:00:00, got 2:00:00
⚠ timestamps not strictly increasing at index 13: 2024-01-15 13:00:00+00:00 >= 2024-01-15 12:00:00+00:00
Detecting missing values
The has_missing property returns True when any value is None (NaN).
[16]:
values_with_gaps = rng.normal(100, 10, 24).tolist()
values_with_gaps[5] = None
values_with_gaps[18] = None
sparse = tdm.TimeSeriesList(
tdm.Frequency.PT1H, timezone="UTC",
timestamps=timestamps,
values=values_with_gaps,
name="sparse",
unit="MW",
)
print(f"has_missing: {sparse.has_missing}")
print(f"NaN count: {np.isnan(sparse.arr).sum()}")
print(f"Length: {len(sparse)}")
has_missing: True
NaN count: 2
Length: 24
DataType — classifying your data
The DataType enum communicates what kind of data a series holds.
[17]:
print("Available DataType values:")
for dt in tdm.DataType:
print(f" {dt.value}")
Available DataType values:
ACTUAL
OBSERVATION
DERIVED
CALCULATED
ESTIMATION
FORECAST
PREDICTION
SCENARIO
SIMULATION
RECONSTRUCTION
REFERENCE
BASELINE
BENCHMARK
IDEAL
[18]:
measured = tdm.TimeSeriesList(
tdm.Frequency.PT1H, timezone="UTC",
timestamps=timestamps,
values=(100 + rng.normal(0, 10, 24)).tolist(),
name="wind_measured",
unit="MW",
data_type=tdm.DataType.OBSERVATION,
)
forecast = tdm.TimeSeriesList(
tdm.Frequency.PT1H, timezone="UTC",
timestamps=timestamps,
values=(105 + rng.normal(0, 15, 24)).tolist(),
name="wind_forecast",
unit="MW",
data_type=tdm.DataType.FORECAST,
)
print(f"{measured.name}: data_type={measured.data_type}")
print(f"{forecast.name}: data_type={forecast.data_type}")
wind_measured: data_type=OBSERVATION
wind_forecast: data_type=FORECAST
TimeSeriesType — structural classification
TimeSeriesType describes the structural nature of the series.
[19]:
print("Available TimeSeriesType values:")
for tst in tdm.TimeSeriesType:
print(f" {tst.value}")
Available TimeSeriesType values:
FLAT
OVERLAPPING
[20]:
flat = tdm.TimeSeriesList(
tdm.Frequency.PT1H, timezone="UTC",
timestamps=timestamps,
values=rng.normal(0, 1, 24).tolist(),
name="flat_series",
timeseries_type=tdm.TimeSeriesType.FLAT,
)
print(f"timeseries_type: {flat.timeseries_type}")
timeseries_type: FLAT
Custom attributes
The attributes dict stores arbitrary key-value metadata — source system, fuel type, model version, etc.
[21]:
rich = tdm.TimeSeriesList(
tdm.Frequency.PT1H,
timezone="UTC",
timestamps=timestamps,
values=(80 + rng.normal(0, 10, 24)).tolist(),
name="wind_farm_alpha",
unit="MW",
description="Measured output from Wind Farm Alpha",
data_type=tdm.DataType.OBSERVATION,
timeseries_type=tdm.TimeSeriesType.FLAT,
attributes={
"source": "SCADA",
"fuel": "wind",
"capacity_mw": "120",
"operator": "NorthWind Energy",
},
)
print(f"Attributes: {rich.attributes}")
print(f"Capacity: {rich.attributes['capacity_mw']} MW")
Attributes: {'source': 'SCADA', 'fuel': 'wind', 'capacity_mw': '120', 'operator': 'NorthWind Energy'}
Capacity: 120 MW
Frequency enum
Frequency is a StrEnum with helpers for calendar-based vs fixed-duration frequencies.
[22]:
print(f"{'Frequency':<8s} {'timedelta':<22s} {'calendar?'}")
print("-" * 45)
for f in tdm.Frequency:
td = f.to_timedelta()
td_str = str(td) if td else "-"
print(f"{f.value:<8s} {td_str:<22s} {f.is_calendar_based}")
Frequency timedelta calendar?
---------------------------------------------
P1Y - True
P3M - True
P1M - True
P1W 7 days, 0:00:00 False
P1D 1 day, 0:00:00 False
PT1H 1:00:00 False
PT30M 0:30:00 False
PT15M 0:15:00 False
PT10M 0:10:00 False
PT5M 0:05:00 False
PT1M 0:01:00 False
PT1S 0:00:01 False
NONE - False
Metadata survives serialization
Units, data types, attributes, and other metadata round-trip through JSON.
[23]:
json_str = rich.to_json()
restored = tdm.TimeSeriesList.from_json(json_str)
print(f"unit: {restored.unit}")
print(f"data_type: {restored.data_type}")
print(f"timeseries_type: {restored.timeseries_type}")
print(f"attributes: {restored.attributes}")
print(f"description: {restored.description}")
unit: MW
data_type: OBSERVATION
timeseries_type: FLAT
attributes: {'source': 'SCADA', 'fuel': 'wind', 'capacity_mw': '120', 'operator': 'NorthWind Energy'}
description: Measured output from Wind Farm Alpha
Summary
Feature |
API |
|---|---|
Set unit |
|
Convert unit |
|
Auto-convert in arithmetic |
|
Pint integration |
|
Per-column units |
|
Column conversion |
|
Validate timestamps |
|
Missing values |
|
Data classification |
|
Structural type |
|
Custom metadata |
|
Frequency info |
|
Next up: nb_04 covers arithmetic operations and comparisons on TimeSeriesList.