{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Unit Handling and Validation\n",
    "\n",
    "TimeDataModel treats **units**, **data types**, and **validation** as first-class concerns.  \n",
    "This notebook covers:\n",
    "\n",
    "1. Setting and inspecting units on `TimeSeriesList` and `TimeSeriesTable`\n",
    "2. Converting between compatible units with `convert_unit()`\n",
    "3. Automatic unit conversion in arithmetic operations\n",
    "4. Resolving units to pint objects with `pint_unit`\n",
    "5. Validating timestamps and frequency with `validate()`\n",
    "6. Using `DataType`, `TimeSeriesType`, and custom `attributes`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2026-03-04T19:11:11.596766Z",
     "iopub.status.busy": "2026-03-04T19:11:11.596567Z",
     "iopub.status.idle": "2026-03-04T19:11:11.640126Z",
     "shell.execute_reply": "2026-03-04T19:11:11.639715Z"
    }
   },
   "outputs": [],
   "source": [
    "from datetime import datetime, timedelta, timezone\n",
    "\n",
    "import numpy as np\n",
    "\n",
    "import timedatamodel as tdm\n",
    "\n",
    "base = datetime(2024, 1, 15, tzinfo=timezone.utc)\n",
    "timestamps = [base + timedelta(hours=i) for i in range(24)]\n",
    "rng = np.random.default_rng(42)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Setting units on a TimeSeriesList\n",
    "\n",
    "The `unit` parameter is a free-form string. It appears in the repr and is carried through all operations."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2026-03-04T19:11:11.641521Z",
     "iopub.status.busy": "2026-03-04T19:11:11.641414Z",
     "iopub.status.idle": "2026-03-04T19:11:11.644619Z",
     "shell.execute_reply": "2026-03-04T19:11:11.644322Z"
    }
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<style>\n",
       ".ts-repr { font-family: monospace; font-size: 13px; max-width: 640px; display: inline-grid; }\n",
       ".ts-repr .ts-header {\n",
       "  font-weight: bold; font-size: 14px;\n",
       "  padding: 6px 10px; border-bottom: 2px solid #4a4a4a;\n",
       "  background: #f0f0f0; color: #1a1a1a;\n",
       "}\n",
       ".ts-repr .ts-meta { padding: 6px 10px; background: #fafafa; overflow: hidden; min-width: 0; }\n",
       ".ts-repr .ts-meta table { border-collapse: collapse; width: 100%; table-layout: fixed; }\n",
       ".ts-repr .ts-meta td { padding: 1px 8px 1px 0; white-space: nowrap; }\n",
       ".ts-repr .ts-meta td:first-child { color: #475569; font-weight: 600; width: 90px; }\n",
       ".ts-repr .ts-meta td:last-child { color: #1a1a1a; overflow: hidden; text-overflow: ellipsis; }\n",
       ".ts-repr .ts-data { padding: 6px 10px; }\n",
       ".ts-repr .ts-data table {\n",
       "  border-collapse: collapse; text-align: right;\n",
       "}\n",
       ".ts-repr .ts-data th {\n",
       "  text-align: right; padding: 3px 10px; border-bottom: 1px solid #ccc;\n",
       "  color: #555; font-weight: 600;\n",
       "}\n",
       ".ts-repr .ts-data th.ts-idx { text-align: left; }\n",
       ".ts-repr .ts-data td { padding: 2px 10px; }\n",
       ".ts-repr .ts-data tr:hover { background: #f5f5f5; }\n",
       ".ts-repr .ts-data td:first-child { text-align: left; color: #1e293b; }\n",
       ".ts-repr .ts-data td.ts-idx { text-align: left; color: #1e293b; }\n",
       ".ts-repr .ts-ellipsis { text-align: center !important; color: #999; }\n",
       "@media (prefers-color-scheme: dark) {\n",
       "  .ts-repr .ts-header { background: #1e293b; color: #e2e8f0; border-color: #475569; }\n",
       "  .ts-repr .ts-meta { background: #0f172a; }\n",
       "  .ts-repr .ts-meta td:first-child { color: #94a3b8; }\n",
       "  .ts-repr .ts-meta td:last-child { color: #e2e8f0; }\n",
       "  .ts-repr .ts-data th { color: #94a3b8; border-color: #334155; }\n",
       "  .ts-repr .ts-data td { color: #e2e8f0; }\n",
       "  .ts-repr .ts-data td:first-child { color: #cbd5e1; }\n",
       "  .ts-repr .ts-data td.ts-idx { color: #cbd5e1; }\n",
       "  .ts-repr .ts-data tr:hover { background: #1e293b; }\n",
       "  .ts-repr .ts-ellipsis { color: #64748b; }\n",
       "}\n",
       "</style>\n",
       "<div class=\"ts-repr\">\n",
       "<div class=\"ts-header\">TimeSeriesList</div>\n",
       "<div class=\"ts-meta\"><table>\n",
       "<tr><td>Name</td><td>wind_speed</td></tr>\n",
       "<tr><td>Length</td><td>24</td></tr>\n",
       "<tr><td>Frequency</td><td>PT1H</td></tr>\n",
       "<tr><td>Timezone</td><td>UTC (+00:00)</td></tr>\n",
       "<tr><td>Unit</td><td>m/s</td></tr>\n",
       "</table></div>\n",
       "<div class=\"ts-data\"><table>\n",
       "<tr><th class=\"ts-idx\">timestamp</th><th>wind_speed</th></tr>\n",
       "<tr><td>2024-01-15 00:00</td><td>8.60943</td></tr>\n",
       "<tr><td>2024-01-15 01:00</td><td>5.92003</td></tr>\n",
       "<tr><td>2024-01-15 02:00</td><td>9.5009</td></tr>\n",
       "<tr><td class=\"ts-ellipsis\">&hellip;</td><td class=\"ts-ellipsis\">&hellip;</td></tr>\n",
       "<tr><td>2024-01-15 21:00</td><td>6.63814</td></tr>\n",
       "<tr><td>2024-01-15 22:00</td><td>10.4451</td></tr>\n",
       "<tr><td>2024-01-15 23:00</td><td>7.69094</td></tr>\n",
       "</table></div>\n",
       "</div>"
      ],
      "text/plain": [
       "TimeSeriesList\n",
       "┌──────────────────────────────────┐\n",
       "│  Name:             wind_speed    │\n",
       "│  Length:           24            │\n",
       "│  Frequency:        PT1H          │\n",
       "│  Timezone:         UTC (+00:00)  │\n",
       "│  Unit:             m/s           │\n",
       "├──────────────────────────────────┤\n",
       "│  2024-01-15 00:00  8.60943       │\n",
       "│  2024-01-15 01:00  5.92003       │\n",
       "│  2024-01-15 02:00   9.5009       │\n",
       "│  ...                   ...       │\n",
       "│  2024-01-15 21:00  6.63814       │\n",
       "│  2024-01-15 22:00  10.4451       │\n",
       "│  2024-01-15 23:00  7.69094       │\n",
       "└──────────────────────────────────┘"
      ]
     },
     "execution_count": 2,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "wind = tdm.TimeSeriesList(\n",
    "    tdm.Frequency.PT1H,\n",
    "    timezone=\"UTC\",\n",
    "    timestamps=timestamps,\n",
    "    values=(8 + rng.normal(0, 2, 24)).tolist(),\n",
    "    name=\"wind_speed\",\n",
    "    unit=\"m/s\",\n",
    ")\n",
    "wind"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2026-03-04T19:11:11.645813Z",
     "iopub.status.busy": "2026-03-04T19:11:11.645739Z",
     "iopub.status.idle": "2026-03-04T19:11:11.647344Z",
     "shell.execute_reply": "2026-03-04T19:11:11.647062Z"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Unit: m/s\n"
     ]
    }
   ],
   "source": [
    "print(f\"Unit: {wind.unit}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Converting units with `convert_unit()`\n",
    "\n",
    "`convert_unit()` uses [pint](https://pint.readthedocs.io/) under the hood to convert values.  \n",
    "It returns a **new** `TimeSeriesList` — the original is unchanged.\n",
    "\n",
    "```bash\n",
    "pip install timedatamodel[pint]\n",
    "```"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2026-03-04T19:11:11.648235Z",
     "iopub.status.busy": "2026-03-04T19:11:11.648176Z",
     "iopub.status.idle": "2026-03-04T19:11:11.744139Z",
     "shell.execute_reply": "2026-03-04T19:11:11.743780Z"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Original:  m/s    mean=7.96\n",
      "Converted: km/h   mean=28.66\n",
      "Converted: knot   mean=15.48\n"
     ]
    }
   ],
   "source": [
    "wind_kmh = wind.convert_unit(\"km/h\")\n",
    "wind_knot = wind.convert_unit(\"knot\")\n",
    "\n",
    "print(f\"Original:  {wind.unit:5s}  mean={np.nanmean(wind.arr):.2f}\")\n",
    "print(f\"Converted: {wind_kmh.unit:5s}  mean={np.nanmean(wind_kmh.arr):.2f}\")\n",
    "print(f\"Converted: {wind_knot.unit:5s}  mean={np.nanmean(wind_knot.arr):.2f}\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2026-03-04T19:11:11.745126Z",
     "iopub.status.busy": "2026-03-04T19:11:11.745056Z",
     "iopub.status.idle": "2026-03-04T19:11:11.747381Z",
     "shell.execute_reply": "2026-03-04T19:11:11.747069Z"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "kWh: mean=508.9\n",
      "MWh: mean=0.5089\n",
      "J:   mean=1832028606\n"
     ]
    }
   ],
   "source": [
    "energy_kwh = tdm.TimeSeriesList(\n",
    "    tdm.Frequency.PT1H,\n",
    "    timezone=\"UTC\",\n",
    "    timestamps=timestamps,\n",
    "    values=(500 + rng.normal(0, 50, 24)).tolist(),\n",
    "    name=\"energy\",\n",
    "    unit=\"kWh\",\n",
    ")\n",
    "\n",
    "energy_mwh = energy_kwh.convert_unit(\"MWh\")\n",
    "energy_j = energy_kwh.convert_unit(\"J\")\n",
    "\n",
    "print(f\"kWh: mean={np.nanmean(energy_kwh.arr):.1f}\")\n",
    "print(f\"MWh: mean={np.nanmean(energy_mwh.arr):.4f}\")\n",
    "print(f\"J:   mean={np.nanmean(energy_j.arr):.0f}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Incompatible units raise an error"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2026-03-04T19:11:11.748303Z",
     "iopub.status.busy": "2026-03-04T19:11:11.748249Z",
     "iopub.status.idle": "2026-03-04T19:11:11.749739Z",
     "shell.execute_reply": "2026-03-04T19:11:11.749399Z"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Error: cannot convert 'm/s' to 'MW': incompatible dimensions\n"
     ]
    }
   ],
   "source": [
    "try:\n",
    "    wind.convert_unit(\"MW\")\n",
    "except ValueError as e:\n",
    "    print(f\"Error: {e}\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2026-03-04T19:11:11.750563Z",
     "iopub.status.busy": "2026-03-04T19:11:11.750508Z",
     "iopub.status.idle": "2026-03-04T19:11:11.752226Z",
     "shell.execute_reply": "2026-03-04T19:11:11.751857Z"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Error: cannot convert units: source unit is None\n"
     ]
    }
   ],
   "source": [
    "no_unit = tdm.TimeSeriesList(\n",
    "    tdm.Frequency.PT1H, timezone=\"UTC\",\n",
    "    timestamps=timestamps,\n",
    "    values=rng.normal(0, 1, 24).tolist(),\n",
    "    name=\"dimensionless\",\n",
    ")\n",
    "\n",
    "try:\n",
    "    no_unit.convert_unit(\"MW\")\n",
    "except ValueError as e:\n",
    "    print(f\"Error: {e}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Automatic unit conversion in arithmetic\n",
    "\n",
    "When you add or subtract two `TimeSeriesList` with compatible units, values are automatically converted to the left operand's unit."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2026-03-04T19:11:11.753056Z",
     "iopub.status.busy": "2026-03-04T19:11:11.752983Z",
     "iopub.status.idle": "2026-03-04T19:11:11.755221Z",
     "shell.execute_reply": "2026-03-04T19:11:11.754942Z"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Result unit: MW\n",
      "plant_a mean: 98.4 MW\n",
      "plant_b mean: 48948.6 kW = 48.9 MW\n",
      "total mean:   147.4 MW\n"
     ]
    }
   ],
   "source": [
    "power_mw = tdm.TimeSeriesList(\n",
    "    tdm.Frequency.PT1H, timezone=\"UTC\",\n",
    "    timestamps=timestamps,\n",
    "    values=(100 + rng.normal(0, 10, 24)).tolist(),\n",
    "    name=\"plant_a\",\n",
    "    unit=\"MW\",\n",
    ")\n",
    "\n",
    "power_kw = tdm.TimeSeriesList(\n",
    "    tdm.Frequency.PT1H, timezone=\"UTC\",\n",
    "    timestamps=timestamps,\n",
    "    values=(50000 + rng.normal(0, 5000, 24)).tolist(),\n",
    "    name=\"plant_b\",\n",
    "    unit=\"kW\",\n",
    ")\n",
    "\n",
    "total = power_mw + power_kw\n",
    "print(f\"Result unit: {total.unit}\")\n",
    "print(f\"plant_a mean: {np.nanmean(power_mw.arr):.1f} MW\")\n",
    "print(f\"plant_b mean: {np.nanmean(power_kw.arr):.1f} kW = {np.nanmean(power_kw.arr)/1000:.1f} MW\")\n",
    "print(f\"total mean:   {np.nanmean(total.arr):.1f} MW\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Mismatched unit *presence* (one has a unit, the other doesn't) raises an error:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2026-03-04T19:11:11.756220Z",
     "iopub.status.busy": "2026-03-04T19:11:11.756145Z",
     "iopub.status.idle": "2026-03-04T19:11:11.757663Z",
     "shell.execute_reply": "2026-03-04T19:11:11.757372Z"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Error: unit mismatch: one operand has unit='MW' and the other has unit=None\n"
     ]
    }
   ],
   "source": [
    "try:\n",
    "    _ = power_mw + no_unit\n",
    "except ValueError as e:\n",
    "    print(f\"Error: {e}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Resolving units with `pint_unit`\n",
    "\n",
    "The `pint_unit` property returns a `pint.Unit` object for programmatic inspection."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2026-03-04T19:11:11.758490Z",
     "iopub.status.busy": "2026-03-04T19:11:11.758431Z",
     "iopub.status.idle": "2026-03-04T19:11:11.760136Z",
     "shell.execute_reply": "2026-03-04T19:11:11.759787Z"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "pint unit: megawatt\n",
      "type:      Unit\n"
     ]
    }
   ],
   "source": [
    "pu = power_mw.pint_unit\n",
    "print(f\"pint unit: {pu}\")\n",
    "print(f\"type:      {type(pu).__name__}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Units on TimeSeriesTable\n",
    "\n",
    "`TimeSeriesTable` supports per-column units via the `units` parameter.  \n",
    "`convert_unit()` can target a single column or all columns."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2026-03-04T19:11:11.760975Z",
     "iopub.status.busy": "2026-03-04T19:11:11.760925Z",
     "iopub.status.idle": "2026-03-04T19:11:11.763084Z",
     "shell.execute_reply": "2026-03-04T19:11:11.762748Z"
    }
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<style>\n",
       ".ts-repr { font-family: monospace; font-size: 13px; max-width: 640px; display: inline-grid; }\n",
       ".ts-repr .ts-header {\n",
       "  font-weight: bold; font-size: 14px;\n",
       "  padding: 6px 10px; border-bottom: 2px solid #4a4a4a;\n",
       "  background: #f0f0f0; color: #1a1a1a;\n",
       "}\n",
       ".ts-repr .ts-meta { padding: 6px 10px; background: #fafafa; overflow: hidden; min-width: 0; }\n",
       ".ts-repr .ts-meta table { border-collapse: collapse; width: 100%; table-layout: fixed; }\n",
       ".ts-repr .ts-meta td { padding: 1px 8px 1px 0; white-space: nowrap; }\n",
       ".ts-repr .ts-meta td:first-child { color: #475569; font-weight: 600; width: 90px; }\n",
       ".ts-repr .ts-meta td:last-child { color: #1a1a1a; overflow: hidden; text-overflow: ellipsis; }\n",
       ".ts-repr .ts-data { padding: 6px 10px; }\n",
       ".ts-repr .ts-data table {\n",
       "  border-collapse: collapse; text-align: right;\n",
       "}\n",
       ".ts-repr .ts-data th {\n",
       "  text-align: right; padding: 3px 10px; border-bottom: 1px solid #ccc;\n",
       "  color: #555; font-weight: 600;\n",
       "}\n",
       ".ts-repr .ts-data th.ts-idx { text-align: left; }\n",
       ".ts-repr .ts-data td { padding: 2px 10px; }\n",
       ".ts-repr .ts-data tr:hover { background: #f5f5f5; }\n",
       ".ts-repr .ts-data td:first-child { text-align: left; color: #1e293b; }\n",
       ".ts-repr .ts-data td.ts-idx { text-align: left; color: #1e293b; }\n",
       ".ts-repr .ts-ellipsis { text-align: center !important; color: #999; }\n",
       "@media (prefers-color-scheme: dark) {\n",
       "  .ts-repr .ts-header { background: #1e293b; color: #e2e8f0; border-color: #475569; }\n",
       "  .ts-repr .ts-meta { background: #0f172a; }\n",
       "  .ts-repr .ts-meta td:first-child { color: #94a3b8; }\n",
       "  .ts-repr .ts-meta td:last-child { color: #e2e8f0; }\n",
       "  .ts-repr .ts-data th { color: #94a3b8; border-color: #334155; }\n",
       "  .ts-repr .ts-data td { color: #e2e8f0; }\n",
       "  .ts-repr .ts-data td:first-child { color: #cbd5e1; }\n",
       "  .ts-repr .ts-data td.ts-idx { color: #cbd5e1; }\n",
       "  .ts-repr .ts-data tr:hover { background: #1e293b; }\n",
       "  .ts-repr .ts-ellipsis { color: #64748b; }\n",
       "}\n",
       "</style>\n",
       "<div class=\"ts-repr\">\n",
       "<div class=\"ts-header\">TimeSeriesTable</div>\n",
       "<div class=\"ts-meta\"><table>\n",
       "<tr><td>Name</td><td>unnamed</td></tr>\n",
       "<tr><td>Columns</td><td>power, wind_speed</td></tr>\n",
       "<tr><td>Length</td><td>24 × 2</td></tr>\n",
       "<tr><td>Frequency</td><td>PT1H</td></tr>\n",
       "<tr><td>Timezone</td><td>UTC (+00:00)</td></tr>\n",
       "<tr><td>Unit</td><td>MW, m/s</td></tr>\n",
       "</table></div>\n",
       "<div class=\"ts-data\"><table>\n",
       "<tr><th class=\"ts-idx\">timestamp</th><th>power</th><th>wind_speed</th></tr>\n",
       "<tr><td>2024-01-15 00:00</td><td>84.6475</td><td>6.37412</td></tr>\n",
       "<tr><td>2024-01-15 01:00</td><td>102.689</td><td>7.16929</td></tr>\n",
       "<tr><td>2024-01-15 02:00</td><td>103.3</td><td>6.77581</td></tr>\n",
       "<tr><td class=\"ts-ellipsis\">&hellip;</td><td class=\"ts-ellipsis\">&hellip;</td><td class=\"ts-ellipsis\">&hellip;</td></tr>\n",
       "<tr><td>2024-01-15 21:00</td><td>85.1569</td><td>6.19415</td></tr>\n",
       "<tr><td>2024-01-15 22:00</td><td>68.0193</td><td>9.86315</td></tr>\n",
       "<tr><td>2024-01-15 23:00</td><td>104.016</td><td>8.7699</td></tr>\n",
       "</table></div>\n",
       "</div>"
      ],
      "text/plain": [
       "TimeSeriesTable\n",
       "┌─────────────────────────────────────────┐\n",
       "│  Name:             unnamed              │\n",
       "│  Columns:          power, wind_speed    │\n",
       "│  Length:           24 × 2               │\n",
       "│  Frequency:        PT1H                 │\n",
       "│  Timezone:         UTC (+00:00)         │\n",
       "│  Unit:             MW, m/s              │\n",
       "├─────────────────────────────────────────┤\n",
       "│                      power  wind_speed  │\n",
       "│  2024-01-15 00:00  84.6475     6.37412  │\n",
       "│  2024-01-15 01:00  102.689     7.16929  │\n",
       "│  2024-01-15 02:00    103.3     6.77581  │\n",
       "│  ...                   ...         ...  │\n",
       "│  2024-01-15 21:00  85.1569     6.19415  │\n",
       "│  2024-01-15 22:00  68.0193     9.86315  │\n",
       "│  2024-01-15 23:00  104.016      8.7699  │\n",
       "└─────────────────────────────────────────┘"
      ]
     },
     "execution_count": 11,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "table = tdm.TimeSeriesTable(\n",
    "    tdm.Frequency.PT1H,\n",
    "    timezone=\"UTC\",\n",
    "    timestamps=timestamps,\n",
    "    values=np.column_stack([\n",
    "        100 + rng.normal(0, 15, 24),\n",
    "        8 + rng.normal(0, 2, 24),\n",
    "    ]),\n",
    "    names=[\"power\", \"wind_speed\"],\n",
    "    units=[\"MW\", \"m/s\"],\n",
    ")\n",
    "table"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2026-03-04T19:11:11.763876Z",
     "iopub.status.busy": "2026-03-04T19:11:11.763827Z",
     "iopub.status.idle": "2026-03-04T19:11:11.765424Z",
     "shell.execute_reply": "2026-03-04T19:11:11.765160Z"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Original units: ['MW', 'm/s']\n",
      "After convert:  ['kW', 'm/s']\n",
      "Power mean: 100.3 MW → 100261.2 kW\n"
     ]
    }
   ],
   "source": [
    "table_kw = table.convert_unit(\"kW\", column=\"power\")\n",
    "\n",
    "print(f\"Original units: {table.units}\")\n",
    "print(f\"After convert:  {table_kw.units}\")\n",
    "print(f\"Power mean: {table.arr[:, 0].mean():.1f} MW → {table_kw.arr[:, 0].mean():.1f} kW\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Validating timestamps and frequency\n",
    "\n",
    "`validate()` checks that timestamps are strictly increasing and match the declared frequency.  \n",
    "It returns a list of warning strings — an empty list means everything is consistent."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2026-03-04T19:11:11.766314Z",
     "iopub.status.busy": "2026-03-04T19:11:11.766244Z",
     "iopub.status.idle": "2026-03-04T19:11:11.767718Z",
     "shell.execute_reply": "2026-03-04T19:11:11.767402Z"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Warnings: []\n"
     ]
    }
   ],
   "source": [
    "good = tdm.TimeSeriesList(\n",
    "    tdm.Frequency.PT1H, timezone=\"UTC\",\n",
    "    timestamps=timestamps,\n",
    "    values=rng.normal(0, 1, 24).tolist(),\n",
    "    name=\"clean\",\n",
    ")\n",
    "\n",
    "warnings = good.validate()\n",
    "print(f\"Warnings: {warnings}\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2026-03-04T19:11:11.768548Z",
     "iopub.status.busy": "2026-03-04T19:11:11.768495Z",
     "iopub.status.idle": "2026-03-04T19:11:11.770458Z",
     "shell.execute_reply": "2026-03-04T19:11:11.770117Z"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "⚠ inconsistent frequency at index 12: expected 1:00:00, got 3:00:00\n"
     ]
    }
   ],
   "source": [
    "gap_timestamps = timestamps[:12] + timestamps[14:]\n",
    "gap_values = rng.normal(0, 1, len(gap_timestamps)).tolist()\n",
    "\n",
    "gapped = tdm.TimeSeriesList(\n",
    "    tdm.Frequency.PT1H, timezone=\"UTC\",\n",
    "    timestamps=gap_timestamps,\n",
    "    values=gap_values,\n",
    "    name=\"has_gap\",\n",
    ")\n",
    "\n",
    "for w in gapped.validate():\n",
    "    print(f\"⚠ {w}\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2026-03-04T19:11:11.771231Z",
     "iopub.status.busy": "2026-03-04T19:11:11.771183Z",
     "iopub.status.idle": "2026-03-04T19:11:11.772864Z",
     "shell.execute_reply": "2026-03-04T19:11:11.772536Z"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "⚠ inconsistent frequency at index 12: expected 1:00:00, got 2:00:00\n",
      "⚠ timestamps not strictly increasing at index 13: 2024-01-15 13:00:00+00:00 >= 2024-01-15 12:00:00+00:00\n"
     ]
    }
   ],
   "source": [
    "bad_order = timestamps[:12] + [timestamps[13], timestamps[12]] + timestamps[14:]\n",
    "bad_values = rng.normal(0, 1, len(bad_order)).tolist()\n",
    "\n",
    "unordered = tdm.TimeSeriesList(\n",
    "    tdm.Frequency.PT1H, timezone=\"UTC\",\n",
    "    timestamps=bad_order,\n",
    "    values=bad_values,\n",
    "    name=\"unordered\",\n",
    ")\n",
    "\n",
    "for w in unordered.validate():\n",
    "    print(f\"⚠ {w}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Detecting missing values\n",
    "\n",
    "The `has_missing` property returns `True` when any value is `None` (NaN)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2026-03-04T19:11:11.773626Z",
     "iopub.status.busy": "2026-03-04T19:11:11.773583Z",
     "iopub.status.idle": "2026-03-04T19:11:11.775285Z",
     "shell.execute_reply": "2026-03-04T19:11:11.774994Z"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "has_missing: True\n",
      "NaN count:   2\n",
      "Length:      24\n"
     ]
    }
   ],
   "source": [
    "values_with_gaps = rng.normal(100, 10, 24).tolist()\n",
    "values_with_gaps[5] = None\n",
    "values_with_gaps[18] = None\n",
    "\n",
    "sparse = tdm.TimeSeriesList(\n",
    "    tdm.Frequency.PT1H, timezone=\"UTC\",\n",
    "    timestamps=timestamps,\n",
    "    values=values_with_gaps,\n",
    "    name=\"sparse\",\n",
    "    unit=\"MW\",\n",
    ")\n",
    "\n",
    "print(f\"has_missing: {sparse.has_missing}\")\n",
    "print(f\"NaN count:   {np.isnan(sparse.arr).sum()}\")\n",
    "print(f\"Length:      {len(sparse)}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## DataType — classifying your data\n",
    "\n",
    "The `DataType` enum communicates what kind of data a series holds."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2026-03-04T19:11:11.776310Z",
     "iopub.status.busy": "2026-03-04T19:11:11.776237Z",
     "iopub.status.idle": "2026-03-04T19:11:11.777803Z",
     "shell.execute_reply": "2026-03-04T19:11:11.777461Z"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Available DataType values:\n",
      "  ACTUAL\n",
      "  OBSERVATION\n",
      "  DERIVED\n",
      "  CALCULATED\n",
      "  ESTIMATION\n",
      "  FORECAST\n",
      "  PREDICTION\n",
      "  SCENARIO\n",
      "  SIMULATION\n",
      "  RECONSTRUCTION\n",
      "  REFERENCE\n",
      "  BASELINE\n",
      "  BENCHMARK\n",
      "  IDEAL\n"
     ]
    }
   ],
   "source": [
    "print(\"Available DataType values:\")\n",
    "for dt in tdm.DataType:\n",
    "    print(f\"  {dt.value}\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2026-03-04T19:11:11.778627Z",
     "iopub.status.busy": "2026-03-04T19:11:11.778574Z",
     "iopub.status.idle": "2026-03-04T19:11:11.780618Z",
     "shell.execute_reply": "2026-03-04T19:11:11.780265Z"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "wind_measured: data_type=OBSERVATION\n",
      "wind_forecast: data_type=FORECAST\n"
     ]
    }
   ],
   "source": [
    "measured = tdm.TimeSeriesList(\n",
    "    tdm.Frequency.PT1H, timezone=\"UTC\",\n",
    "    timestamps=timestamps,\n",
    "    values=(100 + rng.normal(0, 10, 24)).tolist(),\n",
    "    name=\"wind_measured\",\n",
    "    unit=\"MW\",\n",
    "    data_type=tdm.DataType.OBSERVATION,\n",
    ")\n",
    "\n",
    "forecast = tdm.TimeSeriesList(\n",
    "    tdm.Frequency.PT1H, timezone=\"UTC\",\n",
    "    timestamps=timestamps,\n",
    "    values=(105 + rng.normal(0, 15, 24)).tolist(),\n",
    "    name=\"wind_forecast\",\n",
    "    unit=\"MW\",\n",
    "    data_type=tdm.DataType.FORECAST,\n",
    ")\n",
    "\n",
    "print(f\"{measured.name}: data_type={measured.data_type}\")\n",
    "print(f\"{forecast.name}: data_type={forecast.data_type}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## TimeSeriesType — structural classification\n",
    "\n",
    "`TimeSeriesType` describes the structural nature of the series."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2026-03-04T19:11:11.781490Z",
     "iopub.status.busy": "2026-03-04T19:11:11.781447Z",
     "iopub.status.idle": "2026-03-04T19:11:11.782885Z",
     "shell.execute_reply": "2026-03-04T19:11:11.782549Z"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Available TimeSeriesType values:\n",
      "  FLAT\n",
      "  OVERLAPPING\n"
     ]
    }
   ],
   "source": [
    "print(\"Available TimeSeriesType values:\")\n",
    "for tst in tdm.TimeSeriesType:\n",
    "    print(f\"  {tst.value}\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2026-03-04T19:11:11.783632Z",
     "iopub.status.busy": "2026-03-04T19:11:11.783588Z",
     "iopub.status.idle": "2026-03-04T19:11:11.785139Z",
     "shell.execute_reply": "2026-03-04T19:11:11.784802Z"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "timeseries_type: FLAT\n"
     ]
    }
   ],
   "source": [
    "flat = tdm.TimeSeriesList(\n",
    "    tdm.Frequency.PT1H, timezone=\"UTC\",\n",
    "    timestamps=timestamps,\n",
    "    values=rng.normal(0, 1, 24).tolist(),\n",
    "    name=\"flat_series\",\n",
    "    timeseries_type=tdm.TimeSeriesType.FLAT,\n",
    ")\n",
    "print(f\"timeseries_type: {flat.timeseries_type}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Custom attributes\n",
    "\n",
    "The `attributes` dict stores arbitrary key-value metadata — source system, fuel type, model version, etc."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2026-03-04T19:11:11.786039Z",
     "iopub.status.busy": "2026-03-04T19:11:11.785959Z",
     "iopub.status.idle": "2026-03-04T19:11:11.787801Z",
     "shell.execute_reply": "2026-03-04T19:11:11.787538Z"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Attributes: {'source': 'SCADA', 'fuel': 'wind', 'capacity_mw': '120', 'operator': 'NorthWind Energy'}\n",
      "Capacity:   120 MW\n"
     ]
    }
   ],
   "source": [
    "rich = tdm.TimeSeriesList(\n",
    "    tdm.Frequency.PT1H,\n",
    "    timezone=\"UTC\",\n",
    "    timestamps=timestamps,\n",
    "    values=(80 + rng.normal(0, 10, 24)).tolist(),\n",
    "    name=\"wind_farm_alpha\",\n",
    "    unit=\"MW\",\n",
    "    description=\"Measured output from Wind Farm Alpha\",\n",
    "    data_type=tdm.DataType.OBSERVATION,\n",
    "    timeseries_type=tdm.TimeSeriesType.FLAT,\n",
    "    attributes={\n",
    "        \"source\": \"SCADA\",\n",
    "        \"fuel\": \"wind\",\n",
    "        \"capacity_mw\": \"120\",\n",
    "        \"operator\": \"NorthWind Energy\",\n",
    "    },\n",
    ")\n",
    "\n",
    "print(f\"Attributes: {rich.attributes}\")\n",
    "print(f\"Capacity:   {rich.attributes['capacity_mw']} MW\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Frequency enum\n",
    "\n",
    "`Frequency` is a `StrEnum` with helpers for calendar-based vs fixed-duration frequencies."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2026-03-04T19:11:11.788597Z",
     "iopub.status.busy": "2026-03-04T19:11:11.788554Z",
     "iopub.status.idle": "2026-03-04T19:11:11.790293Z",
     "shell.execute_reply": "2026-03-04T19:11:11.789975Z"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Frequency  timedelta               calendar?\n",
      "---------------------------------------------\n",
      "P1Y       -                       True\n",
      "P3M       -                       True\n",
      "P1M       -                       True\n",
      "P1W       7 days, 0:00:00         False\n",
      "P1D       1 day, 0:00:00          False\n",
      "PT1H      1:00:00                 False\n",
      "PT30M     0:30:00                 False\n",
      "PT15M     0:15:00                 False\n",
      "PT10M     0:10:00                 False\n",
      "PT5M      0:05:00                 False\n",
      "PT1M      0:01:00                 False\n",
      "PT1S      0:00:01                 False\n",
      "NONE      -                       False\n"
     ]
    }
   ],
   "source": [
    "print(f\"{'Frequency':<8s}  {'timedelta':<22s}  {'calendar?'}\")\n",
    "print(\"-\" * 45)\n",
    "for f in tdm.Frequency:\n",
    "    td = f.to_timedelta()\n",
    "    td_str = str(td) if td else \"-\"\n",
    "    print(f\"{f.value:<8s}  {td_str:<22s}  {f.is_calendar_based}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Metadata survives serialization\n",
    "\n",
    "Units, data types, attributes, and other metadata round-trip through JSON."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2026-03-04T19:11:11.791053Z",
     "iopub.status.busy": "2026-03-04T19:11:11.791005Z",
     "iopub.status.idle": "2026-03-04T19:11:11.792644Z",
     "shell.execute_reply": "2026-03-04T19:11:11.792408Z"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "unit:            MW\n",
      "data_type:       OBSERVATION\n",
      "timeseries_type: FLAT\n",
      "attributes:      {'source': 'SCADA', 'fuel': 'wind', 'capacity_mw': '120', 'operator': 'NorthWind Energy'}\n",
      "description:     Measured output from Wind Farm Alpha\n"
     ]
    }
   ],
   "source": [
    "json_str = rich.to_json()\n",
    "restored = tdm.TimeSeriesList.from_json(json_str)\n",
    "\n",
    "print(f\"unit:            {restored.unit}\")\n",
    "print(f\"data_type:       {restored.data_type}\")\n",
    "print(f\"timeseries_type: {restored.timeseries_type}\")\n",
    "print(f\"attributes:      {restored.attributes}\")\n",
    "print(f\"description:     {restored.description}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Summary\n",
    "\n",
    "| Feature | API |\n",
    "| --- | --- |\n",
    "| Set unit | `TimeSeriesList(..., unit=\"MW\")` |\n",
    "| Convert unit | `ts.convert_unit(\"kW\")` — returns new series |\n",
    "| Auto-convert in arithmetic | `ts_mw + ts_kw` converts to left operand's unit |\n",
    "| Pint integration | `ts.pint_unit` — resolves to `pint.Unit` |\n",
    "| Per-column units | `TimeSeriesTable(..., units=[\"MW\", \"m/s\"])` |\n",
    "| Column conversion | `table.convert_unit(\"kW\", column=\"power\")` |\n",
    "| Validate timestamps | `ts.validate()` → list of warning strings |\n",
    "| Missing values | `ts.has_missing` |\n",
    "| Data classification | `DataType.OBSERVATION`, `.FORECAST`, `.SCENARIO`, … |\n",
    "| Structural type | `TimeSeriesType.FLAT`, `.OVERLAPPING` |\n",
    "| Custom metadata | `attributes={\"key\": \"value\"}` |\n",
    "| Frequency info | `Frequency.PT1H.to_timedelta()`, `.is_calendar_based` |\n",
    "\n",
    "Next up: **nb_04** covers arithmetic operations and comparisons on TimeSeriesList."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": ".venv",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.14.2"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
