{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# NumPy and Pandas Transforms\n",
    "\n",
    "TimeDataModel provides clean patterns for transforming time series data using numpy and pandas. Every `TimeSeries` and `TimeSeriesTable` exposes `.arr` (numpy array) and `.df` (pandas DataFrame) properties, plus dedicated methods for writing results back. This keeps your domain model structured while letting you leverage the full scientific Python ecosystem."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2026-03-01T13:36:43.496002Z",
     "iopub.status.busy": "2026-03-01T13:36:43.495859Z",
     "iopub.status.idle": "2026-03-01T13:36:43.581351Z",
     "shell.execute_reply": "2026-03-01T13:36:43.580940Z"
    }
   },
   "outputs": [
    {
     "ename": "ModuleNotFoundError",
     "evalue": "No module named 'timedatamodel'",
     "output_type": "error",
     "traceback": [
      "\u001b[31m---------------------------------------------------------------------------\u001b[39m",
      "\u001b[31mModuleNotFoundError\u001b[39m                       Traceback (most recent call last)",
      "\u001b[36mCell\u001b[39m\u001b[36m \u001b[39m\u001b[32mIn[1]\u001b[39m\u001b[32m, line 5\u001b[39m\n\u001b[32m      1\u001b[39m \u001b[38;5;28;01mfrom\u001b[39;00m\u001b[38;5;250m \u001b[39m\u001b[34;01mdatetime\u001b[39;00m\u001b[38;5;250m \u001b[39m\u001b[38;5;28;01mimport\u001b[39;00m datetime, timedelta, timezone\n\u001b[32m      3\u001b[39m \u001b[38;5;28;01mimport\u001b[39;00m\u001b[38;5;250m \u001b[39m\u001b[34;01mnumpy\u001b[39;00m\u001b[38;5;250m \u001b[39m\u001b[38;5;28;01mas\u001b[39;00m\u001b[38;5;250m \u001b[39m\u001b[34;01mnp\u001b[39;00m\n\u001b[32m----> \u001b[39m\u001b[32m5\u001b[39m \u001b[38;5;28;01mimport\u001b[39;00m\u001b[38;5;250m \u001b[39m\u001b[34;01mtimedatamodel\u001b[39;00m\u001b[38;5;250m \u001b[39m\u001b[38;5;28;01mas\u001b[39;00m\u001b[38;5;250m \u001b[39m\u001b[34;01mtdm\u001b[39;00m\n\u001b[32m      7\u001b[39m base = datetime(\u001b[32m2024\u001b[39m, \u001b[32m1\u001b[39m, \u001b[32m15\u001b[39m, tzinfo=timezone.utc)\n\u001b[32m      8\u001b[39m timestamps = [base + timedelta(hours=i) \u001b[38;5;28;01mfor\u001b[39;00m i \u001b[38;5;129;01min\u001b[39;00m \u001b[38;5;28mrange\u001b[39m(\u001b[32m24\u001b[39m)]\n",
      "\u001b[31mModuleNotFoundError\u001b[39m: No module named 'timedatamodel'"
     ]
    }
   ],
   "source": [
    "from datetime import datetime, timedelta, timezone\n",
    "\n",
    "import numpy as np\n",
    "\n",
    "import timedatamodel as tdm\n",
    "\n",
    "base = datetime(2024, 1, 15, tzinfo=timezone.utc)\n",
    "timestamps = [base + timedelta(hours=i) for i in range(24)]\n",
    "\n",
    "ts = tdm.TimeSeries(\n",
    "    tdm.Frequency.PT1H,\n",
    "    timestamps=timestamps,\n",
    "    values=[\n",
    "        120.0, 115.0, 108.0, 105.0, 102.0, 100.0,\n",
    "        110.0, 135.0, 160.0, 175.0, 180.0, 178.0,\n",
    "        172.0, 170.0, 168.0, 165.0, 175.0, 190.0,\n",
    "        200.0, 195.0, 180.0, 165.0, 145.0, 130.0,\n",
    "    ],\n",
    "    name=\"power\",\n",
    "    unit=\"MW\",\n",
    "    data_type=tdm.DataType.MEASUREMENT,\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## The `.arr` and `.df` properties\n",
    "\n",
    "Every `TimeSeries` has two shorthand properties for quick access to the underlying data:\n",
    "- `ts.arr` — returns a numpy `ndarray` (same as `ts.to_numpy()`)\n",
    "- `ts.df` — returns a pandas `DataFrame` (same as `ts.to_pandas_dataframe()`)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2026-03-01T13:36:43.582388Z",
     "iopub.status.busy": "2026-03-01T13:36:43.582321Z",
     "iopub.status.idle": "2026-03-01T13:36:43.588658Z",
     "shell.execute_reply": "2026-03-01T13:36:43.588239Z"
    }
   },
   "outputs": [
    {
     "ename": "NameError",
     "evalue": "name 'ts' is not defined",
     "output_type": "error",
     "traceback": [
      "\u001b[31m---------------------------------------------------------------------------\u001b[39m",
      "\u001b[31mNameError\u001b[39m                                 Traceback (most recent call last)",
      "\u001b[36mCell\u001b[39m\u001b[36m \u001b[39m\u001b[32mIn[2]\u001b[39m\u001b[32m, line 1\u001b[39m\n\u001b[32m----> \u001b[39m\u001b[32m1\u001b[39m \u001b[43mts\u001b[49m.arr\n",
      "\u001b[31mNameError\u001b[39m: name 'ts' is not defined"
     ]
    }
   ],
   "source": [
    "ts.arr"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2026-03-01T13:36:43.589572Z",
     "iopub.status.busy": "2026-03-01T13:36:43.589513Z",
     "iopub.status.idle": "2026-03-01T13:36:43.595937Z",
     "shell.execute_reply": "2026-03-01T13:36:43.595411Z"
    }
   },
   "outputs": [
    {
     "ename": "NameError",
     "evalue": "name 'ts' is not defined",
     "output_type": "error",
     "traceback": [
      "\u001b[31m---------------------------------------------------------------------------\u001b[39m",
      "\u001b[31mNameError\u001b[39m                                 Traceback (most recent call last)",
      "\u001b[36mCell\u001b[39m\u001b[36m \u001b[39m\u001b[32mIn[3]\u001b[39m\u001b[32m, line 1\u001b[39m\n\u001b[32m----> \u001b[39m\u001b[32m1\u001b[39m \u001b[43mts\u001b[49m.df.head()\n",
      "\u001b[31mNameError\u001b[39m: name 'ts' is not defined"
     ]
    }
   ],
   "source": [
    "ts.df.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Pattern 1: `apply_numpy(func)`\n",
    "\n",
    "Pass a function that receives a numpy array and returns a numpy array. Timestamps, frequency, and all metadata are preserved automatically. The output array must have the same length as the input."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2026-03-01T13:36:43.596962Z",
     "iopub.status.busy": "2026-03-01T13:36:43.596901Z",
     "iopub.status.idle": "2026-03-01T13:36:43.604225Z",
     "shell.execute_reply": "2026-03-01T13:36:43.603814Z"
    }
   },
   "outputs": [
    {
     "ename": "NameError",
     "evalue": "name 'ts' is not defined",
     "output_type": "error",
     "traceback": [
      "\u001b[31m---------------------------------------------------------------------------\u001b[39m",
      "\u001b[31mNameError\u001b[39m                                 Traceback (most recent call last)",
      "\u001b[36mCell\u001b[39m\u001b[36m \u001b[39m\u001b[32mIn[4]\u001b[39m\u001b[32m, line 1\u001b[39m\n\u001b[32m----> \u001b[39m\u001b[32m1\u001b[39m normalized = \u001b[43mts\u001b[49m.apply_numpy(\u001b[38;5;28;01mlambda\u001b[39;00m arr: (arr - arr.mean()) / arr.std())\n\u001b[32m      2\u001b[39m normalized\n",
      "\u001b[31mNameError\u001b[39m: name 'ts' is not defined"
     ]
    }
   ],
   "source": [
    "normalized = ts.apply_numpy(lambda arr: (arr - arr.mean()) / arr.std())\n",
    "normalized"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2026-03-01T13:36:43.605153Z",
     "iopub.status.busy": "2026-03-01T13:36:43.605089Z",
     "iopub.status.idle": "2026-03-01T13:36:43.611601Z",
     "shell.execute_reply": "2026-03-01T13:36:43.611203Z"
    }
   },
   "outputs": [
    {
     "ename": "NameError",
     "evalue": "name 'ts' is not defined",
     "output_type": "error",
     "traceback": [
      "\u001b[31m---------------------------------------------------------------------------\u001b[39m",
      "\u001b[31mNameError\u001b[39m                                 Traceback (most recent call last)",
      "\u001b[36mCell\u001b[39m\u001b[36m \u001b[39m\u001b[32mIn[5]\u001b[39m\u001b[32m, line 1\u001b[39m\n\u001b[32m----> \u001b[39m\u001b[32m1\u001b[39m cumulative = \u001b[43mts\u001b[49m.apply_numpy(np.cumsum)\n\u001b[32m      2\u001b[39m cumulative.head(\u001b[32m6\u001b[39m)\n",
      "\u001b[31mNameError\u001b[39m: name 'ts' is not defined"
     ]
    }
   ],
   "source": [
    "cumulative = ts.apply_numpy(np.cumsum)\n",
    "cumulative.head(6)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2026-03-01T13:36:43.612473Z",
     "iopub.status.busy": "2026-03-01T13:36:43.612418Z",
     "iopub.status.idle": "2026-03-01T13:36:43.618736Z",
     "shell.execute_reply": "2026-03-01T13:36:43.618358Z"
    }
   },
   "outputs": [
    {
     "ename": "NameError",
     "evalue": "name 'ts' is not defined",
     "output_type": "error",
     "traceback": [
      "\u001b[31m---------------------------------------------------------------------------\u001b[39m",
      "\u001b[31mNameError\u001b[39m                                 Traceback (most recent call last)",
      "\u001b[36mCell\u001b[39m\u001b[36m \u001b[39m\u001b[32mIn[6]\u001b[39m\u001b[32m, line 1\u001b[39m\n\u001b[32m----> \u001b[39m\u001b[32m1\u001b[39m clipped = \u001b[43mts\u001b[49m.apply_numpy(\u001b[38;5;28;01mlambda\u001b[39;00m arr: np.clip(arr, \u001b[32m110\u001b[39m, \u001b[32m180\u001b[39m))\n\u001b[32m      2\u001b[39m clipped\n",
      "\u001b[31mNameError\u001b[39m: name 'ts' is not defined"
     ]
    }
   ],
   "source": [
    "clipped = ts.apply_numpy(lambda arr: np.clip(arr, 110, 180))\n",
    "clipped"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Pattern 2: `apply_pandas(func)`\n",
    "\n",
    "Pass a function that receives a pandas DataFrame and returns a pandas DataFrame. This lets you use the full pandas API — rolling windows, resampling, interpolation, and more."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2026-03-01T13:36:43.619598Z",
     "iopub.status.busy": "2026-03-01T13:36:43.619548Z",
     "iopub.status.idle": "2026-03-01T13:36:43.631622Z",
     "shell.execute_reply": "2026-03-01T13:36:43.631226Z"
    }
   },
   "outputs": [
    {
     "ename": "NameError",
     "evalue": "name 'ts' is not defined",
     "output_type": "error",
     "traceback": [
      "\u001b[31m---------------------------------------------------------------------------\u001b[39m",
      "\u001b[31mNameError\u001b[39m                                 Traceback (most recent call last)",
      "\u001b[36mCell\u001b[39m\u001b[36m \u001b[39m\u001b[32mIn[7]\u001b[39m\u001b[32m, line 1\u001b[39m\n\u001b[32m----> \u001b[39m\u001b[32m1\u001b[39m rolling_mean = \u001b[43mts\u001b[49m.apply_pandas(\u001b[38;5;28;01mlambda\u001b[39;00m df: df.rolling(\u001b[32m6\u001b[39m, min_periods=\u001b[32m1\u001b[39m).mean())\n\u001b[32m      2\u001b[39m rolling_mean\n",
      "\u001b[31mNameError\u001b[39m: name 'ts' is not defined"
     ]
    }
   ],
   "source": [
    "rolling_mean = ts.apply_pandas(lambda df: df.rolling(6, min_periods=1).mean())\n",
    "rolling_mean"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2026-03-01T13:36:43.632565Z",
     "iopub.status.busy": "2026-03-01T13:36:43.632508Z",
     "iopub.status.idle": "2026-03-01T13:36:43.638750Z",
     "shell.execute_reply": "2026-03-01T13:36:43.638388Z"
    }
   },
   "outputs": [
    {
     "ename": "NameError",
     "evalue": "name 'ts' is not defined",
     "output_type": "error",
     "traceback": [
      "\u001b[31m---------------------------------------------------------------------------\u001b[39m",
      "\u001b[31mNameError\u001b[39m                                 Traceback (most recent call last)",
      "\u001b[36mCell\u001b[39m\u001b[36m \u001b[39m\u001b[32mIn[8]\u001b[39m\u001b[32m, line 1\u001b[39m\n\u001b[32m----> \u001b[39m\u001b[32m1\u001b[39m diff = \u001b[43mts\u001b[49m.apply_pandas(\u001b[38;5;28;01mlambda\u001b[39;00m df: df.diff())\n\u001b[32m      2\u001b[39m diff\n",
      "\u001b[31mNameError\u001b[39m: name 'ts' is not defined"
     ]
    }
   ],
   "source": [
    "diff = ts.apply_pandas(lambda df: df.diff())\n",
    "diff"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2026-03-01T13:36:43.639580Z",
     "iopub.status.busy": "2026-03-01T13:36:43.639530Z",
     "iopub.status.idle": "2026-03-01T13:36:43.646083Z",
     "shell.execute_reply": "2026-03-01T13:36:43.645710Z"
    }
   },
   "outputs": [
    {
     "ename": "NameError",
     "evalue": "name 'ts' is not defined",
     "output_type": "error",
     "traceback": [
      "\u001b[31m---------------------------------------------------------------------------\u001b[39m",
      "\u001b[31mNameError\u001b[39m                                 Traceback (most recent call last)",
      "\u001b[36mCell\u001b[39m\u001b[36m \u001b[39m\u001b[32mIn[9]\u001b[39m\u001b[32m, line 1\u001b[39m\n\u001b[32m----> \u001b[39m\u001b[32m1\u001b[39m pct_change = \u001b[43mts\u001b[49m.apply_pandas(\u001b[38;5;28;01mlambda\u001b[39;00m df: df.pct_change() * \u001b[32m100\u001b[39m)\n\u001b[32m      2\u001b[39m pct_change.head(\u001b[32m6\u001b[39m)\n",
      "\u001b[31mNameError\u001b[39m: name 'ts' is not defined"
     ]
    }
   ],
   "source": [
    "pct_change = ts.apply_pandas(lambda df: df.pct_change() * 100)\n",
    "pct_change.head(6)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Pattern 3: One-liner round-trips with `update_arr()` and `update_df()`\n",
    "\n",
    "Combine `.arr` / `.df` with `update_arr()` / `update_df()` to transform data in a single expression. The result is a new `TimeSeries` with all metadata preserved."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2026-03-01T13:36:43.647035Z",
     "iopub.status.busy": "2026-03-01T13:36:43.646988Z",
     "iopub.status.idle": "2026-03-01T13:36:43.653377Z",
     "shell.execute_reply": "2026-03-01T13:36:43.653043Z"
    }
   },
   "outputs": [
    {
     "ename": "NameError",
     "evalue": "name 'ts' is not defined",
     "output_type": "error",
     "traceback": [
      "\u001b[31m---------------------------------------------------------------------------\u001b[39m",
      "\u001b[31mNameError\u001b[39m                                 Traceback (most recent call last)",
      "\u001b[36mCell\u001b[39m\u001b[36m \u001b[39m\u001b[32mIn[10]\u001b[39m\u001b[32m, line 1\u001b[39m\n\u001b[32m----> \u001b[39m\u001b[32m1\u001b[39m \u001b[43mts\u001b[49m.update_arr(ts.arr.clip(\u001b[32m110\u001b[39m, \u001b[32m180\u001b[39m))\n",
      "\u001b[31mNameError\u001b[39m: name 'ts' is not defined"
     ]
    }
   ],
   "source": [
    "ts.update_arr(ts.arr.clip(110, 180))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2026-03-01T13:36:43.654435Z",
     "iopub.status.busy": "2026-03-01T13:36:43.654375Z",
     "iopub.status.idle": "2026-03-01T13:36:43.660721Z",
     "shell.execute_reply": "2026-03-01T13:36:43.660281Z"
    }
   },
   "outputs": [
    {
     "ename": "NameError",
     "evalue": "name 'ts' is not defined",
     "output_type": "error",
     "traceback": [
      "\u001b[31m---------------------------------------------------------------------------\u001b[39m",
      "\u001b[31mNameError\u001b[39m                                 Traceback (most recent call last)",
      "\u001b[36mCell\u001b[39m\u001b[36m \u001b[39m\u001b[32mIn[11]\u001b[39m\u001b[32m, line 1\u001b[39m\n\u001b[32m----> \u001b[39m\u001b[32m1\u001b[39m \u001b[43mts\u001b[49m.update_df(ts.df.resample(\u001b[33m\"\u001b[39m\u001b[33m3h\u001b[39m\u001b[33m\"\u001b[39m).mean())\n",
      "\u001b[31mNameError\u001b[39m: name 'ts' is not defined"
     ]
    }
   ],
   "source": [
    "ts.update_df(ts.df.resample(\"3h\").mean())"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2026-03-01T13:36:43.661658Z",
     "iopub.status.busy": "2026-03-01T13:36:43.661592Z",
     "iopub.status.idle": "2026-03-01T13:36:43.667851Z",
     "shell.execute_reply": "2026-03-01T13:36:43.667478Z"
    }
   },
   "outputs": [
    {
     "ename": "NameError",
     "evalue": "name 'ts' is not defined",
     "output_type": "error",
     "traceback": [
      "\u001b[31m---------------------------------------------------------------------------\u001b[39m",
      "\u001b[31mNameError\u001b[39m                                 Traceback (most recent call last)",
      "\u001b[36mCell\u001b[39m\u001b[36m \u001b[39m\u001b[32mIn[12]\u001b[39m\u001b[32m, line 1\u001b[39m\n\u001b[32m----> \u001b[39m\u001b[32m1\u001b[39m \u001b[43mts\u001b[49m.update_df(ts.df.diff())\n",
      "\u001b[31mNameError\u001b[39m: name 'ts' is not defined"
     ]
    }
   ],
   "source": [
    "ts.update_df(ts.df.diff())"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2026-03-01T13:36:43.668728Z",
     "iopub.status.busy": "2026-03-01T13:36:43.668674Z",
     "iopub.status.idle": "2026-03-01T13:36:43.674780Z",
     "shell.execute_reply": "2026-03-01T13:36:43.674418Z"
    }
   },
   "outputs": [
    {
     "ename": "NameError",
     "evalue": "name 'ts' is not defined",
     "output_type": "error",
     "traceback": [
      "\u001b[31m---------------------------------------------------------------------------\u001b[39m",
      "\u001b[31mNameError\u001b[39m                                 Traceback (most recent call last)",
      "\u001b[36mCell\u001b[39m\u001b[36m \u001b[39m\u001b[32mIn[13]\u001b[39m\u001b[32m, line 1\u001b[39m\n\u001b[32m----> \u001b[39m\u001b[32m1\u001b[39m \u001b[43mts\u001b[49m.update_arr(np.cumsum(ts.arr))\n",
      "\u001b[31mNameError\u001b[39m: name 'ts' is not defined"
     ]
    }
   ],
   "source": [
    "ts.update_arr(np.cumsum(ts.arr))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Pattern 4: Manual numpy round-trip\n",
    "\n",
    "For transformations where the output shape differs from the input, export to numpy, transform freely, and construct a new `TimeSeries`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2026-03-01T13:36:43.675713Z",
     "iopub.status.busy": "2026-03-01T13:36:43.675659Z",
     "iopub.status.idle": "2026-03-01T13:36:43.681874Z",
     "shell.execute_reply": "2026-03-01T13:36:43.681604Z"
    }
   },
   "outputs": [
    {
     "ename": "NameError",
     "evalue": "name 'ts' is not defined",
     "output_type": "error",
     "traceback": [
      "\u001b[31m---------------------------------------------------------------------------\u001b[39m",
      "\u001b[31mNameError\u001b[39m                                 Traceback (most recent call last)",
      "\u001b[36mCell\u001b[39m\u001b[36m \u001b[39m\u001b[32mIn[14]\u001b[39m\u001b[32m, line 1\u001b[39m\n\u001b[32m----> \u001b[39m\u001b[32m1\u001b[39m arr = \u001b[43mts\u001b[49m.to_numpy()\n\u001b[32m      2\u001b[39m \u001b[38;5;28mprint\u001b[39m(\u001b[33mf\u001b[39m\u001b[33m\"\u001b[39m\u001b[33mType:  \u001b[39m\u001b[38;5;132;01m{\u001b[39;00m\u001b[38;5;28mtype\u001b[39m(arr)\u001b[38;5;132;01m}\u001b[39;00m\u001b[33m\"\u001b[39m)\n\u001b[32m      3\u001b[39m \u001b[38;5;28mprint\u001b[39m(\u001b[33mf\u001b[39m\u001b[33m\"\u001b[39m\u001b[33mShape: \u001b[39m\u001b[38;5;132;01m{\u001b[39;00marr.shape\u001b[38;5;132;01m}\u001b[39;00m\u001b[33m\"\u001b[39m)\n",
      "\u001b[31mNameError\u001b[39m: name 'ts' is not defined"
     ]
    }
   ],
   "source": [
    "arr = ts.to_numpy()\n",
    "print(f\"Type:  {type(arr)}\")\n",
    "print(f\"Shape: {arr.shape}\")\n",
    "print(f\"Mean:  {arr.mean():.1f} MW\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2026-03-01T13:36:43.682754Z",
     "iopub.status.busy": "2026-03-01T13:36:43.682705Z",
     "iopub.status.idle": "2026-03-01T13:36:43.689985Z",
     "shell.execute_reply": "2026-03-01T13:36:43.689709Z"
    }
   },
   "outputs": [
    {
     "ename": "NameError",
     "evalue": "name 'arr' is not defined",
     "output_type": "error",
     "traceback": [
      "\u001b[31m---------------------------------------------------------------------------\u001b[39m",
      "\u001b[31mNameError\u001b[39m                                 Traceback (most recent call last)",
      "\u001b[36mCell\u001b[39m\u001b[36m \u001b[39m\u001b[32mIn[15]\u001b[39m\u001b[32m, line 2\u001b[39m\n\u001b[32m      1\u001b[39m window = \u001b[32m3\u001b[39m\n\u001b[32m----> \u001b[39m\u001b[32m2\u001b[39m smoothed_arr = np.convolve(\u001b[43marr\u001b[49m, np.ones(window) / window, mode=\u001b[33m\"\u001b[39m\u001b[33mvalid\u001b[39m\u001b[33m\"\u001b[39m)\n\u001b[32m      3\u001b[39m smoothed_timestamps = timestamps[window - \u001b[32m1\u001b[39m :]\n\u001b[32m      5\u001b[39m ts_smoothed = tdm.TimeSeries(\n\u001b[32m      6\u001b[39m     tdm.Frequency.PT1H,\n\u001b[32m      7\u001b[39m     timestamps=smoothed_timestamps,\n\u001b[32m   (...)\u001b[39m\u001b[32m     11\u001b[39m     data_type=ts.data_type,\n\u001b[32m     12\u001b[39m )\n",
      "\u001b[31mNameError\u001b[39m: name 'arr' is not defined"
     ]
    }
   ],
   "source": [
    "window = 3\n",
    "smoothed_arr = np.convolve(arr, np.ones(window) / window, mode=\"valid\")\n",
    "smoothed_timestamps = timestamps[window - 1 :]\n",
    "\n",
    "ts_smoothed = tdm.TimeSeries(\n",
    "    tdm.Frequency.PT1H,\n",
    "    timestamps=smoothed_timestamps,\n",
    "    values=smoothed_arr.tolist(),\n",
    "    name=ts.name,\n",
    "    unit=ts.unit,\n",
    "    data_type=ts.data_type,\n",
    ")\n",
    "print(f\"Original length: {len(ts)}, Smoothed length: {len(ts_smoothed)}\")\n",
    "ts_smoothed.head(6)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Pattern 5: Manual pandas round-trip\n",
    "\n",
    "For multi-step pandas workflows where a one-liner would be hard to read, break it into separate steps."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2026-03-01T13:36:43.690911Z",
     "iopub.status.busy": "2026-03-01T13:36:43.690854Z",
     "iopub.status.idle": "2026-03-01T13:36:43.696894Z",
     "shell.execute_reply": "2026-03-01T13:36:43.696562Z"
    }
   },
   "outputs": [
    {
     "ename": "NameError",
     "evalue": "name 'ts' is not defined",
     "output_type": "error",
     "traceback": [
      "\u001b[31m---------------------------------------------------------------------------\u001b[39m",
      "\u001b[31mNameError\u001b[39m                                 Traceback (most recent call last)",
      "\u001b[36mCell\u001b[39m\u001b[36m \u001b[39m\u001b[32mIn[16]\u001b[39m\u001b[32m, line 1\u001b[39m\n\u001b[32m----> \u001b[39m\u001b[32m1\u001b[39m df = \u001b[43mts\u001b[49m.to_pandas_dataframe()\n\u001b[32m      2\u001b[39m df.head()\n",
      "\u001b[31mNameError\u001b[39m: name 'ts' is not defined"
     ]
    }
   ],
   "source": [
    "df = ts.to_pandas_dataframe()\n",
    "df.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2026-03-01T13:36:43.697761Z",
     "iopub.status.busy": "2026-03-01T13:36:43.697711Z",
     "iopub.status.idle": "2026-03-01T13:36:43.704526Z",
     "shell.execute_reply": "2026-03-01T13:36:43.704118Z"
    }
   },
   "outputs": [
    {
     "ename": "NameError",
     "evalue": "name 'df' is not defined",
     "output_type": "error",
     "traceback": [
      "\u001b[31m---------------------------------------------------------------------------\u001b[39m",
      "\u001b[31mNameError\u001b[39m                                 Traceback (most recent call last)",
      "\u001b[36mCell\u001b[39m\u001b[36m \u001b[39m\u001b[32mIn[17]\u001b[39m\u001b[32m, line 1\u001b[39m\n\u001b[32m----> \u001b[39m\u001b[32m1\u001b[39m df_resampled = \u001b[43mdf\u001b[49m.resample(\u001b[33m\"\u001b[39m\u001b[33m3h\u001b[39m\u001b[33m\"\u001b[39m).mean()\n\u001b[32m      3\u001b[39m ts_resampled = ts.update_from_pandas(df_resampled)\n\u001b[32m      4\u001b[39m \u001b[38;5;28mprint\u001b[39m(\u001b[33mf\u001b[39m\u001b[33m\"\u001b[39m\u001b[33mOriginal:  \u001b[39m\u001b[38;5;132;01m{\u001b[39;00m\u001b[38;5;28mlen\u001b[39m(ts)\u001b[38;5;132;01m}\u001b[39;00m\u001b[33m points\u001b[39m\u001b[33m\"\u001b[39m)\n",
      "\u001b[31mNameError\u001b[39m: name 'df' is not defined"
     ]
    }
   ],
   "source": [
    "df_resampled = df.resample(\"3h\").mean()\n",
    "\n",
    "ts_resampled = ts.update_from_pandas(df_resampled)\n",
    "print(f\"Original:  {len(ts)} points\")\n",
    "print(f\"Resampled: {len(ts_resampled)} points\")\n",
    "print(f\"Unit preserved: {ts_resampled.unit}\")\n",
    "ts_resampled"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2026-03-01T13:36:43.705374Z",
     "iopub.status.busy": "2026-03-01T13:36:43.705313Z",
     "iopub.status.idle": "2026-03-01T13:36:43.711779Z",
     "shell.execute_reply": "2026-03-01T13:36:43.711389Z"
    }
   },
   "outputs": [
    {
     "ename": "NameError",
     "evalue": "name 'df' is not defined",
     "output_type": "error",
     "traceback": [
      "\u001b[31m---------------------------------------------------------------------------\u001b[39m",
      "\u001b[31mNameError\u001b[39m                                 Traceback (most recent call last)",
      "\u001b[36mCell\u001b[39m\u001b[36m \u001b[39m\u001b[32mIn[18]\u001b[39m\u001b[32m, line 1\u001b[39m\n\u001b[32m----> \u001b[39m\u001b[32m1\u001b[39m df_ewm = \u001b[43mdf\u001b[49m.ewm(span=\u001b[32m6\u001b[39m).mean()\n\u001b[32m      3\u001b[39m ts_ewm = ts.update_from_pandas(df_ewm)\n\u001b[32m      4\u001b[39m ts_ewm\n",
      "\u001b[31mNameError\u001b[39m: name 'df' is not defined"
     ]
    }
   ],
   "source": [
    "df_ewm = df.ewm(span=6).mean()\n",
    "\n",
    "ts_ewm = ts.update_from_pandas(df_ewm)\n",
    "ts_ewm"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Transforms on TimeSeriesTable\n",
    "\n",
    "All patterns — `apply_*`, `update_arr()`, `update_df()`, `.arr`, `.df` — also work on `TimeSeriesTable`, applying across all columns."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2026-03-01T13:36:43.712684Z",
     "iopub.status.busy": "2026-03-01T13:36:43.712634Z",
     "iopub.status.idle": "2026-03-01T13:36:43.725639Z",
     "shell.execute_reply": "2026-03-01T13:36:43.725296Z"
    }
   },
   "outputs": [
    {
     "ename": "NameError",
     "evalue": "name 'tdm' is not defined",
     "output_type": "error",
     "traceback": [
      "\u001b[31m---------------------------------------------------------------------------\u001b[39m",
      "\u001b[31mNameError\u001b[39m                                 Traceback (most recent call last)",
      "\u001b[36mCell\u001b[39m\u001b[36m \u001b[39m\u001b[32mIn[19]\u001b[39m\u001b[32m, line 3\u001b[39m\n\u001b[32m      1\u001b[39m rng = np.random.default_rng(\u001b[32m42\u001b[39m)\n\u001b[32m----> \u001b[39m\u001b[32m3\u001b[39m table = \u001b[43mtdm\u001b[49m.TimeSeriesTable(\n\u001b[32m      4\u001b[39m     tdm.Frequency.PT1H,\n\u001b[32m      5\u001b[39m     timestamps=timestamps,\n\u001b[32m      6\u001b[39m     values=np.column_stack([\n\u001b[32m      7\u001b[39m         \u001b[32m80\u001b[39m + \u001b[32m40\u001b[39m * np.sin(np.linspace(\u001b[32m0\u001b[39m, \u001b[32m2\u001b[39m * np.pi, \u001b[32m24\u001b[39m)) + rng.normal(\u001b[32m0\u001b[39m, \u001b[32m5\u001b[39m, \u001b[32m24\u001b[39m),\n\u001b[32m      8\u001b[39m         np.clip(\u001b[32m60\u001b[39m * np.sin(np.linspace(-\u001b[32m0.5\u001b[39m, np.pi + \u001b[32m0.5\u001b[39m, \u001b[32m24\u001b[39m)), \u001b[32m0\u001b[39m, \u001b[38;5;28;01mNone\u001b[39;00m),\n\u001b[32m      9\u001b[39m         \u001b[32m50\u001b[39m + rng.normal(\u001b[32m0\u001b[39m, \u001b[32m3\u001b[39m, \u001b[32m24\u001b[39m),\n\u001b[32m     10\u001b[39m     ]),\n\u001b[32m     11\u001b[39m     names=[\u001b[33m\"\u001b[39m\u001b[33mwind\u001b[39m\u001b[33m\"\u001b[39m, \u001b[33m\"\u001b[39m\u001b[33msolar\u001b[39m\u001b[33m\"\u001b[39m, \u001b[33m\"\u001b[39m\u001b[33mhydro\u001b[39m\u001b[33m\"\u001b[39m],\n\u001b[32m     12\u001b[39m     units=[\u001b[33m\"\u001b[39m\u001b[33mMW\u001b[39m\u001b[33m\"\u001b[39m, \u001b[33m\"\u001b[39m\u001b[33mMW\u001b[39m\u001b[33m\"\u001b[39m, \u001b[33m\"\u001b[39m\u001b[33mMW\u001b[39m\u001b[33m\"\u001b[39m],\n\u001b[32m     13\u001b[39m )\n\u001b[32m     14\u001b[39m table\n",
      "\u001b[31mNameError\u001b[39m: name 'tdm' is not defined"
     ]
    }
   ],
   "source": [
    "rng = np.random.default_rng(42)\n",
    "\n",
    "table = tdm.TimeSeriesTable(\n",
    "    tdm.Frequency.PT1H,\n",
    "    timestamps=timestamps,\n",
    "    values=np.column_stack([\n",
    "        80 + 40 * np.sin(np.linspace(0, 2 * np.pi, 24)) + rng.normal(0, 5, 24),\n",
    "        np.clip(60 * np.sin(np.linspace(-0.5, np.pi + 0.5, 24)), 0, None),\n",
    "        50 + rng.normal(0, 3, 24),\n",
    "    ]),\n",
    "    names=[\"wind\", \"solar\", \"hydro\"],\n",
    "    units=[\"MW\", \"MW\", \"MW\"],\n",
    ")\n",
    "table"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2026-03-01T13:36:43.726608Z",
     "iopub.status.busy": "2026-03-01T13:36:43.726550Z",
     "iopub.status.idle": "2026-03-01T13:36:43.732827Z",
     "shell.execute_reply": "2026-03-01T13:36:43.732492Z"
    }
   },
   "outputs": [
    {
     "ename": "NameError",
     "evalue": "name 'table' is not defined",
     "output_type": "error",
     "traceback": [
      "\u001b[31m---------------------------------------------------------------------------\u001b[39m",
      "\u001b[31mNameError\u001b[39m                                 Traceback (most recent call last)",
      "\u001b[36mCell\u001b[39m\u001b[36m \u001b[39m\u001b[32mIn[20]\u001b[39m\u001b[32m, line 1\u001b[39m\n\u001b[32m----> \u001b[39m\u001b[32m1\u001b[39m table_rolling = \u001b[43mtable\u001b[49m.apply_pandas(\u001b[38;5;28;01mlambda\u001b[39;00m df: df.rolling(\u001b[32m4\u001b[39m, min_periods=\u001b[32m1\u001b[39m).mean())\n\u001b[32m      2\u001b[39m table_rolling\n",
      "\u001b[31mNameError\u001b[39m: name 'table' is not defined"
     ]
    }
   ],
   "source": [
    "table_rolling = table.apply_pandas(lambda df: df.rolling(4, min_periods=1).mean())\n",
    "table_rolling"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2026-03-01T13:36:43.733732Z",
     "iopub.status.busy": "2026-03-01T13:36:43.733678Z",
     "iopub.status.idle": "2026-03-01T13:36:43.740218Z",
     "shell.execute_reply": "2026-03-01T13:36:43.739852Z"
    }
   },
   "outputs": [
    {
     "ename": "NameError",
     "evalue": "name 'table' is not defined",
     "output_type": "error",
     "traceback": [
      "\u001b[31m---------------------------------------------------------------------------\u001b[39m",
      "\u001b[31mNameError\u001b[39m                                 Traceback (most recent call last)",
      "\u001b[36mCell\u001b[39m\u001b[36m \u001b[39m\u001b[32mIn[21]\u001b[39m\u001b[32m, line 1\u001b[39m\n\u001b[32m----> \u001b[39m\u001b[32m1\u001b[39m table_norm = \u001b[43mtable\u001b[49m.apply_numpy(\n\u001b[32m      2\u001b[39m     \u001b[38;5;28;01mlambda\u001b[39;00m arr: (arr - arr.mean(axis=\u001b[32m0\u001b[39m)) / arr.std(axis=\u001b[32m0\u001b[39m)\n\u001b[32m      3\u001b[39m )\n\u001b[32m      4\u001b[39m table_norm.head(\u001b[32m6\u001b[39m)\n",
      "\u001b[31mNameError\u001b[39m: name 'table' is not defined"
     ]
    }
   ],
   "source": [
    "table_norm = table.apply_numpy(\n",
    "    lambda arr: (arr - arr.mean(axis=0)) / arr.std(axis=0)\n",
    ")\n",
    "table_norm.head(6)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2026-03-01T13:36:43.741097Z",
     "iopub.status.busy": "2026-03-01T13:36:43.741047Z",
     "iopub.status.idle": "2026-03-01T13:36:43.747274Z",
     "shell.execute_reply": "2026-03-01T13:36:43.746960Z"
    }
   },
   "outputs": [
    {
     "ename": "NameError",
     "evalue": "name 'table' is not defined",
     "output_type": "error",
     "traceback": [
      "\u001b[31m---------------------------------------------------------------------------\u001b[39m",
      "\u001b[31mNameError\u001b[39m                                 Traceback (most recent call last)",
      "\u001b[36mCell\u001b[39m\u001b[36m \u001b[39m\u001b[32mIn[22]\u001b[39m\u001b[32m, line 1\u001b[39m\n\u001b[32m----> \u001b[39m\u001b[32m1\u001b[39m \u001b[43mtable\u001b[49m.update_df(table.df.rolling(\u001b[32m4\u001b[39m, min_periods=\u001b[32m1\u001b[39m).mean())\n",
      "\u001b[31mNameError\u001b[39m: name 'table' is not defined"
     ]
    }
   ],
   "source": [
    "table.update_df(table.df.rolling(4, min_periods=1).mean())"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2026-03-01T13:36:43.748112Z",
     "iopub.status.busy": "2026-03-01T13:36:43.748061Z",
     "iopub.status.idle": "2026-03-01T13:36:43.754336Z",
     "shell.execute_reply": "2026-03-01T13:36:43.753930Z"
    }
   },
   "outputs": [
    {
     "ename": "NameError",
     "evalue": "name 'table' is not defined",
     "output_type": "error",
     "traceback": [
      "\u001b[31m---------------------------------------------------------------------------\u001b[39m",
      "\u001b[31mNameError\u001b[39m                                 Traceback (most recent call last)",
      "\u001b[36mCell\u001b[39m\u001b[36m \u001b[39m\u001b[32mIn[23]\u001b[39m\u001b[32m, line 1\u001b[39m\n\u001b[32m----> \u001b[39m\u001b[32m1\u001b[39m \u001b[43mtable\u001b[49m.update_arr(np.clip(table.arr, \u001b[32m40\u001b[39m, \u001b[32m120\u001b[39m))\n",
      "\u001b[31mNameError\u001b[39m: name 'table' is not defined"
     ]
    }
   ],
   "source": [
    "table.update_arr(np.clip(table.arr, 40, 120))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Summary\n",
    "\n",
    "Five patterns for transforming time series data:\n",
    "\n",
    "| Pattern | Method | Best for |\n",
    "|---------|--------|----------|\n",
    "| `ts.apply_numpy(func)` | Functional | Same-length vectorized ops (normalize, cumsum) |\n",
    "| `ts.apply_pandas(func)` | Functional | Rolling windows, diff, pct_change |\n",
    "| `ts.update_arr(ts.arr.clip(...))` | One-liner | Quick numpy transforms via `.arr` |\n",
    "| `ts.update_df(ts.df.resample(...).mean())` | One-liner | Quick pandas transforms via `.df` |\n",
    "| Manual `to_numpy()` / `to_pandas_dataframe()` | Multi-step | Shape-changing ops, complex workflows |\n",
    "\n",
    "All patterns preserve metadata. Use `.arr` / `.df` for read access and `update_arr()` / `update_df()` to write results back.\n",
    "\n",
    "Next up: **nb_03** covers unit handling, validation, and rich metadata."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": ".venv",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.14.2"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
