{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# I/O and Interoperability\n",
    "\n",
    "TimeDataModel is designed to sit between your domain logic and the broader Python data ecosystem. This notebook shows how to move data seamlessly between `TimeSeriesList` / `TimeSeriesTable` and pandas, numpy, polars, JSON, and CSV."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": "from datetime import datetime, timedelta, timezone\n\nimport numpy as np\n\nimport timedatamodel as tdm\n\nbase = datetime(2024, 1, 15, tzinfo=timezone.utc)\ntimestamps = [base + timedelta(hours=i) for i in range(24)]\nvalues = [100.0 + 50 * np.sin(2 * np.pi * i / 24) for i in range(24)]\n\nts = tdm.TimeSeriesList(\n    tdm.Frequency.PT1H,\n    timezone=\"UTC\",\n    timestamps=timestamps,\n    values=values,\n    name=\"power\",\n    unit=\"MW\",\n    description=\"Synthetic daily power curve\",\n    data_type=tdm.DataType.SIMULATION,\n)"
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## NumPy\n",
    "\n",
    "`to_numpy()` returns a float64 array (None becomes NaN). Use it when you need fast vectorized computation."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "arr = ts.to_numpy()\n",
    "print(f\"Type:  {type(arr)}\")\n",
    "print(f\"Shape: {arr.shape}\")\n",
    "print(f\"Mean:  {arr.mean():.2f} MW\")\n",
    "print(f\"Max:   {arr.max():.2f} MW\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Pandas — export\n",
    "\n",
    "`to_pandas_dataframe()` produces a DataFrame with a DatetimeIndex."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "df = ts.to_pandas_dataframe()\n",
    "df.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Pandas — import with `from_pandas()`\n",
    "\n",
    "Create a TimeSeriesList from any pandas DataFrame that has a DatetimeIndex. Frequency and timezone are auto-inferred when possible."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": "import pandas as pd\n\nidx = pd.date_range(\"2024-06-01\", periods=48, freq=\"h\", tz=\"UTC\")\ndf_external = pd.DataFrame({\"load\": np.random.default_rng(42).normal(150, 20, 48)}, index=idx)\n\nts_from_pd = tdm.TimeSeriesList.from_pandas(df_external, unit=\"MW\", data_type=tdm.DataType.OBSERVATION)\nts_from_pd"
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Pandas — TimeSeriesTable round-trip"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": "rng = np.random.default_rng(42)\ntable = tdm.TimeSeriesTable(\n    tdm.Frequency.PT1H,\n    timestamps=timestamps,\n    values=np.column_stack([\n        100 + rng.normal(0, 10, 24),\n        50 + rng.normal(0, 5, 24),\n    ]),\n    names=[\"wind\", \"solar\"],\n    units=[\"MW\", \"MW\"],\n)\n\ndf_table = table.to_pandas_dataframe()\nprint(f\"Columns: {list(df_table.columns)}\")\n\ntable_back = tdm.TimeSeriesTable.from_pandas(df_table, units=[\"MW\", \"MW\"])\nprint(f\"Round-trip equals: {table.equals(table_back)}\")"
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Polars (optional)\n",
    "\n",
    "If polars is installed, you get `to_polars_dataframe()` and `from_polars()`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": "try:\n    df_pl = ts.to_polars_dataframe()\n    print(df_pl.head())\n\n    ts_from_pl = tdm.TimeSeriesList.from_polars(df_pl, tdm.Frequency.PT1H, unit=\"MW\")\n    print(f\"\\nRound-trip length: {len(ts_from_pl)}\")\nexcept ImportError:\n    print(\"polars not installed — skip this cell or: pip install timedatamodel[polars]\")"
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## JSON serialization\n",
    "\n",
    "`to_json()` produces an ISO-8601 JSON string. `from_json()` reconstructs the series with full metadata."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "json_str = ts.to_json()\n",
    "print(json_str[:200], \"...\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": "ts_restored = tdm.TimeSeriesList.from_json(json_str)\nprint(f\"Name:      {ts_restored.name}\")\nprint(f\"Unit:      {ts_restored.unit}\")\nprint(f\"Frequency: {ts_restored.frequency}\")\nprint(f\"Length:    {len(ts_restored)}\")\nprint(f\"Equals original: {ts.equals(ts_restored)}\")"
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### JSON for TimeSeriesTable"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": "json_table = table.to_json()\ntable_restored = tdm.TimeSeriesTable.from_json(json_table)\nprint(f\"Columns: {table_restored.column_names}\")\nprint(f\"Equals original: {table.equals(table_restored)}\")"
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## CSV serialization\n",
    "\n",
    "`to_csv()` and `from_csv()` write and read simple CSV files."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import tempfile\n",
    "from pathlib import Path\n",
    "\n",
    "with tempfile.TemporaryDirectory() as tmp:\n",
    "    csv_path = Path(tmp) / \"power.csv\"\n",
    "    ts.to_csv(csv_path)\n",
    "\n",
    "    with open(csv_path) as f:\n",
    "        for i, line in enumerate(f):\n",
    "            print(line.rstrip())\n",
    "            if i >= 4:\n",
    "                print(\"...\")\n",
    "                break"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": "with tempfile.TemporaryDirectory() as tmp:\n    csv_path = Path(tmp) / \"power.csv\"\n    ts.to_csv(csv_path)\n    ts_from_csv = tdm.TimeSeriesList.from_csv(csv_path, tdm.Frequency.PT1H, unit=\"MW\")\n\nprint(f\"Length: {len(ts_from_csv)}\")\nprint(f\"Name:   {ts_from_csv.name}\")\nts_from_csv.head()"
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Multi-index round-trip\n",
    "\n",
    "TimeSeriesList supports tuple-based timestamps for hierarchical indexing. This is preserved through pandas and JSON round-trips."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": "issue_times = [\n    datetime(2024, 1, 15, 0, tzinfo=timezone.utc),\n    datetime(2024, 1, 15, 0, tzinfo=timezone.utc),\n    datetime(2024, 1, 15, 0, tzinfo=timezone.utc),\n    datetime(2024, 1, 15, 12, tzinfo=timezone.utc),\n    datetime(2024, 1, 15, 12, tzinfo=timezone.utc),\n    datetime(2024, 1, 15, 12, tzinfo=timezone.utc),\n]\nvalid_times = [\n    datetime(2024, 1, 15, 0, tzinfo=timezone.utc),\n    datetime(2024, 1, 15, 1, tzinfo=timezone.utc),\n    datetime(2024, 1, 15, 2, tzinfo=timezone.utc),\n    datetime(2024, 1, 15, 12, tzinfo=timezone.utc),\n    datetime(2024, 1, 15, 13, tzinfo=timezone.utc),\n    datetime(2024, 1, 15, 14, tzinfo=timezone.utc),\n]\n\nts_multi = tdm.TimeSeriesList(\n    tdm.Frequency.PT1H,\n    timestamps=list(zip(issue_times, valid_times)),\n    values=[100.0, 105.0, 110.0, 95.0, 100.0, 108.0],\n    name=\"forecast\",\n    unit=\"MW\",\n    index_names=[\"issue_time\", \"valid_time\"],\n)\n\ndf_multi = ts_multi.to_pandas_dataframe()\ndf_multi"
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": "ts_multi_back = tdm.TimeSeriesList.from_pandas(df_multi, tdm.Frequency.PT1H, unit=\"MW\")\nprint(f\"Index names: {ts_multi_back.index_names}\")\nprint(f\"Multi-index: {ts_multi_back.is_multi_index}\")\nprint(f\"Equals original: {ts_multi.equals(ts_multi_back)}\")"
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Summary\n",
    "\n",
    "| Format | Export | Import | Metadata preserved |\n",
    "|--------|--------|--------|-------------------|\n",
    "| NumPy | `to_numpy()` | — | Values only |\n",
    "| pandas | `to_pandas_dataframe()` | `from_pandas()` | Name, freq, tz |\n",
    "| polars | `to_polars_dataframe()` | `from_polars()` | Name |\n",
    "| JSON | `to_json()` | `from_json()` | Full |\n",
    "| CSV | `to_csv()` | `from_csv()` | Timestamps + values |\n",
    "\n",
    "Next up: **nb_09** covers geographical support — GeoLocation, GeoArea, and spatial queries."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "name": "python",
   "version": "3.11.0"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}