{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Sample notebook — measurement-data analysis\n",
    "\n",
    "Companion file for the MathJet tutorial: *Upgrading a Jupyter notebook with MathJet’s kernel.*\n",
    "\n",
    "Loads `measurements.csv`, computes per-condition summary statistics, and plots the distributions. Designed to be runnable in a standard Jupyter or IPython kernel and to gain interactive plots and live variable inspection when run inside MathJet’s own kernel."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import pandas as pd\n",
    "import numpy as np\n",
    "import matplotlib.pyplot as plt\n",
    "\n",
    "pd.set_option('display.max_rows', 10)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "df = pd.read_csv('measurements.csv', parse_dates=['timestamp'])\n",
    "df.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Per-condition summary statistics.\n",
    "summary = df.groupby('condition')['measurement_value'].agg(['count', 'mean', 'std', 'min', 'max'])\n",
    "summary"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# 95% confidence intervals on the per-condition means.\n",
    "ci = df.groupby('condition')['measurement_value'].agg(\n",
    "    mean='mean',\n",
    "    sem=lambda s: s.std(ddof=1) / np.sqrt(len(s)),\n",
    ")\n",
    "ci['ci_low']  = ci['mean'] - 1.96 * ci['sem']\n",
    "ci['ci_high'] = ci['mean'] + 1.96 * ci['sem']\n",
    "ci"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Extract the raw measurement values as a NumPy array and compute the overall mean.\n",
    "values = df['measurement_value'].values\n",
    "overall_mean = df['measurement_value'].mean()\n",
    "print(f'Overall mean: {overall_mean:.2f}  ({len(values)} measurements)')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Boxplot by condition.\n",
    "fig, ax = plt.subplots(figsize=(8, 5))\n",
    "df.boxplot(column='measurement_value', by='condition', ax=ax)\n",
    "ax.set_title('Measurement value by condition')\n",
    "ax.set_xlabel('Condition')\n",
    "ax.set_ylabel('Measurement value')\n",
    "plt.suptitle('')  # remove the auto-generated supertitle\n",
    "plt.tight_layout()\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Time series: measurement value over the timestamp axis, colored by condition.\n",
    "fig, ax = plt.subplots(figsize=(10, 5))\n",
    "for cond, sub in df.groupby('condition'):\n",
    "    sub_sorted = sub.sort_values('timestamp')\n",
    "    ax.plot(sub_sorted['timestamp'], sub_sorted['measurement_value'], 'o-', label=cond, alpha=0.7)\n",
    "ax.set_xlabel('Timestamp')\n",
    "ax.set_ylabel('Measurement value')\n",
    "ax.set_title('Measurement values over time, by condition')\n",
    "ax.legend()\n",
    "plt.tight_layout()\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## What to try in MathJet’s kernel\n",
    "\n",
    "Run this notebook once in a standard Jupyter kernel to confirm it works. Then switch the kernel to MathJet’s own (Kernel → Select Kernel → MathJet kernel) and re-run from the top:\n",
    "\n",
    "- The two `plt.show()` plots should now be interactive — pan, zoom, hover for values, and a graph companion table that links cells in the spreadsheet to points on the chart.\n",
    "- The DataFrame `df` and the summary tables should be inspectable in the Environment Pane without re-running cells. Edit a value in `df` from the variable cell block and the downstream summaries will not auto-recompute (notebook cells still run in order); re-running the summary cell will pick up the edit.\n",
    "- The `summary` and `ci` DataFrames open as editable variable cell blocks, with edits round-tripping back to the kernel.\n",
    "- Click `values` (a NumPy array) in the Environment Pane — the Overview Pane shows a line plot. Click `overall_mean` (a scalar) — the Overview Pane shows the value. Each variable type gets a shape-appropriate preview."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "name": "python",
   "version": "3.11.0"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
