[DOC] TSC notebook (aeon-toolkit#2287)

* TSC notebook * wording
IRKnyazev · Nov 3, 2024 · ea6b42b · ea6b42b
1 parent 7e24f08
commit ea6b42b
Showing 1 changed file with 25 additions and 25 deletions.
diff --git a/examples/classification/classification.ipynb b/examples/classification/classification.ipynb
@@ -76,14 +76,10 @@
    ],
    "source": [
     "# Plotting and data loading imports used in this notebook\n",
-    "import warnings\n",
-    "\n",
     "import matplotlib.pyplot as plt\n",
     "\n",
     "from aeon.datasets import load_arrow_head, load_basic_motions\n",
     "\n",
-    "warnings.filterwarnings(\"ignore\")\n",
-    "\n",
     "arrow, arrow_labels = load_arrow_head(split=\"train\")\n",
     "motions, motions_labels = load_basic_motions(split=\"train\")\n",
     "print(f\"ArrowHead series of type {type(arrow)} and shape {arrow.shape}\")\n",
@@ -96,9 +92,13 @@
   {
    "cell_type": "markdown",
    "source": [
-    "We tend to use 3D numpy even if the data is univariate, although all classifiers work\n",
-    " with shape (instance, time point), currently some transformers do not work correctly\n",
-    "  with 2D arrays. If your series are unequal length, have missing values or are\n",
+    "We use 3D numpy even if the data is univariate: even though classifiers\n",
+    "can work using a 2D array of shape `(n_cases, n_timepoints)`, this 2D shape can get\n",
+    "confused with single multivariate time series, which are of shape `(n_channels, n_timepoints)`.\n",
+    "Hence, to differentiate both cases, we enforce the 3D format `(n_cases, n_channels,\n",
+    "n_timepoints)` to avoid any confusion.\n",
+    "\n",
+    "If your series are unequal length, have missing values or are\n",
     "  sampled at irregular time intervals, you should read the note book\n",
     "  on [data preprocessing](../utils/preprocessing.ipynb).\n",
     "\n",
@@ -293,9 +293,9 @@
     "collapsed": false
    },
    "source": [
-    "Another accurate classifier for time series classification is version 2 of the\n",
-    "[HIVE-COTE](https://link.springer.com/article/10.1007/s10994-021-06057-9) algorithm.\n",
-    "(HC2) is described in the [hybrid notebook](hybrid.ipynb) notebook. HC2 is relatively\n",
+    "A slower but generally more accurate classifier for time series classification is\n",
+    "version 2 of the [HIVE-COTE](https://link.springer.com/article/10.1007/s10994-021-06057-9) algorithm.\n",
+    "(HC2) is described in the [hybrid notebook](hybrid.ipynb) notebook. HC2 is particularly\n",
     "slow\n",
     "on small problems like these examples. However, it can be\n",
     "configured with an approximate maximum run time as follows (it may take a bit longer\n",
@@ -449,10 +449,9 @@
    },
    "source": [
     "An alternative for MTSC is to build a univariate classifier on each dimension, then\n",
-    "ensemble. Dimension ensembling can be easily done via ``ColumnEnsembleClassifier``\n",
+    "ensemble. Dimension ensembling can be easily done via ``ChannelEnsembleClassifier``\n",
     "which fits classifiers independently to specified dimensions, then\n",
-    "combines predictions through a voting scheme. The interface is\n",
-    "similar to the ``ColumnTransformer`` from `sklearn`. The example below builds a DrCIF\n",
+    "combines predictions through a voting scheme. The example below builds a DrCIF\n",
     "classifier on the first channel and a RocketClassifier on the fourth and fifth\n",
     "dimensions, ignoring the second, third and sixth."
    ]
@@ -613,17 +612,23 @@
     "\n",
     "#### KNeighborsTimeSeriesClassifier\n",
     "\n",
-    "One nearest neighbour (1-NN) classification with Dynamic Time Warping (DTW) is one of the oldest TSC approaches, and is commonly used as a performance benchmark.\n",
+    "One nearest neighbour (1-NN) classification with Dynamic Time Warping (DTW) is\n",
+    "a [distance based](distance_based.ipynb) classifier and one of the most frequently used\n",
+    "approaches, although it is less accurate on average than the state of the art.\n",
     "\n",
     "#### RocketClassifier\n",
-    "The RocketClassifier is based on a pipeline combination of the ROCKET transformation (transformations.panel.rocket) and the sklearn RidgeClassifierCV classifier. The RocketClassifier is configurable to use variants MiniRocket and MultiRocket. ROCKET is based on generating random convolutional kernels. A large number are generated, then a linear classifier is built on the output.\n",
+    "The RocketClassifier is a [convolution based](convolution_based.ipynb) classifier\n",
+    "made up of a pipeline combination of the ROCKET transformation\n",
+    " (transformations.panel.rocket) and the sklearn RidgeClassifierCV classifier. The RocketClassifier is configurable to use variants MiniRocket and MultiRocket. ROCKET is based on generating random convolutional kernels. A large number are generated, then a linear classifier is built on the output.\n",
     "\n",
     "[1] Dempster, Angus, François Petitjean, and Geoffrey I. Webb. \"Rocket: exceptionally fast and accurate time series classification using random convolutional kernels.\" Data Mining and Knowledge Discovery (2020)\n",
     "[arXiv version](https://arxiv.org/abs/1910.13051)\n",
     "[DAMI 2020](https://link.springer.com/article/10.1007/s10618-020-00701-z)\n",
     "\n",
     "#### DrCIF\n",
-    "The Diverse Representation Canonical Interval Forest Classifier (DrCIF) is an interval based classifier. The algorithm takes multiple randomised intervals from each series and extracts a range of features. These features are used to build a decision tree, which in turn are ensembled into a decision tree forest, in the style of a random forest.\n",
+    "The Diverse Representation Canonical Interval Forest Classifier (DrCIF) is an\n",
+    "[interval based](interval_based.ipynb) classifier. The algorithm takes multiple\n",
+    "randomised intervals from each series and extracts a range of features. These features are used to build a decision tree, which in turn are ensembled into a decision tree forest, in the style of a random forest.\n",
     "\n",
     "Original CIF classifier:\n",
     "[2] Matthew Middlehurst and James Large and Anthony Bagnall. \"The Canonical Interval Forest (CIF) Classifier for Time Series Classification.\" IEEE International Conference on Big Data (2020)\n",
@@ -633,17 +638,12 @@
     "The DrCIF adjustment was proposed in [3].\n",
     "\n",
     "#### HIVE-COTE 2.0 (HC2)\n",
-    "The HIerarchical VotE Collective of Transformation-based Ensembles is a meta ensemble that combines classifiers built on different representations. Version 2  combines DrCIF, TDE, an ensemble of RocketClassifiers called the Arsenal and the  ShapeletTransformClassifier. It is one of the most accurate classifiers on the UCR and UEA time series archives.\n",
+    "The HIerarchical VotE Collective of Transformation-based Ensembles is a meta ensemble\n",
+    " [hybrid](hybrid.ipynb) that combines classifiers built on different representations.\n",
+    "  Version 2  combines DrCIF, TDE, an ensemble of RocketClassifiers called the Arsenal and the  ShapeletTransformClassifier. It is one of the most accurate classifiers on the UCR and UEA time series archives.\n",
     "\n",
     "[3] Middlehurst, Matthew, James Large, Michael Flynn, Jason Lines, Aaron Bostrom, and Anthony Bagnall. \"HIVE-COTE 2.0: a new meta ensemble for time series classification.\" Machine Learning (2021)\n",
-    "[ML 2021](https://link.springer.com/article/10.1007/s10994-021-06057-9)\n",
-    "\n",
-    "#### Catch22\n",
-    "\n",
-    "The CAnonical Time-series CHaracteristics (Catch22) are a set of 22 informative and low redundancy features extracted from time series data. The features were filtered from 4791 features in the `hctsa` toolkit.\n",
-    "\n",
-    "[4] Lubba, Carl H., Sarab S. Sethi, Philip Knaute, Simon R. Schultz, Ben D. Fulcher, and Nick S. Jones. \"catch22: Canonical time-series characteristics.\" Data Mining and Knowledge Discovery (2019)\n",
-    "[DAMI 2019](https://link.springer.com/article/10.1007/s10618-019-00647-x)"
+    "[ML 2021](https://link.springer.com/article/10.1007/s10994-021-06057-9)\n"
    ]
   }
  ],