Skip to content

Commit

Permalink
[DOC] TSC notebook (aeon-toolkit#2287)
Browse files Browse the repository at this point in the history
* TSC notebook

* wording
  • Loading branch information
TonyBagnall authored Nov 3, 2024
1 parent 7e24f08 commit ea6b42b
Showing 1 changed file with 25 additions and 25 deletions.
50 changes: 25 additions & 25 deletions examples/classification/classification.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -76,14 +76,10 @@
],
"source": [
"# Plotting and data loading imports used in this notebook\n",
"import warnings\n",
"\n",
"import matplotlib.pyplot as plt\n",
"\n",
"from aeon.datasets import load_arrow_head, load_basic_motions\n",
"\n",
"warnings.filterwarnings(\"ignore\")\n",
"\n",
"arrow, arrow_labels = load_arrow_head(split=\"train\")\n",
"motions, motions_labels = load_basic_motions(split=\"train\")\n",
"print(f\"ArrowHead series of type {type(arrow)} and shape {arrow.shape}\")\n",
Expand All @@ -96,9 +92,13 @@
{
"cell_type": "markdown",
"source": [
"We tend to use 3D numpy even if the data is univariate, although all classifiers work\n",
" with shape (instance, time point), currently some transformers do not work correctly\n",
" with 2D arrays. If your series are unequal length, have missing values or are\n",
"We use 3D numpy even if the data is univariate: even though classifiers\n",
"can work using a 2D array of shape `(n_cases, n_timepoints)`, this 2D shape can get\n",
"confused with single multivariate time series, which are of shape `(n_channels, n_timepoints)`.\n",
"Hence, to differentiate both cases, we enforce the 3D format `(n_cases, n_channels,\n",
"n_timepoints)` to avoid any confusion.\n",
"\n",
"If your series are unequal length, have missing values or are\n",
" sampled at irregular time intervals, you should read the note book\n",
" on [data preprocessing](../utils/preprocessing.ipynb).\n",
"\n",
Expand Down Expand Up @@ -293,9 +293,9 @@
"collapsed": false
},
"source": [
"Another accurate classifier for time series classification is version 2 of the\n",
"[HIVE-COTE](https://link.springer.com/article/10.1007/s10994-021-06057-9) algorithm.\n",
"(HC2) is described in the [hybrid notebook](hybrid.ipynb) notebook. HC2 is relatively\n",
"A slower but generally more accurate classifier for time series classification is\n",
"version 2 of the [HIVE-COTE](https://link.springer.com/article/10.1007/s10994-021-06057-9) algorithm.\n",
"(HC2) is described in the [hybrid notebook](hybrid.ipynb) notebook. HC2 is particularly\n",
"slow\n",
"on small problems like these examples. However, it can be\n",
"configured with an approximate maximum run time as follows (it may take a bit longer\n",
Expand Down Expand Up @@ -449,10 +449,9 @@
},
"source": [
"An alternative for MTSC is to build a univariate classifier on each dimension, then\n",
"ensemble. Dimension ensembling can be easily done via ``ColumnEnsembleClassifier``\n",
"ensemble. Dimension ensembling can be easily done via ``ChannelEnsembleClassifier``\n",
"which fits classifiers independently to specified dimensions, then\n",
"combines predictions through a voting scheme. The interface is\n",
"similar to the ``ColumnTransformer`` from `sklearn`. The example below builds a DrCIF\n",
"combines predictions through a voting scheme. The example below builds a DrCIF\n",
"classifier on the first channel and a RocketClassifier on the fourth and fifth\n",
"dimensions, ignoring the second, third and sixth."
]
Expand Down Expand Up @@ -613,17 +612,23 @@
"\n",
"#### KNeighborsTimeSeriesClassifier\n",
"\n",
"One nearest neighbour (1-NN) classification with Dynamic Time Warping (DTW) is one of the oldest TSC approaches, and is commonly used as a performance benchmark.\n",
"One nearest neighbour (1-NN) classification with Dynamic Time Warping (DTW) is\n",
"a [distance based](distance_based.ipynb) classifier and one of the most frequently used\n",
"approaches, although it is less accurate on average than the state of the art.\n",
"\n",
"#### RocketClassifier\n",
"The RocketClassifier is based on a pipeline combination of the ROCKET transformation (transformations.panel.rocket) and the sklearn RidgeClassifierCV classifier. The RocketClassifier is configurable to use variants MiniRocket and MultiRocket. ROCKET is based on generating random convolutional kernels. A large number are generated, then a linear classifier is built on the output.\n",
"The RocketClassifier is a [convolution based](convolution_based.ipynb) classifier\n",
"made up of a pipeline combination of the ROCKET transformation\n",
" (transformations.panel.rocket) and the sklearn RidgeClassifierCV classifier. The RocketClassifier is configurable to use variants MiniRocket and MultiRocket. ROCKET is based on generating random convolutional kernels. A large number are generated, then a linear classifier is built on the output.\n",
"\n",
"[1] Dempster, Angus, François Petitjean, and Geoffrey I. Webb. \"Rocket: exceptionally fast and accurate time series classification using random convolutional kernels.\" Data Mining and Knowledge Discovery (2020)\n",
"[arXiv version](https://arxiv.org/abs/1910.13051)\n",
"[DAMI 2020](https://link.springer.com/article/10.1007/s10618-020-00701-z)\n",
"\n",
"#### DrCIF\n",
"The Diverse Representation Canonical Interval Forest Classifier (DrCIF) is an interval based classifier. The algorithm takes multiple randomised intervals from each series and extracts a range of features. These features are used to build a decision tree, which in turn are ensembled into a decision tree forest, in the style of a random forest.\n",
"The Diverse Representation Canonical Interval Forest Classifier (DrCIF) is an\n",
"[interval based](interval_based.ipynb) classifier. The algorithm takes multiple\n",
"randomised intervals from each series and extracts a range of features. These features are used to build a decision tree, which in turn are ensembled into a decision tree forest, in the style of a random forest.\n",
"\n",
"Original CIF classifier:\n",
"[2] Matthew Middlehurst and James Large and Anthony Bagnall. \"The Canonical Interval Forest (CIF) Classifier for Time Series Classification.\" IEEE International Conference on Big Data (2020)\n",
Expand All @@ -633,17 +638,12 @@
"The DrCIF adjustment was proposed in [3].\n",
"\n",
"#### HIVE-COTE 2.0 (HC2)\n",
"The HIerarchical VotE Collective of Transformation-based Ensembles is a meta ensemble that combines classifiers built on different representations. Version 2 combines DrCIF, TDE, an ensemble of RocketClassifiers called the Arsenal and the ShapeletTransformClassifier. It is one of the most accurate classifiers on the UCR and UEA time series archives.\n",
"The HIerarchical VotE Collective of Transformation-based Ensembles is a meta ensemble\n",
" [hybrid](hybrid.ipynb) that combines classifiers built on different representations.\n",
" Version 2 combines DrCIF, TDE, an ensemble of RocketClassifiers called the Arsenal and the ShapeletTransformClassifier. It is one of the most accurate classifiers on the UCR and UEA time series archives.\n",
"\n",
"[3] Middlehurst, Matthew, James Large, Michael Flynn, Jason Lines, Aaron Bostrom, and Anthony Bagnall. \"HIVE-COTE 2.0: a new meta ensemble for time series classification.\" Machine Learning (2021)\n",
"[ML 2021](https://link.springer.com/article/10.1007/s10994-021-06057-9)\n",
"\n",
"#### Catch22\n",
"\n",
"The CAnonical Time-series CHaracteristics (Catch22) are a set of 22 informative and low redundancy features extracted from time series data. The features were filtered from 4791 features in the `hctsa` toolkit.\n",
"\n",
"[4] Lubba, Carl H., Sarab S. Sethi, Philip Knaute, Simon R. Schultz, Ben D. Fulcher, and Nick S. Jones. \"catch22: Canonical time-series characteristics.\" Data Mining and Knowledge Discovery (2019)\n",
"[DAMI 2019](https://link.springer.com/article/10.1007/s10618-019-00647-x)"
"[ML 2021](https://link.springer.com/article/10.1007/s10994-021-06057-9)\n"
]
}
],
Expand Down

0 comments on commit ea6b42b

Please sign in to comment.