binder

Transforming time series

Transforming time series into different data representations is fundamental to time series machine learning. Transformation can involve extracting features that characterize the time series, such as mean and variance or changing the series into, for example, first order differences. We use the term transformer in the scikit-learn sense, not to be confused with deep learning Transformers that employ an attention mechanism. We call transformers that extract features series-to-vector transformers and those that change the series into a different representation that is still ordered series-to-series transformers.

We further differentiate between transformers that act on a single series and those that transform a collection of series. Single series transformers are located in transformations/series directory and inherit from BaseSeriesTransformer. For example, AutoCorrelationSeriesTransformer is a series-to-series transformer that finds the auto correlation function for a single series.

[23]:
from aeon.datasets import load_airline
from aeon.transformations.series import AutoCorrelationSeriesTransformer

series = load_airline()
transformer = AutoCorrelationSeriesTransformer(n_lags=10)
acf = transformer.fit_transform(series)
print(acf)
[[0.96019465 0.89567531 0.83739477 0.7977347  0.78594315 0.7839188
  0.78459213 0.79221505 0.8278519  0.8827128 ]]

Collection transformers are located in the transformations/collection directory and inherit from BaseCollectionTransformer. For example, Truncator truncates all time series in a collection to the same length.

[24]:
from aeon.datasets import load_plaid
from aeon.transformations.collection import Truncator

X, y = load_plaid()
print(" Unequal length, first case ", X[0].shape, " tenth case ", X[10].shape)
trunc = Truncator(truncated_length=100)
X2 = trunc.fit_transform(X)
print("Truncated collection shape  =", X2.shape)
 Unequal length, first case  (1, 500)  tenth case  (1, 300)
Truncated collection shape  = (1074, 1, 100)

Truncator is a series-to-series transformer that returns a new collection of time series of the same length. This can then be used, for example, by a classifier that only works with equal length series:

[25]:
from aeon.classification.feature_based import SummaryClassifier

summary = SummaryClassifier()
try:
    summary.fit(X, y)
except ValueError as e:
    print(e)

summary.fit(X2, y)
Data seen by instance of SummaryClassifier has unequal length series, but SummaryClassifier cannot handle unequal length series.
[25]:
SummaryClassifier()
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.

Some collection transformers are supervised, meaning they fit a transform based on the class labels. For example, the shapelet transform finds shapelets that are good at separating classes. This is a series-to-vector transformer that produces tabular output shape (n_cases, n_shapelets).

[26]:
from aeon.transformations.collection.shapelet_based import RandomShapeletTransform

st = RandomShapeletTransform(max_shapelets=10, n_shapelet_samples=100)
X2 = st.fit_transform(X, y)
print(X2.shape)
(1074, 2)

series-to-vector transformers produce output that is compatible with scikit-learn estimators

[27]:
from sklearn.ensemble import RandomForestClassifier

rf = RandomForestClassifier()
try:
    rf.fit(X, y)
except ValueError as e:
    print(e)
rf.fit(X2, y)
setting an array element with a sequence. The requested array has an inhomogeneous shape after 2 dimensions. The detected shape was (1074, 1) + inhomogeneous part.
[27]:
RandomForestClassifier()
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.

A list of all the available transformers can be found in the API. We currently have specific notebooks for the following transformers:

[ ]:


Generated using nbsphinx. The Jupyter notebook can be found here.