binder

Time series forecasting with aeon

This notebook describes the new, experimental, forecasting module in aeon. We have recently removed a lot of legacy code that was almost entirely wrappers around other projects, mostly statsmodels. Most of the contributors to aeon are from a computer science/machine learning background rather than stats and forecasting, and our objectives for forecasting have changed to reflect this. Our focus is on:

  1. not attempting to be a comprehensive forecasting package.

Forecasting is a wide field with lots of specific variants and use cases. The open source landscape is crowded with packages that focus primarily or exclusively on forecasting. We are not trying to do all things in forecasting. We want to focus on a few key use cases that reflect our research interests.

  1. fast forecasting with numpy arrays.

Whilst our forecasters will work with data frames, our design principle is to write code optimised with numba and numpy. We found that extensive use of data frames in the internal calculations of forecasters makes them much slower and harder to understand for those not used to using dataframes daily.

  1. forecasting using machine learning and deep learning.

we want to implement and assess the latest machine learning and deep learning forecasting for scenarios where it makes sense to use them. Our initial experimental focus will be on forecasting with long series for a single forecasting horizon.

Base Class

Our first design choice for forecasting is to pass the forecasting horizon in the constructor (default is 1). This is because we want a simpler use case: a forecaster trains to predict so many places in the future, then for unseen data, it predicts the same number of steps ahead. We recognise there are other scenarios, but this is the cleanest way to start.

The base class for all forecasters is BaseForecaster. It inherits from BaseSeriesEstimator, which is also the base class for the other series estimators in aeon: BaseSegmenter, BaseAnomalyDetector and BaseSeriesTransformer. The base class BaseSeriesEstimator contains a method to validate and possibly convert an input series. The BaseForecaster has three core methods: fit, predict and forecast. It is an abstract class, and each of these methods calls a protected method _fit, _predict and _forecast.

[10]:
import inspect

from aeon.forecasting import BaseForecaster

# List methods
public_methods = [
    func[0]
    for func in inspect.getmembers(BaseForecaster, predicate=inspect.isfunction)
    if not func[0].startswith("_")
]
print(public_methods)
['clone', 'fit', 'forecast', 'get_fitted_params', 'get_metadata_routing', 'get_params', 'get_tag', 'get_tags', 'predict', 'reset', 'set_params', 'set_tags']

All estimators in aeon have tags. One specific to forecasting is y_inner_type. This specifies the inner type the sub class of BaseForecaster needs to input the method _fit and _predict. The default is np .ndarray but it can also be pd.DataFrame or pd.Series. You can pass forecaster and of SERIES_DATA_TYPES and it will be converted to y_inner_type in fit, predict and forecast.

[11]:
from aeon.utils.data_types import SERIES_DATA_TYPES

print(" Possible data structures for input to forecaster ", SERIES_DATA_TYPES)
print("\n Tags for BaseForecaster: ", BaseForecaster.get_class_tags())
 Possible data structures for input to forecaster  ['pd.Series', 'pd.DataFrame', 'np.ndarray']

 Tags for BaseForecaster:  {'python_version': None, 'python_dependencies': None, 'cant_pickle': False, 'non_deterministic': False, 'algorithm_type': None, 'capability:missing_values': False, 'capability:multithreading': False, 'capability:univariate': True, 'capability:multivariate': False, 'X_inner_type': 'np.ndarray', 'fit_is_empty': False, 'y_inner_type': 'np.ndarray'}

We use the standard airline dataset for examples. This can be stored as a pd.Series, pd.DataFrame or np.ndarray.

[12]:
import pandas as pd

from aeon.datasets import load_airline

y = load_airline()
print(type(y))
y2 = pd.Series(y)
y3 = pd.DataFrame(y)
<class 'numpy.ndarray'>

DummyForecaster

A dummy forecaster can illustrate the use cases for forecasting. This forecaster simply returns the last value of the train data for the forecast. By default the horizon is 1. It makes no difference for this forecaster. It’s inner type is np.ndarray so all three allowable input types are internally converted to numpy arrays.

[13]:
# Fit then predict
from aeon.forecasting import DummyForecaster

d = DummyForecaster()
print(d.get_tag("y_inner_type"))
d.fit(y)
p = d.predict()
print(p)
np.ndarray
432.0
[14]:
# forecast is equivalent to fit_predict in other estimators
p2 = d.forecast(y)
print(p2)
432.0

Regression based forecasting

Our main focus will be forecasting through a sliding window and a regressor. We provide a basic implementation of this in RegressionForecaster. This class can take a regressor as a constructor parameter. It will train the regressor on the windowed series, then apply the data to new series. There will be a notebook for more details of the use of RegressionForecaster. By default it just uses a linear regressor, but our goal is to use it with aeon time series regressors.

[15]:
from aeon.forecasting import RegressionForecaster

r = RegressionForecaster(window=20)
r.fit(y)
p = r.predict()
print(p)
r2 = RegressionForecaster(window=10, horizon=5)
r2.fit(y)
p = r2.predict(y)
print(p)
[451.67541971]
[527.36897094]

With our set up, we can make predictions with previously unseen data, thus more closely modelling machine learning approaches. Or we can use the forecast method to fit/predict at the same time.

[19]:
p1 = r.forecast(y)
p2 = r2.forecast(y)
print(p1, ",\n", p2)
[451.67541971] ,
 [527.36897094]

Exponential Smoothing

The base exponential smoothing module is implemented in stripped down code with numba, and is very fast

[20]:
from aeon.forecasting import ETSForecaster

ets = ETSForecaster()
ets.fit(y)
ets.predict()

[20]:
460.302772481884
[ ]:

[ ]:


Generated using nbsphinx. The Jupyter notebook can be found here.