erp_distance¶

erp_distance(x: ndarray, y: ndarray, window: float | None = None, g: float = 0.0, g_arr: ndarray | None = None, itakura_max_slope: float | None = None) → float[source]¶

Compute the ERP distance between two time series.

Edit Distance with Real Penalty, ERP, first proposed in [1], attempts to align time series by better considering how indexes are carried forward through the cost matrix. Usually in the dtw cost matrix, if an alignment cannot be found the previous value is carried forward in the move off the diagonal. ERP instead proposes the idea of gaps or sequences of points that have no matches. These gaps are then penalised based on their distance from the parameter $g$.

\[\begin{split}match &= D_{i-1,j-1}+ d({x_{i},y_{j}})\\ delete &= D_{i-1,j}+ d({x_{i},g})\\ insert &= D_{i,j-1}+ d({g,y_{j}})\\ D_{i,j} &= min(match,insert, delete)\end{split}\]

Where $D_{0,j}$ and $D_{i,0}$ are initialised to the sum of distances to $g$ for each series.

The value of $g$ is by default 0 in aeon, but in [1] it is data dependent , selected from the range $[\sigma/5, \sigma]$, where $\sigma$ is the average standard deviation of the training time series. When a series is multivariate (more than one channel), $g$ is an array where the $j^{th}$ value is the standard deviation of the $j^{th}$ channel.

Parameters:

xnp.ndarray: First time series, either univariate, shape (n_timepoints,), or multivariate, shape (n_channels, n_timepoints).
ynp.ndarray: Second time series, either univariate, shape (n_timepoints,), or multivariate, shape (n_channels, n_timepoints).
windowfloat, default=None: The window to use for the bounding matrix. If None, no bounding matrix is used.
gfloat, default=0.0: The reference constant used to penalise moves off the diagonal. The default is 0.
g_arrnp.ndarray, default=None: Array of shape (n_channels), Numpy array with a separate g value for each channel. Must be the length of the number of channels in x and y.
itakura_max_slopefloat, default=None: Maximum slope as a proportion of the number of time points used to create Itakura parallelogram on the bounding matrix. Must be between 0. and 1.

Returns:

float: ERP distance between x and y.

Raises:

ValueError: If x and y are not 1D or 2D arrays.

References

[1] (1,2)

Lei Chen and Raymond Ng. 2004. On the marriage of Lp-norms and edit distance.

In Proceedings of the Thirtieth international conference on Very large data bases

Volume 30 (VLDB ‘04). VLDB Endowment, 792–803.

Examples

>>> import numpy as np
>>> from aeon.distances import erp_distance
>>> x = np.array([[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]])
>>> y = np.array([[2, 2, 2, 2, 5, 6, 7, 8, 9, 10]])
>>> erp_distance(x, y)
4.0