
sbd_distance(x: ndarray, y: ndarray, standardize: bool = True) float[source]

Compute the shape-based distance (SBD) between two time series.

Shape-based distance (SBD) [1] is a normalized version of cross-correlation (CC) that is shifting and scaling (if standardization is used) invariant.

For two series, possibly of unequal length, \(\mathbf{x}=\{x_1,x_2,\ldots, x_n\}\) and \(\mathbf{y}=\{y_1,y_2, \ldots,y_m\}\), SBD works by (optionally) first standardizing both time series using the z-score (\(x' = \frac{x - \mu}{\sigma}\)), then computing the cross-correlation between x and y (\(CC(\mathbf{x}, \mathbf{y})\)), then deviding it by the geometric mean of both autocorrelations of the individual sequences to normalize it to \([-1, 1]\) (coefficient normalization), and finally detecting the position with the maximum normalized cross-correlation:

\[SBD(\mathbf{x}, \mathbf{y}) = 1 - max_w\left( \frac{ CC_w(\mathbf{x}, \mathbf{y}) }{ \sqrt{ (\mathbf{x} \cdot \mathbf{x}) * (\mathbf{y} \cdot \mathbf{y}) } }\right)\]

This distance measure has values between 0 and 2; 0 is perfect similarity.

The computation of the cross-correlation \(CC(\mathbf{x}, \mathbf{y})\) for all values of w requires \(O(m^2)\) time, where m is the maximum time-series length. We can however use the convolution theorem to our advantage, and use the fast (inverse) fourier transform (FFT) to perform the computation of \(CC(\mathbf{x}, \mathbf{y})\) in \(O(m \cdot log(m))\):

\[CC(x, y) = \mathcal{F}^{-1}\{ \mathcal{F}(\mathbf{x}) * \mathcal{F}(\mathbf{y}) \}\]

For multivariate time series, SBD is computed independently for each channel and then averaged. Both time series must have the same number of channels!


First time series, either univariate, shape (n_timepoints,), or multivariate, shape (n_channels, n_timepoints).


Second time series, either univariate, shape (n_timepoints,), or multivariate, shape (n_channels, n_timepoints).

standardizebool, default=True

Apply z-score to both input time series for standardization before computing the distance. This makes SBD scaling invariant. Default is True.


SBD distance between x and y.


If x and y are not 1D or 2D arrays.

See also


Compute the shape-based distance (SBD) between all pairs of time series.



Paparrizos, John, and Luis Gravano: Fast and Accurate Time-Series Clustering. ACM Transactions on Database Systems 42, no. 2 (2017): 8:1-8:49.


>>> import numpy as np
>>> from aeon.distances import sbd_distance
>>> x = np.array([[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]])
>>> y = np.array([[11, 12, 13, 14, 15, 16, 17, 18, 19, 20]])
>>> dist = sbd_distance(x, y)