range_roc_auc_score

range_roc_auc_score(y_true: ndarray, y_score: ndarray, buffer_size: int | None = None) float[source]

Compute the range-based area under the ROC curve.

Computes the area under the receiver-operating-characteristic-curve using the range-based TPR and range-based FPR definition from Paparrizos et al. published at VLDB 2022 [1].

We first extend the anomaly labels by two slopes of buffer_size//2 length on both sides of each anomaly, uniformly sample thresholds from the anomaly score, and then compute the confusion matrix for all thresholds. Using the resulting precision and recall values, we can plot a curve and compute its area.

Parameters:
y_truenp.ndarray

True binary labels of shape (n_instances,).

y_scorenp.ndarray

Anomaly scores for each point of the time series of shape (n_instances,).

buffer_sizeint, optional

Size of the buffer region around an anomaly. We add an increasing slope of size buffer_size//2 to the beginning of anomalies and a decreasing slope of size buffer_size//2 to the end of anomalies. Per default (when buffer_size==None), buffer_size is the median length of the anomalies within the time series. However, you can also set it to the period size of the dominant frequency or any other desired value.

Returns:
Tuple[float, float]

Range-based ROC AUC score.

References

[1]

John Paparrizos, Paul Boniol, Themis Palpanas, Ruey S. Tsay, Aaron Elmore, and Michael J. Franklin. Volume Under the Surface: A New Accuracy Evaluation Measure for Time-Series Anomaly Detection. PVLDB, 15(11): 2774 - 2787, 2022. doi:10.14778/3551793.3551830