range_pr_roc_auc_support

range_pr_roc_auc_support(y_true: ndarray, y_score: ndarray, buffer_size: int | None = None, skip_check: bool = False) tuple[float, float][source]

Compute the range-based PR and ROC AUC.

Computes the area under the precision-recall-curve and the area under the receiver operating characteristic using the range-based precision and range-based recall definition from Paparrizos et al. published at VLDB 2022 [1].

We first extend the anomaly labels by two slopes of buffer_size//2 length on both sides of each anomaly, uniformly sample thresholds from the anomaly score, and then compute the confusion matrix for all thresholds. Using the resulting precision and recall values, we can plot a curve and compute its area.

Parameters:
y_truenp.ndarray

True binary labels of shape (n_instances,).

y_scorenp.ndarray

Anomaly scores for each point of the time series of shape (n_instances,).

buffer_sizeint, optional

Size of the buffer region around an anomaly. We add an increasing slope of size buffer_size//2 to the beginning of anomalies and a decreasing slope of size buffer_size//2 to the end of anomalies. Per default (when buffer_size==None), buffer_size is the median length of the anomalies within the time series. However, you can also set it to the period size of the dominant frequency or any other desired value.

skip_checkbool, default False

Whether to skip the input checks.

Returns:
Tuple[float, float]

Range-based PR AUC and range-based ROC AUC.

References

[1]

John Paparrizos, Paul Boniol, Themis Palpanas, Ruey S. Tsay, Aaron Elmore, and Michael J. Franklin. Volume Under the Surface: A New Accuracy Evaluation Measure for Time-Series Anomaly Detection. PVLDB, 15(11): 2774 - 2787, 2022. doi:10.14778/3551793.3551830