range_pr_roc_auc_support¶
- range_pr_roc_auc_support(y_true: ndarray, y_score: ndarray, buffer_size: int | None = None, skip_check: bool = False) tuple[float, float] [source]¶
Compute the range-based PR and ROC AUC.
Computes the area under the precision-recall-curve and the area under the receiver operating characteristic using the range-based precision and range-based recall definition from Paparrizos et al. published at VLDB 2022 [1].
We first extend the anomaly labels by two slopes of
buffer_size//2
length on both sides of each anomaly, uniformly sample thresholds from the anomaly score, and then compute the confusion matrix for all thresholds. Using the resulting precision and recall values, we can plot a curve and compute its area.- Parameters:
- y_truenp.ndarray
True binary labels of shape (n_instances,).
- y_scorenp.ndarray
Anomaly scores for each point of the time series of shape (n_instances,).
- buffer_sizeint, optional
Size of the buffer region around an anomaly. We add an increasing slope of size
buffer_size//2
to the beginning of anomalies and a decreasing slope of sizebuffer_size//2
to the end of anomalies. Per default (whenbuffer_size==None
),buffer_size
is the median length of the anomalies within the time series. However, you can also set it to the period size of the dominant frequency or any other desired value.- skip_checkbool, default False
Whether to skip the input checks.
- Returns:
- Tuple[float, float]
Range-based PR AUC and range-based ROC AUC.
References
[1]John Paparrizos, Paul Boniol, Themis Palpanas, Ruey S. Tsay, Aaron Elmore, and Michael J. Franklin. Volume Under the Surface: A New Accuracy Evaluation Measure for Time-Series Anomaly Detection. PVLDB, 15(11): 2774 - 2787, 2022. doi:10.14778/3551793.3551830