Benchmarking - comparing estimator performance

The benchmarking modules allows you to easily orchestrate benchmarking experiments in which you want to compare the performance of one or more algorithms over one or more datasets and benchmark configurations. This module is still in development.

Benchmarking in general is very easy to get wrong, giving false conclusions about estimator performance - see this 2022 research from Princeton for numerous examples of such mistakes in peer reviewed academic papers as evidence of this.

aeon’s benchmarking module is designed to provide benchmarking functionality while enforcing best practices and structure to help users avoid making mistakes (such as data leakage, etc.) which invalidate their results. The benchmarking module is designed for easy usage in mind, as such it interfaces directly with aeon objects and classes. Previously developed estimator should be usable as they are without alterations.

We also include tools for comparing your results to published work and for testing and visualising relative performance of algorithms. See

These notebooks demonstrate usage of the benchmarking module.

Generated using nbsphinx. The Jupyter notebook can be found here.