load_classification¶

load_classification(name, split=None, extract_path=None, return_metadata=False, load_equal_length: bool = True, load_no_missing: bool = True)[source]¶

Load a classification dataset.

This function loads TSC problems into memory, downloading from https://timeseriesclassification.com/ if the data is not available at the specified local path. If you want to load a problem from a local file, specify the location in extract_path. This function assumes the data is stored in format <extract_path>/<name>/<name>_TRAIN.ts and <extract_path>/<name>/<name>_TEST.ts. If you want to load a file directly from a full path, use the function load_from_tsfile` directly. If you do not specify extract_path, it will set the path to aeon/datasets/local_data. If the problem is not present in extract_path it will attempt to download the data from https://timeseriesclassification.com/.

This function can load timestamped data, but it does not store the time stamps. The time stamp loading is fragile, it will only work if all data are floats.

Data is assumed to be in the standard .ts format: each row is a (possibly multivariate) time series. Each dimension is separated by a colon, each value in a series is comma separated. For examples see aeon.datasets.data. ArrowHead is an example of a univariate equal length problem, BasicMotions an equal length multivariate problem. See https://www.aeon-toolkit.org/en/stable/api_reference /file_specifications/ts.html for formatting details.

Parameters:

namestr: Name of data set. If a dataset that is listed in tsc_datasets is given, this function will look in the extract_path first, and if it is not present, attempt to download the data from www.timeseriesclassification.com, saving it to the extract_path.
splitNone or str{“train”, “test”}, default=None: Whether to load the train or test partition of the problem. By default it loads both into a single dataset, otherwise it looks only for files of the format <name>_TRAIN.ts or <name>_TEST.ts.
extract_pathstr, default=None: the path to look for the data. If no path is provided, the function looks in aeon/datasets/data/. If a path is given, it can be absolute, e.g. C:/Temp/ or relative, e.g. Temp/ or ./Temp/.
return_metadataboolean, default = True: If True, returns a tuple (X, y, metadata)
load_equal_lengthboolean, default=True: This is for the case when the standard release has unequal length series. The downloaded zip for these contain a version made equal length through truncation. These versions all have the suffix _eq after the name. If this flag is set to True, the function first attempts to load files called <name>_eq_TRAIN.ts/TEST.ts. If these are not present, it will load the normal version.
load_no_missingboolean, default=True: This is for the case when the standard release has missing values. The downloaded zip for these contain a version with imputed missing values. These versions all have the suffix _nmv after the name. If this flag is set to True, the function first attempts to load files called <name>_nmv_TRAIN.ts/TEST.ts. If these are not present, it will load the normal version.

Returns:

X: np.ndarray or list of np.ndarray
y: numpy array: The class labels for each case in X
metadata: optional: returns the following meta data ‘problemname’,timestamps, missing,univariate,equallength, class_values targetlabel should be false, and classlabel true

Raises:

URLError or HTTPError: If the website is not accessible.
ValueError: If a dataset name that does not exist on the repo is given or if a webpage is requested that does not exist.

Examples

>>> from aeon.datasets import load_classification
>>> X, y = load_classification(name="ArrowHead")