load_classification

load_classification(name, split=None, extract_path=None, return_metadata=False, load_equal_length: bool = True, load_no_missing: bool = True)[source]

Load a classification dataset.

This function loads TSC problems into memory, downloading from https://timeseriesclassification.com/ if the data is not available at the specified local path. If you want to load a problem from a local file, specify the location in extract_path. This function assumes the data is stored in format <extract_path>/<name>/<name>_TRAIN.ts and <extract_path>/<name>/<name>_TEST.ts. If you want to load a file directly from a full path, use the function load_from_tsfile` directly. If you do not specify extract_path, it will set the path to aeon/datasets/local_data. If the problem is not present in extract_path it will attempt to download the data from https://timeseriesclassification.com/.

This function can load timestamped data, but it does not store the time stamps. The time stamp loading is fragile, it will only work if all data are floats.

Data is assumed to be in the standard .ts format: each row is a (possibly multivariate) time series. Each dimension is separated by a colon, each value in a series is comma separated. For examples see aeon.datasets.data. ArrowHead is an example of a univariate equal length problem, BasicMotions an equal length multivariate problem. See https://www.aeon-toolkit.org/en/stable/api_reference /file_specifications/ts.html for formatting details.

Parameters:
namestr

Name of data set. If a dataset that is listed in tsc_datasets is given, this function will look in the extract_path first, and if it is not present, attempt to download the data from www.timeseriesclassification.com, saving it to the extract_path.

splitNone or str{“train”, “test”}, default=None

Whether to load the train or test partition of the problem. By default it loads both into a single dataset, otherwise it looks only for files of the format <name>_TRAIN.ts or <name>_TEST.ts.

extract_pathstr, default=None

the path to look for the data. If no path is provided, the function looks in aeon/datasets/data/. If a path is given, it can be absolute, e.g. C:/Temp/ or relative, e.g. Temp/ or ./Temp/.

return_metadataboolean, default = True

If True, returns a tuple (X, y, metadata)

load_equal_lengthboolean, default=True

This is for the case when the standard release has unequal length series. The downloaded zip for these contain a version made equal length through truncation. These versions all have the suffix _eq after the name. If this flag is set to True, the function first attempts to load files called <name>_eq_TRAIN.ts/TEST.ts. If these are not present, it will load the normal version.

load_no_missingboolean, default=True

This is for the case when the standard release has missing values. The downloaded zip for these contain a version with imputed missing values. These versions all have the suffix _nmv after the name. If this flag is set to True, the function first attempts to load files called <name>_nmv_TRAIN.ts/TEST.ts. If these are not present, it will load the normal version.

Returns:
X: np.ndarray or list of np.ndarray
y: numpy array

The class labels for each case in X

metadata: optional

returns the following meta data ‘problemname’,timestamps, missing,univariate,equallength, class_values targetlabel should be false, and classlabel true

Raises:
URLError or HTTPError

If the website is not accessible.

ValueError

If a dataset name that does not exist on the repo is given or if a webpage is requested that does not exist.

Examples

>>> from aeon.datasets import load_classification
>>> X, y = load_classification(name="ArrowHead")