load_monster_dataset¶
- load_monster_dataset(dataset_name: str, fold: int = 0) tuple[ndarray, ndarray, ndarray, ndarray][source]¶
Load a Monster dataset from Hugging Face Hub.
MONSTER— the MONash Scalable Time Series Evaluation Repository, introduced in [1], is a collection of large datasets for time series classification.The collection is hosted on Hugging Face Hub.
- Parameters:
- dataset_namestr
The name of the dataset to load (e.g., “CornellWhaleChallenge”, “AudioMNIST”).
- foldint, default=0
The specific cross-validation fold index to load. This determines which samples are used for the test set. Defaults to fold 0.
- Returns:
- X_trainnp.ndarray
The training data, shape (n_train_cases, n_channels, n_timepoints). (n_channels=1 for these univariate datasets).
- y_trainnp.ndarray
The training class labels, shape (n_train_cases,).
- X_testnp.ndarray
The testing data, shape (n_test_cases, n_channels, n_timepoints).
- y_testnp.ndarray
The testing class labels, shape (n_test_cases,).
- Raises:
- ModuleNotFoundError
If required optional dependency ‘huggingface-hub’ not installed.
- ValueError
If the dataset_name is not recognized or the fold number is invalid.
- OSError
If the download fails due to network issues
Notes
The data files are cached locally by the huggingface-hub library, avoiding repeated downloads. This function requires the optional dependency huggingface-hub.
References
[1]Dempster, A., Mohammadi Foumani, N., Tan, C. W., Miller, L., Mishra, A., Salehi, M., Pelletier, C., Schmidt, D. F., & Webb, G. I. (2025). MONSTER: Monash Scalable Time Series Evaluation Repository. arXiv preprint arXiv:2502.15122.