ShapeletTransformClassifier#
- class ShapeletTransformClassifier(n_shapelet_samples=10000, max_shapelets=None, max_shapelet_length=None, estimator=None, transform_limit_in_minutes=0, time_limit_in_minutes=0, contract_max_n_shapelet_samples=inf, save_transformed_data=False, n_jobs=1, batch_size=100, random_state=None)[source]#
A shapelet transform classifier (STC).
Implementation of the binary shapelet transform classifier pipeline along the lines of [1][2] but with random shapelet sampling. Transforms the data using the configurable RandomShapeletTransform and then builds a RotationForestClassifier classifier.
As some implementations and applications contract the transformation solely, contracting is available for the transform only and both classifier and transform.
- Parameters:
- n_shapelet_samplesint, default=10000
The number of candidate shapelets to be considered for the final transform. Filtered down to
<= max_shapelets
, keeping the shapelets with the most information gain.- max_shapeletsint or None, default=None
Max number of shapelets to keep for the final transform. Each class value will have its own max, set to
n_classes_ / max_shapelets
. If None, uses the minimum between10 * n_instances_
and 1000.- max_shapelet_lengthint or None, default=None
Lower bound on candidate shapelet lengths for the transform. If
None
, no max length is used- estimatorBaseEstimator or None, default=None
Base estimator for the ensemble, can be supplied a sklearn BaseEstimator. If None a default RotationForestClassifier classifier is used.
- transform_limit_in_minutesint, default=0
Time contract to limit transform time in minutes for the shapelet transform, overriding n_shapelet_samples. A value of 0 means
n_shapelet_samples
is used.- time_limit_in_minutesint, default=0
Time contract to limit build time in minutes, overriding
n_shapelet_samples
andtransform_limit_in_minutes
. Theestimator
will only be contracted if atime_limit_in_minutes parameter
is present. Default of 0 meansn_shapelet_samples
ortransform_limit_in_minutes
is used.- contract_max_n_shapelet_samplesint, default=np.inf
Max number of shapelets to extract when contracting the transform with
transform_limit_in_minutes
ortime_limit_in_minutes
.- save_transformed_databool, default=False
Save the data transformed in fit in
transformed_data_
for use in_get_train_probs
.- n_jobsint, default=1
The number of jobs to run in parallel for both
fit
andpredict
. -1 means using all processors.- batch_sizeint or None, default=100
Number of shapelet candidates processed before being merged into the set of best shapelets in the transform.
- random_stateint, RandomState instance or None, default=None
If int, random_state is the seed used by the random number generator; If RandomState instance, random_state is the random number generator; If None, the random number generator is the RandomState instance used by np.random.
- Attributes:
- classes_list
The unique class labels in the training set.
- n_classes_int
The number of unique classes in the training set.
- fit_time_int
The time (in milliseconds) for
fit
to run.- n_instances_int
The number of train cases in the training set.
- n_dims_int
The number of dimensions per case in the training set.
- series_length_int
The length of each series in the training set.
- transformed_data_list of shape (n_estimators) of ndarray
The transformed training dataset for all classifiers. Only saved when
save_transformed_data
is True.
See also
RandomShapeletTransform
The randomly sampled shapelet transform.
RotationForestClassifier
The default rotation forest classifier used.
Notes
For the Java version, see tsml.
References
[1]Jon Hills et al., “Classification of time series by shapelet transformation”, Data Mining and Knowledge Discovery, 28(4), 851–881, 2014.
[2]A. Bostrom and A. Bagnall, “Binary Shapelet Transform for Multiclass Time Series Classification”, Transactions on Large-Scale Data and Knowledge Centered Systems, 32, 2017.
Examples
>>> from aeon.classification.shapelet_based import ShapeletTransformClassifier >>> from aeon.classification.sklearn import RotationForestClassifier >>> from aeon.datasets import load_unit_test >>> X_train, y_train = load_unit_test(split="train", return_X_y=True) >>> X_test, y_test = load_unit_test(split="test", return_X_y=True) >>> clf = ShapeletTransformClassifier( ... estimator=RotationForestClassifier(n_estimators=3), ... n_shapelet_samples=100, ... max_shapelets=10, ... batch_size=20, ... ) >>> clf.fit(X_train, y_train) ShapeletTransformClassifier(...) >>> y_pred = clf.predict(X_test)
Methods
Check if the estimator has been fitted.
clone
()Obtain a clone of the object with same hyper-parameters.
clone_tags
(estimator[, tag_names])clone/mirror tags from another estimator as dynamic override.
create_test_instance
([parameter_set])Construct Estimator instance if possible.
create_test_instances_and_names
([parameter_set])Create list of all test instances and a list of names for them.
fit
(X, y)Fit time series classifier to training data.
get_class_tag
(tag_name[, tag_value_default])Get tag value from estimator class (only class tags).
Get class tags from estimator class and all its parent classes.
get_fitted_params
([deep])Get fitted parameters.
Get parameter defaults for the object.
Get parameter names for the object.
get_params
([deep])Get parameters for this estimator.
get_tag
(tag_name[, tag_value_default, ...])Get tag value from estimator class and dynamic tag overrides.
get_tags
()Get tags from estimator class and dynamic tag overrides.
get_test_params
([parameter_set])Return testing parameter settings for the estimator.
Check if the object is composite.
load_from_path
(serial)Load object from file location.
load_from_serial
(serial)Load object from serialized memory container.
predict
(X)Predicts labels for time series in X.
Predicts labels probabilities for sequences in X.
reset
()Reset the object to a clean post-init state.
save
([path])Save serialized self to bytes-like object or to (.zip) file.
score
(X, y)Scores predicted labels against ground truth labels on X.
set_params
(**params)Set the parameters of this object.
set_tags
(**tag_dict)Set dynamic tags to given values.
- classmethod get_test_params(parameter_set='default')[source]#
Return testing parameter settings for the estimator.
- Parameters:
- parameter_setstr, default=”default”
Name of the set of test parameters to return, for use in tests. If no special parameters are defined for a value, will return “default” set. ShapeletTransformClassifier provides the following special sets:
- “results_comparison” - used in some classifiers to compare against
previously generated results where the default set of parameters cannot produce suitable probability estimates
- “contracting” - used in classifiers that set the
“capability:contractable” tag to True to test contacting functionality
- “train_estimate” - used in some classifiers that set the
“capability:train_estimate” tag to True to allow for more efficient testing when relevant parameters are available
- Returns:
- paramsdict or list of dict, default={}
Parameters to create testing instances of the class. Each dict are parameters to construct an “interesting” test instance, i.e., MyClass(**params) or MyClass(**params[i]) creates a valid test instance. create_test_instance uses the first (or only) dictionary in params.
- check_is_fitted()[source]#
Check if the estimator has been fitted.
- Raises:
- NotFittedError
If the estimator has not been fitted yet.
- clone()[source]#
Obtain a clone of the object with same hyper-parameters.
A clone is a different object without shared references, in post-init state. This function is equivalent to returning sklearn.clone of self. Equal in value to type(self)(**self.get_params(deep=False)).
- Returns:
- instance of type(self), clone of self (see above)
- clone_tags(estimator, tag_names=None)[source]#
clone/mirror tags from another estimator as dynamic override.
- Parameters:
- estimatorestimator inheriting from :class:BaseEstimator
- tag_namesstr or list of str, default = None
Names of tags to clone. If None then all tags in estimator are used as tag_names.
- Returns:
- Self
Reference to self.
Notes
Changes object state by setting tag values in tag_set from estimator as dynamic tags in self.
- classmethod create_test_instance(parameter_set='default')[source]#
Construct Estimator instance if possible.
- Parameters:
- parameter_setstr, default=”default”
Name of the set of test parameters to return, for use in tests. If no special parameters are defined for a value, will return “default” set.
- Returns:
- instanceinstance of the class with default parameters
Notes
get_test_params can return dict or list of dict. This function takes first or single dict that get_test_params returns, and constructs the object with that.
- classmethod create_test_instances_and_names(parameter_set='default')[source]#
Create list of all test instances and a list of names for them.
- Parameters:
- parameter_setstr, default=”default”
Name of the set of test parameters to return, for use in tests. If no special parameters are defined for a value, will return “default” set.
- Returns:
- objslist of instances of cls
i-th instance is cls(**cls.get_test_params()[i])
- nameslist of str, same length as objs
i-th element is name of i-th instance of obj in tests convention is {cls.__name__}-{i} if more than one instance otherwise {cls.__name__}
- parameter_setstr, default=”default”
Name of the set of test parameters to return, for use in tests. If no special parameters are defined for a value, will return “default” set.
- fit(X, y)[source]#
Fit time series classifier to training data.
- Parameters:
- X3D np.array (any number of channels, equal length series)
of shape [n_instances, n_channels, series_length]
- or 2D np.array (univariate, equal length series)
of shape [n_instances, series_length]
- y1D np.array of int, of shape [n_instances] - class labels for fitting
indices correspond to instance indices in X
- Returns:
- selfReference to self.
Notes
Changes state by creating a fitted model that updates attributes ending in “_” and sets is_fitted flag to True.
- classmethod get_class_tag(tag_name, tag_value_default=None)[source]#
Get tag value from estimator class (only class tags).
- Parameters:
- tag_namestr
Name of tag value.
- tag_value_defaultany type
Default/fallback value if tag is not found.
- Returns:
- tag_value
Value of the tag_name tag in self. If not found, returns tag_value_default.
- classmethod get_class_tags()[source]#
Get class tags from estimator class and all its parent classes.
- Returns:
- collected_tagsdict
Dictionary of tag name : tag value pairs. Collected from _tags class attribute via nested inheritance. NOT overridden by dynamic tags set by set_tags or mirror_tags.
- get_fitted_params(deep=True)[source]#
Get fitted parameters.
- State required:
Requires state to be “fitted”.
- Parameters:
- deepbool, default=True
Whether to return fitted parameters of components.
If True, will return a dict of parameter name : value for this object, including fitted parameters of fittable components (= BaseEstimator-valued parameters).
If False, will return a dict of parameter name : value for this object, but not include fitted parameters of components.
- Returns:
- fitted_paramsdict with str-valued keys
Dictionary of fitted parameters, paramname : paramvalue keys-value pairs include:
always: all fitted parameters of this object, as via get_param_names values are fitted parameter value for that key, of this object
if deep=True, also contains keys/value pairs of component parameters parameters of components are indexed as [componentname]__[paramname] all parameters of componentname appear as paramname with its value
if deep=True, also contains arbitrary levels of component recursion, e.g., [componentname]__[componentcomponentname]__[paramname], etc
- classmethod get_param_defaults()[source]#
Get parameter defaults for the object.
- Returns:
- default_dict: dict with str keys
keys are all parameters of cls that have a default defined in __init__ values are the defaults, as defined in __init__
- classmethod get_param_names()[source]#
Get parameter names for the object.
- Returns:
- param_names: list of str, alphabetically sorted list of parameter names of cls
- get_params(deep=True)[source]#
Get parameters for this estimator.
- Parameters:
- deepbool, default=True
If True, will return the parameters for this estimator and contained subobjects that are estimators.
- Returns:
- paramsdict
Parameter names mapped to their values.
- get_tag(tag_name, tag_value_default=None, raise_error=True)[source]#
Get tag value from estimator class and dynamic tag overrides.
- Parameters:
- tag_namestr
Name of tag to be retrieved
- tag_value_defaultany type, optional; default=None
Default/fallback value if tag is not found
- raise_errorbool
whether a ValueError is raised when the tag is not found
- Returns:
- tag_value
Value of the tag_name tag in self. If not found, returns an error if raise_error is True, otherwise it returns tag_value_default.
- Raises:
- ValueError if raise_error is True i.e. if tag_name is not in self.get_tags(
- ).keys()
- get_tags()[source]#
Get tags from estimator class and dynamic tag overrides.
- Returns:
- collected_tagsdict
Dictionary of tag name : tag value pairs. Collected from _tags class attribute via nested inheritance and then any overrides and new tags from _tags_dynamic object attribute.
- is_composite()[source]#
Check if the object is composite.
A composite object is an object which contains objects, as parameters. Called on an instance, since this may differ by instance.
- Returns:
- composite: bool, whether self contains a parameter which is BaseObject
- classmethod load_from_path(serial)[source]#
Load object from file location.
- Parameters:
- serialresult of ZipFile(path).open(“object)
- Returns:
- deserialized self resulting in output at path, of cls.save(path)
- classmethod load_from_serial(serial)[source]#
Load object from serialized memory container.
- Parameters:
- serial1st element of output of cls.save(None)
- Returns:
- deserialized self resulting in output serial, of cls.save(None)
- predict(X) ndarray [source]#
Predicts labels for time series in X.
- Parameters:
- X3D np.array of shape (n_instances, n_channels, series_length)
or 2D np.array of shape (n_instances, series_length)
- Returns:
- y1D np.array of int, of shape [n_instances] - predicted class labels
indices correspond to instance indices in X
- predict_proba(X) ndarray [source]#
Predicts labels probabilities for sequences in X.
- Parameters:
- X3D np.array of shape (n_cases, n_channels, series_length)
or 2D np.array of shape (n_cases, series_length)
- Returns:
- y2D array of shape (n_cases, n_classes) - predicted class probabilities
First dimension indices correspond to instance indices in X, second dimension indices correspond to class labels, (i, j)-th entry is estimated probability that i-th instance is of class j
- reset()[source]#
Reset the object to a clean post-init state.
Equivalent to sklearn.clone but overwrites self. After self.reset() call, self is equal in value to type(self)(**self.get_params(deep=False))
Detail behaviour: removes any object attributes, except:
hyper-parameters = arguments of __init__ object attributes containing double-underscores, i.e., the string “__”
runs __init__ with current values of hyper-parameters (result of get_params)
Not affected by the reset are: object attributes containing double-underscores class and object methods, class attributes
- save(path=None)[source]#
Save serialized self to bytes-like object or to (.zip) file.
Behaviour: if path is None, returns an in-memory serialized self if path is a file location, stores self at that location as a zip file
saved files are zip files with following contents: _metadata - contains class of self, i.e., type(self) _obj - serialized self. This class uses the default serialization (pickle).
- Parameters:
- pathNone or file location (str or Path)
if None, self is saved to an in-memory object if file location, self is saved to that file location. If:
path=”estimator” then a zip file estimator.zip will be made at cwd. path=”/home/stored/estimator” then a zip file estimator.zip will be stored in /home/stored/.
- Returns:
- if path is None - in-memory serialized self
- if path is file location - ZipFile with reference to the file
- score(X, y) float [source]#
Scores predicted labels against ground truth labels on X.
- Parameters:
- X3D np.array (any number of channels, equal length series)
of shape [n_instances, n_channels, series_length]
- or 2D np.array (univariate, equal length series)
of shape [n_instances, series_length]
- y1D np.ndarray of shape [n_instances] - class labels (ground truth)
indices correspond to instance indices in X
- Returns:
- float, accuracy score of predict(X) vs y
- set_params(**params)[source]#
Set the parameters of this object.
The method works on simple estimators as well as on nested objects. The latter have parameters of the form
<component>__<parameter>
so that it’s possible to update each component of a nested object.- Parameters:
- **paramsdict
BaseObject parameters
- Returns:
- selfreference to self (after parameters have been set)