SupervisedIntervalClassifier¶
- class SupervisedIntervalClassifier(n_intervals=50, min_interval_length=3, features=None, metric='fisher', randomised_split_point=True, normalise_for_search=True, estimator=None, random_state=None, n_jobs=1, parallel_backend=None)[source]¶
Bases:
BaseClassifierSupervised Interval Classifier.
Extracts multiple intervals from series with using a supervised process and concatenates them into a feature vector. Builds an estimator on the transformed data.
- Parameters:
- n_intervalsint, default=50
The number of times the supervised interval selection process is run. This process will extract more then one interval per run. Each supervised extraction will output a varying amount of features based on series length, number of dimensions and the number of features.
- min_interval_lengthint, default=3
The minimum length of extracted intervals. Minimum value of 3.
- featurescallable, list of callables, default=None
Functions used to extract features from selected intervals. Must take a 2d array of shape (n_cases, interval_length) and return a 1d array of shape (n_cases) containing the features. If None, defaults to the following statistics used in [2]: [mean, median, std, slope, min, max, iqr, count_mean_crossing, count_above_mean].
- metric[“fisher”] or callable, default=”fisher”
The metric used to evaluate the usefulness of a feature extracted on an interval. If “fisher”, the Fisher score is used. If a callable, it must take a 1d array of shape (n_cases) and return a 1d array of scores of shape (n_cases).
- randomised_split_pointbool, default=True
If True, the split point for interval extraction is randomised as is done in [2] rather than split in half.
- normalise_for_searchbool, default=True
If True, the data is normalised for the supervised interval search process. Features extracted for the transform output will not use normalised data.
- estimatorsklearn classifier, optional, default=None
An sklearn estimator to be built using the transformed data. Defaults to sklearn RandomForestClassifier(n_estimators=200)
- random_stateNone, int or instance of RandomState, default=None
Seed or RandomState object used for random number generation. If random_state is None, use the RandomState singleton used by np.random. If random_state is an int, use a new RandomState instance seeded with seed.
- n_jobsint, default=1
The number of jobs to run in parallel for both fit and transform functions. -1 means using all processors.
- parallel_backendstr, ParallelBackendBase instance or None, default=None
Specify the parallelisation backend implementation in joblib, if None a ‘prefer’ value of “threads” is used by default. Valid options are “loky”, “multiprocessing”, “threading” or a custom backend. See the joblib Parallel documentation for more details.
- Attributes:
- n_cases_int
The number of train cases.
- n_channels_int
The number of dimensions per case.
- n_timepoints_int
The length of each series.
- n_classes_int
Number of classes. Extracted from the data.
- classes_ndarray of shape (n_classes)
Holds the label for each class.
See also
SupervisedIntervals
Notes
Capabilities ¶ Missing Values
No
Multithreading
Yes
Univariate
Yes
Multivariate
Yes
Unequal Length
No
Train Estimate
No
Contractable
No
Examples
>>> from aeon.classification.interval_based import SupervisedIntervalClassifier >>> from sklearn.ensemble import RandomForestClassifier >>> from aeon.testing.data_generation import make_example_3d_numpy >>> X, y = make_example_3d_numpy(n_cases=10, n_channels=1, n_timepoints=12, ... return_y=True, random_state=0) >>> clf = SupervisedIntervalClassifier( ... estimator=RandomForestClassifier(n_estimators=5), ... n_intervals=2, ... random_state=0, ... ) >>> clf.fit(X, y) SupervisedIntervalClassifier(...) >>> clf.predict(X) array([0, 1, 0, 1, 0, 0, 1, 1, 1, 0])
Methods
clone([random_state])Obtain a clone of the object with the same hyperparameters.
fit(X, y)Fit time series classifier to training data.
fit_predict(X, y, **kwargs)Fits the classifier and predicts class labels for X.
fit_predict_proba(X, y, **kwargs)Fits the classifier and predicts class label probabilities for X.
get_class_tag(tag_name[, raise_error, ...])Get tag value from estimator class (only class tags).
Get class tags from estimator class and all its parent classes.
get_fitted_params([deep])Get fitted parameters.
get_params([deep])Get parameters for this estimator.
get_tag(tag_name[, raise_error, ...])Get tag value from estimator class.
get_tags()Get tags from estimator.
predict(X)Predicts class labels for time series in X.
Predicts class label probabilities for time series in X.
reset([keep])Reset the object to a clean post-init state.
score(X, y[, metric, use_proba, metric_params])Scores predicted labels against ground truth labels on X.
set_params(**params)Set the parameters of this estimator.
set_tags(**tag_dict)Set dynamic tags to given values.
- clone(random_state=None)[source]¶
Obtain a clone of the object with the same hyperparameters.
A clone is a different object without shared references, in post-init state. This function is equivalent to returning
sklearn.cloneofself. Equal in value totype(self)(**self.get_params(deep=False)).- Parameters:
- random_stateint, RandomState instance, or None, default=None
Sets the random state of the clone. If
None, the random state is not set. Ifint,random_stateis the seed used by the random number generator. IfRandomStateinstance,random_stateis the random number generator.
- Returns:
- estimatorobject
Instance of
type(self), clone of self (see above)
- fit(X, y) BaseCollectionEstimator[source]¶
Fit time series classifier to training data.
- Parameters:
- Xnp.ndarray or list
Input data, any number of channels, equal length series of shape
( n_cases, n_channels, n_timepoints)or 2D np.array (univariate, equal length series) of shape(n_cases, n_timepoints)or list of numpy arrays (any number of channels, unequal length series) of shape[n_cases], 2D np.array(n_channels, n_timepoints_i), wheren_timepoints_iis length of seriesi. Other types are allowed and converted into one of the above.Different estimators have different capabilities to handle different types of input. If
self.get_tag("capability:multivariate")is False, they cannot handle multivariate series, so eithern_channels == 1is true or X is 2D of shape(n_cases, n_timepoints). Ifself.get_tag( "capability:unequal_length")is False, they cannot handle unequal length input. In both situations, aValueErroris raised if X has a characteristic that the estimator does not have the capability for is passed.- ynp.ndarray
1D np.array of float or str, of shape
(n_cases)- class labels (ground truth) for fitting indices corresponding to instance indices in X.
- Returns:
- selfBaseClassifier
Reference to self.
Notes
Changes state by creating a fitted model that updates attributes ending in “_” and sets is_fitted flag to True.
- fit_predict(X, y, **kwargs) ndarray[source]¶
Fits the classifier and predicts class labels for X.
fit_predict produces prediction estimates using just the train data. By default, this is through 10x cross validation, although some estimators may utilise specialist techniques such as out-of-bag estimates or leave-one-out cross-validation.
Classifiers which override _fit_predict will have the
capability:train_estimatetag set to True.Generally, this will not be the same as fitting on the whole train data then making train predictions. To do this, you should call fit(X,y).predict(X)
- Parameters:
- Xnp.ndarray or list
Input data, any number of channels, equal length series of shape
( n_cases, n_channels, n_timepoints)or 2D np.array (univariate, equal length series) of shape(n_cases, n_timepoints)or list of numpy arrays (any number of channels, unequal length series) of shape[n_cases], 2D np.array(n_channels, n_timepoints_i), wheren_timepoints_iis length of seriesi. other types are allowed and converted into one of the above.Different estimators have different capabilities to handle different types of input. If
self.get_tag("capability:multivariate")is False, they cannot handle multivariate series, so eithern_channels == 1is true or X is 2D of shape(n_cases, n_timepoints). Ifself.get_tag( "capability:unequal_length")is False, they cannot handle unequal length input. In both situations, aValueErroris raised if X has a characteristic that the estimator does not have the capability for is passed.- ynp.ndarray
1D np.array of float or str, of shape
(n_cases)- class labels (ground truth) for fitting indices corresponding to instance indices in X.- kwargsdict
key word arguments to configure the default cross validation if the base class default fit_predict is used (i.e. if function
_fit_predictis not overridden. If_fit_predictis overridden, kwargs may not function as expected. If_fit_predictis not overridden, valid input iscv_sizeinteger, which is the number of cross validation folds to use to estimate train data. Ifcv_sizeis not passed, the default is 10. Ifcv_sizeis greater than the minimum number of samples in any class, it is set to this minimum.
- Returns:
- predictionsnp.ndarray
shape
[n_cases]- predicted class labels indices correspond to instance indices in
- fit_predict_proba(X, y, **kwargs) ndarray[source]¶
Fits the classifier and predicts class label probabilities for X.
fit_predict_proba produces probability estimates using just the train data. By default, this is through 10x cross validation, although some estimators may utilise specialist techniques such as out-of-bag estimates or leave-one-out cross-validation.
Classifiers which override _fit_predict_proba will have the
capability:train_estimatetag set to True.Generally, this will not be the same as fitting on the whole train data then making train predictions. To do this, you should call fit(X,y).predict_proba(X)
- Parameters:
- Xnp.ndarray or list
Input data, any number of channels, equal length series of shape
( n_cases, n_channels, n_timepoints)or 2D np.array (univariate, equal length series) of shape(n_cases, n_timepoints)or list of numpy arrays (any number of channels, unequal length series) of shape[n_cases], 2D np.array(n_channels, n_timepoints_i), wheren_timepoints_iis length of seriesi. other types are allowed and converted into one of the above.Different estimators have different capabilities to handle different types of input. If
self.get_tag("capability:multivariate")is False, they cannot handle multivariate series, so eithern_channels == 1is true or X is 2D of shape(n_cases, n_timepoints). Ifself.get_tag( "capability:unequal_length")is False, they cannot handle unequal length input. In both situations, aValueErroris raised if X has a characteristic that the estimator does not have the capability for is passed.- ynp.ndarray
1D np.array of float or str, of shape
(n_cases)- class labels (ground truth) for fitting indices corresponding to instance indices in X.- kwargsdict
key word arguments to configure the default cross validation if the base class default fit_predict is used (i.e. if function
_fit_predictis not overridden. If_fit_predictis overridden, kwargs may not function as expected. If_fit_predictis not overridden, valid input iscv_sizeinteger, which is the number of cross validation folds to use to estimate train data. Ifcv_sizeis not passed, the default is 10. Ifcv_sizeis greater than the minimum number of samples in any class, it is set to this minimum.
- Returns:
- probabilitiesnp.ndarray
2D array of shape
(n_cases, n_classes)- predicted class probabilities First dimension indices correspond to instance indices in X, second dimension indices correspond to class labels, (i, j)-th entry is estimated probability that i-th instance is of class j
- classmethod get_class_tag(tag_name, raise_error=True, tag_value_default=None)[source]¶
Get tag value from estimator class (only class tags).
- Parameters:
- tag_namestr
Name of tag value.
- raise_errorbool, default=True
Whether a
ValueErroris raised when the tag is not found.- tag_value_defaultany type, default=None
Default/fallback value if tag is not found and error is not raised.
- Returns:
- tag_value
Value of the
tag_nametag in cls. If not found, returns an error ifraise_errorisTrue, otherwise it returnstag_value_default.
- Raises:
- ValueError
if
raise_errorisTrueandtag_nameis not inself.get_tags().keys()
Examples
>>> from aeon.classification import DummyClassifier >>> DummyClassifier.get_class_tag("capability:multivariate") True
- classmethod get_class_tags()[source]¶
Get class tags from estimator class and all its parent classes.
- Returns:
- collected_tagsdict
Dictionary of tag name and tag value pairs. Collected from
_tagsclass attribute via nested inheritance. These are not overridden by dynamic tags set byset_tagsor class__init__calls.
- get_fitted_params(deep=True)[source]¶
Get fitted parameters.
- State required:
Requires state to be “fitted”.
- Parameters:
- deepbool, default=True
If
True, will return the fitted parameters for this estimator and contained subobjects that are estimators.
- Returns:
- fitted_paramsdict
Fitted parameter names mapped to their values.
- get_params(deep=True)¶
Get parameters for this estimator.
- Parameters:
- deepbool, default=True
If True, will return the parameters for this estimator and contained subobjects that are estimators.
- Returns:
- paramsdict
Parameter names mapped to their values.
- get_tag(tag_name, raise_error=True, tag_value_default=None)[source]¶
Get tag value from estimator class.
Includes dynamic and overridden tags.
- Parameters:
- tag_namestr
Name of tag to be retrieved.
- raise_errorbool, default=True
Whether a
ValueErroris raised when the tag is not found.- tag_value_defaultany type, default=None
Default/fallback value if tag is not found and error is not raised.
- Returns:
- tag_value
Value of the
tag_nametag in self. If not found, returns an error ifraise_errorisTrue, otherwise it returnstag_value_default.
- Raises:
- ValueError
if raise_error is
Trueandtag_nameis not inself.get_tags().keys()
Examples
>>> from aeon.classification import DummyClassifier >>> d = DummyClassifier() >>> d.get_tag("capability:multivariate") True
- get_tags()[source]¶
Get tags from estimator.
Includes dynamic and overridden tags.
- Returns:
- collected_tagsdict
Dictionary of tag name and tag value pairs. Collected from
_tagsclass attribute via nested inheritance and then any overridden and new tags from__init__orset_tags.
- predict(X) ndarray[source]¶
Predicts class labels for time series in X.
- Parameters:
- Xnp.ndarray or list
Input data, any number of channels, equal length series of shape
( n_cases, n_channels, n_timepoints)or 2D np.array (univariate, equal length series) of shape(n_cases, n_timepoints)or list of numpy arrays (any number of channels, unequal length series) of shape[n_cases], 2D np.array(n_channels, n_timepoints_i), wheren_timepoints_iis length of seriesiother types are allowed and converted into one of the above.Different estimators have different capabilities to handle different types of input. If
self.get_tag("capability:multivariate")is False, they cannot handle multivariate series, so eithern_channels == 1is true or X is 2D of shape(n_cases, n_timepoints). Ifself.get_tag( "capability:unequal_length")is False, they cannot handle unequal length input. In both situations, aValueErroris raised if X has a characteristic that the estimator does not have the capability for is passed.
- Returns:
- predictionsnp.ndarray
1D np.array of float, of shape (n_cases) - predicted class labels indices correspond to instance indices in X
- predict_proba(X) ndarray[source]¶
Predicts class label probabilities for time series in X.
- Parameters:
- Xnp.ndarray or list
Input data, any number of channels, equal length series of shape
( n_cases, n_channels, n_timepoints)or 2D np.array (univariate, equal length series) of shape(n_cases, n_timepoints)or list of numpy arrays (any number of channels, unequal length series) of shape[n_cases], 2D np.array(n_channels, n_timepoints_i), wheren_timepoints_iis length of seriesi. other types are allowed and converted into one of the above.Different estimators have different capabilities to handle different types of input. If
self.get_tag("capability:multivariate")is False, they cannot handle multivariate series, so eithern_channels == 1is true or X is 2D of shape(n_cases, n_timepoints). Ifself.get_tag( "capability:unequal_length")is False, they cannot handle unequal length input. In both situations, aValueErroris raised if X has a characteristic that the estimator does not have the capability for is passed.
- Returns:
- probabilitiesnp.ndarray
2D array of shape
(n_cases, n_classes)- predicted class probabilities First dimension indices correspond to instance indices in X, second dimension indices correspond to class labels, (i, j)-th entry is estimated probability that i-th instance is of class j
- reset(keep=None)[source]¶
Reset the object to a clean post-init state.
After a
self.reset()call,selfis equal or similar in value totype(self)(**self.get_params(deep=False)), assuming no other attributes were kept usingkeep.- Detailed behaviour:
- removes any object attributes, except:
hyper-parameters (arguments of
__init__) object attributes containing double-underscores, i.e., the string “__”
runs
__init__with current values of hyperparameters (result ofget_params)- Not affected by the reset are:
object attributes containing double-underscores class and object methods, class attributes any attributes specified in the
keepargument
- Parameters:
- keepNone, str, or list of str, default=None
If
None, all attributes are removed except hyperparameters. Ifstr, only the attribute with this name is kept. Iflistofstr, only the attributes with these names are kept.
- Returns:
- selfobject
Reference to self.
- Raises:
- TypeError
If ‘keep’ is not a string or a list of strings.
- score(X, y, metric='accuracy', use_proba=False, metric_params=None) float[source]¶
Scores predicted labels against ground truth labels on X.
- Parameters:
- Xnp.ndarray or list
Input data, any number of channels, equal length series of shape
( n_cases, n_channels, n_timepoints)or 2D np.array (univariate, equal length series) of shape(n_cases, n_timepoints)or list of numpy arrays (any number of channels, unequal length series) of shape[n_cases], 2D np.array(n_channels, n_timepoints_i), wheren_timepoints_iis length of seriesi. other types are allowed and converted into one of the above.Different estimators have different capabilities to handle different types of input. If
self.get_tag("capability:multivariate")is False, they cannot handle multivariate series, so eithern_channels == 1is true or X is 2D of shape(n_cases, n_timepoints). Ifself.get_tag( "capability:unequal_length")is False, they cannot handle unequal length input. In both situations, aValueErroris raised if X has a characteristic that the estimator does not have the capability for is passed.- ynp.ndarray
1D np.array of float or str, of shape
(n_cases)- class labels (ground truth) for fitting indices corresponding to instance indices in X.- metricUnion[str, callable], default=”accuracy”,
Defines the scoring metric to test the fit of the model. For supported strings arguments, check sklearn.metrics.get_scorer_names.
- use_probabool, default=False,
Argument to check if scorer works on probability estimates or not.
- metric_paramsdict, default=None,
Contains parameters to be passed to the scoring function. If None, no parameters are passed.
- Returns:
- scorefloat
Accuracy score of predict(X) vs y.
- set_params(**params)¶
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as
Pipeline). The latter have parameters of the form<component>__<parameter>so that it’s possible to update each component of a nested object.- Parameters:
- **paramsdict
Estimator parameters.
- Returns:
- selfestimator instance
Estimator instance.