GreedyGaussianSegmenter¶
- class GreedyGaussianSegmenter(k_max: int = 10, lamb: float = 1.0, max_shuffles: int = 250, verbose: bool = False, random_state: int | None = None)[source]¶
Bases:
BaseSegmenterGreedy Gaussian Segmentation Estimator.
The method approximates solutions for the problem of breaking a multivariate time series into segments, where the data in each segment could be modeled as independent samples from a multivariate Gaussian distribution. It uses a dynamic programming search algorithm with a heuristic that allows finding approximate solution in linear time with respect to the data length and always yields locally optimal choice.
Greedy Gaussian Segmentation (GGS) fits a segmented gaussian model (SGM) to the data by computing the approximate solution to the combinatorial problem of finding the approximate covariance-regularized maximum log-likelihood for fixed number of change points and a reagularization strength. It follows an iterative procedure where a new breakpoint is added and then adjusting all breakpoints to (approximately) maximize the objective. It is similar to the top-down search used in other change point detection problems.
- Parameters:
- k_maxint, default=10
Maximum number of change points to find. The number of segments is thus k+1.
- lambfloat, default=1.0
Regularization parameter lambda (>= 0), which controls the amount of (inverse) covariance regularization, see Eq (1) in [1]. Regularization is introduced to reduce issues for high-dimensional problems. Setting
lambto zero will ignore regularization, whereas large values of lambda will favour simpler models.- max_shufflesint, default=250
Maximum number of shuffles.
- verbosebool, default=False
If
Trueverbose output is enabled.- random_stateint or np.random.RandomState, default=None
Either random seed or an instance of
np.random.RandomState.
- Attributes:
- change_points_: array_like, default=[]
Locations of change points as integer indexes. By convention change points include the identity segmentation, i.e. first and last index + 1 values.
- _intermediate_change_points: List[List[int]], default=[]
Intermediate values of change points for each value of k = 1…k_max
- _intermediate_ll: List[float], default=[]
Intermediate values for log-likelihood for each value of k = 1…k_max
Notes
Capabilities ¶ Missing Values
No
Multithreading
No
Univariate
No
Multivariate
Yes
Based on the work from [1].
source code adapted based on: https://github.com/cvxgrp/GGS
paper available at: https://stanford.edu/~boyd/papers/pdf/ggs.pdf
References
[1] (1,2)Hallac, D., Nystrup, P. & Boyd, S., “Greedy Gaussian segmentation of multivariate time series.”, Adv Data Anal Classif 13, 727–751 (2019). https://doi.org/10.1007/s11634-018-0335-0
Examples
>>> from aeon.testing.data_generation import make_example_dataframe_series >>> from sklearn.preprocessing import MinMaxScaler >>> from aeon.segmentation import GreedyGaussianSegmenter >>> X = make_example_dataframe_series(n_channels=2, random_state=10) >>> X_scaled = MinMaxScaler(feature_range=(0, 1)).fit_transform(X) >>> ggs = GreedyGaussianSegmenter(k_max=3, max_shuffles=5) >>> y = ggs.fit_predict(X_scaled, axis=0)
Methods
clone([random_state])Obtain a clone of the object with the same hyperparameters.
fit(X[, y, axis])Fit time series segmenter to X.
fit_predict(X[, y, axis])Fit segmentation to data and return it.
get_class_tag(tag_name[, raise_error, ...])Get tag value from estimator class (only class tags).
Get class tags from estimator class and all its parent classes.
get_fitted_params([deep])Get fitted parameters.
get_params([deep])Get parameters for this estimator.
get_tag(tag_name[, raise_error, ...])Get tag value from estimator class.
get_tags()Get tags from estimator.
predict(X[, axis])Create amd return segmentation of X.
reset([keep])Reset the object to a clean post-init state.
set_params(**params)Set the parameters of this estimator.
set_tags(**tag_dict)Set dynamic tags to given values.
to_classification(change_points, length)Convert change point locations to a classification vector.
to_clusters(change_points, length)Convert change point locations to a clustering vector.
- clone(random_state=None)[source]¶
Obtain a clone of the object with the same hyperparameters.
A clone is a different object without shared references, in post-init state. This function is equivalent to returning
sklearn.cloneofself. Equal in value totype(self)(**self.get_params(deep=False)).- Parameters:
- random_stateint, RandomState instance, or None, default=None
Sets the random state of the clone. If
None, the random state is not set. Ifint,random_stateis the seed used by the random number generator. IfRandomStateinstance,random_stateis the random number generator.
- Returns:
- estimatorobject
Instance of
type(self), clone of self (see above)
- fit(X, y=None, axis=1)[source]¶
Fit time series segmenter to X.
If the tag
fit_is_emptyis true, this just sets theis_fittedtag to true. Otherwise, it checksselfcan handleX, formatsXinto the structure required byselfthen passesX(and possiblyy) to_fit.- Parameters:
- XOne of
VALID_SERIES_INPUT_TYPES Input time series to fit a segmenter.
- yOne of
VALID_SERIES_INPUT_TYPESor None, default None Training time series, a labeled 1D series same length as X for supervised segmentation.
- axisint, default = None
Axis along which to segment if passed a multivariate X series (2D input). If axis is 0, it is assumed each column is a time series and each row is a time point. i.e. the shape of the data is
(n_timepoints, n_channels).axis == 1indicates the time series are in rows, i.e. the shape of the data is(n_channels, n_timepoints)`.``axis is Noneindicates that the axis of X is the same asself.axis.
- XOne of
- Returns:
- self
Fitted estimator
- classmethod get_class_tag(tag_name, raise_error=True, tag_value_default=None)[source]¶
Get tag value from estimator class (only class tags).
- Parameters:
- tag_namestr
Name of tag value.
- raise_errorbool, default=True
Whether a
ValueErroris raised when the tag is not found.- tag_value_defaultany type, default=None
Default/fallback value if tag is not found and error is not raised.
- Returns:
- tag_value
Value of the
tag_nametag in cls. If not found, returns an error ifraise_errorisTrue, otherwise it returnstag_value_default.
- Raises:
- ValueError
if
raise_errorisTrueandtag_nameis not inself.get_tags().keys()
Examples
>>> from aeon.classification import DummyClassifier >>> DummyClassifier.get_class_tag("capability:multivariate") True
- classmethod get_class_tags()[source]¶
Get class tags from estimator class and all its parent classes.
- Returns:
- collected_tagsdict
Dictionary of tag name and tag value pairs. Collected from
_tagsclass attribute via nested inheritance. These are not overridden by dynamic tags set byset_tagsor class__init__calls.
- get_fitted_params(deep=True)[source]¶
Get fitted parameters.
- State required:
Requires state to be “fitted”.
- Parameters:
- deepbool, default=True
If
True, will return the fitted parameters for this estimator and contained subobjects that are estimators.
- Returns:
- fitted_paramsdict
Fitted parameter names mapped to their values.
- get_params(deep=True)¶
Get parameters for this estimator.
- Parameters:
- deepbool, default=True
If True, will return the parameters for this estimator and contained subobjects that are estimators.
- Returns:
- paramsdict
Parameter names mapped to their values.
- get_tag(tag_name, raise_error=True, tag_value_default=None)[source]¶
Get tag value from estimator class.
Includes dynamic and overridden tags.
- Parameters:
- tag_namestr
Name of tag to be retrieved.
- raise_errorbool, default=True
Whether a
ValueErroris raised when the tag is not found.- tag_value_defaultany type, default=None
Default/fallback value if tag is not found and error is not raised.
- Returns:
- tag_value
Value of the
tag_nametag in self. If not found, returns an error ifraise_errorisTrue, otherwise it returnstag_value_default.
- Raises:
- ValueError
if raise_error is
Trueandtag_nameis not inself.get_tags().keys()
Examples
>>> from aeon.classification import DummyClassifier >>> d = DummyClassifier() >>> d.get_tag("capability:multivariate") True
- get_tags()[source]¶
Get tags from estimator.
Includes dynamic and overridden tags.
- Returns:
- collected_tagsdict
Dictionary of tag name and tag value pairs. Collected from
_tagsclass attribute via nested inheritance and then any overridden and new tags from__init__orset_tags.
- predict(X, axis=1)[source]¶
Create amd return segmentation of X.
- Parameters:
- XOne of
VALID_SERIES_INPUT_TYPES Input time series
- axisint, default = None
Axis along which to segment if passed a multivariate series (2D input) with
n_channelstime series. If axis is 0, it is assumed each row is a time series and each column is a time point. i.e. the shape of the data is(n_timepoints,n_channels).axis == 1indicates the time series are in rows, i.e. the shape of the data is(n_channels, n_timepoints)`.``axis is Noneindicates that the axis of X is the same asself.axis.
- XOne of
- Returns:
- List
Either a list of indexes of X indicating where each segment begins or a list of integers of
len(X)indicating which segment each time point belongs to.
- reset(keep=None)[source]¶
Reset the object to a clean post-init state.
After a
self.reset()call,selfis equal or similar in value totype(self)(**self.get_params(deep=False)), assuming no other attributes were kept usingkeep.- Detailed behaviour:
- removes any object attributes, except:
hyper-parameters (arguments of
__init__) object attributes containing double-underscores, i.e., the string “__”
runs
__init__with current values of hyperparameters (result ofget_params)- Not affected by the reset are:
object attributes containing double-underscores class and object methods, class attributes any attributes specified in the
keepargument
- Parameters:
- keepNone, str, or list of str, default=None
If
None, all attributes are removed except hyperparameters. Ifstr, only the attribute with this name is kept. Iflistofstr, only the attributes with these names are kept.
- Returns:
- selfobject
Reference to self.
- Raises:
- TypeError
If ‘keep’ is not a string or a list of strings.
- set_params(**params)¶
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as
Pipeline). The latter have parameters of the form<component>__<parameter>so that it’s possible to update each component of a nested object.- Parameters:
- **paramsdict
Estimator parameters.
- Returns:
- selfestimator instance
Estimator instance.
- set_tags(**tag_dict)[source]¶
Set dynamic tags to given values.
- Parameters:
- **tag_dictdict
Dictionary of tag name and tag value pairs.
- Returns:
- selfobject
Reference to
self.
- classmethod to_classification(change_points: list[int], length: int)[source]¶
Convert change point locations to a classification vector.
Change point detection results can be treated as classification with true change point locations marked with 1’s at position of the change point and remaining non-change point locations being 0’s.
For example change points [2, 8] for a time series of length 10 would result in: [0, 0, 1, 0, 0, 0, 0, 0, 1, 0].
- classmethod to_clusters(change_points: list[int], length: int)[source]¶
Convert change point locations to a clustering vector.
Change point detection results can be treated as clustering with each segment separated by change points assigned a distinct dummy label.
For example change points [2, 8] for a time series of length 10 would result in: [0, 0, 1, 1, 1, 1, 1, 1, 2, 2].