HMM#
- class HMM(emission_funcs: list, transition_prob_mat: ndarray, initial_probs: Optional[ndarray] = None)[source]#
Implements a simple HMM fitted with Viterbi algorithm.
The HMM annotation estimator uses the the Viterbi algorithm to fit a sequence of ‘hidden state’ class annotations (represented by an array of integers the same size as the observation) to a sequence of observations.
This is done by finding the most likely path given the emission probabilities - (ie the probability that a particular observation would be generated by a given hidden state), the transition prob (ie the probability of transitioning from one state to another or staying in the same state) and the initial probabilities - ie the belief of the probability distribution of hidden states at the start of the observation sequence).
- Current assumptions/limitations of this implementation:
the spacing of time series points is assumed to be equivalent.
it only works on univariate data.
- the emission parameters and transition probabilities are
assumed to be known.
- if no initial probs are passed, uniform probabilities are
assigned (ie rather than the stationary distribution.)
requires and returns np.ndarrays.
_fit is currently empty as the parameters of the probability distribution are required to be passed to the algorithm.
_predict - first the transition_probability and transition_id matrices are calculated - these are both nxm matrices, where n is the number of hidden states and m is the number of observations. The transition probability matrices record the probability of the most likely sequence which has observation m being assigned to hidden state n. The transition_id matrix records the step before hidden state n that proceeds it in the most likely path. This logic is mostly carried out by helper function _calculate_trans_mats. Next, these matrices are used to calculate the most likely path (by backtracing from the final mostly likely state and the id’s that proceeded it.) This logic is done via a helper func hmm_viterbi_label.
- Parameters:
- emission_funcslist, shape = [num hidden states]
List should be of length n (the number of hidden states) Either a list of callables [fx_1, fx_2] with signature fx_1(X) -> float or a list of callables and matched keyword arguments for those callables [(fx_1, kwarg_1), (fx_2, kwarg_2)] with signature fx_1(X, **kwargs) -> float (or a list with some mixture of the two). The callables should take a value and return a probability when passed a single observation. All functions should be properly normalized PDFs over the same space as the observed data.
- transition_prob_mat: 2D np.ndarry, shape = [num_states, num_states]
Each row should sum to 1 in order to be properly normalized (ie the j’th column in the i’th row represents the probability of transitioning from state i to state j.)
- initial_probs: 1D np.ndarray, shape = [num hidden states], optional
A array of probabilities that the sequence of hidden states starts in each of the hidden states. If passed, should be of length n the number of hidden states and should match the length of both the emission funcs list and the transition_prob_mat. The initial probs should be reflective of prior beliefs. If none is passed will each hidden state will be assigned an equal inital prob.
- Attributes:
- emission_funcslist, shape = [num_hidden_states]
The functions to use in calculating the emission probabilities. Taken from the __init__ param of same name.
- transition_prob_mat: 2D np.ndarry, shape = [num_states, num_states]
Matrix of transition probabilities from hidden state to hidden state. Taken from the __init__ param of same name.
- initial_probs1D np.ndarray, shape = [num_hidden_states]
Probability over the hidden state identity of the first state. If the __init__ param of same name was passed it will take on that value. Otherwise it is set to be uniform over all hidden states.
- num_statesint
The number of hidden states. Set to be the length of the emission_funcs parameter which was passed.
- stateslist
A list of integers from 0 to num_states-1. Integer labels for the hidden states.
- num_obsint
The length of the observations data. Extracted from data.
- trans_prob2D np.ndarray, shape = [num_observations, num_hidden_states]
Shape [num observations, num hidden states]. The max probability that that observation is assigned to that hidden state. Calculated in _calculate_trans_mat and assigned in _predict.
- trans_id2D np.ndarray, shape = [num_observations, num_hidden_states]
Shape [num observations, num hidden states]. The state id of the state proceeding the observation is assigned to that hidden state in the most likely path where that occurs. Calculated in _calculate_trans_mat and assigned in _predict.
Examples
>>> from aeon.annotation.hmm import HMM >>> from scipy.stats import norm >>> from numpy import asarray >>> # define the emission probs for our HMM model: >>> centers = [3.5,-5] >>> sd = [.25 for i in centers] >>> emi_funcs = [(norm.pdf, {'loc': mean, ... 'scale': sd[ind]}) for ind, mean in enumerate(centers)] >>> hmm_est = HMM(emi_funcs, asarray([[0.25,0.75], [0.666, 0.333]])) >>> # generate synthetic data (or of course use your own!) >>> obs = asarray([3.7,3.2,3.4,3.6,-5.1,-5.2,-4.9]) >>> hmm_est = hmm_est.fit(obs) >>> labels = hmm_est.predict(obs)
Methods
Check if the estimator has been fitted.
clone
()Obtain a clone of the object with same hyper-parameters.
clone_tags
(estimator[, tag_names])clone/mirror tags from another estimator as dynamic override.
create_test_instance
([parameter_set])Construct Estimator instance if possible.
create_test_instances_and_names
([parameter_set])Create list of all test instances and a list of names for them.
fit
(X[, Y])Fit to training data.
fit_predict
(X[, Y])Fit to data, then predict it.
get_class_tag
(tag_name[, tag_value_default])Get tag value from estimator class (only class tags).
Get class tags from estimator class and all its parent classes.
get_fitted_params
([deep])Get fitted parameters.
Get parameter defaults for the object.
Get parameter names for the object.
get_params
([deep])Get parameters for this estimator.
get_tag
(tag_name[, tag_value_default, ...])Get tag value from estimator class and dynamic tag overrides.
get_tags
()Get tags from estimator class and dynamic tag overrides.
get_test_params
([parameter_set])Return testing parameter settings for the estimator.
Check if the object is composite.
load_from_path
(serial)Load object from file location.
load_from_serial
(serial)Load object from serialized memory container.
predict
(X)Create annotations on test/deployment data.
Return scores for predicted annotations on test/deployment data.
reset
()Reset the object to a clean post-init state.
save
([path])Save serialized self to bytes-like object or to (.zip) file.
set_params
(**params)Set the parameters of this object.
set_tags
(**tag_dict)Set dynamic tags to given values.
update
(X[, Y])Update model with new data and optional ground truth annotations.
Update model with new data and create annotations for it.
- classmethod get_test_params(parameter_set='default')[source]#
Return testing parameter settings for the estimator.
- Parameters:
- parameter_setstr, default=”default”
Name of the set of test parameters to return, for use in tests. If no special parameters are defined for a value, will return “default” set.
- Returns:
- paramsdict or list of dict
- check_is_fitted()[source]#
Check if the estimator has been fitted.
- Raises:
- NotFittedError
If the estimator has not been fitted yet.
- clone()[source]#
Obtain a clone of the object with same hyper-parameters.
A clone is a different object without shared references, in post-init state. This function is equivalent to returning sklearn.clone of self. Equal in value to type(self)(**self.get_params(deep=False)).
- Returns:
- instance of type(self), clone of self (see above)
- clone_tags(estimator, tag_names=None)[source]#
clone/mirror tags from another estimator as dynamic override.
- Parameters:
- estimatorestimator inheriting from :class:BaseEstimator
- tag_namesstr or list of str, default = None
Names of tags to clone. If None then all tags in estimator are used as tag_names.
- Returns:
- Self
Reference to self.
Notes
Changes object state by setting tag values in tag_set from estimator as dynamic tags in self.
- classmethod create_test_instance(parameter_set='default')[source]#
Construct Estimator instance if possible.
- Parameters:
- parameter_setstr, default=”default”
Name of the set of test parameters to return, for use in tests. If no special parameters are defined for a value, will return “default” set.
- Returns:
- instanceinstance of the class with default parameters
Notes
get_test_params can return dict or list of dict. This function takes first or single dict that get_test_params returns, and constructs the object with that.
- classmethod create_test_instances_and_names(parameter_set='default')[source]#
Create list of all test instances and a list of names for them.
- Parameters:
- parameter_setstr, default=”default”
Name of the set of test parameters to return, for use in tests. If no special parameters are defined for a value, will return “default” set.
- Returns:
- objslist of instances of cls
i-th instance is cls(**cls.get_test_params()[i])
- nameslist of str, same length as objs
i-th element is name of i-th instance of obj in tests convention is {cls.__name__}-{i} if more than one instance otherwise {cls.__name__}
- parameter_setstr, default=”default”
Name of the set of test parameters to return, for use in tests. If no special parameters are defined for a value, will return “default” set.
- fit(X, Y=None)[source]#
Fit to training data.
- Parameters:
- Xpd.DataFrame
Training data to fit model to (time series).
- Ypd.Series, optional
Ground truth annotations for training if annotator is supervised.
- Returns:
- self
Reference to self.
Notes
Creates fitted model that updates attributes ending in “_”. Sets _is_fitted flag to True.
- fit_predict(X, Y=None)[source]#
Fit to data, then predict it.
Fits model to X and Y with given annotation parameters and returns the annotations made by the model.
- Parameters:
- Xpd.DataFrame, pd.Series or np.ndarray
Data to be transformed
- Ypd.Series or np.ndarray, optional (default=None)
Target values of data to be predicted.
- Returns:
- selfpd.Series
Annotations for sequence X exact format depends on annotation type.
- classmethod get_class_tag(tag_name, tag_value_default=None)[source]#
Get tag value from estimator class (only class tags).
- Parameters:
- tag_namestr
Name of tag value.
- tag_value_defaultany type
Default/fallback value if tag is not found.
- Returns:
- tag_value
Value of the tag_name tag in self. If not found, returns tag_value_default.
- classmethod get_class_tags()[source]#
Get class tags from estimator class and all its parent classes.
- Returns:
- collected_tagsdict
Dictionary of tag name : tag value pairs. Collected from _tags class attribute via nested inheritance. NOT overridden by dynamic tags set by set_tags or mirror_tags.
- get_fitted_params(deep=True)[source]#
Get fitted parameters.
- State required:
Requires state to be “fitted”.
- Parameters:
- deepbool, default=True
Whether to return fitted parameters of components.
If True, will return a dict of parameter name : value for this object, including fitted parameters of fittable components (= BaseEstimator-valued parameters).
If False, will return a dict of parameter name : value for this object, but not include fitted parameters of components.
- Returns:
- fitted_paramsdict with str-valued keys
Dictionary of fitted parameters, paramname : paramvalue keys-value pairs include:
always: all fitted parameters of this object, as via get_param_names values are fitted parameter value for that key, of this object
if deep=True, also contains keys/value pairs of component parameters parameters of components are indexed as [componentname]__[paramname] all parameters of componentname appear as paramname with its value
if deep=True, also contains arbitrary levels of component recursion, e.g., [componentname]__[componentcomponentname]__[paramname], etc
- classmethod get_param_defaults()[source]#
Get parameter defaults for the object.
- Returns:
- default_dict: dict with str keys
keys are all parameters of cls that have a default defined in __init__ values are the defaults, as defined in __init__
- classmethod get_param_names()[source]#
Get parameter names for the object.
- Returns:
- param_names: list of str, alphabetically sorted list of parameter names of cls
- get_params(deep=True)[source]#
Get parameters for this estimator.
- Parameters:
- deepbool, default=True
If True, will return the parameters for this estimator and contained subobjects that are estimators.
- Returns:
- paramsdict
Parameter names mapped to their values.
- get_tag(tag_name, tag_value_default=None, raise_error=True)[source]#
Get tag value from estimator class and dynamic tag overrides.
- Parameters:
- tag_namestr
Name of tag to be retrieved
- tag_value_defaultany type, optional; default=None
Default/fallback value if tag is not found
- raise_errorbool
whether a ValueError is raised when the tag is not found
- Returns:
- tag_value
Value of the tag_name tag in self. If not found, returns an error if raise_error is True, otherwise it returns tag_value_default.
- Raises:
- ValueError if raise_error is True i.e. if tag_name is not in self.get_tags(
- ).keys()
- get_tags()[source]#
Get tags from estimator class and dynamic tag overrides.
- Returns:
- collected_tagsdict
Dictionary of tag name : tag value pairs. Collected from _tags class attribute via nested inheritance and then any overrides and new tags from _tags_dynamic object attribute.
- is_composite()[source]#
Check if the object is composite.
A composite object is an object which contains objects, as parameters. Called on an instance, since this may differ by instance.
- Returns:
- composite: bool, whether self contains a parameter which is BaseObject
- classmethod load_from_path(serial)[source]#
Load object from file location.
- Parameters:
- serialresult of ZipFile(path).open(“object)
- Returns:
- deserialized self resulting in output at path, of cls.save(path)
- classmethod load_from_serial(serial)[source]#
Load object from serialized memory container.
- Parameters:
- serial1st element of output of cls.save(None)
- Returns:
- deserialized self resulting in output serial, of cls.save(None)
- predict(X)[source]#
Create annotations on test/deployment data.
- Parameters:
- Xpd.DataFrame
Data to annotate (time series).
- Returns:
- Ypd.Series
Annotations for sequence X exact format depends on annotation type.
- predict_scores(X)[source]#
Return scores for predicted annotations on test/deployment data.
- Parameters:
- Xpd.DataFrame
Data to annotate (time series).
- Returns:
- Ypd.Series
Scores for sequence X exact format depends on annotation type.
- reset()[source]#
Reset the object to a clean post-init state.
Equivalent to sklearn.clone but overwrites self. After self.reset() call, self is equal in value to type(self)(**self.get_params(deep=False))
Detail behaviour: removes any object attributes, except:
hyper-parameters = arguments of __init__ object attributes containing double-underscores, i.e., the string “__”
runs __init__ with current values of hyper-parameters (result of get_params)
Not affected by the reset are: object attributes containing double-underscores class and object methods, class attributes
- save(path=None)[source]#
Save serialized self to bytes-like object or to (.zip) file.
Behaviour: if path is None, returns an in-memory serialized self if path is a file location, stores self at that location as a zip file
saved files are zip files with following contents: _metadata - contains class of self, i.e., type(self) _obj - serialized self. This class uses the default serialization (pickle).
- Parameters:
- pathNone or file location (str or Path)
if None, self is saved to an in-memory object if file location, self is saved to that file location. If:
path=”estimator” then a zip file estimator.zip will be made at cwd. path=”/home/stored/estimator” then a zip file estimator.zip will be stored in /home/stored/.
- Returns:
- if path is None - in-memory serialized self
- if path is file location - ZipFile with reference to the file
- set_params(**params)[source]#
Set the parameters of this object.
The method works on simple estimators as well as on nested objects. The latter have parameters of the form
<component>__<parameter>
so that it’s possible to update each component of a nested object.- Parameters:
- **paramsdict
BaseObject parameters
- Returns:
- selfreference to self (after parameters have been set)
- set_tags(**tag_dict)[source]#
Set dynamic tags to given values.
- Parameters:
- tag_dictdict
Dictionary of tag name : tag value pairs.
- Returns:
- Self
Reference to self.
Notes
Changes object state by settting tag values in tag_dict as dynamic tags in self.
- update(X, Y=None)[source]#
Update model with new data and optional ground truth annotations.
- Parameters:
- Xpd.DataFrame
Training data to update model with (time series).
- Ypd.Series, optional
Ground truth annotations for training if annotator is supervised.
- Returns:
- self
Reference to self.
Notes
Updates fitted model that updates attributes ending in “_”.
- update_predict(X)[source]#
Update model with new data and create annotations for it.
- Parameters:
- Xpd.DataFrame
Training data to update model with, time series.
- Returns:
- Ypd.Series
Annotations for sequence X exact format depends on annotation type.
Notes
Updates fitted model that updates attributes ending in “_”.