RotationForestRegressor¶
- class RotationForestRegressor(n_estimators: int = 200, min_group: int = 3, max_group: int = 3, remove_proportion: float = 0.5, base_estimator: BaseEstimator | None = None, pca_solver: str = 'auto', time_limit_in_minutes: float = 0.0, contract_max_n_estimators: int = 500, n_jobs: int = 1, random_state: int | RandomState | None = None)[source]¶
Bases:
RegressorMixin,BaseEstimatorA Rotation Forest (RotF) vector regressor.
Implementation of the Rotation Forest regressor described in Rodriguez et al (2013) [1]. Builds a forest of trees build on random portions of the data transformed using PCA.
Intended as a benchmark for time series data and a base regressor for transformation based approaches such as FreshPRINCERegressor, this aeon implementation only works with continuous attributes.
- Parameters:
- n_estimatorsint, default=200
Number of estimators to build for the ensemble.
- min_groupint, default=3
The minimum size of an attribute subsample group.
- max_groupint, default=3
The maximum size of an attribute subsample group.
- remove_proportionfloat, default=0.5
The proportion of cases to be removed per group.
- base_estimatorBaseEstimator or None, default=”None”
Base estimator for the ensemble. By default, uses the sklearn DecisionTreeRegressor using MSE as a splitting measure.
- pca_solverstr, default=”auto”
Solver to use for the PCA
svd_solverparameter. See the scikit-learn PCA implementation for options.- time_limit_in_minutesint, default=0
Time contract to limit build time in minutes, overriding
n_estimators. Default of 0 meansn_estimatorsis used.- contract_max_n_estimatorsint, default=500
Max number of estimators to build when
time_limit_in_minutesis set.- n_jobsint, default=1
The number of jobs to run in parallel for both
fitandpredict. -1 means using all processors.- random_stateint, RandomState instance or None, default=None
If int, random_state is the seed used by the random number generator; If RandomState instance, random_state is the random number generator; If None, the random number generator is the RandomState instance used by np.random.
- Attributes:
- n_cases_int
The number of train cases in the training set.
- n_atts_int
The number of attributes in the training set.
- estimators_list of shape (n_estimators) of BaseEstimator
The collections of estimators trained in fit.
References
[1]Rodriguez, Juan José, Ludmila I. Kuncheva, and Carlos J. Alonso. “Rotation forest: A new classifier ensemble method.” IEEE transactions on pattern analysis and machine intelligence 28.10 (2006).
[2]Bagnall, A., et al. “Is rotation forest the best classifier for problems with continuous features?.” arXiv preprint arXiv:1809.06705 (2018).
Examples
>>> from aeon.regression.sklearn import RotationForestRegressor >>> from aeon.testing.data_generation import make_example_2d_numpy_collection >>> X, y = make_example_2d_numpy_collection(n_cases=10, n_timepoints=12, ... regression_target=True, random_state=0) >>> reg = RotationForestRegressor(n_estimators=10) >>> reg.fit(X, y) RotationForestRegressor(n_estimators=10) >>> reg.predict(X) array([0.7252543 , 1.50132442, 0.95608366, 1.64399016, 0.42385504, 0.60639322, 1.01919317, 1.30157483, 1.66017354, 0.2900776 ])
Methods
fit(X, y)Fit a forest of trees on cases (X,y), where y is the target variable.
get_params([deep])Get parameters for this estimator.
predict(X)Predict for all cases in X.
score(X, y[, sample_weight])Return coefficient of determination on test data.
set_params(**params)Set the parameters of this estimator.
set_score_request(*[, sample_weight])Configure whether metadata should be requested to be passed to the
scoremethod.fit_predict
- fit(X, y)[source]¶
Fit a forest of trees on cases (X,y), where y is the target variable.
- Parameters:
- X2d ndarray or DataFrame of shape = [n_cases, n_attributes]
The training data.
- yarray-like, shape = [n_cases]
The output values.
- Returns:
- self
Reference to self.
Notes
Changes state by creating a fitted model that updates attributes ending in “_”.
- get_params(deep=True)¶
Get parameters for this estimator.
- Parameters:
- deepbool, default=True
If True, will return the parameters for this estimator and contained subobjects that are estimators.
- Returns:
- paramsdict
Parameter names mapped to their values.
- predict(X) ndarray[source]¶
Predict for all cases in X.
- Parameters:
- X2d ndarray or DataFrame of shape = [n_cases, n_attributes]
The data to make predictions for.
- Returns:
- yarray-like, shape = [n_cases]
Predicted output values.
- score(X, y, sample_weight=None)¶
Return coefficient of determination on test data.
The coefficient of determination, \(R^2\), is defined as \((1 - \frac{u}{v})\), where \(u\) is the residual sum of squares
((y_true - y_pred)** 2).sum()and \(v\) is the total sum of squares((y_true - y_true.mean()) ** 2).sum(). The best possible score is 1.0 and it can be negative (because the model can be arbitrarily worse). A constant model that always predicts the expected value of y, disregarding the input features, would get a \(R^2\) score of 0.0.- Parameters:
- Xarray-like of shape (n_samples, n_features)
Test samples. For some estimators this may be a precomputed kernel matrix or a list of generic objects instead with shape
(n_samples, n_samples_fitted), wheren_samples_fittedis the number of samples used in the fitting for the estimator.- yarray-like of shape (n_samples,) or (n_samples, n_outputs)
True values for X.
- sample_weightarray-like of shape (n_samples,), default=None
Sample weights.
- Returns:
- scorefloat
\(R^2\) of
self.predict(X)w.r.t. y.
Notes
The \(R^2\) score used when calling
scoreon a regressor usesmultioutput='uniform_average'from version 0.23 to keep consistent with default value ofr2_score. This influences thescoremethod of all the multioutput regressors (except forMultiOutputRegressor).
- set_params(**params)¶
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as
Pipeline). The latter have parameters of the form<component>__<parameter>so that it’s possible to update each component of a nested object.- Parameters:
- **paramsdict
Estimator parameters.
- Returns:
- selfestimator instance
Estimator instance.
- set_score_request(*, sample_weight: bool | None | str = '$UNCHANGED$') RotationForestRegressor¶
Configure whether metadata should be requested to be passed to the
scoremethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed toscoreif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it toscore.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
- sample_weightstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED
Metadata routing for
sample_weightparameter inscore.
- Returns:
- selfobject
The updated object.