autoprognosis.explorers.regression module

class RegressionSeeker

Bases: object

AutoML core logic for regression tasks.

Parameters:

study_name – str. Study ID, used for caching.
num_iter – int. Maximum Number of optimization trials. This is the limit of trials for each base estimator in the “regressors” list, used in combination with the “timeout” parameter. For each estimator, the search will end after “num_iter” trials or “timeout” seconds.
metric –
str. The metric to use for optimization. Available metrics:
- ”r2”
n_folds_cv – int. Number of folds to use for evaluation
top_k – int Number of candidates to return
timeout – int. Maximum wait time(seconds) for each estimator hyperparameter search. This timeout will apply to each estimator in the “regressors” list.
feature_scaling –
list. Plugin search pool to use in the pipeline for scaling. Defaults to : [‘maxabs_scaler’, ‘scaler’, ‘feature_normalizer’, ‘normal_transform’, ‘uniform_transform’, ‘nop’, ‘minmax_scaler’] Available plugins, retrieved using Preprocessors(category=”feature_scaling”).list_available():
- ’maxabs_scaler’
- ’scaler’
- ’feature_normalizer’
- ’normal_transform’
- ’uniform_transform’
- ’nop’ # empty operation
- ’minmax_scaler’
feature_selection –
list. Plugin search pool to use in the pipeline for feature selection. Defaults [“nop”, “variance_threshold”, “pca”, “fast_ica”] Available plugins, retrieved using Preprocessors(category=”dimensionality_reduction”).list_available():
- ’feature_agglomeration’
- ’fast_ica’
- ’variance_threshold’
- ’gauss_projection’
- ’pca’
- ’nop’ # no operation
imputers –
list. Plugin search pool to use in the pipeline for imputation. Defaults to [“mean”, “ice”, “missforest”, “hyperimpute”]. Available plugins, retrieved using Imputers().list_available():
- ’sinkhorn’
- ’EM’
- ’mice’
- ’ice’
- ’hyperimpute’
- ’most_frequent’
- ’median’
- ’missforest’
- ’softimpute’
- ’nop’
- ’mean’
- ’gain’
regressors –
list. Plugin search pool to use in the pipeline for prediction. Defaults to [“random_forest_regressor”,”xgboost_regressor”, “linear_regression”, “catboost_regressor”] Available plugins, retrieved using Regression().list_available():
- ’kneighbors_regressor’
- ’bayesian_ridge’
- ’tabnet_regressor’
- ’catboost_regressor’
- ’random_forest_regressor’
- ’mlp_regressor’
- ’xgboost_regressor’
- ’neural_nets_regression’
- ’linear_regression’
hooks – Hooks. Custom callbacks to be notified about the search progress.
random_state – int: Random seed

search(X: DataFrame, Y: Series, group_ids: Series | None = None) → List

search_best_args_for_estimator(estimator: Any, X: DataFrame, Y: Series, group_ids: Series | None = None) → Tuple[List[float], List[float]]