autoprognosis.explorers.risk_estimation_combos module

class RiskEnsembleSeeker

Bases: object

AutoML core logic for risk estimation ensemble search.

Parameters:
  • study_name – str. Study ID, used for caching.

  • time_horizons – list. list of time horizons.

  • num_iter – int. Maximum Number of optimization trials. This is the limit of trials for each base estimator in the “risk_estimators” list, used in combination with the “timeout” parameter. For each estimator, the search will end after “num_iter” trials or “timeout” seconds.

  • num_ensemble_iter – int. Number of optimization trials for the ensemble weights.

  • timeout – int. Maximum wait time(seconds) for each estimator hyperparameter search. This timeout will apply to each estimator in the “risk_estimators” list.

  • n_folds_cv – int. Number of folds to use for evaluation

  • feature_scaling

    list. Plugin search pool to use in the pipeline for scaling. Defaults to : [‘maxabs_scaler’, ‘scaler’, ‘feature_normalizer’, ‘normal_transform’, ‘uniform_transform’, ‘nop’, ‘minmax_scaler’] Available plugins, retrieved using Preprocessors(category=”feature_scaling”).list_available():

    • ’maxabs_scaler’

    • ’scaler’

    • ’feature_normalizer’

    • ’normal_transform’

    • ’uniform_transform’

    • ’nop’ # empty operation

    • ’minmax_scaler’

  • feature_selection

    list. Plugin search pool to use in the pipeline for feature selection. Defaults [“nop”, “variance_threshold”, “pca”, “fast_ica”] Available plugins, retrieved using Preprocessors(category=”dimensionality_reduction”).list_available():

    • ’feature_agglomeration’

    • ’fast_ica’

    • ’variance_threshold’

    • ’gauss_projection’

    • ’pca’

    • ’nop’ # no operation

  • imputers

    list. Plugin search pool to use in the pipeline for imputation. Defaults to [“mean”, “ice”, “missforest”, “hyperimpute”]. Available plugins, retrieved using Imputers().list_available():

    • ’sinkhorn’

    • ’EM’

    • ’mice’

    • ’ice’

    • ’hyperimpute’

    • ’most_frequent’

    • ’median’

    • ’missforest’

    • ’softimpute’

    • ’nop’

    • ’mean’

    • ’gain’

  • estimators

    list. Plugin search pool to use in the pipeline for risk estimation. Defaults to [“survival_xgboost”, “loglogistic_aft”, “deephit”, “cox_ph”, “weibull_aft”, “lognormal_aft”, “coxnet”] Available plugins:

    • ’survival_xgboost’

    • ’loglogistic_aft’

    • ’deephit’

    • ’cox_ph’

    • ’weibull_aft’

    • ’lognormal_aft’

    • ’coxnet’

  • hooks – Hooks. Custom callbacks to be notified about the search progress.

  • random_state – int: Random seed

pretrain_for_cv(ensemble: List, X: DataFrame, T: DataFrame, Y: DataFrame, time_horizon: int, seed: int = 0, group_ids: str | None = None) List
search(X: DataFrame, T: Series, Y: Series, skip_recap: bool = False, group_ids: Series | None = None) RiskEnsemble
search_weights(ensemble: List, X: DataFrame, T: DataFrame, Y: DataFrame, time_horizon: int, skip_recap: bool = False, group_ids: Series | None = None) List[float]