autoprognosis.plugins.imputers.plugin_hyperimpute module

class HyperImputePlugin(random_state: int = 0, **kwargs: Any)

Bases: ImputerPlugin

“HyperImpute strategy, a generalized iterative imputation framework for adaptively and automatically configuring column-wise models and their hyperparameters.

Parameters:
  • classifier_seed – list. List of ClassifierPlugin names for the search pool.

  • regression_seed – list. List of RegressionPlugin names for the search pool.

  • imputation_order – int. 0 - ascending, 1 - descending, 2 - random

  • baseline_imputer – int. 0 - mean, 1 - median, 2- most_frequent

  • optimizer – str. Hyperparam search strategy. Options: simple, hyperband, bayesian

  • class_threshold – int. Maximum number of unique items in a categorical column.

  • optimize_thresh – int. The number of subsamples used for the model search.

  • n_inner_iter – int. number of imputation iterations.

  • select_model_by_column – bool. If False, reuse the first model selected in the current iteration for all columns. Else, search the model for each column.

  • select_model_by_iteration – bool. If False, reuse the models selected in the first iteration. Otherwise, refresh the models on each iteration.

  • select_lazy – bool. If True, if there is a trend towards a certain model architecture, the loop reuses than for all columns, instead of calling the optimizer.

  • inner_loop_hook – Callable. Debug hook, called before each iteration.

  • random_state – int. random seed.

Example

>>> import numpy as np
>>> from autoprognosis.plugins.imputers import Imputers
>>> plugin = Imputers().get("hyperimpute")
>>> plugin.fit_transform([[1, 1, 1, 1], [np.nan, np.nan, np.nan, np.nan], [1, 2, 2, 1], [2, 2, 2, 2]])

Reference: “HyperImpute: Generalized Iterative Imputation with Automatic Model Selection”

change_output(output: str) None
fit(X: DataFrame, *args: Any, **kwargs: Any) Plugin

Train the plugin

Parameters:

X – pd.DataFrame

fit_predict(X: DataFrame, *args: Any, **kwargs: Any) DataFrame

Fit the model and predict the training data. Used by predictors.

fit_transform(X: DataFrame, *args: Any, **kwargs: Any) DataFrame

Fit the model and transform the training data. Used by imputers and preprocessors.

classmethod fqdn() str

The fully-qualified name of the plugin: type->subtype->name

static hyperparameter_space(*args: Any, **kwargs: Any) List[Params]

The hyperparameter search domain, used for tuning.

classmethod hyperparameter_space_fqdn(*args: Any, **kwargs: Any) List[Params]

The hyperparameter domain using they fully-qualified name.

is_fitted() bool

Check if the model was trained

classmethod load(buff: bytes) ImputerPlugin

Load the plugin from bytes

static name() str

The name of the plugin, e.g.: xgboost

predict(X: DataFrame, *args: Any, **kwargs: Any) DataFrame

Run predictions for the input. Used by predictors.

Parameters:

X – pd.DataFrame

classmethod sample_hyperparameters(trial: optuna.trial.Trial, *args: Any, **kwargs: Any) Dict[str, Any]

Sample hyperparameters for Optuna.

classmethod sample_hyperparameters_fqdn(trial: optuna.trial.Trial, *args: Any, **kwargs: Any) Dict[str, Any]

Sample hyperparameters using they fully-qualified name.

classmethod sample_hyperparameters_np(random_state: int = 0, *args: Any, **kwargs: Any) Dict[str, Any]

Sample hyperparameters as a dict.

save() bytes

Save the plugin to bytes

static subtype() str

The type of the plugin, e.g.: classifier

transform(X: DataFrame) DataFrame

Transform the input. Used by imputers and preprocessors.

Parameters:

X – pd.DataFrame

static type() str

The type of the plugin, e.g.: prediction

plugin

alias of HyperImputePlugin