autoprognosis.plugins.imputers.plugin_mice module

class MicePlugin(random_state: int = 0, **kwargs: Any)

Bases: ImputerPlugin

Imputation plugin for completing missing values using the Multivariate Iterative chained equations and multiple imputations.

Method:: Multivariate Iterative chained equations(MICE) methods model each feature with missing values as a function of other features in a round-robin fashion. For each step of the round-robin imputation, we use a BayesianRidge estimator, which does a regularized linear regression. The class sklearn.impute.IterativeImputer is able to generate multiple imputations of the same incomplete dataset. We can then learn a regression or classification model on different imputations of the same dataset. Setting sample_posterior=True for the IterativeImputer will randomly draw values to fill each missing value from the Gaussian posterior of the predictions. If each IterativeImputer uses a different random_state, this results in multiple imputations, each of which can be used to train a predictive model. The final result is the average of all the n_imputation estimates.

Parameters:

n_imputations – int, default=5i number of multiple imputations to perform.
max_iter – int, default=500 maximum number of imputation rounds to perform.
random_state – int, default set to the current time. seed of the pseudo random number generator to use.

Example

>>> import numpy as np
>>> from autoprognosis.plugins.imputers import Imputers
>>> plugin = Imputers().get("mice")
>>> plugin.fit_transform([[1, 1, 1, 1], [np.nan, np.nan, np.nan, np.nan], [1, 2, 2, 1], [2, 2, 2, 2]])
          0        1         2         3
0  1.000000  1.00000  1.000000  1.000000
1  1.222412  1.68686  1.687483  1.221473
2  1.000000  2.00000  2.000000  1.000000
3  2.000000  2.00000  2.000000  2.000000

change_output(output: str) → None

fit(X: DataFrame, *args: Any, **kwargs: Any) → Plugin

Train the plugin

Parameters:: X – pd.DataFrame

fit_predict(X: DataFrame, *args: Any, **kwargs: Any) → DataFrame: Fit the model and predict the training data. Used by predictors.

fit_transform(X: DataFrame, *args: Any, **kwargs: Any) → DataFrame: Fit the model and transform the training data. Used by imputers and preprocessors.

classmethod fqdn() → str: The fully-qualified name of the plugin: type->subtype->name

static hyperparameter_space(*args: Any, **kwargs: Any) → List[Params]: The hyperparameter search domain, used for tuning.

classmethod hyperparameter_space_fqdn(*args: Any, **kwargs: Any) → List[Params]: The hyperparameter domain using they fully-qualified name.

is_fitted() → bool: Check if the model was trained

classmethod load(buff: bytes) → ImputerPlugin: Load the plugin from bytes

static name() → str: The name of the plugin, e.g.: xgboost

predict(X: DataFrame, *args: Any, **kwargs: Any) → DataFrame

Run predictions for the input. Used by predictors.

Parameters:: X – pd.DataFrame

classmethod sample_hyperparameters(trial: optuna.trial.Trial, *args: Any, **kwargs: Any) → Dict[str, Any]: Sample hyperparameters for Optuna.

classmethod sample_hyperparameters_fqdn(trial: optuna.trial.Trial, *args: Any, **kwargs: Any) → Dict[str, Any]: Sample hyperparameters using they fully-qualified name.

classmethod sample_hyperparameters_np(random_state: int = 0, *args: Any, **kwargs: Any) → Dict[str, Any]: Sample hyperparameters as a dict.

save() → bytes: Save the plugin to bytes

static subtype() → str: The type of the plugin, e.g.: classifier

transform(X: DataFrame) → DataFrame

Transform the input. Used by imputers and preprocessors.

Parameters:: X – pd.DataFrame

static type() → str: The type of the plugin, e.g.: prediction

plugin: alias of MicePlugin