autoprognosis.plugins.imputers.plugin_sinkhorn module
- class SinkhornPlugin(random_state: int = 0, **kwargs: Any)
Bases:
ImputerPluginSinkhorn imputation can be used to impute quantitative data and it relies on the idea that two batches extracted randomly from the same dataset should share the same distribution and consists in minimizing optimal transport distances between batches.
- Args:
- eps: float, default=0.01
Sinkhorn regularization parameter.
- lrfloat, default = 0.01
Learning rate.
- opt: torch.nn.optim.Optimizer, default=torch.optim.Adam
Optimizer class to use for fitting.
- n_epochsint, default=15
Number of gradient updates for each model within a cycle.
- batch_sizeint, defatul=256
Size of the batches on which the sinkhorn divergence is evaluated.
- n_pairsint, default=10
Number of batch pairs used per gradient update.
- noisefloat, default = 0.1
Noise used for the missing values initialization.
- scaling: float, default=0.9
Scaling parameter in Sinkhorn iterations
Example
>>> import numpy as np >>> from autoprognosis.plugins.imputers import Imputers >>> plugin = Imputers().get("sinkhorn") >>> plugin.fit_transform([[1, 1, 1, 1], [np.nan, np.nan, np.nan, np.nan], [1, 2, 2, 1], [2, 2, 2, 2]]) 0 1 2 3 0 1.000000 1.000000 1.000000 1.000000 1 1.404637 1.651113 1.651093 1.404638 2 1.000000 2.000000 2.000000 1.000000 3 2.000000 2.000000 2.000000 2.000000
- Reference: “Missing Data Imputation using Optimal Transport”, Boris Muzellec, Julie Josse, Claire Boyer, Marco Cuturi
Original code: https://github.com/BorisMuzellec/MissingDataOT
- change_output(output: str) None
- fit_predict(X: DataFrame, *args: Any, **kwargs: Any) DataFrame
Fit the model and predict the training data. Used by predictors.
- fit_transform(X: DataFrame, *args: Any, **kwargs: Any) DataFrame
Fit the model and transform the training data. Used by imputers and preprocessors.
- classmethod fqdn() str
The fully-qualified name of the plugin: type->subtype->name
- static hyperparameter_space(*args: Any, **kwargs: Any) List[Params]
The hyperparameter search domain, used for tuning.
- classmethod hyperparameter_space_fqdn(*args: Any, **kwargs: Any) List[Params]
The hyperparameter domain using they fully-qualified name.
- is_fitted() bool
Check if the model was trained
- classmethod load(buff: bytes) ImputerPlugin
Load the plugin from bytes
- static name() str
The name of the plugin, e.g.: xgboost
- predict(X: DataFrame, *args: Any, **kwargs: Any) DataFrame
Run predictions for the input. Used by predictors.
- Parameters:
X – pd.DataFrame
- classmethod sample_hyperparameters(trial: optuna.trial.Trial, *args: Any, **kwargs: Any) Dict[str, Any]
Sample hyperparameters for Optuna.
- classmethod sample_hyperparameters_fqdn(trial: optuna.trial.Trial, *args: Any, **kwargs: Any) Dict[str, Any]
Sample hyperparameters using they fully-qualified name.
- classmethod sample_hyperparameters_np(random_state: int = 0, *args: Any, **kwargs: Any) Dict[str, Any]
Sample hyperparameters as a dict.
- save() bytes
Save the plugin to bytes
- static subtype() str
The type of the plugin, e.g.: classifier
- transform(X: DataFrame) DataFrame
Transform the input. Used by imputers and preprocessors.
- Parameters:
X – pd.DataFrame
- static type() str
The type of the plugin, e.g.: prediction
- plugin
alias of
SinkhornPlugin