ForeTiS.model._base_model

Module Contents

Classes

BaseModel

BaseModel parent class for all models that can be used within the framework.

class ForeTiS.model._base_model.BaseModel(optuna_trial, datasets, featureset_name, pca_transform, target_column, optimize_featureset)[source]

Bases: abc.ABC

BaseModel parent class for all models that can be used within the framework.

Every model must be based on BaseModel directly or BaseModel’s child classes, e.g. SklearnModel or TorchModel

** Attributes **

  • Instance attributes *

  • optuna_trial (optuna.trial.Trial): trial of optuna for optimization

  • datasets (list<pd.DataFrame>): all datasets that are available

  • n_outputs (int): number of outputs of the prediction model

  • all_hyperparams (dict): dictionary with all hyperparameters with related info that can be tuned (structure see define_hyperparams_to_tune)

  • dataset (pd.DataFrame): the dataset for this optimization trial

  • model: model object

  • target_column: the target column for the prediction

  • pca_transform: whether conducting pca transformation should be a hyperparameter to optimize or not

  • featureset: the recent featureset

Parameters:
  • optuna_trial (optuna.trial.Trial) – Trial of optuna for optimization

  • datasets (list) – all datasets that are available

  • featureset_name (str) – the name of the recent feature set

  • target_column (str) – the target column for the prediction

  • pca_transform (bool) – whether conducting pca transformation should be a hyperparameter to optimize or not

  • optimize_featureset (bool) – whether the feature set should be optimized or not

abstract define_model()[source]

Method that defines the model that needs to be optimized. Hyperparams to tune have to be specified in all_hyperparams and suggested via suggest_hyperparam_to_optuna(). The hyperparameters have to be included directly in the model definiton to be optimized. e.g. if you want to optimize the number of layers, do something like

n_layers = self.suggest_hyperparam_to_optuna('n_layers') # same name in define_hyperparams_to_tune()
for layer in n_layers:
    do something

Then the number of layers will be optimized by optuna.

abstract define_hyperparams_to_tune()[source]

Method that defines the hyperparameters that should be tuned during optimization and their ranges. Required format is a dictionary with:

{
    'name_hyperparam_1':
        {
        # MANDATORY ITEMS
        'datatype': 'float' | 'int' | 'categorical',
        FOR DATATYPE 'categorical':
            'list_of_values': []  # List of all possible values
        FOR DATATYPE ['float', 'int']:
            'lower_bound': value_lower_bound,
            'upper_bound': value_upper_bound,
            # OPTIONAL ITEMS (only for ['float', 'int']):
            'log': True | False  # sample value from log domain or not
            'step': step_size # step of discretization.
                                # Caution: cannot be combined with log=True
                                                # - in case of 'float' in general and
                                                # - for step!=1 in case of 'int'
        },
    'name_hyperparam_2':
        {
        ...
        },
    ...
    'name_hyperparam_k':
        {
        ...
        }
}

If you want to use a similar hyperparameter multiple times (e.g. Dropout after several layers), you only need to specify the hyperparameter once. Individual parameters for every suggestion will be created.

Return type:

dict

abstract retrain(retrain)[source]

Method that runs the retraining of the model

Parameters:

retrain (pandas.DataFrame) – data for retraining

abstract update(update, period)[source]

Method that runs the updating of the model

Parameters:
  • update (pandas.DataFrame) – data for updating

  • period (int) –

abstract predict(X_in)[source]

Method that predicts target values based on the input X_in

Parameters:

X_in (pandas.DataFrame) – feature matrix as input

Returns:

numpy array with the predicted values

Return type:

numpy.array

abstract train_val_loop(train, val)[source]

Method that runs the whole training and validation loop

Parameters:
  • train (pandas.DataFrame) – data for the training

  • val (pandas.DataFrame) – data for validation

Returns:

predictions on validation set

Return type:

numpy.array

suggest_hyperparam_to_optuna(hyperparam_name)[source]

Suggest a hyperparameter of hyperparam_dict to the optuna trial to optimize it.

If you want to add a parameter to your model / in your pipeline to be optimized, you need to call this method

Parameters:

hyperparam_name (str) – name of the hyperparameter to be tuned (see define_hyperparams_to_tune)

Returns:

suggested value

suggest_all_hyperparams_to_optuna()[source]

Some models accept a dictionary with the model parameters. This method suggests all hyperparameters in all_hyperparams and gives back a dictionary containing them.

Returns:

dictionary with suggested hyperparameters

Return type:

dict

featureset_hyperparam()[source]

Method that defines the feature set hyperparameter that should be tuned during optimization and its ranges.

pca_transform()[source]

Method that defines the pca transform hyperparameter that should be tuned during optimization and its ranges.

save_model(path, filename)[source]

Persist the whole model object on a hard drive (can be loaded with load_model)

Parameters:
  • path (str) – path where the model will be saved

  • filename (str) – filename of the model