:py:mod:`ForeTiS.optimization.optuna_optim`
===========================================

.. py:module:: ForeTiS.optimization.optuna_optim


Module Contents
---------------

Classes
~~~~~~~

.. autoapisummary::

   ForeTiS.optimization.optuna_optim.OptunaOptim


.. py:class:: OptunaOptim(save_dir, data, config_file_section, featureset_name, datasplit, test_set_size_percentage, val_set_size_percentage, n_trials, save_final_model, batch_size, n_epochs, current_model_name, datasets, periodical_refit_frequency, refit_drops, refit_window, intermediate_results_interval, pca_transform, config, optimize_featureset, scale_thr, scale_seasons, scale_window_factor, cf_r, cf_order, cf_smooth, cf_thr_perc, scale_window_minimum, max_samples_factor, valtest_seasons, seasonal_valtest, n_splits, config_model_featureset)

   Class that contains all info for the whole optimization using optuna for one model and dataset.

   ** Attributes **

       - study (*optuna.study.Study*): optuna study for optimization run
       - current_best_val_result (*float*): the best validation result so far
       - early_stopping_point (*int*): point at which early stopping occured (relevant for some models)
       - seasonal_periods (*int*): number of samples in one season of the used dataset
       - target_column (*str*): target column for which predictions shall be made
       - best_trials (*list*): list containing the numbers of the best trials
       - user_input_params (*dict*): all params handed over to the constructor that are needed in the whole class
       - base_path (*str*): base_path for save_path
       - save_path (*str*): path for model and results storing

   :param save_dir: directory for saving the results
   :param data: the dataset that you want to use
   :param config_file_section: the section of the config file for the used dataset
   :param featureset_name: name of the feature set used
   :param datasplit: the used datasplit method, either 'timeseries-cv', 'train-val-test', 'cv'
   :param test_set_size_percentage: size of the test set relevant for cv-test and train-val-test
   :param val_set_size_percentage: size of the validation set relevant for train-val-test
   :param n_trials: number of trials for optuna
   :param save_final_model: specify if the final model should be saved
   :param batch_size: batch size for neural network models
   :param n_epochs: number of epochs for neural network models
   :param current_model_name: name of the current model according to naming of .py file in package model
   :param datasets: the Dataset class containing the feature sets
   :param periodical_refit_frequency: if and for which intervals periodical refitting should be performed
   :param refit_drops: after how many periods the model should get updated
   :param refit_window: seasons get used for refitting
   :param intermediate_results_interval: number of trials after which intermediate results will be saved
   :param pca_transform: whether pca dimensionality reduction will be optimized or not
   :param config: the information from dataset_specific_config.ini
   :param optimize_featureset: whether feature set will be optimized or not output scale threshold
   :param scale_thr: only relevant for evars-gpr: output scale threshold
   :param scale_seasons: only relevant for evars-gpr: output scale seasons taken into account
   :param scale_window_factor: only relevant for evars-gpr: scale window factor based on seasonal periods
   :param cf_r: only relevant for evars-gpr: changefinders r param (decay factor older values)
   :param cf_order: only relevant for evars-gpr: changefinders SDAR model order param
   :param cf_smooth: only relevant for evars-gpr: changefinders smoothing param
   :param cf_thr_perc: only relevant for evars-gpr: percentile of train set anomaly factors as threshold for cpd with changefinder
   :param scale_window_minimum: only relevant for evars-gpr: scale window minimum
   :param max_samples_factor: only relevant for evars-gpr: max samples factor of seasons to keep for gpr pipeline
   :param valtest_seasons: define the number of seasons to be used when seasonal_valtest is True
   :param seasonal_valtest: whether validation and test sets should be a multiple of the season length
   :param n_splits: splits to use for 'timeseries-cv' or 'cv'

   .. py:method:: create_new_study()

      Create a new optuna study.

      :return: a new optuna study instance


   .. py:method:: objective(trial)

      Objective function for optuna optimization that returns a score

      :param trial: trial of optuna for optimization

      :return: score of the current hyperparameter config


   .. py:method:: clean_up_after_exception(trial_number, trial_params, reason)

      Clean up things after an exception: delete unfitted model if it exists and update runtime csv

      :param trial_number: number of the trial
      :param trial_params: parameters of the trial
      :param reason: hint for the reason of the Exception


   .. py:method:: write_runtime_csv(dict_runtime)

      Write runtime info to runtime csv file

      :param dict_runtime: dictionary with runtime information


   .. py:method:: calc_runtime_stats()

      Calculate runtime stats for saved csv file.

      :return: dict with runtime info enhanced with runtime stats


   .. py:method:: check_params_for_duplicate(current_params)

      Check if params were already suggested which might happen by design of TPE sampler.

      :param current_params: dictionar with current parameters

      :return: bool reflecting if current params were already used in the same study


   .. py:method:: pca_transform_train_test(train, test)

      Deliver PCA transformed train and test set

      :param train: data for the training
      :param test: data for the testing

      :return: tuple of transformed train and test dataset


   .. py:method:: load_retrain_model(path, filename, retrain, early_stopping_point = None, test = None)

      Load and retrain persisted model
      :param path: path where the model is saved
      :param filename: filename of the model
      :param retrain: data for retraining
      :param test: data for testing
      :param early_stopping_point: optional early stopping point relevant for some models
      :return: model instance


   .. py:method:: generate_results_on_test()

      Generate the results on the testing data

      :return: evaluation metrics dictionary


   .. py:method:: get_feature_importance(model, period)

      Get feature importances for models that possess such a feature, e.g. XGBoost

      :param model: model to analyze
      :param period: refitting period

      :return: DataFrame with feature importance information


   .. py:method:: plot_results(final_results)