:py:mod:`ForeTiS.preprocess.raw_data_functions` =============================================== .. py:module:: ForeTiS.preprocess.raw_data_functions Module Contents --------------- Functions ~~~~~~~~~ .. autoapisummary:: ForeTiS.preprocess.raw_data_functions.drop_columns ForeTiS.preprocess.raw_data_functions.drop_rows_by_dates ForeTiS.preprocess.raw_data_functions.custom_resampler ForeTiS.preprocess.raw_data_functions.get_one_hot_encoded_df ForeTiS.preprocess.raw_data_functions.get_simple_imputer ForeTiS.preprocess.raw_data_functions.get_iter_imputer ForeTiS.preprocess.raw_data_functions.get_knn_imputer ForeTiS.preprocess.raw_data_functions.encode_cyclical_features .. py:function:: drop_columns(df, columns) Function dropping all columns specified :param df: dataset used for dropping :param columns: columns which should be dropped .. py:function:: drop_rows_by_dates(df, start, end) Function dropping rows within specified dates :param df: dataset used for dropping :param start: start date for dropped period :param end: end date for dropped period .. py:function:: custom_resampler(arraylike, target_column) Custom resampling function when resampling frequency of dataset :param arraylike: Series to use for calculation :param target_column: choosen target column :return: sum or mean of arraylike or 1 .. py:function:: get_one_hot_encoded_df(df, columns_to_encode) Function delivering dataframe with specified columns one hot encoded :param df: dataset to use for encoding :param columns_to_encode: columns to encode :return: dataset with encoded columns .. py:function:: get_simple_imputer(df, strategy = 'mean') Get simple imputer for each column according to specified strategy :param df: DataFrame to impute :param strategy: strategy to use, e.g. 'mean' or 'median' :return: imputer .. py:function:: get_iter_imputer(df, sample_posterior = True, max_iter = 100, min_value = 0, max_value = None) Multivariate, iterative imputer fitted to df with specified parameters :param df: DataFrame to fit for imputation :param sample_posterior: sample from predictive posterior of fitted estimator (standard: BayesianRidge()) :param max_iter: maximum number of iterations for imputation :param min_value: min value for imputation :param max_value: max value for imputation :return: imputer .. py:function:: get_knn_imputer(df, n_neighbors = 10) Imputer of missing values according to k-nearest neighbors in feature space :param df: DataFrame to use for imputation :param n_neighbors: number of neighbors to use for imputation :return: imputer .. py:function:: encode_cyclical_features(df, columns) Function that encodes the cyclic features to sinus and cosinus distribution :param df: DataFrame to use for imputation :param columns: columns that should be encoded