ForeTiS.preprocess.raw_data_functions
Module Contents
Functions
|
Function dropping all columns specified |
|
Function dropping rows within specified dates |
|
Custom resampling function when resampling frequency of dataset |
|
Function delivering dataframe with specified columns one hot encoded |
|
Get simple imputer for each column according to specified strategy |
|
Multivariate, iterative imputer fitted to df with specified parameters |
|
Imputer of missing values according to k-nearest neighbors in feature space |
|
Function that encodes the cyclic features to sinus and cosinus distribution |
- ForeTiS.preprocess.raw_data_functions.drop_columns(df, columns)
Function dropping all columns specified
- Parameters:
df (pandas.DataFrame) – dataset used for dropping
columns (list) – columns which should be dropped
- ForeTiS.preprocess.raw_data_functions.drop_rows_by_dates(df, start, end)
Function dropping rows within specified dates
- Parameters:
df (pandas.DataFrame) – dataset used for dropping
start (datetime.date) – start date for dropped period
end (datetime.date) – end date for dropped period
- ForeTiS.preprocess.raw_data_functions.custom_resampler(arraylike, target_column)
Custom resampling function when resampling frequency of dataset
- Parameters:
arraylike (pandas.Series) – Series to use for calculation
target_column (str) – choosen target column
- Returns:
sum or mean of arraylike or 1
- ForeTiS.preprocess.raw_data_functions.get_one_hot_encoded_df(df, columns_to_encode)
Function delivering dataframe with specified columns one hot encoded
- Parameters:
df (pandas.DataFrame) – dataset to use for encoding
columns_to_encode (list) – columns to encode
- Returns:
dataset with encoded columns
- Return type:
pandas.DataFrame
- ForeTiS.preprocess.raw_data_functions.get_simple_imputer(df, strategy='mean')
Get simple imputer for each column according to specified strategy
- Parameters:
df (pandas.DataFrame) – DataFrame to impute
strategy (str) – strategy to use, e.g. ‘mean’ or ‘median’
- Returns:
imputer
- Return type:
sklearn.impute.SimpleImputer
- ForeTiS.preprocess.raw_data_functions.get_iter_imputer(df, sample_posterior=True, max_iter=100, min_value=0, max_value=None)
Multivariate, iterative imputer fitted to df with specified parameters
- Parameters:
df (pandas.DataFrame) – DataFrame to fit for imputation
sample_posterior (bool) – sample from predictive posterior of fitted estimator (standard: BayesianRidge())
max_iter (int) – maximum number of iterations for imputation
min_value (int) – min value for imputation
max_value (int) – max value for imputation
- Returns:
imputer
- Return type:
sklearn.impute.IterativeImputer
- ForeTiS.preprocess.raw_data_functions.get_knn_imputer(df, n_neighbors=10)
Imputer of missing values according to k-nearest neighbors in feature space
- Parameters:
df (pandas.DataFrame) – DataFrame to use for imputation
n_neighbors (int) – number of neighbors to use for imputation
- Returns:
imputer
- Return type:
sklearn.impute.KNNImputer