prepare_data_for_training

hcrystalball.model_selection.prepare_data_for_training(df, frequency, partition_columns, parallel_over_columns=None, country_code_column=None)[source]

Prepare data for model selection.

Transforms data to a form handled by model selection / training, ensuring correct frequency and filling NaN

Parameters
  • df (pandas.DataFrame) – Data to be transformed, must have a date column of type str

  • frequency (str) – frequency identifier (‘D’, ‘M’ etc.)

  • parallel_over_columns (list) – column(s) which define logical segmentation of data for training

  • country_code_column (str) – name of columns from which to take holiday ISO information

Returns

Resampled, aggregated data

Return type

pandas.DataFrame