select_model¶

hcrystalball.model_selection.select_model(df, target_col_name, partition_columns, grid_search, parallel_over_dict=None, frequency=None, country_code_column=None)[source]¶

Find the best model from grid_search for each partition

Parameters

df (pandas.DataFrame) – Data to be used for model selection
target_col_name (str) – name of target column
partition_columns (list) – List of column names in the input dataframe to be used for partitioning the data.
grid_search (sklearn.model_selection.GridSearchCV) – Instance compatible with Sklearn GridSearchCV
parallel_over_dict (dict) – Dictionary with the partition label used for parallelization (e.g. {‘Region’:’Africa’})
frequency (str) – Frequency at which model selection was done. It is mainly for bookeeping purposes and later used to instantiate result objects
country_code_column (str) – Name of the column with ISO code of country/region, which can be used for supplying holiday. e.g. ‘State’ with values like ‘DE’, ‘CZ’ or ‘Region’ with values like ‘DE-NW’, ‘DE-HE’, etc.

Returns

List of ModelSelectorResults containing all important information for each partition

Return type

list