get_gridsearch¶

hcrystalball.model_selection.get_gridsearch(frequency, horizon=10, n_splits=5, between_split_lag=None, scoring='neg_mean_absolute_error', country_code_column=None, country_code=None, holidays_days_before=0, holidays_days_after=0, holidays_bridge_days=False, sklearn_models=True, sklearn_models_optimize_for_horizon=False, autosarimax_models=False, autoarima_dict=None, prophet_models=False, tbats_models=False, exp_smooth_models=False, theta_models=False, average_ensembles=False, stacking_ensembles=False, stacking_ensembles_train_horizon=10, stacking_ensembles_train_n_splits=20, clip_predictions_lower=None, clip_predictions_upper=None, exog_cols=None, hcb_verbose=False)[source]¶

Get grid search object based on selection criteria.

Parameters

frequency (str) – Frequency of timeseries. Pandas compatible frequncies
horizon (int) – How many units of frequency (e.g. 4 quarters), should be used to find the best models
n_splits (int) – How many cross-validation folds should be used in model selection
between_split_lag (int) – How big lag of observations should cv_splits have If kept as None, horizon is used resulting in non-overlaping cv_splits
scoring (str, callable) – String of sklearn regression metric name, or hcrystalball compatible scorer. For creation of hcrystalball compatible scorer use make_ts_scorer function.
country_code_column (str, list) – Column(s) in data, that contain country code in str (e.g. ‘DE’). Used in holiday transformer. Only one of country_code_column or country_code can be set.
country_code (str, list) – Country code(s) in str (e.g. ‘DE’). Used in holiday transformer. Only one of country_code_column or country_code can be set.
holidays_days_before (int) – Number of days before the holiday which will be taken into account (i.e. 2 means that new bool column will be created and will be True for 2 days before holidays, otherwise False)
holidays_days_after (int) – Number of days after the holiday which will be taken into account (i.e. 2 means that new bool column will be created and will be True for 2 days after holidays, otherwise False)
holidays_bridge_days (bool) – Overlaping holidays_days_before and holidays_days_after feature which serves for modeling between holidays working days
sklearn_models (bool) – Whether to consider sklearn models
sklearn_models_optimize_for_horizon (bool) – Whether to add to default sklearn behavior also models, that optimize predictions for each horizon
autosarimax_models (bool) – Whether to consider auto sarimax models
autoarima_dict (dict) – Specification of pmdautoarima search space
prophet_models (bool) – Whether to consider FB prophet models
exp_smooth_models (bool) – Whether to consider exponential smoothing models
average_ensembles (bool) – Whether to consider average ensemble models
stacking_ensembles (bool) – Whether to consider stacking ensemble models
stacking_ensembles_train_horizon (int) – Which horizon should be used in meta model in stacking ensembles
stacking_ensembles_train_n_splits (int) – Number of splits used in meta model in stacking ensembles
clip_predictions_lower (float, int) – Minimal number allowed in the predictions
clip_predictions_upper (float, int) – Maximal number allowed in the predictions
exog_cols (list) – List of columns to be used as exogenous variables
hcb_verbose (bool) – Whtether to keep (True) or suppress (False) messages to stdout and stderr from the wrapper and 3rd party libraries during fit and predict

Returns

CV / Model selection configuration

Return type

sklearn.model_selection.GridSearchCV