get_gridsearch

hcrystalball.model_selection.get_gridsearch(frequency, horizon=10, n_splits=5, between_split_lag=None, scoring='neg_mean_absolute_error', country_code_column=None, country_code=None, holidays_days_before=0, holidays_days_after=0, holidays_bridge_days=False, sklearn_models=True, sklearn_models_optimize_for_horizon=False, autosarimax_models=False, autoarima_dict=None, prophet_models=False, tbats_models=False, exp_smooth_models=False, theta_models=False, average_ensembles=False, stacking_ensembles=False, stacking_ensembles_train_horizon=10, stacking_ensembles_train_n_splits=20, clip_predictions_lower=None, clip_predictions_upper=None, exog_cols=None, hcb_verbose=False)[source]

Get grid search object based on selection criteria.

Parameters
  • frequency (str) – Frequency of timeseries. Pandas compatible frequncies

  • horizon (int) – How many units of frequency (e.g. 4 quarters), should be used to find the best models

  • n_splits (int) – How many cross-validation folds should be used in model selection

  • between_split_lag (int) – How big lag of observations should cv_splits have If kept as None, horizon is used resulting in non-overlaping cv_splits

  • scoring (str, callable) – String of sklearn regression metric name, or hcrystalball compatible scorer. For creation of hcrystalball compatible scorer use make_ts_scorer function.

  • country_code_column (str, list) – Column(s) in data, that contain country code in str (e.g. ‘DE’). Used in holiday transformer. Only one of country_code_column or country_code can be set.

  • country_code (str, list) – Country code(s) in str (e.g. ‘DE’). Used in holiday transformer. Only one of country_code_column or country_code can be set.

  • holidays_days_before (int) – Number of days before the holiday which will be taken into account (i.e. 2 means that new bool column will be created and will be True for 2 days before holidays, otherwise False)

  • holidays_days_after (int) – Number of days after the holiday which will be taken into account (i.e. 2 means that new bool column will be created and will be True for 2 days after holidays, otherwise False)

  • holidays_bridge_days (bool) – Overlaping holidays_days_before and holidays_days_after feature which serves for modeling between holidays working days

  • sklearn_models (bool) – Whether to consider sklearn models

  • sklearn_models_optimize_for_horizon (bool) – Whether to add to default sklearn behavior also models, that optimize predictions for each horizon

  • autosarimax_models (bool) – Whether to consider auto sarimax models

  • autoarima_dict (dict) – Specification of pmdautoarima search space

  • prophet_models (bool) – Whether to consider FB prophet models

  • exp_smooth_models (bool) – Whether to consider exponential smoothing models

  • average_ensembles (bool) – Whether to consider average ensemble models

  • stacking_ensembles (bool) – Whether to consider stacking ensemble models

  • stacking_ensembles_train_horizon (int) – Which horizon should be used in meta model in stacking ensembles

  • stacking_ensembles_train_n_splits (int) – Number of splits used in meta model in stacking ensembles

  • clip_predictions_lower (float, int) – Minimal number allowed in the predictions

  • clip_predictions_upper (float, int) – Maximal number allowed in the predictions

  • exog_cols (list) – List of columns to be used as exogenous variables

  • hcb_verbose (bool) – Whtether to keep (True) or suppress (False) messages to stdout and stderr from the wrapper and 3rd party libraries during fit and predict

Returns

CV / Model selection configuration

Return type

sklearn.model_selection.GridSearchCV