ModelSelectorResult¶
-
class
hcrystalball.model_selection.
ModelSelectorResult
(best_model, cv_results, cv_data, model_reprs, partition, X_train, y_train, frequency, horizon, country_code_column)[source]¶ Bases:
object
Consolidate infromation/methods from cross validation for 1 time series
Store all relevant information about model selection and provide utility methods (e.g. plot_model_performance) and data (e.g. df_plot) for easier access to further insights.
- Parameters
best_model (sklearn compatible estimator) – best model found during model selection
cv_results (pandas.DataFrame) – cv_results of sklearn.model_selection.GridSearchCV in form of DataFrame
cv_data (pandas.DataFrame) – data with models predictions, cv split indication and true target values
model_reprs (dict) – dictionary of model representations used in model selection in form of {model_hash : model_repr}
partition (dict) – dictionary indicating for which part of the data the model selection results belong to e.g. {“Region”:”Canada”, “Product”:”Chips”}
X_train (pandas.DataFrame) – training data features
y_train (pandas.Series) – training data target
frequency (str) – temporal frequency of data on which the model was trained / selected
horizon (int) – how many steps ahead predictions were made
country_code_column (str) – Name of the column with ISO code of country/region, which can be used for supplying holiday. e.g. ‘State’ with values like ‘DE’, ‘CZ’ or ‘Region’ with values like ‘DE-NW’, ‘DE-HE’, etc.
Attributes Summary
Indicator for cv_splits overlap in training data
Training data suitable for plotting.
Methods Summary
persist
([attribute_name, path])Persist whole object or particular object attributes
plot_error
(**plot_params)Plot model absolute error during model selection
plot_result
([plot_from])Plot model performance from given
plot_from
timestampAttributes Documentation
-
cv_splits_overlap
¶ Indicator for cv_splits overlap in training data
- Returns
Whether cv_splits in training data contain overlap
- Return type
-
df_plot
¶ Training data suitable for plotting.
Utility, that prepares data from model selection to be used for further model performance analysis
- Returns
Data suitable for plotting
- Return type
Methods Documentation
-
persist
(attribute_name=None, path='')[source]¶ Persist whole object or particular object attributes
- Parameters
- Raises
ValueError – If attribute not a valid option. Lists available ones
-
plot_error
(**plot_params)[source]¶ Plot model absolute error during model selection
- Parameters
plot_params (kwargs) – plotting parameters passed down to pandas.DataFrame.plot() dependent on your plotting backend e.g. figsize = (16,9),
title = 'Performance of Model'
- Returns
plot depending on your plotting backend, by default plot from matplotlib
- Return type
pandas.DataFrame.plot()
-
plot_result
(plot_from=None, **plot_params)[source]¶ Plot model performance from given
plot_from
timestamp- Parameters
plot_from (str) – date from which to show actuals, cv_forecast and forecast, Default behavior does not filter dates
plot_params (kwargs) – plotting parameters passed down to pandas.DataFrame.plot() dependent on your plotting backend e.g. figsize = (16,9),
title = 'Performance of Model'
- Returns
plot depending on your plotting backend, by default plot from matplotlib
- Return type
pandas.DataFrame.plot()