HolidayTransformer¶

class hcrystalball.feature_extraction.HolidayTransformer(country_code=None, country_code_column=None, days_before=0, days_after=0, bridge_days=False)[source]¶

Bases: sklearn.base.TransformerMixin, sklearn.base.BaseEstimator

Generate holiday feature based on provided ISO code

Parameters

country_code (str) – ISO code if country/region
country_code_column (str) – name of the column which have the ISO code of the country/region
days_before (int) – number of days before the holiday which will be taken into account (i.e. 2 means that new bool column will be created and will be True for 2 days before holidays, otherwise False)
days_after (int) – number of days after the holiday which will be taken into account (i.e. 2 means that new bool column will be created and will be True for 2 days after holidays, otherwise False)
bridge_days (bool) – overlaping days_before and days_after feature which serves for modeling between holidays working days
be aware that you cannot provide both country_code and country_code_column (Please) –
initialization since this would be ambuguious. If you provide country_code_column (during) –
of country_code the ISO code found in the column will be assigned into country_code column. (instead) –

Attributes Summary

unified_country_code

Utility storing country code or unique value from country_code_column

Methods Summary

`fit`(X[, y])	Check if `date_col` has daily frequency
`fit_transform`(X[, y])	Fit to data, then transform it.
`get_feature_names`()	Return list with features which the transformer generates
`get_params`([deep])	Get parameters for this estimator.
`set_params`(**params)	Set the parameters of this estimator.
`transform`(X[, y])	Create data with ‘holiday’ colummn

Attributes Documentation

unified_country_code¶: Utility storing country code or unique value from country_code_column

Methods Documentation

fit(X, y=None)[source]¶

Check if date_col has daily frequency

This check is in fit method since pandas.infer_freq is used which requires at least 3 observations.

Parameters

X (pandas.DataFrame) – Input features.
y (Any) – Ignored

Returns

self

Return type

HolidayTransformer

Raises

ValueError – in case daily frequency is not used or very few datapoints are provided in X

fit_transform(X, y=None, **fit_params)¶

Fit to data, then transform it.

Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.

Parameters

X (array-like of shape (n_samples, n_features)) – Input samples.
y (array-like of shape (n_samples,) or (n_samples, n_outputs), default=None) – Target values (None for unsupervised transformations).
**fit_params (dict) – Additional fit parameters.

Returns

X_new – Transformed array.

Return type

ndarray array of shape (n_samples, n_features_new)

get_feature_names()[source]¶: Return list with features which the transformer generates

get_params(deep=True)¶

Get parameters for this estimator.

Parameters: deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.
Returns: params – Parameter names mapped to their values.
Return type: dict

set_params(**params)¶

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters: **params (dict) – Estimator parameters.
Returns: self – Estimator instance.
Return type: estimator instance

transform(X, y=None)[source]¶

Create data with ‘holiday’ colummn

Columns contains names of the holidays based on provided ‘date’ column

Parameters

X (pandas.DataFrame) – Input features.
y (numpy.ndarray) – iIgnored.

Returns

DataFrame with self._col_name column including names of holidays for each of the date

Return type

pandas.DataFrame

Raises

KeyError – if ‘country_code_column’ is not found in X
ValueError – if country_code_column has more than 1 value in X