HolidayTransformer

class hcrystalball.feature_extraction.HolidayTransformer(country_code=None, country_code_column=None, days_before=0, days_after=0, bridge_days=False)[source]

Bases: sklearn.base.TransformerMixin, sklearn.base.BaseEstimator

Generate holiday feature based on provided ISO code

Parameters
  • country_code (str) – ISO code if country/region

  • country_code_column (str) – name of the column which have the ISO code of the country/region

  • days_before (int) – number of days before the holiday which will be taken into account (i.e. 2 means that new bool column will be created and will be True for 2 days before holidays, otherwise False)

  • days_after (int) – number of days after the holiday which will be taken into account (i.e. 2 means that new bool column will be created and will be True for 2 days after holidays, otherwise False)

  • bridge_days (bool) – overlaping days_before and days_after feature which serves for modeling between holidays working days

  • be aware that you cannot provide both country_code and country_code_column (Please) –

  • initialization since this would be ambuguious. If you provide country_code_column (during) –

  • of country_code the ISO code found in the column will be assigned into country_code column. (instead) –

Attributes Summary

unified_country_code

Utility storing country code or unique value from country_code_column

Methods Summary

fit(X[, y])

Check if date_col has daily frequency

fit_transform(X[, y])

Fit to data, then transform it.

get_feature_names()

Return list with features which the transformer generates

get_params([deep])

Get parameters for this estimator.

set_params(**params)

Set the parameters of this estimator.

transform(X[, y])

Create data with ‘holiday’ colummn

Attributes Documentation

unified_country_code

Utility storing country code or unique value from country_code_column

Methods Documentation

fit(X, y=None)[source]

Check if date_col has daily frequency

This check is in fit method since pandas.infer_freq is used which requires at least 3 observations.

Parameters
Returns

self

Return type

HolidayTransformer

Raises

ValueError – in case daily frequency is not used or very few datapoints are provided in X

fit_transform(X, y=None, **fit_params)

Fit to data, then transform it.

Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.

Parameters
  • X (array-like of shape (n_samples, n_features)) – Input samples.

  • y (array-like of shape (n_samples,) or (n_samples, n_outputs), default=None) – Target values (None for unsupervised transformations).

  • **fit_params (dict) – Additional fit parameters.

Returns

X_new – Transformed array.

Return type

ndarray array of shape (n_samples, n_features_new)

get_feature_names()[source]

Return list with features which the transformer generates

get_params(deep=True)

Get parameters for this estimator.

Parameters

deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns

params – Parameter names mapped to their values.

Return type

dict

set_params(**params)

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters

**params (dict) – Estimator parameters.

Returns

self – Estimator instance.

Return type

estimator instance

transform(X, y=None)[source]

Create data with ‘holiday’ colummn

Columns contains names of the holidays based on provided ‘date’ column

Parameters
Returns

DataFrame with self._col_name column including names of holidays for each of the date

Return type

pandas.DataFrame

Raises
  • KeyError – if ‘country_code_column’ is not found in X

  • ValueError – if country_code_column has more than 1 value in X