HolidayTransformer¶
- class hcrystalball.feature_extraction.HolidayTransformer(country_code=None, country_code_column=None, days_before=0, days_after=0, bridge_days=False)[source]¶
Bases:
sklearn.base.TransformerMixin
,sklearn.base.BaseEstimator
Generate holiday feature based on provided ISO code
- Parameters
country_code (str) – ISO code if country/region
country_code_column (str) – name of the column which have the ISO code of the country/region
days_before (int) – number of days before the holiday which will be taken into account (i.e. 2 means that new bool column will be created and will be True for 2 days before holidays, otherwise False)
days_after (int) – number of days after the holiday which will be taken into account (i.e. 2 means that new bool column will be created and will be True for 2 days after holidays, otherwise False)
bridge_days (bool) – overlaping
days_before
anddays_after
feature which serves for modeling between holidays working daysbe aware that you cannot provide both country_code and country_code_column (Please) –
initialization since this would be ambuguious. If you provide country_code_column (during) –
of country_code the ISO code found in the column will be assigned into country_code column. (instead) –
Attributes Summary
Utility storing country code or unique value from country_code_column
Methods Summary
fit
(X[, y])Check if
date_col
has daily frequencyfit_transform
(X[, y])Fit to data, then transform it.
Return list with features which the transformer generates
get_params
([deep])Get parameters for this estimator.
set_params
(**params)Set the parameters of this estimator.
transform
(X[, y])Create data with ‘holiday’ colummn
Attributes Documentation
- unified_country_code¶
Utility storing country code or unique value from country_code_column
Methods Documentation
- fit(X, y=None)[source]¶
Check if
date_col
has daily frequencyThis check is in
fit
method since pandas.infer_freq is used which requires at least 3 observations.- Parameters
X (pandas.DataFrame) – Input features.
y (Any) – Ignored
- Returns
self
- Return type
- Raises
ValueError – in case daily frequency is not used or very few datapoints are provided in X
- fit_transform(X, y=None, **fit_params)¶
Fit to data, then transform it.
Fits transformer to
X
andy
with optional parametersfit_params
and returns a transformed version ofX
.- Parameters
X (array-like of shape (n_samples, n_features)) – Input samples.
y (array-like of shape (n_samples,) or (n_samples, n_outputs), default=None) – Target values (None for unsupervised transformations).
**fit_params (dict) – Additional fit parameters.
- Returns
X_new – Transformed array.
- Return type
ndarray array of shape (n_samples, n_features_new)
- get_params(deep=True)¶
Get parameters for this estimator.
- set_params(**params)¶
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as
Pipeline
). The latter have parameters of the form<component>__<parameter>
so that it’s possible to update each component of a nested object.- Parameters
**params (dict) – Estimator parameters.
- Returns
self – Estimator instance.
- Return type
estimator instance
- transform(X, y=None)[source]¶
Create data with ‘holiday’ colummn
Columns contains names of the holidays based on provided ‘date’ column
- Parameters
X (pandas.DataFrame) – Input features.
y (numpy.ndarray) – iIgnored.
- Returns
DataFrame with
self._col_name
column including names of holidays for each of the date- Return type
- Raises
KeyError – if ‘country_code_column’ is not found in X
ValueError – if country_code_column has more than 1 value in X