HolidayTransformer¶
-
class
hcrystalball.feature_extraction.
HolidayTransformer
(country_code=None, country_code_column=None, days_before=0, days_after=0, bridge_days=False)[source]¶ Bases:
sklearn.base.TransformerMixin
,sklearn.base.BaseEstimator
Generate holiday feature based on provided ISO code
- Parameters
country_code (str) – ISO code if country/region
country_code_column (str) – name of the column which have the ISO code of the country/region
days_before (int) – number of days before the holiday which will be taken into account (i.e. 2 means that new bool column will be created and will be True for 2 days before holidays, otherwise False)
days_after (int) – number of days after the holiday which will be taken into account (i.e. 2 means that new bool column will be created and will be True for 2 days after holidays, otherwise False)
bridge_days (bool) – overlaping
days_before
anddays_after
feature which serves for modeling between holidays working daysbe aware that you cannot provide both country_code and country_code_column (Please) –
initialization since this would be ambuguious. If you provide country_code_column (during) –
of country_code the ISO code found in the column will be assigned into country_code column. (instead) –
Attributes Summary
Utility storing country code or unique value from country_code_column
Methods Summary
fit
(X[, y])Check if
date_col
has daily frequencyfit_transform
(X[, y])Fit to data, then transform it.
Return list with features which the transformer generates
get_params
([deep])Get parameters for this estimator.
set_params
(**params)Set the parameters of this estimator.
transform
(X[, y])Create data with ‘holiday’ colummn
Attributes Documentation
-
unified_country_code
¶ Utility storing country code or unique value from country_code_column
Methods Documentation
-
fit
(X, y=None)[source]¶ Check if
date_col
has daily frequencyThis check is in
fit
method since pandas.infer_freq is used which requires at least 3 observations.- Parameters
X (pandas.DataFrame) – Input features.
y (Any) – Ignored
- Returns
self
- Return type
- Raises
ValueError – in case daily frequency is not used or very few datapoints are provided in X
-
fit_transform
(X, y=None, **fit_params)¶ Fit to data, then transform it.
Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.
- Parameters
X ({array-like, sparse matrix, dataframe} of shape (n_samples, n_features)) –
y (ndarray of shape (n_samples,), default=None) – Target values.
**fit_params (dict) – Additional fit parameters.
- Returns
X_new – Transformed array.
- Return type
ndarray array of shape (n_samples, n_features_new)
-
get_params
(deep=True)¶ Get parameters for this estimator.
- Parameters
deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.
- Returns
params – Parameter names mapped to their values.
- Return type
mapping of string to any
-
set_params
(**params)¶ Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form
<component>__<parameter>
so that it’s possible to update each component of a nested object.
-
transform
(X, y=None)[source]¶ Create data with ‘holiday’ colummn
Columns contains names of the holidays based on provided ‘date’ column
- Parameters
X (pandas.DataFrame) – Input features.
y (numpy.ndarray) – iIgnored.
- Returns
DataFrame with
self._col_name
column including names of holidays for each of the date- Return type
- Raises
KeyError – if ‘country_code_column’ is not found in X
ValueError – if country_code_column has more than 1 value in X