partition_data_by_values¶
-
hcrystalball.model_selection.
partition_data_by_values
(df, column, partition_values, default_df=None)[source]¶ Partition data by one column and a fixed set ov values within that column.
If a value is not present, optionally provide default data for the partition.
- Parameters
df (pandas.DataFrame) – Data to be partitioned
column (str) – column with values to partition by
partition_values (list) – values to partition by
default_df (pandas.DataFrame) – data to be used as default in case value is not present
- Returns
Partition dictionary with keys:
- labelsTuple of dictionaries whose keys are the column names
and values are the actual values in the column
data : Tuple of pandas.DataFrame objects holding the subset of the data with
- Return type