partition_data¶
-
hcrystalball.model_selection.
partition_data
(df, partition_by)[source]¶ Partition data by values found in one or more columns.
For each of the selected columns the unique values will be determined and a selection will be made for each element in the cross product of the unique values.
- Parameters
df (pandas.DataFrame) – Data to be partitioned
partition_by (list) – Column names to partition by
- Returns
Partition dictionary with keys:
- labelsTuple of dictionaries whose keys are the column names
and values are the actual values in the column
data : Tuple of pandas.DataFrame objects holding the subset of the data with
- Return type