The product preparation Zone takes the [product_holdings](dataset:product_holdings) dataset and preprocesses it.

![Screenshot 2023-06-01 at 11.54.38.png](61M29AwJmALw)

The [Grouping Recipe](recipe:compute_products) simply retrieves the product and product type for each product_id to build the [correspondence table](dataset:products).

In the upper thread, the [Joining Recipe](recipe:compute_product_holdings_history) performs the cartesian product between each row of the dates_history dataset and the product_holdings dataset. In the next [Prepare Recipe](recipe:compute_product_holdings_prepared), creation age and termination age are computed by taking the difference between reference date and start date or end date, respectively. Only rows with a positive creation age are kept to keep only past holdings with respect to each reference date. The relevant features are created in the subsequent [Grouping Recipe](recipe:compute_product_holdings_by_customer):
 - created window: whether the creation age is larger than the defined lookback window (boolean)
 - terminated window: whether the termination age is larger than the defined lookback window (boolean)
 
 Four aggregations are then defined per customer and reference date:

 - count: the number of subscribed products at a given reference date.
 - count on end date: the number of terminated products at a given reference date.
 - sum on created window: the number of subscribed products before the lookback period.
 - sum on terminated window: the number of terminated products before the lookback period.

In the final [Prepare Recipe](recipe:compute_product_holdings_portfolio), missing values are filled in with 0 to handle them correctly in the clustering process, and columns created from the grouping recipe are renamed to be more descriptive.




