This Flow Zone marks the beginning of the **sales per category segmentation branch**.

The aim of the [transactions_feature_engineering](flow_zone:6GCmqd3) Flow Zone is to create relevant features such as the total revenue of each product category in each store to segment them according to a category's sales performance.

![transactions-feature-engineering.png](wqS4IItqjF3J)

To better understand this Flow Zone, here is a reminder of the meanings of the ```target_category``` and ```sub_category_X``` columns. You can find further information in the Data Model article.

The ```target_category``` is a categorical column corresponding to a level in the hierarchy of the product categories. This column contains all the categories at this level. For example, it can contain the categories "MEN," "WOMEN," and "KIDS" at a macro level in a clothing retail company. This column is called "target" because it corresponds to the level where the user wants to analyze the performance of one or multiple categories.

First, the [compute_stores_sales_per_product_category_sum](recipe:compute_stores_sales_per_product_category_sum) recipe pivots, for each store, on specific values of the ```target_category``` column (the value has been chosen in the Project Set Up) and sums its ```product_revenue``` to get the total revenue for each category. For example, if the user chose in the Project Setup to segment the stores focusing on the sales performance of the category "MEN" only, then this recipe computes the total revenue of the category "MEN" for each store. The new column created is named "MEN_product_revenue_sum".

![compute_stores_sales_per_product_category_sum.png](jN9HgykYknsm)

Then, the prepare recipe [compute_stores_sales_per_product_category_sum_prepared](recipe:compute_stores_sales_per_product_category_sum_prepared) fills the empty values in the newly created columns "CATEGORY_NAME_product_revenue_sum" with zeros and rounds the values to two decimal places.

At this point, **if the user has selected "Percentage sales per category" in the Project Setup** (see image below), then the Python recipe [compute_stores_sales_per_product_category_ratio](recipe:compute_stores_sales_per_product_category_ratio) calculates the ratio of the category sales among all sales in a store.

![transactions aggregations type project set up.png](38qB5ct9oUDQ)

Now, we have a dataset containing either the total revenue of the category(ies) or the percentage of its sales in each store. To get the location of the store, which will be used later for the visual output results in the dashboard, we join this [stores_with_all_sales_per_category_data](dataset:stores_with_all_sales_per_category_data) dataset with the [distinct_stores_prepared](dataset:distinct_stores_prepared) dataset, which was the output of the [store_preprocessing](flow_zone:Um5gJ7R) Flow Zone.