# Slide presentation

You will here have insights helping you to evaluate your ***Product Recommendations*** model.
Main sections:
- [Metrics](#metrics-1)
- [Calibration curve](#calibration-curve-1)
- [Density chart](#density-chart-1)
- [Confusion matrix](#confusion-matrix-1)
- [Subpopulation analysis](#subpopulation-analysis-1)


## Metrics

Start by observing the summary of the classification metrics of performances.
![model_performances_metrics.png](bOBVXjA1VukL)

## Calibration curve

The **calibration curve** allows you to put your model's predictions into perspective with reality.
The intuition behind this curve is that if, on a given sample, *k%* of your records are attached to the positive class, your model should have an average *proba_1* of *k%*. 
In other words, we seek to represent the alignment between the probabilities of the model and the phenomenon that it must predict.

![calibration_curve.png](DjGFMvGOk48a)

### Calibration curve/Computation method :
- 1: The *proba_1* of your model on the test set are grouped into bins.
- 2: For each bin:
    - We calculate the average *proba_1* of each bin.
    - We look at the frequency of records attached to the positive class.
    - We represent the frequency of the positive class within the bins on the ordinate and the average predicted *proba_1* on the abscissa.
    
### Calibration curve/With a perfect model
A perfectly calibrated model will have a diagonal **calibration curve**. The example shown on the chart below is close to this :arrow_down: . 
![perfect_model_calibration_curve.png](T887vMZJgF9f)

## Density chart
A classification model will generate probabilities

The probability of belonging to the positive class, *proba_1*, is the one that will be at the heart of the **density chart** calculation, representing the density of *proba_1* according to the two classes of the test set, as illustrated below (the positive class being represented in orange ) :arrow_down::

![density_chart.png](1OsdP4d07S16)

:arrow_up: Here you can see that these probabilities' densities overlap. This means that some perimeters are common between:
- Probabilities associated with entities of the "false" class, predicted as "true"
- Probabilities associated with entities of class "true", predicted as "false"
Among other things, the model tends to make mistakes in its predictions, and it "confuses" classes, and this is what a "confusion matrix" will materialize.

### Density chart/With a perfect model
If it is common and even normal to have some overlap between the proba densities, a perfect model has very little confusion and will result in a **density chart** in which the two probas densities overlap very little, like the one below :arrow_down: :
![perfect_model_density_chart.png](xFXyF5NLqjWG)


## Confusion matrix

The **confusion matrix** is the tool that will allow you to decide the final behavior of the model.
The model will output some probabilities but the probabilities alone are not enough: we need also the model to make predictions. Thus we define a threshold:
- Beyond which the model will predict the positive class.
- Below which the model will predict the negative class.

The **confusion matrix** makes it possible to observe the performances of the model predictions on the test set according to the threshold. **It illustrates how the model will "confuse" classes for a given threshold**.
![confusion_matrix.gif](me1aIp3mr1xW)

## Subpopulation analysis
The subpopulation analysis will finally let you assess your model performances depending on your input variables. 
![subpopulation_analysis.png](rSPwI1yQplcB)
In the example shown above :arrow_up: we can see the different classification metrics depending on the *item_universe* attribute (in our case, the 5 different universes among which 'Ladieswear', 'Divided', 'Menswear', 'Sport' and 'Baby/Children'). 
- Model has a precision of 0.985 on universe*Ladieswear* while it has a precision of 1.0 on universe *Sport*.

:arrow_forward: This feature will help you to better understand where your interaction predictions are good or could be improved.

The subpopulation is also an interactive feature, that will:
- Let you analyze the density chart and the confusion matrix for each chunk of a subpopulation within the model test set.
- Let you change the features on which you want to assess the subpopulation analysis.

These two kinds of interactions are showcased in the animation below :arrow_down: where:
- We start by comparing the model performances depending on the *item_family*.
- We then compare the performances depending on the *item_universe* and finally *user_age_cluster*.
![subpopulation_analysis.gif](FRw0RWNhU79R)