Models are evaluated with different sample weights
Models are evaluated with different class averaging methods
Models are evaluated with different class averaging methods and different sample weights
Models are evaluated with different causal weighting methods