Each selected stress test generates an altered dataset from the model's test set, that is then preprocessed and scored with the model.
After running the tests, the model view will display several metrics to assess how the model's performance changes between altered and unaltered data.
{{ CORRUPTION_TYPES[stressTestType].description }}
{{ perfMetric.name(metric.name, metric.base_metric, true) }} {{ perfMetric.description(metric.name, metric.base_metric, modelInfo.predType === 'REGRESSION') }}
{{ perfMetric.name(metric.name, metric.base_metric) }} | |
---|---|
{{ uiState[test_results.name].displayName }}
{{ test_results.warning }}
|
{{ metric.value | toFixedIfNeeded: 3 }}
{{ displayWithuserFriendlyMetricName(metric.warning) }}
|
Critical samples are the rows in the test set identified as the most vulnerable to the selected feature corruptions. They are the records where the {{ modelInfo.predType === 'REGRESSION' ? 'predicted value' : 'true class probability' }} varied the most across all the tested feature corruption scenarios.
Average {{ modelInfo.predType === 'REGRESSION' ? 'predicted value' : 'true class probability' }}
{{ results[stressTestType].critical_samples.means[$index] | toFixedIfNeeded: 3}}
± {{ results[stressTestType].critical_samples.uncertainties[$index] | toFixedIfNeeded: 3}}
{{ detail }}
|
---|
{{key}}
= {{value}}
|