07B: Sample Hardness

Materials:

Date: Friday, 13-Sep-2024

Pre-work:

AIC a criteria for model selection
cords for a collection of works/implementations based on subset selection

In-Class

Characterizing data difficulty or sample hardness.
Look at some statistics like Relative Mahalanobis Distance (which some used to flag OOD, others used to measure sample hardness), Perplexity (cross entropy between two models, one model and data or between), Trust Scores
Sample easiness on training performance and generalization error
See this notebook where we walk through these concept on a toy dataset

Post-class

Additional Reading (optional)

[paper] Understanding Dataset difficulty
[tools] Pytorch-ood - a collection of techniques to detect OOD in PyTroch. Mostly image focussed.
[tools] PyOD - a collection of anomaly detection techniques
[tools] DEEL - a collection of OOD, XAI, and other techniques

Notes

Not all examples are equal (in the eyes of the model). There can be many reasons.
There can be outliers (in the feature space, in the label space or both in the feature and label space).
Outliers affect the model performance in different ways.
A suite of techniques, preferably model-agnostic, are needed to quantify sample hardness and make them available at dataset level (train set) and also at inference time.