Things to check when I test a model.

This is logical continuation of my first blogpost about Training the model.

Some times when testing new models with validation accuracy >90%, test accuracy can be < 0.1%. This is pretty strong signal that there is a bug in testing. I would like to list most common ones, in order to eliminate them fast next time something is not working.

When the results are close to 0:
- Loaded model is in eval mode.
- Input normalized the same way it was in training.
When the results are bad but not 0:
- Validate on balanced data, so the long tail classes won’t screw the accuracy.
If you aim for accuracy and don’t have inference time constraints:
- Use test time augmentation.
- Consider model ensemble from multiple checkpoints