Skip to content

Learning computer vision by striving to maximise accuracy on the Stanford Cars dataset

License

Notifications You must be signed in to change notification settings

morganmcg1/stanford-cars

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

32 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Standford Cars - Image Classification

Binder <- Launch Binder or share the Binder link

Image classification of the stanford-cars dataset leveraging the fastai v1. The goal is to try hit 90%+ accuracy shoot for the stars, starting with a basic fastai image classification workflow and interating from there.

This was all run on a Paperspace P4000 machine apart from the EfficientNet-b7 results which were run on a P6000.

Current best score - 94.79%

SOTA = 96.2%, Domain Adaptive Transfer Learning with Specialist Models

TL;DR

  • NOTEBOOK: 10_stanford_cars_EfficientNet_b7_Ranger_Mish_Trial.ipynb
  • Continuing on my EfficinetNet-b3 result of 93.8% I matched the EfficinetNet paper's b7 result
  • Achieved 94.79% (standard deviation of 0.094) 5-run, 40epoch, mean test set accuracy on Stanford Cars using Mish EfficientNet-b7 + Ranger
  • Matched the EfficientNet paper EfficientNet-b7 result of 94.7% (current SOTA is 96.0%)
  • Used MEfficientNet-b3, created by swapping the Squish activation function for the Mish activation function
  • Used the Ranger optimisation function (a combination of RAdam and Lookahead) and trained with FlatCosAnnealScheduler

Credits

Notebook Results

10_stanford_cars_EfficientNet_b7_Ranger_Mish_Trial.ipynb

exp_stanford_cars_EfficientNet_Mish Range913A.ipynb

  • Ran 8 experiments testing Beta version of new Range913A from @less20202
  • Matched previous best using "vanialla" Ranger (93.8%), needed a higher lr to match (1e2 vs 1.5e-3)
  • Notebook for full results and plots of validation loss and accuracy

9_stanford_cars_EfficientNet_Ranger_Mish_Trial.ipynb

  • Achieved 93.8% 5-run, 40epoch, mean test set accuracy on Stanford Cars using Mish EfficientNet-b3 + Ranger
  • Using the Mish activation and Ranger with EfficientNet-b3. See notebook for implementation details
  • Validation loss and Accuracy (I used the test set as the validation set) are saved in mefficientnet_b3_ranger_results.xlsx
  • Fastai Forums post and discussion
  • Quick Medium post

6_stanford_cars_cutout.ipynb

  • Used the Cuout data augmentation alongside default fastai data transforms, size of the squares were 25% of the image side (e.g. 25% x 224)
  • 88.3% Accuracy achieved

5_stanford_cars_mixup_and_dropout.ipynb

  • Tuning the dropout parameters while also using the Mixup) protocol
  • 89.2% Accuracy achieved with agressive dropout (ps = [0.35, 0.7]), accuracy more or less the same as NB4

4_stanford_cars_mixup.ipynb

  • Tuning the model using the Mixup) protocol, blending input images to provide stronger regularisation
  • 89.4% Accuracy, up 1% since NB2

3_stanford_cars_cropped.ipynb

  • Training the model using the cropped images, based on the bounding boxes provided
  • 78.54% Accuracy, down 9.5% from Notebook 2

2_stanford_cars_lr_tuning.ipynb

  • Tuning of the learning rate and differential learning rates, again using fastai's implementation of the 1-cycle policy
  • 88.19% Accuracy, up 3.2%

1_stanford_cars_basic.ipynb

  • Benchmark model using basic fastai image classification workflow including the 1-cycle policy
  • 84.95% Accuracy

labels_df.csv contains the labels, filepath and test/train flag for each image file.

S0TA

Potential Avenues of Investigation

Fine tune first on Cars from Google Open Images

Use DAT Domain Adaptive Transfer Learning with Specialist Models_

FORNAX - Great roundup in advances in 2018, some of which can be applied: https://github.com/kmkolasinski/deep-learning-notes/blob/master/seminars/2018-12-Improving-DL-with-tricks/Improving_deep_learning_models_with_bag_of_tricks.pdf

AMAZON - Bag of Tricks for Image Classification with Convolutional Neural Network: https://arxiv.org/pdf/1812.01187.pdf

Multi-Attention CNN: https://github.com/Jianlong-Fu/Multi-Attention-CNN

Data Augmentation

Training Regimes

Architecture

  • Try alternate resnet sizes (benchmark used resnet152)
  • Try alternate archs, e.g. densenet, unet
  • Try XResnet152

Credits

*(My 90%+ goal was based on @sgugger's code implementing Adam for the Stanford Cars dataset, here: https://github.com/sgugger/Adam-experiments)

About

Learning computer vision by striving to maximise accuracy on the Stanford Cars dataset

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages