CS-GY 6953 / ECE-GY 7123 Miniproject

Convolutional neural networks and, most recently, transformer-based models dominate deep learning for computer vision. Models based on these architectures achieve high accuracy often at the expense of computational complexity, with even the tiniest examples rarely having less than 5 million parameters. For our miniproject, we experimented with a newer architecture, ConvMixer, a convolutional network with skip connections that incorporates aspects of transformers and other more recent architectures to outperform similarly sized models despite its relative simplicity. Our model achieves 94.3% accuracy on CIFAR-10 classification after 100 training epochs and 95.0% after 200 epochs while using less than 12% of our allocated 5 million parameter budget.

You can find notebooks that show how to train, test and analyze experiments using this repository under examples/. The notebook final_results.ipynb can be used to reproduce our final results from the model checkpoints saved in saved_models/.

PyTorch + CIFAR-10 scripts

The repository contains a set of extensible scripts for training and evaluating PyTorch models on the CIFAR-10 image dataset. Heavily inspired by timm.

Currently available architectures:

The scripts can be extended by adding different architectures, optimization algorithms, learning rate schedulers, and data augmentations.

Instructions

All scripts can be run from the command line. The minimum arguments for each can be seen below. Whether you are running the script locally or in a hosted Jupyter notebook, all you need to do is clone the repository and run the script:

git clone https://github.com/cprimel/cautious-fiesta.git
cd cautious-fiesta && python train.py --config experiments/convmixer256_8_k5_p2_00.yml --batch-size=512

To evaluate a model, similarly making sure to specify the correct model and experiment identifiers:

cd cautious-fiesta && python test.py --model=convmixer256_8_k5_p2 --experiment=convmixer256_8_k5_p2_00 --checkpoint=<path-to-checkpoint> --logs=<directory-for-log-output>

Outputs

All scripts log information to standard output.

summarize.py with argument--save-graph=True outputs a standard TensorFlow Event protocol buffer that can be ingested by TensorBoard for examining the models conceptual graph.

train.py outputs a .yml file containing the final arguments (e.g., values from the passed config file or, if provided, the values from the command line), a .json file containing the training log {epoch_num: {train_loss, train_acc, val_loss, val_acc, last_lr, epoch_time}}, and any saved checkpoints.

test.py outputs a .json file containing the evaluation metrics and the list of predicted and true labels{batch_index:{test_acc, predicted_labels, true_labels}}

TODOs

Refactor CutMix operation into its own function and integrate into dataloader+transform pipeline.
Add ability to feed config file to test.py
Modify model_registry so models for available architectures can be fully defined from the command line
Add option for ShakeDrop regularization
Add options for weight initialization
Add option for gradient clipping
Add support for mixed precision operations (torch.amp)
Add support for exporting ONXX models from summarize.py

Name		Name	Last commit message	Last commit date
Latest commit History 70 Commits
examples		examples
experiments		experiments
models		models
publish		publish
saved_models		saved_models
utils		utils
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt
summarize.py		summarize.py
test.py		test.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CS-GY 6953 / ECE-GY 7123 Miniproject

PyTorch + CIFAR-10 scripts

Instructions

Outputs

TODOs

About

Releases

Packages

Languages

ic3b3rger/cifar10-pytorch

Folders and files

Latest commit

History

Repository files navigation

CS-GY 6953 / ECE-GY 7123 Miniproject

PyTorch + CIFAR-10 scripts

Instructions

Outputs

TODOs

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages