This repo contains tutorials covering image classification using PyTorch 1.7, torchvision 0.8, matplotlib 3.3 and scikit-learn 0.24, with Python 3.8.
We'll start by implementing a multilayer perceptron (MLP) and then move on to architectures using convolutional neural networks (CNNs). Specifically, we'll implement LeNet, AlexNet, VGG and ResNet.
If you find any mistakes or disagree with any of the explanations, please do not hesitate to submit an issue. I welcome any feedback, positive or negative!
To install PyTorch, see installation instructions on the PyTorch website.
The instructions to install PyTorch should also detail how to install torchvision but can also be installed via:
pip install torchvision
-
This tutorial provides an introduction to PyTorch and TorchVision. We'll learn how to: load datasets, augment data, define a multilayer perceptron (MLP), train a model, view the outputs of our model, visualize the model's representations, and view the weights of the model. The experiments will be carried out on the MNIST dataset - a set of 28x28 handwritten grayscale digits.
-
2 - LeNet
In this tutorial we'll implement the classic LeNet architecture. We'll look into convolutional neural networks and how convolutional layers and subsampling (aka pooling) layers work.
-
3 - AlexNet
In this tutorial we will implement AlexNet, the convolutional neural network architecture that helped start the current interest in deep learning. We will move on to the CIFAR10 dataset - 32x32 color images in ten classes. We show: how to define architectures using
nn.Sequential
, how to initialize the parameters of your neural network, and how to use the learning rate finder to determine a good initial learning rate. -
4 - VGG
This tutorial will cover implementing the VGG model. However, instead of training the model from scratch we will instead load a VGG model pre-trained on the ImageNet dataset and show how to perform transfer learning to adapt its weights to the CIFAR10 dataset using a technique called discriminative fine-tuning. We'll also explain how adaptive pooling layers and batch normalization works.
-
5 - ResNet
In this tutorial we will be implementing the ResNet model. We'll show how to load your own dataset, using the CUB200 dataset as an example, and also how to use learning rate schedulers which dynamically alter the learning rate of your model whilst training. Specifially, we'll use the one cycle policy introduced in this paper and is now starting to be commonly used for training computer vision models.
Here are some things I looked at while making these tutorials. Some of it may be out of date.
- https://github.com/pytorch/tutorials
- https://github.com/pytorch/examples
- https://colah.github.io/posts/2014-10-Visualizing-MNIST/
- https://distill.pub/2016/misread-tsne/
- https://towardsdatascience.com/visualising-high-dimensional-datasets-using-pca-and-t-sne-in-python-8ef87e7915b
- https://github.com/activatedgeek/LeNet-5
- https://github.com/ChawDoe/LeNet5-MNIST-PyTorch
- https://github.com/kuangliu/pytorch-cifar
- https://github.com/akamaster/pytorch_resnet_cifar10
- https://sgugger.github.io/the-1cycle-policy.html