Digit recognizer from scratch with Numpy and Pandas
The MNIST dataset is used for training and testing the Neural Network. More information regarding the dataset can be found here.
The data preprocessing step consists of two steps: loading and reading the encoded binary files and generating the Neural Network input. The binary files are read with the
gzip
python library and decoded based on the guide on the MNIST website. Subsequently, the pixels data for each image are extracted as saved in a numpy array transposed to have
each one image pixels data per column (as shown in the image below).
Train data is then shuffled to avoid any sort of bias while learning.
The NN model consists of 1 input layer with 784 input neurons (same as the number of pixels for the images), 1 hidden layer and one ouptup layer with 10 neurons each. Between the the 1st hidden layer and the output layer the rectified non-linear activation function is applied, while, after the output layer the softmax function is applied.
In the forward pass the input image is fed into the neural network as an array of pixels. Between the input and the hidden layer, the pixels of the input image are multiplied by the corresponding randomly initialized weight matrix, and the bias is added. Then, the activation function is applied to the result of weight matrix * input + bias. Hence, the activation value of each neuron is computed for ecah layer until the output layer(as shown in the image below).
For the backward pass the mean square error is used to calculate the loss. Then, the partial derivatives are computed and the weights and biases are updated respectively with the Vanilla update. The accuracy is computed as the number of correct predictions over the number of total target values each 10 iterations of the gradient descent algorithm.