A python notebook that implements backpropagation from scratch and achieves 85% accuracy on MNIST with no regularization or data preprocessing.
The neural network being used has two hidden layers and uses sigmoid activations on all layers except the last, which applies a softmax activation. The cost function used is cross entropy, and the layer sizes (in neurons) are 784 (input), 32, 32, and 10.
Includes a PDF with the theoretical foundations, also available on mathcha.