Neural Network from scratch with CUDA Support

🤔 What is this project?

This project is a neural network implementation from scratch in C# with CUDA support written in C++. It currently supports Optical Digit Recognition (ODR) trained with 60,000 images and can also perform XOR as a simple initial test. More complex image classification is in progress. I trained it with 2000 rgb images of 150 * 150 pixels and got some ok results.

❗Info

At the current point I would not recommend this in any production environment, for me it's just a fun project to learn more about CUDA and Neural Networks. Also I tried to implement Convolution and Pooling layer from scratch, but failed in the back propagation. Currently they are not working in any way😢

🛠️ Features

Optical Digit Recognition (ODR): Trained with the MNIST dataset of 60,000 images.
XOR Test: A simple test to demonstrate the neural network's basic functionality.
CUDA Support: Accelerates neural network training using GPU resources.

📎See also DeepReinforcementLearning from scratch using this project

📊 Benchmarks

Training Details	GPU (CUDA, RTX 3050)	CPU (i9-10900)	(CPU) Ryzen 5 3500U
100 images, 150x150x3 (67500 inputs, 1024 hidden, 512 hidden, 256 hidden, 6 outputs)	2.231 sec	9.514 sec	34.472 sec
100 images, 150x150x3 (67500 inputs, 2048 hidden, 1024 hidden, 6 outputs)	(old)6.832 sec	9.426 sec	31.467 sec

🚀 Performance History

Sequential to true Parallel 📈 ...

The initial Optical Digit Recognition (ODR) implementation, using 28x28 black-and-white images as input with a neural network consisting of 128 and 64 hidden neurons and 10 output neurons, took 2.8 seconds to train on 1000 images.
To improve performance, I added Parallel.For support, which accelerated the training process. Enabling Release mode further optimized the training time, reducing it to around 780ms for 1000 images.
However, this was not sufficient. I began integrating CUDA support, which proved challenging but significantly reduced the training time. With CUDA, I brought the training time down to 400ms for 1000 images. In the latest build, I achieved a training time of approximately 200ms per 1000 images.
Overall, this resulted in a 10-fold increase in performance.

🏗️ Get Started

Clone the repository.
Ensure you have the necessary dependencies for C# and CUDA development. (https://developer.nvidia.com/cuda-downloads)
Open the solution file (.sln) in Visual Studio.
Build and run the project.

Example code

//XOR prediction
var nnmodel = NetworkBuilder.Create()
    .Stack(new InputLayer(2))
    .Stack(new DenseLayer(4, ActivationType.Sigmoid))
    .Stack(new OutputLayer(1, ActivationType.Sigmoid))
    .Build();

nnmodel.Summary();

float[][] inputs = new float[][] { new float[] { 0, 0 }, new float[] { 0, 1 }, new float[] { 1, 0 }, new float[] { 1, 1 } };
float[][] desired = new float[][] { new float[] { 0 }, new float[] { 1 }, new float[] { 1 }, new float[] { 0 } };
nnmodel.Train(inputs, desired, 15900, 0.01f, 1000, 100);

var prediction = nnmodel.Predict(new float[] { 0, 0 });
Console.WriteLine("Prediction: " + MathHelper.GetMaximumIndex(prediction));

Name		Name	Last commit message	Last commit date
Latest commit History 74 Commits
CudaWrapper		CudaWrapper
NNFromScratch		NNFromScratch
Tests		Tests
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE.txt		LICENSE.txt
NNFromScratch.sln		NNFromScratch.sln
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Neural Network from scratch with CUDA Support

🤔 What is this project?

❗Info

🛠️ Features

📎See also DeepReinforcementLearning from scratch using this project

📊 Benchmarks

🚀 Performance History

Sequential to true Parallel 📈 ...

🏗️ Get Started

Example code

About

Releases 1

Packages

Languages

License

FrozenAssassine/NeuralNetwork-FromScratch

Folders and files

Latest commit

History

Repository files navigation

Neural Network from scratch with CUDA Support

🤔 What is this project?

❗Info

🛠️ Features

📎See also DeepReinforcementLearning from scratch using this project

📊 Benchmarks

🚀 Performance History

Sequential to true Parallel 📈 ...

🏗️ Get Started

Example code

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages