Diffusion models from scratch
- Results
- Architecture
- Additional Experiments
1) Results
Forward Diffusion Process
Reverse Diffusion Process
Below are some of the results showing Progressive Generation on CIFAR-10 Dataset. The model was trained for 100 epochs on P100 GPU. The model used was Residual-Attention-UNet. We can see some of the CIFAR-10 classes here such Ship, Dog, Frog etc.
Some Generated Samples
2) Architecture
DDPM paper uses model similar to Residual Attention UNet. Implemented multiple different architectures, but here have uploaded 3 architectures namely Convolutional UNet (Vanilla with tweeks!!) , Residual UNet and Residual Attention UNet. Below is the diagram i have drawn for Residual UNet. Residual Attention UNet diagram is work in progress :)
3) Additional Experiments - 40 Epochs
Used WandB (Weights & Biases) for the experiments. Below are the experiment results for only the above 3 uploaded model, namely a Conv UNet, Residual UNet (see diagram), Residual Attention Unet (Notebook 1). All the models were trained for just 40 epochs on P100 GPU. As we can see the Conv UNet fails to generate any images. This is due to the fact that model is very deep with large channel sizes and has no residual connections. Skip connections in UNet were not sufficient in this case. Using a smaller model works but output image generation quality is severely impacted. Currently working on some other ways to make this work. In the same model, adding residual blocks (see diagram) works wonders and model's performance and image generation quality leaps significantly. We can see the result in below displayed images. Lastly Residual Attention UNet performs the best and gives the best results. The difference is marginal when training for 40 Epochs but as we reach 100 Epochs or so we can see some differences. Objects are more clearly defined rather than diffused looking, as compared to Residual UNet.
Residual UNet
Residual Attention UNet
Conv UNet