Skip to content

A PyTorch implementation of DropGrad regularization for Federated Learning

License

Notifications You must be signed in to change notification settings

muditbhargava66/dropgrad

 
 

Repository files navigation

DropGrad: A Simple Method for Regularization and Accelerated Optimization of Neural Networks

Version License

DropGrad is a regularization method for neural networks that works by randomly (and independently) setting gradient values to zero before an optimization step. Similarly to Dropout, it has a single parameter, drop_rate, the probability of setting each parameter gradient to zero. In order to de-bias the remaining gradient values, they are divided by 1.0 - drop_rate.

Table of Contents

Features

  • Simple and easy-to-use gradient regularization technique
  • Compatible with various optimizers and learning rate schedulers
  • Supports per-parameter drop rates for fine-grained control
  • Implements drop rate schedulers for dynamic regularization
  • Provides an option to apply "full" update drop for further regularization
  • Utilizes mixed-precision training for improved performance and memory efficiency (CUDA devices only)
  • Cross-platform compatibility: Works seamlessly on macOS, Windows, and Linux

What's New in Version 0.3.5?

  • Added support for the Lion optimizer in the ViT experiments
  • Implemented gradient clipping to prevent gradient explosion and improve training stability
  • Enhanced data augmentation techniques for better model generalization
  • Improved error handling and user interruption handling during training
  • Updated test suite to cover various aspects of DropGrad, including initialization, optimization step, drop rate scheduling, and saving of loss values
  • Code refactoring and documentation enhancements for better readability and maintainability

Directory Structure

Description Quick Access

Getting Started

The examples directory contains sample code demonstrating various use cases of DropGrad, including basic usage, integration with learning rate schedulers, applying full update drop, and training a Vision Transformer (ViT) on the CIFAR-10 dataset under different regularization scenarios.
└── examples
    ├── basic_usage.py
    ├── lr_scheduler_integration.py
    ├── full_update_drop.py
    └── vit_experiments
        ├── vit_model.py
        ├── train.py
        ├── visualize.py
        ├── mathematical_analysis.py
        ├── benchmark_visualizations.py
        └── *.pth

Documentation

The docs directory contains detailed documentation and analysis of the DropGrad method, as well as instructions for setting up CUDA on Windows for PyTorch and DropGrad.
└── docs
    ├── analysis.md
    └── windows_cuda_setup.md

Core DropGrad Implementation

The dropgrad directory contains the core implementation of the DropGrad optimizer and drop rate schedulers.
└── dropgrad
    ├── __init__.py
    ├── dropgrad_opt.py
    └── dropgrad_scheduler.py

Testing

The tests directory contains the test suite for DropGrad, ensuring the correctness of the implementation. The tests cover the functionality of the DropGrad optimizer and the drop rate schedulers.
└── tests
    ├── __init__.py
    ├── test_dropgrad.py
    ├── test_dropgrad_optimizer.py
    └── test_dropgrad_scheduler.py

Configuration and Setup

This section highlights the key files related to project configuration, requirements, and licensing.
├── .gitignore
├── LICENSE
├── pyproject.toml
├── README.md
└── requirements.txt

Installation

Requirements

  • Python >= 3.7
  • PyTorch >= 1.12.0
  • torchvision >= 0.13.0
  • torchaudio >= 0.12.0
  • matplotlib
  • scipy

Using pip

To install DropGrad using pip, run the following command:

pip install dropgrad

From source

To install DropGrad from source, follow these steps:

git clone https://github.com/dingo-actual/dropgrad.git
cd dropgrad
pip install -r requirements.txt
pip install .

Usage

Basic Usage

To use DropGrad in your neural network optimization, simply import the DropGrad class and wrap your optimizer:

from dropgrad import DropGrad

optimizer = torch.optim.Adam(model.parameters(), lr=0.001)
optimizer = DropGrad(optimizer, drop_rate=0.1)

During training, call .step() on the wrapped optimizer to apply DropGrad, and then call .zero_grad() to reset the gradients:

optimizer.step()
optimizer.zero_grad()

Drop Rate Schedulers

DropGrad supports drop rate schedulers to dynamically adjust the drop rate during training. The package provides several built-in schedulers, including LinearDropRateScheduler, CosineAnnealingDropRateScheduler, and StepDropRateScheduler. To use a drop rate scheduler, pass an instance of a scheduler to the DropGrad constructor:

from dropgrad import DropGrad, LinearDropRateScheduler

scheduler = LinearDropRateScheduler(initial_drop_rate=0.1, final_drop_rate=0.0, num_steps=1000)
optimizer = DropGrad(optimizer, drop_rate_scheduler=scheduler)

Full Update Drop

DropGrad provides an option to apply "full" update drop by interrupting the .step() method. To enable this feature, pass full_update_drop=True to the DropGrad constructor:

optimizer = DropGrad(optimizer, drop_rate=0.1, full_update_drop=True)

Varying Drop Rates per Parameter

DropGrad allows specifying different drop rates for individual parameters or parameter groups. This enables fine-grained control over the regularization applied to different parts of the model. To vary drop rates per parameter, pass a dictionary mapping parameters to drop rates:

params = {
    'encoder': 0.1,
    'decoder': 0.2
}
optimizer = DropGrad(optimizer, params=params)

Examples

The examples directory contains sample code demonstrating various use cases of DropGrad, including basic usage, integration with learning rate schedulers, applying full update drop, and training a Vision Transformer (ViT) on the CIFAR-10 dataset under different regularization scenarios.

Testing

DropGrad includes a test suite to ensure the correctness of the implementation. The tests cover the functionality of the DropGrad optimizer and the drop rate schedulers. To run the tests, use the following command:

pytest tests/

Analysis

For a detailed analysis of the DropGrad method, including its theoretical foundations, advantages, and empirical results, please refer to the docs/analysis.md file.

Windows CUDA Setup

For instructions on setting up CUDA on Windows for PyTorch and DropGrad, please refer to the docs/windows_cuda_setup.md file.

Contributing

Contributions to DropGrad are welcome! If you find any issues or have suggestions for improvements, please open an issue or submit a pull request on the GitHub repository.

License

DropGrad is released under the MIT License. See the LICENSE file for more details.

Star History

Star History Chart