DropGrad: A Simple Method for Regularization and Accelerated Optimization of Neural Networks

DropGrad is a regularization method for neural networks that works by randomly (and independently) setting gradient values to zero before an optimization step. Similarly to Dropout, it has a single parameter, drop_rate, the probability of setting each parameter gradient to zero. In order to de-bias the remaining gradient values, they are divided by 1.0 - drop_rate.

Features

Simple and easy-to-use gradient regularization technique
Compatible with various optimizers and learning rate schedulers
Supports per-parameter drop rates for fine-grained control
Implements drop rate schedulers for dynamic regularization
Provides an option to apply "full" update drop for further regularization
Utilizes mixed-precision training for improved performance and memory efficiency (CUDA devices only)
Cross-platform compatibility: Works seamlessly on macOS, Windows, and Linux

What's New in Version 0.3.5?

Added support for the Lion optimizer in the ViT experiments
Implemented gradient clipping to prevent gradient explosion and improve training stability
Enhanced data augmentation techniques for better model generalization
Improved error handling and user interruption handling during training
Updated test suite to cover various aspects of DropGrad, including initialization, optimization step, drop rate scheduling, and saving of loss values
Code refactoring and documentation enhancements for better readability and maintainability

Directory Structure

Description	Quick Access
Getting Started The `examples` directory contains sample code demonstrating various use cases of DropGrad, including basic usage, integration with learning rate schedulers, applying full update drop, and training a Vision Transformer (ViT) on the CIFAR-10 dataset under different regularization scenarios.	└── examples ├── basic_usage.py ├── lr_scheduler_integration.py ├── full_update_drop.py └── vit_experiments ├── vit_model.py ├── train.py ├── visualize.py ├── mathematical_analysis.py ├── benchmark_visualizations.py └── *.pth
Documentation The `docs` directory contains detailed documentation and analysis of the DropGrad method, as well as instructions for setting up CUDA on Windows for PyTorch and DropGrad.	└── docs ├── analysis.md └── windows_cuda_setup.md
Core DropGrad Implementation The `dropgrad` directory contains the core implementation of the DropGrad optimizer and drop rate schedulers.	└── dropgrad ├── __init__.py ├── dropgrad_opt.py └── dropgrad_scheduler.py
Testing The `tests` directory contains the test suite for DropGrad, ensuring the correctness of the implementation. The tests cover the functionality of the `DropGrad` optimizer and the drop rate schedulers.	└── tests ├── __init__.py ├── test_dropgrad.py ├── test_dropgrad_optimizer.py └── test_dropgrad_scheduler.py
Configuration and Setup This section highlights the key files related to project configuration, requirements, and licensing.	├── .gitignore ├── LICENSE ├── pyproject.toml ├── README.md └── requirements.txt

Installation

Requirements

Python >= 3.7
PyTorch >= 1.12.0
torchvision >= 0.13.0
torchaudio >= 0.12.0
matplotlib
scipy

Using pip

To install DropGrad using pip, run the following command:

pip install dropgrad

From source

To install DropGrad from source, follow these steps:

git clone https://github.com/dingo-actual/dropgrad.git
cd dropgrad
pip install -r requirements.txt
pip install .

Usage

Basic Usage

To use DropGrad in your neural network optimization, simply import the DropGrad class and wrap your optimizer:

from dropgrad import DropGrad

optimizer = torch.optim.Adam(model.parameters(), lr=0.001)
optimizer = DropGrad(optimizer, drop_rate=0.1)

During training, call .step() on the wrapped optimizer to apply DropGrad, and then call .zero_grad() to reset the gradients:

optimizer.step()
optimizer.zero_grad()

Drop Rate Schedulers

DropGrad supports drop rate schedulers to dynamically adjust the drop rate during training. The package provides several built-in schedulers, including LinearDropRateScheduler, CosineAnnealingDropRateScheduler, and StepDropRateScheduler. To use a drop rate scheduler, pass an instance of a scheduler to the DropGrad constructor:

from dropgrad import DropGrad, LinearDropRateScheduler

scheduler = LinearDropRateScheduler(initial_drop_rate=0.1, final_drop_rate=0.0, num_steps=1000)
optimizer = DropGrad(optimizer, drop_rate_scheduler=scheduler)

Full Update Drop

DropGrad provides an option to apply "full" update drop by interrupting the .step() method. To enable this feature, pass full_update_drop=True to the DropGrad constructor:

optimizer = DropGrad(optimizer, drop_rate=0.1, full_update_drop=True)

Varying Drop Rates per Parameter

DropGrad allows specifying different drop rates for individual parameters or parameter groups. This enables fine-grained control over the regularization applied to different parts of the model. To vary drop rates per parameter, pass a dictionary mapping parameters to drop rates:

params = {
    'encoder': 0.1,
    'decoder': 0.2
}
optimizer = DropGrad(optimizer, params=params)

Examples

The examples directory contains sample code demonstrating various use cases of DropGrad, including basic usage, integration with learning rate schedulers, applying full update drop, and training a Vision Transformer (ViT) on the CIFAR-10 dataset under different regularization scenarios.

Testing

DropGrad includes a test suite to ensure the correctness of the implementation. The tests cover the functionality of the DropGrad optimizer and the drop rate schedulers. To run the tests, use the following command:

pytest tests/

Analysis

For a detailed analysis of the DropGrad method, including its theoretical foundations, advantages, and empirical results, please refer to the docs/analysis.md file.

Windows CUDA Setup

For instructions on setting up CUDA on Windows for PyTorch and DropGrad, please refer to the docs/windows_cuda_setup.md file.

Contributing

Contributions to DropGrad are welcome! If you find any issues or have suggestions for improvements, please open an issue or submit a pull request on the GitHub repository.

License

DropGrad is released under the MIT License. See the LICENSE file for more details.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DropGrad: A Simple Method for Regularization and Accelerated Optimization of Neural Networks

Table of Contents

Features

What's New in Version 0.3.5?

Directory Structure

Getting Started

Documentation

Core DropGrad Implementation

Testing

Configuration and Setup

Installation

Requirements

Using pip

From source

Usage

Basic Usage

Drop Rate Schedulers

Full Update Drop

Varying Drop Rates per Parameter

Examples

Testing

Analysis

Windows CUDA Setup

Contributing

License

Star History

About

Releases 4

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 37 Commits
.github/workflows		.github/workflows
docs		docs
dropgrad		dropgrad
examples		examples
tests		tests
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

License

muditbhargava66/dropgrad

Folders and files

Latest commit

History

Repository files navigation

DropGrad: A Simple Method for Regularization and Accelerated Optimization of Neural Networks

Table of Contents

Features

What's New in Version 0.3.5?

Directory Structure

Getting Started

Documentation

Core DropGrad Implementation

Testing

Configuration and Setup

Installation

Requirements

Using pip

From source

Usage

Basic Usage

Drop Rate Schedulers

Full Update Drop

Varying Drop Rates per Parameter

Examples

Testing

Analysis

Windows CUDA Setup

Contributing

License

Star History

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 4

Packages 0

Languages

Packages