DeComFL is a library designed for training/fine-tuning deep learning models in the federated learning scenario. Its unique feature is the utilization of zeroth-order optimization, enabling communication between clients to be limited to just a few scalars, irrespective of the original model's size. This dimension-free communication is the inspiration behind the library's name.
We use conda as our cross-platform environment management tool. However, due to macOS' lacking support for cuda, we have to make 2 different environment setup files:
- Use
environment.yml
on macOS or if you do not have cuda at hand. - Use
environment_cuda.yml
otherwise.
For READMD.md, we will use environment.yml
whenever an environment file is needed.
- Make sure
conda
is available. See https://conda.io/projects/conda/en/latest/user-guide/install/index.html for more detail. - At the root of this repo, run
conda env create -f environment.yml -y
. - Once installation is finished, run
conda activate decomfl
to use the created virtual env. - (Optional) If you see something like
conda init before activate
. Runconda init
, then restart your terminal/powershell. Then repeat step 3. - Run any command provided in Run Experiments section. If code works, then congratulations, you have successfully set up the environment for this repo!
- Update the environemtn if there are some missing dependencies, most recent change was introduced by adding grpc. Try
conda env update --file environment.yml --prune
. The--prune
is optional, if--prune
conda will remove any dependencies that are no longer required from the environment.
-
Run zeroth-order random gradient estimate + SGD training. Train model using ZOO RGE. Usage example:
python zo_rge_main.py --dataset=mnist --num-pert=10 --lr=1e-5 --mu=1e-3 --momentum=0.9
-
Run DeComFL: Follow FL routine, split data into chunks and train on different clients. Usage example:
python decomfl_main.py --large-model=opt-125m --dataset=sst2 --iterations=1000 --train-batch-size=32 --test-batch-size=200 --eval-iterations=25 --num-clients=3 --num-sample-clients=2 --local-update-steps=1 --num-pert=5 --lr=1e-5 --mu=1e-3 --grad-estimate-method=rge-forward
-
Run FedAvg: Run standard fedavg algorithm.
python fo_fl_main.py --dataset=sst2 --lr=1e-3 --num-clients=5 --num-sample-clients=3 --local-update-steps=1 --train-batch-size=32 --test-batch-size=200 --momentum=0.9
@article{li2024achieving,
title={Achieving Dimension-Free Communication in Federated Learning via Zeroth-Order Optimization},
author={Li, Zhe and Ying, Bicheng and Liu, Zidong and Dong, Chaosheng and Yang, Haibo},
journal={arXiv preprint arXiv:2405.15861},
year={2024}
}
DeComFL is currently contributed and maintained by Zidong Liu (ComboCurve), Bicheng Ying (Google) and Zhe Li (RIT), and advised by Prof. Haibo Yang (RIT).