Personalized Privacy-Preserving Framework for Cross-Silo Federated Learning

This repository contains the code and experiments for the paper: Personalized Privacy-Preserving Framework for Cross-Silo Federated Learning

Introduction

Federated learning (FL) is recently surging as a promising decentralized deep learning (DL) framework that enables DL-based approaches trained collaboratively across clients without sharing private data. However, in the context of the central party being active and dishonest, the data of individual clients might be perfectly reconstructed, leading to the high possibility of sensitive information being leaked. Moreover, FL also suffers from the nonindependent and identically distributed (non-IID) data among clients, resulting in the degradation in the inference performance on local clients’ data. In this paper, we propose a novel framework, namely Personalized Privacy-Preserving Federated Learning (PPPFL), with a concentration on cross-silo FL to overcome these challenges. Specifically, we introduce a stabilized variant of the Model-Agnostic Meta-Learning (MAML) algorithm to collaboratively train a global initialization from clients’ synthetic data generated by Differential Private Generative Adversarial Networks (DP-GANs). After reaching convergence, the global initialization will be locally adapted by the clients to their private data. Through extensive experiments, we empirically show that our proposed framework outperforms multiple FL baselines on different datasets, including MNIST, Fashion-MNIST, CIFAR-10, and CIFAR-100.

The overall architecture of the PPPFL framework consists of three consecutive stages. First, each participant locally synthesizes DP-guaranteed dataset ${D^}$ by training their own DP-data-generator $\mathcal{G}$ from secret dataset $D$. Next, in the federated training process, the synthetic data, which have been respectively separated into a training set ${D^}^{tr}$ and held-out validation set ${D^}^{val}$, will be used to update the global model parameters $\theta$. Lastly, the clients emerge from the federated training rounds and perform updates to their local model parameters $\theta^{}$ based on their secret training data $D^{tr}$ and evaluate their new locally adapted model with respect to the secret validation set $D^{val}$.

Usage

Using submodules

This repository contains submodules. To clone this repository, please run

git clone --recurse-submodules

Dataset generation

We adopt DataLens for the DP-guaranteed synthetic data generation. Please refer to this README for more details.

We use LEAF data format for the dataset. Please refer to this README. The datasets should be placed in the datasets/[dataset_name] folder.

Dependencies

The experiments are conducted with Python 3.8 and PyTorch 1.7.1. Please install the dependencies by running

pip install -r requirements.txt

Run Experiments

CIFAR-10

    python run.py --algo fedmeta --data cifar10 --lr 3e-4 --num_epochs 1 --model=cnn \
    --clients_per_round 5 --batch_size 64 --data_format pkl --num_rounds 30 --meta_algo=maml --outer_lr=3e-4 \
    --result_prefix ./cifar10_results --device cuda:0 --wd 1e-6 --use_pppfl

CIFAR-100

    python run.py --algo fedmeta --data cifar100 --lr 3e-4 --num_epochs 1 --model=cnn \
    --clients_per_round 5 --batch_size 64 --data_format pkl --num_rounds 30 --meta_algo=maml --outer_lr=3e-4 \
    --result_prefix ./cifar100_results --device cuda:0 --wd 1e-6 --use_pppfl

MNIST

    python run.py --algo fedmeta --data mnist --lr 3e-4 --num_epochs 1 --model=cnn \
    --clients_per_round 5 --batch_size 64 --data_format pkl --num_rounds 30 --meta_algo=maml --outer_lr=3e-4 \
    --result_prefix ./mnist_results --device cuda:0 --wd 1e-6 --use_pppfl

Fashion-MNIST

    python run.py --algo fedmeta --data fmnist --lr 3e-4 --num_epochs 1 --model=cnn \
    --clients_per_round 5 --batch_size 64 --data_format pkl --num_rounds 30 --meta_algo=maml --outer_lr=3e-4 \
    --result_prefix ./fmnist_results --device cuda:0 --wd 1e-6 --use_pppfl

Training arguments

algo: FL algorithm, e.g., fedavg, fedmeta.
data: Dataset, e.g., mnist, cifar10, cifar100, fmnist.
lr: Inner learning rate.
num_epochs: Number of local epochs.
model: Model architecture, e.g., cnn, resnet18.
clients_per_round: Number of clients per round.
batch_size: Batch size.
data_format: Data format, e.g., pkl, hdf5.
num_rounds: Number of communication rounds.
meta_algo: Meta-learning algorithm, e.g., maml, reptile.
outer_lr: Outer learning rate.
result_prefix: Prefix of the result file.
device: Device, e.g., cuda:0.
wd: Weight decay.
use_pppfl: Whether to use PPPFL.

Acknowledgement

We would like to thank the authors from Xtra-Computing, AI-Secure, TalwalkarLab and Shu for their open-source implementations.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.idea		.idea
DataLens @ 72d0f2f		DataLens @ 72d0f2f
NIID_Bench @ f10d1a3		NIID_Bench @ f10d1a3
assets		assets
clients		clients
data		data
datasets		datasets
leaf @ 09ec454		leaf @ 09ec454
models		models
trainers		trainers
utils		utils
.gitignore		.gitignore
.gitmodules		.gitmodules
README.md		README.md
config.py		config.py
requirements.txt		requirements.txt
run.py		run.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Personalized Privacy-Preserving Framework for Cross-Silo Federated Learning

Introduction

Usage

Using submodules

Dataset generation

Dependencies

Run Experiments

CIFAR-10

CIFAR-100

MNIST

Fashion-MNIST

Training arguments

Acknowledgement

About

Releases

Packages

Languages

vinuni-vishc/PPPF-Cross-Silo-FL

Folders and files

Latest commit

History

Repository files navigation

Personalized Privacy-Preserving Framework for Cross-Silo Federated Learning

Introduction

Usage

Using submodules

Dataset generation

Dependencies

Run Experiments

CIFAR-10

CIFAR-100

MNIST

Fashion-MNIST

Training arguments

Acknowledgement

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages