This is the repo for Adversarial Robustness via Runtime Masking and Cleansing, Yi-Hsuan Wu, Chia-Hung Yuan, and Shan-Hung Wu, In Proceedings of ICML 2020. Our code is implemented in TensorFlow 2.0 using all the best practices.
We devise a new defense method, called runtime masking and cleansing (RMC), to improve adversarial robustness. RMC adapts the network at runtime before making a prediction to dynamically mask network gradients and cleanse the model of the non-robust features inevitably learned during the training process due to the size limit of the training set.
The following figure illustrates the defense mechanism in RMC:
- Augment dataset with adversarial examples
- Find K-nearest neighbors (KNN) of test data from the augmented dataset
- Adapt the network with KNN
- Make predictions
For more details, please refer to our main paper, supplementary materials, video, or slide.
Clone and install requirements.
git clone https://github.com/nthu-datalab/Runtime-Masking-and-Cleansing.git
cd Runtime-Masking-and-Cleansing
pip install -r requirements.txtRMC works well with any existing model architecture. The following command evaluate pretrained ResNet-152v2 downloaded from TensorFlow website on ImageNet:
python evaluate.pyIt is worth noticing that before running evaluate.py, we have to create the augmented dataset, adversarial examples for evaluation, and extract features (hidden representations) from those data. All corresponding codes can be found in /prepare folder. For example, the following command creates the perturbed training dataset with PGD (Projected Gradient Descent) attack.
cd prepare
python augment_dataset.pyTo visualize the test data and its corresponding nearest neighbors, please refer to visualize.ipynb.
Before running any code, please set directories first.
BASE_DIR: Path to "Runtime-Masking-and-Cleansing" folder.TRAIN_DATA_DIR: Path to training dataset.TRAIN_LABEL_DIR: Path to labels of training dataset.AUG_DATA_DIR: Path to augmented dataset.AUG_FEATURES: Path to features of augmented dataset.EVAL_DATA_DIR: Path to evaluation dataset.EVAL_LABEL_DIR: Path to label of evaluation dataset.EVAL_FEATURES: Path to features of evaluation dataset.ATTACK_DATA_DIR: Path to perturbed evaluation dataset.ATTACK_LABEL_DIR: Path to the target label of perturbed evaluation data. Only use when evaluating targeted attack.ATTACK_FEATURES: Path to features of perturbed evaluation dataset.
Evaluate with different configurations:
K = 2048
EPOCHS = 100
EARLY_STOP = 5
LEARNING_RATE = 1e-5
BUFFER_SIZE = 10000
IMG_SIZE = 224
RESIZE_SIZE = 256
BATCH_SIZE = 64
IMG_SHAPE = (IMG_SIZE, IMG_SIZE, 3)
EPSILON = 16/255
EPS_ITERS = 1/255
NB_ITERS = 100K: Hyperparameter in k-NN.EARLY_STOP: Early stop criteria.EPSILON: Allowable perturbation when computing adversarial examples.EPS_ITERS: Step size used in PGD attack.NB_ITERS: Number of iterations used in PGD attack.
We use MNIST, CIFAR-10, and ImageNet dataset in our paper. First two can be downloaded through TensorFlow API.
| Model | Clean Accuracy (%) | Error Rate / Attack Success Rate (%) | ||
|---|---|---|---|---|
| Clean Images | 10-step PGD(8/255) | 10-step PGD(16/255) | 100-step PGD(16/255) | |
| None | 72.9 | 8.5 / 54.69 | 5.2 / 61.7 | 0.6 / 98.1 |
| Adv. Trained | 62.3 | N/A | 52.5 / 5.5 | 41.7 / 31.0 |
| Denoising Block | 65.3 | N/A | 55.7 / 4.9 | 45.5 / 26.6 |
| DeepNN | 26.6 | 12.9 / 0.16 | 8.7 / 1.2 | 7.8 / 1.2 |
| WebNN | 27.8 | 18.8 / 0.54 | 15.2 / 0.3 | 13.9 / 0.3 |
| RMC | 73.6 | 62.4 / 0.28 | 55.9 / 1.6 | 55.6 / 1.3 |
If you find this code is helpful for your research, please cite our ICML 2020 paper:
@inproceedings{wu2020adversarial,
title={Adversarial Robustness via Runtime Masking and Cleansing},
author={Wu, Yi-Hsuan and Yuan, Chia-Hung and Wu, Shan-Hung},
booktitle={International Conference on Machine Learning},
pages={10399--10409},
year={2020},
organization={PMLR}
}
