This repo contrains the implementation for several AE detection methods and provide a standard evaluation for them. The implemented methods include:
- Feature Squeezing (fs), by Xu et al., 2017
- MagNet, by Meng and Chen, 2017
- Trapdoor, by Shan et al., 2020
extra:
- Rectified-Rejection (RR), by Pang et al., 2021, please check flymin/Rectified-Rejection
The pretrained weights for each classifier can be found at Google Drive. After downloading and extracting, the directory should look like:
pretrain/
├── densenet169.pt
├── gtsrb_ResNet18_E87_97.85.pth
└── MNIST_Net.pth
Since Trapdoor modifies the training procedures of the classifier, we provide a python script to generate AEs for Trapdoor model. Please check tools/attack_trapdoor.py
.
Except Trapdoor, we do not provide code for AE generation in this repo since their AE's can be generated only with the classifier. We generate these AEs though foolbox. The directory containing generated AEs should look like:
results/whitebox/
├── BIM
│ ├── cifar10_BIMinf_2352.pt
│ ├── gtsrb_BIMinf_2404.pt
│ └── MNIST_BIMinf_2416.pt
├── BIML2
│ ├── cifar10_BIML2_3296.pt
│ ├── gtsrb_BIML2_2404.pt
│ └── MNIST_BIML2_2416.pt
├── CW
│ ├── cifar10_CW_2416.pt
│ ├── gtsrb_CW_2404.pt
│ └── MNIST_CW_2416.pt
├── CWinf
│ ├── cifar10_CWinf_2416.pt
│ ├── gtsrb_CWinf_2408.pt
│ └── MNIST_CWinf_2416.pt
└── ...
Each .pt
file is a dict contrains of format:
'x_ori': [torchTensor with size (B, C, H, W), ...], batches of original image data in range of (0, 1)
'y_ori': [torchTensor with size (B), ...], batches of gt labels
'x_adv': [torchTensor with size (B, C, H, W), ...], batches of AEs in range of (0, 1)
This is also the format of AEs generated by tools/attack_trapdoor.py
.
Users can also check the script to comfirm the format.
Please clone this repo through:
git clone --recursive https://github.com/flymin/AEdetection.git
to make sure you have all submodules correctly.
We provide several script examples in scripts/*.sh
.
Please check those commands for detail.