-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PatchFool implementation #2163
base: main
Are you sure you want to change the base?
PatchFool implementation #2163
Conversation
Codecov ReportAttention:
❗ Your organization needs to install the Codecov GitHub app to enable full functionality. Additional details and impacted files@@ Coverage Diff @@
## main #2163 +/- ##
==========================================
+ Coverage 85.08% 85.16% +0.07%
==========================================
Files 324 325 +1
Lines 29331 29480 +149
Branches 5409 5431 +22
==========================================
+ Hits 24956 25106 +150
+ Misses 2997 2973 -24
- Partials 1378 1401 +23
|
This is only a draft implementations but I wanted to discuss a few issues that I am facing. The first one comes from getting the attention weights of a transformer model. I added one implementation for the ViT model that comes pre-trained from the Second issue is that the PyTorch model I used behaves incorrectly if the benign input is cast to float ... which makes it hard to test the attack. (there's an example in the attack's notebook ). Is this a problem coming from the mixture of frameworks? Have you seen such behaviour before? |
Hi @sechkova Thank you very much for your pull request! I agree about your first question that general support for all possible architectures is challenging or not reasonably possible. ART does have multiple model specific estimators, for example About your second question, does the model you are working with expect integer arrays as input? If yes, you could accepts float arrays as input to your new ART tools to follow the ART APIs and inside of the tools convert them to integer arrays before providing the input data to the model. We would have to investigate how this conversion affects the adversarial attacks. |
At the end I used |
For now I added |
@beat-buesser the PR is updated and the attack algorithm now shows good results. What I think is still to be resolved is the custom PyTorch DeiT classifier. For now I have implemented just the very basics for the attack to work with a pre-trained one from timm . It involves hardcoding the layers names, therefore there is a difference between PyTorch versions, which I've circumvented by setting 'TIMM_FUSED_ATTN' = '0' (you can see the example notebook below). It is not a very subtle approach for sure. Here is an example notebook that I wish to contribute once the implementation is finalised: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @sechkova Thank you very much for implementing the PatchFool attack in ART! I have added a few comments in my review, please take a look and let me know what you think. In addition to that could you please add a unit test in pytest format for the new attack class and a notebook showing how the implementation reproduces the original paper?
art/attacks/evasion/patchfool.py
Outdated
@@ -0,0 +1,258 @@ | |||
# MIT License | |||
# | |||
# Copyright (C) The Adversarial Robustness Toolbox (ART) Authors 2022 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# Copyright (C) The Adversarial Robustness Toolbox (ART) Authors 2022 | |
# Copyright (C) The Adversarial Robustness Toolbox (ART) Authors 2023 |
art/attacks/evasion/__init__.py
Outdated
@@ -67,3 +67,4 @@ | |||
from art.attacks.evasion.wasserstein import Wasserstein | |||
from art.attacks.evasion.zoo import ZooAttack | |||
from art.attacks.evasion.sign_opt import SignOPTAttack | |||
from art.attacks.evasion.patchfool import PatchFool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
from art.attacks.evasion.patchfool import PatchFool | |
from art.attacks.evasion.patchfool import PatchFoolPyTorch |
art/attacks/evasion/patchfool.py
Outdated
): | ||
""" | ||
Create a :class:`PatchFool` instance. | ||
TODO |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there still a TODO here?
art/attacks/evasion/patchfool.py
Outdated
|
||
def _generate_batch(self, x: "torch.Tensor", y: Optional["torch.Tensor"] = None) -> "torch.Tensor": | ||
""" | ||
TODO |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please update docstring.
art/attacks/evasion/patchfool.py
Outdated
def _get_patch_index(self, x: "torch.Tensor", layer: int) -> "torch.Tensor": | ||
""" | ||
Select the most influencial patch according to a predefined `layer`. | ||
TODO |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please update docstring.
art/attacks/evasion/patchfool.py
Outdated
def _get_attention_loss(self, x: "torch.Tensor", patch_idx: "torch.Tensor") -> "torch.Tensor": | ||
""" | ||
Sum the attention weights from each layer for the most influencail patches | ||
TODO |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please update docstring.
art/attacks/evasion/patchfool.py
Outdated
|
||
def pcgrad(self, grad1, grad2): | ||
""" | ||
TODO |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please update docstring.
""" | ||
return self.model.patch_embed.patch_size[0] | ||
|
||
def get_attention_weights(self, x: Union[np.ndarray, "torch.Tensor"]) -> "torch.Tensor": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this method could be of interest for other models too. Please move it to PyTorchEstimator
and generalise it by making return_nodes
a list of strings provided by the user as an argument.
) | ||
|
||
@property | ||
def patch_size(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't the patch size be defined on the attack side? If yes, we could just reuse the existing PyTorchClassifier
.
art/attacks/evasion/patchfool.py
Outdated
optim = torch.optim.Adam([perturbation], lr=self.learning_rate) | ||
scheduler = torch.optim.lr_scheduler.StepLR(optim, step_size=self.step_size, gamma=self.step_size_decay) | ||
|
||
for i_max_iter in tqdm(range(self.max_iter)): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The variable i_max_iter
seems not be used, you can replace it with _
to avoid the CodeQL alert.
Add a new evasion attack on vision transformers. Signed-off-by: Teodora Sechkova <[email protected]>
Signed-off-by: Teodora Sechkova <[email protected]>
Signed-off-by: Teodora Sechkova <[email protected]>
Skip the class token when calculating the most influential image patch. Signed-off-by: Teodora Sechkova <[email protected]>
Signed-off-by: Teodora Sechkova <[email protected]>
Update classifier to use DeiT from the timm library. Fix algorithm details. Signed-off-by: Teodora Sechkova <[email protected]>
- Calculate the attention loss as negative log likelihood - Clamp perturbations after random init Signed-off-by: Teodora Sechkova <[email protected]>
- Fix input normalisation and scaling. - Fix patch application to happen only once after final iteration - Add skip_loss_att option Signed-off-by: Teodora Sechkova <[email protected]>
Use tqdm indication bar showing the attack iterations. Signed-off-by: Teodora Sechkova <[email protected]>
- Move get_attention weights to PyTorchEstimator and generalise it by making return_nodes a list of strings provided by the user as an argument. - Define patch size on the attack side. - Remove PyTorchClassifierDeiT and reuse the exisitng PyTorchClassifier. Signed-off-by: Teodora Sechkova <[email protected]>
Signed-off-by: Teodora Sechkova <[email protected]>
Signed-off-by: Teodora Sechkova <[email protected]>
Add verbose option for tqdm. Remove unused variable i_max_iter. Signed-off-by: Teodora Sechkova <[email protected]>
Use directly the attribute patch_layer. Signed-off-by: Teodora Sechkova <[email protected]>
Signed-off-by: Teodora Sechkova <[email protected]>
Signed-off-by: Teodora Sechkova <[email protected]>
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, | ||
# TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE | ||
# SOFTWARE. | ||
import os |
Check notice
Code scanning / CodeQL
Unused import Note test
import pytest | ||
|
||
from art.attacks.evasion import PatchFoolPyTorch | ||
from art.estimators.classification.classifier import ClassGradientsMixin |
Check notice
Code scanning / CodeQL
Unused import Note test
|
||
from art.attacks.evasion import PatchFoolPyTorch | ||
from art.estimators.classification.classifier import ClassGradientsMixin | ||
from art.estimators.classification.pytorch import PyTorchClassifier |
Check notice
Code scanning / CodeQL
Unused import Note test
from art.attacks.evasion import PatchFoolPyTorch | ||
from art.estimators.classification.classifier import ClassGradientsMixin | ||
from art.estimators.classification.pytorch import PyTorchClassifier | ||
from art.estimators.estimator import BaseEstimator |
Check notice
Code scanning / CodeQL
Unused import Note test
from art.estimators.classification.pytorch import PyTorchClassifier | ||
from art.estimators.estimator import BaseEstimator | ||
|
||
from tests.attacks.utils import backend_test_classifier_type_check_fail |
Check notice
Code scanning / CodeQL
Unused import Note test
@beat-buesser Can you advise how should the tests be defined? PatchFool attack works on transformer models, using information from the attention layers to calculate the attack. I can use a downloaded pre-trained model for the tests but they are usually trained on ImageNet while the tests in ART use other smaller test datasets. This causes issues with the number of classes etc. I added one initial draft test with the last commit (da05de1). |
Description
Initial draft implementation of PatchFool attack from the paper:
Patch-Fool: Are Vision Transformers Always Robust Against Adversarial Perturbations?
Currently there is an example notebook of the attack in colab. I do plan to contribute the notebook too once ready.
Fixes # (issue)
Type of change
Please check all relevant options.
Testing
Please describe the tests that you ran to verify your changes. Consider listing any relevant details of your test configuration.
Test Configuration:
Checklist