Skip to content

[DLSP25] Add Attention Visualization with ViT tutorial#854

Open
lucas-oma wants to merge 1 commit intoAtcold:masterfrom
lucas-oma:visualize-attention-ViT
Open

[DLSP25] Add Attention Visualization with ViT tutorial#854
lucas-oma wants to merge 1 commit intoAtcold:masterfrom
lucas-oma:visualize-attention-ViT

Conversation

@lucas-oma
Copy link
Contributor

ViT Attention Visualization – No Spoilers Edition

lom2017@nyu.edu

This PR introduces a new tutorial: Vision Transformer (ViT) attention visualization, inspired by the last recitation class where PhD student Alex Wang showed us how to visualize attention in ViTs. However, his notebook contained a "from-scratch" ViT implementation, which, if uploaded to the repo, would basically be like handing over Homework 3 to future students on a silver platter. 😅

So instead, I took inspiration from it but recreated the visualization using Hugging Face's pre-trained ViT model. Same cool insights, no spoilers!

🚀 What’s New?

  • ViT attention visualization powered by transformers.ViTModel
  • A little bit of insight into how different attention heads focus on various parts of an image.

📸 Why This Image?

input image

Tried a couple of images, and this one stood out. The attention heads clearly focus on different elements, like tracking people walking, locking onto the NYU banner, and highlighting the building's structure. I think it’s a great example of how ViTs "see" an image!

Also, this image was obtained from NYU's website, so we should have no copyright issues (I hope).

Follow-ups:

If you spot any inconsistencies or mistakes in the descriptions, comments, etc, don’t hesitate to let me know! I did proofread it a couple of times though.

@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant