-
Notifications
You must be signed in to change notification settings - Fork 6
Description
Thank the authors and contributors for your novel, inspiring works. These works are well-presented.
Here is my question: I just wondered is there any possibility that SAE can be used to interpret the intermediate layer activations of sheer vision models like ResNet?
Compared to ViT, ResNet activations are far less discriminative, or to put it another way, far simpler. But as the authors found, “SAEs enable systematic observation of learned features, revealing fundamental differences between models”. ResNet should have its own traits, and what would it be like if we use SAE to understand it?
Additionally, I noticed that SAE should be trained on very large scale datasets, but why? If we train SAE only on very small scale datasets, say CIFAR-100, what would it be like?
(I understand that text-modal supervison tends to be important, so this is just a question out of nowhere and general or coarse-grained answers would be fine. Hope it does not look silly :D)