-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Some questions about implementation details #13
Comments
I have the same question. |
I also wonder how 1 is done. I think it is reasonable that the author only care about recall. The activated neurons contribute most of the activation, while non-activated neurons are less important. So we want to find all activated neurons to ensure the model accuracy. The neurons that are not activated but predicted as activated do not have a negative impact on the results, but the activated neurons predicted as not activated have a significant impact on the results. That's why recall is used. |
Hi guys, I would like to ask if the term 'activating neurons' in the FFN in the paper refers to a row or column of parameters in a linear layer? For example, if a neural network only has one linear layer (256, 512), with input x as (1, 256), and the output output is (1, 512). Then, for predicting neuron activation, does the MLP predictor need to take x as input and output a (1, 256) or (1, 512) tensor as the activation_mask indicating which row/column of weights are activated? I don't know if I understand it correctly. |
Hi, dejavu is really fascinating! Thanks a lot for releasing the corresponding code.
I have some questions about implementation details.
Thank you! Hope to hear from you!
The text was updated successfully, but these errors were encountered: