Skip to content

A-GEM implementation #448

Answered by AndreaCossu
JonasFrey96 asked this question in Q&A
Mar 24, 2021 · 2 comments · 2 replies
Discussion options

You must be logged in to vote

Hi @JonasFrey96 😄 From our experiments, GEM works quite well with single head models, while A-GEM obtains poor performances. With multi-headed models, A-GEM recovers part of the performance in terms of catastrophic forgetting. This is the case also for Synaptic Intelligence: the original paper works with multi head models so the strategy performs poorly with a single head.
If you have been able to make A-GEM work with single-head , please let us know. That could indicate a bug in our implementation. However, up to now we have no sign pointing in that direction.

Replies: 2 comments 2 replies

Comment options

You must be logged in to vote
2 replies
@JonasFrey96
Comment options

@AndreaCossu
Comment options

Answer selected by JonasFrey96
Comment options

You must be logged in to vote
0 replies
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
3 participants
Converted from issue

This discussion was converted from issue #445 on March 25, 2021 07:35.