Way to relax constrain for prefix length #2

inspirit · 2022-06-22T07:48:33Z

I think we can get away with having learnable null-latents in case we dont have prefix initially

lucidrains · 2022-06-22T14:19:30Z

@inspirit but you need at least one token that outputs a logit

lucidrains · 2022-06-22T14:20:44Z

yeah, or maybe the other way is to randomly curtail the prefix during training, in which case it will generalize on being conditioned from 0 prefix length to the maximum

lucidrains · 2022-06-22T14:21:10Z

feel like the paper should have addressed this, especially if book-level autoregressive generation is the goal here

inspirit · 2022-06-22T14:29:17Z

i think having prefix and query dynamically sized is the best for robustness as well as inference usage

lucidrains · 2022-06-22T15:53:18Z

@inspirit yeah, maybe i'll just have to push this responsibility to the dataloading

ArEnSc · 2022-08-11T23:17:34Z

@lucidrains I searched in the paper I don't see a prefix length mentioned. I am confused about this prefix length issue, wouldn't you want the prefix length to be the full size of the context window? ( I guess you guys meant the length of the latents )

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Way to relax constrain for prefix length #2

Way to relax constrain for prefix length #2

inspirit commented Jun 22, 2022

lucidrains commented Jun 22, 2022

lucidrains commented Jun 22, 2022

lucidrains commented Jun 22, 2022

inspirit commented Jun 22, 2022 •

edited

Loading

lucidrains commented Jun 22, 2022

ArEnSc commented Aug 11, 2022 •

edited

Loading

Way to relax constrain for prefix length #2

Way to relax constrain for prefix length #2

Comments

inspirit commented Jun 22, 2022

lucidrains commented Jun 22, 2022

lucidrains commented Jun 22, 2022

lucidrains commented Jun 22, 2022

inspirit commented Jun 22, 2022 • edited Loading

lucidrains commented Jun 22, 2022

ArEnSc commented Aug 11, 2022 • edited Loading

inspirit commented Jun 22, 2022 •

edited

Loading

ArEnSc commented Aug 11, 2022 •

edited

Loading