XLNet cached memory/recurrence on segments for fine-tuning #1474

paultran47 · 2022-11-08T20:29:40Z

paultran47
Nov 8, 2022

Hi there,

Apologies if this is a dumb question, but I was curious if the usage of XLNet for fine-tuning utilises the cached memory feature for long sequences.

My preliminary understanding is that XLNet is able to get around the issue of fixed segment lengths that BERT has, and that this discussion in this repo seems to show pretraining was completed with cached memory.

But when using XLNet through simpletransformers, is there a particular setting I need to set for cached memory to be used for fine-tuning? Or is it already done by default?

Thanks!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

XLNet cached memory/recurrence on segments for fine-tuning #1474

{{title}}

Replies: 0 comments

Select a reply

XLNet cached memory/recurrence on segments for fine-tuning #1474

paultran47 Nov 8, 2022

Replies: 0 comments

paultran47
Nov 8, 2022