You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
However, the loss appears to drop sharply when entering the final epoch. I think the problem is due to the different data distribution between the final epoch and the standard epochs. Is this expected in the design of separate_last_epoch?
The text was updated successfully, but these errors were encountered:
Your question
If the number samples of final epoch are less than 80% of a standard epoch,
GPTDataset
will separate it.However, the loss appears to drop sharply when entering the final epoch. I think the problem is due to the different data distribution between the final epoch and the standard epochs. Is this expected in the design of
separate_last_epoch
?The text was updated successfully, but these errors were encountered: