Question about the <unk> in the subsample function of chapter_natural-language-processing-pretraining

Why is it that in the subsample function of this section, the token <unk>  is discarded when calculating the word count, but is not added to sentences later?
It may do harm to the ability of model to learn the rear token.