@akutuzov suggests producing an updated set of monolingual BERTs on the 3.0 data. @oepen asks half-skeptically how these will improve over the ones we have on the 2.0 data, seeing as we do not really believe we have substantially better data now, just lots more of it. but for BERT training, we only use a small sample anyway ...