Skip to content
Discussion options

You must be logged in to vote

We haven't done any experiments with this model on non-English data but the model should work for other languages out-of-box.
The quickest wait to try this model with French data, would be to use a pre-trained BERT-like model, for example, model.language_model.pretrained_model_name=bert-base-multilingual-cased or amine/bert-base-5lang-cased. To prepare data for the punctuation and capitalization tasks, please see this tutorial. The Tatoeba dataset contains French sentences as well, you would need to modify this line to get Fr data.

Replies: 1 comment 5 replies

Comment options

You must be logged in to vote
5 replies
@iry47
Comment options

@iry47
Comment options

@ekmb
Comment options

ekmb Jun 8, 2021
Collaborator

@iry47
Comment options

@ekmb
Comment options

ekmb Jun 15, 2021
Collaborator

Answer selected by ekmb
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants