generated from freelawproject/new-project-template
-
-
Notifications
You must be signed in to change notification settings - Fork 1
Open
5 / 85 of 8 issues completedDescription
Pretrain a domain-adapted modernbert for docket entry description text.
Run initial experiments on a small set of ~6M entries to test different strategies including:
- Vanilla masked language modeling with modernbert models
- Finetuning small models from scratch
- Finetuning models with a distillation loss + MLM
- Finetuning "sliced" variants of modernbert models
Then run a larger training run with ~50M entries using the best strategies.
Sub-issues
Metadata
Metadata
Assignees
Labels
No labels
Type
Projects
Status
Ancestors 👪