Any lessons from Imbue for training-in-the-large? #270

lukstafi · 2024-07-04T09:04:15Z

https://imbue.com/research/70b-infrastructure/

"In the span of a few months, with a small team of researchers and engineers, we trained a 70B parameter model from scratch on our own infrastructure that outperformed zero-shot GPT-4o on reasoning-related tasks.

Today, we’re sharing an end-to-end guide for setting up the required infrastructure: from bringing up the initial cluster and installing the OS, to automatically recovering from errors encountered during training."

lukstafi · 2024-07-13T16:10:04Z

Also, from llm.c:
karpathy/llm.c#677

lukstafi added the explore Priority below "enhancement", non-blocking for milestones label Jul 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Any lessons from Imbue for training-in-the-large? #270

Any lessons from Imbue for training-in-the-large? #270

lukstafi commented Jul 4, 2024

lukstafi commented Jul 13, 2024

Any lessons from Imbue for training-in-the-large? #270

Any lessons from Imbue for training-in-the-large? #270

Comments

lukstafi commented Jul 4, 2024

lukstafi commented Jul 13, 2024