-
Notifications
You must be signed in to change notification settings - Fork 52
Open
Description
I've been getting a lot of feedback from users on how long MMvec takes to run. I'll integrate these comments into the README eventually.
Below are some tips on how to speed up training
- Try to get GPUs. You're going to get a 10x boost in runtime off the bat. These days, you can use Google colab to get free GPUs (first come first serve, and you may get booted), or you can rent GPUs at AWS. See the
--arm-the-gpuflag. - Increase your batch size, as large as you can. I'm talking 100,000 reads at a time or more. The larger it is, the faster your iterations will complete. It'll also reduce the noise in your training. This is particularly true for GPUs, you'll probably have a tight upper cap on how much you can load.
- Once you have settled on your batch size, bump up your learning rate (i.e. about 0.1). This will help take larger gradient descent steps, which also speeds up convergence. Having a large batch size will help with this.
- Watch your epochs. You probably don't need more than 100 epochs. This can be justified by the cross-validation plots.
- Reduce the number of summary intervals. You probably don't need record summaries every second, and having too many summaries will bog down your training time. Recording every minute or so should be fine. Setting
--p-summary-interval 60will record a summary every 60 seconds.
Unfortunately, this type of tuning is on a case-by-case basis, since hardware and datasets comes in all sorts of shapes and sizes. We'll try to make this easier in future iterations.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels