continue training #14

geovedi · 2017-04-27T15:48:16Z

I'm planning to train a large dataset that might be too big for my GPU card and thinking to split it into several chunks. Is there any way to continue training? The train command is refusing to start training on existing model.

The text was updated successfully, but these errors were encountered:

odashi · 2017-05-04T00:15:36Z

Hello, thank you for the comment!

The train command is currently supporting only single running, and we need some additional code to resume the training process.
Okay, I'll develop this option soon.

Fortunately, the model directory has enough information to resume training process:

the two latest parameter files
config.ini
previous evaluation scores in training.log (or separated log files 18b5e32).
If you like to write it yourself, train.cc and decode.cc may help developing code.

odashi · 2017-05-04T03:23:39Z

Oops, I found some problems to re-load parameters.
The current train command saves the structure of the translation model and its parameters, but doesn't keep the link between trainer and parameters which is required in the model updating.
This is because of some difference of the policy between this tool and the DyNet backend, and I might need several longer time to fix this issue...

odashi added the enhancement label May 3, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

continue training #14

continue training #14

geovedi commented Apr 27, 2017 •

edited

Loading

odashi commented May 4, 2017 •

edited

Loading

odashi commented May 4, 2017

continue training #14

continue training #14

Comments

geovedi commented Apr 27, 2017 • edited Loading

odashi commented May 4, 2017 • edited Loading

odashi commented May 4, 2017

geovedi commented Apr 27, 2017 •

edited

Loading

odashi commented May 4, 2017 •

edited

Loading