Hi Jiwei! Why you try a little bit more bigger learning rate in your training phase? Or You try some bigger lr like `lr = 3e-4` or `lr = 1e-6`, can you suggest some useful experiential value?