hi, I found an error in your code. in the training part, you wrote 'if i + epo * len(dataloader) % decay_step == 0 and i != 0:' by doing this, the learning rate never update. the correct version I think should be 'if (i + epo * len(dataloader)) % decay_step == 0 and i != 0':