Nan Values During DeePMD Training #4461
Unanswered
VenkatanarayananMridhula
asked this question in
Q&A
Replies: 1 comment 4 replies
-
Just confirm: did you use the latest version? |
Beta Was this translation helpful? Give feedback.
4 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hello,
I am currently using DeePMD to develop a model for predicting the energies of a single molecule (C₂H₄) using the se_e2_a descriptor. My dataset consists of over 5000 frames sampled from various configurations, which I have split into training, validation, and test datasets in a 70:20:10 ratio. The model is trained for 1 million steps. During training, the RMSE for both training and validation decreases steadily and shows significant overlap over the course of training, as illustrated in the attached plot. However, I encounter NaN values almost at the end of the training, specifically around step 944,500. The training parameters were defined based on the supplementary of the paper ["End-to-End Symmetry Preserving Inter-Atomic Potential Energy Model for Molecular Dynamics Simulations" for single-molecule systems. I have attached the following files for reference:
input.json
lcurve.txt
Despite the model showing a promising and stable trend throughout most of the training, the occurrence of NaN values near the very end is perplexing. I would greatly appreciate any insights into what might be causing this issue and how it can be resolved.
Thank you for your time and assistance!
Beta Was this translation helpful? Give feedback.
All reactions