You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We have netcdf-4 logging and it has a lot of useful information. Here at NOAA it's being used to debug problems on big HPC systems.
One set if information that would be super useful would be some timing info for data read/writes.
What I have in mind is a new constant for nc_set_log_level(), which would turn on timing of reads/writes, and cause that to be output to the log(s). This would help large data producers/readers when trying to figure out their IO performance on HPC systems.
IO is becoming very much the limiting factor, computation is no problem, but writing all that data is taking too long! Detailed info on what is taking up the time would help users optimize large modeling systems.
The text was updated successfully, but these errors were encountered:
Caches would be happening, and that certainly would complicate the situation, but right now they don't even have a good idea of how each model is using I/O. Overall numbers would help them adjust the caching to improve performance.
What I have in mind is something very simple, just a few extra lines of code to provide basic read/write times in the log. Of course, the profiler is also available to anyone who wants more detailed info.
In PIO I added support for MPE (optionally). This is a little more involved, but gives excellent output for parallel programming, something like this:
We have netcdf-4 logging and it has a lot of useful information. Here at NOAA it's being used to debug problems on big HPC systems.
One set if information that would be super useful would be some timing info for data read/writes.
What I have in mind is a new constant for nc_set_log_level(), which would turn on timing of reads/writes, and cause that to be output to the log(s). This would help large data producers/readers when trying to figure out their IO performance on HPC systems.
IO is becoming very much the limiting factor, computation is no problem, but writing all that data is taking too long! Detailed info on what is taking up the time would help users optimize large modeling systems.
The text was updated successfully, but these errors were encountered: