Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Venado optimizations #755
base: master
Are you sure you want to change the base?
Venado optimizations #755
Changes from 10 commits
e827603
096d3fe
eac6576
e7a8fc0
b1dcbb1
7ded60f
87d02dd
d766123
84be6e9
559bc8e
fc03d39
0fc96d2
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see a use case for setting N after construction. Is it to avoid constructing a new matrix?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jeanlucf22 Yes. This was addressed in the commit comment but the bullets needed fixing to clarify (now corrected). This is to avoid repeated memory allocations which became a bottleneck in MD simulations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks dangerous to me: you modify N without modifying anything else will lead to matrices in inconsistent state!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I mention in the comments, it is unsafe. It's possible to change N so that the matrix will exceed its original allocation. It's exposed only as bml_set_N_dense(), for experts. It works. Can you explain how the matrix will be inconsistent? It's needed for an application, we need to figure out how to make this method available in some form.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If needed we can protect against exceeding initial size by adding an extra variable to the struct keeping track of the size of the originally allocated matrix. But it's less intrusive to leave the struct alone...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe the performance issue comes from the allocation of the struct members domain and domain2:
bml/src/C-interface/dense/bml_types_dense.h
Line 30 in ef2ee26
I looked into it at some point, but the fix I tried was breaking qmd-progress.
These two struct members could be static or initialized as needed.
Increasing N would obviously lead to a memory allocation too small to hold an NxN matrix.
What is the use case? Is N decreasing and you want to reuse the allocation? Whatever it is, I think looking into domain and domain2 is a safer and better alternative.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The issue is needing to use a MAGMA working array of differing size on different MD iterations, where max N is known. The problem is that creating a new array each time is slow, due to the GPU allocation. The solution is to allocate using max N once and to resize the array using bml_set_N_dense() as needed. How can I use domain and domain2 to do this? I'm not familiar with those.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand now.
I was not suggesting to use domain and domain2. I was just saying their allocation may be the main culprit when it comes to allocation time for a dense matrix. Maybe an issue to deal with another time.
Another suggestion: having a function resizeNoAlloc(int n) that would just change N if n<=N, otherwise would change N and reallocate memory? Having an extra struct member keeping track of allocated memory size would be good in that case.