-
Notifications
You must be signed in to change notification settings - Fork 44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
"Buddy Allocator" in DART: Status #220
Comments
@devreal We got in touch with the author of the original buddy allocator code, and there is no problem with respect to copyright. The code does not match our requirements anyhow, so it's an ancient refactoring ToDo. If you see a way to replace it without putting too much effort into it, I will help as best as I can. |
Thanks for the clarification. I think the permission should be documented somewhere. The question still remains: Do we really need to have a custom allocator implementation? The buddy allocator is only used in An additional drawback of using a custom allocator is the loss of memory checking ability such as double free checks in libc, usage of valgrind, you name it. |
I agree with @devreal, however local memory allocation is not a collective team allocation. If we do not use a Buddy Allocator we still need a data structure to internally track all locally allocated memory segments and free them either at runtime if not needed anymore or at |
Good point. But we do not need an internal data structure to free the memory at runtime, i.e., by calling On a more general point: What is the point of having
which it apparently does not because the buddy allocator only uses local memory (I put a |
Hmyaa, the wording is specific. Memory is allocated in the unit's global memory space (because it is accessible via global pointer), but not in the unit's shared global memory space. If I remember correctly, I use But I'm not religious about this part of the DART interface in particular. |
Actually there are two things to clarify. At the moment we have two windows:
The question is actually what does LOCAL memory allocation mean? And does this in turn require a buddy allocator? It is currently used in only a few places. One use case is for example None of the examined examples really needs a buddy allocator. We can do it the way as described |
The original idea was to have two different kinds of memory allocation in DART (A) A collective operation: every unit in a team participates and contributes an equal amount of memory Since in MPI window creation is always a collective operation in DART-MPI (B) can only work by pre-allocating a chunk of memory and my managing that chunk of memory manually. This is where the buddy allocator comes in. It was originally used for a similar purpose in the DART-CUDA implementation and at some point incorporated by Huan in the DART-MPI code. Its true that this feature (the non-collective allocation) is not currently used much in DASH but I would retain it. |
Ahh thanks for your clarifications. I was not aware of the subtle difference between global and shared global memory... I understand that there is a need for non-collective allocations and I can see the use-case. Two points to make here:
Given the explanations above, I would make the case for keeping |
Argh, it's only after I started to shuffle around code to get rid of the buddy allocator that I understand what the buddy allocator is used for. Good documentation ftw! Turns out the allocator does not actually allocate memory from the OS but return an index to be used for some externally allocated chunk of memory. To add to the confusion, the variable holding the "allocator" instance in DART is called Google did a good job in translating the Chinese blog post, which is literally the only source of documentation at all:
Anyway, I realized that in Fun fact I discovered: the way we currently use it (16MB with 24 levels in the binary tree --> smallest chunk size 1 Byte), the allocator uses 32MB of meta data to manage these 16MB. Maybe going down to 21 levels (smallest chunk 8 Byte) would be sufficient and would only require 4MB of meta data. I apologize for my lack of understanding earlier and the proposals made based on it. |
I've been meaning to close this as we will not get rid of the buddy allocator anytime soon (unless someone wants to get his hands dirty...). Closing it as mostly fixed in #118. |
Inside DART, we use a buddy allocator that was taken from https://github.com/cloudwu/buddy. However, there is no license shipped with it. I assume that we cannot just use it without explicit permission?
There is a refactoring todo but so far no one has seemed to touch it. Do we actually need our own allocator here? Can we show that
malloc
/free
is significantly slower? As it stands right now, this allocator is not thread-safe and the implementation in general does not suite even our (non-existing) code quality standards.Any reason to keep it?
The text was updated successfully, but these errors were encountered: