Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support array and string re-embedding and resizing #107

Open
wks opened this issue Sep 27, 2024 · 0 comments
Open

Support array and string re-embedding and resizing #107

wks opened this issue Sep 27, 2024 · 0 comments

Comments

@wks
Copy link
Collaborator

wks commented Sep 27, 2024

CRuby's default GC has a mechanism to convert a non-embedded Array or String back to an embedded Array or String during GC. When the GC finds the object size (i.e. the space allocated by the GC) is big enough to hold the Array or String elements, it will copy the elements from the malloc-ed off-heap buffer back into the object itself, and make the Array or String instance embedded. This capability is more important after the VWA feature was introduced, in which case an embedded object can be as large as 640 bytes, giving it enough space to hold non-trivial strings or arrays.

MMTk, in theory, has greater capability to do this kind of re-embedding. When using an evacuating collector (such as SemiSpace, GenCopy, and Immix-based collectors), MMTk allows an object to be resized when copied (in ObjectModel::copy. JikesRVM already takes advantage of this to implement array-based hashing. It will add one extra word in front of an object when copying the object to accommodate the hash code.

For Ruby, during copying GC, when copying an Array or String, we can always allocate an embedded Array or String that is big enough to hold all of its elements. (Note that MMTk has no limit in object size when allocating.) Then we can copy the elements from the imemo:mmtk_objbuf or imemo:mmtk_strbuf into the newly allocated embedded Array or String, and abandon the strbuf or objbuf.

Of course we can only do it if the Array or String is not shared, shared root, frozen or nofree (i.e. satisfying rb_ary_embeddable_p or rb_str_reembeddable_p). Even if we can, we may probably only re-embed arrays or strings up to a certain size. Otherwise it would be a waste of memory if the Array or String quickly shrinks in size soon after the re-embedding.

Related issues

#91 (comment) mentioned a bug where the existing re-embedding code for Array is erroneously executed when using MMTk, without forwarding the members. I'll disable array re-embedding for now and re-enable it later (and do it right).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant