-
-
Notifications
You must be signed in to change notification settings - Fork 75
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use "interned" String
s when possible to avoid uneccessary heap allocation.
#276
Conversation
…cation. Fixes ohler55#275 - Background info: https://en.wikipedia.org/wiki/String_interning - Available to C extensions since MRI Ruby 3.0: - https://bugs.ruby-lang.org/issues/13381 - https://bugs.ruby-lang.org/issues/16029 This change makes `Ox` use "interned" (frozen and deduplicated) `String`s anywhere possible, the same effect as calling `String#-@` except without the heap churn of `RVALUE` allocation: https://ruby-doc.org/core/String.html#method-i-2B-40 - Uses the `HAVE_<funcname>` preprocessor macros to detect this functionality and avoid breaking older Rubies (confirmed with at least MRI 2.7.2). - I explicitly use interned `String`s anywhere an allocated `String` seemed entirely internal to `Ox`, e.g. in `sax_value_as_time` when allocating the `String` argument to `ox_time_class`. - Adds a new user-facing option `intern_strings` to control the behavior of `String` return values from user-facing functions, e.g. from `sax_as_s`. - Results in a ~25% reduction in startup GC pressure in my library! :D Inspired by similar changes to `json` and `msgpack`: - ruby/json#451 - msgpack/msgpack-ruby#196 My goal is `Object` allocation reduction in the `Ox::Sax` handler of my MIME Type file-identification library caused by the large number of duplicate `String`s in the `shared-mime-info`-format XML it uses as a data source. I was already calling `#-@` (or using the `-some_str_variable` syntax) everywhere possible in my handler, but that only freezes and deduplicates an already-allocated `String`: irb(main):068:0> oid = proc { puts "#{_1} is object_id #{_1.object_id}" } => #<Proc:0x0000564d53eb6850 (irb):66> irb(main):069:0> lol = 'lol'.tap(&oid)[email protected](&oid) lol is object_id 19800 lol is object_id 19740 => "lol" irb(main):070:0> -lol.tap(&oid)[email protected](&oid) lol is object_id 19740 lol is object_id 19740 => "lol" That avoids duplicate retained `String`s but still causes my library to take a big GC hit right at startup when it loads its data. All the stats below were collected using SamSaffron's `memory_profiler`: https://github.com/SamSaffron/memory_profiler First, a sanity-check to make sure I didn't break MRI < 3.0, using MRI 2.7 + latest official `Ox` from RubyGems (2.14.5): [okeeblow@emi#CHECKING-YOU-OUT] ruby -v ruby 2.7.2p137 (2020-10-01 revision 5445e04352) [x86_64-linux] [okeeblow@emi#CHECKING-YOU-OUT] ./bin/are-we-unallocated-yet|grep Total Total allocated: 20561351 bytes (401372 objects) Total retained: 1680971 bytes (22102 objects) versus same MRI 2.7 but with my patched `Ox`: [okeeblow@emi#CHECKING-YOU-OUT] gem install ../../ox/ox-2.14.5.gem [okeeblow@emi#CHECKING-YOU-OUT] ./bin/are-we-unallocated-yet|grep Total Total allocated: 20561111 bytes (401366 objects) Total retained: 1680971 bytes (22102 objects) No difference between the two, showing that this patch is a no-op for pre-3.0. Sorry I did not test older versions of MRI or other Rubies like J/Truffle/etc since I don't have them on my system. Now the real gainz can be seen when comparing MRI 3.0 with unpatched `Ox`: [okeeblow@emi#CHECKING-YOU-OUT] bundle install Fetching gem metadata from https://rubygems.org/...... … Installing ox 2.14.5 with native extensions Using checking-you-out 0.7.0 from source at `.` Bundle complete! 11 Gemfile dependencies, 19 gems now installed. [okeeblow@emi#CHECKING-YOU-OUT] ./bin/are-we-unallocated-yet|grep Total Total allocated: 20081080 bytes (390133 objects) Total retained: 1713209 bytes (22095 objects) against same MRI 3.0 with my patched `Ox`: [okeeblow@emi#CHECKING-YOU-OUT] gem install ../../ox/ox-2.14.5.gem Building native extensions. This could take a while... Successfully installed ox-2.14.5 Parsing documentation for ox-2.14.5 unknown encoding name ""UTF-8"" for README.md, skipping Installing ri documentation for ox-2.14.5 Done installing documentation for ox after 0 seconds 1 gem installed [okeeblow@emi#CHECKING-YOU-OUT] ./bin/are-we-unallocated-yet|grep Total Total allocated: 17298804 bytes (322860 objects) Total retained: 1713202 bytes (22073 objects) against same MRI 3.0, my patched `Ox`, and `Ox.parse_sax(intern_strings: true)`: [okeeblow@emi#CHECKING-YOU-OUT] ./bin/are-we-unallocated-yet|grep Total Total allocated: 16414938 bytes (301382 objects) Total retained: 1713226 bytes (22073 objects) This shows that just having Ruby 3.0 available results in a huge win, and opting into immutable `String`s from `Value#as_s` helps me even more! Unit tests pass: [okeeblow@emi#test] ruby tests.rb Loaded suite tests Started ................. 16 tests, 40 assertions, 0 failures, 0 errors, 0 pendings, 0 omissions, 0 notifications 100% passed ..... 18 tests, 42 assertions, 0 failures, 0 errors, 0 pendings, 0 omissions, 0 notifications 100% passed ............................................................................................................................... 151 tests, 249 assertions, 0 failures, 0 errors, 0 pendings, 0 omissions, 0 notifications 100% passed Apologies for possible indentation issues in this patch. There seem to be lots of existing lines with a mix of tabs and spaces — nbd, it happens — so I tried to match the surrounding lines everywhere I made an addition to make sure things line up and look okay in `git diff`. I only use `Ox` for Sax parsing, not for marshalling, so please scrutinize my changes in `gen_load.c` and friends extra hard in case I omitted any `intern_strings` option checks that could result in an unwary user getting an immutable `String` where there wasn't one before. The one change I had to make to a `String#force_encoding`-using test case is an example of what I want to avoid surprising anyone with.
For context this is my And this is how I am collecting these stats: https://github.com/okeeblow/DistorteD/blob/master/CHECKING-YOU-OUT/bin/are-we-unallocated-yet |
rb_hash_aset(ah, rstr, rb_str_new2(attrs->value)); | ||
#if HAVE_RB_INTERNED_STR_CSTR | ||
} | ||
#endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The whole section here with the various #if
checks is next to impossible to read. I think the volatile VALUE rstr
statements are not used in the set and would expect a compile error under the right set of conditions. Hard to tell though. How about tolerating some code duplication in favour of readability?
|
||
#if HAVE_RB_ENC_ASSOCIATE | ||
if (0 != pi->options->rb_enc) { | ||
rb_enc_associate(s, pi->options->rb_enc); | ||
} | ||
#endif | ||
#if HAVE_RB_ENC_INTERNED_STR_CSTR && HAVE_RB_INTERNED_STR_CSTR | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar comment to the above here. Don't cross if/else with #if
conditionals. I know I might have violated that in some places but I have been fixing them as I run into them.
I see the same code duplicated many times. Can this be a put into a function?
VALUE s = rb_str_new2(name); | ||
VALUE c = Qnil; | ||
VALUE s; | ||
VALUE c; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please initialize with Qnil
. If not interned and content is NULL then c will not be set.
I am excited about using interned strings. Great that you got this started. I'd like to see some code changes before merging though. It might be a little too extreme to internal all strings when a single options is set. For keys it often makes sense to intern but string values are different in that some a very long and really should not be interned. I added a cache (intern) feature to Oj that I will change to use the ruby intern functions and in that I have two options. One for keys and one for other strings. The other strings option is a number size limit. So for example if the intern limit is 10 then only strings less than 10 characters are cached. That might be better for Ox as well. If you are willing to make those changes that would be great. If you would prefer I helped and made some of the changes I can join you on the MR edits. It's up to you. |
Thank you for the quick turnaround! It will probably be a few days before I get back to working on this, but that all sounds very reasonable. Would you prefer additional commits fixing up these issues, or amending/squashing and force-pushing my branch so there's only one commit? How do the names Perhaps I would try to come up with a more intuitive name than |
No rush. How about simply As for a hard coded limit I think that misses the point. Caching a dictionary of strings less than 40 characters means there would be whole lot of memory used for interned strings. Better to let the user/developer decide what make sense for their situation. |
Thought you might be interested in some benchmarking I just tried out. In the Oj project, new-parser branch I've been writing a new parser and have been using my own cache (interned) string function and comparing it to a previous version of the parser. Most of the time is spent in the Ruby calls. The file is test/perf_parser.rb. Anyway it runs the old SAJ callback parser against the new one. The new one has a cache option. So without the cache options the new one is 15% faster. With the caching it is 25% faster. Now using I was planning on getting to Ox after finishing with Oj but if you want to try the Oj hash implementation with Ox that might give an even greater performance boost. Are you interested in working with me on that? |
That's a promising result. I've admittedly never used Oj but will take a look and familiarize myself with that codebase assuming your new-parser branch is online. I'd be interested to compare the memory consumption too and not just the parsing speed, since that is the resource I'm most trying to optimize right now in my own library. I'm currently allocating ~12MiB despite retaining only ~1.5MiB, but I've at least managed to get that down from over 20. |
I think the Oj hash will show memory savings as well. You would have to test that of course though. |
I haven't forgotten this PR. I am just finishing up a cache for Oj that looks like something worth copying over to Ox. Some changes will be needed to support multiple encodings but it performs better than the Ruby intern so will use the Oj cache instead. Like Oj, Ox will need options to intern or not as with interning on there there is a performance penalty when all new keys and strings are encounter. |
Please take as much time as you like. I haven't had time to come back to this and improve the readability of my patch but also haven't forgotten about that :) |
I have started and realized the current cache has some issue that will be fixed with a new approach. I'll post here when I make progress. |
A new set of intern functions were added that interns strings in a flexible cache. This PR is no longer relevant. |
Fixes #275
This change makes
Ox
use "interned" (frozen and deduplicated)String
s anywhere possible,the same effect as calling
String#-@
except without the heap churn ofRVALUE
allocation:https://ruby-doc.org/core/String.html#method-i-2B-40
HAVE_<funcname>
preprocessor macros to detect this functionalityand avoid breaking older Rubies (confirmed with at least MRI 2.7.2).
String
s anywhere an allocatedString
seemedentirely internal to
Ox
, e.g. insax_value_as_time
when allocatingthe
String
argument toox_time_class
.intern_strings
to control the behaviorof
String
return values from user-facing functions, e.g. fromsax_as_s
.Inspired by similar changes to
json
andmsgpack
:rb_enc_interned_str
if available ruby/json#451My goal is
Object
allocation reduction in theOx::Sax
handler of my MIME Typefile-identification library caused by the large number of duplicate
String
s in theshared-mime-info
-format XML it uses as a data source.I was already calling
#-@
(or using the-some_str_variable
syntax) everywhere possiblein my handler, but that only freezes and deduplicates an already-allocated
String
:That avoids duplicate retained
String
s but still causes my library to takea big GC hit right at startup when it loads its data.
All the stats below were collected using SamSaffron's
memory_profiler
:https://github.com/SamSaffron/memory_profiler
First, a sanity-check to make sure I didn't break MRI < 3.0,
using MRI 2.7 + latest official
Ox
from RubyGems (2.14.5):versus same MRI 2.7 but with my patched
Ox
:No difference between the two, showing that this patch is a no-op for pre-3.0.
Sorry I did not test older versions of MRI or other Rubies like J/Truffle/etc
since I don't have them on my system.
Now the real gainz can be seen when comparing MRI 3.0 with unpatched
Ox
:against same MRI 3.0 with my patched
Ox
:against same MRI 3.0, my patched
Ox
, andOx.parse_sax(intern_strings: true)
:This shows that just having Ruby 3.0 available results in a huge win,
and opting into immutable
String
s fromValue#as_s
helps me even more!Unit tests pass:
Apologies for possible indentation issues in this patch. There seem to be lots of existing
lines with a mix of tabs and spaces — nbd, it happens — so I tried to match the surrounding
lines everywhere I made an addition to make sure things line up and look okay in
git diff
.I only use
Ox
for Sax parsing, not for marshalling, so please scrutinize my changesin
gen_load.c
and friends extra hard in case I omitted anyintern_strings
optionchecks that could result in an unwary user getting an immutable
String
wherethere wasn't one before. The one change I had to make to a
String#force_encoding
-usingtest case is an example of what I want to avoid surprising anyone with.