Use 32 per-type container lists instead of one #29

stepancheg · 2021-11-15T07:42:56Z

Reduce lock contention
Make lookup of a container for type faster

Ideally there should be something like 1024 lists, but so large
copy-paste won't look good.

* Reduce lock contention * Make lookup of a container for type faster Ideally there should be something like 1024 lists, but so large copy-paste won't look good.

droundy · 2021-11-16T01:42:44Z

Same question as before, do you have any reason to believe that contention here is a performance problem, or that this speeds things up? I'm all for internment to be highly optimized, but am doubtful as to whether this contention is an issue. It seems this will only help if multiple types are interned at high rates.

stepancheg · 2021-11-16T02:13:43Z

do you have any reason to believe that contention here is a performance problem, or that this speeds things up?

General reply is in that that issue: #28 (comment)

I didn't check this or that PRs specifically. I applied all optimizations I could think of, and it was measurable speedup. Some of the optimizations are probably useless.

droundy · 2021-11-18T03:50:09Z

Any chance you could describe (or better, produce) a reasonable benchmark that might trigger this? We've got benchmarks, but none that have threads to trigger lock contention. I'd like to ensure we're improving some performance and not hurting the unthreaded case too much before applying.

stepancheg · 2021-11-18T04:01:08Z

describe

50 threads which continuously intern objects of 100 different types in random order. Better objects which take longer to intern (e. g. preallocated, but not prehashed long strings).

produce

I may do it one day, but since we've forked the library, this is no longer a big issue for us. Sorry.

not hurting the unthreaded case

This PR creates very little overhead, much smaller than acquiring the lock or do the hashing. The other PR should have no negative effects.

droundy · 2021-11-22T14:26:59Z

I've confirmed with benchmarks that this doesn't measurably decrease the single-threaded case, and gives a dramatic improvement when allocating Interns for several types on multiple threads.

Use 32 per-type container lists instead of one

08000e6

* Reduce lock contention * Make lookup of a container for type faster Ideally there should be something like 1024 lists, but so large copy-paste won't look good.

droundy added a commit that referenced this pull request Nov 19, 2021

start work on lock contention benchmark (related to #29)

c5df427

droundy merged commit a176408 into droundy:master Nov 22, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use 32 per-type container lists instead of one #29

Use 32 per-type container lists instead of one #29

stepancheg commented Nov 15, 2021

droundy commented Nov 16, 2021

stepancheg commented Nov 16, 2021

droundy commented Nov 18, 2021

stepancheg commented Nov 18, 2021

droundy commented Nov 22, 2021

Use 32 per-type container lists instead of one #29

Use 32 per-type container lists instead of one #29

Conversation

stepancheg commented Nov 15, 2021

droundy commented Nov 16, 2021

stepancheg commented Nov 16, 2021

droundy commented Nov 18, 2021

stepancheg commented Nov 18, 2021

droundy commented Nov 22, 2021