Random keys, the birthday paradox, and performance #1076

LourensVeen · 2024-10-08T16:09:44Z

By default, AMUSE generates particle keys as 64-bit (pseudo)random numbers. By the birthday paradox, you need about 5 billion particles to have a 50% chance of a collision. For 100 million particles (which is more realistic in AMUSE), the probability is about 1 in 3700 runs, for a billion particles it's about 1 in 37. Particle ids are 32-bit signed integers, so if you have enough memory it's possible to get in trouble.

It seems to me that it should be possible to just have a single key generator object (a singleton) that is used to generate keys in batches, and which hands out subsequent ranges of keys. There's already a class that does that actually, BasicUniqueKeyGenerator, it's just not used by default. If particles are made in batches, and codes give them subsequent ids (which they probably do, since it's probably the index into the data array), and keys are generated in the same batches, and are also given subsequent ids, then the mapping between them becomes a piecewise linear function with slope 1 that can be described as a list of boundaries and offsets.

That would save a ton of memory compared to the full index lists currently in amuse.datamodel, and would therefore probably speed things up.

Do note that the code that deals with this is complex, and I don't understand it all, so there may be something keeping this from working. Also, I don't know if this keys to indices conversion is actually a bottleneck. Still, it seemed worth recording, so here's an issue.

The text was updated successfully, but these errors were encountered:

LourensVeen changed the title ~~Random keys and the birthday paradox~~ Random keys, the birthday paradox, and performance Oct 8, 2024

LourensVeen mentioned this issue Oct 8, 2024

64-bit particle ids #1077

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Random keys, the birthday paradox, and performance #1076

Random keys, the birthday paradox, and performance #1076

LourensVeen commented Oct 8, 2024

Random keys, the birthday paradox, and performance #1076

Random keys, the birthday paradox, and performance #1076

Comments

LourensVeen commented Oct 8, 2024