Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider Alternatives To DashMap #837

Open
XAMPPRocky opened this issue Oct 23, 2023 · 2 comments
Open

Consider Alternatives To DashMap #837

XAMPPRocky opened this issue Oct 23, 2023 · 2 comments
Labels
area/performance Anything to do with Quilkin being slow, or making it go faster. help wanted Extra attention is needed kind/design Proposal discussing new features / fixes and how they should be implemented priority/high Issues that should be addressed as soon as possible.

Comments

@XAMPPRocky
Copy link
Collaborator

XAMPPRocky commented Oct 23, 2023

(See: xacrimon/dashmap#285)

While doing loadtests and benchmarking, I've noticed a significant amount of time is spent dropping Arcs from read or write references to DashMap. DashMap is great for concurrent access to multiple shards, but our usage is much more "access everything", where we iterate over the whole map, this causes a tonne of extra overhead that I worked around in a commit by just cloning the endpoints into a separate vec as a cache.

We should consider alternatives to dashmap that are much more designed for our usage patterns.

pub struct ClusterMap(DashMap<Option<Locality>, BTreeSet<Endpoint>>);

@XAMPPRocky XAMPPRocky added kind/bug Something isn't working help wanted Extra attention is needed area/performance Anything to do with Quilkin being slow, or making it go faster. kind/design Proposal discussing new features / fixes and how they should be implemented priority/high Issues that should be addressed as soon as possible. and removed kind/bug Something isn't working labels Oct 23, 2023
@xacrimon
Copy link

xacrimon commented Feb 1, 2025

See xacrimon/dashmap#285 (comment) for some alternative suggestions based on your usage, though I only briefly skimmed over how it's used in quilkin. Should be an easy swap, not a lot of work to do.

@xacrimon
Copy link

xacrimon commented Feb 1, 2025

My suggestion notably makes other significant optimizations trivial. For example your endpoint iteration spends a lot of time collecting a vector of endpoints, just to return it as an iterator. This may be useful under dashmap to release locks earlier but with an ArcSwap<HashMap> you can simply use load the arcswap into a guard and use the ::map method to "project" a Guard<HashMap> into a Guard<impl Iterator>, iterating and flatmapping on the maps entries like currently , and since readers no longer block writers from swapping out the value, while the one you have stays the same, there's no need to perform any intermediate collections, looks like this is #828.

Really on hot paths any kind of iteration + collection is probably a significant bottleneck. Unless you need sub millisecond update latency, using this clone swap update scheme should be fine, even with many thousands of map entries, should allow you to eliminate all of the intermediate allocations introduced in #855. Do note that cloning the btreesets might be a potential perf bottleneck on updates.

What you want to do to resolve that depends on if you have many endpoints comparatively to locality entries or not. I would consider and benchmark flattening them into a Vec of endpoints keyed by locality, always storing them sorted allowing you to use Vec::binary_search_* for "set lookups" if you don't have extremely large amounts of them per locality. If update performance is still prohibitive, I'd solve that by wrapping each key value (as either btreeset or vector, but my hunch is that vector will be better for your usecase) in another arcswap, so that they're not deep-cloned when updating clustermap.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/performance Anything to do with Quilkin being slow, or making it go faster. help wanted Extra attention is needed kind/design Proposal discussing new features / fixes and how they should be implemented priority/high Issues that should be addressed as soon as possible.
Projects
None yet
Development

No branches or pull requests

2 participants