Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove dashmap and replace with static array #16

Merged
merged 10 commits into from
Dec 3, 2024
Merged

Remove dashmap and replace with static array #16

merged 10 commits into from
Dec 3, 2024

Conversation

Robo210
Copy link
Collaborator

@Robo210 Robo210 commented Nov 26, 2024

Since the event metadata lookup never changes after initialization, dashmap's (and every other map's) locking around mutability is just extra unnecessary complexity. A static array to binary search will have less overhead.

The one complexity is that the key to lookup is a tracing::callsite::Identity, which is an opaque type that only implements PartialEq and Hash. That means we need to search based off its hash, which in turn means we need to compute the hash every time... This isn't a regression over dashmap, but it's still less than ideal because we know the Identity is just a pointer and could be compared quicker if that wasn't meant to be opaque to callers or if it implemented a total ordering for the type.

Building the sample "main":

  • Removes 17 crates from the dependencies.
  • Removes 25 KB from the binary size.

@Robo210 Robo210 requested a review from a team as a code owner November 26, 2024 02:15
src/statics.rs Outdated Show resolved Hide resolved
src/statics.rs Outdated Show resolved Hide resolved
src/statics.rs Outdated Show resolved Hide resolved
src/_details.rs Outdated Show resolved Hide resolved
src/statics.rs Outdated Show resolved Hide resolved
src/statics.rs Outdated

impl core::cmp::PartialEq for crate::_details::ParsedEventMetadata {
fn eq(&self, other: &Self) -> bool {
self.identity_hash == other.identity_hash
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

self.hash == other.hash && self.identity == other.identity

src/statics.rs Outdated

pub(crate) fn get_event_metadata(id: &tracing::callsite::Identifier) -> Option<&'static crate::_details::ParsedEventMetadata> {
let hash = BH.hash_one(id);
let etw_meta = EVENT_METADATA.binary_search_by_key(&hash, |m| { m.identity_hash });
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use partition_point to get the index of the lower bound, then iterate until you find one of:

  • The end of the slice. (Not found.)
  • An item where hash doesn't match. (Not found.)
  • An item where this.identity == other.identity. (Found.)

src/statics.rs Outdated Show resolved Hide resolved
src/statics.rs Outdated
let mut cur = idx;
while cur <EVENT_METADATA.len() {
let meta = &EVENT_METADATA[cur];
if meta.identity_hash == identity_hash {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Style: If you test for meta.identity_hash != identity_hash then you can reduce the level of nesting. I also think it reads more clearly "if the hashes don't match, break" rather than "if the hashes match, do stuff and more stuff and more stuff, else break".

if meta.identity_hash != identity_hash {
    break;
}
if (meta.meta.identity == *id {
    return Some(meta.meta);
}
cur += 1;

Optionally you can make it an if-else chain, but that's up to your personal preference.

src/statics.rs Show resolved Hide resolved
src/statics.rs Outdated
let map = dashmap::DashMap::with_capacity(events_slice.len());
let bh = FnvHasher::default();

let mut map: Box<[core::mem::MaybeUninit<crate::_details::ParsedEventMetadata>]> = Box::new_uninit_slice(good_pos + 1);
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clippy says new_uninit_slice was only stabilized last month (1.82), so maybe I should hold off on using it. Though github's CI is apparently running 1.82 already and so are the internal pipelines, so it's probably fine.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My (strong) recommendation is to create a vec, reserve space, push items into it, and call into_boxed_slice when you're done.


#[cfg(target_os = "windows")]
let start = start.add(1);
assert!(!start.is_null());
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure an assert! belongs in a lib, but this would be a pretty critical issue that needs to be surfaced, or else events will end up with the wrong keywords.

src/statics.rs Outdated
let map = dashmap::DashMap::with_capacity(events_slice.len());
let bh = FnvHasher::default();

let mut map: Box<[core::mem::MaybeUninit<crate::_details::ParsedEventMetadata>]> = Box::new_uninit_slice(good_pos + 1);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My (strong) recommendation is to create a vec, reserve space, push items into it, and call into_boxed_slice when you're done.

src/statics.rs Outdated
fn cmp(&self, other: &Self) -> std::cmp::Ordering {
match self.identity_hash.cmp(&other.identity_hash) {
cmp::Ordering::Equal => {
match self.meta.identity == other.meta.identity {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes me nervous. You're using one rule for equality and a different rule for GT/LT. That can get into bad behavior.

Do we use GT/LT for anything other than sorting/probing the map?

If so, you should be comparing ONLY the hash. Two ParsedEventMetadata items with equal hashes will compare equal, even if the identities are not equal. When you look them up, you'll hit the lower bound of them and perform identity == for each one to get actual equality, i.e. you'll implement standard hashtable functionality but with less allocation.

@Robo210 Robo210 merged commit 56d6fcb into main Dec 3, 2024
3 checks passed
@Robo210 Robo210 deleted the dashmap branch December 3, 2024 11:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants