Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Changes to pass Miri test with the stacked borrows model #46

Merged
merged 5 commits into from
Mar 17, 2024
Merged
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 7 additions & 11 deletions src/core/option.rs
Original file line number Diff line number Diff line change
Expand Up @@ -225,8 +225,9 @@ impl<T> Core<T> for OptionCore<T> {
// We can't just have two mutable references, so we use `ptr::swap`
// instead of `mem::swap`. We do not use the slice's `swap` method as
// that performs bound checks.
let pa: *mut _ = self.data.get_unchecked_mut(a);
let pb: *mut _ = self.data.get_unchecked_mut(b);
let p = self.data.as_mut_ptr();
let pa: *mut _ = p.offset(a as isize);
ratmice marked this conversation as resolved.
Show resolved Hide resolved
let pb: *mut _ = p.offset(b as isize);
ptr::swap(pa, pb);
}
}
Expand All @@ -241,15 +242,10 @@ impl<T: Clone> Clone for OptionCore<T> {
// is important for us.
// - The memory after its length is probably uninitialized.
//
// To fix both issues, we get a slice to the complete memory of the
// original `Vec` and create a `Vec` from it. Then we reset the length
// to the old value. Both is safe as all the elements that are included
// and excluded by the "fake length" are `None`.
let data = unsafe {
let mut data_clone = self.data.get_unchecked(0..self.data.capacity()).to_vec();
data_clone.set_len(self.data.len());
data_clone
};
// To fix both issues, we create a new vec with the appopriate capacity
// and extend it with the values of the other.
let mut data = Vec::with_capacity(self.data.capacity());
data.extend(self.data.iter().cloned());
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is correct, as it does not initialize elements beyond self.data.len(), right? But this is required. See realloc that initializes all those elements to None. And the docs on the data field say so as well. I suppose has_element_at was one of the methods relying on this behavior.

But I think this implementation relying that elements beyond vec.len() can be initialized and stay that way is... dubious? The easy way to do this properly is to actually initialize all elements and have the Vec's len == cap at all times. But that wastes a usize of course and would require us to store a separate len field. So I think this implementation should use a pointer and do everything manually like bitvec. I don't think Vec can give us the guarantees that we need. Or, alternatively, we could find all the methods that rely on the current "initialized beyond len" behavior and rewrite them. Hopefully without making them slower.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oof I think many methods rely on the "initialized beyond len". Lots call self.data.get_unchecked_mut(idx) where idx is only guaranteed to be <= cap. Vec::get_unchecked_mut specifically says that calling it with an out of bounds index is immediately UB. So I think this whole implementation is super broken... sigh

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, the key point here is this line in the std docs for Slice::get_unchecked and Slice::get_uncheckec_mut

Calling this method with an out-of-bounds index is undefined behavior even if the resulting reference is not used.

Copy link
Contributor Author

@ratmice ratmice Mar 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think with the has_element_at change that we committed though, we aren't hitting that anymore.
because those get_unchecked_mut calls seem to be predicated on has_element_at, it is very much a can of worms though.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So yes, that means it has to be completely rewritten to either waste a usize and keep the vec's len == cap or not use Vec at all, but raw pointers. Ouch.

I assume you are not volunteering? :D

I don't know when I'll have the time but of course I'd try it asap. I wonder if I should release your other patch already. Or release a new major version where the OptionCore is removed until I get time to re-add it. And then of course yank the old ones.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is a bit involved and a diversion for me, I mostly wanted to compare the performance of Lox vs a naive doubly linked list approach for cases where we are adding/removing lots of elements like decimation and subdivision.
And I was pondering whether modified RRB-vectors could be used in any way that improves the deletion/insertion with the vector stability guarantees but without having to keep the entire vector gaps.
(Not that I have any actual ideas yet that would work, but it is part of the thinking that lead to me playing with these crates)

Copy link
Contributor Author

@ratmice ratmice Mar 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking through github at least I don't see anywhere that OptionCore/InlineStableVec gets used outside of
this benchmark1, so removing it until it can be re-added does seem like a viable option.
I don't have any particular preferences between that and releasing with my patch, so both seem like viable options to me.

Footnotes

  1. https://github.com/mooman219/generational_arena_bench

Copy link
Contributor Author

@ratmice ratmice Mar 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose one other thing we could consider doing rather than abandoning the approach the crate currently takes is to override clear to for item in self.data.iter_mut() { item.take() }, then replace the uses of get_unchecked* with their vec.ast_ptr().offset(idx) which has the weaker precondition the current implementation assumes.

This would probably require quite a bit of auditing of anything in the implementation that removes elements though so they take().

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding your initial comment in this thread I did push a fix that initializes the extra capacity with None values, but in a way that doesn't read the extra capacity of the other vector.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So i've gone ahead and come to the conclusion that there were 2 problems,
the failure to maintain the invariant, in clear, and the get_unchecked usage in has_element_at.

Subsequently, I've changed the clear implementation and avoided get_unchecked,
and now because calls to get_unchecked* are predicated on the return value of has_element_at all the current usage I believe should be correct.

This should allow keeping the current implementation intact, while removing that bounds check from my previous patch


Self { data }
}
Expand Down
Loading