Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Audit/document/remove unsafe code #27

Open
1 of 4 tasks
jneem opened this issue Aug 16, 2021 · 10 comments
Open
1 of 4 tasks

Audit/document/remove unsafe code #27

jneem opened this issue Aug 16, 2021 · 10 comments

Comments

@jneem
Copy link
Owner

jneem commented Aug 16, 2021

We have unsafe code in a few places, and the invariants that need to be upheld are not always well-documented.

  • in Vector::swap we use unsafe because rust doesn't know that pointers to different indices are non-overlapping. I think this is fine, although it could be removed at the cost of a clone() and an extra swap.
  • in the hash table, there is unsafe code in insert, because we need to move out of something before we're ready to move a new value in. This could be removed at the cost of an additional "uninitialized" enum variant in Entry. I'm not sure what the cost of that is.
  • there's some unsafe code related to the "pool" feature. This code is currently disabled in imbl (it's only relevant when using Rc instead of Arc, and we don't have that yet). This part is therefore a lower priority.
  • there is a bunch of unsafe code dealing with Focus and FocusMut. This will take some time to figure out. There's some subtlety in FocusMut::split_at and FocusMut::unmut.
@nomad010
Copy link
Contributor

Hi there,

I contributed a bit to im on the Vector front and I have my own experimental implementation of RRB trees including the Focus/FocusMut. Although my implementation has substantial differences to im's, I did use im' as 'code inspiration'. I feel confident I can help with task item 4 if you would like.

@jneem
Copy link
Owner Author

jneem commented Aug 16, 2021

That would be awesome! Given that I'm attempting to maintain a crate that I didn't write (and with the original author not around to give feedback), I'm particularly concerned about having good documentation about the invariants that we need to preserve.

@nomad010
Copy link
Contributor

Greetings, just checking in. I haven't forgotten about this, I didn't have time to much work on this this past weekend. I have gone through the code and just made sure that I can explain everything when I do write it up.

I just have one or two questions. Would you mind if I prepare the documentation in a fork for the repo? Or would you like it in a separate document (Google Docs, etc)? I might do a bit more than just unsafe with the documentation, there are some gotchas in places that I think should be documented as well.

@jneem
Copy link
Owner Author

jneem commented Aug 26, 2021

My ideal location for documentation would be as doc comments in the code (on private items, so they aren't visible in the API docs but you can compile them with cargo doc --document-private-items). But honestly, any implementation docs will be an improvement over the status quo, so I'll take them in whatever format is convenient for you.

@jneem
Copy link
Owner Author

jneem commented Aug 28, 2021

Based on the discussion on #29, my understanding is that all of the target_ptr pointers in FocusMut point to chunks that are "unique", in the sense that their Arcs have a ref-count of 1. These chunks can be shared between other instances of FocusMut that were created by splitting (or instances of Focus that were created by splitting and then unmuting), but they are not shared by any other instances of the tree. And since FocusMut borrows its tree mutably, it follows that this collection of FocusMuts are the only things that can modify these chunks.

Assuming I'm right, I think this is something worth documenting. (Edit: I wasn't right)

@jneem
Copy link
Owner Author

jneem commented Sep 12, 2021

Ok, I think I've understood the motivation behind the owning Arc in Focus: it's because if you create a Focus by doing something like

let (f1, mut f2) = vec.focus_mut().split_at(100);
let f1 = f1.unmut();

then you have a Focus and a FocusMut that have disjoint ranges, but share the same tree. If I then modify f2, it will modify the root of the tree, but because f1 has its own copy of the root Arc, the root will get copied and f1 will be unmodified.

A consequence of this is that the signature of Focus::get is indeed correct (the lifetime of the returned reference should be tied to the Focus, not to the original vector). The lifetime transmutation in the Iter::next method is incorrect in general, but it's ok in this case because we know that (1) the Focus was created directly from a Vector, and not by unmuting a FocusMut, and (2) the vector can't be modified while the Focus is alive, so the root pointer in the Focus is the same as the root pointer in the Vector, and so the returned reference will actually be valid for as long as the Vector is alive.

@nomad010
Copy link
Contributor

Yes, that sounds correct. One thing I've realised unmut is weird, I'm not sure why you'd ever need it. You'd already need to have gotten mutable access to the Vector in order to use it.

@jneem
Copy link
Owner Author

jneem commented Sep 12, 2021

I guess I can think of a couple of possible motivations, although I'm not really sure how compelling they are:

  • Focus could give better performance, because you aren't locking when resetting the focus
  • you might already have some fn foo(f: Focus<'_, A>) -> ?? {} but you have access to a FocusMut, and you want to apply foo to it without writing some abstraction that would allow handling either case.

As for what to do about it, I suppose I should grep through some im-dependent crates and see if anyone is actually using unmut. One possible way to continue to support it would be to just have an extra Focus variant, so it would be

enum Focus<'a, A> {
    Single(&'a [A]),
    Full(TreeFocus<'a, A>),
    Unmuted(FocusMut<'a, A>),
}

@nomad010
Copy link
Contributor

Ah yes, sorry in my experimental repo I have worked around the need for having a lock. When split is called, I do a (A)rc::make_mut call, also I have a structure that allows multiple mutable references to it to exist which I use for the FocusMut in this instance. The downside to this is that it involves a lot of unsafe.

This makes Focus and FocusMut on a more level playing field.

@maboesanman
Copy link

For the second bullet point (about moving before having a value to move in), you could use the take_mut crate.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants