-
Notifications
You must be signed in to change notification settings - Fork 270
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
context: introduce alternate global context scheme based on #346 #539
base: master
Are you sure you want to change the base?
context: introduce alternate global context scheme based on #346 #539
Conversation
This is in its own commit so that we can argue about it independently of the rest of the PR. I believe this is correct and safe because the docs for `AtomicPtr` say that (a) it is #[repr(C)] and (b) that it has the same in-memory representation as a *mut T.
This covers both the std and no-std cases. This is perhaps an unconventional strategy; see rust-bitcoin#346 for discussion leading up to it.
Weirdly we have no tests that don't require std or alloc. We should revisit this. In many cases it is because of the dependency on rand-std but we should really try to narrow that down to specific tests.
This commit should probably be removed before merging; it can be part of the next PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this code is unsound. First how I understand what the code is doing:
There are two contexts, one gets re-randomized, the other doesn't. The pointer to currently usable context is stored in secp256k1_context_signing_1
. When the context is being re-randomized it atomically sets a "busy", flag switches the pointers, modifies the context, then restores the pointers and finally unsets the busy flag.
The problem with this is that when setting the busy flag we don't know if the context is still being used by other thread (for reading). So it could happen that another thread holds the pointer it obtained before while it's being modified.
The only possible fix I see here is to implement RW lock - count the number of threads using the primary context and only rerandomize if there are none. It kinda worries me though that this may significantly decrease the frequency of rerandomizations. One thing we should take care of is to ensure that the rerandomized context is used for signing only. Verification should always use the alternate context.
I'm also thinking about alternative design:
- Have 3 context: sig1, sig2, verify
- verify is constant and accessed from appropriate operations
- A thread accessing signing context tries sig1 and if it's busy goes for sig2
- seed is 64B - rerandomization writes the remaining 32B into a (now locked) global buffer
- A thread that is releasing read lock tries to obtain write lock for sig2 to rerandomize sig2
- If successful it rerandomizes it with the seed obtained from global buffer
- There needs to be some synchronization to ensure only one context is locked.
- This whole thing probably should be modeled as a queue or something similar.
The expected outcome is that while it may happen that two threads reused the same context at least they wouldn't use a default, completely non-randomized one.
|
||
thread_local! { | ||
pub static CONTEXT: RefCell<Secp256k1<All>> = RefCell::new(Secp256k1::new()); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could actually use pure UnsafeCell
and just make sure we don't access it twice from our methods nor call any user-provided callbacks. Maybe not worth optimizing though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should do this, but not on the first iteration. RefCell
will let us do development "in debug mode" :)
/// | ||
/// Safety: you are being given a raw pointer. Do not mutate through it. Do not | ||
/// move it, or any reference to it, across thread boundaries. | ||
pub unsafe fn with_global_context<F: FnOnce(*const ffi::Context) -> R, R>(f: F) -> R { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not use &ffi::Context
instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, neat! I spent so long going back and forth on ffi::Context
vs Secp256k1<All>
vs some custom type that I didn't even consider the pointer type.
This is a good idea.
// The reason is that a panic at this point in the code would leave | ||
// `LOCK` at true (as well as both `signing_1` and `signing_2` pointing | ||
// at the same, never-rerandomized, place) and therefore prevent any | ||
// future rerandomizations from succeeding. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could still panic after unlocking 🤷♂️
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hah, good idea! Again, I was so worked up over an unrelated thing (whether to call randomize
or let the user pass in an arbitrary-maybe-panicking closure) that I didn't consider this.
Code review ftw.
|
||
unsafe { | ||
// Obtain a pointer to the alternate context. | ||
let alt_ctx = ffi::secp256k1_context_signing_2.load(Ordering::Acquire); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is never modified so we shouldn't need atomic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, right. It was modified in a first iteration of this, and my on-paper design (when I was trying to use AtomicPtr::swap
on two AtomicPtr
s directly, which annoyingly you cannot do).
But as you noticed in your above comment, this whole scheme doesn't seem to work, so it's a moot point.
Ah, yes, you're correct. The pointers are okay but the actual context object may be seen while it's being modified. Dammit.
OK, this isn't too bad. I was scared of doing this because it could mean that reads might fail due to contention, but now I see that this won't be an issue. I'll think a bit about whether having a rwlock would let me get rid of the second context object and the pointer-swaps, which would be nice (and might make this upstreamable).
So, in theory if you have multiple threads constantly signing in a tight loop, it would reduce the frequency to 0 (since the RW lock would never count down to zero). It's not really a fair comparison, but this is the same as the current situation ;). Less unfairly, I don't think this is going to be a very common case -- even if you are signing in a multi-threaded context, I'd expect contention to only happen in short bursts, and if you can pull a rerandomization off even once in 100 sigantures that offers significant protection. (It's hard to quantify this stuff because the attack model is so murky "the attacker can see everything you're doing with the key...except not the actual key". But "typical" timing attacks on not-egregiously-broken code take well over 100 signatures to get enough signal to usefully weaken a key.) In practice I think it'll be really rare to be signing in more than one thread anyway, and verification we already have planned to use a different context (the existing
Heh, I like this in princple, but given that (a) we have no real locking facilities other than spinlocks which we've decided to eschew; (b) even having unbounded queues will be difficult. |
I don't see how. I don't suggest traditional RW lock but fall back to the read-only context when the lock is locked for writing. This read only context must exist. Otherwise we would have to wait which we don't want to.
That's pretty comforting but I was pointing out at fall backs: if it happens that signing falls back to the constant, attacker-known context enough times the attacker might extract the key. Ironically, the more often re-randomization happens the more likely it is to trigger this. My idea with three contexts prevents that. (We could have two but then verifications would cause contention.)
I meant bounded queue containing two items. But since then I tried to think about that and came up with a design without a queue: Data structuresContexts
Main lockThe main lock is
Rerandomization lockSimple Seed
RNGRNG is Signing thread operations
ReseedingPublic reseeding function takes 64B seed.
Alternatively we could take 32B, first overwrite the seed and then call rerandomization. Probably simpler code but calls into HMAC even in hypothetical case when supplying 64B is cheaper. Rerandomization
|
Just letting you know that I haven't forgotten about this, but I've read it twice and need to read it some more times before I can grok everything that's going on. |
If it helps the main idea is for signing threads to use whichever one the two contexts while the other is available for rerandomization. When there's no contention the contexts are swapped so that the used one can be re-randomized. |
This implements, but does not use, the new global context scheme described in #346. As described in #538, we want to actually use the global context API throughout the entire codebase, rather than maintaining parallel global-context and non-global-context APIs.