Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[draft] Constant-time equality checks for sensitive values #1712

Draft
wants to merge 11 commits into
base: main
Choose a base branch
from

Conversation

swenson
Copy link

@swenson swenson commented Oct 10, 2024

We always want to use constant-time and constant-power when doing operations on private keys, and generally we should use constant-time and power operations for hashes and signatures. Sometimes it can be difficult to tell when a given hash is sensitive or not (for instance, if it relates to potential customer code, we don't know how sensitive that code or its hash might be). I think it makes sense to take a cautious approach and protect too many things rather than too few, especially if it doesn't cost us much in terms of code size to protect additional things.

The built-in PartialEq and Eq trait derivations are not constant time or power.

In this PR, I'm removing the PartialEq and Eq derivations from potentially sensitive areas or putting them behind #[cfg_attr(test, derive(PartialEq, Eq))] blocks, and re-adding PartialEq implementations as necessary that use constant-time and constant-power versions.

I ported the OpenTitan hardened_memeq C function to Rust for now, but I'm open to suggestion on that.

Also, when we use type aliases for sensitive values, like type SomeValue = [u32; 48], I'm changing them to a 1-tuple so that they can't be treated as an array accidentally compared insecurely. It's a little more annoying (and makes the PR noisier), but it helps prevent accidental insecure comparisons in the future.

Overall, this should have a small impact on the overall ROM and firmware sizes, mostly due to the new memeq function. I'm hoping to minimize that impact, and possibly reverse it entirely, before merging this.

TODOs:

  • Add constant-time ct* functions from OpenTitan to make it truly branchless.
  • Check the rest of the code base for digest, signature, and key arrays that are embedded in other structs.
  • Evaluate if we need all of the OpenTitan protections, like decoy arrays
  • Add tests for OpenTitan-ported code
  • Update RandomWalk to use a Gray code to actually do a "random" walk with no randomness.
  • Check if inlining the RandomWalk Gray code will save instruction space
  • Update the dpe submodule as well with the same mitigations, e.g., for CryptoBuf.
  • Update the common cfi submodule, if that is where we want to put the memory comparison
  • Evaluate the instruction size impact of core::hint::black_box() vs. doing a simple inline-assembly laundering. (In the memeq function, we only use laundering to force values to be loaded into registers, so we might use a simpler laundering approach).
  • Clean up the unsafe code
  • Reduce instruction code in places by adding more reuse or with small inline-assembly versions of certain pieces

fn advance(&mut self) -> u32 {
// TODO: The current implementation is just a skeleton, and currently just
// traverses from 0 to `min_len * 2`.
let s = self.state;
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Gray code comment I made above is basically changing this to be self.state ^ (self.state >> 1), and going to min_len.next_power_of_two() instead of min_len * 2. This added a few extra bytes of instruction code, which pushed it over the ROM limit, so I've removed it for now, and left this functionally equivalent to the not-at-all random_walk functions in OpenTitan.

@swenson swenson marked this pull request as draft October 10, 2024 16:19
@jhand2
Copy link
Collaborator

jhand2 commented Oct 10, 2024

There is also the constant_time_eq crate, although I don't have any reason to believe this is any better than porting the OT implementation.

@swenson
Copy link
Author

swenson commented Oct 10, 2024

I looked at the constant_time_eq crate, which we use in dpe. It contains a much simpler set equality check (it doesn't use decoys or random walks), but it would probably be fine if we don't think all of the extra precautions are necessary.

@swenson
Copy link
Author

swenson commented Oct 11, 2024

I split off a cfi_launder cleanup PR that saves about 1 KB of ROM and 600 bytes from runtime. #1714

The impact from this PR is an additional ~200-300 bytes in ROM, depending on a few choices (about 150 or so from the hardened memory equality function). But if you count the other PR, then we're still saving hundreds of bytes :)

I'm still digging through the impact of a few choices to try to get this down more.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants