Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

uefi-raw: Add explicit padding field for MemoryDescriptor #1334

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

phip1611
Copy link
Contributor

@phip1611 phip1611 commented Aug 17, 2024

IMHO, having implicit padding bytes in Low-Level types (used for Casting etc) is a bad idea. When you cast a &[MemoryDescriptor] to &[u8], Miri complains about uninitialized bytes.

Exporting one (or multiple) public _pad0 fields however is also not nice. So, we should identify all structs where this is relevant and come up with a solution. Perhaps, lets move to constructor-based types with private fields? This is breaking, how-ever.

By convention, structs having a C representation should be explicit with their
in-between padding fields. This allows users memory-safe trivial serialization
and deserialization
by accessing the underlying memory as byte slice. Without
this fields being explicit, Miri complains about uninitialized memory accesses,
which is UB.

Checklist

  • Sensible git history (for example, squash "typo" or "fix" commits). See the Rewriting History guide for help.
  • Update the changelog (if necessary)

@nicholasbishop
Copy link
Member

When you cast a &[MemoryDescriptor] to &[u8], Miri complains about uninitialized bytes.

I'd like to understand this better -- why is this cast being performed? In general we can't make the problems of uninit bytes go away by adding explicit padding fields, because our types often go over FFI boundaries and we don't control the C definition.

If you do need a "raw bytes" view of a type that may contain uninit bytes, use &[MaybeUnint].

@phip1611 phip1611 force-pushed the miri branch 2 times, most recently from 6b2bd00 to a86c72e Compare October 1, 2024 07:10
@phip1611 phip1611 changed the title uefi-raw: only explicit padding bytes to fight Miri problems (for upstream users) uefi-raw: Add explicit padding field for MemoryDescriptor Oct 1, 2024
@phip1611
Copy link
Contributor Author

phip1611 commented Oct 1, 2024

I've updated the PR description. Please re-check and let me know what you think.

@phip1611 phip1611 marked this pull request as ready for review October 1, 2024 07:11
@phip1611
Copy link
Contributor Author

phip1611 commented Oct 1, 2024

I don't get this check. What is its purpose? Why do we have it? @nicholasbishop

image

@phip1611 phip1611 force-pushed the miri branch 3 times, most recently from 451716a to 0609ea2 Compare October 6, 2024 13:56
This allows to use `dbg!` for example. Having this is beneficial in all no_std
crates. There are no disadvantages or negative implications.
By convention, structs having a C representation should be explicit with their
in-between padding fields. This allows users memory-safe trivial serialization
and deserialization by accessing the underlying memory as byte slice. Without
this fields being explicit, Miri complains about uninitialized memory accesses,
which is UB.

uefi-raw: add test for MemoryDescriptor to catch Miri problems

The underlying issue is that if someone wants to cast a slice of descriptors to a
byte slice, the uninitialized **implicit** padding bytes are UB. Miri complains
about that.
@nicholasbishop
Copy link
Member

I don't get this check. What is its purpose?

The check-raw code assumes nothing is allowed by default, then adds various exceptions for code we do want to allow. Rust is a big language, and we should be very intentional about any new types of code being added to uefi-raw since it's intended to hew closely to the C API. For this particular use case std isn't needed, anything that reads each byte of raw_bytes will work, e.g. let _ = raw_bytes.iter().fold(0, |acc, v| acc + v).

By convention, structs having a C representation should be explicit with their in-between padding fields.

This is debatable I think. For example, bindgen has an option to enable explicit padding but its not enabled by default.

MemoryDescriptor actually strikes me as a place where explicit padding is less useful, because the actual size of MemoryDescriptor can in general only be known at runtime by checking the desc_size returned from get_memory_map. In effect, there may be implicit padding between each element.

In general, adding padding bytes can indeed make serialization and deserialization easier, but there are a couple cons to consider:

  1. The type's fields can become cluttered. Perhaps not such a big deal in this one case, but if we apply it everywhere for UEFI-ABI-compat types, some structs are going to have many padding fields.
  2. If the type is constructed externally (e.g. by UEFI firmware), the padding bytes may not be initialized anyway, and so we're just kind of tricking Miri by making the padding bytes explicit in tests, but not helping with real-world usage.

I think it would really help to have an example of the code where you want this so we can discuss whether alternatives might work acceptably well. For example, perhaps there's something we can do with MaybeUninit<u8> that would work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants