Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

find_device_object can be exteremely inefficient #65

Open
chrisguikema opened this issue Jun 7, 2024 · 0 comments
Open

find_device_object can be exteremely inefficient #65

chrisguikema opened this issue Jun 7, 2024 · 0 comments

Comments

@chrisguikema
Copy link
Contributor

find_device_object searches through the device untypeds, and then when it reaches the correct object, retypes a single page until the proper physical address is found. This isn't a huge deal on ARM platforms, where device objects are defined by the device tree. However, on x86, there exists a region of memory defined as device memory from the end of RAM to the PaddrUserTop value (1 << 47 on x86_64).

The problem with this setup is that x86 processor cards can seemingly arbitrarily place MMIO regions into this memory. For example, this is a snippet of a PCI scan for a COTS Ice Lake processor card:

                Region 0: Memory at 20fffaf0000 (64-bit, non-prefetchable) [size=16K]
                Region 0: Memory at 20fffaec000 (64-bit, non-prefetchable) [size=16K]
                Region 0: Memory at 20fffae8000 (64-bit, non-prefetchable) [size=16K]
                Region 0: Memory at 20fffae4000 (64-bit, non-prefetchable) [size=16K]

Trying to give one of these regions causes find_device_object to take so long the system is unusable:

 [[email protected]:817](mailto:[email protected]:817) Creating object vm0_mmio_frame_2267737452544 in slot 31572, from untyped 7b16...
[[email protected]:682](mailto:[email protected]:682)  device frame/untyped, paddr = 0x20fffaf7000, size = 12 bits
[[email protected]:507](mailto:[email protected]:507)
[[email protected]:532](mailto:[email protected]:532) 8000000000 408000000000

In my case, it would take ~419 million calls to get to that memory. And it could even be worse, if the memory was even higher.

I'm not sure if I can release my code, but I was able to come up with a solution where I retyped Huge Pages instead of single pages. This reduced the time it took to find the proper physical address, but the whole function could use a look through to make things more optimized.

https://github.com/seL4/capdl/blob/master/capdl-loader-app/src/main.c#L502

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant