GPU passthrough availability? #62

vivekpatani · 2025-06-09T21:50:44Z

vivekpatani
Jun 9, 2025

Would I be able to passthrough GPU devices to the container either atomically or in slices? Thanks.

egernst · 2025-06-09T22:33:51Z

egernst
Jun 9, 2025
Maintainer

Hey @vivekpatani - we do not currently support this. If you have feedback on what you'd like to see specifically, let us know. Converting this to a discussion.

9 replies

marty-sullivan Jun 11, 2025

Honestly, this would be the only reason I would ever use this. If you aren't offering virtualization / passthrough of the GPU and other Apple hardware, why would anyone want this over using Docker Desktop?

gbraad Jun 12, 2025

@AndreasKunar
All sources are here:

git clone https://copr-dist-git.fedorainfracloud.org/git/slp/mesa-krunkit/mesa

The reason for the COPR and RPM, is to allow this easily to be consumed from eg. Podman Desktop. It would be possible to also build this on OBS. It is not restricted... just convenience for our usecase.
Note: @slp can probably give a better answer around this.

slp Jun 12, 2025

My issue with containers/krunkit is, that Vulkan remoting from the container to the host to my knowledge requires a certain patched mesa-vulkan-driver,, only available in Fedora and similar. Limiting containers to be based on Fedora/Redhat/Centos to me limits usability.

The need for a patched mesa comes from the fact that virtio-gpu doesn't have the ability to negotiate blob alignment with the host. We extended the virtio spec to address this missing feature, and we'll be sending a kernel patch implementing that extension to the spec. Once that's merged, we can propose another patch to upstream mesa to fix the issue the right way.

In the meantime, there's no option but to use a patched mesa (we can't upstream it because that's basically a hack). That said, there's nothing in those downstream patches that ties them up to a Fedora/Red Hat/CentOS distro. Other distros can pick them up and apply them as needed.

AndreasKunar Jun 12, 2025

gbraad+slp - thanks a lot for the clarification and for implementing it in the first place.

DemiMarie Jun 22, 2025

Worth noting that Vulkan remoting is not a secure solution for untrusted workloads. One needs kernel-userspace API forwarding for that. On Linux, that would forward ioctls; on macOS, I presume it would forward IOKit operations.

yongjer · 2025-06-10T07:09:19Z

yongjer
Jun 10, 2025

such as ollama or pytorch can use mps in the container

4 replies

PengYicong Jun 12, 2025

True, it's such a waste for Macs especially those with large memory and abundant GPU cores size to only run on CPU mode to do AI workloads.

mavenugo Jul 29, 2025
Maintainer

am curious to learn this use-case. Users of ollama or pytorch can still use MPS while using the native macOS. Why is supporting it as a linux container important?

okaris Jul 30, 2025

@mavenugo supporting it as a ~~linux~~ container is important. if it's linux it's a bonus because many systems, especially production systems are running on linux hosts or containers.

PylotLight Aug 9, 2025

am curious to learn this use-case. Users of ollama or pytorch can still use MPS while using the native macOS. Why is supporting it as a linux container important?

Do you not understand the point and benefit of containerised workloads?
Being able to run a cluster of apps and services all orcastrated in one central location rather than manually running some command to stand up a temporary client/server. Having everything in a single image over manually cloning a repo and setting up deps to then always have a terminal session open somewhere to have one or multiple servers running, is immensely easier beyond compare.. does this cover it even a little?

okaris · 2025-06-10T10:29:55Z

okaris
Jun 10, 2025

@egernst not sure if my issue belongs here or not but I think it's relevant.

apple/containerization#46

to quote my use case, which is pretty much the same with @yongjer

I’m building inference.sh, a platform that orchestrates AI workloads across cloud and local machines. On Linux and Windows (WSL2) we use Docker for containerization. On macOS, we currently manage runtimes directly on the host for performance, but the goal is to containerize cleanly across all platforms. Getting GPU access inside containers on Apple Silicon is the missing piece.

0 replies

qdrddr · 2025-06-10T12:58:49Z

qdrddr
Jun 10, 2025

We need GPU pass through (preferable) or some workaround like Docker/Podman did for macOS with model runner to use AI on macOS.
And we need FlashAttention, it’s simply a must if you want to compete with Windows/Linux with Nvidia for GenAI.

1 reply

qdrddr Sep 20, 2025

Metal flash attention for python
https://github.com/bghira/universal-metal-flash-attention

frob · 2025-06-13T20:41:55Z

frob
Jun 13, 2025

This was the first thing I was hoping for when I heard Apple was making a native container solution. Honestly, I have no reason to switch from podman or docker if nothing new is added. The existing solutions work well enough now. Please do something well that the existing solutions do poorly.

3 replies

spamshaker Jul 14, 2025

I totally agree, we have many alternatives that can't to this as well as this brand new solution :)
Only KVM without possibility to use the expensive (M Series) hardware without hacking, so far...

mavenugo Jul 29, 2025
Maintainer

Curious to understand the use-case. Will you be able to expand your use-case to have GPU device made available to Linux container via Containerization while one could use the same GPU natively by running applications outside the container :-) ... I understand the irony of this question in this project. But this is something I'm trying to understand.

okaris Jul 30, 2025

@mavenugo i think people would be just as happy if there was a macos container that had gpu passthrough, the goal is to have a controlled environment, and not touch the host as much as we can

alshdavid · 2025-06-13T22:47:29Z

alshdavid
Jun 13, 2025

I'd also like to see sharing of the onboard GPU with Vulkan support on the guest.

Support for passing through a thunderbolt/usb4 eGPU to the Linux container would be fantastic too.

2 replies

mavenugo Jul 29, 2025
Maintainer

Can you please share more on this use-case and how will it make your life better ?

alshdavid Jul 31, 2025

Vulkan support enables the use of hardware agnostic AI workloads via a Vulkan backend. It also enables the potential for hardware accelerated graphical applications to be virtualized (like games, assuming we can implement something like looking glass on top of the containerization implementation)

eGPU passthrough enables the same things however with the use of more powerful accelerators that feature native CUDA/ROCm/Vulkan on the guest. Again, great for ML and potentially gaming.

To get around this currently, I SSH into an external PC and my MacBook Pro acts exclusively as a thin client. When travelling, I have had to buy a seperate Mini PC which I take with me. Despite my MacBook having the hardware capability to handle my workloads, without access to common APIs (like Vulkan) or without the ability to attach an external GPU (to gain access to those APIs within a VM), it can only ever be a thin client.

andrewcrook · 2025-06-23T12:38:43Z

andrewcrook
Jun 23, 2025

1 reply

okaris Jun 23, 2025

There is https://developer.apple.com/documentation/paravirtualizedgraphics so we are actually not far from it. If your project needs/wants containers then probably venvs are not the solution you want. I doubt anyone asking for this is unaware of venv options. I use Docker + UV for what I'm trying to achieve they are not mutually exclusive

DemiMarie · 2025-07-03T22:35:41Z

DemiMarie
Jul 3, 2025

Apple deliberately used a VM-per-container approach. I expect this is for security, which makes me doubt they will be okay with insecure GPU acceleration approaches like passing Vulkan commands through to the host. Secure GPU acceleration without hardware support (for SR-IOV or PCI passthrough) requires exposing the host kernel driver to the guest and running the shader compiler there. This means either custom container images with a (presumably proprietary) Apple-specific runtime, or providing an interface that Mesa can use and porting Mesa to M3 and M4.

5 replies

DemiMarie Jul 15, 2025

Yup. That ensures that a malicious or compromised container can't just use a Linux kernel vulnerability to compromise another container. It also needs a VM escape.

LandonTClipp Oct 30, 2025

PCI passthrough doesn't require hardware support, only SR-IOV does. Maybe I am misunderstanding what you are saying.

DemiMarie Oct 31, 2025

Apple GPUs are not behind an IOMMU and cannot be passed through to a guest.

LandonTClipp Nov 3, 2025

That's really bizarre. Thank you for your insight!

DemiMarie Nov 3, 2025

It’s for performance reasons. The GPU has its own MMU controlled by the kernel driver.

DemiMarie · 2025-07-05T15:03:19Z

DemiMarie
Jul 5, 2025

I want to see support for external GPU passthrough, without having to trust the external GPU to not emulate a keyboard or mouse.

2 replies

DemiMarie Jul 5, 2025

Thunderbolt devices are PCI devices and can be passed through as such.

DemiMarie Jul 14, 2025

I have no idea what IOMMU group Thunderbolt PCI tunnels use. If they use the same group as a host device (USB controller?), then Thunderbolt should simply not be used.

c0nleyinnnn · 2025-07-15T03:37:42Z

c0nleyinnnn
Jul 15, 2025

Linux is very necessary for scientific computing in bioinformatics most of the time. Apple's native container support is very exciting, but the lack of support for native Metal will undoubtedly limit Apple silicon unified-memory design. I am currently using my Mac for bioinformatics analysis. Although pytorch (mps) runs well most of the time, Mac is still a second-class citizen on the hardware in my research field, even though I personally tested the overall analysis process and the performance of Mac is very satisfactory. I hope the development team can pay enough attention to the great value of Mac in academic research and seriously consider adding native Metal support to the container.

2 replies

mavenugo Jul 29, 2025
Maintainer

Can you please share more details on your use-case and why is Linux essential for this particular use-case ? Especially you mentioned that pytorch (mps) works well on native macOS. I understand the irony of this question in this project. But knowing a solid use-case will go a long way to justify investments in this area.

jtkiley Aug 2, 2025

I'm not in bioinformatics, but a lot of data science and academic workflows are like this.

I don't install Python on my Macs (I use three regularly) at all anymore. I simply create containers (via devcontainers in VSCode) with all of the configuration I need, which is then in the repo. When pulling it anywhere (Mac or PC), VS Code builds the container, and I can get to work. It also helps with issues on the host side like needing libraries (via Linux package manager like apt) for a particular project, but having to install it at the host system level and potentially dealing with conflicts. So, going container-only is quite appealing in general for data science.

GPU compute is often orders of magnitude faster than CPU compute. Originally, this was mostly helpful for ML (PyTorch and tensorflow). But, now, it's useful for local LLMs (see Docker's feature to work around the GPU passthrough issue on Macs), which are useful both for development and sometimes how something may be deployed. Beyond that, many other parts of the data science ecosystem are working on GPU compute (often via Nvidia RAPIDS libraries), like the work on polars dataframes.

Maybe the easiest way to see the value is to try a GPU-enabled workflow on Windows with an nvidia GPU versus a Mac. On Windows and Linux, nvidia GPU passthrough simply works with no work or configuration. Many libraries or model runners look for CUDA devices automatically. On the Mac, it simply isn't possible in a container, which is materially limiting now and is likely to get worse. That's why Docker made their workaround, but it only works for LLMs (and is a a bit of a configuration hassle).

In sum, perhaps the arguments are something like this:

Containerized workflows are important for academics (for collaboration, portability, host stability, and reproducibility) and industry data science workflows more generally. It generates a ton of value for little work.
Linux is particularly attractive for containers, in part because that's where all the work has been done. Deciding to use containers practically means you're going to use Linux. It also helps that distros have a robust package management ecosystem. They also run everywhere (e.g. GitHub codespaces).
GPU passthrough greatly accelerates existing workflows, particularly in hot areas like ML and LLMs, and increasingly across data science more generally.
On nvidia hardware, GPU passthrough simply works, with essentially no configuration, indirection, or hassle. It's been that way for years.
Adding GPU passthrough moves Macs from a material competitive disadvantage (i.e. many GPU things can't be done for only software reasons; the small subset that is possible still requires cumbersome workarounds) compared to PCs with Nvidia GPUs into a superior competitive position (i.e. memory gets very expensive on Nvidia cards compared to unified memory on Macs; Mac CPU cores are already great; Mac GPUs are a lot slower than Nvidia cards, but the memory makes it compelling and good enough). In other words, not needing to spend $4-5k on a PC will pay for a lot of high-margin Mac model, memory, and SoC upgrades, from people who already use/prefer Macs.
Engagement has been high on Mac GPU support issues for years across the data science ecosystem. I think it's fair to say that some variant of GPU support (originally native and now passthrough), has been the single top ask for people doing data science and similar academic research for several years. In my view, by a mile, it's the single biggest caveat to the overall data science experience on (Apple Silicon) Macs being otherwise quite good.
Having GPU passthrough would greatly improve the value scaling from entry-level to higher-end Macs. Now, most of the value comes from scaling up RAM, and that's less important (for non-GPU) than it used to be. Unleashing the GPU puts the key differentiator to work in those Max and Ultra SoCs.

That's more than I planned to write (on an iPhone), and I'm happy to expand on anything if that helps clarify the value.

scottrbaxter · 2025-07-30T04:51:08Z

scottrbaxter
Jul 30, 2025

@mavenugo You seem to be asking several individuals about their use-case for containerization and why this abstraction would improve quality of life. I'm genuinely confused as to why there is such focus on this... essentially you're asking why containerization itself has become so popular over the years. it certainly makes rapid development significantly easier, cleaner and much more portable. There are no shortage of reasons users would be interested in GPU accessibility in Linux containers on a macOS host, and the hardware is insanely powerful compared to the energy required to run it... Is this really so difficult to comprehend why the community would be able to utilize this feature? Ideally, i would like to keep individual components abstracted away from other services and the host, especially in dev environments where several services are able to run alongside each other. of course users can develop on the dedicated host, but build/test/deploy workflow is not nearly as clean compared to templating a base image with pinned requirements in a containerized deployment. being the maintainer of this project, i'm certain you're well aware of all of this... so the real question is, what is the hesitation with implementing GPU access within this native containerization solution? no doubt it is complex, but this seems like such an obvious interest for a feature request. the only reason i could understand hesitation is if there is a known limitation with current macOS and existing hardware... and reading through this thread seems like this topic is trying to be avoided, or at least delayed for whatever reason. i sincerely hope this is not the case. any chance that GPU access is already being considered internally?

7 replies

DemiMarie Aug 3, 2025

@andrewcrook I'm proposing forwarding IOKit (?) calls from the guest to the host, as that is where the security boundary is. This requires a conformant Vulkan runtime in the guest that supports Apple GPUs, and the only such Vulkan runtime is Mesa on Asahi. That's why I think Apple should contribute this support to the Asahi project! MoltenVK would require that the containers contain an Apple-specific (and presumably proprietary) plugin, which I don't think would go over very well.

mertalev Aug 3, 2025

Just to clarify, would such an approach allow for (near) native-level GPU performance, or would it require significant overhead?

AndreasKunar Aug 4, 2025

Just to clarify. DMR/Docker Model Runner implements an openAI API compatible LLM server based on llama.cpp in the host and exposes the APi to the containers. It has no performance downside, but its just that, a specific LLM server. The Vulkan API redirects of Podman,… introduce a >= 30% penalty vs. GPU-native according to my measurements (see my medium.com articles for details). NVIDIAs GPU access from containers, based on CUDA has no measurable overhead to me (OK, it needs some alignment of driver/CUDA versions between host and containers). For me, the desired implementation for Apple to procude advantages would be MLX or Metal APIs available in the containers/hypervisor. An implementation not only for Linix containers, but also for macOS VMs. Vulkan-remoting is already implemented by other container solutions, no advantage for using Apple‘s containers (except VM isolation, which the others could probably also easily implement)

DemiMarie Aug 5, 2025

VM isolation is not simple to implement. The reason is that API forwarding approaches are insecure: the user-mode driver is not¹ a security boundary, and it is not safe to pass untrusted input to it. Only the kernel-mode driver is secure against malicious input. If you are interested in more details, I presented on GPU acceleration at Xen Project Summit 2024.

Since the user-mode driver is not a security boundary, it must not run on the host. Instead, it must run in the guest and make RPCs to the host instead of system calls. The Linux version of this is virtio-GPU native contexts, which was presented at X Developer Conference 2022.

Exceptions include WebGL and WebGPU, where shaders from the web are untrusted, and images and memory buffers received from other processes, which are not trusted. ↩

DemiMarie Aug 10, 2025

@mertalev System call forwarding via an async API gives very close to native performance for graphics workloads: 90% or more is expected, and 98% or more is typical. For compute workloads that do a lot of memory mapping and unmapping larger performance hits are expected. In particular, retryable page faults are not supported so all guest memory that the GPU may access must be explicitly pinned.

auzi68 · 2025-08-01T00:44:28Z

auzi68
Aug 1, 2025

I'm actually quite amazed this wasn't one of the first things added when designing this project. Looking online, there's massive amounts of articles where people are frustrated by not being able to do this without work arounds or performance loss. This should 100% be apart of this project. For me, without this, I really don't see a reason to use it instead of Docker or Podman.

1 reply

frob Aug 5, 2025

Thats because the feature is currently blocked by Apple's hypervisor not supporting GPU passthrough

I think the surprise in the lack of this feature is compounded by the fact that it was built by Apple. If Apple built this you'd think they could add it to their own hypervisor.

ivanfioravanti · 2025-08-28T20:52:09Z

ivanfioravanti
Aug 28, 2025

Any plans to add this for v1 release? This would be a real differentiator against any other competitors out there.

0 replies

alshdavid · 2025-08-28T21:46:12Z

alshdavid
Aug 28, 2025

Would love to be able to attach an NVIDIA or AMD eGPU by thunderbolt dock and have access to Vulkan & CUDA/ROCm within the VM container 🙏

0 replies

diogoviannaaraujo · 2025-09-01T22:58:14Z

diogoviannaaraujo
Sep 1, 2025

At least give us a mac guest on mac host container with gpu sharing, if linux its not an option, would help a lot with running automated isolated environments

1 reply

vhccruz Dec 20, 2025

I agree that would be a good alternative

qdrddr · 2025-09-20T18:19:46Z

qdrddr
Sep 20, 2025

PoC for metal flash attention with Python bindings, tested for images.
https://github.com/bghira/universal-metal-flash-attention

2 replies

andrewcrook Sep 20, 2025

I dont see how this is relevant to GPU pass through?

qdrddr Sep 20, 2025

It’s not. But both are quite anticipated on Apple Silicon and would serve the same purpose: development and local use of LLMs on Mac

kpouget · 2025-09-21T16:39:37Z

kpouget
Sep 21, 2025

POC of llama.cpp running at near-native speed inside a Linux container (easy to test with Podman Desktop and RamaLama and libkrun VMs)
https://developers.redhat.com/articles/2025/09/18/reach-native-speed-macos-llamacpp-container-inference

That's not a general purpose solution though, it can only works for applications relying on the GGML library (llama.cpp, whisper.cpp not tested but should work).

It follows the same idea of the Mesa/Venus/Vulkan API forwarding already available in libkrun VMs.

8 replies

kpouget Sep 21, 2025

I'll have to review what "a 'trusted-environment' llama.cpp server-implementation" means exactly, and what remains of the container when running this way

This does NOT resolve the security-boundaries needed in containers, e.g. if you want to run not fully trusted software in containers, which require GPU-access - unlike as already implemented for NVIDIA-CUDA,....

yes, this level of isolation requires driver and/or hardware level assistance ...

AndreasKunar Sep 21, 2025

Apologies for me not being clear in my wording. With my misleading "trusted environment" comment, I meant that the Docker Model Runner's (DMR) llama-cpp-based server got implemented and verified by the Docker team and is part of it's code-base. If you trust Docker as supplier, you also trust the DMR feature. No special trust-isolation, but it's not a separate open-source component, which would need a supplier/component verification (including its supply-chain). For me, this is why Docker+DMR for macOS was/is interesting. And it also packages models as OCI-artifacts. But it's very specific.

However, I think Apple still should implement some kind of Apple-Silicon GPU API virtualization for GPU-access from containers. Vulkan or other (MLX, Metal,...).

DemiMarie Sep 21, 2025

API virtualization isn't secure. One needs to virtualize the API exposed by the kernel driver.

DemiMarie Sep 22, 2025

That can be entirely in userspace, but it requires allowing Mesa to access the macOS kernel driver and providing the needed APIs for that.

qdrddr Oct 25, 2025

The problem with libkrun is that workaround is applicable only for some popular LLMs. But not all missable AI models. Absolute majority of the models you still have to run on the host, and can't be containerized. libkrun is a half-baked solution.

This so much Mikita AI development on macOS, simply because GPU pass through is prohibited. Apple needs to either allow the GPU passthrough or find a better solution than libkrun.

riley-1995 · 2025-10-10T03:20:59Z

riley-1995
Oct 10, 2025

@mavenugo

Apple Silicon is limiting robotics and STEM students. We need GPU/OpenGL support in Linux VMs for essential tools like ROS 2 and Gazebo.

I’m a CS graduate student studying ML and robotics while using an Apple M4 Max for coursework and research. I run ROS 2 Humble and Gazebo Fortress inside an Ubuntu VM (via UTM) because these tools aren’t fully supported natively on macOS. ROS 2 provides official prebuilt binaries only for Ubuntu. Packages like Cartographer and Navigation2 rely on Ubuntu’s apt-based dependency chain, and Gazebo’s ROS 2 plugins are built and distributed through the same Ubuntu package ecosystem.

The main limitation is that Virtualization.framework doesn’t expose GPU or OpenGL 3.3+ to Linux guests. Gazebo’s GUI immediately fails with “OpenGL 3.3 not supported,” leaving only headless mode. That blocks visualization and debugging, which are central to robotics education and simulation.

The hardware is more than capable, but without GPU/OpenGL passthrough, students and developers can’t use Apple Silicon for modern robotics or simulation workflows.

Extending ParavirtualizedGraphics or adding GPU/3D acceleration passthrough for Linux guests would make Apple Silicon a first-class platform for robotics, AI, and research. It would let us stay within the macOS ecosystem instead of relying on external, non-Apple devices or resorting to a compromised VM workflow with impactful limitations.

3 replies

DemiMarie Oct 10, 2025

It should be possible to use a (slow) software renderer via llvmpipe.

riley-1995 Oct 11, 2025

I don't believe this will work for my case. Reading through the llvmpipe documentation, I do not see support for arm processors.
https://docs.mesa3d.org/drivers/llvmpipe.html

DemiMarie Oct 27, 2025

It definitely works on Arm.

brickheadbs · 2025-10-27T12:32:19Z

brickheadbs
Oct 27, 2025

I'm just adding my comment to show I also agree with everything said above.

I tend to speak direct, so not trying to be a jerk here :) My apologies if I don't sound kind.

I understand there are security implications any time you share a resource. But why would you ever NOT want to enable access to a large part of your system resources? If containers are CPU only, you are eliminating 60% of what I got a MacStudio for. I've been doing some homelab project this year, and it is SO much more difficult than it should be. Why handicap the products in any way? It just works would win pros over as well as 'normal' users. I'm not a dev, just a long time tech enthusiast.

Final rant: To me this is right up there with a young guy being able to develop Whisky for FREE to be able to play Windows games on mac. Why wouldn't apple produce that natively? they are really good at virtualisation. I love my mac, but the biggest thing that tempts me away is compatibility.

OK, that's all I have to say. :D If anyone with pull reads this, realize when you make our lives easy we throw money at you

10 replies

DemiMarie Jan 19, 2026

@MisutaaAsriel That’s the right question to be asking!

Paravirtualized graphics can be secure, but only if one forwards calls to the kernel driver rather than calls to the userspace API. Apple doesn’t want to implement that, though. I suspect this is because they don’t want to commit to a stable kernel-userspace API.

The problem with forwarding an API like Vulkan or Metal is that the Vulkan or Metal implementation is insecure. It is almost certain that malicious use of the API can be used to cause memory corruption, which can in turn be exploited to gain code execution in the userspace process that implements the paravirtualized graphics. Apple might take precautions in SParavirtualizedGraphics.framework, but I’m not sure.

A much better approach would be to support something like https://github.com/magma-gfx, which is meant to provide a simpler mechanism for secure GPU acceleration. It operates at a higher level than raw system calls, but much lower than that of userspace APIs. However, this would still require Apple to contribute M3+ support to Mesa.

MisutaaAsriel Jan 20, 2026

Paravirtualized graphics can be secure, but only if one forwards calls to the kernel driver rather than calls to the userspace API. Apple doesn’t want to implement that, though. I suspect this is because they don’t want to commit to a stable kernel-userspace API.

The problem with forwarding an API like Vulkan or Metal is that the Vulkan or Metal implementation is insecure. It is almost certain that malicious use of the API can be used to cause memory corruption, which can in turn be exploited to gain code execution in the userspace process that implements the paravirtualized graphics. Apple might take precautions in SParavirtualizedGraphics.framework, but I’m not sure.

@DemiMarie Why can't the same memory protection & integrity techniques be applied to code executed by the GPU? Apple Silicon shares RAM between CPU & GPU, and the ARM architecture itself supports various different techniques to protect memory between applications. Apple documents quite a few of the ones they use at https://security.apple.com/blog/memory-integrity-enforcement/

Surely there is some way Apple could implement some sort of "lock and key" system adjacent to MTE in GPU contexts, yeah? In essence, a container would only be able to read/write memory which match its associated tag. Attempts to read or write non-matching tags in this hypothetical scenario would result in the operation failing.

I understand that sometimes its not that simple. After all, MTE is a hardware feature, added to newer versions of the ARM specification. If Apple's integrated GPUs are more, er, barbaric, when it comes to memory safety, then I could understand implementation of such a feature relying on entirely new hardware.

But I would hope, on account of it being build atop ARM, into the same chipset where these memory integrity features are available, that they could leverage that for GPU contexts too.

DemiMarie Jan 20, 2026

@MisutaaAsriel You’re asking the right questions again, thanks!

GPUs from most vendors, including Apple, do use memory protection. They do not implement MTE, but they do have their own page tables, and it is not possible for a GPU context to access memory that isn’t mapped into its address space. To the best of my knowledge, this protection mechanism has never been bypassed. The vulnerabilities that I have seen are either due to firmware or driver bugs, or due to failure to clear state left over from previous users of the GPU.

To understand the rest of the story, one must understand how a program using a graphics runs code on a GPU. The GPU driver has two parts to it. One part is called the user-mode driver, or UMD. The other is the kernel-mode driver, or KMD. The first is a library that runs in the same address space as the calling program, while the second runs as part of the operating system kernel ¹.

The UMD implements a graphics API that programs call. This API is generally documented and not tied to any particular hardware implementation. In particular, it accepts shaders (programs) in a hardware-independent form. This is DXIL for Direct3D, Metal Shader Language for Metal, GLSL for OpenGL, and SPIR-V for Vulkan. The UMD includes a compiler that generates machine code for a given GPU. This machine code is specific to a given GPU model. Code compiled for an M2 GPU won’t run on an M1 or M3 GPU. It also generates commands that are used to control the execution of that code.

From a security perspective, the UMD is just another library loaded into the process that uses it. It can do anything that process can do, and it cannot do anything that process cannot do. What is special about the UMD is not the privileges it has, but rather that it knows how to compile shaders, implement a graphics API, and use the API exposed by the KMD ².

The UMD runs inside the calling program. Therefore, it can only access memory mapped into the address space of that program. However, it can access all such memory. There is one instance of the UMD per user-space process. The UMD itself has its own data structures. Some of these are global to a process, but I believe most of them are associated with an OpenGL context, Vulkan device, or the equivalent in other APIs.

To actually run the code, the UMD must use system calls to invoke the KMD. The KMD has direct access to the GPU hardware. It creates and destroys GPU contexts and manages the page tables that determine what a context can access. In older GPUs, the KMD was also responsible for scheduling contexts, but the trend nowadays is to move this to firmware to improve latency and power consumption.

On most platforms, including macOS, the UMD and KMD are shipped as a unit by the vendor, and a UMD can only work with the exact KMD it was built with. This is because the interface between the UMD and KMD is not stable and can and does change at any time. The main exception is upstream Linux, which exposes a stable API to the UMD. Even on Linux, though, the API is only stable across kernel versions, not across hardware versions.

The KMD is considered to be a security boundary. This means that if a buggy or malicious user can compromise the KMD, it is a security vulnerability. Furthermore, if code running in one GPU context can access or tamper with another context, this is also considered a security vulnerability on Linux. On other OSs, the process may be the only security boundary, but in any case a process should never be able to access memory (whether on the CPU or the GPU) not mapped into it without authorization ³. The ability to isolate different contexts in the same process is very useful for secure and performant GPU virtualization on Linux.

The UMD is not a security boundary. This is not just because a program can bypass the UMD, but also UMDs are not designed to be secure against malicious API usage. Validating that the graphics API is used correctly is not a sufficient defense, because the UMD is just too complicated. It is typically ten times or more the size of the KMD, and its job is much harder. One must assume that one can craft malicious but valid sequences of API calls that will cause the UMD to corrupt its memory, which in turn can be used to execute arbitrary code in the process the UMD is running in.

WebGL and WebGPU implementations do sanitize what gets passed to the UMD, but even then, vulnerabilites do get found. This includes not only vulnerabilities in the WebGL and WebGPU implementations themselves, but also bugs in UMDs that the UMD authors may well not consider to be vulnerabilities. Some UMD developers consider working around these bugs to be the responsibility of the WebGL or WebGPU implemenation. In any case, the amount of restrictions imposed by WebGL and/or WebGPU is far too great for such an implementation to be exposed as a paravirtualized graphics API.

When it comes to exposing a GPU to a VM, there are four options:

PCI pass-through: expose the entire GPU to the VM, and rely on the IOMMU to protect the rest of the system.
SR-IOV: configure the GPU to expose a sub-device on the PCI bus that is passed through to a VM, and rely on the IOMMU to protect the rest of the system from malicious use of this sub-device.
KMD API pass-through: provide a means by which the VM can submit calls to the KMD. The parameters passed to these calls are validated to prevent the VM from compromising the userspace VMM. The kernel ensures that that GPU contexts created by the VM cannot access GPU memory that doesn’t belong to them.
UMD API pass-through: provide a means by which the VM can submit calls to the UMD. As mentioned above, it is not possible to validate the parameters sufficiently to prevent executing arbitrary code with the privileges of the UMD.

Option 1 is heavily used in cloud environments, and is the basis of any cloud instance you get with a full GPU attached. It is considered secure provided certain conditions are met. Ensuring that they are met is a completely separate discussion and off-topic here. It is not possible on Apple GPUs as the GPU is not behind an IOMMU at all.

Option 2 is mostly used in enterprise environments. It requires GPU hardware and firmware support to partition the GPU into various sub-devices, known as virtual functions. Its security depends entirely on that of the GPU firmware. Apple does not support this at all, and the only GPUs that support it which are meant for use outside datacenters are Intel Xe iGPUs and certain Arc Pro models. Panther Lake iGPUs will support it under Linux..

Option 3 is known as native contexts under Linux, where it is currently supported for AMD, Apple, and Qualcomm GPUs. Support for other GPUs can be implemented and is much, much easier than writing a full UMD from scratch. Windows calls this approach vGPU. The Linux implementation is considered to be somewhat more secure as the underlying GPU driver, as many vulnerabilities in that driver are not accessible via the limited interface exposed to the VM. The fraction of Freedreno vulnerabilities migitated was ~80% according to one source a while back. I have no knowledge about the Windows case. macOS does not support this approach.

Option 4 what Apple uses in ParavirtualizedGraphics.framework. It is also the approach used by virtio-GPU without native contexts. This includes virGL and Venus. It is the only approach that allows guests to be independent of the underlying hardware or host OS. However, this approach is not secure when used with untrusted programs in the guest. The guest can use bugs in the UMD to take control of the host program running in the UMD. From there, it can do whatever that program has the privileges to do, and it can attempt to exploit other vulnerabilities to further escalate its privileges. Privilege escalation from host userspace is generally much easier than a VM escape.

While the way Apple hardware works means that options 1 and 2 are impossible, option 3 is possible. However, this would require Apple to write a userspace layer that had an API independent of the KMD version. It would also require Apple to extend the existing Mesa support for M1 and M2 GPUs to also include M3 and beyond. I have reasons (which I will not discuss here) to believe that Apple is highly unlikely to do this.

The consequence is that GPU virtualization on macOS is very unlikely to be secure against malicious guests. Containerization and the container tool are fundamentally designed to contain potentially-malicious containers, provided that no sensitive resources are made available to them. Exposing GPU acceleration via option 4 would significantly reduce this isolation. An attacker would be able to chain a UMD exploit and a sandbox escape to execute code on the host.

While this requires two vulnerabilities instead of one, the vast attack surface the UMD exposes means that finding an exploit against the UMD is likely much simpler than finding one against the KMD. Also, this exploit might not even be considered a vulnerability by the UMD maintainer. Once the attacker is running code in the process hosting the UMD, they can in turn exploit any vulnerability reachable from this process. Since there is a lot more code reachable from this process than from the VM, this makes the attacker’s job much easier.

In case you (or anyone else) is wondering how I was able to write such a long post about this, the answer is that I researched GPU virtualization for perhaps a year or more as part of my previous job for Qubes OS.

Microkernels run the privileged part of the GPU driver in userspace, but macOS/iOS, Windows, Linux, and the BSDs do not use them. The only drivers for a GPU I know of that support 3D acceleration are one in Nintindo’s system software and one in Genode. Furthermore, even on a microkernel the part of the driver that can talk directly to the GPU has extra privileges, so for the purposes of this comment it is equivalent to a true kernel-mode driver. ↩
It very much is possible for programs to call the KMD themselves, bypassing the UMD. Unless the KMD is one that is part of Linux, it also probably won’t work on future kernel versions. Furthermore, if one wants to actually run code on the GPU (instead of just managing buffers), one needs information about the precise GPU model (not just vendor!). Only Intel provides all the needed documentation without an NDA. AMD only documents the instruction set and I believe the other vendors document little to nothing.

The majority of drivers in Mesa are created by reverse engineering. Intel and AMD are notable exceptions and maintain their drivers themselves. I’m not aware which other vendors contribute. ↩
It is possible and common for a process to share GPU buffers with other processes. For instance, most programs using GPU-accelerated rendering pass a handle to the GPU buffer to the compositor. The compositor then maps the buffer into its GPU context and accesses the data using the GPU or, in some cases (mostly fullscreen apps), the display engine. This is much faster than copying the data back to system RAM. It is also possible to use debugging APIs to forcibly access memory of other processes, but access to these APIs is tightly controlled. ↩

MisutaaAsriel Jan 24, 2026

@DemiMarie that's quite the breakdown! Thanks for the detailed information!

If I'm understanding you correctly, what you're saying is that, to the system, all the (GPU) memory "belongs" to the UMD. So even though malicious GPU code wouldn't be able to mess with non-GPU memory, nor memory belonging to the KMD, it COULD mess with the (GPU) memory of other processes using the UMD, or even execute code as the UMD itself?

if so… why can't the UMD be instantiated? One UMD instance per application? I hope you'll forgive this question if it's dumb.

Alternatively, and likely even better, wouldn't intelligently leveraging memory tagging help? If the UMD used a tagging system, similar to (or maybe even built off of) MTE, it would mean each process accessing the UMD would be isolated from each other as calls to read or write memory would require the calling application & the memory to "match".

This could be used to differentiate the UMD from the applications using it, and to differentiate applications from each other, yeah?

DemiMarie Jan 24, 2026

@DemiMarie that's quite the breakdown! Thanks for the detailed information!

You’re welcome! I enjoyed writing it. Feel free to ask questions. I want to turn this comment into a blog post, and want to make sure it is clear, unambiguous, and does not assume understanding GPUs or graphics APIs.

If I'm understanding you correctly, what you're saying is that, to the system, all the (GPU) memory "belongs" to the UMD. So even though malicious GPU code wouldn't be able to mess with non-GPU memory, nor memory belonging to the KMD, it COULD mess with the (GPU) memory of other processes using the UMD, or even execute code as the UMD itself?

It can execute code as the UMD, which lets it do anything the process the UMD is running in can do. The UMD can’t do anything the process calling it can do, though.

From a security perspective, the UMD is just a shared library. It’s no different than libc or any other library loaded into a process. What is special about the UMD is not the privileges it has, but rather that it knows how to compile shaders, implement a graphics API, and use the API exposed by the KMD.

The only things preventing one from writing one’s own UMD are that it is a very large amount of work, that it may require reverse engineering vendor code or trying to understand Mesa source code, and that on non-Linux it may break with new kernel driver versions. That doesn’t mean people haven’t tried, though. Dr. Faith Ekstrand even made the Mesa AMDGPU UMD work (to some degree) with the AMD Windows KMD!

if so… why can't the UMD be instantiated? One UMD instance per application? I hope you'll forgive this question if it's dumb.

This is not a dumb question. It already is instantiated 🙂.

Alternatively, and likely even better, wouldn't intelligently leveraging memory tagging help? If the UMD used a tagging system, similar to (or maybe even built off of) MTE, it would mean each process accessing the UMD would be isolated from each other as calls to read or write memory would require the calling application & the memory to "match".

This could be used to differentiate the UMD from the applications using it, and to differentiate applications from each other, yeah?

There’s no reason to perform such a differentiation, and the performance costs would almost certainly be prohibitive on current hardware. On hardware designed without flat address spaces, it might well make sense.

slp · 2025-10-31T10:37:00Z

slp
Oct 31, 2025

There is an easier route, which is to rely on KosmicKrisp in the container and expose a paravirtualized version of Metal. However, I highly doubt Apple will go this route. From the very beginning, this framework has run one container per VM. This implies a very strong focus on isolating containers from each other and the host. Running the Metal runtime on the host would not meet this requirement. Furthermore, I don’t know if upstream Mesa maintainers would accept it.

Until Apple comes up with their own solution, krunkit does support GPU acceleration for Linux guests/containers on Apple Silicon Macs using libkrun, Venus and MoltenVK.

Podman Desktop supports krunkit/libkrun since quite a while (it's now the default virtualization engine for Podman Desktop), and Lima just gained support for it too. You can also use it standalone with you favorite Linux distro, just make sure the context (VM or container in a VM) where you run the AI workload has a patched Mesa package installed (we're working on removing this requirement, but since it requires changes in the Linux kernel and Mesa is going to take a while to get upstream and trickle down to every distro).

2 replies

DemiMarie Oct 31, 2025

There is an easier route, which is to rely on KosmicKrisp in the container and expose a paravirtualized version of Metal. However, I highly doubt Apple will go this route. From the very beginning, this framework has run one container per VM. This implies a very strong focus on isolating containers from each other and the host. Running the Metal runtime on the host would not meet this requirement. Furthermore, I don’t know if upstream Mesa maintainers would accept it.

Until Apple comes up with their own solution, krunkit does support GPU acceleration for Linux guests/containers on Apple Silicon Macs using libkrun, Venus and MoltenVK.

Podman Desktop supports krunkit/libkrun since quite a while (it's now the default virtualization engine for Podman Desktop), and Lima just gained support for it too. You can also use it standalone with you favorite Linux distro, just make sure the context (VM or container in a VM) where you run the AI workload has a patched Mesa package installed (we're working on removing this requirement, but since it requires changes in the Linux kernel and Mesa is going to take a while to get upstream and trickle down to every distro).

Would it be possible to run the MoltenVK (or, in the future, KosmicKrisp) in a very tightly sandboxed process? Otherwise, this defeats the isolation the VM provides.

qdrddr Nov 9, 2025

UThe problem with that is it supports only LLMs. And only certain architecture types of LLMs. This is so much limiting development on macOS.

wisejnrs · 2025-12-26T01:26:12Z

wisejnrs
Dec 26, 2025

Working GPU Solution: krunkit + Podman + Vulkan

For anyone still looking for GPU acceleration in containers on macOS, I've implemented a solution using krunkit + Podman.

What Works

✅ GPU device passthrough via /dev/dri/renderD128
✅ Vulkan Compute API → Metal via MoltenVK

Tech Stack

krunkit - Lightweight VM with virtio-gpu support
Podman - Container runtime with --device /dev/dri
Venus driver - Mesa's Vulkan driver
MoltenVK - Vulkan → Metal translation

Quick Start

# Install krunkit
brew tap slp/krunkit && brew install krunkit podman

# Pull ML/AI environment
podman pull wisejnrs/carbon-compute-macos:latest

# Run with GPU
podman run -d --device /dev/dri -p 8888:8888 wisejnrs/carbon-compute-macos:latest

Performance

Tested on Apple Silicon M-series:

Matrix operations: 2-4x faster
ML training: ~77% of native Metal performance
Still containerized: reproducible, isolated, shareable that sort of thing.

Resources

I wrote a guide with benchmarks and step-by-step setup:
Blog: https://www.wisejnrs.net/blog/carbon-macos-gpu-setup
GitHub: https://github.com/wisejnrs/wisejnrs-carbon-runtime

Note: This uses krunkit, not Apple Container. It's a workaround until native GPU passthrough is available. But it works today for containerized ML/AI workflows on macOS.

1 reply

qdrddr Dec 27, 2025

Krunkit is only for Embedding models and LLMs, and only for a small, popular list of Embedding/LLMs, not even all of them. There are tons of models on Hugging Face that you can't run with krunkit. krunkit is better than nothing, but still pretty limiting and inferior when you compare dev expirience on macOS vs. simple straight and easy to work with and transferable on Linux.

what we need is either GPU pass through into container or some more sophisticated workaround that would allow us to run any models in a container just like in Linux.

The whole point of this ticket is to find a developer friendly, non-limiting (unlike half-baked options such as krunkit) and easy for team working, transferable solution on the Apple platform.

Emasoft · 2026-01-17T12:43:15Z

Emasoft
Jan 17, 2026

It is incredible that we cannot run a normal app using the gpu in containerized mode for safety and testing, especially now in the age of AI. Apple should consider this a top priority. Instead they respond with WONTFIX?!? Apple is literally forcing their developers to switch to windows machines... I'm speechless...

1 reply

qdrddr Jan 17, 2026

Yep

MisutaaAsriel · 2026-01-17T21:54:57Z

MisutaaAsriel
Jan 17, 2026

Ignoring AI, which is the most commonly cited use case, there is nonetheless an argument to be had for using containers, rather than a full desktop VM, for hardware-accelerated and graphically-interactive tasks.

For example, the development and testing of graphical linux utilities in containerized environments.

Spinning up a full VM requires installation media, virtual drive allotments, hacky host shell integration with serial terminals and spice drivers, excessive amounts of setup, and even then, there may be awkward limitations or unwanted assumptions made by the VM software itself which impact performance, portability, and reliability.
- Comparatively, containers are extremely simple to spin up, configure, etc.
A user may wish to test their software between different distributions and environments in rapid succession which a VM may not be conducive for.
When using a host-side Wayland or X11 compositor with the appropriate sockets exposed to a container, a user may be able to directly integrate any graphical Linux utilities with their host system, allowing for better performance and reliability.

Another example: for shipping and running Linux exclusive software on macOS.

Container images could be built which take advantage of GPU acceleration through virtio-gpu, allowing for the same "ship the environment" approach to be applied to graphical linux applications on Mac, similar to Flatpaks are themselves on Linux.
Many graphical utilities & tools may only be available on Linux platforms, and porting may be difficult to infeasible as opposed to running the applications in a virtualized environment.
Even when command-line utilities exist for a task, graphical utilities may be more comfortable to use, even for the technically minded.

Containers also make it easier to "mix" native ARM & x86 applications, and spin up x86 distributions, over that of a full desktop VM.

Importantly, virtio-gpu in Linux is incredibly mature, receiving support as far back as the Linux 4.4 Kernel, and is the primary means of graphics acceleration for virtualized Linux tasks with tools like QEMU.

VM frontends such as UTM further build upon this with ANGLE based OpenGL ES hardware translation, for full Metal rendering support.
MoltenVK may also be leveraged for accelerated Vulkan graphics, as does other Qemu variants, such as the ADV, use.
- virtio-gpu provides support for the Vulkan API via vulkan-virtio and the virtio-gpu Venus protocol, permitting kernel support has been enabled.

[Granted] changes to the frontend set of utilities here at apple/container would need made to display rendered content (when specified), alongside the ability to enable such paravirtualized graphics acceleration in any given container.

Keeping graphical output and graphical acceleration as separate flags allows for a broader set of use-cases.

Excerpt from apple/containerization#480

0 replies

DemiMarie · 2026-01-18T04:48:59Z

DemiMarie
Jan 18, 2026

How Containers Work on macOS

Containers don’t run macOS. They are full VMs running a Linux kernel. They also use userspace libraries meant for Linux. This means that they usually [^1] use Mesa for GPU acceleration, rather than a proprietary vendor driver. It also means that containers do not use Metal. They use open standards, namely Vulkan, OpenGL, and OpenCL.

For technical reasons, it is very difficult to inject external libraries into containers. Fixing this would require changes to Linux dynamic library loading that nobody appears interested in. Furthermore, Apple doesn’t even have a Vulkan driver and its OpenGL driver is old and buggy. Therefore, the only way to support GPU acceleration in containers is to work with Mesa.

GPU Acceleration Via Mesa’s Apple Silicon Support

Thanks to the Asahi project, Mesa does have support for M1 and M2 GPUs! However, there is no support for M3+. Apple could certainly afford pay someone to add this support, though. For this to be accepted, they would also need to fund a kernel driver, but they could afford that as well.

One difficulty is that Mesa is written for Linux. Its support for Apple GPUs expects to talk to the Linux kernel driver for Apple M1 and M2 GPUs. It can do so either directly or via a proxy layer (“virtio-GPU native contexts”). The former is not an option because containers do not have access to a physical GPU. The latter is very much possible, and Apple could provide an implementation on their host. This implementation would need to translate calls to a Linux-specific API into calls that the macOS kernel driver can understand. I don’t know if this is possible.

GPU Acceleration Via Generic Vulkan Gassthrough

Another option is to proxy the whole Vulkan API surface via Venus. This avoids the problems mentioned above. It does require a Vulkan driver on the host, but KosmicKrisp provides one.

Unfortunately, this runs into unfixable security problems. Vulkan implementations, including KosmicKrisp, assume their input is trusted. The entire design of both this tool and the Containerization package it depends on assumes that containers are not trusted. These use a single virtual machine per container to ensure that one container cannot compromise another. Allowing containers to talk to a KosmicKrisp on the host would blow a giant hole in this isolation. I suspect Apple’s security team would instantly veto this.

Why Not Everyone Needs GPU-Accelerated Containers

Despite how popular machine learning workloads are right now, there are many, many workloads that have nothing to do with it. GPU acceleration is not needed for all sorts of other use-cases, such as developing and/or testing web applications. I am almost certian that these are the kinds of workloads that this tool is intended for, and they would not be able to use a GPU even if one was provided to them. These are the bread-and-butter tasks that pay the bills for many, many developers.

For ML workloads, GPU acceleration wouldn’t even be the best option in many cases. The best option is Core ML, which supports the Apple Neural Engine. The Neural Engine is a dedicated ML accelerator. Therefore, the best option would be to ship a model (in a form that can be executed safely even if malicious) from the container to the host and run it there.

4 replies

qdrddr Jan 19, 2026

Just FYI,

llama.cpp does not use ANE when you run a model on Apple, it uses GPU.
Most of the LLM inference operations to run a model can’t benefit from ANE.

We do need GPU in a container if you wish to run an LLM model from Hugging face on your Mac in a container. And most people on other platforms to exactly that: package all the dependencies in a container and then easily redistribute the container. Why? Because it’s convenient, fast, simple, reproducible. With Mac its simply impossible.

DemiMarie Jan 19, 2026

These arguments are all valid. However, for the reasons I wrote above, I don’t think Apple will find them convincing. LLMs are simply not what this tool is intended for.

MisutaaAsriel Jan 19, 2026

We do need GPU in a container if you wish to run an LLM model from Hugging face on your Mac in a container. And most people on other platforms to exactly that: package all the dependencies in a container and then easily redistribute the container. Why? Because it’s convenient, fast, simple, reproducible. With Mac its simply impossible.

It's also useful for Linux software development and testing on macOS, software staging (prepping graphical Linux software environments using a lightweight Linux container), and even for running graphical Linux applications in a containerized format.

Containers provide ample performance and convenience gains over traditional VMs, both in their ability to expeditiously create and run reproducible systems, and in their lightweight nature making them faster to boot and faster to execute software; than, say, configuring and running your own Linux system in QEMU.

Their nature to also "reset" once they're closed down, cleaning their state and becoming ready for the next run also means that it's easier to test applications or experiment with the environment without a lot of manual work in resetting entire machines, reinstalling binaries, etc.

In bitter irony, not having a solid way to run Linux software on Mac hurts those who are "Mac-first" users. — If we want to develop, test, or even just run graphical Linux applications, we're forced to go through the pain of setting up a "traditional" VM, with all its detriments, or to… just run Linux directly.

The ability to spin up a container, which may have all the tools we already need, and may be preconfigured to access the necessary resources on our machine, is the whole point of containers. Limiting them to CLI applications seems rather… silly. Arbitrary even, if it weren't for the fact that GPU acceleration is a more difficult task with many different ways to achieve said goal in comparison to the CPU.

qdrddr Jan 23, 2026

We do need GPU in a container if you wish to run an LLM model from Hugging face on your Mac in a container. And most people on other platforms to exactly that: package all the dependencies in a container and then easily redistribute the container. Why? Because it’s convenient, fast, simple, reproducible. With Mac its simply impossible.

It's also useful for Linux software development and testing on macOS, software staging (prepping graphical Linux software environments using a lightweight Linux container), and even for running graphical Linux applications in a containerized format.

Containers provide ample performance and convenience gains over traditional VMs, both in their ability to expeditiously create and run reproducible systems, and in their lightweight nature making them faster to boot and faster to execute software; than, say, configuring and running your own Linux system in QEMU.

Their nature to also "reset" once they're closed down, cleaning their state and becoming ready for the next run also means that it's easier to test applications or experiment with the environment without a lot of manual work in resetting entire machines, reinstalling binaries, etc.

In bitter irony, not having a solid way to run Linux software on Mac hurts those who are "Mac-first" users. — If we want to develop, test, or even just run graphical Linux applications, we're forced to go through the pain of setting up a "traditional" VM, with all its detriments, or to… just run Linux directly.

The ability to spin up a container, which may have all the tools we already need, and may be preconfigured to access the necessary resources on our machine, is the whole point of containers. Limiting them to CLI applications seems rather… silly. Arbitrary even, if it weren't for the fact that GPU acceleration is a more difficult task with many different ways to achieve said goal in comparison to the CPU.

Absolutely agree. Exactly my point.
This simply hurts Mac-first users and slows AI development on MacOS platform.

GPU passthrough availability? #62

Uh oh!

Replies: 24 comments · 69 replies

Uh oh!

egernst Jun 9, 2025 Maintainer

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mavenugo Jul 29, 2025 Maintainer

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mavenugo Jul 29, 2025 Maintainer

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mavenugo Jul 29, 2025 Maintainer

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Replies: 24 comments 69 replies

egernst
Jun 9, 2025
Maintainer

mavenugo Jul 29, 2025
Maintainer

mavenugo Jul 29, 2025
Maintainer

mavenugo Jul 29, 2025
Maintainer