Skip to content

Conversation

@gin66
Copy link

@gin66 gin66 commented Dec 7, 2025

Proposed changes

This is just a PoC for usage of Metal Compiler Plugin. It breaks with NAE support due to upstream compile error.

./.build/arm64-apple-macosx/debug/mlx-swift_Cmlx.bundle/default.metallib
./.build/plugins/outputs/mlx-swift/Cmlx/destination/MetalCompilerPlugin/default.metallib

Size is 225.428.760 Bytes.

Moreover loading the library may need investigation, because macOS/iOS apps work with bundles.

Checklist

Put an x in the boxes that apply.

  • I have read the CONTRIBUTING document
  • I have run pre-commit run --all-files to format my code / installed pre-commit prior to committing changes
  • I have added tests that prove my fix is effective or that my feature works
  • I have updated the necessary documentation (if needed)

@davidkoski
Copy link
Collaborator

Interesting approach! I like the fact that there is a plugin metal compiler that we could use.

The reason the NAX piece doesn't build is because these arguments are missing: -Wno-c++20-extensions -std=metal4.0. Currently the CMake build specifies those only for the NAX files.

I think the very large output size indicates that this is building all of the .metal files, but not all of them are meant to build this way. Perhaps this is picking up all the metal files in Cmlx/mlx rather than just the ones in Cmlx/mlx-generated (which is done to produce working #includes, though it looks like this plugin might solve that). Anyway, we would need a way to restrict which files it actually built.

@gin66
Copy link
Author

gin66 commented Dec 8, 2025

Thanks for the feedback. Even though it is large, at least it appears to work in my test project. I just do not know, what a reasonable size would be.

You mean, only the Cmlx/mlx-generated files are sufficient for metal and header files ? Or is there still a mix of mlx and mlx_generated files ? The mlx_generated bf16.h version appears to be the wrong one. It includes algorithm header, which is not available in the metal SDK. I could not figure out, how the build system shall work. Perhaps you can briefly explain, so I can try to fix this. I will give the missing compiler switch a try, too. If successful, I need to create a new PR with a more recent baseline.

@davidkoski
Copy link
Collaborator

Thanks for the feedback. Even though it is large, at least it appears to work in my test project. I just do not know, what a reasonable size would be.

In a build from main (using xcodebuild) it is 3.6M, so this is considerably larger :-)

You mean, only the Cmlx/mlx-generated files are sufficient for metal and header files ? Or is there still a mix of mlx and mlx_generated files ? The mlx_generated bf16.h version appears to be the wrong one. It includes algorithm header, which is not available in the metal SDK. I could not figure out, how the build system shall work. Perhaps you can briefly explain, so I can try to fix this. I will give the missing compiler switch a try, too. If successful, I need to create a new PR with a more recent baseline.

It excludes the metal shaders directory in mlx and builds only from mlx-generated.

I am not sure what version you are forked from, but here is bf16.h from main:

Perhaps you are forked from the latest tag?

That said, I don't see an "algorithm" header in any version.

@gin66
Copy link
Author

gin66 commented Dec 9, 2025

Thanks. I have successfully made the changes to make it work with HEAD.

Check #313 as new PR.

@gin66 gin66 closed this Dec 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants