Trouble training on M3 Max #18966
Replies: 3 comments 1 reply
-
👋 Hello @agoransson, thank you for bringing this to our attention and for your detailed report 🚀! We recommend starting by visiting the Docs for guidance on setup and troubleshooting. Key references include our guides on Python and CLI usage, as well as Tips for Best Training Results. MPS Performance on macOSFrom the logs you've shared, it seems like training is falling back to CPU instead of leveraging MPS on your Apple M3 Max. Since MPS support is still evolving in PyTorch, we recommend verifying that your PyTorch version is compatible with your macOS and hardware architecture. You can refer to PyTorch's MPS Documentation for the latest status. For slow training issues or unexpected behavior on specific environments, please:
Additionally, as noted in your log ( pip install -U ultralytics Pre-configured EnvironmentsTo streamline experimentation, you can also try running YOLO in one of the verified environments below:
Community and SupportJoin the Ultralytics community to share insights or troubleshoot interactively:
Lastly, please note that this is an automated response, and an Ultralytics engineer will review your query and provide further assistance soon. 🚀 Looking forward to resolving this together! |
Beta Was this translation helpful? Give feedback.
-
You can upgrade to MacOS15 or use a smaller batch size, not batch=-1 which is probably selecting a very large batch size. |
Beta Was this translation helpful? Give feedback.
-
Thank you @Y-T-G for the recommendations. Upgrading to MacOS 15 helped get rid of warnings and errors. However, it seemed quite slow anyway, but perhaps that's expected when retraining the Yolo 11 Medium model. It seemed to select 16 as batch size today after the upgrade, I'm unsure if that's large or not. What is the expected time to train this model?
The training fails regardless, seems I need to use CPU at least for some operations.
|
Beta Was this translation helpful? Give feedback.
-
Hello, I'm trying to run training on my Apple M3 Max but run across problems.
I get the following message when I start, which I interpret as it's failing to use MPS, and just falling back to training on my CPU.
And sure enough it's taking quite some time to train it seems. When I ran the same training command in the fall, when evaluating my concept initially with some tests, it was training much faster.
Also worth noting is that I can't seem to install the latest Ultralytics package, when starting the training I see:
I have followed the instructions to create a new conda environment, and installing the required packages (except the CUDA specific installations as I gather those are not needed for me).
Yet, it's so slow. What am I doing wrong? Or is this expected?
Beta Was this translation helpful? Give feedback.
All reactions