-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The latest driver 32.0.101.6559 broke inference speed #12806
Comments
Hi @xyang2013, We have reproduced this issue. Would you mind unsetting the environment variable |
Hi @Oscilloscope98, Thank you! I was able to return to the expected speed (between 155 to 160 tokens per second). However, the documentation states that setting the following should improve performance: Is your suggestion a temporary fix? |
By the way, is it possible to communicate with Microsoft and ask them to stop rolling back the graphics driver to a version that is several months old? It creates a lot of unnecessary work constantly. As you know, the performance can be dependant of on the graphic driver. |
Hi @xyang2013, In our Ollama documentation, we treat rem under most circumstances, the following environment variable may improve performance, but sometimes this may also cause performance degradation
set SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 We may update our document to make it more clear.
For the second issue, if you met driver rolling back problem after updating your driver, you could try to: Open Device Manager(设备管理器) -> Display adapters(显示适配器)-> specific Intel Graphics on your device -- right click --> Update driver(更新驱动程序)-> Browse my computer for drivers(浏览我的电脑以查找驱动程序)-> Let me pick from a list of available drivers on my computer(让我从计算机上的可用驱动程序中选取)-> double click the desired driver version that you want to install back. By this way, the driver may not roll back again :) Please let us know for any further problems. |
Hi @Oscilloscope98, Thank you for suggesting a way to prevent driver rollbacks. I just did what you suggested. I mentioned the SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS flag because, with the previous driver, it provided a 5–10% performance boost. However, with the new driver, I can’t match the performance shown here: My best run on the new driver is 158 t/s—about 7% lower than the previous 170 t/s. I checked that the GPU (ASRock Intel Arc B580 Steel Legend) was running at an expected clock speed, so I suspect the difference is related to either settings or the driver version. Unfortunately, I don’t have the exact driver version used for the 170 t/s run, but I’m sure I had SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 set at that time. |
Hi @xyang2013, Thank you for pointing our this performance degradation issue even with |
Intel Arc B580
Driver: 32.0.101.6559
OS: Windows
Model: llama3.2:1B
Prompt: 'write js code hello world'
Inference speed dropped from 150 tokens per second to 20.06 tokens per second
The text was updated successfully, but these errors were encountered: