Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The latest driver 32.0.101.6559 broke inference speed #12806

Open
xyang2013 opened this issue Feb 11, 2025 · 6 comments
Open

The latest driver 32.0.101.6559 broke inference speed #12806

xyang2013 opened this issue Feb 11, 2025 · 6 comments

Comments

@xyang2013
Copy link

Intel Arc B580
Driver: 32.0.101.6559
OS: Windows
Model: llama3.2:1B
Prompt: 'write js code hello world'
Inference speed dropped from 150 tokens per second to 20.06 tokens per second

@Oscilloscope98
Copy link
Contributor

Oscilloscope98 commented Feb 11, 2025

Hi @xyang2013,

We have reproduced this issue. Would you mind unsetting the environment variable SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS through e.g. set SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS= and having a try again?

@xyang2013
Copy link
Author

Hi @Oscilloscope98,

Thank you! I was able to return to the expected speed (between 155 to 160 tokens per second).

However, the documentation states that setting the following should improve performance:
set SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1

Is your suggestion a temporary fix?

@xyang2013
Copy link
Author

By the way, is it possible to communicate with Microsoft and ask them to stop rolling back the graphics driver to a version that is several months old? It creates a lot of unnecessary work constantly. As you know, the performance can be dependant of on the graphic driver.

https://x.com/XYang2023/status/1889451590031122456

@Oscilloscope98
Copy link
Contributor

Oscilloscope98 commented Feb 12, 2025

Hi @Oscilloscope98,

Thank you! I was able to return to the expected speed (between 155 to 160 tokens per second).

However, the documentation states that setting the following should improve performance: set SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1

Is your suggestion a temporary fix?

Hi @xyang2013,

In our Ollama documentation, we treat SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 as an optional environment variable:

rem under most circumstances, the following environment variable may improve performance, but sometimes this may also cause performance degradation
set SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1

We may update our document to make it more clear.

By the way, is it possible to communicate with Microsoft and ask them to stop rolling back the graphics driver to a version that is several months old? It creates a lot of unnecessary work constantly. As you know, the performance can be dependant of on the graphic driver.

https://x.com/XYang2023/status/1889451590031122456

For the second issue, if you met driver rolling back problem after updating your driver, you could try to:

Open Device Manager(设备管理器) -> Display adapters(显示适配器)-> specific Intel Graphics on your device -- right click --> Update driver(更新驱动程序)-> Browse my computer for drivers(浏览我的电脑以查找驱动程序)-> Let me pick from a list of available drivers on my computer(让我从计算机上的可用驱动程序中选取)-> double click the desired driver version that you want to install back.

By this way, the driver may not roll back again :)

Please let us know for any further problems.

@xyang2013
Copy link
Author

Hi @Oscilloscope98,

Thank you for suggesting a way to prevent driver rollbacks. I just did what you suggested.

I mentioned the SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS flag because, with the previous driver, it provided a 5–10% performance boost. However, with the new driver, I can’t match the performance shown here:
https://x.com/XYang2023/status/1888537903124582803

My best run on the new driver is 158 t/s—about 7% lower than the previous 170 t/s. I checked that the GPU (ASRock Intel Arc B580 Steel Legend) was running at an expected clock speed, so I suspect the difference is related to either settings or the driver version. Unfortunately, I don’t have the exact driver version used for the 170 t/s run, but I’m sure I had SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 set at that time.

@Oscilloscope98
Copy link
Contributor

Hi @xyang2013,

Thank you for pointing our this performance degradation issue even with SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS unset for driver 32.0.101.6559. We will have further investigation on this issue and let you know for any updates :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants