Bug: SYCL builds >= b4069 have half the context limit of previous builds #10421
Labels
bug-unconfirmed
critical severity
Used to report critical severity bugs in llama.cpp (e.g. Crashing, Corrupted, Dataloss)
What happened?
command line:
llama-server.exe -t 16 --threads-http 8 --mlock -ngl 99 -m C:\LLM\Qwen2.5-3B-Instruct_Q4_1.gguf --port 8888 --ctx-size 112000 -np 48 --sampling-seq mt --min-p 0.1 --temp 1.5 -dt .1
command line works fine on builds < b4069
i have to lower context all the way to 60k with b4069
GPU: Intel Arc A770 (16GB)
OS: Windows
Name and Version
ZE_LOADER_DEBUG_TRACE:Using Loader Library Path:
ZE_LOADER_DEBUG_TRACE:Tracing Layer Library Path: ze_tracing_layer.dll
ggml_sycl_init: GGML_SYCL_FORCE_MMQ: no
ggml_sycl_init: SYCL_USE_XMX: yes
ggml_sycl_init: found 1 SYCL devices:
version: 4069 (2e82ffa)
built with MSVC 19.41.34123.0 for
What operating system are you seeing the problem on?
Windows
Relevant log output
No response
The text was updated successfully, but these errors were encountered: