Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG]Issue with Zero Optimization for Llama-2-7b Fine-Tuning on Intel GPUs #6713

Open
molang66 opened this issue Nov 5, 2024 · 9 comments
Open
Labels
bug Something isn't working training

Comments

@molang66
Copy link

molang66 commented Nov 5, 2024

Describe the bug
I’m experiencing an issue when fine-tuning the Llama-2-7b model from Hugging Face with Zero optimization enabled. I am running on 8 Intel Max 1550 GPUs using the code from the examples provided in Intel Extension for DeepSpeed.

The model loads and runs successfully without Zero optimization, but when I enable Zero optimization (particularly with stage 3), I encounter the following errors:
[rank0]: RuntimeError: could not create an engine
2024:11:05-02:39:09:(678567) |CCL_INFO| finalizing level-zero
2024:11:05-02:39:09:(678567) |CCL_INFO| finalized level-zero
0%| | 0/50 [00:00<?, ?it/s]
2024:11:05-02:39:09:(678572) |CCL_INFO| finalizing level-zero
2024:11:05-02:39:09:(678566) |CCL_INFO| finalizing level-zero
...
[2024-11-05 02:39:10,447] [INFO] [launch.py:319:sigkill_handler] Killing subprocess 678572

**System info **
Model: Llama-2-7b from Hugging Face
GPUs: 8x Intel Max 1550 GPUs
Software:
• Intel Extension for pytorch
• DeepSpeed with Zero Optimization (Stage 3)
• oneCCL for communication backend

Launcher context
cd transformers
deepspeed --num_gpus=8 examples/pytorch/language-modeling/run_clm.py
--deepspeed tests/deepspeed/ds_config_zero3.json
--model_name_or_path meta-llama/Llama-2-7b-hf
--dataset_name wikitext
--dataset_config_name wikitext-2-raw-v1
--dataloader_num_workers 0
--per_device_train_batch_size 1
--warmup_steps 10
--max_steps 50
--bf16
--do_train
--output_dir /tmp/test-clm
--overwrite_output_dir

@molang66 molang66 added bug Something isn't working training labels Nov 5, 2024
@tjruwase
Copy link
Contributor

tjruwase commented Nov 5, 2024

@delock, can you please help? Thanks!

@Liangliang-Ma
Copy link
Contributor

@molang66 Hi, I reran the cmd that you pasted in this issue and found that no such error appeared. So I think that must be some version mismatch or outdated.

I verify the cmd with following versions:

Ubuntu 22.04.2 LTS
torch 2.3
intel-extension-for-pytorch 2.3.110
oneccl-bind-pt 2.3.0+gpu
(torch/ipex/onecclbindpt wheels can be found at https://pytorch-extension.intel.com/release-whl/stable/xpu/us/)
oneAPI 2024.2.1
GPU Driver 950.13 (rolling stable version)

Can you provide more details about your development environment? Or you can try using my verified versions :)

@delock
Copy link
Collaborator

delock commented Nov 8, 2024

@delock, can you please help? Thanks!

Hi @tjruwase, @Liangliang-Ma will followup with this issue. Thanks!

@molang66
Copy link
Author

molang66 commented Nov 8, 2024

Thank so much for help. I have updated my CCL version, and now I am encountering this issue:

[rank0]: Traceback (most recent call last):
[rank0]: File "/work2/09250/molang66/stampede3/transformers/examples/pytorch/language-modeling/run_clm.py", line 657, in
[rank0]: main()
[rank0]: File "/work2/09250/molang66/stampede3/transformers/examples/pytorch/language-modeling/run_clm.py", line 605, in main
[rank0]: train_result = trainer.train(resume_from_checkpoint=checkpoint)
[rank0]: File "/work2/09250/molang66/stampede3/miniconda3/envs/intel_xpu/lib/python3.9/site-packages/transformers/trainer.py", line 2141, in train
[rank0]: return inner_training_loop(
[rank0]: File "/work2/09250/molang66/stampede3/miniconda3/envs/intel_xpu/lib/python3.9/site-packages/transformers/trainer.py", line 2495, in _inner_training_loop
[rank0]: tr_loss_step = self.training_step(model, inputs, num_items_in_batch)
[rank0]: File "/work2/09250/molang66/stampede3/miniconda3/envs/intel_xpu/lib/python3.9/site-packages/transformers/trainer.py", line 3613, in training_step
[rank0]: loss = self.compute_loss(model, inputs, num_items_in_batch=num_items_in_batch)
[rank0]: File "/work2/09250/molang66/stampede3/miniconda3/envs/intel_xpu/lib/python3.9/site-packages/transformers/trainer.py", line 3667, in compute_loss
[rank0]: outputs = model(**inputs)
[rank0]: File "/work2/09250/molang66/stampede3/miniconda3/envs/intel_xpu/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
[rank0]: return self._call_impl(*args, **kwargs)
[rank0]: File "/work2/09250/molang66/stampede3/miniconda3/envs/intel_xpu/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
[rank0]: return forward_call(*args, **kwargs)
[rank0]: File "/work2/09250/molang66/stampede3/miniconda3/envs/intel_xpu/lib/python3.9/site-packages/deepspeed/utils/nvtx.py", line 18, in wrapped_fn
[rank0]: ret_val = func(*args, **kwargs)
[rank0]: File "/work2/09250/molang66/stampede3/miniconda3/envs/intel_xpu/lib/python3.9/site-packages/deepspeed/runtime/engine.py", line 1899, in forward
[rank0]: loss = self.module(*inputs, **kwargs)
[rank0]: File "/work2/09250/molang66/stampede3/miniconda3/envs/intel_xpu/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
[rank0]: return self._call_impl(*args, **kwargs)
[rank0]: File "/work2/09250/molang66/stampede3/miniconda3/envs/intel_xpu/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
[rank0]: return forward_call(*args, **kwargs)
[rank0]: File "/work2/09250/molang66/stampede3/miniconda3/envs/intel_xpu/lib/python3.9/site-packages/transformers/models/llama/modeling_llama.py", line 1199, in forward
[rank0]: outputs = self.model(
[rank0]: File "/work2/09250/molang66/stampede3/miniconda3/envs/intel_xpu/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
[rank0]: return self._call_impl(*args, **kwargs)
[rank0]: File "/work2/09250/molang66/stampede3/miniconda3/envs/intel_xpu/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
[rank0]: return forward_call(*args, **kwargs)
[rank0]: File "/work2/09250/molang66/stampede3/miniconda3/envs/intel_xpu/lib/python3.9/site-packages/transformers/models/llama/modeling_llama.py", line 926, in forward
[rank0]: position_embeddings = self.rotary_emb(hidden_states, position_ids)
[rank0]: File "/work2/09250/molang66/stampede3/miniconda3/envs/intel_xpu/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
[rank0]: return self._call_impl(*args, **kwargs)
[rank0]: File "/work2/09250/molang66/stampede3/miniconda3/envs/intel_xpu/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
[rank0]: return forward_call(*args, **kwargs)
[rank0]: File "/work2/09250/molang66/stampede3/miniconda3/envs/intel_xpu/lib/python3.9/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
[rank0]: return func(*args, **kwargs)
[rank0]: File "/work2/09250/molang66/stampede3/miniconda3/envs/intel_xpu/lib/python3.9/site-packages/transformers/models/llama/modeling_llama.py", line 160, in forward
[rank0]: freqs = (inv_freq_expanded.float() @ position_ids_expanded.float()).transpose(1, 2)
[rank0]: RuntimeError: could not create an engine

I was running on Stampede3 cluster, and my environments are as follows

OS:  centos 
conda python = 3.9
intel_extension_for_pytorch   2.3.110+xpu
oneccl-bind-pt                2.3.100+xpu
torch                         2.3.1+cxx11.abi
oneAPI  2024.2.1

GPU driver:
[level_zero:gpu][level_zero:7] Intel(R) Level-Zero, Intel(R) Data Center GPU Max 1550 1.3 [1.3.27642]

@Liangliang-Ma
Copy link
Contributor

Thank so much for help. I have updated my CCL version, and now I am encountering this issue:

[rank0]: Traceback (most recent call last):
[rank0]: File "/work2/09250/molang66/stampede3/transformers/examples/pytorch/language-modeling/run_clm.py", line 657, in
[rank0]: main()
[rank0]: File "/work2/09250/molang66/stampede3/transformers/examples/pytorch/language-modeling/run_clm.py", line 605, in main
[rank0]: train_result = trainer.train(resume_from_checkpoint=checkpoint)
[rank0]: File "/work2/09250/molang66/stampede3/miniconda3/envs/intel_xpu/lib/python3.9/site-packages/transformers/trainer.py", line 2141, in train
[rank0]: return inner_training_loop(
[rank0]: File "/work2/09250/molang66/stampede3/miniconda3/envs/intel_xpu/lib/python3.9/site-packages/transformers/trainer.py", line 2495, in _inner_training_loop
[rank0]: tr_loss_step = self.training_step(model, inputs, num_items_in_batch)
[rank0]: File "/work2/09250/molang66/stampede3/miniconda3/envs/intel_xpu/lib/python3.9/site-packages/transformers/trainer.py", line 3613, in training_step
[rank0]: loss = self.compute_loss(model, inputs, num_items_in_batch=num_items_in_batch)
[rank0]: File "/work2/09250/molang66/stampede3/miniconda3/envs/intel_xpu/lib/python3.9/site-packages/transformers/trainer.py", line 3667, in compute_loss
[rank0]: outputs = model(**inputs)
[rank0]: File "/work2/09250/molang66/stampede3/miniconda3/envs/intel_xpu/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
[rank0]: return self._call_impl(*args, **kwargs)
[rank0]: File "/work2/09250/molang66/stampede3/miniconda3/envs/intel_xpu/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
[rank0]: return forward_call(*args, **kwargs)
[rank0]: File "/work2/09250/molang66/stampede3/miniconda3/envs/intel_xpu/lib/python3.9/site-packages/deepspeed/utils/nvtx.py", line 18, in wrapped_fn
[rank0]: ret_val = func(*args, **kwargs)
[rank0]: File "/work2/09250/molang66/stampede3/miniconda3/envs/intel_xpu/lib/python3.9/site-packages/deepspeed/runtime/engine.py", line 1899, in forward
[rank0]: loss = self.module(*inputs, **kwargs)
[rank0]: File "/work2/09250/molang66/stampede3/miniconda3/envs/intel_xpu/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
[rank0]: return self._call_impl(*args, **kwargs)
[rank0]: File "/work2/09250/molang66/stampede3/miniconda3/envs/intel_xpu/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
[rank0]: return forward_call(*args, **kwargs)
[rank0]: File "/work2/09250/molang66/stampede3/miniconda3/envs/intel_xpu/lib/python3.9/site-packages/transformers/models/llama/modeling_llama.py", line 1199, in forward
[rank0]: outputs = self.model(
[rank0]: File "/work2/09250/molang66/stampede3/miniconda3/envs/intel_xpu/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
[rank0]: return self._call_impl(*args, **kwargs)
[rank0]: File "/work2/09250/molang66/stampede3/miniconda3/envs/intel_xpu/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
[rank0]: return forward_call(*args, **kwargs)
[rank0]: File "/work2/09250/molang66/stampede3/miniconda3/envs/intel_xpu/lib/python3.9/site-packages/transformers/models/llama/modeling_llama.py", line 926, in forward
[rank0]: position_embeddings = self.rotary_emb(hidden_states, position_ids)
[rank0]: File "/work2/09250/molang66/stampede3/miniconda3/envs/intel_xpu/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
[rank0]: return self._call_impl(*args, **kwargs)
[rank0]: File "/work2/09250/molang66/stampede3/miniconda3/envs/intel_xpu/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
[rank0]: return forward_call(*args, **kwargs)
[rank0]: File "/work2/09250/molang66/stampede3/miniconda3/envs/intel_xpu/lib/python3.9/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
[rank0]: return func(*args, **kwargs)
[rank0]: File "/work2/09250/molang66/stampede3/miniconda3/envs/intel_xpu/lib/python3.9/site-packages/transformers/models/llama/modeling_llama.py", line 160, in forward
[rank0]: freqs = (inv_freq_expanded.float() @ position_ids_expanded.float()).transpose(1, 2)
[rank0]: RuntimeError: could not create an engine

I was running on Stampede3 cluster, and my environments are as follows

OS:  centos 
conda python = 3.9
intel_extension_for_pytorch   2.3.110+xpu
oneccl-bind-pt                2.3.100+xpu
torch                         2.3.1+cxx11.abi
oneAPI  2024.2.1

GPU driver: [level_zero:gpu][level_zero:7] Intel(R) Level-Zero, Intel(R) Data Center GPU Max 1550 1.3 [1.3.27642]

Do you have the latest version of deepspeed? I have seen similar issue for outdated deepspeed.

@molang66
Copy link
Author

@Liangliang-Ma My deepspeed version is 0.15.3. I think this is the latest version.
here is my pip list:

absl-py                       2.1.0
accelerate                    1.1.1
aiohappyeyeballs              2.4.3
aiohttp                       3.10.10
aiosignal                     1.3.1
annotated-types               0.7.0
async-timeout                 4.0.3
attrs                         24.2.0
bitsandbytes                  0.44.1
certifi                       2024.8.30
chardet                       5.2.0
charset-normalizer            3.4.0
click                         8.1.7
colorama                      0.4.6
cpuid                         0.0.11
cpuid-native                  0.0.8
DataProperty                  1.0.1
datasets                      3.1.0
deepspeed                     0.15.3
diffusers                     0.31.0
dill                          0.3.8
dpcpp-cpp-rt                  2024.2.1
einops                        0.8.0
evaluate                      0.4.3
filelock                      3.16.1
frozenlist                    1.5.0
fsspec                        2024.9.0
hjson                         3.1.0
huggingface-hub               0.26.2
idna                          3.10
impi-devel                    2021.13.1
impi-rt                       2021.13.1
importlib_metadata            8.5.0
intel-cmplr-lib-rt            2024.2.1
intel-cmplr-lib-ur            2024.2.1
intel-cmplr-lic-rt            2024.2.1
intel_extension_for_pytorch   2.3.110+xpu
intel-opencl-rt               2024.2.1
intel-openmp                  2024.2.1
intel-sycl-rt                 2024.2.1
Jinja2                        3.1.4
joblib                        1.4.2
jsonlines                     4.0.0
lm_eval                       0.4.2
lxml                          5.3.0
MarkupSafe                    3.0.2
mbstrdecoder                  1.1.3
mkl                           2024.2.1
mkl-dpcpp                     2024.2.1
more-itertools                10.5.0
mpi4py                        4.0.1
mpmath                        1.3.0
msgpack                       1.1.0
multidict                     6.1.0
multiprocess                  0.70.16
networkx                      3.2.1
ninja                         1.11.1.1
nltk                          3.9.1
numexpr                       2.10.1
numpy                         2.0.2
oneccl-bind-pt                2.3.100+xpu
oneccl-devel                  2021.13.1
onemkl-sycl-blas              2024.2.1
onemkl-sycl-datafitting       2024.2.1
onemkl-sycl-dft               2024.2.1
onemkl-sycl-lapack            2024.2.1
onemkl-sycl-rng               2024.2.1
onemkl-sycl-sparse            2024.2.1
onemkl-sycl-stats             2024.2.1
onemkl-sycl-vm                2024.2.1
packaging                     24.1
pandas                        2.2.3
pathvalidate                  3.2.1
peft                          0.13.2
pillow                        11.0.0
pip                           24.3.1
portalocker                   2.10.1
propcache                     0.2.0
protobuf                      3.20.3
psutil                        6.1.0
py-cpuinfo                    9.0.0
pyarrow                       18.0.0
pybind11                      2.13.6
pydantic                      2.9.2
pydantic_core                 2.23.4
pytablewriter                 1.2.0
python-dateutil               2.9.0.post0
pytz                          2024.2
PyYAML                        6.0.2
regex                         2024.11.6
requests                      2.32.3
rouge_score                   0.1.2
ruamel.yaml                   0.18.6
ruamel.yaml.clib              0.2.12
sacrebleu                     2.4.3
safetensors                   0.4.5
scikit-learn                  1.5.2
scipy                         1.13.1
sentencepiece                 0.2.0
setuptools                    75.3.0
six                           1.16.0
sqlitedict                    2.1.0
sympy                         1.13.3
tabledata                     1.3.3
tabulate                      0.9.0
tbb                           2021.13.1
tcolorpy                      0.1.6
threadpoolctl                 3.5.0
tokenizers                    0.20.3
torch                         2.3.1+cxx11.abi
torchaudio                    2.3.1+cxx11.abi
torchvision                   0.18.1+cxx11.abi
tqdm                          4.67.0
tqdm-multiprocess             0.0.11
transformers                  4.47.0.dev0
transformers-stream-generator 0.0.5
typepy                        1.3.2
typing_extensions             4.12.2
tzdata                        2024.2
unzip                         1.0.0
urllib3                       2.2.3
wheel                         0.44.0
word2number                   1.1
xxhash                        3.5.0
yarl                          1.17.1
zipp                          3.20.2
zstandard                     0.23.0 

Could it be my gpu drive version? I don't know what the latest version of the drive is

@Liangliang-Ma
Copy link
Contributor

@molang66 I think the gpu driver version is different. Can you try the gpu verison: GPU Driver 950.13 (rolling stable version) and test again?

@molang66
Copy link
Author

@Liangliang-Ma Thank you for your response. I’d like to know which command can check the GPU driver; I didn’t see any indication for the rolling stable version.
Additionally, I tried updating oneAPI to 25.0 and DeepSpeed to 0.15.4, but encountered the following error when compiling FusedAdam:

Using /home1/09250/molang66/.cache/torch_extensions/py39_xpu as PyTorch extensions root...
/work2/09250/molang66/stampede3/miniconda3/envs/intel/lib/python3.9/site-packages/intel_extension_for_pytorch/utils/_logger.py:67: UserWarning: [MissingDependency]
                               !! WARNING !!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Your compiler (/scratch/projects/compilers/intel25.0/compiler/2025.0/bin/icpx 1.9.0) may be ABI-incompatible with PyTorch!
Please use a compiler that is ABI-compatible with GCC 5.0 and above.
See https://gcc.gnu.org/onlinedocs/libstdc++/manual/abi.html.
See https://gist.github.com/goldsborough/d466f43e8ffc948ff92de7486c5216d6
for instructions on how to install GCC 5 or higher.
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
                              !! WARNING !!

  warnings.warn(msg)
2024-11-14 01:20:59,864 - _logger.py - IPEX - WARNING - [MissingDependency]
                               !! WARNING !!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Your compiler (/scratch/projects/compilers/intel25.0/compiler/2025.0/bin/icpx 1.9.0) may be ABI-incompatible with PyTorch!
Please use a compiler that is ABI-compatible with GCC 5.0 and above.
See https://gcc.gnu.org/onlinedocs/libstdc++/manual/abi.html.
See https://gist.github.com/goldsborough/d466f43e8ffc948ff92de7486c5216d6
for instructions on how to install GCC 5 or higher.
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
                              !! WARNING !!

Emitting ninja build file /home1/09250/molang66/.cache/torch_extensions/py39_xpu/fused_adam/build.ninja...
Building extension module fused_adam...
Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
[1/2] /scratch/projects/compilers/intel25.0/compiler/2025.0/bin/icpx -MMD -MF multi_tensor_adam.dp.o.d -DTORCH_EXTENSION_NAME=fused_adam -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1016\" -I/work2/09250/molang66/stampede3/miniconda3/envs/intel/lib/python3.9/site-packages/deepspeed/ops/csrc/xpu/includes -I/work2/09250/molang66/stampede3/miniconda3/envs/intel/lib/python3.9/site-packages/deepspeed/ops/csrc/xpu/adam -isystem /work2/09250/molang66/stampede3/miniconda3/envs/intel/lib/python3.9/site-packages/torch/include -isystem /work2/09250/molang66/stampede3/miniconda3/envs/intel/lib/python3.9/site-packages/torch/include/torch/csrc/api/include -isystem /work2/09250/molang66/stampede3/miniconda3/envs/intel/lib/python3.9/site-packages/torch/include/TH -isystem /scratch/projects/compilers/intel25.0/compiler/2025.0/linux/include -isystem /scratch/projects/compilers/intel25.0/compiler/2025.0/linux/include/sycl -isystem /scratch/projects/compilers/intel25.0/mkl/2025.0/include -isystem /work2/09250/molang66/stampede3/miniconda3/envs/intel/lib/python3.9/site-packages/intel_extension_for_pytorch/include -isystem /work2/09250/molang66/stampede3/miniconda3/envs/intel/include/python3.9 -D_GLIBCXX_USE_CXX11_ABI=1 -fPIC -std=c++17 -fsycl -fsycl-targets=spir64_gen -g -gdwarf-4 -O3 -std=c++17 -fPIC -DMKL_ILP64 -fno-strict-aliasing -DVERSION_GE_1_1 -DVERSION_GE_1_3 -DVERSION_GE_1_5 -fsycl -c /work2/09250/molang66/stampede3/miniconda3/envs/intel/lib/python3.9/site-packages/deepspeed/ops/csrc/xpu/adam/multi_tensor_adam.dp.cpp -o multi_tensor_adam.dp.o
FAILED: multi_tensor_adam.dp.o
/scratch/projects/compilers/intel25.0/compiler/2025.0/bin/icpx -MMD -MF multi_tensor_adam.dp.o.d -DTORCH_EXTENSION_NAME=fused_adam -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1016\" -I/work2/09250/molang66/stampede3/miniconda3/envs/intel/lib/python3.9/site-packages/deepspeed/ops/csrc/xpu/includes -I/work2/09250/molang66/stampede3/miniconda3/envs/intel/lib/python3.9/site-packages/deepspeed/ops/csrc/xpu/adam -isystem /work2/09250/molang66/stampede3/miniconda3/envs/intel/lib/python3.9/site-packages/torch/include -isystem /work2/09250/molang66/stampede3/miniconda3/envs/intel/lib/python3.9/site-packages/torch/include/torch/csrc/api/include -isystem /work2/09250/molang66/stampede3/miniconda3/envs/intel/lib/python3.9/site-packages/torch/include/TH -isystem /scratch/projects/compilers/intel25.0/compiler/2025.0/linux/include -isystem /scratch/projects/compilers/intel25.0/compiler/2025.0/linux/include/sycl -isystem /scratch/projects/compilers/intel25.0/mkl/2025.0/include -isystem /work2/09250/molang66/stampede3/miniconda3/envs/intel/lib/python3.9/site-packages/intel_extension_for_pytorch/include -isystem /work2/09250/molang66/stampede3/miniconda3/envs/intel/include/python3.9 -D_GLIBCXX_USE_CXX11_ABI=1 -fPIC -std=c++17 -fsycl -fsycl-targets=spir64_gen -g -gdwarf-4 -O3 -std=c++17 -fPIC -DMKL_ILP64 -fno-strict-aliasing -DVERSION_GE_1_1 -DVERSION_GE_1_3 -DVERSION_GE_1_5 -fsycl -c /work2/09250/molang66/stampede3/miniconda3/envs/intel/lib/python3.9/site-packages/deepspeed/ops/csrc/xpu/adam/multi_tensor_adam.dp.cpp -o multi_tensor_adam.dp.o
In file included from /work2/09250/molang66/stampede3/miniconda3/envs/intel/lib/python3.9/site-packages/deepspeed/ops/csrc/xpu/adam/multi_tensor_adam.dp.cpp:18:
/work2/09250/molang66/stampede3/miniconda3/envs/intel/lib/python3.9/site-packages/deepspeed/ops/csrc/xpu/adam/multi_tensor_apply.dp.hpp:192:70: warning: 'constant_buffer' is deprecated: use 'target::device' instead [-Wdeprecated-declarations]
  192 |                                                        sycl::target::constant_buffer>(cgh);
      |                                                                      ^
/scratch/projects/compilers/intel25.0/compiler/2025.0/bin/compiler/../../include/sycl/access/access.hpp:24:19: note: 'constant_buffer' has been explicitly marked deprecated here
   24 |   constant_buffer __SYCL2020_DEPRECATED("use 'target::device' instead") = 2015,
      |                   ^
/scratch/projects/compilers/intel25.0/compiler/2025.0/bin/compiler/../../include/sycl/detail/defines_elementary.hpp:62:40: note: expanded from macro '__SYCL2020_DEPRECATED'
   62 | #define __SYCL2020_DEPRECATED(message) __SYCL_DEPRECATED(message)
      |                                        ^
/scratch/projects/compilers/intel25.0/compiler/2025.0/bin/compiler/../../include/sycl/detail/defines_elementary.hpp:53:38: note: expanded from macro '__SYCL_DEPRECATED'
   53 | #define __SYCL_DEPRECATED(message) [[deprecated(message)]]
      |                                      ^
In file included from /work2/09250/molang66/stampede3/miniconda3/envs/intel/lib/python3.9/site-packages/deepspeed/ops/csrc/xpu/adam/multi_tensor_adam.dp.cpp:18:
/work2/09250/molang66/stampede3/miniconda3/envs/intel/lib/python3.9/site-packages/deepspeed/ops/csrc/xpu/adam/multi_tensor_apply.dp.hpp:199:17: warning: expression result unused [-Wunused-value]
  199 |                 0;
      |                 ^
In file included from /work2/09250/molang66/stampede3/miniconda3/envs/intel/lib/python3.9/site-packages/deepspeed/ops/csrc/xpu/adam/multi_tensor_adam.dp.cpp:19:
/work2/09250/molang66/stampede3/miniconda3/envs/intel/lib/python3.9/site-packages/deepspeed/ops/csrc/xpu/includes/type_shim.h:91:54: warning: 'this_nd_item<3>' is deprecated: use sycl::ext::oneapi::this_work_item::get_nd_item() instead [-Wdeprecated-declarations]
   91 |     auto item_ct1 = sycl::ext::oneapi::experimental::this_nd_item<3>();
      |                                                      ^
/scratch/projects/compilers/intel25.0/compiler/2025.0/bin/compiler/../../include/sycl/ext/oneapi/free_function_queries.hpp:50:1: note: 'this_nd_item<3>' has been explicitly marked deprecated here
   50 | __SYCL_DEPRECATED(
      | ^
/scratch/projects/compilers/intel25.0/compiler/2025.0/bin/compiler/../../include/sycl/detail/defines_elementary.hpp:53:38: note: expanded from macro '__SYCL_DEPRECATED'
   53 | #define __SYCL_DEPRECATED(message) [[deprecated(message)]]
      |                                      ^
/work2/09250/molang66/stampede3/miniconda3/envs/intel/lib/python3.9/site-packages/deepspeed/ops/csrc/xpu/adam/multi_tensor_adam.dp.cpp:45:58: warning: 'this_nd_item<3>' is deprecated: use sycl::ext::oneapi::this_work_item::get_nd_item() instead [-Wdeprecated-declarations]
   45 |         auto item_ct1 = sycl::ext::oneapi::experimental::this_nd_item<3>();
      |                                                          ^
/scratch/projects/compilers/intel25.0/compiler/2025.0/bin/compiler/../../include/sycl/ext/oneapi/free_function_queries.hpp:50:1: note: 'this_nd_item<3>' has been explicitly marked deprecated here
   50 | __SYCL_DEPRECATED(
      | ^
/scratch/projects/compilers/intel25.0/compiler/2025.0/bin/compiler/../../include/sycl/detail/defines_elementary.hpp:53:38: note: expanded from macro '__SYCL_DEPRECATED'
   53 | #define __SYCL_DEPRECATED(message) [[deprecated(message)]]
      |                                      ^
In file included from /work2/09250/molang66/stampede3/miniconda3/envs/intel/lib/python3.9/site-packages/deepspeed/ops/csrc/xpu/adam/multi_tensor_adam.dp.cpp:11:
In file included from /work2/09250/molang66/stampede3/miniconda3/envs/intel/lib/python3.9/site-packages/torch/include/ATen/ATen.h:7:
In file included from /work2/09250/molang66/stampede3/miniconda3/envs/intel/lib/python3.9/site-packages/torch/include/ATen/Context.h:3:
In file included from /work2/09250/molang66/stampede3/miniconda3/envs/intel/lib/python3.9/site-packages/torch/include/ATen/CPUGeneratorImpl.h:3:
In file included from /work2/09250/molang66/stampede3/miniconda3/envs/intel/lib/python3.9/site-packages/torch/include/ATen/core/Generator.h:21:
In file included from /work2/09250/molang66/stampede3/miniconda3/envs/intel/lib/python3.9/site-packages/torch/include/c10/core/GeneratorImpl.h:8:
In file included from /work2/09250/molang66/stampede3/miniconda3/envs/intel/lib/python3.9/site-packages/torch/include/c10/core/TensorImpl.h:11:
In file included from /work2/09250/molang66/stampede3/miniconda3/envs/intel/lib/python3.9/site-packages/torch/include/c10/core/ScalarType.h:7:
In file included from /work2/09250/molang66/stampede3/miniconda3/envs/intel/lib/python3.9/site-packages/torch/include/c10/util/Float8_e4m3fnuz.h:136:
In file included from /work2/09250/molang66/stampede3/miniconda3/envs/intel/lib/python3.9/site-packages/torch/include/c10/util/Float8_e4m3fnuz-inl.h:4:
In file included from /work2/09250/molang66/stampede3/miniconda3/envs/intel/lib/python3.9/site-packages/torch/include/c10/util/Float8_fnuz_cvt.h:4:
In file included from /scratch/projects/compilers/intel25.0/compiler/2025.0/bin/compiler/../../include/sycl/sycl.hpp:25:
In file included from /scratch/projects/compilers/intel25.0/compiler/2025.0/bin/compiler/../../include/sycl/detail/core.hpp:21:
In file included from /scratch/projects/compilers/intel25.0/compiler/2025.0/bin/compiler/../../include/sycl/accessor.hpp:15:
/scratch/projects/compilers/intel25.0/compiler/2025.0/bin/compiler/../../include/sycl/buffer.hpp:174:17: error: static assertion failed due to requirement 'is_device_copyable_v<multi_tensor_apply_kernel<TensorListMetadata<4>, AdamFunctor<double>, float, float, float, float, float, float, adamMode_t, float>>': Underlying type of a buffer must be device copyable!
  174 |   static_assert(is_device_copyable_v<T>,
      |                 ^~~~~~~~~~~~~~~~~~~~~~~
/work2/09250/molang66/stampede3/miniconda3/envs/intel/lib/python3.9/site-packages/deepspeed/ops/csrc/xpu/adam/multi_tensor_apply.dp.hpp:187:34: note: in instantiation of template class 'sycl::buffer<multi_tensor_apply_kernel<TensorListMetadata<4>, AdamFunctor<double>, float, float, float, float, float, float, adamMode_t, float>>' requested here
  187 |                     sycl::buffer params(const_cast<const decltype(capture)*>(&capture),
      |                                  ^
/work2/09250/molang66/stampede3/miniconda3/envs/intel/lib/python3.9/site-packages/deepspeed/ops/csrc/xpu/adam/multi_tensor_adam.dp.cpp:146:36: note: in instantiation of function template specialization 'multi_tensor_apply<4, AdamFunctor<double>, float, float, float, float, float, float, adamMode_t, float>' requested here
  146 |                                    multi_tensor_apply<4>(BLOCK_SIZE,
      |                                    ^
In file included from /work2/09250/molang66/stampede3/miniconda3/envs/intel/lib/python3.9/site-packages/deepspeed/ops/csrc/xpu/adam/multi_tensor_adam.dp.cpp:18:
/work2/09250/molang66/stampede3/miniconda3/envs/intel/lib/python3.9/site-packages/deepspeed/ops/csrc/xpu/adam/multi_tensor_apply.dp.hpp:199:17: warning: expression result unused [-Wunused-value]
  199 |                 0;
      |                 ^
/work2/09250/molang66/stampede3/miniconda3/envs/intel/lib/python3.9/site-packages/deepspeed/ops/csrc/xpu/adam/multi_tensor_adam.dp.cpp:146:36: note: in instantiation of function template specialization 'multi_tensor_apply<4, AdamFunctor<double>, float, float, float, float, float, float, adamMode_t, float>' requested here
  146 |                                    multi_tensor_apply<4>(BLOCK_SIZE,
      |                                    ^
In file included from /work2/09250/molang66/stampede3/miniconda3/envs/intel/lib/python3.9/site-packages/deepspeed/ops/csrc/xpu/adam/multi_tensor_adam.dp.cpp:11:
In file included from /work2/09250/molang66/stampede3/miniconda3/envs/intel/lib/python3.9/site-packages/torch/include/ATen/ATen.h:7:
In file included from /work2/09250/molang66/stampede3/miniconda3/envs/intel/lib/python3.9/site-packages/torch/include/ATen/Context.h:3:
In file included from /work2/09250/molang66/stampede3/miniconda3/envs/intel/lib/python3.9/site-packages/torch/include/ATen/CPUGeneratorImpl.h:3:
In file included from /work2/09250/molang66/stampede3/miniconda3/envs/intel/lib/python3.9/site-packages/torch/include/ATen/core/Generator.h:21:
In file included from /work2/09250/molang66/stampede3/miniconda3/envs/intel/lib/python3.9/site-packages/torch/include/c10/core/GeneratorImpl.h:8:
In file included from /work2/09250/molang66/stampede3/miniconda3/envs/intel/lib/python3.9/site-packages/torch/include/c10/core/TensorImpl.h:11:
In file included from /work2/09250/molang66/stampede3/miniconda3/envs/intel/lib/python3.9/site-packages/torch/include/c10/core/ScalarType.h:7:
In file included from /work2/09250/molang66/stampede3/miniconda3/envs/intel/lib/python3.9/site-packages/torch/include/c10/util/Float8_e4m3fnuz.h:136:
In file included from /work2/09250/molang66/stampede3/miniconda3/envs/intel/lib/python3.9/site-packages/torch/include/c10/util/Float8_e4m3fnuz-inl.h:4:
In file included from /work2/09250/molang66/stampede3/miniconda3/envs/intel/lib/python3.9/site-packages/torch/include/c10/util/Float8_fnuz_cvt.h:4:
In file included from /scratch/projects/compilers/intel25.0/compiler/2025.0/bin/compiler/../../include/sycl/sycl.hpp:25:
In file included from /scratch/projects/compilers/intel25.0/compiler/2025.0/bin/compiler/../../include/sycl/detail/core.hpp:21:
In file included from /scratch/projects/compilers/intel25.0/compiler/2025.0/bin/compiler/../../include/sycl/accessor.hpp:15:
/scratch/projects/compilers/intel25.0/compiler/2025.0/bin/compiler/../../include/sycl/buffer.hpp:174:17: error: static assertion failed due to requirement 'is_device_copyable_v<multi_tensor_apply_kernel<TensorListMetadata<4>, AdamFunctor<float>, float, float, float, float, float, float, adamMode_t, float>>': Underlying type of a buffer must be device copyable!
  174 |   static_assert(is_device_copyable_v<T>,
      |                 ^~~~~~~~~~~~~~~~~~~~~~~
/work2/09250/molang66/stampede3/miniconda3/envs/intel/lib/python3.9/site-packages/deepspeed/ops/csrc/xpu/adam/multi_tensor_apply.dp.hpp:187:34: note: in instantiation of template class 'sycl::buffer<multi_tensor_apply_kernel<TensorListMetadata<4>, AdamFunctor<float>, float, float, float, float, float, float, adamMode_t, float>>' requested here
  187 |                     sycl::buffer params(const_cast<const decltype(capture)*>(&capture),
      |                                  ^
/work2/09250/molang66/stampede3/miniconda3/envs/intel/lib/python3.9/site-packages/deepspeed/ops/csrc/xpu/adam/multi_tensor_adam.dp.cpp:146:36: note: in instantiation of function template specialization 'multi_tensor_apply<4, AdamFunctor<float>, float, float, float, float, float, float, adamMode_t, float>' requested here
  146 |                                    multi_tensor_apply<4>(BLOCK_SIZE,
      |                                    ^
In file included from /work2/09250/molang66/stampede3/miniconda3/envs/intel/lib/python3.9/site-packages/deepspeed/ops/csrc/xpu/adam/multi_tensor_adam.dp.cpp:18:
/work2/09250/molang66/stampede3/miniconda3/envs/intel/lib/python3.9/site-packages/deepspeed/ops/csrc/xpu/adam/multi_tensor_apply.dp.hpp:199:17: warning: expression result unused [-Wunused-value]
  199 |                 0;
      |                 ^
/work2/09250/molang66/stampede3/miniconda3/envs/intel/lib/python3.9/site-packages/deepspeed/ops/csrc/xpu/adam/multi_tensor_adam.dp.cpp:146:36: note: in instantiation of function template specialization 'multi_tensor_apply<4, AdamFunctor<float>, float, float, float, float, float, float, adamMode_t, float>' requested here
  146 |                                    multi_tensor_apply<4>(BLOCK_SIZE,
      |                                    ^
In file included from /work2/09250/molang66/stampede3/miniconda3/envs/intel/lib/python3.9/site-packages/deepspeed/ops/csrc/xpu/adam/multi_tensor_adam.dp.cpp:11:
In file included from /work2/09250/molang66/stampede3/miniconda3/envs/intel/lib/python3.9/site-packages/torch/include/ATen/ATen.h:7:
In file included from /work2/09250/molang66/stampede3/miniconda3/envs/intel/lib/python3.9/site-packages/torch/include/ATen/Context.h:3:
In file included from /work2/09250/molang66/stampede3/miniconda3/envs/intel/lib/python3.9/site-packages/torch/include/ATen/CPUGeneratorImpl.h:3:
In file included from /work2/09250/molang66/stampede3/miniconda3/envs/intel/lib/python3.9/site-packages/torch/include/ATen/core/Generator.h:21:
In file included from /work2/09250/molang66/stampede3/miniconda3/envs/intel/lib/python3.9/site-packages/torch/include/c10/core/GeneratorImpl.h:8:
In file included from /work2/09250/molang66/stampede3/miniconda3/envs/intel/lib/python3.9/site-packages/torch/include/c10/core/TensorImpl.h:11:
In file included from /work2/09250/molang66/stampede3/miniconda3/envs/intel/lib/python3.9/site-packages/torch/include/c10/core/ScalarType.h:7:
In file included from /work2/09250/molang66/stampede3/miniconda3/envs/intel/lib/python3.9/site-packages/torch/include/c10/util/Float8_e4m3fnuz.h:136:
In file included from /work2/09250/molang66/stampede3/miniconda3/envs/intel/lib/python3.9/site-packages/torch/include/c10/util/Float8_e4m3fnuz-inl.h:4:
In file included from /work2/09250/molang66/stampede3/miniconda3/envs/intel/lib/python3.9/site-packages/torch/include/c10/util/Float8_fnuz_cvt.h:4:
In file included from /scratch/projects/compilers/intel25.0/compiler/2025.0/bin/compiler/../../include/sycl/sycl.hpp:25:
In file included from /scratch/projects/compilers/intel25.0/compiler/2025.0/bin/compiler/../../include/sycl/detail/core.hpp:21:
In file included from /scratch/projects/compilers/intel25.0/compiler/2025.0/bin/compiler/../../include/sycl/accessor.hpp:15:
/scratch/projects/compilers/intel25.0/compiler/2025.0/bin/compiler/../../include/sycl/buffer.hpp:174:17: error: static assertion failed due to requirement 'is_device_copyable_v<multi_tensor_apply_kernel<TensorListMetadata<4>, AdamFunctor<c10::Half>, float, float, float, float, float, float, adamMode_t, float>>': Underlying type of a buffer must be device copyable!
  174 |   static_assert(is_device_copyable_v<T>,
      |                 ^~~~~~~~~~~~~~~~~~~~~~~
/work2/09250/molang66/stampede3/miniconda3/envs/intel/lib/python3.9/site-packages/deepspeed/ops/csrc/xpu/adam/multi_tensor_apply.dp.hpp:187:34: note: in instantiation of template class 'sycl::buffer<multi_tensor_apply_kernel<TensorListMetadata<4>, AdamFunctor<c10::Half>, float, float, float, float, float, float, adamMode_t, float>>' requested here
  187 |                     sycl::buffer params(const_cast<const decltype(capture)*>(&capture),
      |                                  ^
/work2/09250/molang66/stampede3/miniconda3/envs/intel/lib/python3.9/site-packages/deepspeed/ops/csrc/xpu/adam/multi_tensor_adam.dp.cpp:146:36: note: in instantiation of function template specialization 'multi_tensor_apply<4, AdamFunctor<c10::Half>, float, float, float, float, float, float, adamMode_t, float>' requested here
  146 |                                    multi_tensor_apply<4>(BLOCK_SIZE,
      |                                    ^
In file included from /work2/09250/molang66/stampede3/miniconda3/envs/intel/lib/python3.9/site-packages/deepspeed/ops/csrc/xpu/adam/multi_tensor_adam.dp.cpp:18:
/work2/09250/molang66/stampede3/miniconda3/envs/intel/lib/python3.9/site-packages/deepspeed/ops/csrc/xpu/adam/multi_tensor_apply.dp.hpp:199:17: warning: expression result unused [-Wunused-value]
  199 |                 0;
      |                 ^
/work2/09250/molang66/stampede3/miniconda3/envs/intel/lib/python3.9/site-packages/deepspeed/ops/csrc/xpu/adam/multi_tensor_adam.dp.cpp:146:36: note: in instantiation of function template specialization 'multi_tensor_apply<4, AdamFunctor<c10::Half>, float, float, float, float, float, float, adamMode_t, float>' requested here
  146 |                                    multi_tensor_apply<4>(BLOCK_SIZE,
      |                                    ^
In file included from /work2/09250/molang66/stampede3/miniconda3/envs/intel/lib/python3.9/site-packages/deepspeed/ops/csrc/xpu/adam/multi_tensor_adam.dp.cpp:11:
In file included from /work2/09250/molang66/stampede3/miniconda3/envs/intel/lib/python3.9/site-packages/torch/include/ATen/ATen.h:7:
In file included from /work2/09250/molang66/stampede3/miniconda3/envs/intel/lib/python3.9/site-packages/torch/include/ATen/Context.h:3:
In file included from /work2/09250/molang66/stampede3/miniconda3/envs/intel/lib/python3.9/site-packages/torch/include/ATen/CPUGeneratorImpl.h:3:
In file included from /work2/09250/molang66/stampede3/miniconda3/envs/intel/lib/python3.9/site-packages/torch/include/ATen/core/Generator.h:21:
In file included from /work2/09250/molang66/stampede3/miniconda3/envs/intel/lib/python3.9/site-packages/torch/include/c10/core/GeneratorImpl.h:8:
In file included from /work2/09250/molang66/stampede3/miniconda3/envs/intel/lib/python3.9/site-packages/torch/include/c10/core/TensorImpl.h:11:
In file included from /work2/09250/molang66/stampede3/miniconda3/envs/intel/lib/python3.9/site-packages/torch/include/c10/core/ScalarType.h:7:
In file included from /work2/09250/molang66/stampede3/miniconda3/envs/intel/lib/python3.9/site-packages/torch/include/c10/util/Float8_e4m3fnuz.h:136:
In file included from /work2/09250/molang66/stampede3/miniconda3/envs/intel/lib/python3.9/site-packages/torch/include/c10/util/Float8_e4m3fnuz-inl.h:4:
In file included from /work2/09250/molang66/stampede3/miniconda3/envs/intel/lib/python3.9/site-packages/torch/include/c10/util/Float8_fnuz_cvt.h:4:
In file included from /scratch/projects/compilers/intel25.0/compiler/2025.0/bin/compiler/../../include/sycl/sycl.hpp:25:
In file included from /scratch/projects/compilers/intel25.0/compiler/2025.0/bin/compiler/../../include/sycl/detail/core.hpp:21:
In file included from /scratch/projects/compilers/intel25.0/compiler/2025.0/bin/compiler/../../include/sycl/accessor.hpp:15:
/scratch/projects/compilers/intel25.0/compiler/2025.0/bin/compiler/../../include/sycl/buffer.hpp:174:17: error: static assertion failed due to requirement 'is_device_copyable_v<multi_tensor_apply_kernel<TensorListMetadata<4>, AdamFunctor<c10::BFloat16>, float, float, float, float, float, float, adamMode_t, float>>': Underlying type of a buffer must be device copyable!
  174 |   static_assert(is_device_copyable_v<T>,
      |                 ^~~~~~~~~~~~~~~~~~~~~~~
/work2/09250/molang66/stampede3/miniconda3/envs/intel/lib/python3.9/site-packages/deepspeed/ops/csrc/xpu/adam/multi_tensor_apply.dp.hpp:187:34: note: in instantiation of template class 'sycl::buffer<multi_tensor_apply_kernel<TensorListMetadata<4>, AdamFunctor<c10::BFloat16>, float, float, float, float, float, float, adamMode_t, float>>' requested here
  187 |                     sycl::buffer params(const_cast<const decltype(capture)*>(&capture),
      |                                  ^
/work2/09250/molang66/stampede3/miniconda3/envs/intel/lib/python3.9/site-packages/deepspeed/ops/csrc/xpu/adam/multi_tensor_adam.dp.cpp:146:36: note: in instantiation of function template specialization 'multi_tensor_apply<4, AdamFunctor<c10::BFloat16>, float, float, float, float, float, float, adamMode_t, float>' requested here
  146 |                                    multi_tensor_apply<4>(BLOCK_SIZE,
      |                                    ^
In file included from /work2/09250/molang66/stampede3/miniconda3/envs/intel/lib/python3.9/site-packages/deepspeed/ops/csrc/xpu/adam/multi_tensor_adam.dp.cpp:18:
/work2/09250/molang66/stampede3/miniconda3/envs/intel/lib/python3.9/site-packages/deepspeed/ops/csrc/xpu/adam/multi_tensor_apply.dp.hpp:199:17: warning: expression result unused [-Wunused-value]
  199 |                 0;
      |                 ^
/work2/09250/molang66/stampede3/miniconda3/envs/intel/lib/python3.9/site-packages/deepspeed/ops/csrc/xpu/adam/multi_tensor_adam.dp.cpp:146:36: note: in instantiation of function template specialization 'multi_tensor_apply<4, AdamFunctor<c10::BFloat16>, float, float, float, float, float, float, adamMode_t, float>' requested here
  146 |                                    multi_tensor_apply<4>(BLOCK_SIZE,
      |                                    ^
8 warnings and 4 errors generated.
ninja: build stopped: subcommand failed.

Is this normal? Compilation worked fine with version 24.2.1.

@Liangliang-Ma
Copy link
Contributor

Liangliang-Ma commented Nov 15, 2024

@molang66 You can check with dpkg -l | grep -P "intel|level-zero|libigc|libigd|libigf|opencl" to see installed components. If you install the same gpu driver version like mine, you will see the output like this:

ii intel-fw-gpu 2024.24.5-337-22.04
ii intel-level-zero-gpu-devel 1.5.9999.17145-embargo
ri level-zero 1.16.11
ii libigc-dev 1.0.17537.24-996-22.04
ii libze-intel-gpu-dev 24.35.30872.31-996-22.04
(Just paste some of them, you can compare the numbers)

And we suggest you keep using oneAPI 2024.2.1 with ipex2.3.110 because it currently matching each other for usage.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working training
Projects
None yet
Development

No branches or pull requests

4 participants