Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

XPU code update #4048

Draft
wants to merge 12 commits into
base: develop
Choose a base branch
from
Draft

XPU code update #4048

wants to merge 12 commits into from

Conversation

eunwoosh
Copy link
Contributor

@eunwoosh eunwoosh commented Oct 21, 2024

Summary

  • torch version update
    • IPEX is integrated in torch since torch 2.4 (need to use exclusive index url when installing torch)
    • torch 2.4 supports Max dGPU. torch 2.5 supports ARC also.
  • mixed precision
    • According to documentation, both fp16 and bfp16 mixed precision are supported.
    • But there is problems when using Gradient scaler w/ fp16.
    • torch gradient scaler unscales tensors (fp32 -> fp64), but XPU doesn't support fp64 now.
    • existing precision plugin for XPU in OTX isn't necessary. It can be removed.
  • OTX code update
    • need to update cuda oriented code (e.g. torch.cuda.amp)
    • There is no ipex.optimize now. Code for that in xpu strategy can be removed.

How to test

Checklist

  • I have added unit tests to cover my changes.​
  • I have added integration tests to cover my changes.​
  • I have ran e2e tests and there is no issues.
  • I have added the description of my changes into CHANGELOG in my target branch (e.g., CHANGELOG in develop).​
  • I have updated the documentation in my target branch accordingly (e.g., documentation in develop).
  • I have linked related issues.

License

  • I submit my code changes under the same Apache License that covers the project.
    Feel free to contact the maintainers if that's a concern.
  • I have updated the license header for each file (see an example below).
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0

@github-actions github-actions bot added DEPENDENCY Any changes in any dependencies (new dep or its version) should be produced via Change Request on PM BUILD OTX 2.0 labels Oct 21, 2024
@github-actions github-actions bot added the TEST Any changes in tests label Oct 22, 2024
@@ -1107,7 +1108,16 @@ def _build_trainer(self, **kwargs) -> None:
self._cache.update(strategy="xpu_single")
# add plugin for Automatic Mixed Precision on XPU
if self._cache.args.get("precision", 32) == 16:
self._cache.update(plugins=[MixedPrecisionXPUPlugin()])
msg = "XPU doesn't support fp16 now, so bfp16 will be used instead."
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The feedback from IPEX is that BF16 is preferable for computer vision

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can remove warning then

@@ -100,9 +100,6 @@ def test_perf(
fxt_benchmark: Benchmark,
fxt_accelerator: str,
):
if fxt_model.name == "dino_v2" and fxt_accelerator == "xpu":
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these models supported now?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
BUILD DEPENDENCY Any changes in any dependencies (new dep or its version) should be produced via Change Request on PM OTX 2.0 TEST Any changes in tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants