Optimized inference of XGLM model on HPU #1323

XinyuYe-Intel · 2024-09-10T08:00:07Z

What does this PR do?

Optimized inference of XGLM model on HPU.

Before submitting

Did you make sure to update the documentation with your changes?
Did you write any new necessary tests?

Signed-off-by: Ye, Xinyu <[email protected]>

libinta · 2024-09-18T21:45:43Z

@XinyuYe-Intel can you provide gaudi2 test on latest 1/17/1.18 docker RUN_SLOW=true GAUDI2_CI test and gaudi1 test result?

XinyuYe-Intel · 2024-09-20T07:14:43Z

@XinyuYe-Intel can you provide gaudi2 test on latest 1/17/1.18 docker RUN_SLOW=true GAUDI2_CI test and gaudi1 test result?

perf on gaudi2 on 1.17.1 with RUN_SLOW=true is as below:

For gaudi1, I don't have the machine, so I can't provide the result.

Signed-off-by: Ye, Xinyu <[email protected]>

ssarkar2

@XinyuYe-Intel could you please resolve the conflicts on this PR, looks good otherwise

XinyuYe-Intel · 2024-10-12T06:12:14Z

@XinyuYe-Intel could you please resolve the conflicts on this PR, looks good otherwise

Resolved conflicts.

XinyuYe-Intel added 2 commits September 10, 2024 03:29

Optimized inference of XGLM model on HPU

2f78787

Signed-off-by: Ye, Xinyu <[email protected]>

add test.

c48fc46

Signed-off-by: Ye, Xinyu <[email protected]>

XinyuYe-Intel requested review from ssarkar2, bhargaveede, vivekgoe and regisss as code owners September 10, 2024 08:00

add readme.

45d4d4c

Signed-off-by: Ye, Xinyu <[email protected]>

XinyuYe-Intel force-pushed the xglm branch from fa21cef to 45d4d4c Compare September 13, 2024 06:23

XinyuYe-Intel added 3 commits September 26, 2024 11:25

Merge branch 'main' into xglm

a5d729e

style fix

f748171

Signed-off-by: Ye, Xinyu <[email protected]>

Merge branch 'main' into xglm

48dc721

ssarkar2 reviewed Oct 11, 2024

View reviewed changes

Merge branch 'main' into xglm

6b807a3

ssarkar2 approved these changes Oct 16, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimized inference of XGLM model on HPU #1323

Optimized inference of XGLM model on HPU #1323

XinyuYe-Intel commented Sep 10, 2024

libinta commented Sep 18, 2024

XinyuYe-Intel commented Sep 20, 2024

ssarkar2 left a comment •

edited

Loading

XinyuYe-Intel commented Oct 12, 2024

Optimized inference of XGLM model on HPU #1323

Are you sure you want to change the base?

Optimized inference of XGLM model on HPU #1323

Conversation

XinyuYe-Intel commented Sep 10, 2024

What does this PR do?

Before submitting

libinta commented Sep 18, 2024

XinyuYe-Intel commented Sep 20, 2024

ssarkar2 left a comment • edited Loading

Choose a reason for hiding this comment

XinyuYe-Intel commented Oct 12, 2024

ssarkar2 left a comment •

edited

Loading