Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Usage of Gluten Functions At Query Level #8451

Open
rajatma1993 opened this issue Jan 7, 2025 · 1 comment
Open

Usage of Gluten Functions At Query Level #8451

rajatma1993 opened this issue Jan 7, 2025 · 1 comment
Labels
bug Something isn't working triage

Comments

@rajatma1993
Copy link

Backend

VL (Velox)

Bug description

Hello Team,

I am currently testing Gluten to benchmark its performance against TPCH. In the process, I observed that in TPCH Query 16, In Flame Graph we see only 4.69% of the function usage of the overall seems to be utilized.

I would like to understand if this utilization level aligns with our expectations. Additionally, could we confirm if this level of utilization contributes significantly to the execution speedup?

Attached is the Flame Graph we have generated.
gluten_snap_q16

Spark version

None

Spark configurations

No response

System information

No response

Relevant logs

No response

@rajatma1993 rajatma1993 added bug Something isn't working triage labels Jan 7, 2025
@FelixYBW
Copy link
Contributor

FelixYBW commented Jan 7, 2025

It depends on the partition data size of the stage. You may increase the TPCH SF or decrease the partition number to get larger partition in each task. The larger partition a task is processing the more time spending in native.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working triage
Projects
None yet
Development

No branches or pull requests

2 participants