Replies: 2 comments
-
|
Delta tables has a bit lower performance than pure hive table. Delta uses SQL to query metadata during the SQL processing. But some operators are not supported in the metadata query which caused frequent C2R, R2C in some cases and perform worse than vanilla spark. Welcome to fix. |
Beta Was this translation helpful? Give feedback.
0 replies
-
|
Thanks @FelixYBW for your detailed response. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hi Gluten Community,
I am currently exploring the performance of Apache Gluten with the Velox backend specifically for Delta Lake workloads.
While there are several TPC-DS benchmark reports available for Parquet/ORC, I am looking for insights or existing benchmarking results for the following specific setup:
Context:
We are evaluating the overhead of the Delta Log reading process versus the native acceleration provided by Velox. Specifically, we are interested in:
If anyone has run these benchmarks or has a performance comparison (Native Spark vs. Gluten+Velox) for this setup, I would greatly appreciate it if you could share your findings or any tuning tips!
Thanks!
Beta Was this translation helpful? Give feedback.
All reactions