Refine Qualification AutoTuner recommendations for shuffle partitions for CPU event logs #1479

parthosa · 2024-12-31T04:18:32Z

This PR updates the Qualification AutoTuner to recommend setting shuffle partitions to 200 when processing CPU event logs. For GPU event logs, the existing logic in the Profiling AutoTuner remains unchanged.

Changes

Improvements to `AutoTuner`:

Changed the visibility of limitedLogicRecommendations from private to protected to allow subclass access.
Moved the initialization of shuffleStagesWithPosSpilling inside a conditional block to avoid unnecessary computations.

Enhancements to `QualificationAutoTuner`:

Added an override for limitedLogicRecommendations in QualificationAutoTuner to include spark.sql.shuffle.partitions.

Tests

Introduced a helper method buildDefaultAutoTuner in QualificationAutoTunerSuite to create instances with default properties.
Added a new test to verify that QualificationAutoTuner sets shuffle partitions to 200.

Signed-off-by: Partho Sarthi <[email protected]>

amahussein

Thanks @parthosa
LGTME

amahussein · 2024-12-31T16:20:53Z

core/src/main/scala/com/nvidia/spark/rapids/tool/tuning/QualificationAutoTuner.scala

+   * List of recommendations for which the Qualification AutoTuner skips calculations and only
+   * depend on default values.
+   */
+  override protected val limitedLogicRecommendations: mutable.HashSet[String] = mutable.HashSet(


QQ:
It is expected that limitedLogicRecommendations to be platform specific.
Is it going to be feasible to do change behavior based on platform when needed in the future?

Yes, we could use a similar class based approach (say PlatformSpecificAutoTunerProvider) to provide any platform specific tunings.

Add shuffle partition conf to limited logic recommendation

61fd0f9

Signed-off-by: Partho Sarthi <[email protected]>

parthosa added bug Something isn't working core_tools Scope the core module (scala) labels Dec 31, 2024

parthosa self-assigned this Dec 31, 2024

parthosa marked this pull request as ready for review December 31, 2024 16:03

parthosa requested review from cindyyuanjiang and amahussein December 31, 2024 16:03

amahussein approved these changes Dec 31, 2024

View reviewed changes

parthosa merged commit 891b3b5 into NVIDIA:dev Dec 31, 2024
16 checks passed

parthosa deleted the spark-rapids-tools-1400 branch December 31, 2024 21:55

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refine Qualification AutoTuner recommendations for shuffle partitions for CPU event logs #1479

Refine Qualification AutoTuner recommendations for shuffle partitions for CPU event logs #1479

parthosa commented Dec 31, 2024

amahussein left a comment

amahussein Dec 31, 2024

parthosa Dec 31, 2024

Refine Qualification AutoTuner recommendations for shuffle partitions for CPU event logs #1479

Refine Qualification AutoTuner recommendations for shuffle partitions for CPU event logs #1479

Conversation

parthosa commented Dec 31, 2024

Changes

Improvements to AutoTuner:

Enhancements to QualificationAutoTuner:

Tests

amahussein left a comment

Choose a reason for hiding this comment

amahussein Dec 31, 2024

Choose a reason for hiding this comment

parthosa Dec 31, 2024

Choose a reason for hiding this comment

Improvements to `AutoTuner`:

Enhancements to `QualificationAutoTuner`: