Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refine Qualification AutoTuner recommendations for shuffle partitions for CPU event logs #1479

Merged
merged 1 commit into from
Dec 31, 2024

Conversation

parthosa
Copy link
Collaborator

Fixes #1400.

This PR updates the Qualification AutoTuner to recommend setting shuffle partitions to 200 when processing CPU event logs. For GPU event logs, the existing logic in the Profiling AutoTuner remains unchanged.

Changes

Improvements to AutoTuner:

  • Changed the visibility of limitedLogicRecommendations from private to protected to allow subclass access.
  • Moved the initialization of shuffleStagesWithPosSpilling inside a conditional block to avoid unnecessary computations.

Enhancements to QualificationAutoTuner:

  • Added an override for limitedLogicRecommendations in QualificationAutoTuner to include spark.sql.shuffle.partitions.

Tests

  • Introduced a helper method buildDefaultAutoTuner in QualificationAutoTunerSuite to create instances with default properties.
  • Added a new test to verify that QualificationAutoTuner sets shuffle partitions to 200.

@parthosa parthosa added bug Something isn't working core_tools Scope the core module (scala) labels Dec 31, 2024
@parthosa parthosa self-assigned this Dec 31, 2024
@parthosa parthosa marked this pull request as ready for review December 31, 2024 16:03
Copy link
Collaborator

@amahussein amahussein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @parthosa
LGTME

* List of recommendations for which the Qualification AutoTuner skips calculations and only
* depend on default values.
*/
override protected val limitedLogicRecommendations: mutable.HashSet[String] = mutable.HashSet(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

QQ:
It is expected that limitedLogicRecommendations to be platform specific.
Is it going to be feasible to do change behavior based on platform when needed in the future?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we could use a similar class based approach (say PlatformSpecificAutoTunerProvider) to provide any platform specific tunings.

@parthosa parthosa merged commit 891b3b5 into NVIDIA:dev Dec 31, 2024
16 checks passed
@parthosa parthosa deleted the spark-rapids-tools-1400 branch December 31, 2024 21:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working core_tools Scope the core module (scala)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] Shuffle partitions for the first GPU run should not be set for more than 200
2 participants