Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add qualx support for platform runtime variants (DB AWS) #1417

Merged
merged 1 commit into from
Nov 13, 2024

Conversation

leewyang
Copy link
Collaborator

This PR follows #1414 to support variants of platform runtimes (e.g. photon) in the qualx models.

Model variants are delimited by the underscore character '', i.e. .
This PR also updates the models from the latest code and datasets, including the new databricks-aws_photon model.

Changes

  1. adds sparkRuntime as a new expected_raw_feature.
  2. uses the sparkRuntime column to detect datasets with mixed runtimes (i.e. more than one).
  3. modifies the prediction loop to operate on input rows grouped by runtime variant.

Test

Following CMDs have been tested:

spark_rapids prediction

Internal Usage:

python qualx_main.py preprocess
python qualx_main.py train
python qualx_main.py evaluate

@leewyang leewyang added the user_tools Scope the wrapper module running CSP, QualX, and reports (python) label Nov 12, 2024
@leewyang leewyang self-assigned this Nov 12, 2024
@leewyang leewyang requested a review from parthosa November 12, 2024 17:10
@parthosa
Copy link
Collaborator

parthosa commented Nov 12, 2024

Thanks, @leewyang! This PR introduces Photon models for Databricks on AWS. Could we update the title to reflect “DB AWS”?

In future, when we add support for DB Azure models, this will help provide clarity.

@leewyang leewyang changed the title Add qualx support for platform runtime variants Add qualx support for platform runtime variants (DB AWS) Nov 13, 2024
Copy link
Collaborator

@parthosa parthosa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @leewyang for adding the Photon model.

@leewyang leewyang merged commit 4fac4c3 into NVIDIA:dev Nov 13, 2024
15 checks passed
@leewyang leewyang deleted the qualx_platform_qualifier branch November 13, 2024 19:41
@tgravescs
Copy link
Collaborator

so DB AWS here is in the title because we use a model training on eventlogs from databricks aws? What does it do on azure or gcp? Are we expecting differences there? I would expect the databricks code to the same but you have different machine types and possibly I/O characteristics so I'm wondering if we have seen differences from those.

@parthosa
Copy link
Collaborator

so DB AWS here is in the title because we use a model training on eventlogs from databricks aws?

Yes, it adds a model trained on DB AWS Photon event logs.

What does it do on azure or gcp? Are we expecting differences there?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
user_tools Scope the wrapper module running CSP, QualX, and reports (python)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants