-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Qualx model updates from weekly KPI run 2025-01-10 #1495
Conversation
Description:\n\nThe latest /ssd0/qual/spark-rapids-tools-private/qual-kpis/kpi_summary_xgboost-2025-01-10.csv:\n\nplatform,tp_count,fp_count,tn_count,fn_count,precision,recall\n\ndatabricks-aws,15,8,26,1,65.22,93.75\n\ndatabricks-aws_photon,10,18,18,4,35.71,71.43\n\ndatabricks-azure,18,1,21,8,94.74,69.23\n\ndatabricks-azure_photon,20,7,6,15,74.07,57.14\n\ndataproc,34,20,29,2,62.96,94.44\n\nemr,13,9,25,4,59.09,76.47\n\nonprem,27,19,28,2,58.7,93.1 Signed-off-by: nvauto <[email protected]>
KPIs look good on the updates....only minor changes is one more TN and one less FP for dataproc and onprem (which is good). @leewyang any concerns on your side? |
It's interesting that while only dataproc and onprem had KPI deltas, all of the model binaries were different than before. Since there were no dataset changes, these changes were likely due to:
For (2), I had been using a fairly fixed conda environment on a dev box, so I think building from a controlled CI/CD environment is actually better. Note that I had seen some binary deltas when building models in different environments before, hence the use of a fixed conda env. I will run the old process today just to confirm the expected deltas for dataproc and onprem and see if the other models change as well (or if they stay the same for my build environment). |
Question on this CI pipeline: Can we use a bash script in the pipeline that converts CSV to markdown format so that it renders better in Github? |
OK, finished running through the old process and comparing results.
So, basically, I think these models would be fine to merge now. Ideally, it would be nice to also auto-generate a PR for the @mattahrens should we address @parthosa's comment first before committing these models? |
Did you specifically mean the |
We can address this in the next weekly update as it is more of a cosmetic feature. |
No need to block this PR to make that change. |
Also, one other note. The script doesn't build/evaluate the |
Description:
The latest /ssd0/qual/spark-rapids-tools-private/qual-kpis/kpi_summary_xgboost-2025-01-10.csv:
platform,tp_count,fp_count,tn_count,fn_count,precision,recall
databricks-aws,15,8,26,1,65.22,93.75
databricks-aws_photon,10,18,18,4,35.71,71.43
databricks-azure,18,1,21,8,94.74,69.23
databricks-azure_photon,20,7,6,15,74.07,57.14
dataproc,34,20,29,2,62.96,94.44
emr,13,9,25,4,59.09,76.47
onprem,27,19,28,2,58.7,93.1