Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[punet] Add integration tests. #84

Merged
merged 1 commit into from
Jun 29, 2024
Merged

Conversation

stellaraccident
Copy link
Contributor

  • Imports/validates the FP16 model running eagerly.
  • Imports/validates the int8 model running eagerly.
  • Exports the models.

Progress on #76

* Imports/validates the FP16 model running eagerly.
* Imports/validates the int8 model running eagerly.
* Exports the models.
@stellaraccident stellaraccident merged commit dbc50eb into main Jun 29, 2024
3 checks passed
@stellaraccident stellaraccident deleted the punet_integration_test branch June 29, 2024 02:21
Copy link
Member

@ScottTodd ScottTodd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

Comment on lines +23 to +24
REPO_ID = "amd-shark/sharktank-goldens"
REVISION = "230dad4d85fbcb8759a331dcf1d45f0562875abe"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ooh using huggingface (instead of Azure, GCS, etc.)? Nice!

May want to link https://huggingface.co/amd-shark/sharktank-goldens somewhere for easy referencing

Comment on lines +85 to +90
def get_best_torch_device() -> str:
import torch

if torch.cuda.is_available() and torch.cuda.device_count() > 0:
return "cuda:0"
return "cpu"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May want a way to set this device index via environment variable or flag.

On some of our CI runners we try to only use a specific GPU (like 6), since we and possibly other groups are running multiple jobs on the same "machine" / node.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Until this is running on CI, it would be useful to see the logs produced (e.g. in a gist).

Things I'd look for:

  • Time taken for each step / test case
  • Format of logs on success
  • Format of logs on failure
  • How easy the flow is to understand and reproduce outside of pytest
  • What artifacts are downloaded
  • What artifacts are produced

return "cpu"


def assert_golden_safetensors(actual_path, ref_path):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is nice, giving a good reason to stay in Python instead of using bash or just the native tools like iree-run-module.

(I'm still slowly coming to terms with moving infrastructure from C/C++ to python :P)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants