Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to add a data asset reference to the model in ML Studio with Python SDK #38513

Open
TCodingB opened this issue Nov 13, 2024 · 4 comments
Assignees
Labels
Client This issue points to a problem in the data-plane of the library. customer-reported Issues that are reported by GitHub users external to the Azure organization. Machine Learning needs-author-feedback Workflow: More information is needed from author to address the issue. question The issue doesn't require a change to the product in order to be resolved. Most issues start as that Service Attention Workflow: This issue is responsible by Azure service team.

Comments

@TCodingB
Copy link

  • Package Name: azureml.core
  • Package Version: 1.58.0
  • Operating System: Linux
  • Python Version: 3.10.12

Describe the bug
When making a data asset reference to the model in ML Studio with the help of python SDK I'm unable to add a data asset of type MLTable (composed of a parquet file and it's complementary mltable definition), but there doesn't seem to be an issue adding a MLTable deriving from .csv file.

To Reproduce

  1. Register a model to the ML Studio
  2. Registering the .csv and parquet datasets as MLtable to the data assets in the ML Studio
  3. Getting the data assets per name:
    • Dataset.get_by_name(ws, name=data_asset_name)
    • Saving the data assets as reference_dataset_csv and reference_dataset_parquet
  4. Adding the data asset reference to the registered model
    • model.add_dataset_references([("dataset csv", reference_dataset_csv), ("dataset parquet", reference_dataset_parquet)])
  5. After running there is no errors that would signalise that there is anything wrong.

Expected behavior
Both data assets to be referenced to the model in the ML Studio.

Screenshots
.csv data asset:
Image

.parquet data asset:
Image

Model data references:
Image

@github-actions github-actions bot added Client This issue points to a problem in the data-plane of the library. customer-reported Issues that are reported by GitHub users external to the Azure organization. Machine Learning needs-team-attention Workflow: This issue needs attention from Azure service team or SDK team question The issue doesn't require a change to the product in order to be resolved. Most issues start as that Service Attention Workflow: This issue is responsible by Azure service team. labels Nov 13, 2024
Copy link

Thanks for the feedback! We are routing this to the appropriate team for follow-up. cc @Azure/azure-ml-sdk @azureml-github.

@TCodingB
Copy link
Author

Hi, we managed to find a better solution going with registering pandas dataframe as data asset to the ML Studio. Still I find it odd that you can't reference the data asset that is otherwise completely functional (can be downloded and processed etc.) but can't be referenced to the model. Another thing that was odd was, that after running the code, we didn't get any feedback (good or bad) that the dataset was or wasn't referenced to the model.

Thank you for taking the time,

Kind regards,

Tadej

@jaga-work
Copy link
Member

Need sample files and model to investigate on this issue. @TCodingB kindly provide the same.

@jaga-work jaga-work added the needs-author-feedback Workflow: More information is needed from author to address the issue. label Nov 21, 2024
Copy link

Hi @TCodingB. Thank you for opening this issue and giving us the opportunity to assist. To help our team better understand your issue and the details of your scenario please provide a response to the question asked above or the information requested above. This will help us more accurately address your issue.

@github-actions github-actions bot removed the needs-team-attention Workflow: This issue needs attention from Azure service team or SDK team label Nov 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Client This issue points to a problem in the data-plane of the library. customer-reported Issues that are reported by GitHub users external to the Azure organization. Machine Learning needs-author-feedback Workflow: More information is needed from author to address the issue. question The issue doesn't require a change to the product in order to be resolved. Most issues start as that Service Attention Workflow: This issue is responsible by Azure service team.
Projects
None yet
Development

No branches or pull requests

2 participants