Skip to content

Conversation

@harvey0100
Copy link
Contributor

@harvey0100 harvey0100 commented Jun 11, 2025

Introduced Caching Method Before Network Requests. A few changes to Error Handling / Params Formatting

Reference: #6167

Summary by CodeRabbit

  • New Features

    • Find and match cached VM images across compatible architectures with checksum verification.
  • Bug Fixes

    • Clearer network-related error messages; network failures handled immediately.
    • Provider selection now aggregates and reports provider errors when relevant.
    • More robust asset metadata reading and direct-file asset handling with improved logging.
  • Chores

    • Raised Fedora image URL threshold from version 38 to 40.

@harvey0100 harvey0100 added this to the 112 - Codename TBD milestone Jun 11, 2025
@harvey0100 harvey0100 requested review from clebergnu and richtja June 11, 2025 07:26
@harvey0100 harvey0100 self-assigned this Jun 11, 2025
@harvey0100 harvey0100 added the bug label Jun 11, 2025
@github-project-automation github-project-automation bot moved this to Review Requested in Avocado Kanban Jun 11, 2025
@mr-avocado mr-avocado bot moved this to Review Requested in Default project Jun 11, 2025
@codecov
Copy link

codecov bot commented Jun 11, 2025

Codecov Report

❌ Patch coverage is 25.00000% with 36 lines in your changes missing coverage. Please review.
✅ Project coverage is 69.75%. Comparing base (fa25203) to head (5596cd0).
⚠️ Report is 60 commits behind head on master.

Files with missing lines Patch % Lines
avocado/utils/vmimage.py 17.14% 29 Missing ⚠️
avocado/utils/asset.py 46.15% 7 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #6169      +/-   ##
==========================================
+ Coverage   68.33%   69.75%   +1.42%     
==========================================
  Files         205      206       +1     
  Lines       22413    22477      +64     
==========================================
+ Hits        15316    15679     +363     
+ Misses       7097     6798     -299     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@harvey0100 harvey0100 force-pushed the cachefix branch 2 times, most recently from 0e5307e to 5f2c75a Compare June 11, 2025 10:10
Copy link
Contributor

@richtja richtja left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @harvey0100, thank you for investigating the #6167 and coming up with solution. Unfortunately, I am not sure if this is the right path, because all the cache directory manipulation which you introduced are already solved in the Asset utility, see the fetch method.. Therefore, if you try to get the vmimage avocado will check the cache before downloading it.

IMO the problem in #6167 is different. The vmimage.Image.from_parameters() will get stuck without internet connection, because it tries to read the image url, which connects to the provider server and since this connection is not hadled properly we have the issue #6167.

@harvey0100
Copy link
Contributor Author

harvey0100 commented Jun 16, 2025

Hi @harvey0100, thank you for investigating the #6167 and coming up with solution. Unfortunately, I am not sure if this is the right path, because all the cache directory manipulation which you introduced are already solved in the Asset utility, see the fetch method.. Therefore, if you try to get the vmimage avocado will check the cache before downloading it.

IMO the problem in #6167 is different. The vmimage.Image.from_parameters() will get stuck without internet connection, because it tries to read the image url, which connects to the provider server and since this connection is not hadled properly we have the issue #6167.

Hi @richtja ,

The tricky part I ran into is that vmimage.py can't just ask asset.py to fetch an image, because it first has to figure out the exact URL for the latest build. As you know, that's the step that requires an internet connection and fails when offline.

So, the new logic doesn't touch the caching system itself. It just does a quick, smart search of the cache first. It asks asset.py for a list of everything it has stored and checks the metadata to see if we already have what we need. If we do, it completely skips the online check.

This felt like the cleanest way to solve the offline problem while letting asset.py continue to manage all the file caching.

Please let me know what you think and if there is a better solution?

@richtja
Copy link
Contributor

richtja commented Jun 17, 2025

Hi @harvey0100, thank you for investigating the #6167 and coming up with solution. Unfortunately, I am not sure if this is the right path, because all the cache directory manipulation which you introduced are already solved in the Asset utility, see the fetch method.. Therefore, if you try to get the vmimage avocado will check the cache before downloading it.
IMO the problem in #6167 is different. The vmimage.Image.from_parameters() will get stuck without internet connection, because it tries to read the image url, which connects to the provider server and since this connection is not hadled properly we have the issue #6167.

Hi @richtja ,

The tricky part I ran into is that vmimage.py can't just ask asset.py to fetch an image, because it first has to figure out the exact URL for the latest build. As you know, that's the step that requires an internet connection and fails when offline.

So, the new logic doesn't touch the caching system itself. It just does a quick, smart search of the cache first. It asks asset.py for a list of everything it has stored and checks the metadata to see if we already have what we need. If we do, it completely skips the online check.

This felt like the cleanest way to solve the offline problem while letting asset.py continue to manage all the file caching.

Please let me know what you think and if there is a better solution?

I still think that you try to reinvent a wheel here, because all the needed features are in the asset utility already. The problem of your solution is that it doesn't cover all cache features like expire and make it less maintainable in the future ( we will have to make updates when we decided to change asset utility).

IMO you could use Asset.find_asset_file() and you will omit the url(name) and if it fails, you will ask providers to fetch the url. Maybe it will need some updates to asset utility, but over all it should work and it will reuse current solutions.

@harvey0100
Copy link
Contributor Author

@coderabbitai full review

@coderabbitai
Copy link

coderabbitai bot commented Jun 19, 2025

✅ Actions performed

Full review triggered.

@coderabbitai
Copy link

coderabbitai bot commented Jun 19, 2025

📝 Walkthrough

Walkthrough

Enhanced VM image provider selection and caching, extended Fedora legacy URL threshold to ≤40, added Image._find_cached_image for cached asset lookup and checksum verification, and hardened Asset file/metadata handling with direct-path support and JSON/OSError logging.

Changes

Cohort / File(s) Change Summary
VM Image Handling Enhancements
avocado/utils/vmimage.py
- HTTP/network failures during provider/feed operations now raise ImageProviderError prefixed with "Network error: ...". - Fedora legacy-cloud URL threshold changed from ≤38 to ≤40. - Added classmethod Image._find_cached_image(cache_dirs, name=None, version=None, build=None, arch=None, checksum=None, algorithm=None, snapshot_dir=None) to enumerate cached assets via Asset.get_all_assets(...), load metadata, match by type/name/version/build/arch (with arch compatibility groups), verify checksum, and return an Image or None. - get_best_provider now collects provider errors and raises a combined ImageProviderError if providers fail; network-related instantiation failures (HTTPError, OSError) raise immediately.
Asset File and Metadata Robustness
avocado/utils/asset.py
- Asset.find_asset_file(self, create_metadata=False) now treats self.name as a direct filesystem path and returns it immediately if it exists; docstring updated. - Asset.get_metadata(self) now wraps metadata JSON reading in try/except, logs json.JSONDecodeError and OSError warnings, and logs a debug message when asset file is missing before raising.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 Pylint (3.3.7)
avocado/utils/asset.py
avocado/utils/vmimage.py

Tip

🔌 Remote MCP (Model Context Protocol) integration is now available!

Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

  • Visit our Status Page to check the current availability of CodeRabbit.
  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
avocado/utils/vmimage.py (2)

612-659: Review architecture compatibility mapping and asset iteration efficiency.

The method has good logic but a few considerations:

  1. Architecture mapping: The arch_map provides reasonable compatibility (e.g., x86_64 ↔ amd64), but consider if this should be centralized or configurable.

  2. Performance concern: asset.Asset.get_all_assets() might be expensive for large cache directories. Consider if there's a more efficient way to query specific assets.

Consider these improvements:

+# Move architecture mapping to module level for reusability
+ARCH_COMPATIBILITY_MAP = {
+    "x86_64": ["amd64", "64"],
+    "aarch64": ["arm64"],
+    "amd64": ["x86_64"],
+    "arm64": ["aarch64"],
+}

 @classmethod
 def _find_cached_image(cls, cache_dirs, ...):
-    arch_map = {
-        "x86_64": ["amd64", "64"],
-        "aarch64": ["arm64"],
-        "amd64": ["x86_64"],
-        "arm64": ["aarch64"],
-    }
-    compatible_arches = [arch] + arch_map.get(arch, []) if arch else []
+    compatible_arches = [arch] + ARCH_COMPATIBILITY_MAP.get(arch, []) if arch else []

870-892: Enhanced provider error handling with smart network error detection.

The improvements to get_best_provider are well thought out:

  1. Error tracking: last_error properly captures the most recent failure
  2. Network error detection: The "Cannot open" string check identifies network failures
  3. Exception variety: Catches both HTTPError and OSError for comprehensive network error handling

However, the string-based error detection could be more robust.

Consider using exception type checking instead of string matching:

-    if (
-        last_error
-        and isinstance(last_error, ImageProviderError)
-        and "Cannot open" in str(last_error)
-    ):
-        # Re-raise network errors to be handled by caller
-        raise last_error
+    if last_error and isinstance(last_error, ImageProviderError):
+        # Check if the error is network-related by examining the cause
+        if (hasattr(last_error, '__cause__') and 
+            isinstance(last_error.__cause__, (HTTPError, OSError))):
+            # Re-raise network errors to be handled by caller
+            raise last_error

This approach is more reliable than string matching and follows Python's exception chaining patterns.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 199475a and 5f2c75a.

📒 Files selected for processing (1)
  • avocado/utils/vmimage.py (6 hunks)
🧰 Additional context used
🪛 GitHub Check: codecov/patch
avocado/utils/vmimage.py

[warning] 122-122: avocado/utils/vmimage.py#L122
Added line #L122 was not covered by tests


[warning] 579-581: avocado/utils/vmimage.py#L579-L581
Added lines #L579 - L581 were not covered by tests


[warning] 583-587: avocado/utils/vmimage.py#L583-L587
Added lines #L583 - L587 were not covered by tests


[warning] 590-590: avocado/utils/vmimage.py#L590
Added line #L590 was not covered by tests


[warning] 604-608: avocado/utils/vmimage.py#L604-L608
Added lines #L604 - L608 were not covered by tests


[warning] 610-610: avocado/utils/vmimage.py#L610
Added line #L610 was not covered by tests


[warning] 631-631: avocado/utils/vmimage.py#L631
Added line #L631 was not covered by tests


[warning] 637-637: avocado/utils/vmimage.py#L637
Added line #L637 was not covered by tests


[warning] 640-643: avocado/utils/vmimage.py#L640-L643
Added lines #L640 - L643 were not covered by tests


[warning] 646-648: avocado/utils/vmimage.py#L646-L648
Added lines #L646 - L648 were not covered by tests


[warning] 659-659: avocado/utils/vmimage.py#L659
Added line #L659 was not covered by tests


[warning] 677-679: avocado/utils/vmimage.py#L677-L679
Added lines #L677 - L679 were not covered by tests


[warning] 682-682: avocado/utils/vmimage.py#L682
Added line #L682 was not covered by tests


[warning] 689-690: avocado/utils/vmimage.py#L689-L690
Added lines #L689 - L690 were not covered by tests


[warning] 692-694: avocado/utils/vmimage.py#L692-L694
Added lines #L692 - L694 were not covered by tests


[warning] 765-767: avocado/utils/vmimage.py#L765-L767
Added lines #L765 - L767 were not covered by tests


[warning] 772-772: avocado/utils/vmimage.py#L772
Added line #L772 was not covered by tests


[warning] 782-783: avocado/utils/vmimage.py#L782-L783
Added lines #L782 - L783 were not covered by tests


[warning] 785-785: avocado/utils/vmimage.py#L785
Added line #L785 was not covered by tests


[warning] 791-794: avocado/utils/vmimage.py#L791-L794
Added lines #L791 - L794 were not covered by tests


[warning] 815-816: avocado/utils/vmimage.py#L815-L816
Added lines #L815 - L816 were not covered by tests


[warning] 870-870: avocado/utils/vmimage.py#L870
Added line #L870 was not covered by tests


[warning] 877-879: avocado/utils/vmimage.py#L877-L879
Added lines #L877 - L879 were not covered by tests


[warning] 882-882: avocado/utils/vmimage.py#L882
Added line #L882 was not covered by tests


[warning] 885-885: avocado/utils/vmimage.py#L885
Added line #L885 was not covered by tests


[warning] 891-891: avocado/utils/vmimage.py#L891
Added line #L891 was not covered by tests

⏰ Context from checks skipped due to timeout of 90000ms (360)
  • GitHub Check: rpm-build:epel-9-x86_64
  • GitHub Check: rpm-build:epel-9-x86_64
  • GitHub Check: rpm-build:epel-9-x86_64
  • GitHub Check: rpm-build:epel-9-x86_64
  • GitHub Check: rpm-build:fedora-41-s390x
  • GitHub Check: rpm-build:epel-9-x86_64
  • GitHub Check: rpm-build:fedora-41-ppc64le
  • GitHub Check: rpm-build:fedora-41-x86_64
  • GitHub Check: rpm-build:fedora-42-x86_64
  • GitHub Check: rpm-build:fedora-rawhide-x86_64
  • GitHub Check: rpm-build:centos-stream-9-x86_64
  • GitHub Check: rpm-build:fedora-41-aarch64
  • GitHub Check: rpm-build:fedora-41-aarch64
  • GitHub Check: rpm-build:centos-stream-9-x86_64
  • GitHub Check: rpm-build:epel-9-x86_64
  • GitHub Check: rpm-build:fedora-41-ppc64le
  • GitHub Check: rpm-build:fedora-41-s390x
  • GitHub Check: rpm-build:fedora-rawhide-x86_64
  • GitHub Check: rpm-build:fedora-41-x86_64
  • GitHub Check: rpm-build:fedora-42-x86_64
  • GitHub Check: rpm-build:fedora-41-s390x
  • GitHub Check: rpm-build:epel-9-x86_64
  • GitHub Check: rpm-build:fedora-rawhide-x86_64
  • GitHub Check: rpm-build:fedora-41-ppc64le
  • GitHub Check: rpm-build:fedora-42-x86_64
  • GitHub Check: rpm-build:fedora-41-aarch64
  • GitHub Check: rpm-build:fedora-41-x86_64
  • GitHub Check: rpm-build:centos-stream-9-x86_64
  • GitHub Check: rpm-build:centos-stream-9-x86_64
  • GitHub Check: rpm-build:fedora-41-s390x
  • GitHub Check: rpm-build:fedora-rawhide-x86_64
  • GitHub Check: rpm-build:fedora-41-x86_64
  • GitHub Check: rpm-build:fedora-42-x86_64
  • GitHub Check: rpm-build:fedora-41-ppc64le
  • GitHub Check: rpm-build:epel-9-x86_64
  • GitHub Check: rpm-build:fedora-41-aarch64
  • GitHub Check: rpm-build:epel-9-x86_64
  • GitHub Check: rpm-build:epel-9-x86_64
  • GitHub Check: rpm-build:epel-9-x86_64
  • GitHub Check: rpm-build:epel-9-x86_64
  • GitHub Check: rpm-build:fedora-41-s390x
  • GitHub Check: rpm-build:epel-9-x86_64
  • GitHub Check: rpm-build:fedora-41-ppc64le
  • GitHub Check: rpm-build:fedora-41-x86_64
  • GitHub Check: rpm-build:fedora-42-x86_64
  • GitHub Check: rpm-build:fedora-rawhide-x86_64
  • GitHub Check: rpm-build:centos-stream-9-x86_64
  • GitHub Check: rpm-build:fedora-41-aarch64
  • GitHub Check: rpm-build:fedora-41-aarch64
  • GitHub Check: rpm-build:centos-stream-9-x86_64
  • GitHub Check: rpm-build:epel-9-x86_64
  • GitHub Check: rpm-build:fedora-41-ppc64le
  • GitHub Check: rpm-build:fedora-41-s390x
  • GitHub Check: rpm-build:fedora-rawhide-x86_64
  • GitHub Check: rpm-build:fedora-41-x86_64
  • GitHub Check: rpm-build:fedora-42-x86_64
  • GitHub Check: rpm-build:fedora-41-s390x
  • GitHub Check: rpm-build:epel-9-x86_64
  • GitHub Check: rpm-build:fedora-rawhide-x86_64
  • GitHub Check: rpm-build:fedora-41-ppc64le
  • GitHub Check: rpm-build:fedora-42-x86_64
  • GitHub Check: rpm-build:fedora-41-aarch64
  • GitHub Check: rpm-build:fedora-41-x86_64
  • GitHub Check: rpm-build:centos-stream-9-x86_64
  • GitHub Check: rpm-build:centos-stream-9-x86_64
  • GitHub Check: rpm-build:fedora-41-s390x
  • GitHub Check: rpm-build:fedora-rawhide-x86_64
  • GitHub Check: rpm-build:fedora-41-x86_64
  • GitHub Check: rpm-build:fedora-42-x86_64
  • GitHub Check: rpm-build:fedora-41-ppc64le
  • GitHub Check: rpm-build:epel-9-x86_64
  • GitHub Check: rpm-build:fedora-41-aarch64
  • GitHub Check: rpm-build:epel-9-x86_64
  • GitHub Check: rpm-build:epel-9-x86_64
  • GitHub Check: rpm-build:epel-9-x86_64
  • GitHub Check: rpm-build:epel-9-x86_64
  • GitHub Check: rpm-build:fedora-41-s390x
  • GitHub Check: rpm-build:epel-9-x86_64
  • GitHub Check: rpm-build:fedora-41-ppc64le
  • GitHub Check: rpm-build:fedora-41-x86_64
  • GitHub Check: rpm-build:fedora-42-x86_64
  • GitHub Check: rpm-build:fedora-rawhide-x86_64
  • GitHub Check: rpm-build:centos-stream-9-x86_64
  • GitHub Check: rpm-build:fedora-41-aarch64
  • GitHub Check: rpm-build:fedora-41-aarch64
  • GitHub Check: rpm-build:centos-stream-9-x86_64
  • GitHub Check: rpm-build:epel-9-x86_64
  • GitHub Check: rpm-build:fedora-41-ppc64le
  • GitHub Check: rpm-build:fedora-41-s390x
  • GitHub Check: rpm-build:fedora-rawhide-x86_64
  • GitHub Check: rpm-build:fedora-41-x86_64
  • GitHub Check: rpm-build:fedora-42-x86_64
  • GitHub Check: rpm-build:fedora-41-s390x
  • GitHub Check: rpm-build:epel-9-x86_64
  • GitHub Check: rpm-build:fedora-rawhide-x86_64
  • GitHub Check: rpm-build:fedora-41-ppc64le
  • GitHub Check: rpm-build:fedora-42-x86_64
  • GitHub Check: rpm-build:fedora-41-aarch64
  • GitHub Check: rpm-build:fedora-41-x86_64
  • GitHub Check: rpm-build:centos-stream-9-x86_64
  • GitHub Check: rpm-build:centos-stream-9-x86_64
  • GitHub Check: rpm-build:fedora-41-s390x
  • GitHub Check: rpm-build:fedora-rawhide-x86_64
  • GitHub Check: rpm-build:fedora-41-x86_64
  • GitHub Check: rpm-build:fedora-42-x86_64
  • GitHub Check: rpm-build:fedora-41-ppc64le
  • GitHub Check: rpm-build:epel-9-x86_64
  • GitHub Check: rpm-build:fedora-41-aarch64
  • GitHub Check: rpm-build:epel-9-x86_64
  • GitHub Check: rpm-build:epel-9-x86_64
  • GitHub Check: rpm-build:epel-9-x86_64
  • GitHub Check: rpm-build:epel-9-x86_64
  • GitHub Check: rpm-build:fedora-41-s390x
  • GitHub Check: rpm-build:epel-9-x86_64
  • GitHub Check: rpm-build:fedora-41-ppc64le
  • GitHub Check: rpm-build:fedora-41-x86_64
  • GitHub Check: rpm-build:fedora-42-x86_64
  • GitHub Check: rpm-build:fedora-rawhide-x86_64
  • GitHub Check: rpm-build:centos-stream-9-x86_64
  • GitHub Check: rpm-build:fedora-41-aarch64
  • GitHub Check: rpm-build:fedora-41-aarch64
  • GitHub Check: rpm-build:centos-stream-9-x86_64
  • GitHub Check: rpm-build:epel-9-x86_64
  • GitHub Check: rpm-build:fedora-41-ppc64le
  • GitHub Check: rpm-build:fedora-41-s390x
  • GitHub Check: rpm-build:fedora-rawhide-x86_64
  • GitHub Check: rpm-build:fedora-41-x86_64
  • GitHub Check: rpm-build:fedora-42-x86_64
  • GitHub Check: rpm-build:fedora-41-s390x
  • GitHub Check: rpm-build:epel-9-x86_64
  • GitHub Check: rpm-build:fedora-rawhide-x86_64
  • GitHub Check: rpm-build:fedora-41-ppc64le
  • GitHub Check: rpm-build:fedora-42-x86_64
  • GitHub Check: rpm-build:fedora-41-aarch64
  • GitHub Check: rpm-build:fedora-41-x86_64
  • GitHub Check: rpm-build:centos-stream-9-x86_64
  • GitHub Check: rpm-build:centos-stream-9-x86_64
  • GitHub Check: rpm-build:fedora-41-s390x
  • GitHub Check: rpm-build:fedora-rawhide-x86_64
  • GitHub Check: rpm-build:fedora-41-x86_64
  • GitHub Check: rpm-build:fedora-42-x86_64
  • GitHub Check: rpm-build:fedora-41-ppc64le
  • GitHub Check: rpm-build:epel-9-x86_64
  • GitHub Check: rpm-build:fedora-41-aarch64
  • GitHub Check: rpm-build:epel-9-x86_64
  • GitHub Check: rpm-build:epel-9-x86_64
  • GitHub Check: rpm-build:epel-9-x86_64
  • GitHub Check: rpm-build:epel-9-x86_64
  • GitHub Check: rpm-build:fedora-41-s390x
  • GitHub Check: rpm-build:epel-9-x86_64
  • GitHub Check: rpm-build:fedora-41-ppc64le
  • GitHub Check: rpm-build:fedora-41-x86_64
  • GitHub Check: rpm-build:fedora-42-x86_64
  • GitHub Check: rpm-build:fedora-rawhide-x86_64
  • GitHub Check: rpm-build:centos-stream-9-x86_64
  • GitHub Check: rpm-build:fedora-41-aarch64
  • GitHub Check: rpm-build:fedora-41-aarch64
  • GitHub Check: rpm-build:centos-stream-9-x86_64
  • GitHub Check: rpm-build:epel-9-x86_64
  • GitHub Check: rpm-build:fedora-41-ppc64le
  • GitHub Check: rpm-build:fedora-41-s390x
  • GitHub Check: rpm-build:fedora-rawhide-x86_64
  • GitHub Check: rpm-build:fedora-41-x86_64
  • GitHub Check: rpm-build:fedora-42-x86_64
  • GitHub Check: rpm-build:fedora-41-s390x
  • GitHub Check: rpm-build:epel-9-x86_64
  • GitHub Check: rpm-build:fedora-rawhide-x86_64
  • GitHub Check: rpm-build:fedora-41-ppc64le
  • GitHub Check: rpm-build:fedora-42-x86_64
  • GitHub Check: rpm-build:fedora-41-aarch64
  • GitHub Check: rpm-build:fedora-41-x86_64
  • GitHub Check: rpm-build:centos-stream-9-x86_64
  • GitHub Check: rpm-build:centos-stream-9-x86_64
  • GitHub Check: rpm-build:fedora-41-s390x
  • GitHub Check: rpm-build:fedora-rawhide-x86_64
  • GitHub Check: rpm-build:fedora-41-x86_64
  • GitHub Check: rpm-build:fedora-42-x86_64
  • GitHub Check: rpm-build:fedora-41-ppc64le
  • GitHub Check: rpm-build:epel-9-x86_64
  • GitHub Check: rpm-build:fedora-41-aarch64
  • GitHub Check: rpm-build:epel-9-x86_64
  • GitHub Check: rpm-build:epel-9-x86_64
  • GitHub Check: rpm-build:epel-9-x86_64
  • GitHub Check: rpm-build:epel-9-x86_64
  • GitHub Check: rpm-build:fedora-41-s390x
  • GitHub Check: rpm-build:epel-9-x86_64
  • GitHub Check: rpm-build:fedora-41-ppc64le
  • GitHub Check: rpm-build:fedora-41-x86_64
  • GitHub Check: rpm-build:fedora-42-x86_64
  • GitHub Check: rpm-build:fedora-rawhide-x86_64
  • GitHub Check: rpm-build:centos-stream-9-x86_64
  • GitHub Check: rpm-build:fedora-41-aarch64
  • GitHub Check: rpm-build:fedora-41-aarch64
  • GitHub Check: rpm-build:centos-stream-9-x86_64
  • GitHub Check: rpm-build:epel-9-x86_64
  • GitHub Check: rpm-build:fedora-41-ppc64le
  • GitHub Check: rpm-build:fedora-41-s390x
  • GitHub Check: rpm-build:fedora-rawhide-x86_64
  • GitHub Check: rpm-build:fedora-41-x86_64
  • GitHub Check: rpm-build:fedora-42-x86_64
  • GitHub Check: rpm-build:fedora-41-s390x
  • GitHub Check: rpm-build:epel-9-x86_64
  • GitHub Check: rpm-build:fedora-rawhide-x86_64
  • GitHub Check: rpm-build:fedora-41-ppc64le
  • GitHub Check: rpm-build:fedora-42-x86_64
  • GitHub Check: rpm-build:fedora-41-aarch64
  • GitHub Check: rpm-build:fedora-41-x86_64
  • GitHub Check: rpm-build:centos-stream-9-x86_64
  • GitHub Check: rpm-build:centos-stream-9-x86_64
  • GitHub Check: rpm-build:fedora-41-s390x
  • GitHub Check: rpm-build:fedora-rawhide-x86_64
  • GitHub Check: rpm-build:fedora-41-x86_64
  • GitHub Check: rpm-build:fedora-42-x86_64
  • GitHub Check: rpm-build:fedora-41-ppc64le
  • GitHub Check: rpm-build:epel-9-x86_64
  • GitHub Check: rpm-build:fedora-41-aarch64
  • GitHub Check: rpm-build:epel-9-x86_64
  • GitHub Check: rpm-build:epel-9-x86_64
  • GitHub Check: rpm-build:epel-9-x86_64
  • GitHub Check: rpm-build:epel-9-x86_64
  • GitHub Check: rpm-build:fedora-41-s390x
  • GitHub Check: rpm-build:epel-9-x86_64
  • GitHub Check: rpm-build:fedora-41-ppc64le
  • GitHub Check: rpm-build:fedora-41-x86_64
  • GitHub Check: rpm-build:fedora-42-x86_64
  • GitHub Check: rpm-build:fedora-rawhide-x86_64
  • GitHub Check: rpm-build:centos-stream-9-x86_64
  • GitHub Check: rpm-build:fedora-41-aarch64
  • GitHub Check: rpm-build:fedora-41-aarch64
  • GitHub Check: rpm-build:centos-stream-9-x86_64
  • GitHub Check: rpm-build:epel-9-x86_64
  • GitHub Check: rpm-build:fedora-41-ppc64le
  • GitHub Check: rpm-build:fedora-41-s390x
  • GitHub Check: rpm-build:fedora-rawhide-x86_64
  • GitHub Check: rpm-build:fedora-41-x86_64
  • GitHub Check: rpm-build:fedora-42-x86_64
  • GitHub Check: rpm-build:fedora-41-s390x
  • GitHub Check: rpm-build:epel-9-x86_64
  • GitHub Check: rpm-build:fedora-rawhide-x86_64
  • GitHub Check: rpm-build:fedora-41-ppc64le
  • GitHub Check: rpm-build:fedora-42-x86_64
  • GitHub Check: rpm-build:fedora-41-aarch64
  • GitHub Check: rpm-build:fedora-41-x86_64
  • GitHub Check: rpm-build:centos-stream-9-x86_64
  • GitHub Check: rpm-build:centos-stream-9-x86_64
  • GitHub Check: rpm-build:fedora-41-s390x
  • GitHub Check: rpm-build:fedora-rawhide-x86_64
  • GitHub Check: rpm-build:fedora-41-x86_64
  • GitHub Check: rpm-build:fedora-42-x86_64
  • GitHub Check: rpm-build:fedora-41-ppc64le
  • GitHub Check: rpm-build:epel-9-x86_64
  • GitHub Check: rpm-build:fedora-41-aarch64
  • GitHub Check: rpm-build:epel-9-x86_64
  • GitHub Check: rpm-build:epel-9-x86_64
  • GitHub Check: rpm-build:epel-9-x86_64
  • GitHub Check: rpm-build:epel-9-x86_64
  • GitHub Check: rpm-build:fedora-41-s390x
  • GitHub Check: rpm-build:epel-9-x86_64
  • GitHub Check: rpm-build:fedora-41-ppc64le
  • GitHub Check: rpm-build:fedora-41-x86_64
  • GitHub Check: rpm-build:fedora-42-x86_64
  • GitHub Check: rpm-build:fedora-rawhide-x86_64
  • GitHub Check: rpm-build:centos-stream-9-x86_64
  • GitHub Check: rpm-build:fedora-41-aarch64
  • GitHub Check: rpm-build:fedora-41-aarch64
  • GitHub Check: rpm-build:centos-stream-9-x86_64
  • GitHub Check: rpm-build:epel-9-x86_64
  • GitHub Check: rpm-build:fedora-41-ppc64le
  • GitHub Check: rpm-build:fedora-41-s390x
  • GitHub Check: rpm-build:fedora-rawhide-x86_64
  • GitHub Check: rpm-build:fedora-41-x86_64
  • GitHub Check: rpm-build:fedora-42-x86_64
  • GitHub Check: rpm-build:fedora-41-s390x
  • GitHub Check: rpm-build:epel-9-x86_64
  • GitHub Check: rpm-build:fedora-rawhide-x86_64
  • GitHub Check: rpm-build:fedora-41-ppc64le
  • GitHub Check: rpm-build:fedora-42-x86_64
  • GitHub Check: rpm-build:fedora-41-aarch64
  • GitHub Check: rpm-build:fedora-41-x86_64
  • GitHub Check: rpm-build:centos-stream-9-x86_64
  • GitHub Check: rpm-build:centos-stream-9-x86_64
  • GitHub Check: rpm-build:fedora-41-s390x
  • GitHub Check: rpm-build:fedora-rawhide-x86_64
  • GitHub Check: rpm-build:fedora-41-x86_64
  • GitHub Check: rpm-build:fedora-42-x86_64
  • GitHub Check: rpm-build:fedora-41-ppc64le
  • GitHub Check: rpm-build:epel-9-x86_64
  • GitHub Check: rpm-build:fedora-41-aarch64
  • GitHub Check: rpm-build:epel-9-x86_64
  • GitHub Check: rpm-build:epel-9-x86_64
  • GitHub Check: rpm-build:epel-9-x86_64
  • GitHub Check: rpm-build:epel-9-x86_64
  • GitHub Check: rpm-build:fedora-41-s390x
  • GitHub Check: rpm-build:epel-9-x86_64
  • GitHub Check: rpm-build:fedora-41-ppc64le
  • GitHub Check: rpm-build:fedora-41-x86_64
  • GitHub Check: rpm-build:fedora-42-x86_64
  • GitHub Check: rpm-build:fedora-rawhide-x86_64
  • GitHub Check: rpm-build:centos-stream-9-x86_64
  • GitHub Check: rpm-build:fedora-41-aarch64
  • GitHub Check: rpm-build:fedora-41-aarch64
  • GitHub Check: rpm-build:centos-stream-9-x86_64
  • GitHub Check: rpm-build:epel-9-x86_64
  • GitHub Check: rpm-build:fedora-41-ppc64le
  • GitHub Check: rpm-build:fedora-41-s390x
  • GitHub Check: rpm-build:fedora-rawhide-x86_64
  • GitHub Check: rpm-build:fedora-41-x86_64
  • GitHub Check: rpm-build:fedora-42-x86_64
  • GitHub Check: rpm-build:fedora-41-s390x
  • GitHub Check: rpm-build:epel-9-x86_64
  • GitHub Check: rpm-build:fedora-rawhide-x86_64
  • GitHub Check: rpm-build:fedora-41-ppc64le
  • GitHub Check: rpm-build:fedora-42-x86_64
  • GitHub Check: rpm-build:fedora-41-aarch64
  • GitHub Check: rpm-build:fedora-41-x86_64
  • GitHub Check: rpm-build:centos-stream-9-x86_64
  • GitHub Check: rpm-build:centos-stream-9-x86_64
  • GitHub Check: rpm-build:fedora-41-s390x
  • GitHub Check: rpm-build:fedora-rawhide-x86_64
  • GitHub Check: rpm-build:fedora-41-x86_64
  • GitHub Check: rpm-build:fedora-42-x86_64
  • GitHub Check: rpm-build:fedora-41-ppc64le
  • GitHub Check: rpm-build:epel-9-x86_64
  • GitHub Check: rpm-build:fedora-41-aarch64
  • GitHub Check: rpm-build:epel-9-x86_64
  • GitHub Check: rpm-build:epel-9-x86_64
  • GitHub Check: rpm-build:epel-9-x86_64
  • GitHub Check: rpm-build:epel-9-x86_64
  • GitHub Check: rpm-build:fedora-41-s390x
  • GitHub Check: rpm-build:epel-9-x86_64
  • GitHub Check: rpm-build:fedora-41-ppc64le
  • GitHub Check: rpm-build:fedora-41-x86_64
  • GitHub Check: rpm-build:fedora-42-x86_64
  • GitHub Check: rpm-build:fedora-rawhide-x86_64
  • GitHub Check: rpm-build:centos-stream-9-x86_64
  • GitHub Check: rpm-build:fedora-41-aarch64
  • GitHub Check: rpm-build:fedora-41-aarch64
  • GitHub Check: rpm-build:centos-stream-9-x86_64
  • GitHub Check: rpm-build:epel-9-x86_64
  • GitHub Check: rpm-build:fedora-41-ppc64le
  • GitHub Check: rpm-build:fedora-41-s390x
  • GitHub Check: rpm-build:fedora-rawhide-x86_64
  • GitHub Check: rpm-build:fedora-41-x86_64
  • GitHub Check: rpm-build:fedora-42-x86_64
  • GitHub Check: rpm-build:fedora-41-s390x
  • GitHub Check: rpm-build:epel-9-x86_64
  • GitHub Check: rpm-build:fedora-rawhide-x86_64
  • GitHub Check: rpm-build:fedora-41-ppc64le
  • GitHub Check: rpm-build:fedora-42-x86_64
  • GitHub Check: rpm-build:fedora-41-aarch64
  • GitHub Check: rpm-build:fedora-41-x86_64
  • GitHub Check: rpm-build:centos-stream-9-x86_64
  • GitHub Check: rpm-build:centos-stream-9-x86_64
  • GitHub Check: rpm-build:fedora-41-s390x
  • GitHub Check: rpm-build:fedora-rawhide-x86_64
  • GitHub Check: rpm-build:fedora-41-x86_64
  • GitHub Check: rpm-build:fedora-42-x86_64
  • GitHub Check: rpm-build:fedora-41-ppc64le
  • GitHub Check: rpm-build:epel-9-x86_64
  • GitHub Check: rpm-build:fedora-41-aarch64
🔇 Additional comments (6)
avocado/utils/vmimage.py (6)

19-19: LGTM: Import addition for JSON metadata handling.

The json import is correctly added to support the new metadata functionality for cached images.


122-123: Good improvement: Enhanced error handling for network operations.

Adding OSError to the exception handling covers additional network-related failures beyond HTTP errors, improving robustness of the URL opening operations.


592-610: LGTM: Well-structured metadata matching logic.

The _metadata_matches method correctly implements flexible matching with proper type conversions and case-insensitive name matching. The logic handles optional parameters appropriately by treating None values as "match any".


675-702: Excellent cache-first download logic with proper metadata handling.

The modified download method correctly:

  • Checks if URL is a local file path (from cache) first
  • Provides comprehensive metadata for asset management
  • Handles both string and iterable cache_dir types
  • Uses appropriate timeout multiplier for large VM images

The approach efficiently avoids redundant downloads while maintaining compatibility with the existing asset system.


749-784: Well-implemented cache-first strategy with proper fallback.

The new from_parameters logic correctly implements the cache-first approach:

  • Normalizes parameters (cache_dir, arch) appropriately
  • Searches cache before making network requests
  • Provides informative logging for debugging
  • Falls back to provider-based image resolution when cache miss occurs

This addresses the core offline usability issue mentioned in the PR objectives.


571-590: Verify metadata file naming convention and error handling.

The _get_metadata_for_asset method assumes a specific naming pattern for metadata files ({basename}_metadata.json). This should align with how the asset.Asset class stores metadata.

Run this script to verify the metadata file naming convention used by the asset system:

#!/bin/bash
# Description: Check how asset.Asset stores metadata files to ensure naming convention alignment

# Search for metadata file creation/access patterns in the asset module
rg -A 5 -B 5 "metadata.*json|\.json.*metadata" avocado/utils/asset.py

# Look for existing metadata file patterns in the codebase
fd -e json | head -20

coderabbitai bot added a commit that referenced this pull request Jun 19, 2025
Docstrings generation was requested by @harvey0100.

* #6169 (comment)

The following files were modified:

* `avocado/utils/vmimage.py`
@avocado-framework avocado-framework deleted a comment from coderabbitai bot Jun 19, 2025
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (1)
avocado/utils/vmimage.py (1)

842-872: Duplicate comment: Error message clarity needs improvement.

This is similar to the previous review comment about improving error message clarity and exception chaining.

The error handling is comprehensive but the error messages could be more actionable for users. Please refer to the previous review comment for specific improvement suggestions.

🧹 Nitpick comments (2)
avocado/utils/vmimage.py (2)

646-679: Simplify boolean logic as suggested by static analysis.

The method logic is correct, but the return statement can be simplified.

Apply the static analysis suggestion to improve readability:

        # Check architecture compatibility
-       if compatible_arches and metadata.get("arch") not in compatible_arches:
-           return False
-
-       return True
+       return not (compatible_arches and metadata.get("arch") not in compatible_arches)

Or even better, make the logic more explicit:

        # Check architecture compatibility
-       if compatible_arches and metadata.get("arch") not in compatible_arches:
-           return False
-
-       return True
+       return not compatible_arches or metadata.get("arch") in compatible_arches

706-784: Consider performance optimization for cache search.

While the logic is correct, the method creates multiple Asset instances in a loop which could be expensive for large cache directories.

Consider these optimizations:

  1. Early metadata filtering: Check if metadata files exist before creating Asset instances
  2. Batch processing: Group asset paths and process metadata more efficiently
  3. Metadata caching: Cache metadata parsing results to avoid repeated JSON parsing

Example optimization:

# Before creating Asset instances, quickly filter for potential candidates
potential_assets = []
for asset_path in asset.Asset.get_all_assets(cache_dirs, sort=False):
    metadata_file = f"{os.path.splitext(asset_path)[0]}_metadata.json"
    if os.path.isfile(metadata_file):
        potential_assets.append(asset_path)

# Now process only assets that have metadata
for asset_path in potential_assets:
    # ... existing logic
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5f2c75a and 3795fce.

📒 Files selected for processing (2)
  • avocado/utils/asset.py (1 hunks)
  • avocado/utils/vmimage.py (5 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (1)
avocado/utils/vmimage.py (1)
avocado/utils/asset.py (4)
  • Asset (61-738)
  • fetch (382-442)
  • get_all_assets (535-549)
  • get_metadata (481-526)
🪛 Ruff (0.11.9)
avocado/utils/vmimage.py

675-678: Return the negated condition directly

Inline condition

(SIM103)

⏰ Context from checks skipped due to timeout of 90000ms (18)
  • GitHub Check: Fedora selftests
  • GitHub Check: Podman Egg task
  • GitHub Check: Podman spawner with 3rd party runner plugin
  • GitHub Check: Version task fedora:41
  • GitHub Check: Egg task debian:11.0
  • GitHub Check: Egg task ubuntu:20.04
  • GitHub Check: Version task debian:12.4
  • GitHub Check: Egg task ubi:8.8
  • GitHub Check: Version task debian:11.0
  • GitHub Check: Build Package (wheel/tarball) for Python 3.8
  • GitHub Check: Version task ubuntu:22.04
  • GitHub Check: Egg task fedora:40
  • GitHub Check: Windows with Python 3.13
  • GitHub Check: Windows with Python 3.10
  • GitHub Check: Static checks
  • GitHub Check: macOS with Python 3.11
  • GitHub Check: Code Coverage (3.11)
  • GitHub Check: Code Coverage (3.11)
🔇 Additional comments (3)
avocado/utils/vmimage.py (3)

121-122: Good enhancement to error handling.

Adding OSError to the exception handling alongside HTTPError is appropriate since network operations can raise either exception type depending on the underlying failure mode.


584-622: Optimize download logic for cached files.

The enhancement correctly separates cached file handling from remote downloads, improving performance by avoiding unnecessary Asset.fetch() calls for local files.

The metadata structure is well-defined and will support the enhanced caching functionality.


922-944: Enhance error handling for provider initialization.

The enhanced error handling correctly captures both HTTP and network errors, improving robustness of provider selection.

The logic to re-raise network errors with "Cannot open" messages is particularly good as it allows the caller to handle offline scenarios appropriately.

Copy link
Contributor

@richtja richtja left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @harvey0100, thank you for the update it looks much better now, because it uses the asset features without code duplications. I still have so comments, but they are more about simplifying things then completely changing the implementation. IMO you are on the right track here.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (2)
avocado/utils/vmimage.py (2)

817-831: Consider the error message improvements from past review.

The error handling is comprehensive, but the error messages could be more actionable for users. There's a past review comment suggesting improvements to make these messages clearer and more helpful.

The previous suggestion to improve error message clarity still applies - consider making the messages more actionable by clearly stating what users can do (check internet connection, verify image exists, etc.).


847-853: Apply previous error message improvement suggestions.

Similar to the earlier error handling block, this error message could be more actionable for users based on past review feedback.

The previous suggestion to enhance error messages for better troubleshooting applies here as well - consider making it clearer that the provider failed to supply a valid image URL and both cached and network options failed.

🧹 Nitpick comments (2)
avocado/utils/vmimage.py (2)

705-708: Simplify the condition as suggested by static analysis.

The negated condition can be returned directly for better readability.

-            # Check architecture compatibility
-            if compatible_arches and metadata.get("arch") not in compatible_arches:
-                return False
-
-            return True
+            # Check architecture compatibility
+            return not (compatible_arches and metadata.get("arch") not in compatible_arches)

735-735: Consider using public API instead of private method.

The code accesses the private _verify_hash method. Based on past review comments, temp_asset.validate_hash() should be available as a public method.

-                    # pylint: disable-next=W0212
-                    if checksum and not temp_asset._verify_hash(asset_path):
+                    if checksum and not temp_asset.validate_hash():

However, verify that validate_hash() method exists and works as expected before making this change.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3795fce and 12a1c56.

📒 Files selected for processing (2)
  • avocado/utils/asset.py (3 hunks)
  • avocado/utils/vmimage.py (6 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • avocado/utils/asset.py
🧰 Additional context used
🧬 Code Graph Analysis (1)
avocado/utils/vmimage.py (1)
avocado/utils/asset.py (4)
  • Asset (61-744)
  • get_all_assets (541-555)
  • get_metadata (506-532)
  • _verify_hash (371-380)
🪛 Ruff (0.12.2)
avocado/utils/vmimage.py

705-708: Return the negated condition directly

Inline condition

(SIM103)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (51)
  • GitHub Check: rpm-build:epel-9-x86_64
  • GitHub Check: rpm-build:fedora-41-ppc64le
  • GitHub Check: rpm-build:centos-stream-9-x86_64
  • GitHub Check: rpm-build:fedora-42-x86_64
  • GitHub Check: rpm-build:fedora-41-x86_64
  • GitHub Check: rpm-build:fedora-rawhide-x86_64
  • GitHub Check: rpm-build:fedora-41-aarch64
  • GitHub Check: rpm-build:fedora-41-s390x
  • GitHub Check: rpm-build:fedora-41-aarch64
  • GitHub Check: rpm-build:fedora-rawhide-x86_64
  • GitHub Check: rpm-build:fedora-42-x86_64
  • GitHub Check: rpm-build:fedora-41-x86_64
  • GitHub Check: rpm-build:fedora-41-ppc64le
  • GitHub Check: rpm-build:epel-9-x86_64
  • GitHub Check: rpm-build:centos-stream-9-x86_64
  • GitHub Check: rpm-build:fedora-42-x86_64
  • GitHub Check: rpm-build:fedora-rawhide-x86_64
  • GitHub Check: rpm-build:fedora-41-aarch64
  • GitHub Check: rpm-build:fedora-41-x86_64
  • GitHub Check: rpm-build:centos-stream-9-x86_64
  • GitHub Check: rpm-build:fedora-41-ppc64le
  • GitHub Check: rpm-build:fedora-41-s390x
  • GitHub Check: rpm-build:epel-9-x86_64
  • GitHub Check: rpm-build:epel-9-x86_64
  • GitHub Check: rpm-build:fedora-41-s390x
  • GitHub Check: rpm-build:fedora-42-x86_64
  • GitHub Check: rpm-build:fedora-rawhide-x86_64
  • GitHub Check: rpm-build:centos-stream-9-x86_64
  • GitHub Check: rpm-build:fedora-41-x86_64
  • GitHub Check: rpm-build:fedora-41-aarch64
  • GitHub Check: rpm-build:fedora-41-ppc64le
  • GitHub Check: Egg task fedora:40
  • GitHub Check: Fedora selftests
  • GitHub Check: Egg task ubuntu:20.04
  • GitHub Check: Egg task ubuntu:22.04
  • GitHub Check: Build Package (wheel/tarball) for Python 3.11
  • GitHub Check: Build Package (wheel/tarball) for Python 3.12
  • GitHub Check: Build Package (wheel/tarball) for Python 3.8
  • GitHub Check: Build Package (wheel/tarball) for Python 3.13
  • GitHub Check: Build Package (wheel/tarball) for Python 3.10
  • GitHub Check: Build Package (wheel/tarball) for Python 3.9
  • GitHub Check: Version task fedora:40
  • GitHub Check: macOS with Python 3.11
  • GitHub Check: Windows with Python 3.10
  • GitHub Check: Version task debian:12.4
  • GitHub Check: Smokecheck on Linux with Python 3.10
  • GitHub Check: Smokecheck on Linux with Python 3.11
  • GitHub Check: Version task ubi:8.8
  • GitHub Check: Static checks
  • GitHub Check: Code Coverage (3.11)
  • GitHub Check: Code Coverage (3.11)
🔇 Additional comments (4)
avocado/utils/vmimage.py (4)

121-122: LGTM! Improved network error handling.

Adding OSError to the exception handling improves robustness by catching additional network-related failures like connection timeouts and DNS resolution errors. The "Network error:" prefix provides clear error categorization.


585-615: LGTM! Improved download logic with better cache handling.

The enhanced download method efficiently handles both cached and remote images:

  • Direct use of local cached files avoids unnecessary processing
  • Proper metadata preparation supports better asset management
  • Delegation to asset.Asset.fetch() leverages existing caching infrastructure

This aligns well with the PR objective to improve offline usability.


791-809: LGTM! Excellent cache-first approach implementation.

The parameter normalization and cache-first lookup logic effectively addresses the PR objective of avoiding network calls when images are already cached. The structured approach of trying cache first, then falling back to providers is well-implemented.


903-921: LGTM! Enhanced error tracking and propagation.

The improvements to error handling are excellent:

  • Tracking last_error preserves provider-specific error context
  • Catching OSError handles additional network failure scenarios like connection timeouts
  • Re-raising the last error provides users with more specific troubleshooting information

This enhances the overall robustness of provider selection.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
avocado/utils/vmimage.py (1)

775-853: Excellent implementation of cache-first approach.

The refactored method successfully addresses the PR objectives by:

  1. Cache-first strategy: Checking local cache before network requests (lines 797-809)
  2. Proper parameter normalization: Handling cache_dir and arch consistently (lines 791-795)
  3. Enhanced error handling: Detailed error messages with proper exception chaining
  4. Provider integration: Merging checksum/algorithm from providers when not specified

The implementation effectively solves the offline usability issue mentioned in the PR objectives.

However, note that past review comments suggested improving error message clarity to be more actionable for users. Consider the suggestions from previous reviews about making error messages more user-friendly.

🧹 Nitpick comments (1)
avocado/utils/vmimage.py (1)

646-757: Well-implemented cache search with room for optimization.

The method effectively implements cached image discovery with good architecture compatibility mapping and proper error handling. However, there are several areas for improvement:

Improvements to consider:

  1. Simplify condition logic (addressing static analysis hint):
-    # Check architecture compatibility
-    if compatible_arches and metadata.get("arch") not in compatible_arches:
-        return False
-
-    return True
+    # Check architecture compatibility
+    return not compatible_arches or metadata.get("arch") in compatible_arches
  1. Move inline function outside (as suggested in past reviews):
    The matches_image_criteria function could be moved outside this method or made a separate class method to reduce complexity and improve reusability.

  2. Consider extracting architecture compatibility logic:
    The architecture mapping could be extracted to a class-level constant or separate method for better maintainability.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 12a1c56 and 371101a.

📒 Files selected for processing (2)
  • avocado/utils/asset.py (3 hunks)
  • avocado/utils/vmimage.py (6 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • avocado/utils/asset.py
🧰 Additional context used
🧬 Code Graph Analysis (1)
avocado/utils/vmimage.py (1)
avocado/utils/asset.py (4)
  • Asset (61-747)
  • get_all_assets (544-558)
  • get_metadata (506-535)
  • _verify_hash (371-380)
🪛 Ruff (0.12.2)
avocado/utils/vmimage.py

705-708: Return the negated condition directly

Inline condition

(SIM103)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (41)
  • GitHub Check: rpm-build:fedora-41-x86_64
  • GitHub Check: rpm-build:fedora-41-s390x
  • GitHub Check: rpm-build:fedora-42-x86_64
  • GitHub Check: rpm-build:fedora-41-aarch64
  • GitHub Check: rpm-build:fedora-41-ppc64le
  • GitHub Check: rpm-build:epel-9-x86_64
  • GitHub Check: rpm-build:centos-stream-9-x86_64
  • GitHub Check: rpm-build:fedora-rawhide-x86_64
  • GitHub Check: rpm-build:fedora-41-x86_64
  • GitHub Check: rpm-build:fedora-41-s390x
  • GitHub Check: rpm-build:fedora-42-x86_64
  • GitHub Check: rpm-build:fedora-41-aarch64
  • GitHub Check: rpm-build:fedora-41-ppc64le
  • GitHub Check: rpm-build:epel-9-x86_64
  • GitHub Check: rpm-build:centos-stream-9-x86_64
  • GitHub Check: rpm-build:fedora-rawhide-x86_64
  • GitHub Check: rpm-build:fedora-41-ppc64le
  • GitHub Check: rpm-build:fedora-41-s390x
  • GitHub Check: rpm-build:fedora-41-aarch64
  • GitHub Check: rpm-build:fedora-rawhide-x86_64
  • GitHub Check: rpm-build:fedora-41-x86_64
  • GitHub Check: rpm-build:fedora-42-x86_64
  • GitHub Check: rpm-build:centos-stream-9-x86_64
  • GitHub Check: rpm-build:epel-9-x86_64
  • GitHub Check: Fedora selftests
  • GitHub Check: Egg task fedora:41
  • GitHub Check: Egg task debian:12.4
  • GitHub Check: Egg task ubi:9.2
  • GitHub Check: Egg task fedora:40
  • GitHub Check: Podman spawner with 3rd party runner plugin
  • GitHub Check: Version task ubuntu:20.04
  • GitHub Check: Smokecheck on Linux with Python 3.10
  • GitHub Check: Version task ubi:8.8
  • GitHub Check: Version task debian:12.4
  • GitHub Check: Build Package (wheel/tarball) for Python 3.13
  • GitHub Check: Windows with Python 3.12
  • GitHub Check: Smokecheck on Linux with Python 3.11
  • GitHub Check: Static checks
  • GitHub Check: macOS with Python 3.11
  • GitHub Check: Code Coverage (3.11)
  • GitHub Check: Code Coverage (3.11)
🔇 Additional comments (4)
avocado/utils/vmimage.py (4)

121-122: Good improvement to network error handling.

Adding OSError to the exception handling is appropriate since network operations can fail with various OS-level errors beyond HTTP errors. The "Network error:" prefix in the error message also provides better context for debugging.


216-216: LGTM: Fedora version threshold update.

The version threshold update from 38 to 40 appears to be a necessary configuration change to reflect changes in Fedora's image repository structure.


584-615: Excellent optimization for cached image handling.

The changes improve efficiency by:

  1. Direct local file usage: Checking if the URL is already a local file path avoids unnecessary Asset operations
  2. Proper cache directory normalization: Handling both string and iterable cache_dir types correctly
  3. Enhanced metadata: Adding vmimage-specific metadata fields for better cache management
  4. Extended timeout: Using 3x the default timeout is appropriate for large image downloads

The logic flow is well-structured and aligns perfectly with the PR's caching objectives.


903-921: Improved error handling and propagation.

The changes significantly enhance error handling by:

  1. Better error context: Tracking last_error provides more specific error information than the generic AttributeError
  2. Network error handling: Properly catching and handling network errors during provider instantiation
  3. Immediate network error raising: Correctly failing fast on network errors as suggested in past reviews
  4. Proper error chaining: Using from e maintains the exception chain for better debugging

The transition from AttributeError("Provider not available") to re-raising the specific ImageProviderError is a substantial improvement for error diagnostics.

Copy link
Contributor

@richtja richtja left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @harvey0100, thank you for the update, I have a couple more comments to it. Also, could you run the vmimage tests which are run only during the pre_release testing to check if this is stable. Thank you.

# Use Asset.get_all_assets() to find cached assets
for asset_path in asset.Asset.get_all_assets(cache_dirs, sort=False):
try:
# Create temp asset as Jan suggested
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is not needed, because it is understandable from var name temp_asset

)
for cache_dir in self.cache_dirs
],
# Then try by name
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The self.relative_dir is already based on self.name why it is needed to add search by self.name?

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
avocado/utils/vmimage.py (1)

763-841: Cache-first approach implemented correctly but error messages need improvement.

The cache-first strategy successfully addresses the PR objective of avoiding network calls when images are already cached. The parameter normalization and error handling flow are logical.

However, the error messages from past review comments haven't been fully improved. The current error messages could be more actionable for users.

The error handling suggestions from previous reviews still apply - consider implementing the more user-friendly error messages that were previously suggested to make the errors more actionable and informative.

🧹 Nitpick comments (1)
avocado/utils/vmimage.py (1)

645-745: Method functions correctly but consider past review feedback.

The caching logic is sound and properly matches image criteria with architecture compatibility. However, several improvements from past reviews could be addressed:

  1. Architecture compatibility: Consider making the ARCH_COMPATIBILITY mapping configurable rather than hardcoded, as suggested in past reviews.

  2. Hash verification: The method uses temp_asset._verify_hash() which accesses a protected method. Consider using public Asset methods for hash validation.

  3. Inner function placement: As suggested in past reviews, the matches_image_criteria function could be moved inside _find_cached_image as an inner method to avoid polluting the API.

  4. Condition simplification: The static analysis correctly identifies that lines 697-700 can be simplified by returning the negated condition directly.

Apply this diff to simplify the condition check:

-            if compatible_arches and metadata.get("arch") not in compatible_arches:
-                return False
-
-            return True
+            return not (compatible_arches and metadata.get("arch") not in compatible_arches)
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 371101a and c9be127.

📒 Files selected for processing (4)
  • avocado/utils/asset.py (3 hunks)
  • avocado/utils/data_structures.py (2 hunks)
  • avocado/utils/vmimage.py (5 hunks)
  • selftests/unit/utils/data_structures.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • avocado/utils/asset.py
🧰 Additional context used
🧬 Code Graph Analysis (1)
selftests/unit/utils/data_structures.py (1)
avocado/utils/data_structures.py (2)
  • DataSize (311-374)
  • b (357-358)
🪛 Ruff (0.12.2)
avocado/utils/vmimage.py

697-700: Return the negated condition directly

Inline condition

(SIM103)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (44)
  • GitHub Check: rpm-build:fedora-41-x86_64
  • GitHub Check: rpm-build:fedora-41-s390x
  • GitHub Check: rpm-build:fedora-42-x86_64
  • GitHub Check: rpm-build:fedora-41-aarch64
  • GitHub Check: rpm-build:fedora-41-ppc64le
  • GitHub Check: rpm-build:epel-9-x86_64
  • GitHub Check: rpm-build:centos-stream-9-x86_64
  • GitHub Check: rpm-build:fedora-rawhide-x86_64
  • GitHub Check: rpm-build:fedora-41-x86_64
  • GitHub Check: rpm-build:fedora-42-x86_64
  • GitHub Check: rpm-build:fedora-rawhide-x86_64
  • GitHub Check: rpm-build:centos-stream-9-x86_64
  • GitHub Check: rpm-build:epel-9-x86_64
  • GitHub Check: rpm-build:fedora-41-aarch64
  • GitHub Check: rpm-build:fedora-41-ppc64le
  • GitHub Check: rpm-build:fedora-41-s390x
  • GitHub Check: rpm-build:fedora-41-ppc64le
  • GitHub Check: rpm-build:fedora-41-s390x
  • GitHub Check: rpm-build:fedora-41-aarch64
  • GitHub Check: rpm-build:fedora-rawhide-x86_64
  • GitHub Check: rpm-build:fedora-41-x86_64
  • GitHub Check: rpm-build:fedora-42-x86_64
  • GitHub Check: rpm-build:centos-stream-9-x86_64
  • GitHub Check: rpm-build:epel-9-x86_64
  • GitHub Check: Fedora selftests
  • GitHub Check: Egg task ubi:8.8
  • GitHub Check: Podman Egg task
  • GitHub Check: Windows with Python 3.13
  • GitHub Check: Windows with Python 3.10
  • GitHub Check: Windows with Python 3.12
  • GitHub Check: Windows with Python 3.9
  • GitHub Check: Windows with Python 3.11
  • GitHub Check: Version task fedora:41
  • GitHub Check: Build Package (wheel/tarball) for Python 3.13
  • GitHub Check: Version task debian:12.4
  • GitHub Check: Smokecheck on Linux with Python 3.10
  • GitHub Check: Smokecheck on Linux with Python 3.11
  • GitHub Check: Version task ubuntu:22.04
  • GitHub Check: Version task ubi:8.8
  • GitHub Check: Static checks
  • GitHub Check: macOS with Python 3.11
  • GitHub Check: Analyze (python)
  • GitHub Check: Code Coverage (3.11)
  • GitHub Check: Code Coverage (3.11)
🔇 Additional comments (5)
avocado/utils/data_structures.py (2)

28-28: LGTM!

The import of the re module is correctly placed and necessary for the regex-based parsing implementation.


332-346: Well-implemented regex-based parsing with improved validation.

The refactored constructor uses a clear regex pattern that properly handles:

  • Integer and decimal values (\d+(\.\d+)?)
  • Optional whitespace before units (\s*)
  • Case-insensitive unit matching (after lowercase conversion)
  • Comprehensive validation of units and non-negative values

The error message is now descriptive with concrete examples, which improves user experience.

selftests/unit/utils/data_structures.py (1)

142-142: LGTM! Test correctly validates decimal parsing.

The test assertion correctly validates that "0.5g" produces 536870912 bytes (0.5 × 1GB), which aligns perfectly with the enhanced regex-based parsing in the DataSize constructor.

avocado/utils/vmimage.py (2)

121-122: LGTM: Improved network error handling.

Adding OSError to the exception handling is appropriate as it catches broader network-related failures (DNS resolution, connection timeouts, etc.) beyond just HTTP errors. The "Network error" prefix provides clear error categorization.


585-614: LGTM: Proper handling of local vs remote URLs.

The logic correctly distinguishes between local file paths (from cache) and remote URLs, using direct file access for cached images and proper Asset.fetch() for downloads. The metadata preparation ensures proper cache management.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (2)
avocado/utils/vmimage.py (2)

763-841: Excellent implementation of cache-first approach.

This refactoring successfully implements the PR's main objective of checking cache before making network requests. The method now:

  • Properly normalizes parameters and cache directories
  • Attempts to find cached images first, avoiding unnecessary network calls
  • Provides comprehensive error handling with informative messages
  • Maintains proper error chaining for debugging

The implementation aligns perfectly with the goal of improving offline usability while maintaining compatibility with existing asset management.

Note: There's already a past review comment suggesting improvements to the error message clarity in the exception handling blocks around lines 790-820.


891-909: Good error collection improvements, but consider network error strategy.

The error collection and consolidated reporting are excellent improvements for debugging. However, immediately raising network errors on lines 899-903 may prevent other providers from being tried.

Note: There's already a past review comment suggesting that network errors should be collected rather than immediately raised, allowing all providers to be attempted for better resilience.

🧹 Nitpick comments (2)
avocado/utils/vmimage.py (2)

697-700: Simplify condition logic as suggested by static analysis.

The negated condition can be simplified for better readability.

Apply this diff to simplify the condition:

-            if compatible_arches and metadata.get("arch") not in compatible_arches:
-                return False
-
-            return True
+            return not compatible_arches or metadata.get("arch") in compatible_arches

740-743: Consider more specific exception handling.

The broad exception catching could hide important errors. Consider being more specific about expected exceptions during metadata processing.

Consider this more targeted approach:

-            except (OSError, ValueError, KeyError, AttributeError) as e:
-                # Skip assets that can't be processed (common errors during metadata processing)
-                LOG.debug("Skipping asset %s due to error: %s", asset_path, e)
+            except OSError as e:
+                # Skip assets with file access issues
+                LOG.debug("Skipping asset %s due to file access error: %s", asset_path, e)
+            except (ValueError, KeyError) as e:
+                # Skip assets with malformed metadata
+                LOG.debug("Skipping asset %s due to metadata error: %s", asset_path, e)
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c9be127 and f200241.

📒 Files selected for processing (1)
  • avocado/utils/vmimage.py (6 hunks)
🧰 Additional context used
🪛 Ruff (0.12.2)
avocado/utils/vmimage.py

697-700: Return the negated condition directly

Inline condition

(SIM103)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (44)
  • GitHub Check: rpm-build:fedora-42-x86_64
  • GitHub Check: rpm-build:fedora-rawhide-x86_64
  • GitHub Check: rpm-build:fedora-41-aarch64
  • GitHub Check: rpm-build:fedora-41-x86_64
  • GitHub Check: rpm-build:centos-stream-9-x86_64
  • GitHub Check: rpm-build:fedora-41-ppc64le
  • GitHub Check: rpm-build:fedora-41-s390x
  • GitHub Check: rpm-build:epel-9-x86_64
  • GitHub Check: rpm-build:fedora-41-x86_64
  • GitHub Check: rpm-build:fedora-41-s390x
  • GitHub Check: rpm-build:fedora-42-x86_64
  • GitHub Check: rpm-build:fedora-41-aarch64
  • GitHub Check: rpm-build:fedora-41-ppc64le
  • GitHub Check: rpm-build:epel-9-x86_64
  • GitHub Check: rpm-build:centos-stream-9-x86_64
  • GitHub Check: rpm-build:fedora-rawhide-x86_64
  • GitHub Check: rpm-build:epel-9-x86_64
  • GitHub Check: rpm-build:fedora-41-s390x
  • GitHub Check: rpm-build:fedora-42-x86_64
  • GitHub Check: rpm-build:fedora-rawhide-x86_64
  • GitHub Check: rpm-build:centos-stream-9-x86_64
  • GitHub Check: rpm-build:fedora-41-x86_64
  • GitHub Check: rpm-build:fedora-41-aarch64
  • GitHub Check: rpm-build:fedora-41-ppc64le
  • GitHub Check: Fedora develop install/uninstall task
  • GitHub Check: Fedora selftests
  • GitHub Check: Build Package (wheel/tarball) for Python 3.12
  • GitHub Check: Egg task ubi:8.8
  • GitHub Check: Build Package (wheel/tarball) for Python 3.9
  • GitHub Check: Build Package (wheel/tarball) for Python 3.11
  • GitHub Check: Build Package (wheel/tarball) for Python 3.13
  • GitHub Check: Version task ubuntu:20.04
  • GitHub Check: Version task ubi:8.8
  • GitHub Check: Build Package (wheel/tarball) for Python 3.10
  • GitHub Check: Build Package (wheel/tarball) for Python 3.8
  • GitHub Check: Version task debian:12.4
  • GitHub Check: Smokecheck on Linux with Python 3.11
  • GitHub Check: Smokecheck on Linux with Python 3.10
  • GitHub Check: Windows with Python 3.13
  • GitHub Check: macOS with Python 3.11
  • GitHub Check: Static checks
  • GitHub Check: Code Coverage (3.11)
  • GitHub Check: Code Coverage (3.11)
  • GitHub Check: Analyze (python)
🔇 Additional comments (4)
avocado/utils/vmimage.py (4)

121-122: LGTM! Enhanced error handling for network operations.

Adding OSError to the exception handling is appropriate as it covers additional network-related failures like connection timeouts and DNS resolution issues. The "Network error:" prefix provides consistent error categorization that enables proper error handling downstream.


216-216: LGTM! Version threshold update.

The version threshold update from 38 to 40 appears to be routine maintenance for Fedora infrastructure changes.


585-615: Excellent implementation for offline caching support.

The new logic properly handles cached images by:

  • Checking if the URL is already a local file path and using it directly
  • Properly preparing metadata for asset management
  • Normalizing cache directory handling

This implementation aligns perfectly with the PR objective of improving offline usability by avoiding unnecessary network calls when images are already cached locally.


645-745: Well-architected caching solution that addresses PR objectives.

This implementation successfully addresses the core issue described in the PR objectives - avoiding network calls when images are already cached locally. The approach:

  • Leverages existing Asset infrastructure appropriately with Asset.get_all_assets() and Asset.get_metadata()
  • Provides comprehensive metadata-based matching for cached images
  • Includes proper architecture compatibility handling
  • Supports the offline use case without reinventing Asset functionality

The method effectively solves the problem where vmimage.Image.from_parameters() would hang offline by providing a cache-first approach that skips network provider queries when suitable cached images are found.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (3)
avocado/utils/vmimage.py (3)

763-841: Excellent implementation of cache-first approach.

This directly addresses the PR objectives by checking the local cache before attempting network requests. The parameter normalization and error handling improvements are well done.

Consider incorporating the error message improvements from previous reviews for better user experience:

         except ImageProviderError as e:
             # Preserve network error information for proper categorization
             if "Network error:" in str(e):
                 raise ImageProviderError(
-                    f"Network error: No provider found for {name or 'default'} "
-                    f"(version={version}, build={build}, arch={final_arch}). "
-                    f"No cached image available. {e}"
+                    f"No suitable VM image found for {name or 'any provider'} "
+                    f"(version={version or 'latest'}, build={build or 'any'}, arch={final_arch}). "
+                    f"Try running with internet connection or check if the requested image exists. "
+                    f"Details: {e}"
                 ) from e
             raise ImageProviderError(
-                f"No provider found for {name or 'default'} "
-                f"(version={version}, build={build}, arch={final_arch}). "
-                f"No cached image available. Original error: {e}"
+                f"No suitable VM image found for {name or 'any provider'} "
+                f"(version={version or 'latest'}, build={build or 'any'}, arch={final_arch}). "
+                f"Try running with internet connection or check if the requested image exists. "
+                f"Details: {e}"
             ) from e

645-745: Avoid using private Asset methods and simplify implementation.

The caching functionality addresses the PR objectives well, but there are several implementation concerns:

The use of temp_asset._verify_hash() on line 723 couples this code to Asset's internal implementation and may break if the Asset class changes.

Consider this cleaner implementation:

    @classmethod
    def _find_cached_image(
        cls,
        cache_dirs,
        name=None,
        version=None,
        build=None,
        arch=None,
        checksum=None,
        algorithm=None,
        snapshot_dir=None,
    ):
        """Find a cached image matching the specified criteria."""
        
+        def get_compatible_arches(arch):
+            """Get list of compatible architectures."""
+            compatibility = {
+                "x86_64": ["x86_64", "amd64", "64"],
+                "amd64": ["x86_64", "amd64", "64"],
+                "aarch64": ["aarch64", "arm64"],
+                "arm64": ["aarch64", "arm64"],
+            }
+            return compatibility.get(arch, [arch]) if arch else []
+        
+        def matches_criteria(metadata):
+            """Check if metadata matches the specified criteria."""
+            if metadata.get("type") != "vmimage":
+                return False
+            if name and metadata.get("name", "").lower() != name.lower():
+                return False
+            if version and str(metadata.get("version", "")) != str(version):
+                return False
+            if build and str(metadata.get("build", "")) != str(build):
+                return False
+            
+            compatible_arches = get_compatible_arches(arch)
+            if compatible_arches and metadata.get("arch") not in compatible_arches:
+                return False
+            return True
        
-        arch_compatibility = {
-            "x86_64": ["x86_64", "amd64", "64"],
-            "amd64": ["x86_64", "amd64", "64"],
-            "aarch64": ["aarch64", "arm64"],
-            "arm64": ["aarch64", "arm64"],
-        }
-        compatible_arches = arch_compatibility.get(arch, [arch]) if arch else []
-
-        def matches_image_criteria(metadata, name, version, build, compatible_arches):
-            # ... existing logic ...
        
         for asset_path in asset.Asset.get_all_assets(cache_dirs, sort=False):
             try:
-                temp_asset = asset.Asset(
-                    name=asset_path,
-                    asset_hash=checksum,
-                    algorithm=algorithm,
-                    cache_dirs=cache_dirs,
-                )
-                
-                try:
-                    metadata = temp_asset.get_metadata()
-                    if not metadata:
-                        continue
-                except OSError:
+                # Use Asset class methods instead of instance methods
+                temp_asset = asset.Asset(name=asset_path, cache_dirs=cache_dirs)
+                metadata = temp_asset.get_metadata()
+                if not metadata or not matches_criteria(metadata):
                     continue
                 
-                if matches_image_criteria(metadata, name, version, build, compatible_arches):
-                    if checksum and not temp_asset._verify_hash(asset_path):
-                        LOG.debug("Hash mismatch for cached image: %s", asset_path)
-                        continue
+                # Validate hash using public interface
+                if checksum:
+                    hash_asset = asset.Asset(
+                        name=asset_path,
+                        asset_hash=checksum,
+                        algorithm=algorithm,
+                        cache_dirs=cache_dirs,
+                    )
+                    if not hash_asset.validate_hash():
+                        LOG.debug("Hash mismatch for cached image: %s", asset_path)
+                        continue
                 
                 # Return matching image
                 return cls(...)
             except (OSError, ValueError, KeyError, AttributeError) as e:
                 LOG.debug("Skipping asset %s due to error: %s", asset_path, e)
                 continue

891-909: Consider allowing other providers after network errors.

The error collection and consolidated reporting are excellent improvements. However, the immediate raising of network errors (lines 899-903) prevents other providers from being tried.

Reconsider network-error short-circuit in get_best_provider

Currently any HTTPError or OSError immediately aborts trying other IMAGE_PROVIDERS. To maximize resilience, catch and record network errors in provider_errors so all providers get a chance, then surface a combined error if none succeed.

             except (HTTPError, OSError) as e:
                 LOG.debug(
                     "Network error while creating provider %s: %s", provider.name, e
                 )
-                raise ImageProviderError(f"Network error: {e}") from e
+                provider_errors.append(f"{provider.name}: Network error - {e}")
🧹 Nitpick comments (3)
avocado/utils/vmimage.py (3)

585-614: Good optimization for local file handling.

The logic to directly use local files when the URL is already a file path is a sensible optimization. The metadata preparation ensures proper asset tracking.

Minor suggestion: Consider extracting the cache_dirs preparation logic since it's used in multiple places:

+    def _prepare_cache_dirs(self):
+        """Prepare cache directories list from cache_dir."""
+        if isinstance(self.cache_dir, str):
+            return [self.cache_dir]
+        return self.cache_dir
+
     def download(self):
         # If URL is already a local file path (from cache), just use it directly
         if os.path.isfile(self.url):
             LOG.debug("Using cached image file: %s", self.url)
             asset_path = self.url
         else:
             # Prepare cache_dirs and metadata
-            if isinstance(self.cache_dir, str):
-                cache_dirs = [self.cache_dir]
-            else:
-                cache_dirs = self.cache_dir
+            cache_dirs = self._prepare_cache_dirs()

697-700: Simplify boolean return logic.

The static analysis tool correctly identifies that this can be simplified by returning the boolean condition directly.

-            if compatible_arches and metadata.get("arch") not in compatible_arches:
-                return False
-
-            return True
+            return not (compatible_arches and metadata.get("arch") not in compatible_arches)

Or more readably as part of a broader refactor:

-            if compatible_arches and metadata.get("arch") not in compatible_arches:
-                return False
-
-            return True
+            # Return True if no arch constraint or arch matches
+            return not compatible_arches or metadata.get("arch") in compatible_arches

645-841: Implementation meets PR objectives but consider maintainability improvements.

The changes successfully implement the cache-first approach to solve the offline usability issue described in the PR objectives. The _find_cached_image() method and the refactored from_parameters() method work together to check local cache before attempting network requests.

However, several maintainability concerns from previous reviews remain:

  1. Code duplication: The new caching logic duplicates functionality already present in the Asset utility
  2. Private method usage: Using _verify_hash() couples code to Asset's internal implementation
  3. Missed reuse opportunities: The Asset utility's find_asset_file() method could potentially be leveraged more effectively

Consider the reviewer's earlier suggestion to enhance the existing Asset utility methods rather than implementing parallel functionality. This would provide:

  • Better maintainability by reusing existing, tested code
  • Consistent handling of cache features like expiration
  • Reduced code complexity

Would you like me to help design an approach that better leverages the existing Asset utility while still meeting the cache-first requirements?

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f200241 and 42101ab.

📒 Files selected for processing (1)
  • avocado/utils/vmimage.py (5 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (1)
avocado/utils/vmimage.py (1)
avocado/utils/asset.py (5)
  • Asset (61-735)
  • fetch (382-442)
  • get_all_assets (532-546)
  • get_metadata (494-523)
  • _verify_hash (371-380)
🪛 Ruff (0.12.2)
avocado/utils/vmimage.py

697-700: Return the negated condition directly

Inline condition

(SIM103)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (43)
  • GitHub Check: rpm-build:fedora-41-x86_64
  • GitHub Check: rpm-build:fedora-42-x86_64
  • GitHub Check: rpm-build:fedora-rawhide-x86_64
  • GitHub Check: rpm-build:centos-stream-9-x86_64
  • GitHub Check: rpm-build:epel-9-x86_64
  • GitHub Check: rpm-build:fedora-41-aarch64
  • GitHub Check: rpm-build:fedora-41-ppc64le
  • GitHub Check: rpm-build:fedora-41-s390x
  • GitHub Check: rpm-build:fedora-41-s390x
  • GitHub Check: rpm-build:fedora-rawhide-x86_64
  • GitHub Check: rpm-build:centos-stream-9-x86_64
  • GitHub Check: rpm-build:fedora-41-ppc64le
  • GitHub Check: rpm-build:epel-9-x86_64
  • GitHub Check: rpm-build:fedora-42-x86_64
  • GitHub Check: rpm-build:fedora-41-x86_64
  • GitHub Check: rpm-build:fedora-41-aarch64
  • GitHub Check: rpm-build:epel-9-x86_64
  • GitHub Check: rpm-build:fedora-41-s390x
  • GitHub Check: rpm-build:fedora-42-x86_64
  • GitHub Check: rpm-build:fedora-rawhide-x86_64
  • GitHub Check: rpm-build:centos-stream-9-x86_64
  • GitHub Check: rpm-build:fedora-41-x86_64
  • GitHub Check: rpm-build:fedora-41-aarch64
  • GitHub Check: rpm-build:fedora-41-ppc64le
  • GitHub Check: Egg task debian:12.4
  • GitHub Check: Fedora selftests
  • GitHub Check: Podman Egg task
  • GitHub Check: Egg task ubuntu:22.04
  • GitHub Check: Egg task ubi:9.2
  • GitHub Check: Egg task debian:11.0
  • GitHub Check: Build Package (wheel/tarball) for Python 3.10
  • GitHub Check: Podman spawner with 3rd party runner plugin
  • GitHub Check: Version task ubuntu:22.04
  • GitHub Check: Build Package (wheel/tarball) for Python 3.9
  • GitHub Check: Version task debian:12.4
  • GitHub Check: Smokecheck on Linux with Python 3.10
  • GitHub Check: Smokecheck on Linux with Python 3.11
  • GitHub Check: Version task ubi:8.8
  • GitHub Check: Analyze (python)
  • GitHub Check: macOS with Python 3.11
  • GitHub Check: Code Coverage (3.11)
  • GitHub Check: Code Coverage (3.11)
  • GitHub Check: Static checks
🔇 Additional comments (1)
avocado/utils/vmimage.py (1)

121-122: Good improvement to network error handling.

Adding OSError to the exception handling provides more comprehensive coverage of network-related failures, and the "Network error" prefix helps with downstream error categorization and handling.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (3)
avocado/utils/vmimage.py (3)

696-701: Avoid unnecessary Asset instance creation.

Creating Asset instances just for metadata retrieval is inefficient and goes against the existing Asset utility design patterns.

As mentioned in previous reviews, you don't need to create Asset instances for this purpose. The Asset utility likely provides direct methods for metadata access.


714-714: Avoid using private Asset methods.

Using temp_asset._verify_hash() couples this code to Asset's internal implementation and may break if the Asset class changes.

As mentioned in previous reviews, avoid calling private methods like _verify_hash. Use the public Asset interface instead or implement hash verification directly without relying on private methods.


636-737: Consider integrating with existing Asset utility methods.

Based on the PR discussion and past review comments, this method attempts to duplicate functionality already available in the Asset utility. The Asset utility already provides caching and metadata features.

As mentioned in previous reviews, consider using existing Asset utility methods like Asset.find_asset_file() to locate cached assets without needing URLs, rather than reimplementing caching logic. This would be more maintainable and leverage existing cache features like expiration.

🧹 Nitpick comments (2)
avocado/utils/vmimage.py (2)

688-691: Simplify the condition return.

The static analysis tool correctly identifies that this condition can be simplified by returning the negated condition directly.

Apply this diff to simplify the condition:

-            if compatible_arches and metadata.get("arch") not in compatible_arches:
-                return False
-
-            return True
+            return not (compatible_arches and metadata.get("arch") not in compatible_arches)

703-708: Improve metadata retrieval error handling.

The broad OSError catch might mask important issues during metadata retrieval.

Consider more specific error handling:

                try:
                    metadata = temp_asset.get_metadata()
                    if not metadata:
                        continue
-                except OSError:
+                except (OSError, FileNotFoundError, PermissionError):
                    continue
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 42101ab and 6fa13a2.

📒 Files selected for processing (3)
  • avocado/utils/data_structures.py (2 hunks)
  • avocado/utils/vmimage.py (1 hunks)
  • selftests/unit/utils/data_structures.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • avocado/utils/data_structures.py
  • selftests/unit/utils/data_structures.py
🧰 Additional context used
🪛 Ruff (0.12.2)
avocado/utils/vmimage.py

688-691: Return the negated condition directly

Inline condition

(SIM103)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (46)
  • GitHub Check: rpm-build:fedora-41-ppc64le
  • GitHub Check: rpm-build:fedora-41-x86_64
  • GitHub Check: rpm-build:centos-stream-9-x86_64
  • GitHub Check: rpm-build:epel-9-x86_64
  • GitHub Check: rpm-build:fedora-rawhide-x86_64
  • GitHub Check: rpm-build:fedora-41-aarch64
  • GitHub Check: rpm-build:fedora-42-x86_64
  • GitHub Check: rpm-build:fedora-41-x86_64
  • GitHub Check: rpm-build:fedora-41-s390x
  • GitHub Check: rpm-build:fedora-42-x86_64
  • GitHub Check: rpm-build:fedora-41-aarch64
  • GitHub Check: rpm-build:fedora-41-ppc64le
  • GitHub Check: rpm-build:epel-9-x86_64
  • GitHub Check: rpm-build:centos-stream-9-x86_64
  • GitHub Check: rpm-build:fedora-rawhide-x86_64
  • GitHub Check: rpm-build:fedora-42-x86_64
  • GitHub Check: rpm-build:fedora-rawhide-x86_64
  • GitHub Check: rpm-build:fedora-41-aarch64
  • GitHub Check: rpm-build:fedora-41-x86_64
  • GitHub Check: rpm-build:centos-stream-9-x86_64
  • GitHub Check: rpm-build:fedora-41-ppc64le
  • GitHub Check: rpm-build:fedora-41-s390x
  • GitHub Check: rpm-build:epel-9-x86_64
  • GitHub Check: rpm-build:fedora-41-ppc64le
  • GitHub Check: rpm-build:fedora-41-s390x
  • GitHub Check: rpm-build:fedora-41-aarch64
  • GitHub Check: rpm-build:fedora-rawhide-x86_64
  • GitHub Check: rpm-build:fedora-41-x86_64
  • GitHub Check: rpm-build:fedora-42-x86_64
  • GitHub Check: rpm-build:centos-stream-9-x86_64
  • GitHub Check: rpm-build:epel-9-x86_64
  • GitHub Check: Egg task ubi:9.2
  • GitHub Check: Egg task fedora:41
  • GitHub Check: Build Egg for Python 3.10
  • GitHub Check: Egg task debian:12.4
  • GitHub Check: Build Egg for Python 3.13
  • GitHub Check: Version task ubuntu:20.04
  • GitHub Check: Build Package (wheel/tarball) for Python 3.10
  • GitHub Check: Version task fedora:40
  • GitHub Check: Code Coverage (3.11)
  • GitHub Check: Smokecheck on Linux with Python 3.11
  • GitHub Check: Smokecheck on Linux with Python 3.10
  • GitHub Check: Static checks
  • GitHub Check: macOS with Python 3.11
  • GitHub Check: Code Coverage (3.11)
  • GitHub Check: Analyze (python)
🔇 Additional comments (3)
avocado/utils/vmimage.py (3)

656-662: Architecture compatibility mapping looks good.

The architecture compatibility groups are well-defined and handle the common aliases correctly (e.g., x86_64/amd64/64, aarch64/arm64).


719-729: Return construction looks correct.

The Image instance creation with cached asset metadata is properly implemented, handling fallbacks for missing metadata fields appropriately.


731-734: Exception handling is appropriately broad.

The broad exception handling for common errors during metadata processing is appropriate here, as various asset types might fail in different ways during processing.

@harvey0100 harvey0100 force-pushed the cachefix branch 2 times, most recently from 21bbe91 to 83d37fe Compare July 31, 2025 13:29
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (3)
avocado/utils/vmimage.py (3)

216-216: Consider separating unrelated changes.

While this version threshold update appears correct for Fedora's image hosting, it's unrelated to the caching functionality. Consider moving such changes to separate commits for cleaner change tracking.


715-715: Avoid using private Asset methods.

Using temp_asset._verify_hash() couples this code to Asset's internal implementation and may break if the Asset class changes.

Consider using the public Asset interface instead:

-                    # pylint: disable-next=W0212
-                    if checksum and not temp_asset._verify_hash(asset_path):
+                    if checksum and not asset.Asset._has_valid_hash(asset_path, checksum, algorithm):

This uses the public class method _has_valid_hash which is designed for this purpose.


853-858: Reconsider network-error short-circuit strategy.

Currently any HTTPError or OSError immediately aborts trying other IMAGE_PROVIDERS. For better resilience, consider allowing other providers to be attempted even after network errors.

Consider this approach for better fault tolerance:

             except (HTTPError, OSError) as e:
                 LOG.debug(
                     "Network error while creating provider %s: %s", provider.name, e
                 )
-                # Network errors should raise immediately
-                raise ImageProviderError(f"Network error: {e}") from e
+                provider_errors.append(f"{provider.name}: Network error - {e}")

This would allow other providers to be tried even if one has network issues.

🧹 Nitpick comments (3)
avocado/utils/vmimage.py (3)

657-662: Consider making architecture compatibility a class constant.

The hardcoded architecture compatibility mapping could be extracted to a class constant to improve maintainability and allow for easier testing or extension.

+    # Architecture compatibility groups
+    ARCH_COMPATIBILITY = {
+        "x86_64": ["x86_64", "amd64", "64"],
+        "amd64": ["x86_64", "amd64", "64"],
+        "aarch64": ["aarch64", "arm64"],
+        "arm64": ["aarch64", "arm64"],
+    }
+
     @classmethod
     def _find_cached_image(
         # ... parameters
     ):
         # ...
-        # pylint: disable-next=invalid-name
-        ARCH_COMPATIBILITY = {
-            "x86_64": ["x86_64", "amd64", "64"],
-            "amd64": ["x86_64", "amd64", "64"],
-            "aarch64": ["aarch64", "arm64"],
-            "arm64": ["aarch64", "arm64"],
-        }
-        compatible_arches = ARCH_COMPATIBILITY.get(arch, [arch]) if arch else []
+        compatible_arches = cls.ARCH_COMPATIBILITY.get(arch, [arch]) if arch else []

689-692: Simplify boolean logic per static analysis hint.

The static analysis tool correctly identifies that this condition can be simplified by returning the negated condition directly.

-            if compatible_arches and metadata.get("arch") not in compatible_arches:
-                return False
-
-            return True
+            return not (compatible_arches and metadata.get("arch") not in compatible_arches)

732-735: Consider more specific exception handling.

The broad exception handling catches many unrelated exceptions. Consider being more specific about which exceptions to handle.

-            except (OSError, ValueError, KeyError, AttributeError) as e:
-                # Skip assets that can't be processed (common errors during metadata processing)
-                LOG.debug("Skipping asset %s due to error: %s", asset_path, e)
-                continue
+            except OSError as e:
+                LOG.debug("Cannot access asset %s: %s", asset_path, e)
+                continue
+            except (ValueError, KeyError) as e:
+                LOG.debug("Invalid metadata for asset %s: %s", asset_path, e)
+                continue
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6fa13a2 and 21bbe91.

📒 Files selected for processing (4)
  • avocado/utils/asset.py (3 hunks)
  • avocado/utils/data_structures.py (2 hunks)
  • avocado/utils/vmimage.py (4 hunks)
  • selftests/unit/utils/data_structures.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • selftests/unit/utils/data_structures.py
  • avocado/utils/data_structures.py
🧰 Additional context used
🪛 Ruff (0.12.2)
avocado/utils/vmimage.py

689-692: Return the negated condition directly

Inline condition

(SIM103)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (35)
  • GitHub Check: rpm-build:fedora-41-s390x
  • GitHub Check: rpm-build:fedora-rawhide-x86_64
  • GitHub Check: rpm-build:centos-stream-9-x86_64
  • GitHub Check: rpm-build:fedora-41-ppc64le
  • GitHub Check: rpm-build:epel-9-x86_64
  • GitHub Check: rpm-build:fedora-42-x86_64
  • GitHub Check: rpm-build:fedora-41-x86_64
  • GitHub Check: rpm-build:fedora-41-aarch64
  • GitHub Check: rpm-build:fedora-41-ppc64le
  • GitHub Check: rpm-build:fedora-41-s390x
  • GitHub Check: rpm-build:fedora-41-aarch64
  • GitHub Check: rpm-build:fedora-rawhide-x86_64
  • GitHub Check: rpm-build:fedora-41-x86_64
  • GitHub Check: rpm-build:fedora-42-x86_64
  • GitHub Check: rpm-build:centos-stream-9-x86_64
  • GitHub Check: rpm-build:epel-9-x86_64
  • GitHub Check: Podman Egg task
  • GitHub Check: Egg task fedora:40
  • GitHub Check: Fedora develop install/uninstall task
  • GitHub Check: Egg task fedora:41
  • GitHub Check: Fedora selftests
  • GitHub Check: Version task ubuntu:20.04
  • GitHub Check: Podman spawner with 3rd party runner plugin
  • GitHub Check: Build Egg for Python 3.8
  • GitHub Check: Version task ubuntu:22.04
  • GitHub Check: Version task ubi:8.8
  • GitHub Check: macOS with Python 3.11
  • GitHub Check: Build Package (wheel/tarball) for Python 3.12
  • GitHub Check: Smokecheck on Linux with Python 3.11
  • GitHub Check: Smokecheck on Linux with Python 3.10
  • GitHub Check: Version task debian:12.4
  • GitHub Check: Static checks
  • GitHub Check: Code Coverage (3.11)
  • GitHub Check: Analyze (python)
  • GitHub Check: Code Coverage (3.11)
🔇 Additional comments (7)
avocado/utils/asset.py (5)

446-448: LGTM - Clear documentation update.

The docstring appropriately reflects the enhanced functionality to handle direct file paths.


459-461: LGTM - Efficient direct file path handling.

The early return for direct file paths is well-placed and efficiently handles the case where self.name is already a valid file path.


463-463: LGTM - More specific comment.

The updated comment provides better clarity about the search mechanism.


492-492: LGTM - Consistent documentation.

The docstring update appropriately reflects the enhanced functionality, consistent with the find_asset_file method.


500-500: Excellent error handling improvements.

The enhanced error handling significantly improves robustness:

  • Added debug logging when asset file is not found
  • Graceful handling of JSON parsing errors and file access issues
  • Appropriate fallback behavior (returning None) when metadata is unavailable

These changes address the brittleness concerns from previous reviews.

Also applies to: 506-513

avocado/utils/vmimage.py (2)

122-122: Good error message consistency.

Adding the "Network error:" prefix improves error message clarity and consistency across the codebase.


845-845: Excellent error collection and reporting improvements.

The enhanced error handling with provider_errors list and consolidated error reporting provides much better debugging information when providers fail.

Also applies to: 852-852, 861-863

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (3)
avocado/utils/vmimage.py (3)

216-216: Consider separating this change into a dedicated commit.

The version threshold update from 38 to 40 appears unrelated to the main caching functionality. As previously noted by a reviewer, this change would be better suited for a separate commit focused on Fedora version handling.


715-715: Avoid using private Asset methods.

Using temp_asset._verify_hash() couples this code to Asset's internal implementation. This was flagged in previous reviews and should be addressed.

Replace the private method call with a public interface:

-                    # pylint: disable-next=W0212
-                    if checksum and not temp_asset._verify_hash(asset_path):
+                    if checksum and not temp_asset.validate_hash():

Or implement hash verification directly without relying on private methods.


853-858: Reconsider immediate network error raising strategy.

The current approach of immediately raising network errors prevents trying other providers that might be accessible. This was previously flagged by a reviewer.

Consider allowing other providers to be tried even after network errors:

             except (HTTPError, OSError) as e:
                 LOG.debug(
                     "Network error while creating provider %s: %s", provider.name, e
                 )
-                # Network errors should raise immediately
-                raise ImageProviderError(f"Network error: {e}") from e
+                provider_errors.append(f"{provider.name}: Network error - {e}")

This would maximize resilience by trying all providers before failing.

🧹 Nitpick comments (1)
avocado/utils/vmimage.py (1)

657-663: Consider moving architecture compatibility mapping to module level.

The ARCH_COMPATIBILITY dictionary is a good addition but could be moved to module level as a constant to improve reusability and reduce method complexity.

+# Architecture compatibility mapping for cached image matching
+ARCH_COMPATIBILITY = {
+    "x86_64": ["x86_64", "amd64", "64"],
+    "amd64": ["x86_64", "amd64", "64"],
+    "aarch64": ["aarch64", "arm64"],
+    "arm64": ["aarch64", "arm64"],
+}
+
 class Image:
     # ... existing code ...
     
     @classmethod
     def _find_cached_image(
         # ... parameters ...
     ):
         # ... existing code ...
-        # pylint: disable-next=invalid-name
-        ARCH_COMPATIBILITY = {
-            "x86_64": ["x86_64", "amd64", "64"],
-            "amd64": ["x86_64", "amd64", "64"],
-            "aarch64": ["aarch64", "arm64"],
-            "arm64": ["aarch64", "arm64"],
-        }
         compatible_arches = ARCH_COMPATIBILITY.get(arch, [arch]) if arch else []
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 21bbe91 and 83d37fe.

📒 Files selected for processing (2)
  • avocado/utils/asset.py (3 hunks)
  • avocado/utils/vmimage.py (4 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • avocado/utils/asset.py
🧰 Additional context used
🧬 Code Graph Analysis (1)
avocado/utils/vmimage.py (1)
avocado/utils/asset.py (4)
  • Asset (61-726)
  • get_all_assets (523-537)
  • get_metadata (488-514)
  • _verify_hash (371-380)
🪛 Ruff (0.12.2)
avocado/utils/vmimage.py

689-692: Return the negated condition directly

Inline condition

(SIM103)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (35)
  • GitHub Check: rpm-build:fedora-41-s390x
  • GitHub Check: rpm-build:fedora-41-ppc64le
  • GitHub Check: rpm-build:fedora-41-aarch64
  • GitHub Check: rpm-build:centos-stream-9-x86_64
  • GitHub Check: rpm-build:epel-9-x86_64
  • GitHub Check: rpm-build:fedora-rawhide-x86_64
  • GitHub Check: rpm-build:fedora-42-x86_64
  • GitHub Check: rpm-build:fedora-41-x86_64
  • GitHub Check: rpm-build:fedora-41-x86_64
  • GitHub Check: rpm-build:fedora-42-x86_64
  • GitHub Check: rpm-build:fedora-rawhide-x86_64
  • GitHub Check: rpm-build:centos-stream-9-x86_64
  • GitHub Check: rpm-build:epel-9-x86_64
  • GitHub Check: rpm-build:fedora-41-aarch64
  • GitHub Check: rpm-build:fedora-41-ppc64le
  • GitHub Check: rpm-build:fedora-41-s390x
  • GitHub Check: rpm-build:fedora-41-s390x
  • GitHub Check: rpm-build:fedora-rawhide-x86_64
  • GitHub Check: rpm-build:centos-stream-9-x86_64
  • GitHub Check: rpm-build:fedora-41-ppc64le
  • GitHub Check: rpm-build:epel-9-x86_64
  • GitHub Check: rpm-build:fedora-42-x86_64
  • GitHub Check: rpm-build:fedora-41-x86_64
  • GitHub Check: rpm-build:fedora-41-aarch64
  • GitHub Check: rpm-build:epel-9-x86_64
  • GitHub Check: rpm-build:fedora-41-s390x
  • GitHub Check: rpm-build:fedora-42-x86_64
  • GitHub Check: rpm-build:fedora-rawhide-x86_64
  • GitHub Check: rpm-build:centos-stream-9-x86_64
  • GitHub Check: rpm-build:fedora-41-x86_64
  • GitHub Check: rpm-build:fedora-41-aarch64
  • GitHub Check: rpm-build:fedora-41-ppc64le
  • GitHub Check: macOS with Python 3.11
  • GitHub Check: Code Coverage (3.11)
  • GitHub Check: Code Coverage (3.11)
🔇 Additional comments (3)
avocado/utils/vmimage.py (3)

122-122: LGTM: Improved error message consistency.

The "Network error:" prefix improves error message clarity and consistency with other network error handling in the codebase.


845-852: LGTM: Improved error collection and reporting.

The addition of provider_errors list and enhanced error messages will greatly improve debugging when multiple providers fail.


861-863: LGTM: Clear error reporting for provider failures.

The consolidated error reporting provides better visibility into why providers failed, which will help users diagnose issues.

Copy link
Contributor

@richtja richtja left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @harvey0100, thanks for the update, I have two more comments, but IMO we are very close to the merge of this. Also, if you use AI to help you with this please add Assisted-by: tag to your commits. Thank you

@harvey0100 harvey0100 force-pushed the cachefix branch 2 times, most recently from 60b6d50 to 8d5e33d Compare August 12, 2025 14:02
compatible_arches = ARCH_COMPATIBILITY.get(arch, [arch]) if arch else []

def matches_image_criteria(metadata, name, version, build, compatible_arches):
return (
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I didn't express myself correctly when I ask to change this method into the one-liner, I meant to remove this method and put the logic into the if on line 691.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah my misunderstanding, thank you i've fixed it now!

Copy link
Contributor

@richtja richtja left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @harvey0100, after the latest changes requested by me, the style checks are not passing, can you please revert the changes and move the metadata comparison back to separated method. Sorry about that, my mistake.

Improve VMImage caching and asset handling functionality.

- Enhanced asset.py to handle direct file paths and normal asset names
- Better error handling and logging in get_metadata()
- Refactored _find_cached_image() to use Asset functionality
- Fixed multiple Pylint warnings (too-many-branches, protected-access, etc.)
- Improved network error handling in provider selection

Assisted By: Cursor GPT-5
Signed-off-by: Harvey Lynden <[email protected]>
Update Fedora provider to use archive URLs for versions <= 40
instead of <= 38 to fix download failures for Fedora 39/40.

Assisted By: Cursor GPT-5
Signed-off-by: Harvey Lynden <[email protected]>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (1)
avocado/utils/vmimage.py (1)

665-673: Inline the matching predicate; drop the nested function

Simplify by inlining the match condition where it’s used. It improves readability and avoids an extra indirection.

Apply this diff:

-        def matches_image_criteria(metadata, name, version, build, compatible_arches):
-            return (
-                metadata.get("type") == "vmimage"
-                and (not name or metadata.get("name", "").lower() == name.lower())
-                and (not version or str(metadata.get("version", "")) == str(version))
-                and (not build or str(metadata.get("build", "")) == str(build))
-                and (not compatible_arches or metadata.get("arch") in compatible_arches)
-            )
+        # Inline predicate at use site to avoid nested function

@@
-                if matches_image_criteria(
-                    metadata, name, version, build, compatible_arches
-                ):
+                if (
+                    metadata.get("type") == "vmimage"
+                    and (not name or metadata.get("name", "").lower() == name.lower())
+                    and (not version or str(metadata.get("version", "")) == str(version))
+                    and (not build or str(metadata.get("build", "")) == str(build))
+                    and (not compatible_arches or metadata.get("arch") in compatible_arches)
+                ):

Also applies to: 691-693

🧹 Nitpick comments (6)
avocado/utils/vmimage.py (6)

122-122: Include original exception details (and consider URLError) in message

Right now the raised message loses the original reason; users will only see “Cannot open …”. Consider including the exception string for actionable context. Optionally also catch URLError.

Apply this diff:

-        except HTTPError as exc:
-            raise ImageProviderError(f"Network error: Cannot open {url}") from exc
+        except HTTPError as exc:
+            raise ImageProviderError(
+                f"Network error while fetching {url}: {exc}"
+            ) from exc

Optional (if you want broader coverage):

  • add from urllib.error import URLError to imports
  • extend the except clause to (HTTPError, URLError).

674-676: Defensive: accept either str or list for cache_dirs

If callers pass a single directory as str (common elsewhere in this module), get_all_assets() will be fed a string, which is not the expected iterable of directories.

Apply this diff:

-        for asset_path in asset.Asset.get_all_assets(cache_dirs, sort=False):
+        normalized_cache_dirs = [cache_dirs] if isinstance(cache_dirs, str) else cache_dirs
+        for asset_path in asset.Asset.get_all_assets(normalized_cache_dirs, sort=False):

699-710: Nit: reduce log noise and set cache_dir from the actual asset location

  • “Found matching cached image” is fine at DEBUG; INFO is too chatty.
  • Using cache_dirs[0] may be incorrect if the asset was found in a different cache dir. Derive from asset_path.

Apply this diff:

-                    LOG.info("Found matching cached image: %s", asset_path)
+                    LOG.debug("Found matching cached image: %s", asset_path)
@@
-                        cache_dir=cache_dirs[0],
+                        cache_dir=os.path.dirname(asset_path),

712-715: Narrow exception scope; don’t hide programming errors

Catching ValueError, KeyError, AttributeError here may mask real bugs. Keep to expected IO/JSON issues.

Apply this diff and add import json at the top:

-            except (OSError, ValueError, KeyError, AttributeError) as e:
-                # Skip assets that can't be processed (common errors during metadata processing)
+            except (OSError, json.JSONDecodeError) as e:
+                # Skip assets with inaccessible files or invalid metadata
                 LOG.debug("Skipping asset %s due to error: %s", asset_path, e)
                 continue

825-839: Confirm short-circuit on network errors; it prevents trying other providers

On any (HTTPError, OSError) during provider creation, you raise immediately. That can be desirable, but it also prevents trying other providers that might still work.

Given the PR goal (robust offline handling via cache-first), once the cache-first pre-check is in place, this becomes less critical. Still, please confirm the intended behavior. If you prefer resilience, accumulate these errors and raise after all providers are attempted.

Alternative:

-            except (HTTPError, OSError) as e:
+            except (HTTPError, OSError) as e:
                 LOG.debug(
                     "Network error while creating provider %s: %s", provider.name, e
                 )
-                # Network errors should raise immediately
-                raise ImageProviderError(f"Network error: {e}") from e
+                provider_errors.append(f"{provider.name}: Network error - {e}")
+                continue

695-695: Avoid calling private Asset._verify_hash; expose a public API or verify locally

Relying on _verify_hash() (private) couples to Asset internals and may break if the implementation changes. You have two paths:

Option A – Local verification (no dependency on Asset internals):

  1. At the top of avocado/utils/vmimage.py add:

    import hashlib
    
    def _has_valid_hash(asset_path, checksum, algorithm):
        if not checksum:
            return True
        h = hashlib.new(algorithm)
        with open(asset_path, 'rb') as f:
            for chunk in iter(lambda: f.read(8192), b''):
                h.update(chunk)
        return h.hexdigest().lower() == checksum.lower()
  2. Replace the call at line 695:

    --- a/avocado/utils/vmimage.py
    +++ b/avocado/utils/vmimage.py
    @@ -692,7 +692,7 @@
                         asset_path = temp_asset.local_path
    -                    if checksum and not temp_asset._verify_hash(asset_path):
    +                    if checksum and not _has_valid_hash(asset_path, checksum, temp_asset.algorithm):
                            LOG.debug("Hash mismatch for cached image: %s", asset_path)
                            continue

Option B – Public Asset API (preferred if you can change Asset):

  1. In avocado/utils/asset.py add a wrapper:

    class Asset:
        # …
        def verify_hash(self, asset_path):
            """Public API for integrity-checking the asset file."""
            return self._verify_hash(asset_path)
  2. Then call:

    if checksum and not temp_asset.verify_hash(asset_path):
        LOG.debug("Hash mismatch for cached image: %s", asset_path)
        continue

Since Asset.validate_hash() or a public verifier doesn’t exist today, Option A is the quickest path.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled
  • Linear integration is disabled

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 025f731 and 5596cd0.

📒 Files selected for processing (2)
  • avocado/utils/asset.py (3 hunks)
  • avocado/utils/vmimage.py (4 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • avocado/utils/asset.py
🧰 Additional context used
🧬 Code Graph Analysis (1)
avocado/utils/vmimage.py (2)
avocado/utils/asset.py (4)
  • Asset (61-728)
  • get_all_assets (525-539)
  • get_metadata (488-516)
  • _verify_hash (371-380)
avocado/utils/download.py (1)
  • _verify_hash (194-197)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (44)
  • GitHub Check: rpm-build:fedora-41-s390x
  • GitHub Check: rpm-build:fedora-41-ppc64le
  • GitHub Check: rpm-build:centos-stream-9-x86_64
  • GitHub Check: rpm-build:fedora-43-x86_64
  • GitHub Check: rpm-build:fedora-rawhide-x86_64
  • GitHub Check: rpm-build:fedora-41-aarch64
  • GitHub Check: rpm-build:epel-9-x86_64
  • GitHub Check: rpm-build:fedora-41-x86_64
  • GitHub Check: rpm-build:fedora-42-x86_64
  • GitHub Check: rpm-build:fedora-41-aarch64
  • GitHub Check: rpm-build:epel-9-x86_64
  • GitHub Check: rpm-build:fedora-41-x86_64
  • GitHub Check: rpm-build:fedora-41-ppc64le
  • GitHub Check: rpm-build:fedora-43-x86_64
  • GitHub Check: rpm-build:fedora-rawhide-x86_64
  • GitHub Check: rpm-build:fedora-42-x86_64
  • GitHub Check: rpm-build:centos-stream-9-x86_64
  • GitHub Check: rpm-build:fedora-41-s390x
  • GitHub Check: rpm-build:fedora-41-x86_64
  • GitHub Check: rpm-build:fedora-42-x86_64
  • GitHub Check: rpm-build:fedora-rawhide-x86_64
  • GitHub Check: rpm-build:fedora-41-aarch64
  • GitHub Check: rpm-build:fedora-41-s390x
  • GitHub Check: rpm-build:epel-9-x86_64
  • GitHub Check: rpm-build:fedora-41-ppc64le
  • GitHub Check: rpm-build:centos-stream-9-x86_64
  • GitHub Check: rpm-build:fedora-43-x86_64
  • GitHub Check: rpm-build:fedora-42-x86_64
  • GitHub Check: rpm-build:centos-stream-9-x86_64
  • GitHub Check: rpm-build:fedora-41-x86_64
  • GitHub Check: rpm-build:fedora-41-aarch64
  • GitHub Check: rpm-build:fedora-41-ppc64le
  • GitHub Check: rpm-build:epel-9-x86_64
  • GitHub Check: rpm-build:fedora-41-s390x
  • GitHub Check: rpm-build:fedora-rawhide-x86_64
  • GitHub Check: rpm-build:fedora-43-x86_64
  • GitHub Check: Smokecheck on Linux with Python 3.10
  • GitHub Check: macOS with Python 3.11
  • GitHub Check: Static checks
  • GitHub Check: Smokecheck on Linux with Python 3.11
  • GitHub Check: Version task debian:12.4
  • GitHub Check: Fedora selftests
  • GitHub Check: Code Coverage (3.11)
  • GitHub Check: Code Coverage (3.11)
🔇 Additional comments (1)
avocado/utils/vmimage.py (1)

841-844: Aggregated provider error reporting is a nice improvement

Summarizing provider-specific failures into a single ImageProviderError is helpful for diagnostics. LGTM.

Comment on lines +216 to 219
if self.url_old_images and int(self.version) <= 40:
self.url_versions = self.url_old_images

self.url_images = self.url_versions + "{version}/" + cloud + "/{arch}/images/"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Verification agent

🧩 Analysis chain

Fedora legacy URL threshold bumped to ≤40 — please verify

Changing the cutoff from older releases to include 40 can be correct, but it’s brittle and provider-specific. Please verify the hosting layout for Fedora 39/40 images (main vs archives) to avoid regressions.


🌐 Web query:

For Fedora 39 and 40 Cloud Base (Generic) QCOW2 images, are releases served from dl.fedoraproject.org or archives.fedoraproject.org? Provide authoritative references and URL patterns.

💡 Result:

Fedora 39 and 40 Cloud Base (Generic) QCOW2 images are served from dl.fedoraproject.org for active releases like 39 and 40, not from archives.fedoraproject.org.

For currently supported Fedora releases, official Cloud Base QCOW2 images use a URL pattern like:

https://download.fedoraproject.org/pub/fedora/linux/releases/<version>/Cloud/x86_64/images/Fedora-Cloud-Base-Generic.x86_64-<version>-<release>.qcow2

Authoritative example for Fedora 40:

https://download.fedoraproject.org/pub/fedora/linux/releases/40/Cloud/x86_64/images/Fedora-Cloud-Base-Generic.x86_64-40-1.14.qcow2
```[1]

**dl.fedoraproject.org** is the canonical download server for all active (non-EOL) Fedora releases, and images are accessed via the `/pub/fedora/linux/releases/` subdirectory[5][1][10]. This is confirmed by Fedora’s own documentation and current community tutorials referencing these URLs[1][10].

**archives.fedoraproject.org** is used only for **End-Of-Life (EOL) releases**, historical, and unsupported images and packages; it does not host live links for currently-supported Fedora versions[2][8]. Archived image patterns include `/pub/archive/fedora/linux/releases/`, which is not used for active releases.

**Relevant URL patterns:**

| Type               | Download Server               | URL Pattern Example                                                                                  |
|--------------------|------------------------------|------------------------------------------------------------------------------------------------------|
| Active releases    | dl.fedoraproject.org         | `/pub/fedora/linux/releases/40/Cloud/x86_64/images/Fedora-Cloud-Base-Generic.x86_64-40-1.14.qcow2`   |
| Archived releases  | archives.fedoraproject.org   | `/pub/archive/fedora/linux/releases/36/Cloud/x86_64/images/Fedora-Cloud-Base-Generic.x86_64-36-1.5.qcow2`|

*For Fedora 39 and 40, the images are thus found on* **dl.fedoraproject.org** *using the plain* `/pub/fedora/linux/releases/` *URL pattern*[1][5].

Citations:

- 1: https://joshrnoll.com/deploying-fedora-servers-with-cloudinit-on-proxmox/
- 2: https://discussion.fedoraproject.org/t/how-to-use-fedora-archive-site/76086
- 3: https://discussion.fedoraproject.org/t/cloud-image-update-frequency/130520
- 4: https://blog.christophersmart.com/2016/06/17/booting-fedora-24-cloud-image-with-kvm/
- 5: https://dl.fedoraproject.org
- 6: https://lists.fedoraproject.org/archives/list/[email protected]/2016/6/?count=100
- 7: https://docs.openstack.org/image-guide/obtain-images.html
- 8: https://realtechtalk.com/Centos_567_Install_EPEL_from_Fedora_for_extra_packages_Howto-1438-articles
- 9: https://github.com/containerd/containerd/issues/12124
- 10: https://fedoramagazine.org/setting-up-a-vm-on-fedora-server-using-cloud-images-and-virt-install-version-3/

---



</details>

**Threshold adjustment needed for Fedora URL logic**

Based on Fedora’s current hosting layout, versions 39 and 40 are still served from dl.fedoraproject.org (via `/pub/fedora/linux/releases/...`) and not the archives site. By treating all ≤ 40 as “old” you’ll end up pointing Fedora 40 to the archive URL, causing 404s.

Please update the cutoff so only truly EOL releases use `url_old_images` (e.g. change `<= 40` back to `<= 38`, or better yet, drive this from a maintained EOL list).

Locations to update:
- avocado/utils/vmimage.py: lines 216–219

Suggested diff snippet:
```diff
-       if self.url_old_images and int(self.version) <= 40:
+       # Only use archives for End-Of-Life Fedora releases (currently ≤ 38)
+       if self.url_old_images and int(self.version) <= 38:
            self.url_versions = self.url_old_images
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if self.url_old_images and int(self.version) <= 40:
self.url_versions = self.url_old_images
self.url_images = self.url_versions + "{version}/" + cloud + "/{arch}/images/"
# Only use archives for End-Of-Life Fedora releases (currently ≤ 38)
if self.url_old_images and int(self.version) <= 38:
self.url_versions = self.url_old_images
self.url_images = self.url_versions + "{version}/" + cloud + "/{arch}/images/"
🤖 Prompt for AI Agents
In avocado/utils/vmimage.py around lines 216–219, the Fedora cutoff is too high
(<= 40) causing Fedora 39/40 to be treated as archived; change the conditional
to only use self.url_old_images for truly EOL releases (e.g., replace
"int(self.version) <= 40" with "int(self.version) <= 38") or, preferably,
implement a maintained EOL lookup (e.g., a set/list of EOL versions and test
membership) and use that to decide when to assign self.url_versions =
self.url_old_images; ensure the rest of the URL construction remains the same
and update/add tests if present to cover Fedora 39/40 using the current
dl.fedoraproject.org path.

Comment on lines +636 to +647
@classmethod
def _find_cached_image( # pylint: disable=too-many-locals
cls,
cache_dirs,
name=None,
version=None,
build=None,
arch=None,
checksum=None,
algorithm=None,
snapshot_dir=None,
):
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Verification agent

🧩 Analysis chain

Cache-first helper is not wired in; offline behavior remains unchanged

_find_cached_image() looks good, but it’s not used. from_parameters() calls get_best_provider() first, which triggers network-dependent provider logic. That defeats the PR’s objective (avoid network when a cached image exists).

  • Verify current usage:

  • Suggested integration (add at the start of from_parameters, after setting a default arch):

# Cache-first pre-check before any provider/network call
effective_cache_dirs = [cache_dir] if isinstance(cache_dir, str) else (cache_dir or [tempfile.gettempdir()])
cached = cls._find_cached_image(
    cache_dirs=effective_cache_dirs,
    name=name,
    version=version,
    build=build,
    arch=arch,
    checksum=checksum,
    algorithm=algorithm,
    snapshot_dir=snapshot_dir,
)
if cached:
    return cached

This ensures we return a cached image immediately when available, and only hit providers if not.


🏁 Script executed:

#!/bin/bash
set -e
# Where is _find_cached_image referenced?
rg -n -C2 '\b_find_cached_image\s*\('

Length of output: 307


🏁 Script executed:

#!/bin/bash
set -e
# Locate the `from_parameters` implementation and its signature
rg -n -C3 'def from_parameters' avocado/utils/vmimage.py

Length of output: 220


🏁 Script executed:

#!/bin/bash
set -e
# Show calls to get_best_provider() within the file to confirm order of operations
rg -n -C3 'get_best_provider' avocado/utils/vmimage.py

Length of output: 578


Cache lookup isn’t invoked before network calls in from_parameters

The newly added _find_cached_image() helper is never called, so from_parameters() always calls get_best_provider() (and may hit the network) even when a cached image exists.

Locations to update in avocado/utils/vmimage.py:

  • In from_parameters() (around line 755), immediately after setting the default arch and before provider = get_best_provider(...)

Suggested insertion:

     if arch is None:
         arch = DEFAULT_ARCH

+    # Cache-first pre-check before any provider/network call
+    effective_cache_dirs = (
+        [cache_dir] if isinstance(cache_dir, str)
+        else (cache_dir or [tempfile.gettempdir()])
+    )
+    cached = cls._find_cached_image(
+        cache_dirs=effective_cache_dirs,
+        name=name,
+        version=version,
+        build=build,
+        arch=arch,
+        checksum=checksum,
+        algorithm=algorithm,
+        snapshot_dir=snapshot_dir,
+    )
+    if cached:
+        return cached
+
     provider = get_best_provider(name, version, build, arch)

This ensures an available cached image is returned immediately, avoiding any network-dependent provider logic.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
@classmethod
def _find_cached_image( # pylint: disable=too-many-locals
cls,
cache_dirs,
name=None,
version=None,
build=None,
arch=None,
checksum=None,
algorithm=None,
snapshot_dir=None,
):
if arch is None:
arch = DEFAULT_ARCH
# Cache-first pre-check before any provider/network call
effective_cache_dirs = (
[cache_dir] if isinstance(cache_dir, str)
else (cache_dir or [tempfile.gettempdir()])
)
cached = cls._find_cached_image(
cache_dirs=effective_cache_dirs,
name=name,
version=version,
build=build,
arch=arch,
checksum=checksum,
algorithm=algorithm,
snapshot_dir=snapshot_dir,
)
if cached:
return cached
provider = get_best_provider(name, version, build, arch)
🤖 Prompt for AI Agents
In avocado/utils/vmimage.py around lines 755 (inside from_parameters), the new
helper _find_cached_image is not being used so from_parameters always calls
get_best_provider() and may hit the network; after the default arch is
determined (immediately before the provider = get_best_provider(...) call) call
cls._find_cached_image(...) with the same cache_dirs, name, version, build,
arch, checksum, algorithm and snapshot_dir arguments used by from_parameters,
and if it returns a non-None image immediately return that cached image instead
of proceeding to provider selection; ensure the call uses the class (cls) and
mirrors the parameter names so cached lookups short-circuit network provider
logic.

Copy link
Contributor

@richtja richtja left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thank you

@richtja richtja merged commit e0c356c into avocado-framework:master Aug 18, 2025
65 checks passed
@github-project-automation github-project-automation bot moved this from Review Requested to Done 112 in Default project Aug 18, 2025
@github-project-automation github-project-automation bot moved this from Review Requested to Done #100(the 100) in Avocado Kanban Aug 18, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

Status: Done #100(the 100)
Status: Done 112

Development

Successfully merging this pull request may close these issues.

avocado.utils.vmimage.Image.from_parameters() hangs on network even if base image is in cache

2 participants