Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. Weโ€™ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Async URL checks #375

Open
wants to merge 8 commits into
base: main
Choose a base branch
from
Open

Async URL checks #375

wants to merge 8 commits into from

Conversation

glenn-jocher
Copy link
Member

@glenn-jocher glenn-jocher commented Jan 19, 2025

๐Ÿ› ๏ธ PR Summary

Made with โค๏ธ by Ultralytics Actions

๐ŸŒŸ Summary

This PR refactors URL and link-checking utilities to leverage asynchronous programming, significantly improving efficiency and performance.

๐Ÿ“Š Key Changes

  • Replaced synchronous HTTP requests with asynchronous aiohttp library for URL validation and link checking.
  • Added new async functions is_url_async and check_links_in_string_async, with asyncio.run used for their synchronous counterparts (is_url and check_links_in_string).
  • Removed ThreadPoolExecutor and redundant requests code, streamlining logic.
  • Implemented exponential backoff with asyncio.sleep for retry attempts when checking URLs.

๐ŸŽฏ Purpose & Impact

  • Boosts efficiency ๐Ÿš€: Asynchronous operations handle multiple requests simultaneously, improving performance when processing many URLs.
  • Better resource utilization โœ…: Eliminates blocking I/O and reduces dependency on ThreadPoolExecutor, making the script lighter and more scalable.
  • Improved reliability ๐ŸŒ: Adds fine-tuned error handling and session management to ensure cleaner, more robust link-checking workflows.

This refactor makes link validation faster and more dependable, benefiting developers by reducing delays when working with large datasets or documents containing many URLs. ๐ŸŒ

@glenn-jocher glenn-jocher changed the title Update URLs to redirects Async URL checks Jan 19, 2025
@UltralyticsAssistant UltralyticsAssistant added devops GitHub Devops or MLops enhancement New feature or request labels Jan 19, 2025
@UltralyticsAssistant
Copy link
Member

๐Ÿ‘‹ Hello @glenn-jocher, thank you for submitting this Async URL checks ๐Ÿš€ PR to the ultralytics/actions repository! We appreciate your work in improving the codebase, and the switch to asynchronous programming here is exciting ๐ŸŒŸ. To ensure your PR gets reviewed and merged smoothly, please go through the checklist below:

  • โœ… Define a Purpose: Youโ€™ve included a detailed summary and purpose in your PR description โ€” great job! Make sure all key details remain accurate, and include any relevant issues if applicable.
  • โœ… Keep Your Branch Up-to-Date: Check if your branch is synced with the latest main branch of ultralytics/actions. You can update it by clicking the 'Update branch' button (if visible) or running git fetch && git rebase origin/main locally.
  • โœ… Pass CI Checks: Ensure all Continuous Integration (CI) pipelines pass. If any errors occur, see the CI logs for troubleshooting.
  • โœ… Add/Update Documentation: Since this introduces asynchronous functions, double-check that the relevant documentation is updated to reflect these changes, especially if they affect existing usage or APIs.
  • โœ… Testing: Validate your changes with relevant tests. If no tests currently cover this, please add them to avoid potential regressions.
  • โœ… Sign CLA: If this is your first contribution to Ultralytics, donโ€™t forget to sign the Contributor License Agreement (CLA) by commenting, "I have read the CLA Document and I sign the CLA."

Regarding Your Changes:

The asynchronous enhancements youโ€™ve introduced promise a massive leap in efficiency and resource utilization for URL validation workflows ๐Ÿš€. However, a couple of pointers for consideration:

  1. Are all edge cases (like unavailable URLs, 500 responses, etc.) properly handled under the new async implementation? Comprehensive testing would be valuable here.
  2. If applicable, could you supply examples or tests demonstrating the performance benefits of moving from ThreadPoolExecutor to aiohttp with asyncio? This would help in validating and showcasing the improvements ๐ŸŒ.

If you encounter any blockers, refer to the Contributing Guide or leave a comment here. An Ultralytics engineer will review this PR soon to assist you further.

Thanks again for contributing to Ultralytics โ€” weโ€™re excited to see this in action! ๐Ÿš€โœจ

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
devops GitHub Devops or MLops enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants