-
Notifications
You must be signed in to change notification settings - Fork 1.2k
downloaded file debugger accessible #3338
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
<!-- ELLIPSIS_HIDDEN --> > [!IMPORTANT] > Enhance browser session file management by adding S3 file handling and session-specific file listing capabilities. > > - **Behavior**: > - Add `need_date` parameter to `_watch_and_sync_directory()` in `browser_controller.py` to control date inclusion in S3 key. > - Handle file deletion in S3 when `Change.deleted` is detected in `browser_controller.py`. > - Update `start_artifact_sync()` in `browser_controller.py` to set `need_date=False`. > - **Session Management**: > - Add `browser_session_id` and `organization_id` parameters to `get_browser_state_from_network_info()` in `persistent_sessions_manager.py`. > - Add `browser_session_id` to various browser creation functions in `special_browsers.py`. > - **File Management**: > - Implement `list_downloaded_files_in_browser_session()` and `list_downloading_files_in_browser_session()` in `base.py`, `local.py`, and `s3.py`. > - Update `download_file()` and `wait_for_download_finished()` in `files.py` to handle S3 URIs. > - Add `delete_file()` method to `aws.py` for S3 file deletion. > - **Misc**: > - Change `DownloadFileMaxWaitingTime` exception to use `list[str]` instead of `list[Path]` in `exceptions.py`. > - Update `handle_click_to_download_file_action()` in `handler.py` to check files in browser session storage. > > <sup>This description was created by </sup>[<img alt="Ellipsis" src="https://img.shields.io/badge/Ellipsis-blue?color=175173">](https://www.ellipsis.dev?ref=Skyvern-AI%2Fskyvern-cloud&utm_source=github&utm_medium=referral)<sup> for 465124432b6e9e05df32437aacacd00cbe731378. You can [customize](https://app.ellipsis.dev/Skyvern-AI/settings/summaries) this summary. It will automatically update as commits are pushed.</sup> <!-- ELLIPSIS_HIDDEN -->
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Important
Looks good to me! 👍
Reviewed 97568e4 in 1 minute and 23 seconds. Click for details.
- Reviewed
314
lines of code in9
files - Skipped
0
files when reviewing. - Skipped posting
8
draft comments. View those below. - Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. skyvern/forge/sdk/api/aws.py:117
- Draft comment:
Replace 'if storage_class not in S3StorageClass' with an explicit Enum type check (e.g. using isinstance or comparing storage_class.value) so the check works reliably. - Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.
2. skyvern/forge/sdk/api/files.py:100
- Draft comment:
Consider refactoring the download_file function to separate HTTP request handling and file writing for improved readability and testability. - Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.
3. skyvern/forge/sdk/artifact/storage/local.py:236
- Draft comment:
Abstract out repeated directory creation logic (e.g. _create_directories_if_not_exists) to reduce duplication. - Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.
4. skyvern/forge/sdk/artifact/storage/s3.py:303
- Draft comment:
Ensure robust handling of S3 key formatting and presigned URL generation; add error-handling for edge cases in upload/download flows. - Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.
5. skyvern/webeye/actions/handler.py:860
- Draft comment:
Action handler functions are deeply nested and catch broad exceptions; refactor into smaller helper functions and use more specific exception types for improved maintainability. - Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.
6. skyvern/webeye/browser_factory.py:510
- Draft comment:
Subprocess.Popen usage to launch Chrome with a fixed sleep may conflict in containerized environments; make delays and port checks more robust and configurable. - Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.
7. skyvern/forge/agent.py:526
- Draft comment:
Typo: In the warning message, "incompleted download file" should be updated to "incomplete download file" for clarity. - Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.
8. skyvern/webeye/actions/handler.py:903
- Draft comment:
Typo: The comment text "check if there's any file is still downloading" is unclear. Consider rephrasing it to "check if there are any files still downloading". - Reason this comment was not posted:
Comment was on unchanged code.
Workflow ID: wflow_AoGR7rVSKTfkoFjW
You can customize by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Important
Looks good to me! 👍
Reviewed everything up to 97568e4 in 1 minute and 23 seconds. Click for details.
- Reviewed
314
lines of code in9
files - Skipped
0
files when reviewing. - Skipped posting
9
draft comments. View those below. - Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. skyvern/webeye/browser_factory.py:42
- Draft comment:
Consider abstracting the duplicate logic between _create_headless_chromium and _create_headful_chromium. Both functions share similar steps (e.g. setting up user data, download directories, and updating browser preferences) and could be refactored into a common helper to DRY up the code. - Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.
2. skyvern/webeye/browser_factory.py:676
- Draft comment:
The navigate_to_url method uses a fixed sleep (5 seconds) after page.goto. Consider replacing this fixed delay with a dynamic wait (e.g., using page.wait_for_load_state('networkidle')) to make navigation more robust and reduce unnecessary waiting. - Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.
3. skyvern/webeye/browser_factory.py:716
- Draft comment:
The get_working_page() method assumes the last page in browser_context.pages is the working page. This heuristic might be unsafe with multi-page scenarios. Consider a more robust selection mechanism or documenting this assumption clearly. - Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.
4. skyvern/webeye/browser_factory.py:46
- Draft comment:
In set_browser_console_log, the console event listener is registered on the browser context but there is no removal logic when closing the context. This may lead to resource leaks if contexts are frequently recreated. Consider removing the listener on cleanup. - Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.
5. skyvern/webeye/browser_factory.py:498
- Draft comment:
In _create_cdp_connection_browser, synchronous time.sleep(1) is used after starting Chrome via subprocess. Since this code runs in an async context, consider replacing time.sleep(1) with await asyncio.sleep(1) to avoid blocking the event loop. - Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.
6. skyvern/webeye/browser_factory.py:240
- Draft comment:
The code relies on 'asyncio.timeout' for timeout handling. Ensure that the target Python runtime supports this (Python 3.11+) or provide a compatibility layer for earlier versions. - Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.
7. skyvern/forge/agent.py:526
- Draft comment:
Typo: The log message uses 'incompleted'. Consider changing it to 'incomplete' for grammatical correctness. - Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.
8. skyvern/forge/sdk/api/files.py:199
- Draft comment:
Typographical issue: The comment reads "check if there's any file is still downloading". Consider rephrasing it, e.g., "check if any file is still downloading", for clarity. - Reason this comment was not posted:
Comment was on unchanged code.
9. skyvern/webeye/actions/handler.py:904
- Draft comment:
The comment 'check if there's any file is still downloading' is a bit unclear grammatically. Consider rephrasing it to something like 'check if any files are still downloading' for clarity. - Reason this comment was not posted:
Comment was on unchanged code.
Workflow ID: wflow_miSJUM8K3aP3xMMU
You can customize by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.
Important
Enhance file download handling with browser session awareness and S3 support, updating several functions and classes for improved file management.
agent.py
andhandler.py
.s3.py
.DownloadFileMaxWaitingTime
to acceptlist[str]
instead oflist[Path]
inexceptions.py
.execute_step()
inagent.py
to check and list files in browser sessions.list_downloaded_files_in_browser_session()
andlist_downloading_files_in_browser_session()
inbase.py
,local.py
, ands3.py
.download_file()
inaws.py
to handle S3 URIs and download files locally.list_downloading_files_in_directory()
infiles.py
to returnlist[str]
.delete_file()
method inaws.py
for S3 file deletion.check_and_fix_state()
inbrowser_factory.py
to handle browser session pages.This description was created by
for 97568e4. You can customize this summary. It will automatically update as commits are pushed.
📁 This PR enhances browser session file management by adding comprehensive S3 file handling capabilities and session-specific file listing for downloaded files in browser automation workflows.
🔍 Detailed Analysis
Key Changes
list_downloaded_files_in_browser_session()
andlist_downloading_files_in_browser_session()
to the base storage class with implementations for both local and S3 storagePath
objects to string paths to accommodate S3 URIs in theDownloadFileMaxWaitingTime
exceptionTechnical Implementation
Impact
Created with Palmier