-
Notifications
You must be signed in to change notification settings - Fork 501
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Globus support: download optimizations #11125
Conversation
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
…d/counted with gestbook popups enabled. #11057
Plus implements proper checks before deleting access rules, ensuring that no other active tasks are still using them. #11057
I un-drafter the PR; waiting for a Jenkins test, will put the PR on the board if it passes |
This comment has been minimized.
This comment has been minimized.
I just killed the last Jenkins build (no. 14), because it was triggered by a commit of a typo fix in a comment. |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
…l debt from 1.5 yrs and 6 releases ago; an annoying inefficiency in the dataset page that we forgot about, that came up during the work on the pr. #11057
This comment has been minimized.
This comment has been minimized.
📦 Pushed preview images as
🚢 See on GHCR. Use by referencing with full name as printed above, mind the registry name. |
Looks like the error is not related to the PR - a consequence of testing #11068 first which drops a datasetversion column. |
@landreev How would I get past this?: Screen.Recording.2025-03-06.at.12.34.45.PM.movEdit: Jim hooked me up. All's I needed to do was download connect-personal and set that up. |
Testing Evidences: Parallel Upload successful with large files: Parallel Downloading: https://github.com/user-attachments/assets/4540dda7-2cb0-4989-8ee3-c6bb49c876cf |
Maybe not related to the PR but here is an observation I made: Screen.Recording.2025-03-06.at.2.32.47.PM.mov |
A separate PR has been opened to track this: #11316. Merging PR. |
Thank you for merging the PR and for the detailed report! Opening the new issue was the right thing to do, that's what I would have recommended myself. I'll add a comment there. |
What this PR does / why we need it:
The PR improves the reliability of the Globus download framework, primarily by extending the new task-monitoring system first implemented for uploads in #10781. There are a few other improvements. For example, it fixes a somewhat exotic bug where Globus downloads weren't counted, but only for multi-file downloads and only when a guestbook popup was enabled.
The improvements are implemented in response to/based on the experience with the NESE Data Storage integration with Dataverse at IQSS. Most of these are already in prod. use there, deployed as an experimental beta build.
Which issue(s) this PR closes:
Special notes for your reviewer:
The single most important improvement is that, similarly to what was implemented in #10781 for uploads, the ongoing download tasks can now be monitored asynchronously, with the state saved in the database. This makes the management of the temporary access rules more robust, guaranteeing that Dataverse will register the completion of each task, even if there was a server restart in between.
Assorted other fixes and improvements were added. For example, it is now possible for a user to have simultaneous downloads runnig on files from the same dataset (when the task state is saved in the database, upon completion of a task it is easy to check if any other active tasks are using the same access rule and, if so, avoid deleting it). I fixed something misguided I did in #10781 when I first implemented saving the task state in the database: I misunderstood how the client tokens worked; and tried to save the token for each task, thinking that it needed to be reused throughout the life of the task. In reality, the same token can be used for multiple tasks on the same endpoint; also, for a long-running task that saved token has a good chance of expiring - and I didn't have a provision for that. Now the monitoring service simply caches the access tokens for each endpoint that it manages, and refreshes them as needed. General error handling, logging and state checking has been improved.
Suggestions on how to test this:
dataverse-internal has active Globus configuration tied to a fully-functioning remote storage volume at NESE, identical to our prod. volume there. There are existing datasets with multiple large-ish (100MB+) Globus-stored files (for example: https://dataverse-internal.iq.harvard.edu/dataset.xhtml?persistentId=doi:10.70122/FK2/QZQPQE); more data can be uploaded for testing as needed. QA would involve verifying that such files can be downloaded; with an emphasis on repeating downloads from the same dataset, and perhaps parallel downloads of different files from the same dataset at the same time. For the end user, everything should work as described in the Globus download instruction (written for the prod. users, but fully applies to the test configuration on internal).
One extra thing not directly related to the Globus tech fixed in this PR: there was a glaring inefficiency in the dataset page code in how it was looking up various external tools and previewers for each file on the page, introduced 6 releases ago (!) but then forgotten about. It was re-discovered during this effort and I committed the fixes since they were trivial. So, please check on the previewers and such, before and after, and confirm that they are still shown on the page properly.
Does this PR introduce a user interface change? If mockups are available, please link/include them here:
Is there a release notes update needed for this change?:
Additional documentation: