Skip to content

Conversation

peterhuene
Copy link
Contributor

@peterhuene peterhuene commented Aug 31, 2025

Previously, File and Directory task inputs were evaluated and host paths were translated to guest paths just prior to command evaluation.

However, private declarations of tasks may have already evaluated references to the inputs which can cause host paths to end up in the evaluated command.

This commit refactors input evaluation so that the evaluation scope stores the already mapped values of File and Directory inputs.

Additionally, input localization was also refactored out of the backends and into task evaluation; backends no longer have to concern themselves with mapping guest paths or downloading remote inputs.

The standard library functions dealing with reading and writing files had to be updated to translate guest paths to host paths in the case of reading and host paths to guest paths in case of writing.

The InputTrie was refactored to immediately calculate a guest path for a newly inserted input (assuming the backend uses containers). Additionally, it is not responsible for performing guest to host path translation.

Fixes #574

Before submitting this PR, please make sure:

For external contributors:

  • You have read the CONTRIBUTING guide in its entirety.
  • You have not used AI on any parts of this pull request.

For all contributors:

  • You have added a few sentences describing the PR here.
  • Your code builds clean without any errors or warnings.
  • You have added tests (when appropriate).
  • You have updated the README or other documentation to account for these
    changes (when appropriate).
  • You have added an entry to the relevant CHANGELOG.md (see
    "keep a changelog" for more information).
  • Your commit messages follow the conventional commit style.
  • Your PR title follows the conventional commit style.

@peterhuene peterhuene self-assigned this Aug 31, 2025
@peterhuene peterhuene requested a review from a team as a code owner August 31, 2025 04:32
@peterhuene peterhuene requested a review from acfoltzer August 31, 2025 04:32
@github-actions github-actions bot added the S-awaiting-CI State: awaiting the CI to pass label Aug 31, 2025
@peterhuene peterhuene force-pushed the fix-localization branch 6 times, most recently from 5f5b7c1 to 68bc3ed Compare August 31, 2025 16:08
@github-actions github-actions bot removed the S-awaiting-CI State: awaiting the CI to pass label Aug 31, 2025
@github-actions github-actions bot added the S-awaiting-CI State: awaiting the CI to pass label Aug 31, 2025
@github-actions github-actions bot added S-awaiting-CI State: awaiting the CI to pass and removed S-awaiting-CI State: awaiting the CI to pass labels Aug 31, 2025
@github-actions github-actions bot removed the S-awaiting-CI State: awaiting the CI to pass label Aug 31, 2025
Copy link
Member

@acfoltzer acfoltzer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review not yet complete, but I've gotta step away for the moment

Comment on lines 639 to 652
/// Constructs a new inputs trie with the given guest inputs directory.
pub fn new(guest_inputs_dir: &'static str) -> Self {
Self {
guest_inputs_dir: Some(guest_inputs_dir),
..Default::default()
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having both a new() and a default() is ripe for confusion. If there are two primary configurations for new tries, I think they should each have constructors with descriptive names, and we should forego a Default implementation.

Copy link
Contributor Author

@peterhuene peterhuene Sep 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, I'll remove the Default impl.

Actually, clippy prefers a Default impl, so I'll remove the new method instead.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, so that's why we have so many Default impls! I don't think that is a good lint: now we have new_with_guest_dir(), but if you're looking through the Rustdoc for this type that is the only method that starts with new. It's easy to assume, therefore, that this is the only way an InputTrie can be created, which of course is not the case. I made a PR to disable it here: #595

Now, freed of clippy's hectoring, we can name these the way we'd like ✊🏻

@acfoltzer acfoltzer self-requested a review September 3, 2025 21:33
Previously, `File` and `Directory` task inputs were evaluated and host paths
were translated to guest paths just prior to command evaluation.

However, private declarations of tasks may have already evaluated references to
the inputs which can cause host paths to end up in the evaluated command.

This commit refactors input evaluation so that the evaluation scope stores the
already mapped values of `File` and `Directory` inputs.

Additionally, input localization was also refactored out of the backends and
into task evaluation; backends no longer have to concern themselves with
mapping guest paths or downloading remote inputs.

The standard library functions dealing with reading and writing files had to be
updated to translate guest paths to host paths in the case of reading and host
paths to guest paths in case of writing.

The `InputTrie` was refactored to immediately calculate a guest path for a
newly inserted input (assuming the backend uses containers). Additionally, it
is not responsible for performing guest to host path translation.

Fixes stjude-rust-labs#574
WDL 1.2 made required `File` or `Directory` inputs that don't exist an error.

If the inputs are optional and do not exist, they evaluate to `None`.

However, in 1.1, they are explicitly allowed to not exist until accessed.

This commit limits the check for existence to 1.2+ documents.
peterhuene and others added 3 commits September 3, 2025 23:01
Co-authored-by: Adam C. Foltzer <[email protected]>
Fix check in the Docker backend that requires an input to exist for it to be
included in the bind mounts.

The underlying problem is that a backend input should only be added for a
specified input as it may contain host or remote paths; for inputs that are
defaulted, the evaluated expression will only contain guest paths and therefore
does not need an input.
@github-actions github-actions bot added the S-awaiting-CI State: awaiting the CI to pass label Sep 4, 2025
@github-actions github-actions bot removed the S-awaiting-CI State: awaiting the CI to pass label Sep 4, 2025
peterhuene and others added 2 commits September 4, 2025 10:41
Add an assertion for ensuring the first entry in the inputs collection is the
temp directory.
Copy link
Member

@claymcleod claymcleod left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was a bit of a tough one. That being said, I've now looked at everything, and it looks good the best I can tell.

("https://example.com/foo/baz/bar.txt", "/inputs/22/bar.txt"),
("https://example.com/bar/foo/foo.txt", "/inputs/26/foo.txt"),
("https://example.com/bar/foo/bar.txt", "/inputs/26/bar.txt"),
("https://foo.com/bar", "/inputs/29/bar"),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I didn't get far enough to leave this comment yesterday, but seeing the line count I recommend we move InputTrie and its supporting types into its own module.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, it could use its own module; I'll refactor that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will confess that I do not really understand the workings of the changes in this file. I reviewed them for obvious safety and low-level style issues, but I think I'll need more hands-on time implementing localization myself to really understand the intricacies.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

bug: File to String coercion in task private declaration not mapped to guest
3 participants