Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Process.run(command, shell: true) inconsistent on Windows and POSIX #12873

Open
straight-shoota opened this issue Dec 28, 2022 · 20 comments · May be fixed by #13567
Open

Process.run(command, shell: true) inconsistent on Windows and POSIX #12873

straight-shoota opened this issue Dec 28, 2022 · 20 comments · May be fixed by #13567
Labels
kind:bug A bug in the code. Does not apply to documentation, specs, etc. platform topic:stdlib:system tough-cookie Multi-faceted and challenging topic, making it difficult to arrive at a straightforward decision.

Comments

@straight-shoota
Copy link
Member

Process.run with shell: true behaves differently on Windows and POSIX platforms when the command executable does not exist:

Process.run("filedoesnotexist", shell: true)
  • On POSIX system it returns Process::Status with exit code 127 (which indicates failure).
  • On Windows it raises FileNotFoundError.

Other error conditions might show similar inconsistent behaviour.

IMO raising FileNotFoundError is a more reasonable choice. The user asks to run a command and when that command doesn't even exist, this should usually be considered an error and raise. However, this might need to be considered a breaking change.

Btw. in Ruby `filedoesnotexist` raises the exception Errno::ENOENT on both Linux and Windows.

@straight-shoota straight-shoota added kind:bug A bug in the code. Does not apply to documentation, specs, etc. topic:stdlib:system platform labels Dec 28, 2022
@straight-shoota
Copy link
Member Author

Maybe not directly relevant, but there's also this related discussion about shell: true with args on Windows: #9030

@j8r
Copy link
Contributor

j8r commented Dec 28, 2022

FileNotFoundError will be incorrect on Unix-like systems, because the command can also be a built-in shell command.
crystal eval 'Process.run("set", shell: true)' will succeed for instance, but set is not a file.

@straight-shoota
Copy link
Member Author

I don't think that's exactly incorrect. Yes, a command could also be a shell built-in or alias. But when invoking that command fails with error 127 it means there is neither a built-in, alias nor file by that name. So file not found would be acceptable IMO.
It could also use a more specific error type, though.

@j8r
Copy link
Contributor

j8r commented Dec 28, 2022

Other possible valid one word "commands" I found:

  • variables: Process.run("$some_variable", shell: true)
  • one-liner with no spaces: Process.run("ls>/tmp/output", shell: true)

@straight-shoota
Copy link
Member Author

straight-shoota commented Dec 29, 2022

What's the significance of "one word commands? Does that have any special meaning?

It shouldn't matter what the command is (it can even be a complex shell expression). If a command is not found, the shell returns exit code 127. That implicitly means no file was found that would match the command name.
If there would be a more specific error type (such as CommandNotFound error), it should probably be a subtype of FileNotFound. I doubt that such specificity is necessary, though.
Note that Process.run with shell: false also raises FileNotFound if the command cannot be found.

@straight-shoota
Copy link
Member Author

straight-shoota commented Dec 30, 2022

Another difference is that on POSIX the shell besides returning exit code 127 also prints a message to stderr (something like --: 1: filedoesnotexist: not found). On Windows there's no output. I suppose this could be acceptable as functional differences of the operating system / shell being used for executing the command. It might still be worth considering to unify this aspect as well.

@j8r
Copy link
Contributor

j8r commented Dec 30, 2022

I mean, there isn't necessary a file involved in the process for shell expressions.
I wasn't sure if there was any parsing done, with spaces for instance, so in my examples I didn't include any - but that's not the case.

@straight-shoota
Copy link
Member Author

straight-shoota commented Dec 30, 2022

Well, yes when running a command that is actually a shell built-in, no file will be looked up and it doesn't matter if it exists or not. But then it's not in the error track because the command exists.
When it's not a built-in, the shell looks up a file by the name of the command. A file is involved in that scenario. Or rather its absence. And that's a perfect fit for FileNotFound.

@j8r
Copy link
Contributor

j8r commented Dec 30, 2022

Faire enough. Another alternative is to still return a Process::Status with an #exit_reason getter, which will be FileNotFound (can be an enum member).

The advantage being not a breaking change.

@j8r
Copy link
Contributor

j8r commented Dec 30, 2022

I think I may have confused myself with the behavior of shell: false.
It can be good to have in this case the same result between the two.

@straight-shoota
Copy link
Member Author

Another alternative is to still return a Process::Status with an #exit_reason getter, which will be FileNotFound

Yeah that would align with #12284

But I think communicating this as Process::Status is not a good model.

When the command to be executed is not found, the shell doesn't even attempt to run any process because there is nothing to run. So this should be expressed as an error state of Process.run, i.e. raise an exception. Process.run("commandnotexists", shell: true) returning a Process::Status with exit code 127 exposes an implementation detail on POSIX. shell: true is implemented via /bin/sh -c which communicates that the command was not found via exit code 127. The public API has nothing to do with /bin/sh, so this specific error communication should not leak outside of Process.run.

@straight-shoota
Copy link
Member Author

straight-shoota commented Dec 31, 2022

Due to the implementation of shell: true and the mechanics of spawning a process on POSIX systems, there is some inherently different behaviour. On Windows, Process.new(shell: true) directly spawns a process with the given command and raises if the command cannot be found. On POSIX, Process.new(shell: true) only spawns the shell (/bin/sh) which should usually succeed. The shell then executes the command and we can't know whether that was successful until the shell process exits, which requires to wait for it.

So a command not found error can be observed in different places: On windows it's directly raised in Process.new(shell: true). On POSIX, it's later in Process#wait. At that point the original command and whether it's run with a shell are unknown, because Process is a very thin wrapper and doesn't persist that information. That impacts the amount of information that can be available for an error message.

Process.new(shell: false) raises immediately on all platforms.

Process.run calls .new and #wait internally and can handle any kind of error.
But when using the Process.new API with shell: true, errors about the command not being found will be reported in different places (.new vs #wait).

@straight-shoota
Copy link
Member Author

straight-shoota commented Jan 4, 2023

There's an additional complication with mapping exit codes 127 and 126 to File::NotFoundError and File::AccessDeniedError on POSIX: Those exit codes don't necessarily need to come from the shell process telling that it couldn't run the command it received. It could also be passed through from the command that's run within the shell. Still telling that a command couldn't be run, but then it's inside the user-supplied command and not from the transparent wrapper around it.
For a completely transparent implementation of shell: true we would need to differentiate where the exit status originates. If it's from the shell, raise an exception. Otherwise it's just the exit status of the command that's being executed in the shell.
But that would be difficult to implement. I suppose it should probably be okay to just raise on 127 and 126 exit codes.

@mominshaikhdevs
Copy link

mominshaikhdevs commented Jan 10, 2023

FileNotFoundError will be incorrect on Unix-like systems, because the command can also be a built-in shell command.

"aliases" and "built in shell commands" are not unix-clones only thing.

@straight-shoota
Copy link
Member Author

For reference I have an experimental branch for dabbling with this: https://github.com/straight-shoota/crystal/pull/new/fix/process-run-raise-file_not_found

@HertzDevil
Copy link
Contributor

HertzDevil commented Feb 14, 2023

I find it odd that POSIX calls /bin/sh -c but Windows doesn't call %COMSPEC% /C. At least for the backtick (where #9030 does not apply because there are no extra arguments) it should be possible to use built-in shell commands:

`exit 123`   # raises on Windows
$?.exit_code # 123 on POSIX, unreachable on Windows

Now if we do this explicitly, we could obtain the equivalent conventional exit code used to represent a non-existent file or command, and also an error will be printed to the console:

`cmd.exe /v /c filedoesnotexist & exit !ERRORLEVEL!`
# 'filedoesnotexist' is not recognized as an internal or external command,
# operable program or batch file.
$?.exit_code # => 9009

Process.run("filedoesnotexist", shell: true) should probably expand to something like that. The /v and exit !ERRORLEVEL! are necessary to return the actual exit code rather than letting cmd.exe overwrite it, although I imagine it makes escaping certain commands much harder or even impossible.

Also, we are dealing with the shell's syntax here, which is beyond the scope of Process.quote_windows; interpolating the result of Process.quote_windows into the argument to cmd.exe is still unsafe.

I'm not sure if turning `exit 127` into a file-not-found exception is a good idea, because the exit built-in command does indeed exist. And if we are dealing with arbitrary shell commands we cannot reliably identify the file that was meant to be found.

@SamantazFox
Copy link

There's an additional complication with mapping exit codes 127 and 126 to File::NotFoundError and File::AccessDeniedError on POSIX: Those exit codes don't necessarily need to come from the shell process telling that it couldn't run the command it received. It could also be passed through from the command that's run within the shell.

That's not supposed to happen. 126 and 127 are reserved return codes under POSIX:

Exit Code Meaning Example Comments
126 Command invoked cannot execute /dev/null Permission problem or command is not an executable
127 "command not found" illegal_command Possible problem with $PATH or a typo

Source: https://tldp.org/LDP/abs/html/exitcodes.html
(I'd like to link to the POSIX specification, but it's paywalled)

@straight-shoota
Copy link
Member Author

straight-shoota commented Feb 19, 2023

@SamantazFox Yes, the meaning of these exit codes is clearly defined. That shouldn't be an issue. But the reference can be ambiguous.

On windows, there's a clear indicator whether the specified shell command could be executed or not. Process.run("foobar", shell: true) raises when it fails to execute foobar.
On POSIX, the exit code 126 or 127 can mean that foobar failed to execute. But it could also mean that foobar itself executed correctly, but it failed executing another command.

Compare the result of these largely equivalent command invokations on Windows and POSIX:

# Windows
Process.run("cmd /c NUL", shell: true).exit_code # => 1
Process.run("NUL", shell: true).exit_code # File::NotFoundError

# POSIX
Process.run("bash -c /dev/null", shell: true).exit_code # => 126
Process.run("/dev/null", shell: true).exit_code # => 126

@SamantazFox
Copy link

SamantazFox commented Feb 19, 2023

Ah, hmm, I see. I'd expect exit code 126/127 to raise on POSIX systems anyway.
I spent an hour yesterday trying to figure out why Process.run wasn't raising, where the documentation mentions it should.

@straight-shoota
Copy link
Member Author

Yes, the documentation is incorrect when using shell: true on a POSIX system.

@HertzDevil HertzDevil moved this from Todo to In Progress in Windows Support Jun 15, 2023
@straight-shoota straight-shoota added the tough-cookie Multi-faceted and challenging topic, making it difficult to arrive at a straightforward decision. label Oct 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind:bug A bug in the code. Does not apply to documentation, specs, etc. platform topic:stdlib:system tough-cookie Multi-faceted and challenging topic, making it difficult to arrive at a straightforward decision.
Projects
Status: In Progress
Development

Successfully merging a pull request may close this issue.

5 participants