Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add --filter argument that filters results based on command exit-code #1545

Open
wants to merge 11 commits into
base: master
Choose a base branch
from
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@
was often useful, it also broke some existing workflows, and there wasn't a good way to opt out of it. And there isn't really a good way for us to add
a way to opt out of it. And you can easily get similar behavior by adding `.git/` to your global fdignore file.
See #1457.
- Add `-f` \ `--filter <command>` argument that allows you to filter results based on the output of a command, as requested in #400.

## Bugfixes

Expand Down
37 changes: 35 additions & 2 deletions src/cli.rs
Original file line number Diff line number Diff line change
Expand Up @@ -776,6 +776,11 @@ impl clap::FromArgMatches for Exec {
.get_occurrences::<String>("exec_batch")
.map(CommandSet::new_batch)
})
.or_else(|| {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this be implemented in a way that would allow using it along with exec or exec-batch?

matches
.get_occurrences::<String>("filter")
.map(CommandSet::new_filter)
})
.transpose()
.map_err(|e| clap::Error::raw(ErrorKind::InvalidValue, e))?;
Ok(Exec { command })
Expand All @@ -797,7 +802,7 @@ impl clap::Args for Exec {
.allow_hyphen_values(true)
.value_terminator(";")
.value_name("cmd")
.conflicts_with("list_details")
.conflicts_with_all(["list_details", "exec_batch"])
.help("Execute a command for each search result")
.long_help(
"Execute a command for each search result in parallel (use --threads=1 for sequential command execution). \
Expand Down Expand Up @@ -851,7 +856,35 @@ impl clap::Args for Exec {
fd -g 'test_*.py' -X vim\n\n \
- Find all *.rs files and count the lines with \"wc -l ...\":\n\n \
fd -e rs -X wc -l\
"
"
),
)
.arg(
Arg::new("filter")
.action(ArgAction::Append)
.long("filter")
.short('f')
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if we should use a short option for this. After all we don't have a short option for --exec or --exec-batch which I think are probably more common than --filter. And we may want to use -f for something else in the future (maybe a --format option? ). It is easier to add this alias later if desired than to remove it, so let's leave it off for now.

.num_args(1..)
.allow_hyphen_values(true)
.value_terminator(";")
.value_name("cmd")
.conflicts_with_all(["exec", "exec_batch", "list_details"])
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does this need to conflict?

Logically, it makes sense to allow filtering items first, and then passing the results that succeed through to exec or exec-batch.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this idea, I'll consider how we can make these commands compatible.

.help("Execute a command to determine whether each result should be filtered")
.long_help(
"Execute a command in parallel for each search result, filtering out results where the exit code is non-zero. \
There is no guarantee of the order commands are executed in, and the order should not be depended upon. \
All positional arguments following --filter are considered to be arguments to the command - not to fd. \
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe mention that you can end with a \;?

It is therefore recommended to place the '-f'/'--filter' option last.\n\
The following placeholders are substituted before the command is executed:\n \
'{}': path (of the current search result)\n \
'{/}': basename\n \
'{//}': parent directory\n \
'{.}': path without file extension\n \
'{/.}': basename without file extension\n \
'{{': literal '{' (for escaping)\n \
'}}': literal '}' (for escaping)\n\n\
If no placeholder is present, an implicit \"{}\" at the end is assumed.\n\n\
"
),
)
}
Expand Down
62 changes: 59 additions & 3 deletions src/exec/command.rs
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,8 @@ struct Outputs {
stdout: Vec<u8>,
stderr: Vec<u8>,
}

/// Used to print the results of commands that run on results in a thread-safe way
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do weweant to print output from the filter command, or just suppress it?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree it would make more sense for the filter command to simply reduce the results that are matched before they're output to the user as usual.

struct OutputBuffer<'a> {
output_permission: &'a Mutex<()>,
outputs: Vec<Outputs>,
Expand Down Expand Up @@ -67,8 +69,9 @@ pub fn execute_commands<I: Iterator<Item = io::Result<Command>>>(
let output = if enable_output_buffering {
cmd.output()
} else {
// If running on only one thread, don't buffer output
// Allows for viewing and interacting with intermediate command output
// If running on only one thread, don't buffer output; instead just
// write directly to stdout. Allows for viewing and interacting with
// intermediate command output
cmd.spawn().and_then(|c| c.wait_with_output())
};

Expand All @@ -78,7 +81,7 @@ pub fn execute_commands<I: Iterator<Item = io::Result<Command>>>(
if enable_output_buffering {
output_buffer.push(output.stdout, output.stderr);
}
if output.status.code() != Some(0) {
if !output.status.success() {
output_buffer.write();
return ExitCode::GeneralError;
}
Expand All @@ -93,6 +96,59 @@ pub fn execute_commands<I: Iterator<Item = io::Result<Command>>>(
ExitCode::Success
}

/// Executes a command and pushes the path to the buffer if it succeeded with a
/// non-zero exit code.
pub fn execute_commands_filtering<I: Iterator<Item = io::Result<Command>>>(
path: &std::path::Path,
cmds: I,
out_perm: &Mutex<()>,
enable_output_buffering: bool,
) -> ExitCode {
let mut output_buffer = OutputBuffer::new(out_perm);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Writing to this output buffer skips all the logic in output.rs.

The filter code shouldn't be handling output at all.

Rather, this should be used as one of the filters that we use to exclude files to be processed in walk.rs in the spawn_senders function.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I based this on how the --exec command does the same. Should we change that as well?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, with exec, the output of the command itself is the desired output, but with filter, the desired output is the path (but possibly with some transformations done on it).

The question here is what should we do with the output from the filter command, if any (I imagine in most cases there won't be any).

Some options are:

  1. Pipe to /dev/null or equivalent
  2. Read, and immediately drop
  3. Collect into buffer, then write to stderr, so that error messages are preserved.
  4. Collect all filter output, then print all of it at the end (probably on stderr).

I do think if we do print the output it should be to stderr.


// Convert path to bufferable path string
let path_str = match path.to_str() {
Some(path) => format!("{}\n", path),
None => {
// Probably had non UTF-8 chars in the path somehow
return ExitCode::GeneralError;
}
};
let path_u8 = path_str.as_bytes().to_vec();

for result in cmds {
let mut cmd = match result {
Ok(cmd) => cmd,
Err(e) => return handle_cmd_error(None, e),
};

// Spawn the supplied command.
let output = cmd.output();

match output {
Ok(output) => {
if output.status.success() {
if enable_output_buffering {
// Push nothing to stderr because, well, there's nothing to push.
output_buffer.push(path_u8.clone(), vec![]);
} else {
print!("{}", path_str);
}
} else {
return ExitCode::GeneralError;
}
}
Err(why) => {
return handle_cmd_error(Some(&cmd), why);
}
}
}
output_buffer.write();
ExitCode::Success
}

/// Displays user-friendly error message based on the kind of error that occurred while
/// running a command
pub fn handle_cmd_error(cmd: Option<&Command>, err: io::Error) -> ExitCode {
match (cmd, err) {
(Some(cmd), err) if err.kind() == io::ErrorKind::NotFound => {
Expand Down
3 changes: 3 additions & 0 deletions src/exec/job.rs
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@ pub fn job(
cmd: &CommandSet,
out_perm: &Mutex<()>,
config: &Config,
filter: bool,
) -> ExitCode {
// Output should be buffered when only running a single thread
let buffer_output: bool = config.threads > 1;
Expand All @@ -39,7 +40,9 @@ pub fn job(
config.path_separator.as_deref(),
out_perm,
buffer_output,
filter,
);

ret = merge_exitcodes([ret, code]);
}
// Returns error in case of any error.
Expand Down
31 changes: 29 additions & 2 deletions src/exec/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ use argmax::Command;

use crate::exit_codes::{merge_exitcodes, ExitCode};

use self::command::{execute_commands, handle_cmd_error};
use self::command::{execute_commands, execute_commands_filtering, handle_cmd_error};
use self::input::{basename, dirname, remove_extension};
pub use self::job::{batch, job};
use self::token::{tokenize, Token};
Expand All @@ -28,6 +28,8 @@ pub enum ExecutionMode {
OneByOne,
/// Command is run for a batch of results at once
Batch,
/// Command is executed for each search result to determine if it should be filtered
FilterResults,
}

#[derive(Debug, Clone, PartialEq)]
Expand Down Expand Up @@ -76,6 +78,25 @@ impl CommandSet {
})
}

pub fn new_filter<I, T, S>(input: I) -> Result<CommandSet>
where
I: IntoIterator<Item = T>,
T: IntoIterator<Item = S>,
S: AsRef<str>,
{
Ok(CommandSet {
mode: ExecutionMode::FilterResults,
commands: input
.into_iter()
.map(CommandTemplate::new)
.collect::<Result<_>>()?,
})
}

pub fn get_mode(&self) -> ExecutionMode {
self.mode
}

pub fn in_batch_mode(&self) -> bool {
self.mode == ExecutionMode::Batch
}
Expand All @@ -86,12 +107,18 @@ impl CommandSet {
path_separator: Option<&str>,
out_perm: &Mutex<()>,
buffer_output: bool,
filter: bool,
) -> ExitCode {
let commands = self
.commands
.iter()
.map(|c| c.generate(input, path_separator));
execute_commands(commands, out_perm, buffer_output)

if filter {
execute_commands_filtering(input, commands, out_perm, buffer_output)
} else {
execute_commands(commands, out_perm, buffer_output)
}
}

pub fn execute_batch<I>(&self, paths: I, limit: usize, path_separator: Option<&str>) -> ExitCode
Expand Down
1 change: 1 addition & 0 deletions src/exit_codes.rs
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,7 @@ impl ExitCode {
}
}

/// If any of the exit codes was an error, this returns a GeneralError
pub fn merge_exitcodes(results: impl IntoIterator<Item = ExitCode>) -> ExitCode {
if results.into_iter().any(ExitCode::is_error) {
return ExitCode::GeneralError;
Expand Down
47 changes: 25 additions & 22 deletions src/walk.rs
Original file line number Diff line number Diff line change
Expand Up @@ -408,28 +408,31 @@ impl WorkerState {

// This will be set to `Some` if the `--exec` argument was supplied.
if let Some(ref cmd) = config.command {
if cmd.in_batch_mode() {
exec::batch(rx.into_iter().flatten(), cmd, config)
} else {
let out_perm = Mutex::new(());

thread::scope(|scope| {
// Each spawned job will store its thread handle in here.
let threads = config.threads;
let mut handles = Vec::with_capacity(threads);
for _ in 0..threads {
let rx = rx.clone();

// Spawn a job thread that will listen for and execute inputs.
let handle = scope
.spawn(|| exec::job(rx.into_iter().flatten(), cmd, &out_perm, config));

// Push the handle of the spawned thread into the vector for later joining.
handles.push(handle);
}
let exit_codes = handles.into_iter().map(|handle| handle.join().unwrap());
merge_exitcodes(exit_codes)
})
match cmd.get_mode() {
exec::ExecutionMode::Batch => exec::batch(rx.into_iter().flatten(), cmd, config),
exec::ExecutionMode::OneByOne | exec::ExecutionMode::FilterResults => {
let out_perm = Mutex::new(());
let filter = cmd.get_mode() == exec::ExecutionMode::FilterResults;

thread::scope(|scope| {
// Each spawned job will store its thread handle in here.
let threads = config.threads;
let mut handles = Vec::with_capacity(threads);
for _ in 0..threads {
let rx = rx.clone();

// Spawn a job thread that will listen for and execute inputs.
let handle = scope.spawn(|| {
exec::job(rx.into_iter().flatten(), cmd, &out_perm, config, filter)
});

// Push the handle of the spawned thread into the vector for later joining.
handles.push(handle);
}
let exit_codes = handles.into_iter().map(|handle| handle.join().unwrap());
merge_exitcodes(exit_codes)
})
}
}
} else {
let stdout = io::stdout().lock();
Expand Down