Help with regex matching syntax #1753
-
Hello, I'm new to this crate entirely, and I'm trying to figure out how to use the grep crate to find matches of words like this (with const WORDLIST: &'static [u8] = b"\n word\n book\n sphinx\n someotherword\n"; ```
and then a search function like this:
```rust
fn search(pattern: &str, wordlist: &'static [u8]) -> Result<(), Box<Error>> {
let matcher = RegexMatcher::new_line_matcher(pattern)?;
let mut matches: Vec<(u64, String)> = vec![];
Searcher::new().search_slice(&matcher, wordlist, UTF8(|lnum, line| {
// We are guaranteed to find a match, so the unwrap is OK.
let mymatch = matcher.find(line.as_bytes())?.unwrap();
matches.push((lnum, line[mymatch].to_string()));
Ok(true)
}))?;
println!("matches:{:?}", matches);
Ok(())
} And I want to be able to use a pattern like this |
Beta Was this translation helpful? Give feedback.
Replies: 3 comments 1 reply
-
Note: I edited your comment to use backticks instead of double quotes around your regexes. Without those, the Note: I converted this to a discussion question since it seems more like a support question rather than a bug report or a feature request. As for your question, it looks like you're asking for help with constructing the regex itself? And not the actual code? If so, you're pretty close. The
If you're used to globs, then this syntax is different. In globs, So in your case, it sounds like you want |
Beta Was this translation helpful? Give feedback.
-
Ah, ok. That makes sense. So potentially I could also do something like
this: [a-z]* if I wanted to match one or more of any character between a
and z.
…On Sun, Dec 6, 2020 at 5:30 AM Andrew Gallant ***@***.***> wrote:
Note: I edited your comment to use backticks instead of double quotes
around your regexes. Without those, the * characters are usually
interpreted as Markdown, which means it's rendered in a way you don't
expect.
Note: I converted this to a discussion question since it seems more like a
support question rather than a bug report or a feature request.
As for your question, it looks like you're asking for help with
constructing the regex itself? And not the actual code? If so, you're
pretty close. The * itself is a *unary operator*, meaning that it takes a
single argument. In regex syntax, the * applies to the "thing" preceding
it. In your case, s*nx, the * applies to the s, which modifies it to
mean, "match the letter s zero or more times." Here are some other
examples:
- ab*c, the * applies to b only.
- (ab)*c, the * applies to ab, and says, "match zero or more
occurrences of of ab."
If you're used to globs, then this syntax is different. In globs, * is
not an operator but a wildcard itself. The regex equivalent of it is more
like, .*, which says, "match any character zero or more times." In regex
syntax, the . is a wildcard meaning, "match any character."
So in your case, it *sounds* like you want s.*nx instead of s*nx.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#1753 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ALTRKROT65JCZ7UDVMO2KKTSTN2OTANCNFSM4UPJD4FA>
.
|
Beta Was this translation helpful? Give feedback.
-
Alrighty, thanks for the help.
…On Mon, Dec 7, 2020 at 11:37 AM Andrew Gallant ***@***.***> wrote:
Yes. I might suggest looking at a few regex tutorials to get your
bearings. I don't have any particular one to recommend though unfortunately.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#1753 (reply in thread)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ALTRKRKB23QPGKXJD6A7NC3STUOHFANCNFSM4UPJD4FA>
.
|
Beta Was this translation helpful? Give feedback.
Note: I edited your comment to use backticks instead of double quotes around your regexes. Without those, the
*
characters are usually interpreted as Markdown, which means it's rendered in a way you don't expect.Note: I converted this to a discussion question since it seems more like a support question rather than a bug report or a feature request.
As for your question, it looks like you're asking for help with constructing the regex itself? And not the actual code? If so, you're pretty close. The
*
itself is a unary operator, meaning that it takes a single argument. In regex syntax, the*
applies to the "thing" preceding it. In your case,s*nx
, the*
applies to thes
, which modifies it …