Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

recognize \R for multiline regex in groovy parsers #1901

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

Bananeweizen
Copy link
Contributor

\R matches any unicode newline sequence, and may appear in custom groovy parsers besides the usual \r and \n.

Testing done

None. Currently sitting behind a company proxy that prohibits the Maven build to download all dependencies. If you need specific tests, I would have to rework this from a private machine.

Submitter checklist

  • Make sure you are opening from a topic/feature/bugfix branch (right side) and not your main branch!
  • Ensure that the pull request title represents the desired changelog entry
  • Please describe what you did
  • Link to relevant issues in GitHub or Jira
  • Link to relevant pull requests, esp. upstream and downstream changes
  • Ensure you have provided tests - that demonstrates feature works or fixes the issue

\R matches _any_ unicode newline sequence, and may appear in custom
groovy parsers besides the usual \r and \n.
@KalleOlaviNiemitalo
Copy link
Contributor

If the regex contains \\n (or some other even number of consecutive backslashes) then this will incorrectly assume that the regex can match a newline. And now this PR expands the flaw to \\R too. Will the incorrect heuristic cause any harmful effect?

@Bananeweizen
Copy link
Contributor Author

Bananeweizen commented Dec 2, 2024

@KalleOlaviNiemitalo I'm not sure but I think that using multi line parsing for false positive matches of newline characters in the regex doesn't have negative effects besides maybe performance/memory. It only leads to "whole document" parsing instead of line based parsing AFAICS. See https://github.com/jenkinsci/warnings-ng-plugin/blob/fc29fc0ce6f1ec3ab85e2a6d4b10d59c723460a7/plugin/src/main/java/io/jenkins/plugins/analysis/warnings/groovy/GroovyParser.java#L203-208

@uhafner uhafner added the enhancement Enhancement of existing functionality label Dec 6, 2024
Copy link
Member

@uhafner uhafner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please add the new character in

/**
* Tries to expose JENKINS-35262: multi-line regular expression parser.
*
* @see <a href="https://issues.jenkins-ci.org/browse/JENKINS-35262">Issue 35262</a>
*/
@Test
void issue35262() throws IOException {
matchMultiLine("(make(?:(?!make)[\\s\\S])*?make-error:.*(?:\\n|\\r\\n?))");
matchMultiLine("(make(?:(?!make)[\\s\\S])*?make-error:.*(?:\\r?))");
}
.

@uhafner
Copy link
Member

uhafner commented Dec 6, 2024

If the regex contains \\n (or some other even number of consecutive backslashes) then this will incorrectly assume that the regex can match a newline. And now this PR expands the flaw to \\R too. Will the incorrect heuristic cause any harmful effect?

This was always the case and nobody reported a problem so far.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Enhancement of existing functionality
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants