Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiline issue regarding index() function when matching ^ and/or $ #10

Open
jurvei opened this issue Oct 28, 2021 · 1 comment
Open

Comments

@jurvei
Copy link

jurvei commented Oct 28, 2021

When using '-re' to match a regex, there is an issue regarding the used index() function and matching ^ or $.

Example (simplified, but represents the code of Example.pm:759-785

my $string = "test\r\ntest-end-test\r\nend";
my $pattern = qr/^end$/m;
if (my @matchlist = ($string =~ m/($pattern)/)) {
    # match successful => the "end" at the end of the string is matched and stored in @matchlist

    # but now you are using index() to determine the position... which will give you the position of the "end" string at "test-end-test", NOT the matched "end" at the end of the string
    my $start = index $string, $matchlist[0];

    # so in the end, $before and $after are wrong
    my $before = substr $string, 0, $start;
    my $after = substr $string, $start + length($matchlist[0]);

    print "Before: $before\n";
    print "After: $after\n";
}

String:

test
test-end-test
end

Output:

Before: test
test-
After: -test
end

Expected output:

Before: test
test-end-test

After:

The code used in v1.25 was not using index() and seems to work as expectd.

@matchlist = ( ${*$exp}{exp_Accum} =~ m/$pattern->[2]()/m );
( $match, $before, $after ) = ( $&, $`, $' );
jurvei added a commit to eramon-gmbh/expect.pm that referenced this issue Jan 5, 2022
- fixes before/after matches described in jacoby#10
@Karl-Hungus-Autobahn
Copy link

I am, too, suffering from this problem (see details below), the patched version of Expect.pm provided in this thread here above solves it for me. Is there a reason, why the patch was not considered so far? It would be greatly appreciated.

Relevant debug output, notice the unwanted residue in "After match string":

spawn id(3): Does `\r\n\r\n{master:0}\r\nroot@router> \r\n\r\n{master:0}\r\nroot@router'
match:
  pattern #1: -re `(?^:[\\r\\n]+[^\\r\\n<]+[#>%] ?$)'? No.


spawn id(3): Does `\r\n\r\n{master:0}\r\nroot@router> \r\n\r\n{master:0}\r\nroot@router> '
match:
  pattern #1: -re `(?^:[\\r\\n]+[^\\r\\n<]+[#>%] ?$)'? YES!!
    Before match string: `\r\n\r\n{master:0}'
    Match string: `\r\nroot@router> '
    After match string: `\r\n\r\n{master:0}\r\nroot@router> '
    Matchlist: ()

see also https://stackoverflow.com/questions/79247491/solved-expect-module-end-of-string-is-correctly-matched-but-characters-remain

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants