You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm using your package as part of research and want to collaborate on adding anchor assertion support. From a group of ~835k regexes, approximately 53% were supported for conversion without anchor support and approximately 93% were supported for conversion when excluding those with anchors. Any thoughts on how I would go about implementing anchor (^$) support?
Cheers and great work on the package.
The text was updated successfully, but these errors were encountered:
albert-sun
changed the title
Anchor assertion support
Anchor (^$) assertion support
Oct 23, 2020
I'm currently looking into implementing support for lookarounds via RE AST transformations (#9). When finished, refa will have full support for all JS assertions (^$\b\B) as well as limited support for general lookarounds.
The main problem is that there is no NFA that can express the regex /^foo$/, so I'll use a little trick. The idea is that I reserve one special character to represent the "outside" of the string - let's call this character %. With this trick, I can rewrite the regex: /^foo$/ -> /(?<=%)foo(?=%)/.
Since we might end up with lookarounds that go beyond the consumed characters of the string, the transformation will have two option: Either it will make those lookarounds a part of the pattern (/(?<=%)foo(?=%)/ -> /%foo%/) or it will remove them (/(?<=%)foo(?=%)/ -> /foo/).
Since these two transformation steps are independent of each other, you will also be able to just remove all assertions that go beyond. That alone might be enough for your use case.
Also, may I ask what regex set you use?
I currently test refa using Prism's 2.5k regexes but it would be nice to have more to test with.
I'm using your package as part of research and want to collaborate on adding anchor assertion support. From a group of ~835k regexes, approximately 53% were supported for conversion without anchor support and approximately 93% were supported for conversion when excluding those with anchors. Any thoughts on how I would go about implementing anchor (^$) support?
Cheers and great work on the package.
The text was updated successfully, but these errors were encountered: