-
Notifications
You must be signed in to change notification settings - Fork 93
Description
In 2.1.14 we updated some data and this uncovered some issues with joni and JRuby interactions involving warnings. The main visible issue is some regexps are generating the warning:
character class has duplicated range
This warning is sometimes coming out from internal expansions (like \X). If an expansion is internally diplicating we definitely do not want end users to be warned. We actually fixed one case where we were making a regexp UTF-8 when it shouldn't have been, but we are still see some other missing cases.
Joni's design compounds this issue because some constructor paths use the DEFAULT WarnCallback which is literally a system.err.println() call. This means we cannot change anything in JRuby specifically to avoid this potentially being used since not all joni Regex code is from JRuby core. We also have native extension authors who might be calling a constructor using DEFAULT.
That probably was not a super clear description but the solution should be reasonably easy to follow:
- uncomment warn for 'character class has duplicated range' (in ScanEnvironment)
- Add ability to register a default WarnCallback handler
- (on jruby side) use this new register API
Additional things to do:
- (on jruby side) audit all regex constructors and figure out where our remaining duplicated class warnings are coming from
- augment joni warning to provide the actual regexp which is generating the warning (MRI does print out the failing regexp). But warn(message, regexp) would be a great API for debugging issues like this so we should change joni to be like that.