Skip to content

Conversation

lf-
Copy link
Contributor

@lf- lf- commented Jul 5, 2025

This is a basic implementation of the same string pattern system as in
the revset language. It's currently only used for string.matches, so
you can now do:

"foo".matches(regex:'[a-f]o+')

In the future this could be added to more string functions (and e.g.
the ability to parse things out of strings could be added).

CC: #6893

Checklist

If applicable:

  • I have updated CHANGELOG.md
  • I have updated the documentation (README.md, docs/, demos/)
  • I have updated the config schema (cli/src/config-schema.json)
  • I have added/updated tests to cover my changes

@lf- lf- requested a review from a team as a code owner July 5, 2025 02:02
@lf- lf- force-pushed the jade/push-nlllwnwltntw branch 4 times, most recently from 8ffafc9 to cc7eb2f Compare July 5, 2025 02:20
@lf-
Copy link
Contributor Author

lf- commented Jul 5, 2025

cc @yuja

@lf- lf- force-pushed the jade/push-nlllwnwltntw branch 2 times, most recently from 3ebcc42 to 4cae57d Compare July 5, 2025 04:36
@PhilipMetzger
Copy link
Contributor

just a reminder to adhere to commit guidelines here: https://github.com/jj-vcs/jj/blob/main/docs/contributing.md#commit-guidelines since. a feat: isn't really a topic. My suggestion would be templater:.

@lf- lf- force-pushed the jade/push-nlllwnwltntw branch 3 times, most recently from 3ea5b1a to 7c68277 Compare July 5, 2025 23:55
@lf-
Copy link
Contributor Author

lf- commented Jul 6, 2025

The full completion of this feature to be able to pull out the matching text in question probably involves introducing a new type that represents a match result (and match groups in regex), but doing so requires some rework and semantics design of the string pattern system itself.

I think that would probably be .match. Which probably means this boolean one should be called is_match actually as matches might mean "what is the list of matches in this string". Either that or this one should be called match and then we extend the output type to be a more substantive type later and just assume that everyone's using it as truthiness now. Thoughts?

@lf- lf- changed the title feat: support string patterns in template language templates: support string patterns in template language Jul 6, 2025
@yuja
Copy link
Contributor

yuja commented Jul 6, 2025

The full completion of this feature to be able to pull out the matching text in question probably involves introducing a new type that represents a match result (and match groups in regex), but doing so requires some rework and semantics design of the string pattern system itself.

For this use case, I think we can add .match/es and .replace. The user might have to write the same regex twice, but that seems okay?

# .match(pat) -> String and/or .matches(path) -> List<String>
# .replace_all(pat, subst) -> String
.match(regex:'#\d+\b").replace_all(regex:'^#(.*)', 'issue-$2')

(match/matches might be a bad name, but I'm not sure.)

Implementation-wise, we'll need StringPattern::to_regex() or something, and use string-based substitution provided by Regex. We could add lambda syntax for subst expression, but that would be more complex, and I don't know if we'll need expressivity provided by arbitrary templates.

https://docs.rs/regex/latest/regex/struct.Regex.html#method.replace

@lf-
Copy link
Contributor Author

lf- commented Jul 6, 2025

The issue I have is that it seems nontrivial to make a to_regex translator for globs, and that IME it is most ideal to have the ability to match groups with a regex. But that's probably going to require creating a language type to be able to extract groups and matches; a regex feature that only does full matches is quite a bit harder to use.

So I'm partially trying to figure out how to incrementally deliver something useful without simply . inventing good regex support in my first code PR to the project heh. I think the path forward on that is calling the function introduced in this PR is_match.

@yuja
Copy link
Contributor

yuja commented Jul 6, 2025

The issue I have is that it seems nontrivial to make a to_regex translator for globs,

I think we'll need to switch to the globset crate at some point. The current StringPattern API doesn't have a separate .to_matcher() step, so adapting to GlobSet might not be straightforward. Still I think we can leverage globset to implement .to_regex().

https://docs.rs/globset/0.4.16/globset/struct.Glob.html

@lf- lf- mentioned this pull request Jul 7, 2025
@lf- lf- force-pushed the jade/push-nlllwnwltntw branch 3 times, most recently from 52da308 to a6dbfd7 Compare July 18, 2025 00:10
@lf-
Copy link
Contributor Author

lf- commented Jul 18, 2025

I've rebased and rewritten this based on #6967. It should be ready to go from my end minus one code question I couldn't figure out and need help with :)

@lf- lf- force-pushed the jade/push-nlllwnwltntw branch from a6dbfd7 to c5b798d Compare July 20, 2025 23:22
@lf- lf- force-pushed the jade/push-nlllwnwltntw branch 3 times, most recently from 7775609 to d971626 Compare July 22, 2025 03:55
@lf-
Copy link
Contributor Author

lf- commented Jul 22, 2025

Should be sufficiently rebased and feedback addressed.

Copy link
Contributor

@PhilipMetzger PhilipMetzger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

last minor thing from me.

@lf-
Copy link
Contributor Author

lf- commented Aug 29, 2025

rebased, I'm going to fix up the feedback now

edit: done!

@lf- lf- force-pushed the jade/push-nlllwnwltntw branch 3 times, most recently from 7a7bea5 to 1a1e6ed Compare August 29, 2025 00:32
Copy link
Contributor

@yuja yuja left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LG, thanks.

Copy link
Contributor

@PhilipMetzger PhilipMetzger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LG

@lf- lf- force-pushed the jade/push-nlllwnwltntw branch from 1a1e6ed to c417a3c Compare August 29, 2025 18:05
@lf- lf- force-pushed the jade/push-nlllwnwltntw branch from c417a3c to 583edf9 Compare August 29, 2025 18:07
@lf-
Copy link
Contributor Author

lf- commented Aug 29, 2025

okay, i think that's good to go now

lf- added 2 commits September 1, 2025 14:54
This is a basic implementation of the same string pattern system as in
the revset language. It's currently only used for `string.matches`, so
you can now do:

```
"foo".matches(regex:'[a-f]o+')
```

In the future this could be added to more string functions (and e.g.
the ability to parse things out of strings could be added).

CC: jj-vcs#6893
This allows for any matcher type and allows extracting a capture group
by number.
@lf- lf- force-pushed the jade/push-nlllwnwltntw branch from 583edf9 to a4f1704 Compare September 1, 2025 21:55
@lf-
Copy link
Contributor Author

lf- commented Sep 1, 2025

Feedback should be fully addressed, PTAL.

@lf- lf- added this pull request to the merge queue Sep 2, 2025
Merged via the queue into jj-vcs:main with commit 735a27d Sep 2, 2025
29 checks passed
@lf- lf- deleted the jade/push-nlllwnwltntw branch September 2, 2025 02:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants