Feedback for replace_regex
tool
#145
Replies: 3 comments 5 replies
-
Thanks for the feedback! I'd rather address the verification problems with an automated linting for the new code, the language servers should be able to do that. Only if the linting fails some info on the generated code, possibly the diff, should be returned. What do you think? See also here |
Beta Was this translation helpful? Give feedback.
-
Could you pls provide the examples where it generated wrong syntax? |
Beta Was this translation helpful? Give feedback.
-
Yeah, I get it, it's probably because I talk so much about NOT adding indentation for the symbolic editing tools in the same prompt that the LLM got confused. I pushed an update of the prompt about an hour ago, specifically introducing to use indentation in the regex tool. Could you pls try again with the same example? The model is smart enough to not screw up indentation, if problems still persist, we should be able to solve them with prompting |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
TLDR: It's amazing! But the new prompt that tells the LLM to trust the tool and not verify can cause problems.
Excellent work on the new tool. I now understand why you went down the regex path and how it's way better than my line based token efficient editing tool.
It's fast too. It sped through my tests at incredible speed thanks to the efficiency of regex.
The only issue I found is that it will accept anything as input for the replacement lines, even bad code. And the new instructions tells the LLM not to verify, but to trust the tools "ok" output. Once again, I propose my
git diff
verification strategy as a token light weight way for the LLM to verify no syntax or indentation errors were inadvertently introduced.I had Claude write up a memory for my testing session. Here it is:
ReplaceRegexTool Testing Results and Validation Strategy
Overview
The new
replace_regex
tool represents a major advancement in token-efficient code editing, providing surgical editing capabilities that can achieve 80-90% token savings compared to traditional symbol-based approaches.Tool Architecture
ReplaceRegexTool
(src/serena/agent.py:1404)EditedFileContext
- robust context manager for atomic file operationsTest Results Summary
Performed 5 comprehensive tests following developer guidance:
.*?
wildcardsToken Efficiency: Achieved ~90% reduction compared to reading/replacing entire symbols.
Critical Discovery: The Replacement Content Validation Gap
The Problem
While the tool excellently validates pattern matching (no matches, multiple matches, file errors), it performs NO validation of replacement content quality:
Demonstrated Risk
Successfully inserted completely broken code (wrong indentation + syntax errors) that returned "OK" from the tool, creating a false sense of success.
The Token Efficiency Paradox
Universal Mitigation Strategy: The Git Diff Approach
Primary Validation:
git diff
Advantages:
Secondary Validation: Syntax Checking
Advantages:
Recommended Validation Workflow
replace_regex
operationgit diff filename
for visual verificationBest Practices for Regex Patterns
End Pattern Uniqueness Strategies
Next Method Boundary (Most Reliable):
Specific Content End:
Context + Generic Return:
Wildcard Usage
.*?
(non-greedy) for spanning large sectionsStrategic Guidelines
Impact Assessment
This tool + validation strategy combination creates a robust, token-efficient editing pipeline that:
The git diff validation approach solves the "replacement content blind spot" and makes the replace_regex tool suitable for production use in token-constrained environments.
Beta Was this translation helpful? Give feedback.
All reactions