Skip to content

Conversation

MischaPanch
Copy link
Contributor

Claude Code and most other ide assistants have capable pattern searching tools. Ours has quite a long description, cluttering the context, and I see times and times again that claude fails to write non-greedy regexes, leading to long matches. I think we should disable it by default in this context

@MischaPanch MischaPanch requested a review from opcode81 August 25, 2025 19:14
@opcode81
Copy link
Contributor

opcode81 commented Aug 25, 2025

claude fails to write non-greedy regexes, leading to long matches

Could this be a problem of the description?
Are we applying the match in multi-line mode without telling the model that this is how matching works?
Perhaps it assumes that matches will only ever be single-line by default (like grep).

Ours has quite a long description

Also, I don't see why it would need a lengthy description.

@MischaPanch
Copy link
Contributor Author

We had that discussion in the past and adjusted the description (partly the reason for the huge docstring):

a snippet:

        Pattern Matching Logic:
            For each match, the returned result will contain the full lines where the
            substring pattern is found, as well as optionally some lines before and after it. The pattern will be compiled with
            DOTALL, meaning that the dot will match all characters including newlines.
            This also means that it never makes sense to have .* at the beginning or end of the pattern,
            but it may make sense to have it in the middle for complex patterns.
            If a pattern matches multiple lines, all those lines will be part of the match.
            Be careful to not use greedy quantifiers unnecessarily, it is usually better to use non-greedy quantifiers like .*? to avoid
            matching too much content.

@opcode81
Copy link
Contributor

OK, so the LLM is too stupid to apply it correctly. But perhaps we should cater to its stupidity and make it behave more like grep.

@MischaPanch
Copy link
Contributor Author

Thing is, we really should have a good reason for each tool, and I don't think there is a good reason for this one

@MischaPanch
Copy link
Contributor Author

In the ide context, I mean

@opcode81
Copy link
Contributor

opcode81 commented Aug 25, 2025

IDE context does not mean that there necessarily will be an alternative to this tool, so I don't really agree.
But the tool has unusual defaults and partly incorrect descriptions.
It should be improved rather than disabled.
We can make "dotall"/multi-line an option instead of the default.

    Generally, symbolic operations like find_symbol or find_referencing_symbols
    should be preferred if you know which symbols you are looking for.

This sort of thing should never be in a tool description. It should be in system prompts.

    :param restrict_search_to_code_files: whether to restrict the search to only those files where
        analyzed code symbols can be found. Otherwise, will search all non-ignored files.
        Set this to True if your search is only meant to discover code that can be manipulated with symbolic tools.
        For example, for finding classes or methods from a name pattern.
        Setting to False is a better choice if you also want to search in non-code files, like in html or yaml files,
        which is why it is the default.

This is overly lengthy.

@MischaPanch
Copy link
Contributor Author

We should then both improve it and have a dedicated, optimized context for claude code (where the tool is not needed), which is what the majority of our users are using

@opcode81
Copy link
Contributor

I think it should be mode, not a context.

@MischaPanch
Copy link
Contributor Author

I don't know if this was affected by recent refactorings, but ignored files should never be searched and I hope they are not searched now

@MischaPanch
Copy link
Contributor Author

Why mode? Modes can be switched dynamically, claude code is a prime example for an execution context, is it not?

@opcode81
Copy link
Contributor

opcode81 commented Aug 25, 2025

I was considering the mode as an option to specialise the context without undue repetition.
But the context spec is not very large, so I suppose we could create one for Claude Code, no problem.

We need to take a closer look at all the prompts. There are so many contradictions, questionable assumptions and imprecise formulations...

For example, the ide-assitant context

  • assumes that certain internal tools are available (although we cannot know this)
  • talks about using the read_file tool even though it is disabled (while instructing the LLM never to use excluded tools)

@MischaPanch
Copy link
Contributor Author

yes, we should go through all prompts before the next release

@rubas
Copy link
Contributor

rubas commented Aug 31, 2025

I vote for with disabling it. It brings no additional value over existing tools and just introduces complexity.

I was just looking into options to improve the situation and maybe limit the output/context as this got abused by Claude and just cluttered the context. Not only with non greedy regex. But why do the work, when they are already existing tools ...

And for instructions, those will always be interpreted quite differently by different models. YOU MUST is a given if you try to teach Claude something and even with this, he decides to ignore you so often. On the other hand GPT5 would never cross you like this.

@MischaPanch
Copy link
Contributor Author

MischaPanch commented Aug 31, 2025

@rubas
I was thinking for a while to enhance our prompt templating and allow customization based on the model used and maybe even the project language. It can already be done by enabling a custom mode, but then the user will need to remember to parametrize it, and it should not be necessary. E.g., when specifying the context codex, the user should automatically get prompts tuned to work well with GPT5. And for some models, it may even make sense to have language-specific prompts to make them work better with particular programming languages. Maybe for rarer languages like elixir additional examples may be needed to enhance performance.

There are various ways in which this can be achieved. It opens up quite a big change to Serena conceptually, although technically it will all not be hard to implement. Just sharing my thoughts here.

In summary - at startup Serena already knows the project language, the execution context, the active tools, and any specifics through the active modes (we could introduce special modes for models). If we make full use of that info in an extended prompt templating mechanism, we could probably enhance performance in the default configs by quite a bit, and this also nicely scales to a community of contributors.

@rubas
Copy link
Contributor

rubas commented Aug 31, 2025

@MischaPanch
Not to hijack this thread, but I'm approaching this from a different direction:

  • My goal is to leverage LSP information and functionality through a slim and efficient layer.

I would prefer to keep anything else out of that layer. I'm not sure (as I haven't tested) whether the edit capability via Serena represents an improvement. As I understand it, this project started with a different mindset. But this is a fast-moving target ;-)

I'm concerned about long-term maintainability if we introduce too much model-specific/language-specific context. The project's goal shouldn't be teaching LLMs about languages—it should provide clear, unambiguous tools that enhance the understanding and add specific functionality. Fewer, well-designed tools are likely more effective...

I'm not here to present solutions, but I've started thinking about different approaches lately.


I find what's happening in the Elixir world around the MCP part of Tidewave very interesting:

The Elixir ecosystem treats LSP as a first-class citizen (🥳 new Expert Elixir LSP got announced yesterday). With Tidewave's MCP implementation, they're experimenting with exposing all information sources - LSP data, documentation, framework-specific tooling, logs, eval, test and more - through one unified interface.

(Not perfect or complete yet), but this unified framework specific approach feels very promising to me.

@MischaPanch
Copy link
Contributor Author

MischaPanch commented Aug 31, 2025

Fewer, well-designed tools are likely more effective...

I agree, and I feel like Serena does provide this. With the configuration one can reduce the tools to a minimum. I was talking about an additional goal - about possibly better default behavior for non-power users. You can reduce Serena to just find_symbol and find_referencing_symbols through the config, and already have a lot of the main benefits. In my mind these are not conflicting goals, as long as the configurability remains in place.

@MischaPanch
Copy link
Contributor Author

Another aspect is that Serena is not just meant to enhance an agent but also to turn something like claude desktop into a capable agent. For that we just need more tools that wouldn't be needed for say just improving Claude Code

@rubas
Copy link
Contributor

rubas commented Aug 31, 2025

I see your point.

Still, I do question, if you don't overshoot...

  • Why would you want to improve the agent behaviour of Claude Desktop. If you want an agent there are a ton of options out there. Don't use Claude Desktop in the current form ... look at the integrated (but messy) approach now from Codex that merges the agent in cloud, via cli and editor together. It's up to Anthropic to improve their tooling, not you.

  • Do you think non-power user will find their way here and will stay - and, should you care? There is an endless stream of .. stuff .. coming out daily that promise improvements and markets themself as such. Perfect for people that love to play with their toys instead of getting work done ;-)

Anyway, maybe I give it a run myself one day.

@MischaPanch
Copy link
Contributor Author

Claude Desktop doesn't have any agent behavior as it can't access your code by default ;). Do you mean Claude Code?

The initial idea was not to improve claude code but simply to write a good toolkit, to make it available through MCP, and to democratize the development of good agents.

But the main popularity of Serena grew almost entirely because it does improve on Claude Code's internal tools. If Anthropic would have better tooling, this project would have likely remained a niche thing with a much smaller community

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants