linediff number-prefixed diff format #2174

dceluis · 2024-10-28T09:54:32Z

Hi there,

Would a new diff format be of interest to the project?

I wanted to write a simple format that would allow me to simplify the line-matching logic. It took me a while to get it to work reliably on gpt-4o-mini (it's what I could extensively test with). More advanced models should be able to comply without such a huge system prompt.

Pros:

Simple rules by leveraging LLMs recitation capabilities.
Client-side: Simpler line matching logic.

Cons:

More verbose, although a lot of that is the big system reminder I had to add so that 4o-mini followed the specification.

You can also have a look at https://github.com/dceluis/ln-diff for a deeper look into the decisions behind the format.

Benchmark for context:

I understand the implementation is nowhere near mergeable but I figured I might show it anyways for easier consideration.

Cheers!

CLAassistant · 2024-10-28T09:54:39Z

All committers have signed the CLA.

Finndersen · 2025-01-07T16:24:20Z

Hi @dceluis,
When thinking about building my own AI coding agent project, I also assumed that a line-number based editing approach would make sense. However, according to the Aider docs:

GPT is terrible at working with source code line numbers. This is a general observation about any use of line numbers in editing formats, backed up by many quantitative benchmark experiments.

Has this not been the case in your experimentation?

dceluis · 2025-01-07T17:52:11Z

Hi @dceluis,
When thinking about building my own AI coding agent project, I also assumed that a line-number based editing approach would make sense. However, according to the Aider docs:

GPT is terrible at working with source code line numbers. This is a general observation about any use of line numbers in editing formats, backed up by many quantitative benchmark experiments.

Has this not been the case in your experimentation?

Yer & no.

by far the biggest problem i found was convincing the LLM to construct diffs that don't assume that patch hunks will be applied sequentially.
(so the second hunk will have to reference the source code lines +/- the number of lines the previous hunk added/deleted)

this is straightforward to implement in software but LLMs get confused pretty quickly.

my hypothesis was that a coding LLM would generate more correct patches if instructed to reference the original line numbers no matter what.

it was surprisingly hard to convince 4o-mini to do this, though. since there must be conflicting references in the models' training database, https://github.com/google/diff-match-patch/wiki/Unidiff#3-rolling-context . it took a huge prompt but it worked and the logic for parsing the patches is arguably much simpler.

stronger models have less issues, so you might try those and get better results.

so I think the docs are mostly true but could be updated, this is why I made this PR too, to contribute my findings. & still use linediff on a day to day basis, although not through aider :)

https://x.com/dceluis/status/1854601543963525576?t=OuuEDHsjo9Bsr10LBft3og&s=19

https://github.com/dceluis/kznllm.nvim/tree/main

Finndersen · 2025-01-07T21:24:41Z

by far the biggest problem i found was convincing the LLM to construct diffs that don't assume that patch hunks will be applied sequentially.

Do you mean that the LLM would always assume that changes would be applied sequentially, so that later changes would have line numbers that don't match the original file?

my hypothesis was that a coding LLM would generate more correct patches if instructed to reference the original line numbers no matter what.

Yes I would've thought so too.. but if it seems to want to account for prior changes, what if you just let it do that? Does it do it accurately?

Finndersen · 2025-01-07T21:32:52Z

I wonder if a tool-based approach using line numbers would work well... I read how it's generally less effective due to having to do JSON escaping of the code content, but I think it only reduced the performance by a small amount. Could be worth trying

Finndersen · 2025-01-07T21:40:26Z

Also, regarding your diff format:

The REMOVE section line numbers and contents must match the SOURCE file exactly

Isn't it kind of redundant to require both line numbers AND content to match? Wouldn't it be best to have just one or the other? Could just use a number range and not repeat the original content. Or does LLM struggle with this?

dceluis · 2025-01-07T21:45:46Z

by far the biggest problem i found was convincing the LLM to construct diffs that don't assume that patch hunks will be applied sequentially.

Do you mean that the LLM would always assume that changes would be applied sequentially, so that later changes would have line numbers that don't match the original file?

Yes, it tends to do that. And this also hurst its ability to produce code, since a wrong line could be interpreted as a source or destination line.

my hypothesis was that a coding LLM would generate more correct patches if instructed to reference the original line numbers no matter what.

Yes I would've thought so too.. but if it seems to want to account for prior changes, what if you just let it do that? Does it do it accurately?

Not very accurately, hence the Aider doc's notice.

I wonder if a tool-based approach using line numbers would work well... I read how it's generally less effective due to having to do JSON escaping of the code content, but I think it only reduced the performance by a small amount. Could be worth trying

I wouldn't discard it but my prior is that since there isn't much source code in the training database represented as JSON the models would have more difficulties producing quality code.

dceluis · 2025-01-07T21:50:04Z

Also, regarding your diff format:

The REMOVE section line numbers and contents must match the SOURCE file exactly

Isn't it kind of redundant to require both line numbers AND content to match? Wouldn't it be best to have just one or the other? Could just use a number range and not repeat the original content. Or does LLM struggle with this?

for what I could see in my tests having both the line numbers AND the source line helps the models reason. more context is good. ~~also, there's empty lines so it helps clear out many ambiguities.~~

it's the same reason why i set the destination lines to NOT have line numbers so that the models do not get confused.

I predict that with better models you would need less of these prompting tricks

Finndersen · 2025-01-07T23:43:52Z

But if it is putting in the exact source line content then the numbers aren't even required at all right? (Assuming the content is unique in the file). Then it basically just becomes like the existing diff method used by aider.

Have you tried an approach of having the LLM only provide the line numbers to replace? I'm tempted to try it out

dceluis · 2025-01-08T04:28:44Z

But if it is putting in the exact source line content then the numbers aren't even required at all right? (Assuming the content is unique in the file). Then it basically just becomes like the existing diff method used by aider.

Have you tried an approach of having the LLM only provide the line numbers to replace? I'm tempted to try it out

I have, although I wish i had documented more of what did and didn't work. IIRC it performs a bit worse because printing out the number + code turns the source-referencing into a recitation exercise: the simplest rule I could enforce to reduce confusion.

But don't refrain from running the benchmark yourself, with any tweaks you think will improve results. I'm most interested in learning what you find!

linediff format

3466cb0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

linediff number-prefixed diff format #2174

linediff number-prefixed diff format #2174

dceluis commented Oct 28, 2024

CLAassistant commented Oct 28, 2024 •

edited

Loading

Finndersen commented Jan 7, 2025

dceluis commented Jan 7, 2025 •

edited

Loading

Finndersen commented Jan 7, 2025

Finndersen commented Jan 7, 2025

Finndersen commented Jan 7, 2025

dceluis commented Jan 7, 2025

dceluis commented Jan 7, 2025 •

edited

Loading

Finndersen commented Jan 7, 2025 •

edited

Loading

dceluis commented Jan 8, 2025

linediff number-prefixed diff format #2174

Are you sure you want to change the base?

linediff number-prefixed diff format #2174

Conversation

dceluis commented Oct 28, 2024

CLAassistant commented Oct 28, 2024 • edited Loading

Finndersen commented Jan 7, 2025

dceluis commented Jan 7, 2025 • edited Loading

Finndersen commented Jan 7, 2025

Finndersen commented Jan 7, 2025

Finndersen commented Jan 7, 2025

dceluis commented Jan 7, 2025

dceluis commented Jan 7, 2025 • edited Loading

Finndersen commented Jan 7, 2025 • edited Loading

dceluis commented Jan 8, 2025

CLAassistant commented Oct 28, 2024 •

edited

Loading

dceluis commented Jan 7, 2025 •

edited

Loading

dceluis commented Jan 7, 2025 •

edited

Loading

Finndersen commented Jan 7, 2025 •

edited

Loading