Use zero-width delimiters for role tracking in gptel-mode #565

lispy-ai · 2025-01-13T11:16:12Z

Title: Use zero-width delimiters for role tracking with overlay-based highlighting in gptel-mode buffers

This is a proposal for a new approach to tracking and visually distinguishing assistant/user roles in gptel buffers that addresses several long-standing issues (#321, #343) while maintaining compatibility with the existing system.

The Problem

Currently gptel uses text properties to track which sections of text are assistant responses. This approach has proven problematic because:

Text properties don't interact naturally with standard Emacs editing operations
Property stickiness creates ambiguous cases during editing
Yanked text carries properties that can cause confusion
Visual feedback about roles is difficult to implement reliably

Proposed Solution

Use zero-width Unicode characters as role delimiters with overlay-based highlighting, but only when gptel-mode is active:

U+200B (zero-width space) marks response start
U+200C (zero-width non-joiner) marks response end
Overlays provide visual distinction for responses

Key aspects:

Delimiters are invisible and don't affect buffer display
Standard editing operations work naturally
Cut/paste preserves role boundaries correctly
Overlays provide clean visual feedback
Works with all major modes

Implementation

The solution uses two phases:

When gptel-mode is enabled:
- Convert existing gptel text properties to delimiter pairs
- Remove gptel properties
- Enable delimiter-based role tracking
- Create overlays for responses
When gptel-mode is disabled:
- Convert delimiters back to gptel properties
- Remove delimiters
- Remove overlays
- Restore property-based tracking

Response Highlighting

Use overlays for visual distinction:

Clean visual distinction
No interference with text properties
Preserves other modes' fontification
Easy to customize appearance

Benefits

Reliable editing operations:
- Cut/paste works naturally
- Undo/redo maintains role boundaries
- No property stickiness issues
Better user experience:
- Clear visual distinction of responses
- Predictable editing behaviour
- Compatible with standard Emacs commands
- Non-intrusive highlighting
Technical improvements:
- Simple to parse conversation history
- Clean visual feedback via overlays
- Works with all major modes
- Separation of tracking and display

Testing

To test this change:

Enable gptel-mode in a buffer with existing responses
Verify properties convert to delimiters correctly
Test editing operations (especially cut/paste)
Verify overlay highlighting
Disable mode and verify cleanup
Check property restoration

Notes

Only affects buffers with gptel-mode active
Zero-width characters don't affect buffer display or export
Maintains compatibility with existing gptel features
Solves long-standing editing issues
Provides clean visual distinction via overlays

Caveat

There is an obvious caveat here. Enabling the mode mutates the buffer. The characters I have chosen are highly unlikely to appear in regular text. One solution might be that instead of predefining two characters, allow these characters to be configurable via buffer local variables or customisation, or have them automatically selected from a set of candidate characters which characters do not appear in the buffer when scanned upon entering gptel-mode.

Related issues: #321, #343

axelknock · 2025-01-16T22:33:51Z

I like this as a solution that would also make it very simple to edit responses, which is quite a powerful method for guiding output.

I can forsee situations where an odd number of separators exist in the buffer, which would cause gptel-send to fail. In that case a function gptel-mark-response could simply wrap a selected region with the separators, deleting any that are inside the active region. gptel-show-separators/gptel-hide-separators could also replace the separators with something visible for inspection. The latter would most usefully replace the separators with some indicator of message count, probably xml-like (<message_1> </message_1>).

I also feel this violates the central ethos of gptel that prevented karthik from using response indicators in the first place. You would end up with documents containing invisible characters if you copy-and-paste responses. But I do think it addresses the main issues #546 without introducing more unacceptable problems. Backwards compatibility could be maintained by automatically dropping the separators in buffers where the previous method was used.

lispy-ai · 2025-01-17T00:32:09Z

I think the zero-width delimiter approach effectively addresses these concerns while maintaining gptel's simplicity:

Invisible but robust role tracking:
- Zero-width delimiters mark response boundaries (carefully chosen to avoid text conflicts)
- Overlays provide clear visual feedback of boundaries
- Delimiters aren't saved to disk, preserving clean file format
- Existing GPTEL_BOUNDS continue working normally
Optional safe editing operations in gptel-mode buffers:
- Add advice to emacs editing primitives to handle delimiters:
```
(advice-add 'insert-before-markers :around #'gptel--clean-insertion-advice)
(advice-add 'delete-region :around #'gptel--preserve-delimiters-advice)
```
- Strip delimiters from inserted text
- Preserve necessary delimiters at region boundaries during deletion
- External editors can modify files without corruption
- Copy/paste operations work cleanly
Recovery tools:
- gptel-mark-response to mark region as response (or with prefix to mark as prompt)
- gptel-validate-buffer to check and repair delimiter integrity
- gptel-show-separators/gptel-hide-separators for visual inspection
- Overlay system shows current prompt/response status clearly
Backwards compatibility:
- No migration needed for existing chat logs
- Delimiters recreated from bounds when loading buffer
- Maintains the "everything up to cursor" interaction model

A simpler alternative would be to:

Skip the safe editing operations entirely
Rely on clear overlay feedback to show prompt/response regions
Trust users to maintain/repair their chat buffers as needed
Provide the same robust recovery tools above

This simpler approach might be preferable - users get immediate visual feedback about response regions and can easily fix any corruption using gptel-mark-response. The editing safeguards may be unnecessary complexity given good overlay feedback and repair tools.

All that said and backtracking a bit, @daedsidog suggested in #343 that simply making regions explicitly visible and allow them to be fixed up with gptel-mark-response (or gptel-toggle-response-role per his suggestion) might be easiest because

It maintains the existing text property mechanism but adds explicit user control
It avoids introducing new delimiter-related complexity and edge cases
The visual feedback is also through overlays and makes it clear what's prompt vs response
Manual region marking with gptel-mark-response gives users direct control of prompt vs response

The zero-width delimiter approach I've suggested, while elegant in some ways, introduces:

New edge cases around delimiter handling
Possibly complex advice on editing primitives if you take it that far
Potential for delimiter corruption requiring repair tools
Additional complexity in buffer management

On balance perhaps the simplest solution would be:

Keep existing text property mechanism
Add clear overlay-based visual feedback
Provide gptel-mark-response command for manual region control
Trust users to maintain their chat buffers with these tools

This would maintain gptel's existing M.O. while giving users the tools they need to manage/edit prompt/response regions effectively. The visual feedback through overlays addresses the "what is marked as what" problem, while manual region control handles edge cases without introducing new complexity.

The benefit of #565 the zero width delimiter solution is that with careful editing (avoiding region boundaries) you won't break the prompt response sequence within a buffer. But with the existing text properties mechanism you always will break the sequence because of the way text properties are handled in emacs. That is, more often than not, prompt/response regions will need to be "fixed up after editing the buffer", notwithstanding the sticky patch 25efd55 that @karthink recently introduced to mitigate this (I've found myriad ways to break this with yank and other editing commands).

[2025-01-17 Fri 11:32]

axelknock · 2025-01-17T15:21:50Z

A potential way to introduce this without changing the way gptel fundamentally works would be introducing two customizeable variables like gptel-response-start/gptel-response-end, which when both non-nil will break up the buffer like the described behavior. Surfacing this in the transient menu would allow users to opt to use this behavior in some buffers and not others.

lispy-ai added the enhancement New feature or request label Jan 13, 2025

lispy-ai mentioned this issue Jan 13, 2025

Naive ideas for making gptel simpler yet more powerful #546

Open

axelknock mentioned this issue Jan 17, 2025

Dedicated buffer like gptel-inspect-query, but just to edit the response at point with the same major mode #573

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use zero-width delimiters for role tracking in gptel-mode #565

Use zero-width delimiters for role tracking in gptel-mode #565

lispy-ai commented Jan 13, 2025

axelknock commented Jan 16, 2025

lispy-ai commented Jan 17, 2025

axelknock commented Jan 17, 2025

Use zero-width delimiters for role tracking in gptel-mode #565

Use zero-width delimiters for role tracking in gptel-mode #565

Comments

lispy-ai commented Jan 13, 2025

Title: Use zero-width delimiters for role tracking with overlay-based highlighting in gptel-mode buffers

The Problem

Proposed Solution

Implementation

Response Highlighting

Benefits

Testing

Notes

Caveat

axelknock commented Jan 16, 2025

lispy-ai commented Jan 17, 2025

axelknock commented Jan 17, 2025