-
Notifications
You must be signed in to change notification settings - Fork 180
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Naive ideas for making gptel simpler yet more powerful #546
Comments
I agree! It doesn't work well, I'd like to change it. The reasons it's still the default are:
Point 3 continued: More than changing the defaults, I'd like some way to transparently indicate to the users that the prefixes are cosmetic as far as the LLM is concerned. 1 and 2 are not minor issues, we can live with these breakages. If you have suggestions for 3 and 4 however, please let me know.
This tree-of-conversations feature only works in Org mode, not markdown.
As for "something else", again suggestions are welcome. The only constraint is that I don't plan to switch gptel to using syntax to track responses instead of invisible text-properties. [1]: I know this because this is also my default behavior when installing Emacs packages. There's just too much to read considering how many Emacs packages we use, and I like reading documentation. |
These are all important points. In my opinion all of them could be solved by one - unforunately - huge change:
This is some pain and some work, but the result would be a huge improvement IMHO.
I’m aware of the tree-of-conversations feature, but I didn’t even mean it here. (I also agree that it can be confusing.) My thinking is that it is often useful to have (org or markdown) document structure in a “gptel-notebook” even without tree-of-conversations. This could be as simple as having a few sections that split up the conversation into subsequent sections. The prompt prefixes should not interfere with this. |
OK, I understand now that gptel also uses text properties in other buffers, not only in But I think that this is not a problem: currently, text properties are only preserved in notebooks (through the local variables mechanism), not in other files. If the gptel "response" text property was preserved through markings instead of storing I think that having special markers |
After playing a bit with the effect of gptel text properties in non-notebook buffers, I fear that gptel's current model of using hidden text properties is more confusing than useful: For example let's say that the user is in a programming mode buffer and asks the LLM to write a function. The function code is inserted and invisibly marked as a response. Now if the user modifies this function by introducing some code in the middle, and later asks some question to the LLM about this function, gptel will split up the code of the function into "user" and "assistant" chunks, thus totally confusing the LLM. To make things slightly more confusing, this splitting up is not operational if one uses gptel's rewrite mechanism. I understood all of this only because I enabled inspection. My current thinking is that gptel's hidden text properties are not worth the complexity and potential confusion. Here is an idea about how I believe the operation of gptel could be made easier to understand and more powerful at the same time:
For example, if the system message is "Assist in composing a letter. Only output letter text, without any explainations." and the user wants some section to be edited, they would mark the section, and the directive (knowing that region is active), would emit an additional request to "reformulate the highlighted region". If no region is active, the request would be to suggest some text based on the surroundings. Note that in this new way of operating, Emacs' narrow-down functionality would be used to restrict what the LLM gets to see of the buffer. The region is used to indicate what part should be replaced. Using both narrow-down and region/point is the key to allowing more powerful/unified/consistent directives in a transparent way (=you get what you see). The way static directives could be specified is by the user providing three strings: one general instruction, one additional instruction for the case that the region is active and one for when it is not. The second or third string could be It seems to me that these simple directives would already allow to cover many use cases in a transparent way. They would allow to unify rewrite and non-rewrite. Explain-the-thing-at-point could be realized as a directive as well. Of course Note that with current Note also that with my proposal it would be still possible to use gptel quickly in any buffer. For example replacing a region of the minibuffer would work very naturally - it would no longer require pressing I'm sorry if my ideas are too radical or unrealistic. (Sometimes, ideas of a newcomer can be useful.) In any case, I would appreciate your comments on these ideas. |
* gptel.el (gptel--attach-response-history, gptel--insert-response, gptel--restore-state): Make the `gptel' text property front-sticky, so that they are now front-sticky and rear-nonsticky. This means text typed at the start of a response is considered part of the response , while text typed at the end is not. (#249, #321, #343, #546) This is another experiment to address the longstanding problem of being able to edit gptel responses as responses without creating the many edge cases caused by rear-sticky text properties. In the process we hopefully also avoid a whole bunch of API validation errors caused by whitespace user-edits in the buffer. (#351, #409, This change is tentative and might be reverted in the future. * gptel-org.el (gptel-org--restore-state): Concomitant changes. * gptel-curl.el (gptel-curl--stream-insert-response): Concomitant changes.
There are many suggestions here, so I'll respond in detail when I have time. There are several prior discussions about all these topics, I suggest searching the issues and discussions page for "tracking", "track", "whitespace", "prefix" and org-mode, among other terms. See also #249, #321, #343 among others. (Feel free to skip past the many parts that don't have to do with the issues you bring up in this thread.) |
Thanks for the pointers. (Looking forward to your response!) Here is just some more context for what I wrote above: I tried to use
This seems to work (assuming that the region is active), but in interactive usage I haven’t found a way to change the system message which is always the default one. It seems like I would have to provide different versions of this directive for different roles. Such experiments made me thinking about how
|
Addressing a couple of your points here (out of many, so I'll respond more later).
I don't follow. Do you mean that you haven't been able to change the system message dynamically, from gptel's menu?
All "directives" are the same kind of object: a system message + a canned conversation. That some are useful for rewriting and some for other kinds of interaction doesn't make them different in kind. You can have as many rewrite directives as you want, fit for different purposes. Here's a rewrite directive I use when the task needs more context: (defun my/gptel-code-infill ()
"Fill in code at point based on buffer context. Note: Sends the whole buffer."
(let ((lang (gptel--strip-mode-suffix major-mode)))
`(,(format "You are a %s programmer and assistant in a code buffer in a text editor.
Follow my instructions and generate %s code to be inserted at the cursor.
For context, I will provide you with the code BEFORE and AFTER the cursor.
Generate %s code and only code without any explanations or markdown code fences. NO markdown.
You may include code comments.
Do not repeat any of the BEFORE or AFTER code." lang lang lang)
nil
"What is the code AFTER the cursor?"
,(format "AFTER\n```\n%s\n```\n"
(buffer-substring-no-properties
(if (use-region-p) (max (point) (region-end)) (point))
(point-max)))
"And what is the code BEFORE the cursor?"
,(format "BEFORE\n```%s\n%s\n```\n" lang
(buffer-substring-no-properties
(point-min)
(if (use-region-p) (min (point) (region-beginning)) (point))))
,@(when (use-region-p) "What should I insert at the cursor?")))) (You could achieve the same effect by adding the buffer to gptel's context and using the default rewrite directive, but that gets tedious.) That said, the discussion in #375 hasn't concluded and meain's idea of templated interactions hasn't been implemented in full yet, so the way that instructions are specified for rewrites will probably change some more.
Yes, |
No, I mean that I can activate my split directive, and it works, but then I’m not able to vary the system message, because choosing the system message and choosing a directive is the same menu point. Here are the details: In order to enable the directive function listed above, I use: Now let’s try it out on some short shell script. I do the following:
And I get a useful response, as expected. For reference, the following query is sent in the background:
So far so good, but I’d like to be able to change the system message in the query to something else. Perhaps I feel a bit strange and would like to tell the LLM “You are a poet. Reply only in verse.”, but otherwise keep everything as above. I understand that the comment from which I took
Thus, since I added my version of the directive to My point is that there doesn’t seem to be a way to specify the system message independently from the directive. Note that the unfinished directive Thanks for Going further, gptel already varies the default system message based on the major mode of the buffer. It would be great if there was a code-infill-directive that would work both for email text and for Python code, because the mode-specific bit would be automatically set through the system message, and one could still choose a directive (and also modify the system message if one so chooses). |
Please use #375 to discuss this particular idea/workflow. We can continue in this thread the discussion of other topics you've brought up. |
Thanks. I will have a closer look at #375. Within this issue, it is not my purpose to discuss templates and dynamic directives. It is rather that I got into playing with gptel's dynamic directives as a vehicle for experimenting with my ideas for an alternative mechanism for static directives. Still, here is one short question which IMHO fits best into this thread: is the example directive |
Returning to the topic of exploring alternative mechanisms for static directives, let me try to summarize my points - I fear that they got a bit mixed up in the conversation above, in particular because my understanding of gptel evolved during the discussion. This comment tries to distill my previous ideas, but contains also some new ones towards the end. Disclaimer: for simplicity, I frame the following as a concrete proposal. I am aware that I've been using gptel only for a couple of days, so I am not in any way trying to impose anything. I see this just as a form of brainstorming, and hope that others here may find some of these ideas useful. I would greatly appreciate getting feedback about this proposal. I really like gptel’s basic premise of being a simple and general tool! I think that this fits Emacs and LLMs very well. My proposal aims to get even better at this.
(1) (2) A general-purpose command similar to Let’s call this command
|
@grothesque Thanks for putting your thoughts together into a single post, that makes it easier to follow. I'm currently busy addressing bugs and issues opened in the past week, and couldn't get to this topic (which deserves a long response) in time this weekend. Posting here to mention that I will reply -- haven't lost track of this! Re: some of your points, you may want to look at |
Wrote #565 in partial response to this thread. |
Sure, take your time. I'm very interested in your thoughts on the topics I raised. Even if you don't want to alter the way gptel works, I might be able to write something myself that uses
I've set Thanks for the pointer to |
Edit: Originally, the title of this issue was “Please consider not making prompt prefixes markdown/org headings by default”. I changed it to better reflect the way into which the discussion had evolved.
Thank you very much for this impressive tool that is helping me discover a new world of possibilities.
I have the impression that the default choice of prompt prefixes (
gptel-prompt-prefix-alist
andgptel-response-prefix-alist
) being headings is sub-optimal both in principle and in practice:In principle it means that the first line of each prompt is marked-up as a heading, while the following lines are regular text. This distinction, however, is not in any way visible to the model, and has therefore no effect. Often the first line of a multi-line prompt will not have any particular importance - why mark it up differently?
In practice, if one tries to use other headings in a
M-x gptel
buffer, the fixed "prompt headings" tend to mess up the structure. At least with org, the possibility to structure the document as a tree is at the heart of what it is about. Gptel's readme file suggests a solution: https://github.com/karthink/gptel?tab=readme-ov-file#use-branching-context-in-org-mode-tree-of-conversations. How about making this the actual default? Or something else that does not introduce an inconsistency between visuals and semantics, and that does not interfere with the document structure.The text was updated successfully, but these errors were encountered: