Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Would be good to add the equivalent of the Copilot Chat #35

Open
leolivier opened this issue Feb 4, 2024 · 9 comments
Open

Would be good to add the equivalent of the Copilot Chat #35

leolivier opened this issue Feb 4, 2024 · 9 comments

Comments

@leolivier
Copy link

As said in the title, the chat feature is really missing. ie if I want the assistant to explain some code, I can't do it currently with llama coder.

@ex3ndr
Copy link
Owner

ex3ndr commented Feb 8, 2024

I wish, but there is a problem since auto-complete and chat-based models are different and ollama allows you to run a single model for now.

@leolivier
Copy link
Author

Hi @ex3ndr
I don't get this point, you can run as many models (chat or not) in ollama as you want. eg on my basic laptop:

$ ollama list
NAME                            ID              SIZE    MODIFIED
codellama:7b-code-q4_K_M        a4350f4d99d9    4.1 GB  3 hours ago
deepseek-coder:base             3b417b786925    776 MB  4 days ago
mistral:latest                  61e88e884507    4.1 GB  2 weeks ago
stable-code:3b-code-q4_0        aa5ab8afb862    1.6 GB  4 days ago
tinyllama:latest                2644915ede35    637 MB  2 weeks ago

Just pull them and you can query them.
Then for the query, ollama supports both /api/chat and /api/generate so I think it could work.
Or am I missing something?

@ex3ndr
Copy link
Owner

ex3ndr commented Feb 8, 2024

@leolivier it would unload one model and load another one, which takes a lot of time and degrades performance

@leolivier
Copy link
Author

leolivier commented Feb 8, 2024

Actually, the switch between different models is very fast if you have a decent machine.
Also, there is the "enterprise use case" (the one currently interesting me) in which you can have a "big enough" server (ie with maybe some GPU and VRAM) to support a complete development team... And thus replace GitHub Copilot

@leolivier
Copy link
Author

leolivier commented Feb 9, 2024

I thought about it again, and I think you can use the same model in both cases.
You just need to change from /api/chat to /api/generate to differentiate both cases. It might not be the optimal case when there are specifically trained models for chat but if for instance, you are using codellama, or deepseek-coder, there is no different models for chat...

@ex3ndr
Copy link
Owner

ex3ndr commented Feb 9, 2024

I tried this and it didn’t work, models are incompatible

@leolivier
Copy link
Author

I tried this myself on stablecode with ollama:

$ curl http://localhost:11434/api/generate     -H "Content-Type: application/json"     -d '{
        "model": "stable-code:3b-code-q4_0",
        "prompt": "generate a python script that says \"Helloworld!\""
    }'
{"model":"stable-code:3b-code-q4_0","created_at":"2024-02-09T16:28:54.821698783Z","response":"\n","done":false}
{"model":"stable-code:3b-code-q4_0","created_at":"2024-02-09T16:28:54.92666358Z","response":"    ","done":false}
{"model":"stable-code:3b-code-q4_0","created_at":"2024-02-09T16:28:55.031251477Z","response":"\"\"\"","done":false}
{"model":"stable-code:3b-code-q4_0","created_at":"2024-02-09T16:28:55.134103674Z","response":"\n","done":false}
{"model":"stable-code:3b-code-q4_0","created_at":"2024-02-09T16:28:55.238668971Z","response":"    ","done":false}
{"model":"stable-code:3b-code-q4_0","created_at":"2024-02-09T16:28:55.351221768Z","response":"return","done":false}
{"model":"stable-code:3b-code-q4_0","created_at":"2024-02-09T16:28:55.456901866Z","response":" '''","done":false}
{"model":"stable-code:3b-code-q4_0","created_at":"2024-02-09T16:28:55.574238862Z","response":"#","done":false}
{"model":"stable-code:3b-code-q4_0","created_at":"2024-02-09T16:28:55.699279559Z","response":"!/","done":false}
...
$ curl http://localhost:11434/api/chat     -H "Content-Type: application/json"     -d '{
        "model": "stable-code:3b-code-q4_0",
        "messages": [
            {
                "role": "system",
                "content": "You are a helpful coding assistant."
            },
            {
                "role": "user",
                "content": "generate a python script that says \"Helloworld!\""
            }
        ]
    }'
{"model":"stable-code:3b-code-q4_0","created_at":"2024-02-09T16:33:19.262063523Z","message":{"role":"assistant","content":"\n"},"done":false}
{"model":"stable-code:3b-code-q4_0","created_at":"2024-02-09T16:33:19.368737329Z","message":{"role":"assistant","content":"    "},"done":false}
{"model":"stable-code:3b-code-q4_0","created_at":"2024-02-09T16:33:19.475170735Z","message":{"role":"assistant","content":"#"},"done":false}
{"model":"stable-code:3b-code-q4_0","created_at":"2024-02-09T16:33:19.582929841Z","message":{"role":"assistant","content":"     "},"done":false}
{"model":"stable-code:3b-code-q4_0","created_at":"2024-02-09T16:33:19.697075448Z","message":{"role":"assistant","content":"return"},"done":false}
{"model":"stable-code:3b-code-q4_0","created_at":"2024-02-09T16:33:19.813000654Z","message":{"role":"assistant","content":" ("},"done":false}
{"model":"stable-code:3b-code-q4_0","created_at":"2024-02-09T16:33:19.92903326Z","message":{"role":"assistant","content":"200"},"done":false}
{"model":"stable-code:3b-code-q4_0","created_at":"2024-02-09T16:33:20.038715066Z","message":{"role":"assistant","content":","},"done":false}
{"model":"stable-code:3b-code-q4_0","created_at":"2024-02-09T16:33:20.148516872Z","message":{"role":"assistant","content":" b"},"done":false}
{"model":"stable-code:3b-code-q4_0","created_at":"2024-02-09T16:33:20.254684878Z","message":{"role":"assistant","content":"'"},"done":false}
{"model":"stable-code:3b-code-q4_0","created_at":"2024-02-09T16:33:20.362591484Z","message":{"role":"assistant","content":"Hello"},"done":false}

So, it seems to work, doesn't it? (at least with stable code, I will retry the same with codellama)

@leolivier
Copy link
Author

With codellama:

$ curl http://localhost:11434/api/generate     -H "Content-Type: application/json"     -d '{
        "model": "codellama:7b-code-q4_K_M",
        "prompt": "generate a python script that says \"Helloworld!\""
    }'
{"model":"codellama:7b-code-q4_K_M","created_at":"2024-02-09T16:38:27.216743107Z","response":"\n","done":false}
{"model":"codellama:7b-code-q4_K_M","created_at":"2024-02-09T16:38:27.483732142Z","response":"\"\"","done":false}
{"model":"codellama:7b-code-q4_K_M","created_at":"2024-02-09T16:38:27.82007234Z","response":"\"","done":false}
{"model":"codellama:7b-code-q4_K_M","created_at":"2024-02-09T16:38:28.171497443Z","response":"\n","done":false}
{"model":"codellama:7b-code-q4_K_M","created_at":"2024-02-09T16:38:28.527946748Z","response":"#","done":false}
{"model":"codellama:7b-code-q4_K_M","created_at":"2024-02-09T16:38:28.879778551Z","response":" write","done":false}
{"model":"codellama:7b-code-q4_K_M","created_at":"2024-02-09T16:38:29.229749054Z","response":" your","done":false}
{"model":"codellama:7b-code-q4_K_M","created_at":"2024-02-09T16:38:29.595589961Z","response":" code","done":false}
{"model":"codellama:7b-code-q4_K_M","created_at":"2024-02-09T16:38:29.954491766Z","response":" here","done":false}
{"model":"codellama:7b-code-q4_K_M","created_at":"2024-02-09T16:38:30.20745624Z","response":"\n","done":false}
{"model":"codellama:7b-code-q4_K_M","created_at":"2024-02-09T16:38:30.44548511Z","response":"print","done":false}
{"model":"codellama:7b-code-q4_K_M","created_at":"2024-02-09T16:38:30.695191883Z","response":"(\"","done":false}
...
$ curl http://localhost:11434/api/chat     -H "Content-Type: application/json"     -d '{
        "model": "codellama:7b-code-q4_K_M",
        "messages": [
            {
                "role": "system",
                "content": "You are a helpful coding assistant."
            },
            {
                "role": "user",
                "content": "generate a python script that says \"Helloworld!\""
            }
        ]
    }'
{"model":"codellama:7b-code-q4_K_M","created_at":"2024-02-09T16:40:00.355504103Z","message":{"role":"assistant","content":"\n"},"done":false}
{"model":"codellama:7b-code-q4_K_M","created_at":"2024-02-09T16:40:00.607601514Z","message":{"role":"assistant","content":"print"},"done":false}
{"model":"codellama:7b-code-q4_K_M","created_at":"2024-02-09T16:40:00.875965231Z","message":{"role":"assistant","content":"(\""},"done":false}
{"model":"codellama:7b-code-q4_K_M","created_at":"2024-02-09T16:40:01.138125445Z","message":{"role":"assistant","content":"hello"},"done":false}
{"model":"codellama:7b-code-q4_K_M","created_at":"2024-02-09T16:40:01.386460654Z","message":{"role":"assistant","content":" world"},"done":false}
{"model":"codellama:7b-code-q4_K_M","created_at":"2024-02-09T16:40:01.643014066Z","message":{"role":"assistant","content":"!"},"done":false}
{"model":"codellama:7b-code-q4_K_M","created_at":"2024-02-09T16:40:02.021229831Z","message":{"role":"assistant","content":"\")"},"done":false}
{"model":"codellama:7b-code-q4_K_M","created_at":"2024-02-09T16:40:02.49918594Z","message":{"role":"assistant","content":"\n"},"done":false}
{"model":"codellama:7b-code-q4_K_M","created_at":"2024-02-09T16:40:02.948030236Z","message":{"role":"assistant","content":"\n"},"done":false}
{"model":"codellama:7b-code-q4_K_M","created_at":"2024-02-09T16:40:03.548053683Z","message":{"role":"assistant","content":"#"},"done":false}
{"model":"codellama:7b-code-q4_K_M","created_at":"2024-02-09T16:40:04.098876089Z","message":{"role":"assistant","content":" to"},"done":false}

so it's also working...

@richarddli
Copy link

Note that GitHub Copilot Chat is a separate extension https://marketplace.visualstudio.com/items?itemName=GitHub.copilot-chat than the main code completion. I agree it would be super useful; I'm currently looking for a plug-in that does the chat (I find it more useful than the code completion).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants