You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
#352 is a fantastic addition. I'm currently using deepseek-r1 via Ollama to run everything locally / offline. That said, afaik ds-r1 doesn't support vision.
I'd like to pass 2 LLMClients when initializing Stagehand. One for vision, and the other for everything else. This would allow me to try out something like LLava locally.
My guess is that the interface would look something like:
Hey! thanks for the snippet & detailed issue desc -- I love this idea!
Vision is definitely in the long term plans for Stagehand. We are actually in the process of removing it + completely revamping it with a different approach, which we hope to add in the very near future.
I think after we get that in, a PR for this issue would be greatly appreciated. Gonna share this with the team and get their thoughts.
#352 is a fantastic addition. I'm currently using deepseek-r1 via Ollama to run everything locally / offline. That said, afaik ds-r1 doesn't support vision.
I'd like to pass 2 LLMClients when initializing Stagehand. One for vision, and the other for everything else. This would allow me to try out something like LLava locally.
My guess is that the interface would look something like:
Has this been discussed internally? Is vision going to be kept longer term in Stagehand? Would a PR for this be welcome?
Thanks! 🤙
(related: #184)
The text was updated successfully, but these errors were encountered: