You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
After seeing that sonnet is trained for computer use (with exact pixel coordinates) I tried using it for bounding box detection (both open vocab with text input, or few-shot with image input). However, my results have been worse than I expected given claude's performance with computer use. I tried following the best practices outlined in this repo.
My question to you is:
Can you share what specific normalization/origin location is claude for computer use trained for? So I can use the same set up.
Any bb grounding related suggestions I should try beyond what is given in the cookbooks.
Thank you very much!
The text was updated successfully, but these errors were encountered:
Hello!
After seeing that sonnet is trained for computer use (with exact pixel coordinates) I tried using it for bounding box detection (both open vocab with text input, or few-shot with image input). However, my results have been worse than I expected given claude's performance with computer use. I tried following the best practices outlined in this repo.
My question to you is:
Thank you very much!
The text was updated successfully, but these errors were encountered: