Bounding Box Detection #123

batu · 2024-12-19T20:48:58Z

Hello!

After seeing that sonnet is trained for computer use (with exact pixel coordinates) I tried using it for bounding box detection (both open vocab with text input, or few-shot with image input). However, my results have been worse than I expected given claude's performance with computer use. I tried following the best practices outlined in this repo.

My question to you is:

Can you share what specific normalization/origin location is claude for computer use trained for? So I can use the same set up.
Any bb grounding related suggestions I should try beyond what is given in the cookbooks.

Thank you very much!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bounding Box Detection #123

Bounding Box Detection #123

batu commented Dec 19, 2024

Bounding Box Detection #123

Bounding Box Detection #123

Comments

batu commented Dec 19, 2024