Giving more context on image requirements #119

n-sviridenko · 2025-01-21T09:29:57Z

Hi everyone,

Having a few questions on the image requirements:

What's the minimum device pixel ratio for image that can be passed to get proper results? Currently we send screenshots from retina displays, which are pretty big, and I'm not sure if it has to be that big. Furthermore, maybe it can be even smaller than the 1x device pixel ratio (e.g. 0.5).
How significantly does the image dimensions impact the inference time (if anyone did some benchmarking)? E.g. 2x size increases the inference time only by 10% etc.
Are there already some optimisations in the OmniParser code that reduce the image dimensions?

Best,
Nikita

Provide feedback