You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We evaluate both UI-TARS-SFT and UI-TARS-DPO for OSWorld in § 5.4, as this benchmark benefits most from the iterative improvement from the DPO phase. For other benchmarks, however, we report the model trained after the annealing phase (i.e., UI-TARS-SFT).
Does it mean that the SFT model is likely to be preferred for other tasks? esp. midscene chrome navigation
The text was updated successfully, but these errors were encountered:
Which model should be preferred for desktop use?
Paper states:
Does it mean that the SFT model is likely to be preferred for other tasks? esp. midscene chrome navigation
The text was updated successfully, but these errors were encountered: