-
Notifications
You must be signed in to change notification settings - Fork 733
Description
https://github.com/X-PLUG/MobileAgent/blob/main/UI-S1/datasets/android_control_evaluation_std.jsonl中的第二条测试数据。
{"goal": "I'd like to add this item to my cart.", "is_successful": true, "steps": [{"action_content": {"action": "click", "coordinate": [542, 2226]}, "screenshot": "/datasets/AndroidControl/images/8197_0.png", "step_instruction": "click on the Add to cart button", "check_options": {"action": "click", "candidate_bbox": [[0, 262, 1084, 2339]], "coordinate": [542, 2226]}}, {"action_content": {"action": "wait", "time": 2}, "screenshot": "/datasets/AndroidControl/images/8197_1.png", "step_instruction": "click on the Add to cart button", "check_options": {"action": "wait", "time": 2}}, {"action_content": {"action": "wait", "time": 2}, "screenshot": "/datasets/AndroidControl/images/8197_2.png", "step_instruction": "click on the Add to cart button", "check_options": {"action": "wait", "time": 2}}], "episode_id": 8197}
此{"action": "click", "coordinate": [542, 2226]},动作对应的点击坐标不在图标上(如图1红点)
而开源的ui-s1的7B模型输出为:{"action": "click", "coordinate": [795, 2261]},动作对应的点击坐标在图标中心(如图2红点)。
请问是test数据集的问题还是别的问题?
在运行python /evaluation/eval_qwenvl.py --model_name UI-S1-7B指令评估时,有对image进行resize.
resize前后对比如下:
current_check_pam:{'action': 'click', 'coordinate': [0.5018518518518519, 0.9275], 'candidate_bbox': []}
原始图像:(width, height)= (1080,2400 )
标注数据: {"action": "click", "coordinate": [542, 2226]}
pred_action:'candidate_bbox': []} {'action': 'click', 'coordinate': [0.728021978021978, 0.938953488372093]}
resize图像:(resized_width, resized_height) = (1092,2408)
eval输出数据:{"action": "click", "coordinate": [795, 2261]}
在def check_click(click, candidate_bbox, gt_point)中返回了false. 导致针对8197_0.png的step被判为 not extract_match