Skip to content

Latest commit

 

History

History
41 lines (33 loc) · 1.61 KB

OSWorld.md

File metadata and controls

41 lines (33 loc) · 1.61 KB

Deplying Agent-S in OSWorld

To use the GUIAgent with OSWorld, first follow the setup instructions at: https://github.com/xlang-ai/OSWorld.git

After completing the setup instructions, import the GraphSearchAgent into the run.py file in OSWorld. The GraphSearchAgent is the parent agent used in the Agent S framework. To understand the architecture of this GraphSearchAgent, refer to Agent S Architecture.

from agent_s.GraphSearchAgent import GraphSearchAgent

Replace the PromptAgent on line 138 in the test() method with the Graph Search Agent. Specify engine params and instantiate the agent as shown:

if args.model.startswith("claude"):
  engine_type = "anthropic"
elif args.model.startswith("gpt"):
  engine_type = "openai"
else:
  engine_type = "vllm"

engine_params = {
    "engine_type": engine_type,
    "model": args.model,
}

agent = GraphSearchAgent(
    engine_params=engine_params,
    experiment_type='osworld',
    platform="ubuntu",
    max_tokens=args.max_tokens,
    top_p=args.top_p,
    temperature=args.temperature,
    action_space=args.action_space,
    observation_type=args.observation_type,
    max_trajectory_length=args.max_trajectory_length,
    vm_version='latest'
)

The permissible values for the model argument are gpt-4o, gpt-4o-mini for OpenAI models, and claude-3-5-sonnet-20240620 for Anthropic models.

We have set the latest Agent S to use the latest Ubuntu VM image from OSWorld. However, our experiments are based on the older version of the VM. To reproduce the results, set the vm_version argument to 'old' while instantiating the agent.