Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ketan simplified co act workflow #3793

Draft
wants to merge 19 commits into
base: main
Choose a base branch
from

Conversation

ketan1741
Copy link

Short description of the problem this fixes or functionality that this introduces. This may be used for the CHANGELOG

  • This PR implements a simplified multi-agent workflow inspired by the CoAct paper.
  • Currently, in swe-bench eval, there are complex instances that OpenHands fails, especially ones that single CodeActAgent overlooks the buggy location. If we have a grounding test case for the issue, this workflow seems to help.
  • An overkill-ish successful trajectory with replanning can be found here.
  • A task which CoActPlannerAgent finished but CodeActAgent failed (I expected both to be able to complete it):
    CoAct traj
    CodeAct traj

Give a summary of what the PR does, explaining any non-trivial design decisions

  • Modify CodeAct to make it accept delegated task.
  • Implement 2 new agents, planner and executor with the same abilities as CodeAct, different system prompts, additional action parsers.

Link of any specific issues this addresses

@neubig
Copy link
Contributor

neubig commented Sep 10, 2024

Hey @ketan1741 , I'm confused. How does this relate to #3770 ?

@ketan1741
Copy link
Author

ketan1741 commented Sep 10, 2024

Hey @ketan1741 , I'm confused. How does this relate to #3770 ?

Hey, Prof. @neubig! Hoang wanted me to have a different branch if I were to make updates to our workflow implementation to make it more reliable/improve it.

@neubig
Copy link
Contributor

neubig commented Sep 10, 2024

Ah, I see. We can figure this out.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants