Weird behavior from trained Connect-4 agent #23

tonberry22 · 2018-09-03T07:00:54Z

Hi there, would like to ask if anyone else has had this problem. I've trained up an agent for a couple of days using the high-quality settings (higher self-play and simulation parameters etc.), but when I play test games against it I notice that at times when I miss blocking its win (it has 3-in-a-row already on a diagonal) and play somewhere else, it also ignores its own win and plays elsewhere. This may go on for more than a couple of moves and it never takes the win. I am loading the weights from my model and using act() to get the agent's moves, with tau set to 0 so it acts deterministic.

Would this be a problem with the code or is it explainable in terms of exploitation vs exploration (where the agent is confused when encountering such situations because it has never explored that avenue because it will always block 3-in-a-rows when given the opportunity)? Would there be any way to discourage this behavior apart from hard-coding a 'win-lose check' that prioritizes playing to connect 3-in-a-rows first?

xuehy · 2018-12-23T07:10:36Z

I have met similar cases in my experiments. So I am eager to know if anyone has succeeded in training a model that constantly beats humans?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Weird behavior from trained Connect-4 agent #23

Weird behavior from trained Connect-4 agent #23

tonberry22 commented Sep 3, 2018

xuehy commented Dec 23, 2018

Weird behavior from trained Connect-4 agent #23

Weird behavior from trained Connect-4 agent #23

Comments

tonberry22 commented Sep 3, 2018

xuehy commented Dec 23, 2018