What is the role of the actor network in the training of a PPO agent? #918
-
Hi, This question might be already answered, but I was unable to spot it. Thanks in advance for the answer :) |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 5 replies
-
Basically actor network is used to determine which actions the agent takes within its environment to increase advantage. In doing this, it uses a series of policies, once the actor network selects actions based on the policy, the value network evaluates those actions by estimating the expected cumulative reward (value prediction see the comment at the top of ppo_agent.py) associated with those actions in order to do the more appropriate choices that increases the likelihood to maximize them adjusting the network parameters. In summary
|
Beta Was this translation helpful? Give feedback.
-
@101AlexMartin If you don't need further clarification about the role of the actor network in the training of a PPO, please mark question as answered. Thanks. |
Beta Was this translation helpful? Give feedback.
Basically actor network is used to determine which actions the agent takes within its environment to increase advantage.
In doing this, it uses a series of policies, once the actor network selects actions based on the policy, the value network evaluates those actions by estimating the expected cumulative reward (value prediction see the comment at the top of ppo_agent.py) associated with those actions in order to do the more appropriate choices that increases the likelihood to maximize them adjusting the network parameters.
In summary
Actor Network (actor_net) selects actions within the environment taking the current state as input.
Value network estimates the expected future reward f…