Cube stacking training - Efficientnet-B0 - trial reward
Pre-release
Pre-release
Note that this is a brief run (6k iterations) compared to most others (20-30k iterations), and we cannot yet know if it will become the best stacking nn model. However, the progress is promising.
Command:
export CUDA_VISIBLE_DEVICES="0" && python3 main.py --is_sim --obj_mesh_dir 'objects/blocks' --num_obj 4 --push_rewards --experience_replay --explore_rate_decay --trial_reward --place
Output:
Training iteration: 6209
WARNING variable mismatch num_trials + 1: 583 nonlocal_variables[stack].trial: 584
Change detected: True (value: 645)
Primitive confidence scores: 1.990686 (push), 3.271586 (grasp), 3.296898 (place)
Action: place at (0, 116, 75)
Executing: place at (-0.574000, 0.008000, 0.102957)
gripper position: 0.0036782920360565186
gripper position: 0.003635406494140625
Trainer.get_label_value(): Current reward: 1.562500 Current reward multiplier: 2.000000 Predicted Future reward: 3.760068 Expected reward: 1.562500 + 0.500000 x 3.760068 = 3.442534
Training loss: 0.325041
Experience replay 11110: history timestep index 3430, action: place, surprise value: 2.583875
current_position: [-0.48557281 -0.14327906 0.03296793]
current_obj_z_location: 0.0629679325222969
goal_position: 0.16295654107920776 goal_position_margin: 0.26295654107920774
has_moved: True near_goal: False place_success: False
check_stack() False, not enough nearby objects for a successful stack! expected at least 3 nearby objects, but only counted: 1
check_stack() current detected stack height: 1
check_stack() stack_height: 1 stack matches current goal: False partial_stack_success: False Does the code think a reset is needed: True
main.py check_stack() DETECTED A MISMATCH between the goal height: 2 and current workspace stack height: 1, RESETTING the objects, goals, and action success to FALSE...
Training loss: 0.660970
Experience replay 11111: history timestep index 177, action: push, surprise value: 2.441809
Training loss: 0.015068
Experience replay 11112: history timestep index 459, action: place, surprise value: 2.292749
STACK: trial: 585 actions/partial: 7.446043165467626 actions/full stack: 69.7752808988764 (lower is better) Grasp Count: 2360, grasp success rate: 0.6029661016949153 place_on_stack_rate: 0.5919091554293825 place_attempts: 1409 partial_stack_successes: 834 stack_successes: 89 trial_success_rate: 0.15213675213675212 stack goal: [1 2 0] current_height: 1
Training loss: 0.001867
Time elapsed: 12.592669
Trainer iteration: 6210.000000