You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
RuntimeError: CUDA out of memory. Tried to allocate 20.00 MiB (GPU 0; 7.79 GiB total capacity; 6.03 GiB already allocated; 10.94 MiB free; 6.17 GiB reserved in total by PyTorch)
#20
Traceback (most recent call last):
File "main.py", line 1755, in <module>
one_train_test_run(args)
File "main.py", line 1563, in one_train_test_run
training_base_directory, best_dict = main(args)
File "main.py", line 1141, in main
trainer.backprop(prev_color_heightmap, prev_valid_depth_heightmap, prev_primitive_action, prev_best_pix_ind, label_value, goal_condition=prev_goal_condition)
File "/home/khan/cop_ws/src/good_robot/trainer.py", line 717, in backprop
push_predictions, grasp_predictions, place_predictions, state_feat, output_prob = self.forward(color_heightmap, depth_heightmap, is_volatile=False, specific_rotation=best_pix_ind[0], goal_condition=goal_condition)
File "/home/khan/cop_ws/src/good_robot/trainer.py", line 445, in forward
output_prob, state_feat = self.model.forward(input_color_data, input_depth_data, is_volatile, specific_rotation, goal_condition=goal_condition)
File "/home/khan/cop_ws/src/good_robot/models.py", line 246, in forward
interm_push_feat, interm_grasp_feat, interm_place_feat, tiled_goal_condition = self.layers_forward(rotate_theta, input_color_data, input_depth_data, goal_condition, tiled_goal_condition)
File "/home/khan/cop_ws/src/good_robot/models.py", line 301, in layers_forward
interm_place_depth_feat = self.place_depth_trunk.features(rotate_depth)
File "/home/khan/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/khan/.local/lib/python3.8/site-packages/torch/nn/modules/container.py", line 119, in forward
input = module(input)
File "/home/khan/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/khan/.local/lib/python3.8/site-packages/torch/nn/modules/container.py", line 119, in forward
input = module(input)
File "/home/khan/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/khan/anaconda3/envs/raisim_env/lib/python3.8/site-packages/torchvision/models/densenet.py", line 33, in forward
new_features = super(_DenseLayer, self).forward(x)
File "/home/khan/.local/lib/python3.8/site-packages/torch/nn/modules/container.py", line 119, in forward
input = module(input)
File "/home/khan/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/khan/.local/lib/python3.8/site-packages/torch/nn/modules/batchnorm.py", line 135, in forward
return F.batch_norm(
File "/home/khan/.local/lib/python3.8/site-packages/torch/nn/functional.py", line 2149, in batch_norm
return torch.batch_norm(
RuntimeError: CUDA out of memory. Tried to allocate 2.00 MiB (GPU 0; 7.79 GiB total capacity; 6.09 GiB already allocated; 28.69 MiB free; 6.26 GiB reserved in total by PyTorch)
After reading some blogs, I found some discussion here which pointed out that it is due to batch size and solution is possible if batch size is reduced. To do this, I tried trainer.py file and I found following section of the code
# Construct minibatch of size 1 (b,c,h,w)
input_color_image.shape = (input_color_image.shape[0], input_color_image.shape[1], input_color_image.shape[2], 1)
input_depth_image.shape = (input_depth_image.shape[0], input_depth_image.shape[1], input_depth_image.shape[2], 1)
input_color_data = torch.from_numpy(input_color_image.astype(np.float32)).permute(3,2,0,1)
input_depth_data = torch.from_numpy(input_depth_image.astype(np.float32)).permute(3,2,0,1)
if self.flops:
# sorry for the super random code here, but this is where we will check the
# floating point operations (flops) counts and parameters counts for now...
print('input_color_data trainer: ' + str(input_color_data.size()))
class Wrapper(object):
custom_params = {'input_color_data': input_color_data, 'input_depth_data': input_depth_data, 'goal_condition': goal_condition}
def input_constructor(shape):
return Wrapper.custom_params
flops, params = get_model_complexity_info(self.model, color_heightmap.shape, as_strings=True, print_per_layer_stat=True, input_constructor=input_constructor)
print('flops: ' + flops + ' params: ' + params)
exit(0)
# Pass input data through model
output_prob, state_feat = self.model.forward(input_color_data, input_depth_data, is_volatile, specific_rotation, goal_condition=goal_condition)
I think I need to divide something to reduce batch size. Can anyone help please?
The text was updated successfully, but these errors were encountered:
I tried to follow this repository. However, when I entered
export CUDA_VISIBLE_DEVICES="0" && python3 main.py --is_sim --obj_mesh_dir objects/blocks --num_obj 4 --push_rewards --experience_replay --explore_rate_decay --check_row --tcp_port 19997 --place --future_reward_discount 0.65 --max_train_actions 20000 --random_actions --common_sense --trial_reward
I got following error
After reading some blogs, I found some discussion here which pointed out that it is due to batch size and solution is possible if batch size is reduced. To do this, I tried
trainer.py
file and I found following section of the codeI think I need to divide something to reduce batch size. Can anyone help please?
The text was updated successfully, but these errors were encountered: