RuntimeError('cannot join current thread',) in <object repr() failed> #50

SeekPoint · 2018-12-29T08:51:38Z

(.venv) ub16c9@ub16c9-gpu:~/ub16_prj/QANet$ python config.py --mode train
Building model...
WARNING:tensorflow:From /home/ub16c9/ub16_prj/QANet/layers.py:52: calling reduce_mean (from tensorflow.python.ops.math_ops) with keep_dims is deprecated and will be removed in a future version.
Instructions for updating:
keep_dims is deprecated, use keepdims instead
WARNING:tensorflow:From /home/ub16c9/ub16_prj/QANet/model.py:134: calling softmax (from tensorflow.python.ops.nn_ops) with dim is deprecated and will be removed in a future version.
Instructions for updating:
dim is deprecated, use axis instead
WARNING:tensorflow:From /home/ub16c9/ub16_prj/QANet/model.py:174: softmax_cross_entropy_with_logits (from tensorflow.python.ops.nn_ops) is deprecated and will be removed in a future version.
Instructions for updating:

Future major versions of TensorFlow will allow gradients to flow
into the labels input on backprop by default.

See tf.nn.softmax_cross_entropy_with_logits_v2.

Total number of trainable parameters: 788673
2018-12-29 11:14:48.345129: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2018-12-29 11:14:48.431530: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:964] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2018-12-29 11:14:48.431955: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1432] Found device 0 with properties:
name: GeForce GTX 1080 Ti major: 6 minor: 1 memoryClockRate(GHz): 1.6575
pciBusID: 0000:01:00.0
totalMemory: 10.92GiB freeMemory: 10.43GiB
2018-12-29 11:14:48.431971: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1511] Adding visible gpu devices: 0
2018-12-29 11:14:48.733045: I tensorflow/core/common_runtime/gpu/gpu_device.cc:982] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-12-29 11:14:48.733079: I tensorflow/core/common_runtime/gpu/gpu_device.cc:988] 0
2018-12-29 11:14:48.733085: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 0: N
2018-12-29 11:14:48.733318: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 10086 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1080 Ti, pci bus id: 0000:01:00.0, compute capability: 6.1)
2018-12-29 11:14:50.042331: W tensorflow/core/framework/allocator.cc:122] Allocation of 109906800 exceeds 10% of system memory.
2018-12-29 11:14:50.174758: W tensorflow/core/framework/allocator.cc:122] Allocation of 109906800 exceeds 10% of system memory.
2018-12-29 11:14:50.507489: W tensorflow/core/framework/allocator.cc:122] Allocation of 109906800 exceeds 10% of system memory.
2018-12-29 11:14:50.691090: W tensorflow/core/framework/allocator.cc:122] Allocation of 109906800 exceeds 10% of system memory.
2018-12-29 11:14:50.825623: W tensorflow/core/framework/allocator.cc:122] Allocation of 109906800 exceeds 10% of system memory.
55%|██████████████████████████████████████████████████████████████████████████████████████▏ | 32935/60000 [3:15:35<2:19:53, 3.22it/s] 90%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████ | 53999/60000 [5:17:29<29:48, 3.36it/sException RuntimeError: RuntimeError('cannot join current thread',) in <object repr() failed> ignored██████████████████████████████████████████████████████████████████████| 328/328 [00:36<00:00, 9.07it/s]
(.venv) ub16c9@ub16c9-gpu:~/ub16_prj/QANet$

The text was updated successfully, but these errors were encountered:

natureLanguageQing · 2019-04-24T07:39:26Z

我也遇到了一样的问题，减小了batch size之后就好了或者是重新运行一次。很偶然

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RuntimeError('cannot join current thread',) in <object repr() failed> #50

RuntimeError('cannot join current thread',) in <object repr() failed> #50

SeekPoint commented Dec 29, 2018

natureLanguageQing commented Apr 24, 2019

RuntimeError('cannot join current thread',) in <object repr() failed> #50

RuntimeError('cannot join current thread',) in <object repr() failed> #50

Comments

SeekPoint commented Dec 29, 2018

natureLanguageQing commented Apr 24, 2019