You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I prepared public dataset(which includes a lots of bank document and medicines wrapper) and converted into ICDAR2015 format.
I wrote the below config file, and then prepared to train the dbnet++ model.
However, the program raises Error, but I can't fix it.
Could you help me fix this error? thanks.
System environment:
System environment:
sys.platform: linux
Python: 3.7.10 (default, Feb 26 2021, 18:47:35) [GCC 7.3.0]
CUDA available: True
numpy_random_seed: 373617217
GPU 0: NVIDIA A100-SXM4-40GB
CUDA_HOME: /usr/local/cuda
NVCC: Cuda compilation tools, release 11.1, V11.1.105
GCC: gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
PyTorch: 1.9.0
PyTorch compiling details: PyTorch built with:
GCC 7.3
C++ Version: 201402
Intel(R) oneAPI Math Kernel Library Version 2021.2-Product Build 20210312 for Intel(R) 64 architecture applications
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
I prepared public dataset(which includes a lots of bank document and medicines wrapper) and converted into ICDAR2015 format.
I wrote the below config file, and then prepared to train the dbnet++ model.
However, the program raises Error, but I can't fix it.
Could you help me fix this error? thanks.
System environment:
System environment:
sys.platform: linux
Python: 3.7.10 (default, Feb 26 2021, 18:47:35) [GCC 7.3.0]
CUDA available: True
numpy_random_seed: 373617217
GPU 0: NVIDIA A100-SXM4-40GB
CUDA_HOME: /usr/local/cuda
NVCC: Cuda compilation tools, release 11.1, V11.1.105
GCC: gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
PyTorch: 1.9.0
PyTorch compiling details: PyTorch built with:
GCC 7.3
C++ Version: 201402
Intel(R) oneAPI Math Kernel Library Version 2021.2-Product Build 20210312 for Intel(R) 64 architecture applications
Intel(R) MKL-DNN v2.1.2 (Git Hash 98be7e8afa711dc9b66c8ff3504129cb82013cdb)
OpenMP 201511 (a.k.a. OpenMP 4.5)
NNPACK is enabled
CPU capability usage: AVX2
CUDA Runtime 11.1
NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86;-gencode;arch=compute_37,code=compute_37
CuDNN 8.0.5
Magma 2.5.2
Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.1, CUDNN_VERSION=8.0.5, CXX_COMPILER=/opt/rh/devtoolset-7/root/usr/bin/c++, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.9.0, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON,
TorchVision: 0.10.0
OpenCV: 4.8.0
MMEngine: 0.8.4
Runtime environment:
cudnn_benchmark: False
mp_cfg: {'mp_start_method': 'fork', 'opencv_num_threads': 0}
dist_cfg: {'backend': 'nccl'}
seed: 373617217
Distributed launcher: none
Distributed training: False
GPU number: 1
Code in config.py
python command
CUDA_VISIBLE_DEVICES=6 python tools/train.py configs/textdet/dbnetpp/dbnetpp_resnet50-dcnv2_fpnc_medfin.py --work-dir dbnetpp/
Following error occurs
It reports "RuntimeWarning" messages and worked until 1 epoch ends. But, finally it stopped with "RuntimeError"
/opt/conda/lib/python3.7/site-packages/shapely/set_operations.py:133: RuntimeWarning: invalid value encountered in intersection
return lib.intersection(a, b, **kwargs)
/opt/conda/lib/python3.7/site-packages/shapely/set_operations.py:133: RuntimeWarning: invalid value encountered in intersection
return lib.intersection(a, b, **kwargs)
/opt/conda/lib/python3.7/site-packages/shapely/set_operations.py:133: RuntimeWarning: invalid value encountered in intersection
return lib.intersection(a, b, **kwargs)
/opt/conda/lib/python3.7/site-packages/shapely/set_operations.py:133: RuntimeWarning: invalid value encountered in intersection
return lib.intersection(a, b, **kwargs)
10/11 06:16:28 - mmengine - INFO - Epoch(train) [1][485/500] lr: 1.0000e-03 eta: 5:54:33 time: 1.6872 data_time: 1.1822 memory: 2318 loss: 5.0179 loss_prob: 2.8286 loss_thr: 1.1893 loss_db: 1.0000
10/11 06:16:31 - mmengine - INFO - Epoch(train) [1][490/500] lr: 1.0000e-03 eta: 5:53:28 time: 1.6945 data_time: 1.1353 memory: 2318 loss: 5.0350 loss_prob: 2.8305 loss_thr: 1.2060 loss_db: 0.9985
10/11 06:16:34 - mmengine - INFO - Epoch(train) [1][495/500] lr: 1.0000e-03 eta: 5:51:35 time: 0.5262 data_time: 0.0041 memory: 2318 loss: 5.0349 loss_prob: 2.8268 loss_thr: 1.2096 loss_db: 0.9985
10/11 06:16:36 - mmengine - INFO - Exp name: dbnetpp_resnet50-dcnv2_fpnc_medfin_20231011_060914
10/11 06:16:36 - mmengine - INFO - Epoch(train) [1][500/500] lr: 1.0000e-03 eta: 5:50:01 time: 0.4604 data_time: 0.0043 memory: 2318 loss: 5.0168 loss_prob: 2.8240 loss_thr: 1.1928 loss_db: 1.0000
Traceback (most recent call last):
File "tools/train.py", line 114, in
main()
File "tools/train.py", line 110, in main
runner.train()
File "/opt/conda/lib/python3.7/site-packages/mmengine/runner/runner.py", line 1745, in train
model = self.train_loop.run() # type: ignore
File "/opt/conda/lib/python3.7/site-packages/mmengine/runner/loops.py", line 102, in run
self.runner.val_loop.run()
File "/opt/conda/lib/python3.7/site-packages/mmengine/runner/loops.py", line 363, in run
self.run_iter(idx, data_batch)
File "/opt/conda/lib/python3.7/site-packages/torch/autograd/grad_mode.py", line 28, in decorate_context
return func(*args, **kwargs)
File "/opt/conda/lib/python3.7/site-packages/mmengine/runner/loops.py", line 383, in run_iter
outputs = self.runner.model.val_step(data_batch)
File "/opt/conda/lib/python3.7/site-packages/mmengine/model/base_model/base_model.py", line 132, in val_step
data = self.data_preprocessor(data, False)
File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/OCR/lyh/mmocr/mmocr/models/textdet/data_preprocessors/data_preprocessor.py", line 86, in forward
data = super().forward(data=data, training=training)
File "/opt/conda/lib/python3.7/site-packages/mmengine/model/base_model/data_preprocessor.py", line 247, in forward
_batch_inputs = data['inputs']
KeyError: 'inputs'
Error in atexit._run_exitfuncs:
Traceback (most recent call last):
File "/opt/conda/lib/python3.7/multiprocessing/popen_fork.py", line 28, in poll
pid, sts = os.waitpid(self.pid, flag)
File "/opt/conda/lib/python3.7/site-packages/torch/utils/data/_utils/signal_handling.py", line 66, in handler
_error_if_any_worker_fails()
RuntimeError: DataLoader worker (pid 3181791) is killed by signal: Terminated.
Beta Was this translation helpful? Give feedback.
All reactions