Replies: 2 comments 6 replies
-
Try removing the space near the
|
Beta Was this translation helpful? Give feedback.
6 replies
-
@RRThivyan , @GreatV , i have the same issue , where the image folder say that is not found even though they are within the PaddleOCR folder. could you show me how did you setup you dataset and the multi-language file ? i use the following : "image_name label ". |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hello community,
I am trying to read a hand written text from an image. The base model is able to read text that are very well readable, but not well for cursive writings. I have a dataset to train the model, but getting an error in configuration level. I am not sure if the dataset i created is incorrect or any other issue. Kindly help me in this. Also kindly let me know where can I find the materials for the finetuing if possible.
The labels I created are as follows
gs://iam_words_images/images/a01-000u-00-00.png A
gs://iam_words_images/images/a01-000u-00-01.png MOVE
the path of the image and the text present in the image. Do I have to include any further details apart from this?
This is the command i execute to run the train.py script to start the finetuning.
python3 /home/thivyan/PaddleOCR/tools/train.py -c /home/thivyan/PaddleOCR/configs/rec/PP-OCRv3/en_PP-OCRv3_rec.yml -o Global.pretrained_model= /home/thivyan/en_PP-OCRv3_rec_train/best_accuracy
The error i am getting is as follows. It says some details is missing in the config file, which I don't know.
Traceback (most recent call last):
File "/home/thivyan/PaddleOCR/tools/train.py", line 252, in
config, device, logger, vdl_writer = program.preprocess(is_train=True)
File "/home/thivyan/PaddleOCR/tools/program.py", line 712, in preprocess
FLAGS = ArgsParser().parse_args()
File "/home/thivyan/PaddleOCR/tools/program.py", line 58, in parse_args
args.opt = self._parse_opt(args.opt)
File "/home/thivyan/PaddleOCR/tools/program.py", line 67, in _parse_opt
k, v = s.split("=")
ValueError: not enough values to unpack (expected 2, got 1)
Below is the yaml file i created for this purpose.
Global:
debug: false
use_gpu: true
epoch_num: 500
log_smooth_window: 20
print_batch_step: 10
save_model_dir: ./output/v3_en_mobile
save_epoch_step: 3
eval_batch_step: [0, 2000]
cal_metric_during_train: true
pretrained_model:
checkpoints:
save_inference_dir:
use_visualdl: false
infer_img: doc/imgs_words/ch/word_1.jpg
character_dict_path: ppocr/utils/en_dict.txt
max_text_length: &max_text_length 25
infer_mode: false
use_space_char: true
distributed: true
save_res_path: gs://iam_words_images/ocr_output/rec/predicts_ppocrv3_en.txt
Optimizer:
name: Adam
beta1: 0.9
beta2: 0.999
lr:
name: Cosine
learning_rate: 0.001
warmup_epoch: 5
regularizer:
name: L2
factor: 3.0e-05
Architecture:
model_type: rec
algorithm: SVTR_LCNet
Transform:
Backbone:
name: MobileNetV1Enhance
scale: 0.5
last_conv_stride: [1, 2]
last_pool_type: avg
last_pool_kernel_size: [2, 2]
Head:
name: MultiHead
head_list:
- CTCHead:
Neck:
name: svtr
dims: 64
depth: 2
hidden_dims: 120
use_guide: True
Head:
fc_decay: 0.00001
- SARHead:
enc_dim: 512
max_text_length: *max_text_length
Loss:
name: MultiLoss
loss_config_list:
- CTCLoss:
- SARLoss:
PostProcess:
name: CTCLabelDecode
Metric:
name: RecMetric
main_indicator: acc
ignore_space: False
Train:
dataset:
name: SimpleDataSet
data_dir: gs://iam_words_images/train_set
ext_op_transform_idx: 1
label_file_list:
- gs://iam_words_images/train.txt
transforms:
- DecodeImage:
img_mode: BGR
channel_first: false
- RecConAug:
prob: 0.5
ext_data_num: 2
image_shape: [48, 320, 3]
max_text_length: *max_text_length
- RecAug:
- MultiLabelEncode:
- RecResizeImg:
image_shape: [3, 48, 320]
- KeepKeys:
keep_keys:
- image
- label_ctc
- label_sar
- length
- valid_ratio
loader:
shuffle: true
batch_size_per_card: 128
drop_last: true
num_workers: 4
Eval:
dataset:
name: SimpleDataSet
data_dir: gs://iam_words_images/val_set
label_file_list:
- gs://iam_words_images/val.txt
transforms:
- DecodeImage:
img_mode: BGR
channel_first: false
- MultiLabelEncode:
- RecResizeImg:
image_shape: [3, 48, 320]
- KeepKeys:
keep_keys:
- image
- label_ctc
- label_sar
- length
- valid_ratio
loader:
shuffle: false
drop_last: false
batch_size_per_card: 128
num_workers: 4
Beta Was this translation helpful? Give feedback.
All reactions