Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add version branch #26

Open
wants to merge 4 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
27 changes: 27 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,10 @@

***
最近跟新:
- 2021.05.19 更新基于DBnet的多语种文本检测。
- 2021.05.01 更新CRNN 训练,解决了多gpu训练问题,更换成lmdb训练,需要将图片先转成lmdb(在script文件夹中有多进程将图片转成lmdb的代码),做了一些训练优化,模型结构更改(训练时使用名字中带lmdb的yaml文件),实际训练效果如下表。
- 2021.03.26 更新CRNN 训练效果,代码整理后上传
- 2021.03.06 更新CRNN backbone resnet 和 mobilev3 以及配置文件
- 2020.12.22 更新CRNN+CTCLoss+CenterLoss训练
- 2020.09.18 更新文本检测说明文档
- 2020.09.12 更新DB,pse,pan,sast,crnn训练测试代码和预训练模型
Expand All @@ -27,6 +31,17 @@
- [ ] 训练通用化ocr模型
- [ ] 结合chinese_lite进行部署
- [ ] 手机端部署
***
### crnn模型效果(实验中)
使用 MJSynth(MJ) 和 SynthText(ST) 训练,以batchsize=512训练,在以下数据集上测试:

| 模型 |迭代次数| CUTE80 | IC03_867 |IC13_1015|IC13_857|IC15_1811|IC15_2077|IIIT5k_3000|SVT|SVTP|mean|
|-|-|-|-|-|-|-|-|-|-|-|-|
| resnet34+lstm+ctc |120000| 82.98| 91.92|90.93|91.59|73.10|67.98|90.16|85.16|78.29|83.56|
| mobilev3_large+lstm+ctc | 210000| 73.61| 92.50|90.34|91.59|74.82|68.89|87.56|83.46|77.20|82.21|
| mobilev3_small+lstm+ctc | 210000| 66.31| 90.77|88.76|91.13|73.66|69.52|88.80|84.54|72.24|80.64|


***
### 检测模型效果(实验中)

Expand Down Expand Up @@ -93,6 +108,18 @@
<img src="./doc/show/ocr1.jpg" width=600 height=600 />
<img src="./doc/show/ocr2.jpg" width=600 height=600 />

***

### Dbnet多语种文本检测效果

#### 生成数据集:
<img src="./doc/show/2.jpg" width=600 height=600 />

#### 公开数据集:
<img src="./doc/show/1.jpg" width=600 height=600 />
<img src="./doc/show/3.jpg" width=500 height=600 />


***

### 有问题及交流加微信
Expand Down
Binary file added bg_img/1.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added bg_img/2.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added bg_img/3.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added bg_img/4.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added bg_img/5.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added bg_img/6.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added bg_img/7.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added bg_img/8.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added bg_img/9.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Empty file removed checkpoint/新建文本文档.txt
Empty file.
8 changes: 4 additions & 4 deletions config/det_DB_mobilev3.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -9,12 +9,12 @@ base:
crop_shape: [640,640]
shrink_ratio: 0.4
n_epoch: 1200
start_val: 400
start_val: 700
show_step: 20
checkpoints: ./checkpoint
save_epoch: 100
restore: True
restore_file : ./checkpoint/ag_DB_bb_mobilenet_v3_small_he_DB_Head_bs_16_ep_1200_mobile_slim_all/DB_best.pth.tar
restore: False
restore_file : ./checkpoint/DB_best.pth.tar

backbone:
function: ptocr.model.backbone.det_mobilev3,mobilenet_v3_small
Expand Down Expand Up @@ -82,6 +82,6 @@ postprocess:
min_size: 3

infer:
model_path: './checkpoint/ag_DB_bb_mobilenet_v3_small_he_DB_Head_bs_16_ep_1200/DB_best.pth.tar'
model_path: './checkpoint/DB_best.pth.tar'
path: '/src/notebooks/detect_text/icdar2015/ch4_test_images'
save_path: './result'
87 changes: 0 additions & 87 deletions config/det_DB_mobilev3_common.yaml

This file was deleted.

88 changes: 0 additions & 88 deletions config/det_DB_mobilev3_pytorch_qua.yaml

This file was deleted.

20 changes: 10 additions & 10 deletions config/det_DB_resnet50_3_3.yaml
Original file line number Diff line number Diff line change
@@ -1,27 +1,27 @@
base:
gpu_id: '0'
algorithm: DB
pretrained: True
pretrained: False
in_channels: [256, 512, 1024, 2048]
inner_channels: 256
k: 50
adaptive: True
crop_shape: [640,640]
shrink_ratio: 0.4
n_epoch: 1201
start_val: 400
n_epoch: 600
start_val: 6000
show_step: 20
checkpoints: ./checkpoint
save_epoch: 100
restore: False
restore_file : ./DB.pth.tar

backbone:
function: ptocr.model.backbone.det_resnet_3*3,resnet50
function: ptocr.model.backbone.det_resnet_3_3,resnet50

head:
function: ptocr.model.head.det_DBHead,DB_Head
# function: ptocr.model.head.det_FPEM_FFM_Head,FPEM_FFM_Head
# function: ptocr.model.head.det_DBHead,DB_Head
function: ptocr.model.head.det_FPEM_FFM_Head,FPEM_FFM_Head
# function: ptocr.model.head.det_FPNHead,FPN_Head

segout:
Expand Down Expand Up @@ -59,7 +59,7 @@ optimizer_decay:

trainload:
function: ptocr.dataloader.DetLoad.DBProcess,DBProcessTrain
train_file: /src/notebooks/detect_text/icdar2015/train_list.txt
train_file: /src/notebooks/MyworkData/huayandang/train_list.txt
num_workers: 10
batch_size: 8

Expand All @@ -75,10 +75,10 @@ testload:
postprocess:
function: ptocr.postprocess.DBpostprocess,DBPostProcess
is_poly: False
thresh: 0.5
box_thresh: 0.6
thresh: 0.2
box_thresh: 0.3
max_candidates: 1000
unclip_ratio: 2
unclip_ratio: 1.5
min_size: 3

infer:
Expand Down
89 changes: 89 additions & 0 deletions config/det_DB_resnet50_mul.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,89 @@
base:
gpu_id: '1' # 设置训练的gpu id,多卡训练设置为 '0,1,2'
algorithm: DB # 算法名称
pretrained: True # 是否加载预训练
in_channels: [256, 512, 1024, 2048] #
inner_channels: 256 #
k: 50
n_class: 3
adaptive: True
crop_shape: [640,640] #训练时crop图片的大小
shrink_ratio: 0.4 # kernel向内收缩比率
n_epoch: 1200 # 训练的epoch
start_val: 400 #开始验证的epoch,如果不想验证直接设置数值大于n_epoch
show_step: 20 #设置迭代多少次输出一次loss
checkpoints: ./checkpoint #保存模型地址
save_epoch: 100 #设置每多少个epoch保存一次模型
restore: False #是否恢复训练
restore_file : ./DB.pth.tar #恢复训练所需加载模型的地址

backbone:
function: ptocr.model.backbone.det_resnet,resnet50

head:
function: ptocr.model.head.det_DBHead,DB_Head
# function: ptocr.model.head.det_FPEM_FFM_Head,FPEM_FFM_Head
# function: ptocr.model.head.det_FPNHead,FPN_Head

segout:
function: ptocr.model.segout.det_DB_segout,SegDetectorMul

architectures:
model_function: ptocr.model.architectures.det_model,DetModel
loss_function: ptocr.model.architectures.det_model,DetLoss

loss:
function: ptocr.model.loss.db_loss,DBLossMul
l1_scale: 10
bce_scale: 1
class_scale: 1

#optimizer:
# function: ptocr.optimizer,AdamDecay
# base_lr: 0.002
# beta1: 0.9
# beta2: 0.999

optimizer:
function: ptocr.optimizer,SGDDecay
base_lr: 0.002
momentum: 0.99
weight_decay: 0.0005

optimizer_decay:
function: ptocr.optimizer,adjust_learning_rate_poly
factor: 0.9

#optimizer_decay:
# function: ptocr.optimizer,adjust_learning_rate
# schedule: [1,2]
# gama: 0.1

trainload:
function: ptocr.dataloader.DetLoad.DBProcess,DBProcessTrainMul
train_file: /src/notebooks/fangxuwei_96/TextGenerator-master/output/train/train_list.txt
num_workers: 10
batch_size: 8

testload:
function: ptocr.dataloader.DetLoad.DBProcess,DBProcessTest
test_file: /src/notebooks/detect_text/icdar2015/test_list.txt
test_gt_path: /src/notebooks/detect_text/icdar2015/ch4_test_gts/
test_size: 736
stride: 32
num_workers: 5
batch_size: 4

postprocess:
function: ptocr.postprocess.DBpostprocess,DBPostProcessMul
is_poly: False #测试时,检测弯曲文本设置成 True,否则就是输出矩形框
thresh: 0.5
box_thresh: 0.6
max_candidates: 1000
unclip_ratio: 2
min_size: 3

infer:
model_path: './checkpoint/ag_DB_bb_resnet50_he_DB_Head_bs_8_ep_601_train_mul/DB_best.pth.tar'
path: '/src/notebooks/fangxuwei_96/TextGenerator-master/output/img/'
save_path: './result'
Loading