You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug/ 问题描述 (Mandatory / 必填)
使用vgg16、vgg19在GPU和NPU跑5分类花的数据loss不收敛、精度有问题。
Hardware Environment(Ascend/GPU/CPU) / 硬件环境:
Please delete the backend not involved / 请删除不涉及的后端:
/device ascend/GPU
Software Environment / 软件环境 (Mandatory / 必填):
-- MindSpore version (e.g., 2.2.11) :
-- Python version (e.g., Python 3.9.18) :
-- OS platform and distribution (e.g., Linux Ubuntu 22.04):
-- GCC/Compiler version (if compiled from source):
Additional context / 备注 (Optional / 选填)
Add any other context about the problem here.
loss不收敛,精度也不对。麻烦看一下是什么问题;还有就是我把预训练模型下载下来了怎么进行指定?目前使用pretrained: True会自动下载且在固定位置,想问下怎么进行指定;
The text was updated successfully, but these errors were encountered:
If this is your first time, please read our contributor guidelines:
https://github.com/mindspore-lab/mindcv/blob/main/CONTRIBUTING.md
Describe the bug/ 问题描述 (Mandatory / 必填)
使用vgg16、vgg19在GPU和NPU跑5分类花的数据loss不收敛、精度有问题。
Ascend
/GPU
/CPU
) / 硬件环境:Software Environment / 软件环境 (Mandatory / 必填):
-- MindSpore version (e.g., 2.2.11) :
-- Python version (e.g., Python 3.9.18) :
-- OS platform and distribution (e.g., Linux Ubuntu 22.04):
-- GCC/Compiler version (if compiled from source):
Excute Mode / 执行模式 (Mandatory / 必填)(
PyNative
/Graph
):To Reproduce / 重现步骤 (Mandatory / 必填)
Steps to reproduce the behavior:
使用yaml文件训练
命令:python train.py --config ./configs/vgg/vgg16_ascend.yaml
Expected behavior / 预期结果 (Mandatory / 必填)
A clear and concise description of what you expected to happen.
Screenshots/ 日志 / 截图 (Mandatory / 必填)
If applicable, add screenshots to help explain your problem.
yaml文件内容
system
mode: 1
distribute: False
num_parallel_workers: 8
val_while_train: True
dataset
dataset: 'imagenet'
data_dir: './imageNet'
shuffle: True
dataset_download: False
batch_size: 32
drop_remainder: True
augmentation
image_resize: 224
scale: [0.08, 1.0]
ratio: [0.75, 1.333]
hflip: 0.5
interpolation: 'bilinear'
crop_pct: 0.875
model
model: 'vgg16'
num_classes: 5
pretrained: True
ckpt_path: ''
keep_checkpoint_max: 1
ckpt_save_dir: './ckpt3'
epoch_size: 20
dataset_sink_mode: True
amp_level: 'O0'
loss
loss: 'CE'
label_smoothing: 0.1
lr scheduler
scheduler: 'warmup_cosine_decay'
lr: 0.01
min_lr: 0.0001
decay_epochs: 198
warmup_epochs: 2
optimizer
opt: 'momentum'
momentum: 0.9
weight_decay: 0.00004
loss_scale: 1024
use_nesterov: False
训练结果:
Epoch TrainLoss Top_1_Accuracy Top_5_Accuracy TrainTime EvalTime TotalTime
1 1.659075 25.2044% 100.0000% 22.04 0.99 27.67
2 1.790772 19.0736% 100.0000% 6.21 0.84 10.10
3 1.747301 19.0736% 100.0000% 6.46 0.84 10.10
4 1.628069 19.0736% 100.0000% 6.18 0.78 9.68
5 1.661704 19.0736% 100.0000% 6.33 0.85 10.33
6 1.725484 19.0736% 100.0000% 6.19 0.85 10.06
7 1.674596 18.9373% 100.0000% 6.40 0.89 10.36
8 1.607921 19.0736% 100.0000% 6.25 0.75 10.25
9 1.670359 19.0736% 100.0000% 6.17 0.80 10.14
10 1.685464 19.0736% 100.0000% 6.22 0.87 10.75
11 1.688051 19.0736% 100.0000% 6.41 0.83 10.23
12 1.720397 19.0736% 100.0000% 6.22 0.78 10.54
13 1.750791 19.0736% 100.0000% 6.29 0.79 10.29
14 1.598438 19.0736% 100.0000% 6.18 0.83 9.85
15 1.609399 19.0736% 100.0000% 6.14 0.84 9.81
16 1.617299 19.0736% 100.0000% 6.17 0.95 10.13
17 1.744891 19.0736% 100.0000% 6.23 0.86 10.30
18 1.776682 19.0736% 100.0000% 6.18 0.83 9.81
19 1.670697 19.0736% 100.0000% 6.12 0.93 10.03
20 1.782085 19.0736% 100.0000% 6.36 0.83 10.14
Additional context / 备注 (Optional / 选填)
Add any other context about the problem here.
loss不收敛,精度也不对。麻烦看一下是什么问题;还有就是我把预训练模型下载下来了怎么进行指定?目前使用pretrained: True会自动下载且在固定位置,想问下怎么进行指定;
The text was updated successfully, but these errors were encountered: