Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

when I continue train yolov9-extensive on my specific dataset, the model fails to detect some categories and differs in each training experiment #626

Open
linzangsc opened this issue Jan 21, 2025 · 0 comments

Comments

@linzangsc
Copy link

Hello, I am recently encountered with some strange (at least for me) problem when training yolov9. Here is the thing:

First of all, I trained yolov9 from scratch on a large general dataset, "big detection", which has about 3 million images with 600 categories. After that, I got a pretrained model.

Then, I tried to continue train the pretrained model on some specific dataset, which contains about 100k images in total. This dataset contains about 80 categories in total, I split it into train and test (8:2) and I use a initial lr0=0.0004, with a batch size of 8 per device and 32 devices in total.

The problem is that after around 100 epochs of continue training, I find that the model will fail to detect some categories which means that those categories have a zero precision and zero recall and so on. The visualization shows that the model just predicts nothing in those areas, even with a confidence threshold of 0.001.

You might say that there must be some thing wrong with my dataset. However, the most strange thing is that those failed categories differ from each other in each training. For example, in the first training experiment, the model fails to detect cat and dog. While in the second training experiment with exact same settings, the model fails to detect pig and horse and succeeds in detecting cat and dog. I tried about 10 experiments and did not find any two share the same failed categories. I am not saying that my dataset is perfect, but if there are some obvious mistakes with the dataset, wrong labeling for example, shouldn't the result of each training shares the same failed cases?

I am speculating that the initial lr0 is still too large consider the small batch size compared with dataset and the model converges to a local minimal. Is there any other thoughts? It would be really helpful if someone has ran into something like this before.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant