train strategy #18900
Replies: 2 comments
-
👋 Hello @yeonhyochoi, thank you for your detailed post and for sharing your training strategy and insights! 🚀 It's great to see how much effort you've put into optimizing your training process and analyzing your results. We recommend checking out the Docs for guidance on topics such as dataset balancing and augmentation strategies. For more in-depth information on improving training performance, take a look at our Tips for Best Training Results 📚. These might help refine your approach further. If your post is related to a 🐛 Bug Report, we kindly ask for a minimum reproducible example so we can assist you quickly and efficiently. If you’re looking for additional feedback on your custom training ❓, providing more details, such as logs, sample images (if possible), or snippets of your dataset metadata could help facilitate better advice from both the community and our team. In terms of slower processing speed with your approach, you may also want to explore integrating dataset sampling techniques or caching optimizations. Check the Dataset Guide for tips on efficiently handling large datasets (such as your 2,000,000-image dataset). For real-time help, join the discussion on our Discord server 🎧. If you prefer topic-specific threads or detailed posts, our Discourse Community or Subreddit would be excellent places to collaborate with other developers. UpgradeFirst, please ensure you're running the latest version of pip install -U ultralytics Make sure all requirements are installed. This will ensure that you're leveraging the latest improvements for data handling, training, and optimization. EnvironmentsConsider running your experiments in one of the following verified environments, which include preinstalled dependencies like CUDA/CUDNN:
This is an automated response 🤖, but an Ultralytics engineer will review your discussion and provide further assistance shortly. Thank you for being a part of the Ultralytics community! 🌟 |
Beta Was this translation helpful? Give feedback.
-
@yeonhyochoi it seems you're experimenting with data strategies to improve training outcomes. While replacing or dynamically sampling datasets can be beneficial in some cases, it introduces complexity and can slow down training significantly, as you've observed. Instead, consider techniques like data augmentation, multi-scale training, or mosaic augmentation, which can increase data variety without the need for dataset replacement. For background images, maintaining a 3–10% ratio is a sound practice, provided they are representative of your deployment environment. If you need further optimization, you can explore hyperparameter tuning or early stopping when metrics plateau to save resources. For more details, refer to the training tips here. |
Beta Was this translation helpful? Give feedback.
-
The train results continued to improve. But now the improvement has stopped. As shown in the document, the background image should be used less than 10%, but I don't think continuing to see the same background during 300 epoch and 500 epoch training will help improve learning. So, we try to replace the training data set every 5 or 10 epochs.
i also tried like this, but the process very slow. i think it is wrong approach.
Is there anyone who can give some advice on my code?
Beta Was this translation helpful? Give feedback.
All reactions