YOLOv6: A Single-Stage Object Detection Framework for Industrial Applications
For years, YOLO series have been de facto industry-level standard for efficient object detection. The YOLO community has prospered overwhelmingly to enrich its use in a multitude of hardware platforms and abundant scenarios. In this technical report, we strive to push its limits to the next level, stepping forward with an unwavering mindset for industry application. Considering the diverse requirements for speed and accuracy in the real environment, we extensively examine the up-to-date object detection advancements either from industry or academy. Specifically, we heavily assimilate ideas from recent network design, training strategies, testing techniques, quantization and optimization methods. On top of this, we integrate our thoughts and practice to build a suite of deployment-ready networks at various scales to accommodate diversified use cases. With the generous permission of YOLO authors, we name it YOLOv6. We also express our warm welcome to users and contributors for further enhancement. For a glimpse of performance, our YOLOv6-N hits 35.9% AP on COCO dataset at a throughput of 1234 FPS on an NVIDIA Tesla T4 GPU. YOLOv6-S strikes 43.5% AP at 495 FPS, outperforming other mainstream detectors at the same scale (YOLOv5-S, YOLOX-S and PPYOLOE-S). Our quantized version of YOLOv6-S even brings a new state-of-the-art 43.3% AP at 869 FPS. Furthermore, YOLOv6-M/L also achieves better accuracy performance (i.e., 49.5%/52.3%) than other detectors with the similar inference speed. We carefully conducted experiments to validate the effectiveness of each component.
Backbone | Arch | Size | Epoch | SyncBN | AMP | Mem (GB) | Box AP | Config | Download |
---|---|---|---|---|---|---|---|---|---|
YOLOv6-n | P5 | 640 | 400 | Yes | Yes | 6.04 | 36.2 | config | model | log |
YOLOv6-t | P5 | 640 | 400 | Yes | Yes | 8.13 | 41.0 | config | model | log |
YOLOv6-s | P5 | 640 | 400 | Yes | Yes | 8.88 | 44.0 | config | model | log |
YOLOv6-m | P5 | 640 | 300 | Yes | Yes | 16.69 | 48.4 | config | model | log |
YOLOv6-l | P5 | 640 | 300 | Yes | Yes | 20.86 | 51.0 | config | model | log |
Note:
- The official m and l models use knowledge distillation, but our version does not support it, which will be implemented in MMRazor in the future.
- The performance is unstable and may fluctuate by about 0.3 mAP.
- If users need the weight of 300 epoch for nano, tiny and small model, they can train according to the configs of 300 epoch provided by us, or convert the official weight according to the converter script.
- We have observed that the base model has been officially released in v6 recently. Although the accuracy has decreased, it is more efficient. We will also provide the base model configuration in the future.
@article{li2022yolov6,
title={YOLOv6: A Single-Stage Object Detection Framework for Industrial Applications},
author={Li, Chuyi and Li, Lulu and Jiang, Hongliang and Weng, Kaiheng and Geng, Yifei and Li, Liang and Ke, Zaidan and Li, Qingyuan and Cheng, Meng and Nie, Weiqiang and others},
journal={arXiv preprint arXiv:2209.02976},
year={2022}
}