Skip to content

Latest commit

 

History

History
118 lines (83 loc) · 6.77 KB

README.md

File metadata and controls

118 lines (83 loc) · 6.77 KB

FastPillars

FastPillars: A Deployment-friendly Pillar-based 3D Detector
Sifan Zhou, Zhi Tian, Xiangxiang Chu, Xinyu Zhang, Bo Zhang, Xiaobo Lu, Chengjian Feng, Zequn Jie, Patrick Yin Chiang and Lin Ma
Southeast University, Meituan, Fudan University
Paper (arXiv 2302.02367)

ToDo List

  • release the implementation code for KITTI dataset
  • release the TensorRT implementation code
  • release the training code and details
  • release the model weights on nuScenes val set
  • release the inference code on nuScenes dataset

Contact

Any questions or suggestions are welcome! Sifan Zhou [email protected]

Abstract

The deployment of 3D detectors strikes one of the major challenges in real-world self-driving scenarios. Existing BEV-based (i.e., Bird Eye View) detectors favor sparse convolutions (known as SPConv) to speed up training and inference, which puts a hard barrier for deployment, especially for on-device applications. In this paper, to tackle the challenge of efficient 3D object detection from an industry perspective, we devise a deployment-friendly pillar-based 3D detector, termed FastPillars. First, we introduce a novel lightweight Max-and-Attention Pillar Encoding (MAPE) module specially for enhancing small 3D objects. Second, we propose a simple yet effective principle for designing a backbone in pillar-based 3D detection. We construct FastPillars based on these designs, achieving high performance and low latency without SPConv. Extensive experiments on two large-scale datasets demonstrate the effectiveness and efficiency of FastPillars for on-device 3D detection regarding both performance and speed. Specifically, FastPillars delivers state-of-the-art accuracy on Waymo Open Dataset with 1.8X speed up and 3.8 mAPH/L2 improvement over CenterPoint (SPConv-based).

FastPillars Framework

FastPillars architecture
FastPillars architecture

Highlights

  • Simple and SPConv-free: Two sentences method summary: We employ a Max-and-Attention Pillar Encoding (MAPE) module to enhance pillar feature extraction and a computation reallocation principle to improve the bev feature representation. The SPConv-free design enables our method to be seamlessly accelerated using TensorRT.

  • Fast and Accurate: Our best single model achieves 73.3 mAPH on Waymo val set and 71.8 NDS on nuScenes test set while running at 20FPS+.

Main results

3D detection on Waymo val set

#Frame Veh_L2 (mAP/mAPH) Ped_L2 (mAP/mAPH) Cyc_L2 (mAP/mAPH) Mean_L2 mAPH FPS
FastPillars 1 71.5 / 71.1 73.2 / 67.2 70.5 / 69.5 69.3 27.4
FastPillars 2 72.5 / 72.0 75.5 / 72.4 73.9 / 73.0 72.5 24.3
FastPillars 3 73.2 / 72.8 76.3 / 73.2 74.6 / 73.8 73.3 21.7

3D detection on Waymo test set (mAPH)

#Frame Veh_L2 (mAP/mAPH) Ped_L2 (mAP/mAPH) Cyc_L2 (mAP/mAPH)
FastPillars 1 75.4 / 75.0 75.0 / 69.2 70.3 / 69.2
FastPillars 2 76.5 / 76.1 77.2 / 73.9 74.1 / 73.1
FastPillars 3 77.1 / 76.7 77.8 / 74.6 74.2 / 73.2

3D detection on nuScenes val set

Model Weights mAP NDS car truck bus trailer CV Ped Motor Bic TC barrier
FastPillars Model Weights 61.3 68.2 87.3 58.7 71.9 39.8 20.2 86.6 63.3 45.8 72.8 66.7

3D detection on nuScenes test set

mAP NDS car truck bus trailer CV Ped Motor Bic TC barrier
FastPillars 66.8 71.8 87.3 58.0 66.0 62.3 34.5 87.4 70.3 47.9 81.5 72.8

All results are tested on a Titan RTX GPU with batch size 1.

Use FastPillars

Installation

Please refer to INSTALL to set up libraries needed for distributed training and sparse convolution.

Benchmark Evaluation and Training

Please refer to GETTING_START to prepare the data. Then follow the instruction there to reproduce our detection and tracking results. All detection configurations are included in configs.

Develop

If you are interested in training CenterPoint on a new dataset, use CenterPoint in a new task, or use a new network architecture for CenterPoint, please refer to DEVELOP. Feel free to send us an email for discussions or suggestions.

License

FastPillars is release under MIT license (see LICENSE). It is developed based on a forked version of det3d. We also incorperate a large amount of code from CenterPoint. See the NOTICE for details. Note that both nuScenes and Waymo datasets are under non-commercial licenses.

Citation

If you think our paper or code is helpful, please consider citing our work.

@article{zhou2023fastpillars,
  title={FastPillars: A Deployment-friendly Pillar-based 3D Detector},
  author={Zhou, Sifan and Tian, Zhi and Chu, Xiangxiang and Zhang, Xinyu and Zhang, Bo and Lu, Xiaobo and Feng, Chengjian and Jie, Zequn and Chiang, Patrick Yin and Ma, Lin},
  journal={arXiv preprint arXiv:2302.02367},
  year={2023}
}

Star History

Star History Chart

Acknowlegement

This project is not possible without multiple great opensourced codebases. We list some notable examples below.