VisionCycle revolutionizes automated labeling for machine learning using Neural Radiance Fields (NeRF). Our algorithm, aptly named 'VisionCycle', starts by reconstructing 3D models from a minimal set of 2D images utilizing Instant NGP. It then renders these models from various angles to generate new 2D views. This cyclical transformation from 2D to 3D and back to 2D is central to our approach, enhancing the dataset generation process. Finally, VisionCycle automatically annotates these images with bounding boxes, readying them for YOLO training. This innovative cycle of vision makes the labeling process both efficient and scalable, pushing the boundaries of what automated systems can achieve in machine learning preparation.
You can check my Paper(Korean) 'here'
- 2D to 3D Reconstruction: Convert 2D photos to 3D models using Instant NGP.
- 3D to 2D Rendering: Render 3D models from various angles to produce new 2D images.
- Bounding Box Creation: Auto-generate bounding boxes on the new 2D images.
- YOLO Training: Use the images for machine learning model training.
NeRF.py
: Executes Instant NGP to transform 2D images into 3D models.AutoCapture.py
: Manages the rendering of 3D models into 2D views.AutoBounding.py
: Automates bounding box creation in the new 2D images.
- Gao, J., Shen, T., Wang, Z., Chen, W., Yin, K., Li, D., Litany, O., Gojcic, Z. and Fidler, S., 2022. Get3d: A generative model of high quality 3d textured shapes learned from images. Advances In Neural Information Processing Systems, 35, pp.31841-31854.
- Müller, T., Evans, A., Schied, C. and Keller, A., 2022. Instant neural graphics primitives with a multiresolution hash encoding. ACM Transactions on Graphics (TOG), 41(4), pp.1-15.
Required Python packages:
- TensorFlow
- PyTorch
- OpenCV
- bpy
- pillow
- numpy
- OpenGL
- CUDA (if it possbile)
For improved performance, install CUDA to enable GPU acceleration. Please follow the official CUDA Installation Guide to download and install CUDA suitable for your system.
pip install -r requirements.txt
git clone https://github.com/tersite1/VisionCycle.git
cd VisionCycle
pip install -r requirements.txt
python3 main.py 'path_to_dataset'
- Minsuk Jang - [email protected]
Contributions are welcome! For major changes, please open an issue first to discuss what you would like to change.
This project is licensed under the MIT License - see the LICENSE file for details.
- Support and funding from Yonsei University
- Resources provided by NVIDIA