Skip to content

Using NeRF, This Pipeline secures a large number of images that can be used for YOLO with a small number of images and then create a bounding box.

License

Notifications You must be signed in to change notification settings

tersite1/VisionCycle

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

53 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

VisionCycle : Automated Labeling Pipeline utilizing Neural Radiance Fields

Introduction

VisionCycle revolutionizes automated labeling for machine learning using Neural Radiance Fields (NeRF). Our algorithm, aptly named 'VisionCycle', starts by reconstructing 3D models from a minimal set of 2D images utilizing Instant NGP. It then renders these models from various angles to generate new 2D views. This cyclical transformation from 2D to 3D and back to 2D is central to our approach, enhancing the dataset generation process. Finally, VisionCycle automatically annotates these images with bounding boxes, readying them for YOLO training. This innovative cycle of vision makes the labeling process both efficient and scalable, pushing the boundaries of what automated systems can achieve in machine learning preparation.



You can check my Paper(Korean) 'here'


Pipeline Process

  1. 2D to 3D Reconstruction: Convert 2D photos to 3D models using Instant NGP.
  2. 3D to 2D Rendering: Render 3D models from various angles to produce new 2D images.
  3. Bounding Box Creation: Auto-generate bounding boxes on the new 2D images.
  4. YOLO Training: Use the images for machine learning model training.
스크린샷 2024-05-11 오후 10 32 06

BoundingBox Creation

스크린샷 2024-05-12 오후 9 52 48

Model Testing Result

스크린샷 2024-05-16 오전 1 42 31

Components Description

  • NeRF.py: Executes Instant NGP to transform 2D images into 3D models.
  • AutoCapture.py: Manages the rendering of 3D models into 2D views.
  • AutoBounding.py: Automates bounding box creation in the new 2D images.

Used Papers

  • Gao, J., Shen, T., Wang, Z., Chen, W., Yin, K., Li, D., Litany, O., Gojcic, Z. and Fidler, S., 2022. Get3d: A generative model of high quality 3d textured shapes learned from images. Advances In Neural Information Processing Systems, 35, pp.31841-31854.
  • Müller, T., Evans, A., Schied, C. and Keller, A., 2022. Instant neural graphics primitives with a multiresolution hash encoding. ACM Transactions on Graphics (TOG), 41(4), pp.1-15.

Dependencies

Required Python packages:

  • TensorFlow
  • PyTorch
  • OpenCV
  • bpy
  • pillow
  • numpy
  • OpenGL
  • CUDA (if it possbile)

CUDA Installation

For improved performance, install CUDA to enable GPU acceleration. Please follow the official CUDA Installation Guide to download and install CUDA suitable for your system.

Install dependencies:

pip install -r requirements.txt

Installation

git clone https://github.com/tersite1/VisionCycle.git
cd VisionCycle
pip install -r requirements.txt

Usage

python3 main.py 'path_to_dataset'

Contact

Contributing

Contributions are welcome! For major changes, please open an issue first to discuss what you would like to change.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgement

  • Support and funding from Yonsei University
  • Resources provided by NVIDIA

About

Using NeRF, This Pipeline secures a large number of images that can be used for YOLO with a small number of images and then create a bounding box.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages