This repository contains the source codes for the paper Weighted Voxel: a novel voxel representation for 3D reconstruction (Xie et al. 2018).
3D reconstruction has been attracting increasing attention in the past few years. With the surge of deep neural networks, the performance of 3D reconstruction has been improved significantly. However, the voxel reconstructed by extant approaches usually contains lots of noise and leads to heavy computation. In this paper, we define a new voxel representation, named Weighted Voxel. It provides more abundant information, facilitating the subsequent learning and generalization steps. Unlike regular voxel which consists of zero-one, the proposed Weighted Voxel makes full use of the structure information of voxels. Experimental results demonstrate that Weighted Voxel not only performs better in reconstruction but also takes less time in training.
If you find this work useful in your research, please consider citing:
@inproceedings{xie2018weighted,
title={Weighted Voxel: a novel voxel representation for 3D reconstruction},
author={Xie, Haozhe and Yao, Hongxun and Sun, Xiaoshuai and Zhou, Shangchen and Tong, Xiaojun},
booktitle={International Conference on Internet Multimedia Computing and Service {ICIMCS} 2018},
year={2018},
organization={ACM}
}
The project page is available at https://haozhexie.com/project/weighted-voxel.
The generation of Weighted Voxels can be regarded as applying a convolutional kernel on regular voxels. The value of each voxel in the Weighted Voxel is weighted summed over voxel values of its immediate neighbors. More formally, the value in Weighted Voxel can be calculated as
where denotes the value in the regular voxel, and is set to 26. Specially, we define when , or .
The network architecture of 3D-R2N2 and Weighted Voxel. Both of them consist of an encoder, a 3D convolutional LSTM, and a decoder. In 3D-R2N2, the reconstructed voxels are composed of zeros and ones, while in Weighted Voxel, the voxel values are filled with integers.
Reconstruction samples of (a) cars (b) cabinets (c) speakers (d) sofas on the ShapeNet testing dataset. The Weighted Voxel preserves more structural details of 3D objects
Please follow the instruction on the homepage of gpuarray.
pip3 install -r requirements.txt
Please paste following lines to ~/.theanorc
:
[cuda]
root = /opt/cuda # Please change it with your CUDA installation path
[global]
device = cuda0 # Please change it with your GPU device ID
floatX = float32
Use following commands to download the ShapeNet dataset:
cd /path/to/the/repository
mkdir -p datasets/ShapeNet
cd datasets/ShapeNet
wget http://cvgl.stanford.edu/data2/ShapeNetRendering.tgz
tar -xf ShapeNetRendering.tgz
wget http://cvgl.stanford.edu/data2/ShapeNetVox32.tgz
tar -xf ShapeNetVox32.tgz
Use following command to generate Weighted Voxel dataset:
cd /path/to/the/repository
python utils/binvox_weighting.py datasets/ShapeNet/ShapeNetVox32 datasets/ShapeNet/WeightedShapeNetVox32
Use following command to train the neural network:
python3 runner.py
Use following command to test the neural network:
python3 runner.py \
--test \
--weights output/weights.npy
The pretrained model can be downloaded from here (206 MB).
This project is open sourced under MIT license.