Jiaming Liu1, Linghe Kong1, Yue Wu2✉, Maoguo Gong2 ,Hao Li2, Qiguang Miao2, Wenping Ma2, Can Qin3
1 Shanghai Jiao Tong University · 2 Xidian University · 3 Salesforce AI Research
(✉️) Corresponding author
Existing 3D mask learning methods encounter performance bottlenecks under limited data, and our objective is to overcome this limitation. In this paper, we introduce a Triple Point Masking scheme, named TPM, which serves as a scalable plug-and-play framework for MAE pre-training to achieve multi-mask learning for 3D point clouds. Specifically, we augment the baseline methods with two additional mask choices (i.e., medium mask and low mask) as our core insight is that the recovery process of an object can manifest in diverse ways. Previous high-masking schemes focus on capturing the global representation information but lack fine-grained recovery capabilities, so that the generated pre-training weights tend to play a limited role in the finetuning process. With the support of the proposed TPM, current methods can exhibit more flexible and accurate completion capabilities, enabling the potential autoencoder in the pre-training stage to consider multiple representations of a single 3D object. In addition, during the fine-tuning stage, an SVM-guided weight selection module is proposed to fill the encoder parameters for downstream networks with the optimal weight, maximizing linear accuracy and facilitating the acquisition of intricate representations for new objects. Extensive experiments show that the five baselines equipped with the proposed TPM achieve comprehensive performance improvements on various downstream tasks.
Please refer to the baselines to set up the installation environment. Point-MAE is recommended:
# Install basic required packages
pip install -r requirements.txt
# PointNet++
pip install "git+https://github.com/erikwijmans/Pointnet2_PyTorch.git#egg=pointnet2_ops&subdirectory=pointnet2_ops_lib"
# GPU kNN
pip install --upgrade https://github.com/unlimblue/KNN_CUDA/releases/download/0.2/KNN_CUDA-0.2-py3-none-any.whl
-
For Point-MAE, Point-M2AE and PointGPT-S, please see DATASET.md for details.
-
For Inter-MAE, please download ShapeNetRender dataset from here.
-
For PointGPT-B, please download both unlabeled hybrid dataset and labeled hybrid dataset from here.
First go to the folder of the baseline method and then execute the pre-training command. For example, Point-MAE:
cd Point-MAE
CUDA_VISIBLE_DEVICES=<GPU> python main.py --config cfgs/pretrain-tpm.yaml --exp_name <output_file_name>
# 1k input
CUDA_VISIBLE_DEVICES=<GPU> python main.py --config cfgs/finetune_modelnet.yaml --ckpts <path/to/pre-trained/model> --finetune_model --exp_name <name>
# 8k input
CUDA_VISIBLE_DEVICES=<GPU> python main.py --config cfgs/finetune_modelnet_8k.yaml --ckpts <path/to/pre-trained/model> --finetune_model --exp_name <name>
# voting
CUDA_VISIBLE_DEVICES=<GPU> python main.py --config cfgs/finetune_modelnet.yaml --test --vote --ckpts <path/to/best/model> --exp_name <name>
# few-shot learning
CUDA_VISIBLE_DEVICES=<GPUs> python main.py --config cfgs/fewshot.yaml --finetune_model --ckpts <path/to/pre-trained/model> --exp_name <output_file_name> --way <5 or 10> --shot <10 or 20> --fold <0-9>
# Fine-tuning on OBJ-BG variant
CUDA_VISIBLE_DEVICES=<GPU> python main.py --config cfgs/finetune_scan_objbg.yaml --ckpts <path/to/pre-trained/model> --finetune_model --exp_name <name>
# Fine-tuning on OBJ-ONLY variant
CUDA_VISIBLE_DEVICES=<GPU> python main.py --config cfgs/finetune_scan_objonly.yaml --ckpts <path/to/pre-trained/model> --finetune_model --exp_name <name>
# Fine-tuning on PB-T50-RS variant
CUDA_VISIBLE_DEVICES=<GPU> python main.py --config cfgs/finetune_scan_hardest.yaml --ckpts <path/to/pre-trained/model> --finetune_model --exp_name <name>
cd segmentation
python main.py --ckpts <path/to/pre-trained/model> --root path/to/data --learning_rate 0.0002 --epoch 300
Visulization of pre-trained model on validation set, please run:
python main_vis.py --config cfgs/pretrain.yaml --test --ckpts <path/to/pre-trained/model> --exp_name <name>
In addition, after converting the .ply
file to a .obj
file, we use KeyShot to render 3D point clouds to reflect the completion effect with the help of TPM.
Input | Point-MAE | Point-MAT with TPM |
---|---|---|
This project is based on Point-MAE (paper, code), Point-M2AE (paper, code), Inter-MAE (paper, code), PointGPT (paper, code). Thanks for their wonderful works.
If you find this work useful in your research, please consider giving a star ⭐ and a citation
@article{liu2024triple,
title={Triple Point Masking},
author={Liu, Jiaming and Kong, Linghe and Wu, Yue and Gong, Maoguo and Li, Hao and Miao, Qiguang and Ma, Wenping and Qin, Can},
journal={arXiv preprint arXiv:2409.17547},
year={2024}
}