ECCV 2024, Oral Presentation | Project Page
Authors: Prachi Garg, K J Joseph, Vineeth N Balasubramanian, Necati Cihan Camgoz, Chengde Wan, Kenrick Kin, Weiguang Si, Shugao Ma, and Fernando De La Torre
POET enables users to personalize their experience by adding new action classes efficiently and continually whenever they want.
We demonstrate the efficacy of prompt tuning a significantly lightweight backbone, pretrained exclusively on the base class data. We propose a novel spatio-temporal learnable prompt offset tuning approach, and are the first to apply such prompt tuning to Graph Neural Networks.
We contribute two new benchmarks for our new problem setting in human action recognition: (i) NTU RGB+D dataset for activity recognition, and (ii) SHREC-2017 dataset for hand gesture recognition.
⬜ Code for Gesture Recognition benchmark on SHREC 2017, DG-STA graph transformer backbone.
✅ [Jan 3, 2025] Released our 10+1 sets few-shots splits of NTU RGB+D 60 skeleton joints dataset for full reproducibility here.
✅ Released POET training and evaluation code for our Activity Recognition benchmark on NTU RGB+D dataset. We use the CTR-GCN backbone.
✅ Additionally, this release includes (i) the base step model checkpoints, (ii) a few-shot data file.
📌 Note, additional code for adaptation of various baselines and ablations can be made available upon request.
- Create a new conda environment:
conda create -n poet python=3.8
conda activate poet
- Install PyTorch and CUDA toolkit:
conda install pytorch torchvision torchaudio pytorch-cuda=11.8 -c pytorch -c nvidia
- Install dependencies from requirements.txt:
pip install -r requirements.txt
- Clone the repository:
git clone <repository-url>
- We divide the 60 daily action classes in NTU RGB+D skeleton action recognition dataset into 40 base classes and 20 incremental classes (5x4). We train base model with full supervision and initalize prompts.
- We add 5 new classes to the model sequentially, over 4 continual user sessions, each class trained using only 5 training samples. We fine-tune only expanded classifier and prompt components (prompt pool, prompt keys and query adapter) freezing the rest of the network.
- Our privacy-aware setting is rehearsal-free and does not store any previous class samples or exemplars. Hence, POET is a prompt tuning only solution which acts like a plug and play into most graph convolutional and graph transformer architectures.
-
We had downloaded the NTU RGB+D 60 dataset and preprocessed it following the instructions in the original CTR-GCN repository. Sample few-shot data file is here.
-
Provide path to data files inside
temp_24nov.yaml
-> feeder ->data_path
andfew_shot_data_file
variables.
The POET_final_10run.sh
script performs incremental learning over 4 user sessions (steps):
- Step 1: Classes 40-45
- Step 2: Classes 45-50
- Step 3: Classes 50-55
- Step 4: Classes 55-60
To run training:
# Run for a specific few-shot data file
./POET_train.sh 1 # For set1
# Run for multiple sets
./POET_train.sh 1 2 3 4 5 6 7 8 9 10
This script will:
- Train on each incremental step
- Evaluate performance: (A) average of all classes; (B) old-only average class accuracy; (C) new-only average class accuracy; (D) HM of Old and New.
- Save model checkpoints and evaluation metrics
To run only evaluation:
# Run for a specific few-shot data file
./POET_eval.sh 1 # For set1
# Run for multiple sets
./POET_eval.sh 1 2 3 4 5 6 7 8 9 10
The per-run performance is in this file for reproducibility and comparison.
📌 Note, the parameters that are experiment-specific and stay fixed across different continual sessions are specified in this config file. The arguments that are dynamic across continual sessions (E.g. new class labels being added in a session) irrespective of the experiment are in the argparser_continual.py.
--k_shot
: Number of samples per class for few-shot learning (default: 5)--prompt_layer
: Which layer to add prompts (default: 1)--device
: GPU device ID to use--save_name_args
: Experiment name for saving results--prompt_sim_reg
: Enable prompt similarity regularization--classifier_average_init
: Initialize new classifier weights as average of old ones
Results are saved in:
work_dir/ntu60/csub/ctrgcn_prompt/
├── checkpoints/
├── logs/
├── plots/
└── results.csv
We thank authors of CTR-GCN and Learning to Prompt for Continual Learning and their Pytorch reimplementation. These were useful starting points for our project.
If you find our work useful for your project, please consider citing our work:
@inproceedings{garg2024poet,
title={POET: Prompt Offset Tuning for Continual Human Action Adaptation},
author={Garg, Prachi and Joseph, KJ and Balasubramanian, Vineeth N and Camgoz, Necati Cihan and Wan, Chengde and Kin, Kenrick and Si, Weiguang and Ma, Shugao and De La Torre, Fernando},
booktitle={European Conference on Computer Vision},
pages={436--455},
year={2024},
organization={Springer}
}