SynTalker: Enabling Synergistic Full-Body Control in Prompt-Based Co-Speech Motion Generation

Project Page • Arxiv Paper • Demo Video • Web Gradio Demo • Colab • Citation

📝 Release Plans

A simple and powerful cospeech model (corespond to paper Table2:SynTalker (w/o both) )
Training scripts (include training rvqvae and diffusion)
A web demo (We strongly suggest you to try it!)
Our syntalker can recieve both speech and text input simultaneously
Training scripts (include data preprocessing, training rvqvae, text-motion alignspace and diffusion)

💖 Online Demo

Thank Hugging Face🤗 for providing us GPU! Feel free to exprience our online web demo!

⚒️ Installation

Build Environtment

conda create -n syntalker python=3.12
conda activate syntalker
pip install -r requirements.txt
bash demo/install_mfa.sh

Download Model

gdown https://drive.google.com/drive/folders/1tGTB40jF7v0RBXYU-VGRDsDOZp__Gd0_?usp=drive_link -O ./ckpt --folder
gdown https://drive.google.com/drive/folders/1MCks7CMNBtAzU2XihYezNmiGT_6pWex8?usp=drive_link -O ./datasets/hub --folder

Download Dataset

For evaluation and training, not necessary for running a web demo or inference.

Download the original raw data

bash bash_raw_cospeech_download.sh

🚩 Running

Run a web demo

python demo.py -c ./configs/diffusion_rvqvae_128_hf.yaml

Notice: If you use ssh to conect and run code in a headless computer, you may encounter an error pyglet.canvas.xlib.NoSuchDisplayException: Cannot connect to "None". Here, we recommend a method to solve it.

sudo apt-get install libegl1-mesa-dev libgles2-mesa-dev
PYOPENGL_PLATFORM='egl' python demo.py -c ./configs/diffusion_rvqvae_128_hf.yaml

Eval

Require download dataset

python test.py -c configs/diffusion_rvqvae_128.yaml

We also provide a colab notebook for you to evaluate it.

📺 Visualize

Following EMAGE, you can download SMPLX blender addon, and install it in your blender 3.x or 4.x. Click the button Add Animation to visualize the generated smplx file (like xxx.npz).

🔥 Training from scratch

1. Train RVQVAE

Well, if your multiple gpus, we can parellel run these three commands.

python rvq_beatx_train.py --batch-size 256 --lr 2e-4 --total-iter 300000 --lr-scheduler 200000 --nb-code 512 --code-dim 512 --down-t 2 --depth 3 --dilation-growth-rate 3 --out-dir outputs/rvqvae --vq-act relu --quantizer ema_reset --loss-vel 0.5 --recons-loss l1_smooth --exp-name RVQVAE --body_part upper

python rvq_beatx_train.py --batch-size 256 --lr 2e-4 --total-iter 300000 --lr-scheduler 200000 --nb-code 512 --code-dim 512 --down-t 2 --depth 3 --dilation-growth-rate 3 --out-dir outputs/rvqvae --vq-act relu --quantizer ema_reset --loss-vel 0.5 --recons-loss l1_smooth --exp-name RVQVAE --body_part hands

python rvq_beatx_train.py --batch-size 256 --lr 2e-4 --total-iter 300000 --lr-scheduler 200000 --nb-code 512 --code-dim 512 --down-t 2 --depth 3 --dilation-growth-rate 3 --out-dir outputs/rvqvae --vq-act relu --quantizer ema_reset --loss-vel 0.5 --recons-loss l1_smooth --exp-name RVQVAE --body_part lower_trans

2. Train Diffusion Model

python train.py -c configs/diffusion_rvqvae_128.yaml

🙏 Acknowledgments

Thanks to EMAGE, DiffuseStyleGesture, MDM, T2M-GPT, MoMask, MotionCLIP, TMR, OpenTMA, HumanML3D, human_body_prior, our code is partially borrowing from them. Please check these useful repos.

📖 Citation

If you find our code or paper helps, please consider citing:

@inproceedings{chen2024syntalker,
  author = {Bohong Chen and Yumeng Li and Yao-Xiang Ding and Tianjia Shao and Kun Zhou},
  title = {Enabling Synergistic Full-Body Control in Prompt-Based Co-Speech Motion Generation},
  booktitle = {Proceedings of the 32nd ACM International Conference on Multimedia},
  year = {2024},
  publisher = {ACM},
  address = {New York, NY, USA},
  pages = {10},
  doi = {10.1145/3664647.3680847}
}

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
configs		configs
dataloaders		dataloaders
datasets/BEAT_SMPL/beat_v2.0.0/beat_english_v2.0.0/weights		datasets/BEAT_SMPL/beat_v2.0.0/beat_english_v2.0.0/weights
demo		demo
diffusion		diffusion
mean_std		mean_std
models		models
optimizers		optimizers
utils		utils
.gitignore		.gitignore
README.md		README.md
bash_raw_cospeech_download.sh		bash_raw_cospeech_download.sh
demo.py		demo.py
diffusion_rvqvae_trainer.py		diffusion_rvqvae_trainer.py
requirements.txt		requirements.txt
rvq_beatx_train.py		rvq_beatx_train.py
system_utils.py		system_utils.py
test.py		test.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SynTalker: Enabling Synergistic Full-Body Control in Prompt-Based Co-Speech Motion Generation

📝 Release Plans

💖 Online Demo

⚒️ Installation

Build Environtment

Download Model

Download Dataset

🚩 Running

Run a web demo

Eval

📺 Visualize

🔥 Training from scratch

1. Train RVQVAE

2. Train Diffusion Model

🙏 Acknowledgments

📖 Citation

About

Releases

Packages

Languages

RobinWitch/SynTalker

Folders and files

Latest commit

History

Repository files navigation

SynTalker: Enabling Synergistic Full-Body Control in Prompt-Based Co-Speech Motion Generation

📝 Release Plans

💖 Online Demo

⚒️ Installation

Build Environtment

Download Model

Download Dataset

🚩 Running

Run a web demo

Eval

📺 Visualize

🔥 Training from scratch

1. Train RVQVAE

2. Train Diffusion Model

🙏 Acknowledgments

📖 Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages