./examples_of_results
: 32 test videos
Please first clone the repo and install the required environment, which can be done by running the following commands:
conda env create -n gvmgen python=3.9.0
conda activate gvmgen
cd GVMGen
pip install -r requirements.txt
Please refer to ./data_preprocess
-
Download MusicGen model from MusicGen (small) or MusicGen (medium) and put them into
./checkpoints
folder. -
Modify the config files There are some variables you must modify before your training. Other changes are optional and you can refer to each
default.yaml
config/dset/train.yaml datasource.evaluate path/to/eval_folder config/dset/train.yaml datasource.generate path/to/eval_folder config/dset/train.yaml datasource.train path/to/train_folder config/dset/train.yaml datasource.valid path/to/eval_folder config/solver/gvmgen/gvmgen.yaml compression_model_checkpoint path/to/musicgen_compression_model config/teams/default.yaml default.dora_dir path/to/GVMGen config/teams/default.yaml default.reference_dir path/to/GVMGen config/teams/default.yaml darwin.dora_dir path/to/GVMGen config/teams/default.yaml darwin.reference_dir path/to/GVMGen
-
run training
bash run.sh
- transform model weights
python load_model.py --checkpoint_path path/to/your_checkpoint --output_path path/to/output
- inference
python test.py --model_path ./checkpoints/state_dict.bin --video_path test.mp4 --syn_path output --fps 1 --duration 30
Please refer to ./evaluation_model
folder.
We will release our dataset and pretrained model weight soon.
you may refer to related work that serves as foundations for our framework and code repository, CLIP, MusicGen. Thanks for their wonderful works.
@inproceedings{zuo2025gvmgen,
title={GVMGen: A General Video-to-Music Generation Model With Hierarchical Attentions},
author={Zuo, heda and You, Weitao and Wu, junxian and Ren, Shihong and Chen, Pei and Zhou, Mingxu and Lu, Yujia and Sun, Lingyun},
booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
year={2025}
}