GitHub

GVMGen: A General Video-to-Music Generation Model With Hierarchical Attentions

1. Example Demos

./examples_of_results: 32 test videos

2. Environment Preparation

Please first clone the repo and install the required environment, which can be done by running the following commands:

conda env create -n gvmgen python=3.9.0

conda activate gvmgen

cd GVMGen

pip install -r requirements.txt

3. Data Preprocessing

Please refer to ./data_preprocess

4. Training

Download MusicGen model from MusicGen (small) or MusicGen (medium) and put them into ./checkpoints folder.

Modify the config files There are some variables you must modify before your training. Other changes are optional and you can refer to each default.yaml

config/dset/train.yaml datasource.evaluate path/to/eval_folder
config/dset/train.yaml datasource.generate path/to/eval_folder
config/dset/train.yaml datasource.train path/to/train_folder
config/dset/train.yaml datasource.valid path/to/eval_folder

config/solver/gvmgen/gvmgen.yaml compression_model_checkpoint path/to/musicgen_compression_model
config/teams/default.yaml default.dora_dir path/to/GVMGen
config/teams/default.yaml default.reference_dir path/to/GVMGen
config/teams/default.yaml darwin.dora_dir path/to/GVMGen
config/teams/default.yaml darwin.reference_dir path/to/GVMGen

run training
```
bash run.sh
```

5. Inference

transform model weights (Run this step only when loading your own trained model. If you want to test our published model, please skip it.)

python load_model.py --checkpoint_path path/to/your_checkpoint --output_path path/to/output

inference

python test.py --model_path ./checkpoints/state_dict.bin --video_path test.mp4 --syn_path output --fps 1 --duration 30

6. Evaluation Model

Please refer to ./evaluation_model folder.

7. Dataset and Model weights

We will release our dataset soon. The pretrained model's parameter weights have been published and can be accessed at here.

8. Acknowledgements

you may refer to related work that serves as foundations for our framework and code repository, CLIP, MusicGen. Thanks for their wonderful works.

9. Citation

@inproceedings{zuo2025gvmgen,
        title={GVMGen: A General Video-to-Music Generation Model With Hierarchical Attentions},
        author={Zuo, heda and You, Weitao and Wu, junxian and Ren, Shihong and Chen, Pei and Zhou, Mingxu and Lu, Yujia and Sun, Lingyun},
        booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
        year={2025}
    }

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

GVMGen: A General Video-to-Music Generation Model With Hierarchical Attentions

1. Example Demos

2. Environment Preparation

3. Data Preprocessing

4. Training

5. Inference

6. Evaluation Model

7. Dataset and Model weights

8. Acknowledgements

9. Citation

About

Releases

Packages

Contributors 2

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
checkpoints		checkpoints
config		config
data_preprocess		data_preprocess
dataset		dataset
evaluation_model		evaluation_model
examples_of_results		examples_of_results
module		module
static		static
LICENSE		LICENSE
README.md		README.md
index.html		index.html
load_model.py		load_model.py
requirements.txt		requirements.txt
run.sh		run.sh
test.py		test.py

License

chouliuzuo/GVMGen

Folders and files

Latest commit

History

Repository files navigation

GVMGen: A General Video-to-Music Generation Model With Hierarchical Attentions

1. Example Demos

2. Environment Preparation

3. Data Preprocessing

4. Training

5. Inference

6. Evaluation Model

7. Dataset and Model weights

8. Acknowledgements

9. Citation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages