4.0: DDSP Cascade Diffusion Model
Unzip the demo model into exp
directory, unzip the sample audios to the main directory, then run the demo samples:
# opencpop (1st speaker)
python main_diff.py -i samples/source.wav -diff exp/diffusion-new-demo/model_200000.pt -o samples/svc-opencpop+12key.wav -id 1 -k 12 -kstep 100
# kiritan (2nd speaker)
python main_diff.py -i samples/source.wav -diff exp/diffusion-new-demo/model_200000.pt -o samples/svc-kiritan+12key.wav -id 2 -k 12 -kstep 100
# mix the timbre of opencpop and kiritan in a 0.5 to 0.5 ratio
python main_diff.py -i samples/source.wav -diff exp/diffusion-new-demo/model_200000.pt -o samples/svc-opencpop_kiritan_mix+12key.wav -mix "{1:0.5,2:0.5}" -k 12 -kstep 100
The training data of this 2-speaker model is from opencpop and kiritan
Thanks to CN_ChiTu for helping to train this model.