The following table shows examples of copy synthesis and pitch shifting of the vocoder.
Source | -12 smt. | -8 smt. | -4 smt. | 0 smt. (normal) | +4 smt. | +8 smt. | +12 smt. | |
---|---|---|---|---|---|---|---|---|
01 | ||||||||
02 |
Training data of the vocoders is provided and used under permissions by the following organizations, societies or individuals:
The following public datasets are used:
Dataset | Link |
---|---|
Opencpop | https://wenet.org.cn/opencpop/ |
CCMUSIC | https://ccmusic-database.github.io/index.html |
SingingVoiceDataset | http://isophonics.net/SingingVoiceDataset |
東北イタコ | https://zunko.jp/itadev/login.php |
御丹宮くるみ | https://onikuru.info/db-download/ |
PJS | https://sites.google.com/site/shinnosuketakamichi/research-topics/pjs_corpus |
HTS | https://hts.sp.nitech.ac.jp/ |
TIGER 2.0 | https://tigermeat.xyz |
The model weights are licensed under the CC BY-NC-SA 4.0. Anyone who distributes the model weights should include a copy of the license, a notice informing that the models are provided by the OpenVPI Community (or DiffSinger Community), and a link referring this page (or a complete contribution list).
We regret to publish a verified Registry of Hostile Conduct (shown as below). This registry documents individuals/entities who have engaged in long-term destructive activities against the development team.
We solemnly declare:
- Strongly recommend all users review this registry before downloading and using this vocoder
- No technical or legal restrictions are currently imposed on listed parties, as the vocoder is still licensed under CC BY-NC-SA 4.0
- Reserve the right to apply further restrictions in case of persistent malicious acts
Name | Identifiers | Reason |
---|---|---|
旋转_turning_point | QQ: 2673587414; Bilibili UID: 285801087; Discord username: colstone233 |
Engaging in long-term hostile and personal attacks against developers, repeatedly spreading false information about DiffSinger and the development team, and interfering with the development process of the vocoder and other projects in the community |
model | date | specifications | dataset | iters | link |
---|---|---|---|---|---|
NSF-HiFiGAN | 2022-12-11 | 44.1 kHz sampling rate, hop size 512, 128 mel bins, input frequency 40-16000 | ~93h singing | >= 1M | link |
NSF-HiFiGAN | 2024-02-19 | 44.1 kHz sampling rate, hop size 512, 128 mel bins, input frequency 40-16000 | ~72h singing (for fine-tuning) | 110K | link |
PC-NSF-HIFIGAN | 2025-02-27 | 44.1 kHz sampling rate, hop size 512, 128 mel bins, input frequency 40-16000, pitch-controllable | ~79h singing (for fine-tuning) | 40K+108K | link |