Attempting! #20

binarythinktank · 2020-09-11T11:57:37Z

Hi

Trying to get this to work, with a custom sample set. I'm stuck at the end of this: https://github.com/ivanvovk/DurIAN#6-how-to-align-your-own-data

I don't see how to go from the many .TextGrid files to the 2 filelists that the config needs.
You mentioned using textgrid in python but I'm not a python dev, do you have a script to convert all of the textgrid files?

Cheers

binarythinktank · 2020-09-12T02:39:48Z

ok, i have figured out how to read the data files:

import textgrid
import sys

f = './alligned/wavs3/{}.TextGrid'.format(sys.argv[1])
tg = textgrid.TextGrid.fromFile(f)

for i in tg.tiers[1].intervals:
print(i)

which gives me a bunch of these: Interval(0.0, 0.12, sil)

i don't entirely see how this converts to your filelist format though...

looks something like:
[original text]|[the first numbers from all of the intervals??]|[the second numbers from all of the intervals??]|[the text from all of the intervals]|[filename]

you say to "convert to frames in the doc", is that just x 100?

carankt · 2020-09-13T13:47:50Z

On https://github.com/carankt/FastSpeech2.git
Check the Generate_FileList.ipynb
Hope it Helps!

binarythinktank · 2020-09-14T00:08:19Z

many thanks!

binarythinktank · 2020-09-14T06:09:20Z

@carankt tried running the notebook but getting:

ModuleNotFoundError Traceback (most recent call last)
in
3 import librosa
4 import textgrid
----> 5 from utils.files import get_files

ModuleNotFoundError: No module named 'utils.files'

carankt · 2020-09-14T06:30:20Z

@binarythinktank clone the repo and then run. The notebook requires a module from the repo located in utils/files

binarythinktank · 2020-09-14T06:49:44Z

@carankt same...
(to be clear, i cloned your repo, https://github.com/carankt/FastSpeech2)

carankt · 2020-09-14T06:57:19Z

Take the latest version of the repo. Added the missing module.

binarythinktank · 2020-09-14T07:12:57Z

thanks, now it works.
this approach is definitely different to my attempt to pull the relevant data out.
Your train.txt file is only 4mb, in my attempt, it somehow ended up being 10Gb....

python3 .\train_fastspeech.py is failing though, I'll need to debug again.

binarythinktank · 2020-09-14T08:37:21Z

nope, i'm stuck again. If anyone has any idea...

2020-09-14 16:28:53,162 - INFO - ngpu: 1
2020-09-14 16:28:53,162 - INFO - random seed = 1
Using cache found in C:\Users\ml/.cache\torch\hub\seungwonpark_melgan_master
Model is loaded ...
New Training
Batch Size : 16
Trainable Parameters: 26.691M
Loading train data: 0%| | 0/160 [00:06<?, ?it/s]
Traceback (most recent call last):
File ".\train_fastspeech.py", line 452, in
main(sys.argv[1:])
File ".\train_fastspeech.py", line 448, in main
train(args, hp, hp_str, logger, vocoder)
File ".\train_fastspeech.py", line 93, in train
for data in pbar:
File "C:\Users\ml\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\tqdm\std.py", line 1130, in iter
for obj in iterable:
File "C:\Users\ml\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\torch\utils\data\dataloader.py", line 354, in next
data = self._next_data()
File "C:\Users\ml\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\torch\utils\data\dataloader.py", line 980, in _next_data
return self._process_data(data)
File "C:\Users\ml\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\torch\utils\data\dataloader.py", line 1005, in process_data
data.reraise()
File "C:\Users\ml\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\torch_utils.py", line 395, in reraise
raise self.exc_type(msg)
KeyError: Caught KeyError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "C:\Users\ml\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\torch\utils\data_utils\worker.py", line 185, in worker_loop
data = fetcher.fetch(index)
File "C:\Users\ml\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\torch\utils\data_utils\fetch.py", line 44, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "C:\Users\ml\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\torch\utils\data_utils\fetch.py", line 44, in
data = [self.dataset[idx] for idx in possibly_batched_index]
File "D:\home\Downloads\FastSpeech2\dataset\dataloader.py", line 51, in getitem
x = phonemes_to_sequence(x)
File "D:\home\Downloads\FastSpeech2\dataset\texts_init.py", line 175, in phonemes_to_sequence
sequence = [phoneme_to_id[s] for s in string]
File "D:\home\Downloads\FastSpeech2\dataset\texts_init.py", line 175, in
sequence = [_phoneme_to_id[s] for s in string]
KeyError: 'ER0'

carankt · 2020-09-15T04:33:07Z

The filelist you generated using my repo uses different symbols than this one. You must make adjustments to the filelist or the type of symbols in this repo to move forward.

binarythinktank · 2020-09-15T08:52:39Z

which symbols? not the separator, both are using the pipe | character?

I will, in parallel, also attempt to use your version.

Have I put the correct values into your config?

wav_path = lab_path = '../melgan/wavs3' (contains the .wav and .lab files)
csv_path = '../melgan/texts.csv' (not sure what this should be, I put a csv file I used to create the .lab files, it has [filename without .wav]|[spoken text]\n )
dict_path = 'cmudict.txt' (not sure what this is, is the dictionary file in MFA? mine is called english.dict)
output_directory = 'training_log' (created this folder)
log_directory = 'fastspeech2-wavs3' (created this folder)
data_path = data_dir = '../melgan/data/' (output folder of preprocess from this repo, will it work?)
filelist_alignment_dir=teacher_dir = '../melgan/alligned/wavs3' (raw output from mfa, will it work?)
training_files='../melgan/filelists/train.txt' (converted from mfa dir by your repo)
validation_files='../melgan/filelists/valid.txt' (converted from mfa dir by your repo)

wrong csv_path file format?

Traceback (most recent call last):
File "train.py", line 144, in
main()
File "train.py", line 48, in main
train_loader, val_loader, collate_fn = prepare_dataloaders(hparams)
File "FastSpeech2\utils\utils.py", line 13, in prepare_dataloaders
trainset = TMDPESet(hparams.training_files, hparams)
File "FastSpeech2\utils\data_utils.py", line 20, in init
self.audiopaths_and_text = load_filepaths_and_text(audiopaths_and_text, hparams.data_path)
File "FastSpeech2\utils\data_utils.py", line 12, in load_filepaths_and_text
file_name, text1, text2 = line.strip().split('|')
ValueError: too many values to unpack (expected 3)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Attempting! #20

Attempting! #20

binarythinktank commented Sep 11, 2020

binarythinktank commented Sep 12, 2020 •

edited

Loading

carankt commented Sep 13, 2020

binarythinktank commented Sep 14, 2020

binarythinktank commented Sep 14, 2020

carankt commented Sep 14, 2020

binarythinktank commented Sep 14, 2020 •

edited

Loading

carankt commented Sep 14, 2020

binarythinktank commented Sep 14, 2020 •

edited

Loading

binarythinktank commented Sep 14, 2020

carankt commented Sep 15, 2020

binarythinktank commented Sep 15, 2020 •

edited

Loading

Attempting! #20

Attempting! #20

Comments

binarythinktank commented Sep 11, 2020

binarythinktank commented Sep 12, 2020 • edited Loading

carankt commented Sep 13, 2020

binarythinktank commented Sep 14, 2020

binarythinktank commented Sep 14, 2020

carankt commented Sep 14, 2020

binarythinktank commented Sep 14, 2020 • edited Loading

carankt commented Sep 14, 2020

binarythinktank commented Sep 14, 2020 • edited Loading

binarythinktank commented Sep 14, 2020

carankt commented Sep 15, 2020

binarythinktank commented Sep 15, 2020 • edited Loading

binarythinktank commented Sep 12, 2020 •

edited

Loading

binarythinktank commented Sep 14, 2020 •

edited

Loading

binarythinktank commented Sep 14, 2020 •

edited

Loading

binarythinktank commented Sep 15, 2020 •

edited

Loading