MelSpectrogram() and unspecified sampling rate #74

dsplog · 2024-05-05T01:49:48Z

in the meldataset.py, could see that all wav files are resampled to 24000sps. however, as the MelSpectrogram() transform is called without sample_rate argument defaults to 16000sps.

to_mel = torchaudio.transforms.MelSpectrogram(
    n_mels=80, n_fft=2048, win_length=1200, hop_length=300)
mean, std = -4, 4

def preprocess(wave):
    wave_tensor = torch.from_numpy(wave).float()
    mel_tensor = to_mel(wave_tensor)
    mel_tensor = (torch.log(1e-5 + mel_tensor.unsqueeze(0)) - mean) / std
    return mel_tensor

questions :

believe 2400sps (vs 16000sps needed) was an oversight ?
also, how was the mean/std of -4, 4 arrived.

The text was updated successfully, but these errors were encountered:

gnitoah · 2024-09-07T19:54:17Z

yl4579/StarGANv2-VC#10 and #57, should be helpful.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MelSpectrogram() and unspecified sampling rate #74

MelSpectrogram() and unspecified sampling rate #74

dsplog commented May 5, 2024 •

edited

Loading

gnitoah commented Sep 7, 2024

MelSpectrogram() and unspecified sampling rate #74

MelSpectrogram() and unspecified sampling rate #74

Comments

dsplog commented May 5, 2024 • edited Loading

gnitoah commented Sep 7, 2024

dsplog commented May 5, 2024 •

edited

Loading