fix: change padding style to tf style #4

unemployed-denizen · 2022-06-10T15:13:45Z

Main changes are the paddings in convolution layers (I have done the experiments on tf to make sure that the padding works). I also rewiring the last down conv layer, since it is not used in the tf version. Please check the examples to see if the quality of output is increased.
I also edited test_estimator.py so that it fits the new version of pytorch, and the inference is also supporting GPU.

tuan3w · 2022-06-18T12:12:39Z

Hi @unemployed-denizen,
Thanks for this investigation. I haven't learned more about Pytorch/TensorFlow for years, so I'm not sure about this.
Just checked Spectrogram in Audacity and spectrogram differences. Here is what I found.

Spectrogram in Audacity:

From top to bottom: Old result, reference result, new result.

The shape of output:
- Old result: [2, 481280]
- Reference & new result: [2, 479832]
AVG Spec difference between each result and the original output:
- Old result: 1.8341962099075317
- New result: 0.0863308310508728

The shape of the new output matches the shape of the original one but there is still a large enough difference. Do you have any ideas to improve this?

unemployed-denizen · 2022-06-22T07:22:08Z

Hi @unemployed-denizen, Thanks for this investigation. I haven't learned more about Pytorch/TensorFlow for years, so I'm not sure about this. Just checked Spectrogram in Audacity and spectrogram differences. Here is what I found.

Spectrogram in Audacity:

From top to bottom: Old result, reference result, new result.

The shape of output:

Old result: [2, 481280]

Reference & new result: [2, 479832]

AVG Spec difference between each result and the original output:

Old result: 1.8341962099075317

New result: 0.0863308310508728

The shape of the new output matches the shape of the original one but there is still a large enough difference. Do you have any ideas to improve this?

For shape issue, it's all about how the padding of stft and istft works. I changed both of them to 'centre' so that the padding works in the right way.
For computation accuracy, it might depend on the stft lib and framwork itself. Even if the procedure is 100% same, same input will give different output (the difference could be small but would not be 100% same). It would be hard to figure out what happends in the underlying computation process.
By the way, I would recommend Demucs v3.0.4, since it is SOTA in 2022.

tuan3w · 2022-06-24T13:05:39Z

Hi @unemployed-denizen,

Sorry for the late reply. Actually, I don't have much passion for ML nowadays because ML is now more about scaling computing power.
Yes, I'm using the latest version of Demucs. My evaluation shows that your result is better, but checking the spectrogram shows that my old result seems better at higher frequencies.

However, I'm going to merge this. Anyone who uses this library should be aware of this difference.

Again, thanks for your time.

Ref issue: #2

changed padding to tf style

d8f1356

tuan3w changed the title ~~changed padding to tf style~~ fix: change padding style to tf style Jun 24, 2022

tuan3w merged commit 88249f5 into tuan3w:master Jun 24, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: change padding style to tf style #4

fix: change padding style to tf style #4

unemployed-denizen commented Jun 10, 2022

tuan3w commented Jun 18, 2022 •

edited

Loading

unemployed-denizen commented Jun 22, 2022

tuan3w commented Jun 24, 2022

fix: change padding style to tf style #4

fix: change padding style to tf style #4

Conversation

unemployed-denizen commented Jun 10, 2022

tuan3w commented Jun 18, 2022 • edited Loading

unemployed-denizen commented Jun 22, 2022

tuan3w commented Jun 24, 2022

tuan3w commented Jun 18, 2022 •

edited

Loading