Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Last layer of the generator in the CNN (size 16384) #103

Open
spagliarini opened this issue Mar 16, 2021 · 1 comment
Open

Last layer of the generator in the CNN (size 16384) #103

spagliarini opened this issue Mar 16, 2021 · 1 comment

Comments

@spagliarini
Copy link

Hi!

I would like to clean up a doubt I have.
Is the fact that the last layer of the generator has dimension 16384 related to CNN constraints or is it important that the dimension is a bit more than one would like to obtain? Here, 1 s.

Thank you in advance!!!

@chrisdonahue
Copy link
Owner

Ah, good question. The last layer of the generator has dimension 16384 simply because it is a power of four; each of the five layers of the generator increases the number of timesteps by a factor of four, starting from 16 (arbitrary choice).

This output length could represent any amount of time depending on sampling rate, but 16kHz is a common sampling rate in speech processing and conveniently works out to around one second of generated audio (our goal), so we went with that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants