-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error in the docs and in combining layers #329
Comments
Thanks for raising these @marcobonici ! Both are now addressed in #330 . For the first issue, this is a mistake where For your second issue, the following works just fine:) using Zygote, Bijectors, Functors
b = PlanarLayer(2) ∘ PlanarLayer(2)
θs, reconstruct = Functors.functor(b);
struct NLLObjective{R,D,T}
reconstruct::R
basedist::D
data::T
end
# CHANGED: `θs...` -> `θs` so we maintain the same container-structure.
function (obj::NLLObjective)(θs)
transformed_dist = transformed(obj.basedist, obj.reconstruct(θs))
return -sum(Base.Fix1(logpdf, transformed_dist), eachcol(obj.data))
end
xs = randn(2, 1000);
f = NLLObjective(reconstruct, MvNormal(2, 1), xs);
@info "Initial loss: $(f(θs))"
# CHANGED: `1e-3` -> `1e-4` because I ran into divergences when using `1e-3`.
ε = 1e-4;
for i in 1:200
# CHANGED: `θs... -> θs` again due to above.
# CHANGED: `Zygote.gradient` returns a tuple of length equal to number of args,
# so we need to unpack the first component to get the gradient of `θs`.
(∇s,) = Zygote.gradient(f, θs)
# CHANGED: `map` -> `fmap` so we can map over nested `NamedTuple`s, etc.
θs = fmap(θs, ∇s) do θ, ∇
θ - ε .* ∇
end
end
@info "Finall loss: $(f(θs))"
samples = rand(transformed(f.basedist, f.reconstruct(θs)), 1000);
mean(eachcol(samples)) # ≈ [0, 0]
cov(samples; dims=2) # ≈ I Basically, the example in the docs is not really the correct way to do things but it just happened to work out 😬 To be compatible with nested transformations, etc. we need to use |
It seems like #329 fixed this so I'm closing; please feel free to reopen if there are still problems :) |
I need to train a Normalizing flow on some samples and then use it as a distribution.
@Red-Portal suggested me to use
Bijectors.jl
. However, when trying to follow the example in the documentation, I noticed there is an error in the documentation.For my use case, that is not important, but I thought it was worth mentioning.
So, I tried to use a PlanarLayer on my use case, but it did not work: the training was performed, but the result was not satisfying. Then, as suggested by the tutorial, I tried to compose a couple of layers, to see it would have improved performance...and I got this error, when getting to the training
Here below is the MWE, to reproduce the error. I am doing what is suggested in the docs, but maybe I misinterpreted something...? Thanks in advance for your help!
The text was updated successfully, but these errors were encountered: