Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Quadrature with Dirichlet and Matrix Beta #157

Open
CharlesRSmith44 opened this issue Mar 24, 2021 · 4 comments
Open

Quadrature with Dirichlet and Matrix Beta #157

CharlesRSmith44 opened this issue Mar 24, 2021 · 4 comments

Comments

@CharlesRSmith44
Copy link

Hello, I'm trying to use quadrature to compute basic statistics for different multivariate distributions such as Dirichlet and Matrix Beta, such as the sum(x_i), variance(sum(x_i)) and mean of sqrt(sum(x_i^2)) where x_i = 1, 2, ..., N is drawn from the Dirichlet or Matrix Beta distribution with parameters a0[1:N] or a0[1:N] and b0[1:N].

For a univariate distribution, I can do this just fine: i.e.

N =2 

#User Inputs
α0 = 10 .* (1 .- rand(N))
β0 = 50 .* (1 .- rand(N))
reltol_val = 1e-5
abstol_val = 1e-5
alg = HCubatureJL()

for i = 1:N
    prob = QuadratureProblem(f,0.0,1.0,[α0[i]...,β0[i]...])
    sol = Quadrature.solve(prob,alg,reltol=reltol_val,abstol=abstol_val)[1]
    final_sum += sol
end    

However, since these are multivariate distributions, I can't just forloop to compute the mean. To compute a similar metric, I try:

N =2 

#User Inputs
α0 = 10 .* (1 .- rand(N))
β0 = 50 .* (1 .- rand(N))
reltol_val = 1e-5
abstol_val = 1e-5
alg = HCubatureJL()

dist_pdf(x,p) = exp(DistributionsAD.logpdf(DistributionsAD.TuringDirichlet(N,p[1]),x[1]))
f(x,p) = dist_pdf(x,p) .* x
prob = QuadratureProblem(f,0.0,1.0,[α0...])
sol = Quadrature.solve(prob,alg,reltol=reltol_val,abstol=abstol_val)[1]

Here I get the error: MethodError: no method matching logpdf(::DistributionsAD.TuringDirichlet{Float64,Array{Float64,1}}, ::Float64)

N =2 

#User Inputs
α0 = 10 .* (1 .- rand(N))
β0 = 50 .* (1 .- rand(N))
reltol_val = 1e-5
abstol_val = 1e-5
alg = HCubatureJL()

dist_pdf(x,p) = exp(DistributionsAD.logpdf(DistributionsAD.MatrixBeta(N,p[1],p[2]),x[1]))
f(x,p) = dist_pdf(x,p) .* x
prob = QuadratureProblem(f,0.0,1.0,[α0...])
sol = Quadrature.solve(prob,alg,reltol=reltol_val,abstol=abstol_val)[1]

Here I get the error: no method matching logpdf(::MatrixBeta{Float64,Wishart{Float64,PDMats.ScalMat{Float64},Int64}}, ::Float64)

Is there simply no way to do this for multivariate distributions? Or am I making a mistake somewhere?

Thank you very much,

@devmotion
Copy link
Member

Hej!

It's not exactly clear to me what you want to do. Do you want to compute the expectation E f(X) for some function f and random variable X, where X is distributed according to a univariate, multi-, or matrixvariate distribution?

For the univariate case Distributions.jl provides a function expectation and Expectations.jl is a whole package dedicated to these computations. It should be possible to extend these approaches to multi- and matrixvariate distributions (AFAIK currently they are not supported) but you would have to study these packages and open issues over there.

Regarding your examples, they fail since you try to evaluate the logpdf of some higher-dimensional distribution at a scalar value which is not possible. You need something in the support of these distributions. More concretely, x[1] is always a scalar but would have to be a vector from the probability simplex in the case of (Turing)Dirichlet and a matrix from the support of the matrixvariate Beta distribution in the case of MatrixBeta (see https://en.wikipedia.org/wiki/Matrix_variate_beta_distribution).

In general, it is not clear to me why you use DistributionsAD.jl, and e.g. TuringDirichlet instead of Distributions.Dirichlet. DistributionsAD also just reexports Distributions.logpdf and Distributions.MatrixBeta. Additionally, you can simplify your code a bit by using Distributions.pdf(distribution, x) which is defined as exp(Distributions.logpdf(distribution, x)).

@fernando-duarte
Copy link

Hi David! Thanks for your reply. I am working with @smitch151 on the same project.

You are right that we were passing a scalar as the input to the (Turing)Dirichlet pdf when it should have been a vector. Thanks for catching that! In addition, I agree that in the code, as provided, there is no reason to use DistributionsAD rather than Distributions, and little reason to use Quadrature.jl over Expectations.jl. However, in the end, we do want to use DistributionsAD to solve a larger optimization problem where AD over the parameters of distributions is very useful, and integrals are high-dimensional, which is not to my understanding doable in Expectations.jl.

After fixing the scalar vs vector mistake and going a bit further, we realized one source of our issues is that the latest release of Distributions (v0.24.15) always outputs 0.0 for the pdf of the Dirichlet distribution at a point x whenever x is not in the support. In contrast, DistributionsAD (I tried with the latest release v0.6.20 and the master branch) returns either a non-zero value for some points outside of the support, and errors out if any of the elements of x is negative. The behavior of DistributionsAD is the same as in older versions of Distributions (I checked v0.23.8).

So one super useful fix would be to have DistributionsAD be consistent with Distribtions v0.24.15. Is there a way for us to achieve this?

The code that shows what I just wrote:

using Pkg
Pkg.add(Pkg.PackageSpec(name="Distributions", version="0.23.8"))
Pkg.add(Pkg.PackageSpec(name="DistributionsAD", version="0.6.20"))
using Distributions, DistributionsAD

# test pdf of Dirichlet distribution
α = [0.43,0.21]
d = Distributions.Dirichlet(α)
Turing_d = DistributionsAD.TuringDirichlet(α)

x_not_sup = [0.5,0.8] # a point not in the support
x_neg = [-0.5,1.5] # a point with a negative entry

Distributions.insupport(d,x_not_sup)
Distributions.insupport(d,x_neg)

Distributions.pdf(d,x_not_sup) # a non-zero number
Distributions.pdf(d,x_neg) # throws an error

DistributionsAD.pdf(Turing_d,x_not_sup) # a non-zero number equal to Distributions.pdf(d,x_not_sup)
DistributionsAD.pdf(Turing_d,x_neg) # throws an error

Pkg.add(Pkg.PackageSpec(name="Distributions", version="0.24.15"))
Pkg.add(Pkg.PackageSpec(url="https://github.com/TuringLang/DistributionsAD.jl", rev="master"))
using Distributions, DistributionsAD

# test pdf of Dirichlet distribution
α = [0.43,0.21]
d = Distributions.Dirichlet(α)
Turing_d = DistributionsAD.TuringDirichlet(α)

x_not_sup = [0.5,0.8] # a point not in the support
x_neg = [-0.5,1.5] # a point with a negative entry

#insupport works well
Distributions.insupport(d,x_not_sup)
Distributions.insupport(d,x_neg)

# in the new version of Distributions, the issue is solved
Distributions.pdf(d,x_not_sup) # now gives zero! (the desired behavior)
Distributions.pdf(d,x_neg)  # now gives zero! (the desired behavior)

# the master branch of DistributionsAD behaves like Distributions v0.23.8
# would be great if pdf evaluates to 0.0 if evaluated outside its support
DistributionsAD.pdf(Turing_d,x_not_sup) # a non-zero number
DistributionsAD.pdf(Turing_d,x_neg) # throws an error

@fernando-duarte
Copy link

I realized the issue is now quite different. I will open a new issue and maybe @smitch151 can close this one?

@CharlesRSmith44
Copy link
Author

CharlesRSmith44 commented Mar 30, 2021 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants