-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
mean([1±1, 2±1]) should give a scalar #104
Comments
The problem here is described in #88 (comment) |
Yeah this is not good: julia> using Distributions
[ Info: Precompiling Distributions [31c24e10-a181-5473-b8eb-7969acd0382f]
julia> d = MvNormal(ones(3))
ZeroMeanDiagNormal(
dim: 3
μ: 3-element FillArrays.Zeros{Float64,1,Tuple{Base.OneTo{Int64}}} = 0.0
Σ: [1.0 0.0 0.0; 0.0 1.0 0.0; 0.0 0.0 1.0]
)
julia> mean(d)
3-element FillArrays.Zeros{Float64,1,Tuple{Base.OneTo{Int64}}} = 0.0
julia> mean(rand(d,1000))
-0.018331363866233977 MeasureTheory doesn't have anything for this yet, but we should make sure we don't end up with strange behavior like this. I think the core of the problem is that a sample from julia> mean(eachcol(rand(d,1000)))
3-element Array{Float64,1}:
-0.007034913250783585
-0.02964041764216485
-0.028973082765793752 Distributions.jl made this choice for efficiency, but now it could make more sense to use something like ArraysOfArrays.jl. @baggepinnen if MeasureTheory were to use typeof(mean(d)) == typeof(mean(rand(d, n))) Would that fix things? |
More: julia> mean(Any[0.411 ± 0.12, 0.342 ± 0.099])
Particles{Float64,2000}
0.3765 ± 0.0763
julia> mean([0.411 ± 0.12, 0.342 ± 0.099])
2-element Array{Float64,1}:
0.41099999999999987
0.342 and julia> std([1.0±1, 10±1])
Particles{Float64,2000}
10.1 ± 0.994 |
I have a feeling that it's a step in the right direction, but might not completely solve the problem experienced here. The problem in MCM can be explained as There is still one problem though, the data storage layout between the two examples is transposed, and so their meaning is also transposed. @mauro3 is expecting the sample mean of the vector of particles, viewing the vector of particles not as a multivariate distributions but as a sample from a univariate distribution (which happens to be particle valued). The only way I can see out of his is to have two different functions, one that treats the data as coming from a multivariate distribution, and one that considers the data coming from a vector-valued univariate distribution (as weird as that sounds). I.e., one |
I haven't thought about this problem in a while in the MCM context, but I've been having some good luck lately with ArraysOfArrays.jl. Maybe that can help? Example: julia> using Distributions
julia> using ArraysOfArrays
julia> d = MvNormal(ones(3))
ZeroMeanDiagNormal(
dim: 3
μ: 3-element FillArrays.Zeros{Float64, 1, Tuple{Base.OneTo{Int64}}} = 0.0
Σ: [1.0 0.0 0.0; 0.0 1.0 0.0; 0.0 0.0 1.0]
)
julia> mean(d)
3-element FillArrays.Zeros{Float64, 1, Tuple{Base.OneTo{Int64}}} = 0.0
help?> nestedview
search: nestedview
nestedview(A::AbstractArray{T,M+N}, M::Integer)
nestedview(A::AbstractArray{T,2})
AbstractArray{<:AbstractArray{T,M},N}
View array A in as an M-dimensional array of N-dimensional arrays by wrapping it into an ArrayOfSimilarArrays.
It's also possible to use a StaticVector of length S as the type of the inner arrays via
nestedview(A::AbstractArray{T}, ::Type{StaticArrays.SVector{S}})
nestedview(A::AbstractArray{T}, ::Type{StaticArrays.SVector{S,T}})
julia> mean(nestedview(rand(d,1000),1))
3-element Vector{Float64}:
0.02048182849608377
-0.03202973353056356
0.01099568630469374 |
But however this is resolved, my example above shows that
|
They definitely should, and I believe the only way of making that happen is to let them behave like a vector of scalars rather than a MV distribution if there is a conflict. One can either supply new functions that treat them as an MV distribution, e.g, |
But wouldn't it then make sense to wrap dimensions of MV distribution into something other than a Vector? Then you could keep the same function names. Also:
That seems like the "if there is a conflict" would be very confusing, no? Me, a stats-noob, I have no clear picture in my head which function makes sense for only MV. So likely I'd try around stuff with my vector of univariate distributions and then end up doing something which is totally inappropriate. But I guess, it's also ok for you to say that you not going to do any hand-holding with MCM.jl. |
Many of these problems should be fixed on master now, where julia> mean([1±1, 2±1])
1.5 ± 0.71
julia> mean(Any[0.411 ± 0.12, 0.342 ± 0.099])
0.3765 ± 0.0792
julia> mean([0.411 ± 0.12, 0.342 ± 0.099])
0.3765 ± 0.078
julia> std([1.0±1, 10±1])
6.36396 ± 0.997 |
A wrapper type around Particles that give them the semantics of a distributions is planned. I think all problems in this issue are resolved by today's changes. |
Currently:
shouldn't this be what
mean.([1±1, 2±1])
returns? I would expect it to return:The text was updated successfully, but these errors were encountered: