-
Notifications
You must be signed in to change notification settings - Fork 421
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
support intervals #1809
base: master
Are you sure you want to change the base?
support intervals #1809
Conversation
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## master #1809 +/- ##
==========================================
- Coverage 85.98% 85.82% -0.17%
==========================================
Files 144 145 +1
Lines 8648 8654 +6
==========================================
- Hits 7436 7427 -9
- Misses 1212 1227 +15 ☔ View full report in Codecov by Sentry. |
bump... |
I wonder if there is general interest for this functionality, the application seems a bit niche to me since most distributions have a fixed support that can't be adjusted based on a given interval. My gut feeling is that it is easy enough to extract the endpoints of an interval and call the standard constructors if one wants to construct e.g. a |
Many interval operations are "easy enough" to do without a special interval type at all, by working with endpoints directly. The whole point of an interval type (same with many other types, such as We use these functions not because they are hard to implement without a special type (they're not!), but because they are natural and convenient: julia> i = 1..3
julia> 2 ∈ i
true
julia> range(i, length=5)
1.0:0.5:3.0 Isn't it natural to create a uniform distribution on a given interval?.. Uniform(i)
# instead of
Uniform(leftendpoint(i), rightendpoint(i)) |
Not sure if I understand this part. Of course it doesn't make sense to have a |
I just want to point out that intervals are somewhat fundamental objects in probability (the Borel algebra), and that prob(interval,dist) = cdf(dist,interval.upper) - cdf(dist,interval.lower) or similar could be pretty neat. Likewise conditional probabilities, empirical samplers, histograms, all kinds of stuff! |
Yes, this kind of probability calculations is convenient sometimes, but orthogonal to this PR. Here, I add intervals support to already existing functions, no new functions added. For playing with distributions and probabilities, I have a snippet defining an interface similar but more general than your ℙ(∈(0..1), Normal(0, 1))
ℙ(<(1), Normal(0, 1))
... Some function definitionsℙ(f::Base.Fix2{typeof(<=)}, d::UnivariateDistribution) = cdf(d, f.x)
ℙ(f::Base.Fix2{typeof(> )}, d::UnivariateDistribution) = ccdf(d, f.x)
function ℙ(f::Base.Fix2{typeof(∈), <:Interval}, d::UnivariateDistribution)
Pr = isrightclosed(f.x) ? ℙ(<=(rightendpoint(f.x)), d) : ℙ(<(rightendpoint(f.x)), d)
Pl = isleftclosed(f.x) ? ℙ(<(leftendpoint(f.x)), d) : ℙ(<=(leftendpoint(f.x)), d)
return Pr - Pl
end |
Gentle bump... |
I guess such definitions might be convenient for someone working with IntervalSets, but my feeling is that these should rather be put in an extension of IntervalSets (similar to existing extensions for Random, Statistics, and StatsBase). |
Oh, sorry, I forgot to explicitly state the motivation to put this into Distributions indeed! |
My concern is that putting these definitions in an extension in Distributions adds maintenance burden for developers that seemingly so far have never expressed a desire to use IntervalSets for constructing the few univariate distributions whose support is an interval that fully characterizes the parameters of this distribution type. The use case still seems quite limited to me, and hence I think one has not be too worried about new distributions. Since the PR also only uses exported and official APIs, I don't think potential breakages are a problem either. To me it seems contributors of IntervalSets are much more interested in such a functionality than developers of Distributions, so I guess an extension in IntervalSets would be preferable. |
In case others come looking for this functionality: the interval support (this PR) is available in https://github.com/JuliaAPlavin/DistributionsExtra.jl. Not sure if I want to register that package yet... It has more than that, btw – including what @chelate suggested above, stuff like |
In the last commit, I tried a complete switch from RealInterval to intervals from IntervalSets – please take a look. It turned out to be very straightforward with few lines modified! This change is non-breaking afaict, and enables all kinds of operations on distributions supports – IntervalSets are quite widely supported. A few random examples: support(d1) ⊆ support(d2)
clamp(x, support(d))
censored(d1, support(d2)) using Makie
lines(support(d), x -> logpdf(d, x)) using ApproxFun
Fun(x -> pdf(d, x), support(d)) If there's no specific reason to stick to the current IntervalSets supports open/closed endpoints natively, can utilize it in the future – discussed long ago in #457. |
Existing tools in external packages could surely be useful. Note, however, that my comment was about discrete distributions for which AFAICT IntervalSets would be insufficient and we would need DomainSets or something different such as InfiniteArrays. Regarding the previous changes (constructors with intervals) my opinion remains unchanged (see above). So if you'd like to replace RealInterval with IntervalSets.Interval I suggest opening a separate PR (or reverting the other changes). |
Sure, IntervalSets doesn't solve all problems in the world :) I'm simply suggesting to use those intervals wherever makes sense, not literally everywhere – for one, discrete distributions stay as-is.
IMO, going halfway here is weird: use intervalsets in some places, but not everywhere where they make sense. That's why the PR remains coherent with the same goal (as stated in its title) – to support an actual widely-adopted interval type throughout Distributions. I've personally been using DistributionsExtra for convenience, see https://github.com/JuliaAPlavin/DistributionsExtra.jl/blob/master/src/intervals_integrations.jl for intervals integration specifically. Unfortunately, this way |
My point is that I think it doesn't make sense to use them to construct distributions but it does make sense to use them as an improvement of |
Add extension for IntervalSets (the most popular intervals package) to support its intervals where it makes sense in Distributions.
I defined and used this
Uniform
method a few times myself, others are included here for completeness.No tests for now, looking for a general confirmation first.