Skip to content

Commit

Permalink
Re-write median(Binomial) to improve speed (#1928)
Browse files Browse the repository at this point in the history
* Speed up median(Binomial)

Follow suggestion from #1926 to reduce `cdf(Binomial)` evaluations.

* Simplify bound condition

Co-authored-by: Andreas Noack <[email protected]>

* Fix name of argument variable

Co-authored-by: David Widmann <[email protected]>

* Simplify b

Co-authored-by: Andreas Noack <[email protected]>

* Fix cummulative probability check

Co-authored-by: David Widmann <[email protected]>

* Incorporate suggestions from PR review

* Further simplifications following suggestions

* Add more tests

* Fixed checks for median bounds

* Simplified tests a bit

* Update src/univariate/discrete/binomial.jl

Co-authored-by: David Widmann <[email protected]>

* Update src/univariate/discrete/binomial.jl

Co-authored-by: David Widmann <[email protected]>

* Update src/univariate/discrete/binomial.jl

Co-authored-by: David Widmann <[email protected]>

* Avoid test memory allocation with an iterator

Co-authored-by: David Widmann <[email protected]>

---------

Co-authored-by: Andreas Noack <[email protected]>
Co-authored-by: David Widmann <[email protected]>
  • Loading branch information
3 people authored Dec 17, 2024
1 parent 790411a commit 445c2fe
Show file tree
Hide file tree
Showing 2 changed files with 33 additions and 1 deletion.
31 changes: 31 additions & 0 deletions src/univariate/discrete/binomial.jl
Original file line number Diff line number Diff line change
Expand Up @@ -73,6 +73,37 @@ function mode(d::Binomial{T}) where T<:Real
end
modes(d::Binomial) = Int[mode(d)]

function median(dist::Binomial)
# The median is floor(Int, mean) or ceil(Int, mean)
# As shown in https://doi.org/10.1016/0167-7152(94)00090-U,
# |median - mean| <= 1 - bound
# where the equality is strict except for the case p = 1/2 and n odd.
# Thus if |k - mean| < bound for one of the two candidates if p = 1/2 and n odd
# or |k - mean| <= bound for one of the two candidates otherwise,
# the other candidate can't satisfy the condition and hence k must be the median
bound = max(min(dist.p, 1-dist.p), loghalf)
dist_mean = mean(dist)

floor_mean = floor(Int, dist_mean)
difference = dist_mean - floor_mean

if difference <= bound
# The only case where the median satisfies |median - mean| <= 1 - bound with equality
# is p = 1/2 and n odd
# However, in that case we also want to return floor(mean)
floor_mean
elseif difference >= 1 - bound
# The case p = 1/2 and n odd was already covered above,
# thus only cases with |median - mean| < 1 - bound are left here
# Therefore difference >= 1 - bound implies that floor(mean) cannot be the median
floor_mean + 1
elseif cdf(dist, floor_mean) >= 0.5
floor_mean
else
floor_mean + 1
end
end

function skewness(d::Binomial)
n, p1 = params(d)
p0 = 1 - p1
Expand Down
3 changes: 2 additions & 1 deletion test/univariate/discrete/binomial.jl
Original file line number Diff line number Diff line change
Expand Up @@ -27,11 +27,12 @@ end
@test Distributions.expectation(x -> -x, Binomial(10, 0.2)) -2.0

# Test median
@test median(Binomial(5,3//10)) == 1
@test median(Binomial(25,3//10)) == 7
@test median(Binomial(45,3//10)) == 13
@test median(Binomial(65,3//10)) == 19
@test median(Binomial(85,3//10)) == 25

@test all(median(Binomial(7, p)) == quantile(Binomial(7, p), 1//2) for p in 0:0.1:1)

# Test mode
@test Distributions.mode(Binomial(100, 0.4)) == 40
Expand Down

0 comments on commit 445c2fe

Please sign in to comment.