-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fluctuation complexity, restrict possibilites to formally defined self-informations #413
base: main
Are you sure you want to change the base?
Conversation
This all sounds good to me, however I dislike the term |
Oh, just now I saw the wiki https://en.wikipedia.org/wiki/Information_content Right, I would still stay away from self, but we can use |
I'm fine with whatever term we use, as long as it is rooted in common literature usage. We can go for |
src/core/information_functions.jl
Outdated
@@ -279,6 +280,39 @@ function information(::InformationMeasure, ::DifferentialInfoEstimator, args...) | |||
)) | |||
end | |||
|
|||
""" | |||
self_information(measure::InformationMeasure, pᵢ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
self_information(measure::InformationMeasure, pᵢ) | |
self_information(measure::InformationMeasure, p::Real) |
I'd suggest that we use just p
here and make it clear in the docstring that this is a number. We use probs
for Probabilities
in most places in the library.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another argument is that the subscript i
is out of context here, and may confuse instead of clarify.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, we can call this simply p
. It is clear that p
is from a distribution.
src/core/information_functions.jl
Outdated
Compute the "self-information"/"surprisal" of a single probability `pᵢ` under the given | ||
information measure. | ||
|
||
This function assumes `pᵢ > 0`, so make sure to pre-filter your probabilities. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function assumes `pᵢ > 0`, so make sure to pre-filter your probabilities. | |
This function requires `pᵢ > 0`. |
Just require it and throw error in the function body.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I see the problem here. You want to define all information content functions with their simple syntax as we anyways filter 0 probabilities when we compute entropy.
Okay, let's say then: "This function requires pᵢ > 0
, giving 0
will yield Inf
or NaN
.".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, we can say that.
@@ -57,3 +57,8 @@ function information_maximum(e::TsallisExtropy, L::Int) | |||
|
|||
return ((L - 1) * L^(q - 1) - (L - 1)^q) / ((q - 1) * L^(q - 1)) | |||
end | |||
|
|||
function self_information(e::TsallisExtropy, pᵢ, N) #must have N |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is N
here? This is not reflected in the function definition or call signature.
If N
is either the length of probs
, or the total number of outcomes, then this quantity does not satisfy the definition of an information content, as it isn't an exclusive function of a single real number. You can explore options in the paper, but here i'd say we don't keep it in the software until it is more solid.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll see if I can define a functional that doesn't depend on N
, if not we keep it out.
Co-authored-by: George Datseris <[email protected]>
@Datseris I'm also a bit hesitant about the name This goes a bit against the terminology used in the literature, but I think it is more precise. What do you think? EDIT: or maybe just |
We can keep |
I don't like the generic |
Compute the "self-information"/"surprisal" of a single probability `pᵢ` under the given | ||
information measure. | ||
Compute the "self-information"/"surprisal" of a single probability `p` under the given | ||
information measure, assuming that `p` is part of a length-`N` probability distribution. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately I don't agree with the latest change of requiring N
. It seems simpler, and more reasonable, to simply not allow Curado to be part of this interface. The opposite, defining the information unit as depending on N, doesn't make much sense at least not with how Shannon introduced it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Datseris Curado
is not the only information measure whose surprisal/self-information depends explicitly on N
, when following the definition of an information measure as a probability weighted average of the surprisal (as I do in the paper).
The opposite, defining the information unit as depending on N, doesn't make much sense at least not with how Shannon introduced it.
In the context of Shannon information unit alone, I agree. But the point of this interface is to generalize the Shannon information unit. This inevitably introduces N
as a parameter.
Can we discuss the final API when I'm done with writing up the paper? I'm not too far from finishing it; I just need to generate a few example applications. Since I am using this PR for the paper analyses, it would be nice to not change anything in this draft PR until the paper is ready.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The alternative is to have information_content
/information_unit
which dispatches to a subset of InformationMeasure
s, and then generalized_information_content
/generalized_information_unit
which dispatches to those InformationMeasure
s whose generalization of information unit depends on N
. But that kind of defeats the purpose of having an interface to begin with - since we're back at defining multiple functions with different names for things that are fundamentally identical (modulo the parameter N
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PS: I sent you a link to the paper draft, @Datseris
What's this?
Here I address #410 and restrict the fluctuation complexity to information measures for which it is possible to define "self-information" in the following sense.
Given an information measure
H
, I define the "generalized" self-information as the functionalI(p_i)
that allows us to re-write the expression forH
as a probability-weighted sumH = sum_i (p_i I(p_i))
( a weighted average, but sincesum(p_i) = 1
, the denominator in the weighted average doesn't appear explicitly).Next, the fluctuation complexity is the square root of
sum_{i=1}^N p_i(I(p_i) - H)^2)
. Hence, using the formulation above, we can meaningfully speak about a fluctuation of local information around the mean of information, regardless of which measure is chosen.I also require that the generalized self-information will yield a fluctuation complexity that have the same properties as the original Shannon-based fluctuation complexity:
p_k = 1
andp_i = 0
fori != k
.Note that we don't involve the axioms which Shannon self-information fulfill at all: we only demand that the generalized self-information is the functional with the properties above. I haven't been able, at least until now, to find any papers in the literature that deals with this concept for Tsallis or other generalized entropies, so I think it is safe to explore with this naming convention.
New design
self_information(measure::InformationMeasure, p_i, args...)
.information(measure::FluctuationComplexity, args...)
to compute theI(p_i)
terms inside the sum of the fluctuation complexity. Onlymeasure
s that implementself_information
are valid, otherwise an error will be thrown.Progress
I've made the necessary derivations for the measures where calculations looked easiest: Shannon entropy/extropy, Tsallis entropy and Curado entropy. I'll fill in the gaps for the rest of the measures whenever I get some free time.
I'm writing this all up in a paper, where I also highlight ComplexityMeasures.jl and how easy it is to use the measure practically due to our discrete estimation API. I've essentially finished the intro and method, but the experimental part remains to be done. For that, I need functional code. So before I proceed, I'd like to get your input on this code proposal, @Datseris. Does this dispatch-based system make sense?
Pending the paper, I verify correctness by numerical comparison in the test suites. I re-write the information measures as weighted sums involving
self_information
, and check that we obtain the same value as if computing the measure using the traditional formulations.