Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mutual information between two timeseries? #33

Open
Datseris opened this issue Apr 9, 2020 · 9 comments
Open

Mutual information between two timeseries? #33

Datseris opened this issue Apr 9, 2020 · 9 comments
Assignees

Comments

@Datseris
Copy link
Member

Datseris commented Apr 9, 2020

Hi @heliosdrm ,I am wondering... We have this great Mutual Information function. At the moment I have two timeseries x, y and I want to calculate the mutual information between x and y delayed by τ.

At the moment I am using "InformationMeasures", but I am not really happy with that package. It has bad syntax, and doesn't even have a project toml... I want to get rid of it and add such a method here. I was thinking of copy pasting their code here and making it drastically smaller and with simpler syntax, but first I should ask you: is it possible to add mutual information here between two timesries, given the method you have written?

(to clarify: I don't have a good idea how to get mutual info from two variables, besides doing all the histograms from scratch. that's why I use a package)

@Datseris
Copy link
Member Author

Datseris commented Jun 9, 2020

Hi @heliosdrm , have you seen this message? If you don't have time to do anything, would be still good to just say that, so I know that you have seen this and then try to do things on my own.

@kahaaga
Copy link
Member

kahaaga commented Jun 9, 2020

@Datseris In CausalityToolsBase, I define RectangularBinning. It basically provides different pre-defined ways of partitioning the state space. I use this for the binning based transfer entropy estimators.

The current implementation of the mutualinformation function here is based on binning. It would be nice if mutualinformation here could dispatch on different estimators, so that the estimator types contain parameters necessary for the computations.

I am also currently working on symbolic estimators for transfer entropy over in CausalityTools.jl, which uses different variants of permutation entropy. It would be straight-forward to customize these estimators for delayed MI. If you're interested, I can attempt a PR.

What I am imagining is something like this:

mutualinformation(x, y, lag::Int, method::RectangularBinning)  # customized rectangular binning
mutualinformation(x, y, lag::Int, method::RecursiveBisection)  # recursive bisection binning
mutualinformation(x, y, lag::Int, method::KraskovKNN)  # the "old" mi
mutualinformation(x, y, lag::Int, method::PermutationEntropy) 
mutualinformation(x, y, lag::Int, method::WeightedPermutationEntropy)

If desired, I can start a PR with an api and contribute the permutation-based methods, and someone else can take care of the binning-based methods?

@Datseris
Copy link
Member Author

Datseris commented Jun 9, 2020

yeah I've been thinking about that and i think it is worth combining effforts.

The method here uses binning, correct, but it is an optimized version because it can only do the self mutual information with time delay. But I was wondering that in CausalityTools.jl you would have a mutual information calculation anyway, right? So we could potentially use this version for the 2 timeseries version. Is it in TransferEntropy.jl ?

@Datseris
Copy link
Member Author

entropy(x, method::EntropyEstimator)

@Datseris
Copy link
Member Author

Datseris commented Jun 17, 2020

@kahaaga I think it is worth also exposing the interface

probabilities(x, method::EntropyEstimator)  # customized rectangular binning

that simply calculates the propabilities p_k which are the passed to the generalized entropy formula.

@Datseris
Copy link
Member Author

Datseris commented Jun 17, 2020

and for clarity perhaps we should be using est instead of method (for estimator) and maybe change the abstract type to ProbabilitiesEstimator ?

@kahaaga
Copy link
Member

kahaaga commented Jun 17, 2020

To get the (unordered) probabilities for my marginal x, I call

probabilities(x, est::ProbabilitiesEstimator)

I am understanding you correctly?

@Datseris
Copy link
Member Author

yeah, but for me x is an ordered set, or an ordered timeseries, not a marginal (but maybe we say the same thing). The probabilities themselves are indeed typically un-ordered, or their order doesn't matter.

@kahaaga
Copy link
Member

kahaaga commented Jun 17, 2020

I think we're speaking of the same thing.

x is typically some Dataset (which may be the entire multidimensional dataset, or some subset (marginal) of the entire dataset) (which is ordered). The probabilities, however, are un-ordered.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants