-
Notifications
You must be signed in to change notification settings - Fork 162
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Louvain algorithm for undirected graphs #1277
base: main
Are you sure you want to change the base?
Conversation
…for partition instead of `Vec`
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall this is a great start! I would call it more than a draft given it already has good tests.
I left some Rust comments, I need to study the algorithm more myself to give more "algorithmic" advice
} | ||
|
||
#[test] | ||
fn test_louvain_karate_club_graph() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Friendly reminder to myself that we should add a Karate club graph generator. It could be a separate PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Once #1280 gets merged this test will hopefully be much shorter
// happen to get the same result as: | ||
// import networkx as nx | ||
// g = nx.karate_club_graph() | ||
// communities = nx.community.louvain_communities(g, weight='weight', seed=12) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is a brittle test (https://testing.googleblog.com/2024/04/how-i-learned-to-stop-writing-brittle.html) but leave a TODO there and we can revisit it later. I am glad it passes though
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure yet what to do with this. I can adjust the resolution parameter so that we consistently get two output communities. However, there is still a lot of dependence on the seed (in networkx I get at least 5 different partitions with the same parameters)
}; | ||
|
||
let m: f64 = total_edge_weight(self.graph); | ||
sigma_internal / m - resolution * sigma_total_squared / (m * m) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We might want to reorganize this expression to avoid floating point erros, specially because of the common factor 1/m
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I started doing this in 58857fd. Other than the common factor what are the possible issues? I think sigma_internal
, sigma_total_squared / m
, and m
are all about the same order of magnitude but could be missing something.
It looks as if the CI is failing because I used associated type bounds. |
Indeed our current MSRV is Rust 1.70 from June 2023. Rust 1.79 is from June 2024. In the past we used to follow Debian stable's Rustc which believe it or not is in 1.63 still. I'd try rewriting it before we discuss bumping MSR, 1.79 might break a couple people that install from source |
Pull Request Test Coverage Report for Build 10877937563Warning: This coverage report may be inaccurate.This pull request's base commit is no longer the HEAD commit of its target branch. This means it includes changes from outside the original pull request, including, potentially, unrelated coverage changes.
Details
💛 - Coveralls |
I finally had some time to work on #1141, so far only on the Rust side and only for undirected graphs. Opening this for feedback on how the Rust code is organized, and maybe to restart the discussion on whether this should go here or in petgraph.
The big additions here are:
metrics.rs
Modularity
: a trait for graphs for which we can compute modularity (by extension this means we can apply the Louvain method)modularity
: computes the modularity, given a graph and a set of subsets.Partition
: a struct holding a graph partition. This mainly exists to keep the implementation ofmodularity
organized.louvain.rs
InnerGraph
: at each level of the algorithm we work with an aggregated graph, where each node of this graph corresponds to one of the communities that were identified in the previous level).InnerGraph
keeps track of this correspondence.LouvainAlgo
: helper functions for updating thePartition
at each level of the algorithm.one_level_undirected
: most of the business logic is in here since it constructs the new communities at each level, roughly equivalent to_one_level
in the networkx implementation.louvain_communities
: repeatedly callsone_level_undirected
, stopping either after a fixed number of iterations or if the modularity improvement falls below the given threshold.I realize this is a lot to review at once and can go into more detail about each piece if necessary.