Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Differences to GeometricFlux.jl? #2

Closed
AriMKatz opened this issue Aug 30, 2021 · 5 comments
Closed

Differences to GeometricFlux.jl? #2

AriMKatz opened this issue Aug 30, 2021 · 5 comments

Comments

@AriMKatz
Copy link

Here's the inevitable question ;)

What are the differences (philosophical, implementation etc) between this and geometric flux?

Are you covering a smaller scope? I think graphs are a subset of geometric deep learning

@CarloLucibello
Copy link
Member

CarloLucibello commented Aug 30, 2021

Here's the inevitable question ;)

Indeed, I wasn't expecting it already on day 1 though :)

What are the differences (philosophical, implementation etc) between this and geometric flux?
Are you covering a smaller scope? I think graphs are a subset of geometric deep learning

The scope is the same, I just needed another name :) While geometric deep learning is broader than GNNs (you also have CNNs, deep learning on manifolds, and possibly other stuff I don't know about), in practice GeometrixFlux.jl, and also python's PytorchGeometric and DGL, are only about GNNs.

Here I'm trying to address some major (in my opinion) design issues in GeometricFlux that I couldn't fix there since I couldn't find a common ground with the author. For background, see
FluxML/GeometricFlux.jl#215
FluxML/GeometricFlux.jl#204
and a few linked issues.

The main differences at the moment are the following (this list may evolve in the near future)

  • Layers and graphs are now decoupled: layers don't store graphs
  • revisited the message passing mechanism to allow batch operations instead of using a mapreduce approach. I'll present some benchmarks soon, but in practice I see a 10x-100x speedup
  • Support COO and sparse adjacency matrix implementations
  • CUDA support (currently broken in GeometricFlux)
  • A simpler code base: less repos, a flatter type hierarchy, some major code simplifications
  • Graphs are always directed
  • Support for batched graphs in order to leverage cpu/gpu parallelism. Graph-level tasks are well supported
  • Layers are not tied to a single message function but can use as many as they wish
  • Extensive CUDA and gradients test coverage
  • Examples of node-, edge-, and graph-level tasks

Having 2 GNN libraries in julia instead of joining efforts is probably not ideal, but I really think these changes will benefit the ecosystem, so here we are. I hope there will be room for collaboration down the road.

@AriMKatz
Copy link
Author

Ok, that makes sense. Thanks for explaining

FYI @Wimmerer is doing a lot of interesting work on Sparse matrices, Graphs, Graphblas and eventually custom sparse codegen: mcabbott/Tullio.jl#114

@rayegun
Copy link

rayegun commented Sep 8, 2021

Support COO and sparse adjacency matrix implementations

How decoupled can this be? Can layers support an arbitrary AbstractSparse in the future?

How are you implementing the core message passing operations?

How are you storing features? Node features in a dense matrix I imagine, what about edge features?

@CarloLucibello
Copy link
Member

CarloLucibello commented Sep 8, 2021

Support COO and sparse adjacency matrix implementations

How decoupled can this be? Can layers support an arbitrary AbstractSparse in the future?

@Wimmerer Yes, they already can in principle, provided linear algebra, map and broadcasting operations are supported by the sparse matrix type. I opened #19 to experiment with GBMatrix, many tests are failing at the moment, GBMatrix doesn't seem ready as a drop-in replacement for SparseMatrixCSC but I didn't look into the details yet. Feel free to experiment on that branch!

@CarloLucibello
Copy link
Member

How are you implementing the core message passing operations?

The most generic message passing scheme is based on a gather and scatter operations, you can see it here

Many commonly used specific schemes can leverage algebraic operations on the adjacency matrix though. An example
is provided by the GCNConv layer.
https://github.com/CarloLucibello/GraphNeuralNetworks.jl/blob/cba6565e593e2e02c9395c1734a9ef05d948d8df/src/layers/conv.jl#L43

An equivalent forward pass for the GCNConv that stays within the gather/scatter scheme is currently used for gpu operations, since sparse cuda matrix support is not good enough yet (e.g. broadcasting is not supported).
https://github.com/CarloLucibello/GraphNeuralNetworks.jl/blob/cba6565e593e2e02c9395c1734a9ef05d948d8df/src/layers/conv.jl#L51

How are you storing features? Node features in a dense matrix I imagine, what about edge features?

Node features are stored as Dn x num_nodes dense matrices and edge features as De x num_edges dense matrices.
It is ok to consider scalar edge features (commonly called edge weights) as a separate and specialized case to be represented by the adjacency matrix elements or by the value vector in the the COO representation

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants