Skip to content

Releases: pyg-team/pytorch_geometric

PyG 2.1.0: Principled aggregations, link-level and temporal samplers, data pipe support, ...

17 Aug 10:32
07bf02f
Compare
Choose a tag to compare

We are excited to announce the release of PyG 2.1.0 🎉🎉🎉

PyG 2.1.0 is the culmination of work from over 60 contributors who have worked on features and bug-fixes for a total of over 320 commits since torch-geometric==2.0.4.

Highlights

Principled Aggregations

See here for the accompanying tutorial.

Aggregation functions play an important role in the message passing framework and the readout functions of Graph Neural Networks. Specifically, many works in the literature (Hamilton et al. (2017), Xu et al. (2018), Corso et al. (2020), Li et al. (2020), Tailor et al. (2021), Bartunov et al. (2022)) demonstrate that the choice of aggregation functions contributes significantly to the representational power and performance of the model.

To facilitate further experimentation and unify the concepts of aggregation within GNNs across both MessagePassing and global readouts, we have made the concept of Aggregation a first-class principle in PyG (#4379, #4522, #4687, #4721, #4731, #4762, #4749, #4779, #4863, #4864, #4865, #4866, #4872, #4927, #4934, #4935, #4957, #4973, #4973, #4986, #4995, #5000, #5021, #5034, #5036, #5039, #4522, #5033, #5085, #5097, #5099, #5104, #5113, #5130, #5098, #5191). As of now, PyG provides support for various aggregations — from simple ones (e.g., mean, max, sum), to advanced ones (e.g., median, var, std), learnable ones (e.g., SoftmaxAggregation, PowerMeanAggregation), and exotic ones (e.g., LSTMAggregation, SortAggregation, EquilibriumAggregation). Furthermore, multiple aggregations can be combined and stacked together:

from torch_geometric.nn import MessagePassing, SoftmaxAggregation

class MyConv(MessagePassing):
    def __init__(self, ...):
        # Combines a set of aggregations and concatenates their results.
        # The interface also supports automatic resolution.
        super().__init__(aggr=['mean', 'std', SoftmaxAggregation(learn=True)])

Link-level Neighbor Loader

We added a new LinkNeighborLoader class for training scalable GNNs that perform edge-level predictions on giant graphs (#4396, #4439, #4441, #4446, #4508, #4509, #4868). LinkNeighborLoader comes with automatic support for both homogeneous and heterogenous data, and supports link prediction via automatic negative sampling as well as edge-level classification and regression models:

from torch_geometric.loader import LinkNeighborLoader

loader = LinkNeighborLoader(
    data,
    num_neighbors=[30] * 2,  # Sample 30 neighbors for each node for 2 iterations
    batch_size=128,  # Use a batch size of 128 for sampling training links
    edge_label_index=data.edge_index,  # Use the entire graph for supervision
    negative_sampling_ratio=1.0,  # Sample negative edges
)

sampled_data = next(iter(loader))
print(sampled_data)
>>> Data(x=[1368, 1433], edge_index=[2, 3103], edge_label_index=[2, 256], edge_label=[256])

Neighborhood Sampling based on Temporal Constraints

Both NeighborLoader and LinkNeighborLoader now support temporal sampling via the time_attr argument (#4025, #4877, #4908, #5137, #5173). If set, temporal sampling will be used such that neighbors are guaranteed to fulfill temporal constraints, i.e. neighbors have an earlier timestamp than the center node:

from torch_geometric.loader import NeighborLoader

data['paper'].time = torch.arange(data['paper'].num_nodes)

loader = NeighborLoader(
    data,
    input_nodes='paper',
    time_attr='time',  # Only sample papers that appeared before the seed paper
    num_neighbors=[30] * 2,
    batch_size=128,
)

Note that this feature requires torch-sparse>=0.6.14.

Functional DataPipes

See here for the accompanying example.

PyG now fully supports data loading using the newly introduced concept of DataPipes in PyTorch for easily constructing flexible and performant data pipelines (#4302, #4345, #4349). PyG provides DataPipe support for batching multiple PyG data objects together and for applying any PyG transform:

datapipe = FileOpener(['SMILES_HIV.csv'])
datapipe = datapipe.parse_csv_as_dict()
datapipe = datapipe.parse_smiles(target_key='HIV_active')
datapipe = datapipe.in_memory_cache()  # Cache graph instances in-memory.
datapipe = datapipe.shuffle()
datapipe = datapipe.batch_graphs(batch_size=32)
datapipe = FileLister([root_dir], masks='*.off', recursive=True)
datapipe = datapipe.read_mesh()
datapipe = datapipe.in_memory_cache()  # Cache graph instances in-memory.
datapipe = datapipe.sample_points(1024)  # Use PyG transforms from here.
datapipe = datapipe.knn_graph(k=8)
datapipe = datapipe.shuffle()
datapipe = datapipe.batch_graphs(batch_size=32)

Breaking Changes

Read more

2.0.4

12 Mar 16:43
Compare
Choose a tag to compare

PyG 2.0.4 🎉

A new minor PyG version release, bringing PyTorch 1.11 support to PyG. It further includes a variety of new features and bugfixes:

Features

Datasets

Minor Changes

Bugfixes

Read more

2.0.3

22 Dec 06:49
d47d9cd
Compare
Choose a tag to compare

PyG 2.0.3 🎉

A new minor PyG version release, including a variety of new features and bugfixes:

Features

Datasets

Minor Changes

Read more

2.0.2

26 Oct 12:41
Compare
Choose a tag to compare

A new minor version release, including further bugfixes, official PyTorch 1.10 support, as well as additional features and operators:

Features

Minor Changes

  • Data.to_homogeneous will now add node_type information to the homogeneous Data object
  • GINEConv now allows to transform edge features automatically in case their dimensionalities do not match (thanks to @CaypoH)
  • OGB_MAG will now add node_year information to paper nodes
  • Entities datasets do now allow the processing of HeteroData objects via the hetero=True option
  • Batch objects can now be batched together to form super batches
  • Added heterogeneous graph support for Center, Constant and LinearTransformation transformations
  • HeteroConv now allows to return "stacked" embeddings
  • The batch vector of a Batch object will now be initialized on the GPU in case other attributes are held in GPU memory

Bugfixes

  • Fixed the num_neighbors argument of NeighborLoader in order to specify an edge-type specific number of neighbors
  • Fixed the collate policy of lists of integers/strings to return nested lists
  • Fixed the Delaunay transformation in case the face attribute is not present in the data
  • Fixed the TGNMemory module to only read from the latest update (thanks to @cwh104504)
  • Fixed the pickle.PicklingError when Batch objects are used in a torch.multiprocessing.manager.Queue() (thanks to @RasmusOrsoe)
  • Fixed an issue with _parent state changing after pickling of Data objects (thanks to @zepx)
  • Fixed the ToUndirected transformation in case the number of edges and nodes are equal (thanks to @lmkmkrcc)
  • Fixed the from_networkx routine in case node-level and edge-level features share the same names
  • Removed the num_nodes warning when creating PairData objects
  • Fixed the initialization of the GeneralMultiLayer module in GraphGym (thanks to @fjulian)
  • Fixed custom model registration in GraphGym
  • Fixed a clash in the run_dir naming of GraphGym (thanks to @fjulian)
  • Includes a fix to prevent a GraphGym crash in case ROC-score is undefined (thanks to @fjulian)
  • Fixed the Batch.from_data_list routine on dataset slices (thanks to @dtortorella)
  • Fixed the MetaPath2Vec model in case there exists isolated nodes
  • Fixed torch_geometric.utils.coalesce with CUDA tensors

2.0.1

16 Sep 07:22
Compare
Choose a tag to compare

PyG 2.0.1

This is a minor release, bringing some emergency fixes to PyG 2.0.

Bugfixes

2.0.0

13 Sep 07:48
Compare
Choose a tag to compare

PyG 2.0 🎉 🎉 🎉

PyG (PyTorch Geometric) has been moved from my own personal account rusty1s to its own organization account pyg-team to emphasize the ongoing collaboration between TU Dortmund University, Stanford University and many great external contributors. With this, we are releasing PyG 2.0, a new major release that brings sophisticated heterogeneous graph support, GraphGym integration and many other exciting features to PyG.

If you encounter any bugs in this new release, please do not hesitate to create an issue.

Heterogeneous Graph Support

We finally provide full heterogeneous graph support in PyG 2.0. See here for the accompanying tutorial.

Highlights

  • Heterogeneous Graph Storage: Heterogeneous graphs can now be stored in their own dedicated data.HeteroData class (thanks to @yaoyaowd):

    from torch_geometric.data import HeteroData
    
    data = HeteroData()
    
    # Create two node types "paper" and "author" holding a single feature matrix:
    data['paper'].x = torch.randn(num_papers, num_paper_features)
    data['author'].x = torch.randn(num_authors, num_authors_features)
    
    # Create an edge type ("paper", "written_by", "author") holding its graph connectivity:
    data['paper', 'written_by', 'author'].edge_index = ...  # [2, num_edges]

    data.HeteroData behaves similar to a regular homgeneous data.Data object:

    print(data['paper'].num_nodes)
    print(data['paper', 'written_by', 'author'].num_edges)
    data = data.to('cuda')
  • Heterogeneous Mini-Batch Loading: Heterogeneous graphs can be converted to mini-batches for many small and single giant graphs via the loader.DataLoader and loader.NeighborLoader loaders, respectively. These loaders can now handle both homogeneous and heterogeneous graphs:

    from torch_geometric.loader import DataLoader
    
    loader = DataLoader(heterogeneous_graph_dataset, batch_size=32, shuffle=True)
    
    from torch_geometric.loader import NeighborLoader
    
    loader = NeighborLoader(heterogeneous_graph, num_neighbors=[30, 30], batch_size=128,
                            input_nodes=('paper', data['paper'].train_mask), shuffle=True)
  • Heterogeneous Graph Neural Networks: Heterogeneous GNNs can now easily be created from homogeneous ones via nn.to_hetero and nn.to_hetero_with_bases. These processes take an existing GNN model and duplicate their message functions to account for different node and edge types:

    from torch_geometric.nn import SAGEConv, to_hetero
    
    class GNN(torch.nn.Module):
        def __init__(hidden_channels, out_channels):
            super().__init__()
            self.conv1 = SAGEConv((-1, -1), hidden_channels)
            self.conv2 = SAGEConv((-1, -1), out_channels)
    
        def forward(self, x, edge_index):
            x = self.conv1(x, edge_index).relu()
            x = self.conv2(x, edge_index)
            return x
    
    model = GNN(hidden_channels=64, out_channels=dataset.num_classes)
    model = to_hetero(model, data.metadata(), aggr='sum')

Additional Features

Managing Experiments with GraphGym

GraphGym is now officially supported in PyG 2.0 via torch_geometric.graphgym. See here for the accompanying tutorial. Overall, GraphGym is a platform for designing and evaluating Graph Neural Networks from configuration files via a highly modularized pipeline (thanks to @JiaxuanYou):

  1. GraphGym is the perfect place to start learning about standardized GNN implementation and evaluation
  2. GraphGym provides a simple interface to try out thousands of GNN architectures in parallel to find the best design for your specific task
  3. GraphGym lets you easily do hyper-parameter search and visualize what design choices are better

Breaking Changes

Read more

1.7.2

26 Jun 08:50
Compare
Choose a tag to compare

Datasets

Bugfixes

1.7.1

17 Jun 08:19
Compare
Choose a tag to compare

A minor release that brings PyTorch 1.9.0 and Python 3.9 support to PyTorch Geometric. In case you are in the process of updating to PyTorch 1.9.0, please re-install the external dependencies for PyTorch 1.9.0 as well (torch-scatter and torch-sparse).

Features

Datasets

Issues

1.7.0

09 Apr 08:44
Compare
Choose a tag to compare

Major Features

Additional Features

Minor Changes

Datasets

Bugfixes

1.6.3

02 Dec 14:57
Compare
Choose a tag to compare