QoL improvements and more docs (#126)

This PR is (unfortunately) fairly massive, but is mostly renaming things, or removing things that aren't needed anymore. Considering all the renaming and breaking, bumping the version to 0.2.0 🎉 - Updated `README.md` - Added `docs/contributing.md` and `docs/getting_started.md` - Added (a link to) an introductory notebook, runnable on colab - Renamed `FlatRewards` into `ObjectProperties`, as well as related function names - Renamed `RewardScalar` into either `LogScalar` or `LinScalar`, as well as changed the function signature of functions accordingly - Removed all passing around of `rng` instances, replaced by `get_worker_rng` to remove any confusion around worker seeding - Similarly, `dev` should now be acquired via `get_worker_device` - `None`-proofed `cond_info` in models. There should now be a sensible default behavior. Eventually we should properly make sure the code support unconditional tasks - Made `cfg.git_hash` computational conditional on being in a git repo - Renamed generic functions and methods working with objects to `obj[s]` rather than `mol[s]` - Made `terminate` close log files (to avoid issues when a single process runs multiple trials) - Added a `read_all_results` function in `sqlite_log.py`. This is probably the fastest way to load log data - Fixed a number of `mypy` issues.
recursionpharma · May 8, 2024 · f106cde · f106cde
1 parent 02eaed3
commit f106cde
Show file tree

Hide file tree

Showing 36 changed files with 500 additions and 433 deletions.
diff --git a/README.md b/README.md
@@ -11,34 +11,14 @@ GFlowNet-related training and environment code on graphs.
 
 **Primer**
 
-[GFlowNet](https://yoshuabengio.org/2022/03/05/generative-flow-networks/), short for Generative Flow Network, is a novel generative modeling framework, particularly suited for discrete, combinatorial objects. Here in particular it is implemented for graph generation.
+GFlowNet [[1]](https://yoshuabengio.org/2022/03/05/generative-flow-networks/), [[2]](https://www.gflownet.org/), [[3]](https://github.com/zdhNarsil/Awesome-GFlowNets), short for Generative Flow Network, is a novel generative modeling framework, particularly suited for discrete, combinatorial objects. Here in particular it is implemented for graph generation.
 
-The idea behind GFN is to estimate flows in a (graph-theoretic) directed acyclic network*. The network represents all possible ways of constructing an object, and so knowing the flow gives us a policy which we can follow to sequentially construct objects. Such a sequence of partially constructed objects is a _trajectory_. *Perhaps confusingly, the _network_ in GFN refers to the state space, not a neural network architecture.
+The idea behind GFN is to estimate flows in a (graph-theoretic) directed acyclic network*. The network represents all possible ways of constructing objects, and so knowing the flow gives us a policy which we can follow to sequentially construct objects. Such a sequence of partially constructed objects is a _trajectory_. *Perhaps confusingly, the _network_ in GFN refers to the state space, not a neural network architecture.
 
-Here the objects we construct are themselves graphs (e.g. graphs of atoms), which are constructed node by node. To make policy predictions, we use a graph neural network. This GNN outputs per-node logits (e.g. add an atom to this atom, or add a bond between these two atoms), as well as per-graph logits (e.g. stop/"done constructing this object").
+The main focus of this library (although it can do other things) is to construct graphs (e.g. graphs of atoms), which are constructed node by node. To make policy predictions, we use a graph neural network. This GNN outputs per-node logits (e.g. add an atom to this atom, or add a bond between these two atoms), as well as per-graph logits (e.g. stop/"done constructing this object").
 
-The GNN model can be trained on a mix of existing data (offline) and self-generated data (online), the latter being obtained by querying the model sequentially to obtain trajectories. For offline data, we can easily generate trajectories since we know the end state.
+This library supports a variety of GFN algorithms (as well as some baselines), and supports training on a mix of existing data (offline) and self-generated data (online), the latter being obtained by querying the model sequentially to obtain trajectories.
 
-## Repo overview
-
-- [algo](src/gflownet/algo), contains GFlowNet algorithms implementations ([Trajectory Balance](https://arxiv.org/abs/2201.13259), [SubTB](https://arxiv.org/abs/2209.12782), [Flow Matching](https://arxiv.org/abs/2106.04399)), as well as some baselines. These implement how to sample trajectories from a model and compute the loss from trajectories.
-- [data](src/gflownet/data), contains dataset definitions, data loading and data sampling utilities.
-- [envs](src/gflownet/envs), contains environment classes; a graph-building environment base, and a molecular graph context class. The base environment is agnostic to what kind of graph is being made, and the context class specifies mappings from graphs to objects (e.g. molecules) and torch geometric Data.
-- [examples](docs/examples), contains simple example implementations of GFlowNet.
-- [models](src/gflownet/models), contains model definitions.
-- [tasks](src/gflownet/tasks), contains training code.
-    -  [qm9](src/gflownet/tasks/qm9/qm9.py), temperature-conditional molecule sampler based on QM9's HOMO-LUMO gap data as a reward.
-    -  [seh_frag](src/gflownet/tasks/seh_frag.py), reproducing Bengio et al. 2021, fragment-based molecule design targeting the sEH protein
-    -  [seh_frag_moo](src/gflownet/tasks/seh_frag_moo.py), same as the above, but with multi-objective optimization (incl. QED, SA, and molecule weight objectives).
-- [utils](src/gflownet/utils), contains utilities (multiprocessing, metrics, conditioning).
-- [`trainer.py`](src/gflownet/trainer.py), defines a general harness for training GFlowNet models.
-- [`online_trainer.py`](src/gflownet/online_trainer.py), defines a typical online-GFN training loop.
-
-See [implementation notes](docs/implementation_notes.md) for more.
-
-## Getting started
-
-A good place to get started is with the [sEH fragment-based MOO task](src/gflownet/tasks/seh_frag_moo.py). The file `seh_frag_moo.py` is runnable as-is (although you may want to change the default configuration in `main()`).
 
 ## Installation
 
@@ -62,6 +42,30 @@ pip install git+https://github.com/recursionpharma/[email protected] --find-l
 
 If package dependencies seem not to work, you may need to install the exact frozen versions listed `requirements/`, i.e. `pip install -r requirements/main-3.10.txt`.
 
+## Getting started
+
+A good place to get started immediately is with the [sEH fragment-based MOO task](src/gflownet/tasks/seh_frag_moo.py). The file `seh_frag_moo.py` is runnable as-is (although you may want to change the default configuration in `main()`).
+
+For a gentler introduction to the library, see [Getting Started](docs/getting_started.md). For a more in-depth look at the library, see [Implementation Notes](docs/implementation_notes.md).
+
+## Repo overview
+
+- [algo](src/gflownet/algo), contains GFlowNet algorithms implementations ([Trajectory Balance](https://arxiv.org/abs/2201.13259), [SubTB](https://arxiv.org/abs/2209.12782), [Flow Matching](https://arxiv.org/abs/2106.04399)), as well as some baselines. These implement how to sample trajectories from a model and compute the loss from trajectories.
+- [data](src/gflownet/data), contains dataset definitions, data loading and data sampling utilities.
+- [envs](src/gflownet/envs), contains environment classes; the base environment is agnostic to what kind of graph is being made, and context classes specify mappings from graphs to objects (e.g. molecules) and torch geometric Data.
+- [examples](docs/examples), contains simple example implementations of GFlowNet.
+- [models](src/gflownet/models), contains model definitions.
+- [tasks](src/gflownet/tasks), contains training code.
+    -  [qm9](src/gflownet/tasks/qm9/qm9.py), temperature-conditional molecule sampler based on QM9's HOMO-LUMO gap data as a reward.
+    -  [seh_frag](src/gflownet/tasks/seh_frag.py), reproducing Bengio et al. 2021, fragment-based molecule design targeting the sEH protein
+    -  [seh_frag_moo](src/gflownet/tasks/seh_frag_moo.py), same as the above, but with multi-objective optimization (incl. QED, SA, and molecule weight objectives).
+- [utils](src/gflownet/utils), contains utilities (multiprocessing, metrics, conditioning).
+- [`trainer.py`](src/gflownet/trainer.py), defines a general harness for training GFlowNet models.
+- [`online_trainer.py`](src/gflownet/online_trainer.py), defines a typical online-GFN training loop.
+
+See [implementation notes](docs/implementation_notes.md) for more.
+
+
 ## Developing & Contributing
 
 External contributions are welcome.
@@ -73,3 +77,5 @@ pip install -e '.[dev]' --find-links https://data.pyg.org/whl/torch-2.1.2+cu121.
 
 We use `tox` to run tests and linting, and `pre-commit` to run checks before committing.
 To ensure that these checks pass, simply run `tox -e style` and `tox run` to run linters and tests, respectively.
+
+For more information, see [Contributing](docs/contributing.md).
diff --git a/docs/contributing.md b/docs/contributing.md
@@ -0,0 +1,38 @@
+# Contributing
+
+Contributions to the repository are welcome, and we encourage you to open issues and pull requests. In general, it is recommended to fork this repository and open a pull request from your fork to the `trunk` branch. PRs are encouraged to be short and focused, and to include tests and documentation where appropriate.
+
+## Installation
+
+To install the developers dependencies run:
+```
+pip install -e '.[dev]' --find-links https://data.pyg.org/whl/torch-2.1.2+cu121.html
+```
+
+## Dependencies
+
+Dependencies are defined in `pyproject.toml`, and frozen versions that are known to work are provided in `requirements/`. 
+
+To regenerate the frozen versions, run `./generate_requirements.sh <ENV-NAME>`. See comments within.
+
+## Linting and testing
+
+We use `tox` to run tests and linting, and `pre-commit` to run checks before committing.
+To ensure that these checks pass, simply run `tox -e style` and `tox run` to run linters and tests, respectively.
+
+`tox` itself runs many linters, but the most important ones are `black`, `ruff`, `isort`, and `mypy`. The full list
+of linting tools is found in `.pre-commit-config.yaml`, while `tox.ini` defines the environments under which these 
+linters (as well as tests) are run.
+
+## Github Actions
+
+We use Github Actions to run tests and linting on every push and pull request. The configuration for these actions is found in `.github/workflows/`.
+
+The cascade of events is as follows:
+- For `build-and-test`, `tox -> testenv:py310 -> pytest` is run.
+- For `code-quality`, `tox -e style -> testenv:style -> pre-commit -> {isort, black, mypy, bandit, ruff, & others}`. This and the "others" are defined in `.pre-commit-config.yaml` and include things like checking for secrets and trailing whitespace.
+
+## Style Guide
+
+On top of `black`-as-a-style-guide, we generally adhere to the [Google Python Style Guide](https://google.github.io/styleguide/pyguide.html). 
+Our docstrings follow the [numpydoc](https://numpydoc.readthedocs.io/en/latest/format.html) format, and we use type hints throughout the codebase.
diff --git a/docs/getting_started.md b/docs/getting_started.md
@@ -0,0 +1,11 @@
+# Getting Started 
+
+For an introduction to the library, see [this colab notebook](https://colab.research.google.com/drive/1wANyo6Y-ceYEto9-p50riCsGRb_6U6eH).
+
+For an introduction to using `wandb` to log experiments, see [this demo](../src/gflownet/hyperopt/wandb_demo).
+
+For more general introductions to GFlowNets, check out the following:
+- The 2023 [GFlowNet workshop](https://gflownet.org/) has several introductory talks and colab tutorials.
+- This high-level [GFlowNet colab tutorial](https://colab.research.google.com/drive/1fUMwgu2OhYpQagpzU5mhe9_Esib3Q2VR) (updated versions of which were written for the 2023 workshop, in particular for continuous GFNs).
+
+A good place to get started immediately is with the [sEH fragment-based MOO task](src/gflownet/tasks/seh_frag_moo.py). The file `seh_frag_moo.py` is runnable as-is (although you may want to change the default configuration in `main()`).
diff --git a/src/gflownet/__init__.py b/src/gflownet/__init__.py
@@ -1,17 +1,18 @@
-from typing import Dict, List, NewType, Optional, Tuple
+from typing import Any, Dict, List, NewType, Optional, Tuple
 
 import torch_geometric.data as gd
-from rdkit.Chem.rdchem import Mol as RDMol
 from torch import Tensor, nn
 
 from .config import Config
 
-# This type represents an unprocessed list of reward signals/conditioning information
-FlatRewards = NewType("FlatRewards", Tensor)  # type: ignore
+# This type represents a set of scalar properties attached to each object in a batch.
+ObjectProperties = NewType("ObjectProperties", Tensor)  # type: ignore
 
-# This type represents the outcome for a multi-objective task of
-# converting FlatRewards to a scalar, e.g. (sum R_i omega_i) ** beta
-RewardScalar = NewType("RewardScalar", Tensor)  # type: ignore
+# This type represents log-scalars, in particular log-rewards at the scale we operate with with GFlowNets
+# for example, converting a reward ObjectProperties to a log-scalar with log [(sum R_i omega_i) ** beta]
+LogScalar = NewType("LogScalar", Tensor)  # type: ignore
+# This type represents linear-scalars
+LinScalar = NewType("LinScalar", Tensor)  # type: ignore
 
 
 class GFNAlgorithm:
@@ -75,15 +76,15 @@ def get_random_action_prob(self, it: int):
 
 
 class GFNTask:
-    def cond_info_to_logreward(self, cond_info: Dict[str, Tensor], flat_reward: FlatRewards) -> RewardScalar:
+    def cond_info_to_logreward(self, cond_info: Dict[str, Tensor], obj_props: ObjectProperties) -> LogScalar:
         """Combines a minibatch of reward signal vectors and conditional information into a scalar reward.
 
         Parameters
         ----------
         cond_info: Dict[str, Tensor]
             A dictionary with various conditional informations (e.g. temperature)
-        flat_reward: FlatRewards
-            A 2d tensor where each row represents a series of flat rewards.
+        obj_props: ObjectProperties
+            A 2d tensor where each row represents a series of object properties.
 
         Returns
         -------
@@ -92,18 +93,35 @@ def cond_info_to_logreward(self, cond_info: Dict[str, Tensor], flat_reward: Flat
         """
         raise NotImplementedError()
 
-    def compute_flat_rewards(self, mols: List[RDMol]) -> Tuple[FlatRewards, Tensor]:
-        """Compute the flat rewards of mols according the the tasks' proxies
+    def compute_obj_properties(self, objs: List[Any]) -> Tuple[ObjectProperties, Tensor]:
+        """Compute the flat rewards of objs according the the tasks' proxies
 
         Parameters
         ----------
-        mols: List[RDMol]
-            A list of RDKit molecules.
+        objs: List[Any]
+            A list of n objects.
         Returns
         -------
-        reward: FlatRewards
-            A 2d tensor, a vector of scalar reward for valid each molecule.
+        obj_probs: ObjectProperties
+            A 2d tensor (m, p), a vector of scalar properties for the m <= n valid objects.
         is_valid: Tensor
-            A 1d tensor, a boolean indicating whether the molecule is valid.
+            A 1d tensor (n,), a boolean indicating whether each object is valid.
+        """
+        raise NotImplementedError()
+
+    def sample_conditional_information(self, n: int, train_it: int) -> Dict[str, Tensor]:
+        """Sample conditional information for n objects
+
+        Parameters
+        ----------
+        n: int
+            The number of objects to sample conditional information for.
+        train_it: int
+            The training iteration number.
+
+        Returns
+        -------
+        cond_info: Dict[str, Tensor]
+            A dictionary with various conditional informations (e.g. temperature)
         """
         raise NotImplementedError()
diff --git a/src/gflownet/algo/advantage_actor_critic.py b/src/gflownet/algo/advantage_actor_critic.py
@@ -1,4 +1,3 @@
-import numpy as np
 import torch
 import torch.nn as nn
 import torch_geometric.data as gd
@@ -16,7 +15,6 @@ def __init__(
         self,
         env: GraphBuildingEnv,
         ctx: GraphBuildingEnvContext,
-        rng: np.random.RandomState,
         cfg: Config,
     ):
         """Advantage Actor-Critic implementation, see
@@ -34,15 +32,12 @@ def __init__(
             A graph environment.
         ctx: GraphBuildingEnvContext
             A context.
-        rng: np.random.RandomState
-            rng used to take random actions
         cfg: Config
             The experiment configuration
 
         """
         self.ctx = ctx
         self.env = env
-        self.rng = rng
         self.max_len = cfg.algo.max_len
         self.max_nodes = cfg.algo.max_nodes
         self.illegal_action_logreward = cfg.algo.illegal_action_logreward
@@ -54,7 +49,7 @@ def __init__(
         # Experimental flags
         self.sample_temp = 1
         self.do_q_prime_correction = False
-        self.graph_sampler = GraphSampler(ctx, env, self.max_len, self.max_nodes, rng, self.sample_temp)
+        self.graph_sampler = GraphSampler(ctx, env, self.max_len, self.max_nodes, self.sample_temp)
 
     def create_training_data_from_own_samples(
         self, model: nn.Module, n: int, cond_info: Tensor, random_action_prob: float
@@ -82,7 +77,7 @@ def create_training_data_from_own_samples(
         """
         dev = get_worker_device()
         cond_info = cond_info.to(dev)
-        data = self.graph_sampler.sample_from_model(model, n, cond_info, dev, random_action_prob)
+        data = self.graph_sampler.sample_from_model(model, n, cond_info, random_action_prob)
         return data
 
     def create_training_data_from_graphs(self, graphs):
@@ -152,12 +147,12 @@ def compute_batch_losses(self, model: nn.Module, batch: gd.Batch, num_bootstrap:
         # of length 4, trajectory 1 of length 3, and so on.
         batch_idx = torch.arange(num_trajs, device=dev).repeat_interleave(batch.traj_lens)
 
-        # Forward pass of the model, returns a GraphActionCategorical and per molecule predictions
+        # Forward pass of the model, returns a GraphActionCategorical and per graph predictions
         # Here we will interpret the logits of the fwd_cat as Q values
         policy, per_state_preds = model(batch, cond_info[batch_idx])
         V = per_state_preds[:, 0]
         G = rewards[batch_idx]  # The return is the terminal reward everywhere, we're using gamma==1
-        G = G + (1 - batch.is_valid[batch_idx]) * self.invalid_penalty  # Add in penalty for invalid mol
+        G = G + (1 - batch.is_valid[batch_idx]) * self.invalid_penalty  # Add in penalty for invalid object
         A = G - V
         log_probs = policy.log_prob(batch.actions)