BioNeuralNet: A Multi-Omics Integration and GNN-Based Embedding Framework

Welcome to BioNeuralNet Beta 0.1

Note: This is a beta version of BioNeuralNet. It is under active development, and certain features may be incomplete or subject to change. Feedback and bug reports are highly encouraged to help us improve the tool.

BioNeuralNet is a Python-based software tool designed to streamline the integration of multi-omics data with Graph Neural Network (GNN) embeddings. It supports graph clustering, subject representation, and disease prediction, enabling advanced analyses of complex multi-omics networks.

Key Features

BioNeuralNet offers five core steps in a typical workflow:

1. Graph Construction

Not performed internally. You provide or build adjacency matrices externally (e.g., via WGCNA, SmCCNet, or your own scripts).
Lightweight wrappers are available in bioneuralnet.external_tools (e.g., WGCNA, SmCCNet) for convenience. However, using these wrappers is optional and not mandatory for BioNeuralNet’s pipeline.

2. Graph Clustering

Identify functional modules or communities using PageRank.
The PageRank module enables finding subnetwork clusters through personalized sweep cuts, capturing local neighborhoods influenced by seed nodes.

3. Network Embedding

Generate embeddings using methods like GCN, GAT, GraphSAGE, and GIN.
You can attach numeric labels to nodes or remain “unsupervised,” relying solely on graph structure and node features (e.g., correlation with clinical data).

4. Subject Representation

Integrate node embeddings back into omics data, enriching each subject’s feature vector by weighting columns with the learned embedding scalars.

5. Downstream Tasks

Perform advanced analyses, such as disease prediction, via DPMON, which trains a GNN end-to-end alongside a classifier to incorporate both local and global network information.

Installation

BioNeuralNet supports Python 3.10 and 3.11 in this beta release. Follow the steps below to set up BioNeuralNet and its dependencies.

1. Install BioNeuralNet via pip

To install the core BioNeuralNet modules for GNN embeddings, subject representation, disease prediction (DPMON), and clustering:

pip install bioneuralnet==0.1.0b1

2. Install PyTorch and PyTorch Geometric (Separately)

BioNeuralNet relies on PyTorch and PyTorch Geometric for GNN operations:

Install PyTorch:

pip install torch torchvision torchaudio

Install PyTorch Geometric:
```
pip install torch_geometric
```

For GPU-accelerated builds or other configurations, visit their official guides:

Select the appropriate build for your system (e.g., Stable, Linux, pip, Python, CPU/GPU).

PyTorch Installation Guide

PyTorch Geometric Installation Guide

3. (Optional) Install R and External Tools

If you plan to use WGCNA or SmCCNet for network construction:

Install R from The R Project.
Install the required R packages. Open R and run the following:

if (!requireNamespace("BiocManager", quietly = TRUE)) install.packages("BiocManager")
install.packages(c("dplyr", "jsonlite"))
BiocManager::install(c("impute", "preprocessCore", "GO.db", "AnnotationDbi"))
install.packages("SmCCNet")
install.packages("WGCNA")

4. Additional Notes for External Tools

For Node2Vec, feature selection, or visualization modules, refer to the external tools documentation in bioneuralnet.external_tools. Examples include:

Node2Vec: Node2Vec-based embeddings.
FeatureSelector: Basic feature selection strategies like LassoCV and random forest.
HierarchicalClustering: Agglomerative clustering and silhouette scoring.
StaticVisualizer and DynamicVisualizer: Static or interactive network visualization.
SmCCNet / WGCNA: Build adjacency matrices using R-based libraries.

These integrations are optional and do not form part of the core pipeline.

5. Development Setup (Optional)

If you plan to contribute to BioNeuralNet:

git clone https://github.com/UCD-BDLab/BioNeuralNet.git
cd BioNeuralNet
pip install -r requirements-dev.txt
pre-commit install
pytest

Quick Example: Transforming Multi-Omics for Enhanced Disease Prediction

BioNeuralNet: Transforming Multi-Omics for Enhanced Disease Prediction

External Tools

We offer a number of external tools available through the bioneuralnet.external_tools module:

These tools were implemented to facilitate testing and should not be considered part of the package's core functionality.
The classes inside the external_tools module are lightweight wrappers around existing tools and libraries offering minimal functionality.
We highly encourage users to explore these tools outside of BioNeuralNet to fully leverage their capabilities.

Steps

Below is a quick example demonstrating the following:

Building or Importing a Network Adjacency Matrix:
- For instance, using external tools like SmCCNet.
Using DPMON for Disease Prediction:
- A detailed explanation follows.

1. Data Preparation

Input your multi-omics data (e.g., proteomics, metabolomics, genomics) along with phenotype data.

2. Network Construction

Not performed internally: You need to provide or build adjacency matrices externally (e.g., via WGCNA, SmCCNet, or your own scripts).
Lightweight wrappers are available in bioneuralnet.external_tools (e.g., WGCNA, SmCCNet) for convenience. However, using these wrappers is optional and not mandatory for BioNeuralNet’s pipeline.

3. Disease Prediction

DPMON integrates GNN-based node embeddings with a downstream neural network to predict disease phenotypes.

Code Example:

from bioneuralnet.external_tools import SmCCNet
from bioneuralnet.downstream_task import DPMON
import pandas as pd

# 1) Prepare data
omics_data = pd.read_csv("data/omics_data.csv")
phenotype_data = pd.read_csv("data/phenotype_data.csv")
clinical_data = pd.read_csv("data/clinical_data.csv")

# 2) Run SmCCNet to get adjacency
smccnet = SmCCNet(
   phenotype_df=phenotype_data,
   omics_df=omics_data,
   data_types=["genes, proteins"]
   kfolds=5,
   summarization = "NetSHy",
   seed: 127,
   )
adjacency_matrix = smccnet.run()

# 3) Disease Prediction with DPMON
dpmon = DPMON(
   adjacency_matrix=adjacency_matrix,
   omics_list=[omics_data],
   phenotype_data=phenotype_data,
   clinical_data=clinical_data,
   model: "GAT",
   gnn_hidden_dim: 64,
   layer_num: 3,
   nn_hidden_dim1: 2,
   nn_hidden_dim2: 2,
   epoch_num: 10,
   repeat_num: 5,
   lr: 0.01,
   weight_decay: 1e-4,
   tune: True,
   gpu: False
)
predictions = dpmon.run()
print("Disease predictions:\n", predictions)

Output

Adjacency Matrix: The multi-omics network representation.
Predictions: Disease phenotype predictions for each sample as a DataFrame linking subjects to predicted classes.

Documentation & Tutorials

Extensive documentation at Read the Docs
Tutorials illustrating:
- Unsupervised vs. label-based GNN usage
- PageRank clustering and hierarchical clustering
- Subject representation
- Integrating external tools like WGCNA, SmCCNet

Frequently Asked Questions (FAQ)

Key topics include:

GPU acceleration vs. CPU-only
External Tools usage (R-based adjacency construction, Node2Vec, etc.)
DPMON for local/global structure-based disease prediction
PageRank or HierarchicalClustering for subnetwork identification

See FAQ for more.

Acknowledgments

BioNeuralNet relies on and interfaces with various open-source libraries. We extend our gratitude to the developers and contributors of these projects for their invaluable tools and resources.

Core Dependencies

PyYAML - MIT License
pandas - BSD 3-Clause License
numpy - BSD 3-Clause License
scikit-learn - BSD 3-Clause License
node2vec - MIT License
matplotlib - Matplotlib License
ray - Apache 2.0 License
tensorboardX - MIT License
networkx - BSD License
pyvis - MIT License
leidenalg - GNU LGPL v3
dtt - MIT License
pyreadr - MIT License
torch - BSD License
torch_geometric - MIT License

Development Dependencies

These tools are essential for the development and maintenance of BioNeuralNet but are not required for end-users.

pytest - MIT License
pytest-cov - MIT License
pytest-mock - MIT License
Sphinx - BSD License
Sphinx RTD Theme - BSD License
sphinx-autosummary-accessors - MIT License
sphinxcontrib-napoleon - BSD License
flake8 - MIT License
Black - MIT License
mypy - MIT License
pre-commit - MIT License
tox - MIT License
setuptools - MIT License
twine - MIT License

External Tools

BioNeuralNet integrates with external tools to enhance functionality:

WGCNA - GPL-3.0 License
SmCCNet - GPL-3.0 License

Special Thanks

We appreciate the efforts of these communities and all contributors who make open-source development possible. Your dedication and hard work enable projects like BioNeuralNet to thrive and evolve.

Testing & CI

Local Testing:

pytest --cov=bioneuralnet --cov-report=html
open htmlcov/index.html

Continuous Integration:
- GitHub Actions run our test suite and code checks on each commit and PR.

Contributing

Fork the repository, create a new branch, implement your changes.
Add/Update tests, docstrings, and examples if appropriate.
Open a pull request describing your modifications.

For more details, see our FAQ or open an issue.

License

License: MIT License

Contact

Questions or Feature Requests: Open an issue
Email: [email protected]

BioNeuralNet aims to streamline multi-omics network analysis by providing graph clustering, GNN embedding, subject representation, and disease prediction tools. We hope it helps uncover new insights in multi-omics research.

Name		Name	Last commit message	Last commit date
Latest commit History 76 Commits
.github/workflows		.github/workflows
assets		assets
bioneuralnet		bioneuralnet
docs		docs
tests		tests
.DS_Store		.DS_Store
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.readthedocs.yaml		.readthedocs.yaml
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
requirements-dev.txt		requirements-dev.txt
requirements.txt		requirements.txt
setup.cfg		setup.cfg
setup.py		setup.py

License

UCD-BDLab/BioNeuralNet

Folders and files

Latest commit

History

Repository files navigation

BioNeuralNet: A Multi-Omics Integration and GNN-Based Embedding Framework

Welcome to BioNeuralNet Beta 0.1

Key Features

1. Graph Construction

2. Graph Clustering

3. Network Embedding

4. Subject Representation

5. Downstream Tasks

Installation

1. Install BioNeuralNet via pip

2. Install PyTorch and PyTorch Geometric (Separately)

3. (Optional) Install R and External Tools

4. Additional Notes for External Tools

5. Development Setup (Optional)

Quick Example: Transforming Multi-Omics for Enhanced Disease Prediction

External Tools

Steps

1. Data Preparation

2. Network Construction

3. Disease Prediction

Code Example:

Output

Documentation & Tutorials

Frequently Asked Questions (FAQ)

Acknowledgments

Core Dependencies

Development Dependencies

External Tools

Special Thanks

Testing & CI

Contributing

License

Contact

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages