Welcome to BioNeuralNet Beta 0.1
Note: This is a beta version of BioNeuralNet. It is under active development, and certain features may be incomplete or subject to change. Feedback and bug reports are highly encouraged to help us improve the tool.
BioNeuralNet is a Python-based software tool designed to streamline the integration of multi-omics data with Graph Neural Network (GNN) embeddings. It supports graph clustering, subject representation, and disease prediction, enabling advanced analyses of complex multi-omics networks.
BioNeuralNet offers five core steps in a typical workflow:
- Not performed internally. You provide or build adjacency matrices externally (e.g., via WGCNA, SmCCNet, or your own scripts).
- Lightweight wrappers are available in
bioneuralnet.external_tools
(e.g., WGCNA, SmCCNet) for convenience. However, using these wrappers is optional and not mandatory for BioNeuralNet’s pipeline.
- Identify functional modules or communities using PageRank.
- The
PageRank
module enables finding subnetwork clusters through personalized sweep cuts, capturing local neighborhoods influenced by seed nodes.
- Generate embeddings using methods like GCN, GAT, GraphSAGE, and GIN.
- You can attach numeric labels to nodes or remain “unsupervised,” relying solely on graph structure and node features (e.g., correlation with clinical data).
- Integrate node embeddings back into omics data, enriching each subject’s feature vector by weighting columns with the learned embedding scalars.
- Perform advanced analyses, such as disease prediction, via DPMON, which trains a GNN end-to-end alongside a classifier to incorporate both local and global network information.
BioNeuralNet supports Python 3.10 and 3.11 in this beta release. Follow the steps below to set up BioNeuralNet and its dependencies.
To install the core BioNeuralNet modules for GNN embeddings, subject representation, disease prediction (DPMON), and clustering:
pip install bioneuralnet==0.1.0b1
BioNeuralNet relies on PyTorch and PyTorch Geometric for GNN operations:
- Install PyTorch:
pip install torch torchvision torchaudio
- Install PyTorch Geometric:
pip install torch_geometric
For GPU-accelerated builds or other configurations, visit their official guides:
Select the appropriate build for your system (e.g., Stable, Linux, pip, Python, CPU/GPU).
PyTorch Geometric Installation Guide
If you plan to use WGCNA or SmCCNet for network construction:
- Install R from The R Project.
- Install the required R packages. Open R and run the following:
if (!requireNamespace("BiocManager", quietly = TRUE)) install.packages("BiocManager")
install.packages(c("dplyr", "jsonlite"))
BiocManager::install(c("impute", "preprocessCore", "GO.db", "AnnotationDbi"))
install.packages("SmCCNet")
install.packages("WGCNA")
For Node2Vec, feature selection, or visualization modules, refer to the external tools documentation in bioneuralnet.external_tools
. Examples include:
Node2Vec
: Node2Vec-based embeddings.FeatureSelector
: Basic feature selection strategies like LassoCV and random forest.HierarchicalClustering
: Agglomerative clustering and silhouette scoring.StaticVisualizer
andDynamicVisualizer
: Static or interactive network visualization.SmCCNet
/WGCNA
: Build adjacency matrices using R-based libraries.
These integrations are optional and do not form part of the core pipeline.
If you plan to contribute to BioNeuralNet:
git clone https://github.com/UCD-BDLab/BioNeuralNet.git
cd BioNeuralNet
pip install -r requirements-dev.txt
pre-commit install
pytest
BioNeuralNet: Transforming Multi-Omics for Enhanced Disease Prediction
We offer a number of external tools available through the bioneuralnet.external_tools
module:
- These tools were implemented to facilitate testing and should not be considered part of the package's core functionality.
- The classes inside the
external_tools
module are lightweight wrappers around existing tools and libraries offering minimal functionality. - We highly encourage users to explore these tools outside of BioNeuralNet to fully leverage their capabilities.
Below is a quick example demonstrating the following:
-
Building or Importing a Network Adjacency Matrix:
- For instance, using external tools like SmCCNet.
-
Using DPMON for Disease Prediction:
- A detailed explanation follows.
- Input your multi-omics data (e.g., proteomics, metabolomics, genomics) along with phenotype data.
- Not performed internally: You need to provide or build adjacency matrices externally (e.g., via WGCNA, SmCCNet, or your own scripts).
- Lightweight wrappers are available in
bioneuralnet.external_tools
(e.g., WGCNA, SmCCNet) for convenience. However, using these wrappers is optional and not mandatory for BioNeuralNet’s pipeline.
- DPMON integrates GNN-based node embeddings with a downstream neural network to predict disease phenotypes.
from bioneuralnet.external_tools import SmCCNet
from bioneuralnet.downstream_task import DPMON
import pandas as pd
# 1) Prepare data
omics_data = pd.read_csv("data/omics_data.csv")
phenotype_data = pd.read_csv("data/phenotype_data.csv")
clinical_data = pd.read_csv("data/clinical_data.csv")
# 2) Run SmCCNet to get adjacency
smccnet = SmCCNet(
phenotype_df=phenotype_data,
omics_df=omics_data,
data_types=["genes, proteins"]
kfolds=5,
summarization = "NetSHy",
seed: 127,
)
adjacency_matrix = smccnet.run()
# 3) Disease Prediction with DPMON
dpmon = DPMON(
adjacency_matrix=adjacency_matrix,
omics_list=[omics_data],
phenotype_data=phenotype_data,
clinical_data=clinical_data,
model: "GAT",
gnn_hidden_dim: 64,
layer_num: 3,
nn_hidden_dim1: 2,
nn_hidden_dim2: 2,
epoch_num: 10,
repeat_num: 5,
lr: 0.01,
weight_decay: 1e-4,
tune: True,
gpu: False
)
predictions = dpmon.run()
print("Disease predictions:\n", predictions)
- Adjacency Matrix: The multi-omics network representation.
- Predictions: Disease phenotype predictions for each sample as a DataFrame linking subjects to predicted classes.
- Extensive documentation at Read the Docs
- Tutorials illustrating:
- Unsupervised vs. label-based GNN usage
- PageRank clustering and hierarchical clustering
- Subject representation
- Integrating external tools like WGCNA, SmCCNet
Key topics include:
- GPU acceleration vs. CPU-only
- External Tools usage (R-based adjacency construction, Node2Vec, etc.)
- DPMON for local/global structure-based disease prediction
- PageRank or HierarchicalClustering for subnetwork identification
See FAQ for more.
BioNeuralNet relies on and interfaces with various open-source libraries. We extend our gratitude to the developers and contributors of these projects for their invaluable tools and resources.
- PyYAML - MIT License
- pandas - BSD 3-Clause License
- numpy - BSD 3-Clause License
- scikit-learn - BSD 3-Clause License
- node2vec - MIT License
- matplotlib - Matplotlib License
- ray - Apache 2.0 License
- tensorboardX - MIT License
- networkx - BSD License
- pyvis - MIT License
- leidenalg - GNU LGPL v3
- dtt - MIT License
- pyreadr - MIT License
- torch - BSD License
- torch_geometric - MIT License
These tools are essential for the development and maintenance of BioNeuralNet but are not required for end-users.
- pytest - MIT License
- pytest-cov - MIT License
- pytest-mock - MIT License
- Sphinx - BSD License
- Sphinx RTD Theme - BSD License
- sphinx-autosummary-accessors - MIT License
- sphinxcontrib-napoleon - BSD License
- flake8 - MIT License
- Black - MIT License
- mypy - MIT License
- pre-commit - MIT License
- tox - MIT License
- setuptools - MIT License
- twine - MIT License
BioNeuralNet integrates with external tools to enhance functionality:
We appreciate the efforts of these communities and all contributors who make open-source development possible. Your dedication and hard work enable projects like BioNeuralNet to thrive and evolve.
-
Local Testing:
pytest --cov=bioneuralnet --cov-report=html open htmlcov/index.html
-
Continuous Integration:
- GitHub Actions run our test suite and code checks on each commit and PR.
- Fork the repository, create a new branch, implement your changes.
- Add/Update tests, docstrings, and examples if appropriate.
- Open a pull request describing your modifications.
For more details, see our FAQ or open an issue.
- License: MIT License
- Questions or Feature Requests: Open an issue
- Email: [email protected]
BioNeuralNet aims to streamline multi-omics network analysis by providing graph clustering, GNN embedding, subject representation, and disease prediction tools. We hope it helps uncover new insights in multi-omics research.