Integrated Molecular Modeling and Machine Learning for Drug Design

In this tutorial, we will demonstrate how to use calculation and analysis tools mentioned in the perspective paper Integrated Molecular Modeling and Machine Learning for Drug Design.

Clone this Repo

git clone --recurse-submodules -j8 https://github.com/SongXia-NYU/drug_discovery_perspective.git

A. Protein Pocket Identification and Analysis with AlphaSpace 2.0

AlphaSpace 2.0 Environment Setup; tutorials

conda create -n alphaspace2 python==3.8 -y
conda activate alphaspace2
git clone https://github.com/lenhsherr/AlphaSpace2.git
cd AlphaSpace2
pip install .

If you have trouble installing mdtraj, try installing it with conda

conda install -c conda-forge mdtraj

Proper preparation of the receptors for the calculation of β-score requires the installation of 3rd party tools

To run this conda in the jupyter notebook (if not installed yet), keep the conda environment active and run:

pip install ipykernel
python -m ipykernel install --user --name alphapsace2 --display-name AS2

Now this environment will be available to run with jupyter notebooks.

Proper preparation of receptors to calculate the β-score of the 1M17 structure

As the β-score is analagous to the Vina score, we follow a similar protocol of Vina preparations except we exclude hydrogens.

First remove the crystal waters, split the ligand and receptor into separate files. Model the missing residues as needed.
Use pdb2pqr30 to add missing heavy atoms to individual residues, protonate to generate a .pqr file for the receptor.
Then you can delete the last two columns of the .pqr file so that it can be input into the prepare_receptor4.py tool. Input the .pqr file use to prepare_receptor4.py function from MGLtools to generate the charges and output a .pdbqt file

Calculation of AlphaSpace2.0 features

You can now run the corresponding jupyter notebook in script/1M17_AS2.ipynb. There you can run and determine the features of the Erlotinib binding pockets of EGFR

jupyter notebook jupyter_notebooks/1M17_AS2.ipynb

B. Binding affinity prediction of crystal EGFR-Erlotinib pose with $\Delta_\mathrm{{Lin F9}}\mathrm{XGB}$

Environment Setup

You can follow this link to install $\Delta_\mathrm{{Lin F9}}\mathrm{XGB}$. A linux system is required to run Smina, the docking tool that is used to incorporate the Lin_F9 scoring function.

Similar to AlphaSpace2, it is necessary to install 3rd packages to properly prepare the receptors

Preparation of input files

Split the ligand and receptor into separate files. Retain the crystal waters in the receptor file. Model the missing residues as needed
Use pdb2pqr30 to add missing heavy atoms to individual residues and protonate them. This generates a .pqr file for the receptor.
Then you can delete the last two columns of the .pqr file so that it can be input into the prepare_receptor4.py tool. Input the altered .pqr file to prepare_receptor4.py function from MGLtools to generate the charges and output a .pdbqt file. Use the -U nphs flag to retain only the non-polar hydrogens.
Use the prepare_ligand4.py tool to generate charges for the ligand and output a .pdbqt file. Use the -U nphs flag to retain only the non-polar hydrogens.

Calculation of $\Delta_\mathrm{{Lin F9}}\mathrm{XGB}$

Once you've done these preparation, generate the features used by the XGBoost model with delta_LinF9_XGB/script/runFeatures.sh. It is important to input the altered .pqr file and .pdbqt files into runFeatures.sh script

Then you can run the delta_LinF9_XGB/script/runXGB.py script. This script will run Smina with the Lin_F9 scoring function, calculate all of the appropriate features, and run the XGBoost model to predict the correction to the Lin_F9 scoring function.

C. Small Molecule Properties Prediction with sPhysNet-MT-ens5

Environment Setup

conda create -n sphysnet-mt python==3.8 -y
conda activate sphysnet-mt
bash sPhysNet-MT/bash_scripts/install_env_linux.bash

Prediction with sPhysNet-MT-ens5

The SMILES of Erlotinib is COCCOc1cc2c(cc1OCCOC)ncnc2Nc3cccc(c3)C#C. To run prediction:

bash run_sphysnet_mt.sh "COCCOc1cc2c(cc1OCCOC)ncnc2Nc3cccc(c3)C#C"

You can run multiple compounds at once. For example:

bash run_sphysnet_mt.sh "COCCOc1cc2c(cc1OCCOC)ncnc2Nc3cccc(c3)C#C C[C@H](c1c(ccc(c1Cl)F)Cl)Oc2cc(cnc2N)c3cnn(c3)C4CCNCC4 C[C@@H]1CCN(C[C@@H]1N(C)c2c3cc[nH]c3ncn2)C(=O)CC#N"

Name		Name	Last commit message	Last commit date
Latest commit History 38 Commits
AlphaSpace2 @ 1283f73		AlphaSpace2 @ 1283f73
Frag20_prepare @ 38beb9a		Frag20_prepare @ 38beb9a
data		data
delta_LinF9_XGB @ 54ae795		delta_LinF9_XGB @ 54ae795
jupyter_notebooks		jupyter_notebooks
sPhysNet-MT @ 0277fa7		sPhysNet-MT @ 0277fa7
scripts		scripts
.gitignore		.gitignore
.gitmodules		.gitmodules
README.md		README.md
run_sphysnet_mt.sh		run_sphysnet_mt.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Integrated Molecular Modeling and Machine Learning for Drug Design

Clone this Repo

A. Protein Pocket Identification and Analysis with AlphaSpace 2.0

AlphaSpace 2.0 Environment Setup; tutorials

Proper preparation of receptors to calculate the β-score of the 1M17 structure

Calculation of AlphaSpace2.0 features

B. Binding affinity prediction of crystal EGFR-Erlotinib pose with $\Delta_\mathrm{{Lin F9}}\mathrm{XGB}$

Environment Setup

Preparation of input files

Calculation of $\Delta_\mathrm{{Lin F9}}\mathrm{XGB}$

C. Small Molecule Properties Prediction with sPhysNet-MT-ens5

Environment Setup

Prediction with sPhysNet-MT-ens5

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

SongXia-NYU/drug_discovery_perspective

Folders and files

Latest commit

History

Repository files navigation

Integrated Molecular Modeling and Machine Learning for Drug Design

Clone this Repo

A. Protein Pocket Identification and Analysis with AlphaSpace 2.0

AlphaSpace 2.0 Environment Setup; tutorials

Proper preparation of receptors to calculate the β-score of the 1M17 structure

Calculation of AlphaSpace2.0 features

B. Binding affinity prediction of crystal EGFR-Erlotinib pose with $\Delta_\mathrm{{Lin F9}}\mathrm{XGB}$

Environment Setup

Preparation of input files

Calculation of $\Delta_\mathrm{{Lin F9}}\mathrm{XGB}$

C. Small Molecule Properties Prediction with sPhysNet-MT-ens5

Environment Setup

Prediction with sPhysNet-MT-ens5

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages