In this tutorial, we will demonstrate how to use calculation and analysis tools mentioned in the perspective paper Integrated Molecular Modeling and Machine Learning for Drug Design.
git clone --recurse-submodules -j8 https://github.com/SongXia-NYU/drug_discovery_perspective.git
AlphaSpace 2.0 Environment Setup; tutorials
conda create -n alphaspace2 python==3.8 -y
conda activate alphaspace2
git clone https://github.com/lenhsherr/AlphaSpace2.git
cd AlphaSpace2
pip install .If you have trouble installing mdtraj, try installing it with conda
conda install -c conda-forge mdtrajProper preparation of the receptors for the calculation of β-score requires the installation of 3rd party tools
To run this conda in the jupyter notebook (if not installed yet), keep the conda environment active and run:
pip install ipykernel
python -m ipykernel install --user --name alphapsace2 --display-name AS2Now this environment will be available to run with jupyter notebooks.
As the β-score is analagous to the Vina score, we follow a similar protocol of Vina preparations except we exclude hydrogens.
- First remove the crystal waters, split the ligand and receptor into separate files. Model the missing residues as needed.
- Use
pdb2pqr30to add missing heavy atoms to individual residues, protonate to generate a.pqrfile for the receptor. - Then you can delete the last two columns of the
.pqrfile so that it can be input into theprepare_receptor4.pytool. Input the.pqrfile use toprepare_receptor4.pyfunction from MGLtools to generate the charges and output a.pdbqtfile
You can now run the corresponding jupyter notebook in script/1M17_AS2.ipynb. There you can run and determine the features of the Erlotinib binding pockets of EGFR
jupyter notebook jupyter_notebooks/1M17_AS2.ipynbB. Binding affinity prediction of crystal EGFR-Erlotinib pose with $\Delta_\mathrm{{Lin F9}}\mathrm{XGB}$
You can follow this link to install $\Delta_\mathrm{{Lin F9}}\mathrm{XGB}$. A linux system is required to run Smina, the docking tool that is used to incorporate the Lin_F9 scoring function.
Similar to AlphaSpace2, it is necessary to install 3rd packages to properly prepare the receptors
- Split the ligand and receptor into separate files. Retain the crystal waters in the receptor file. Model the missing residues as needed
- Use
pdb2pqr30to add missing heavy atoms to individual residues and protonate them. This generates a.pqrfile for the receptor. - Then you can delete the last two columns of the
.pqrfile so that it can be input into theprepare_receptor4.pytool. Input the altered.pqrfile toprepare_receptor4.pyfunction from MGLtools to generate the charges and output a.pdbqtfile. Use the-U nphsflag to retain only the non-polar hydrogens. - Use the
prepare_ligand4.pytool to generate charges for the ligand and output a.pdbqtfile. Use the-U nphsflag to retain only the non-polar hydrogens.
Once you've done these preparation, generate the features used by the XGBoost model with delta_LinF9_XGB/script/runFeatures.sh. It is important to input the altered .pqr file and .pdbqt files into runFeatures.sh script
Then you can run the delta_LinF9_XGB/script/runXGB.py script. This script will run Smina with the Lin_F9 scoring function, calculate all of the appropriate features, and run the XGBoost model to predict the correction to the Lin_F9 scoring function.
conda create -n sphysnet-mt python==3.8 -y
conda activate sphysnet-mt
bash sPhysNet-MT/bash_scripts/install_env_linux.bashThe SMILES of Erlotinib is COCCOc1cc2c(cc1OCCOC)ncnc2Nc3cccc(c3)C#C. To run prediction:
bash run_sphysnet_mt.sh "COCCOc1cc2c(cc1OCCOC)ncnc2Nc3cccc(c3)C#C"You can run multiple compounds at once. For example:
bash run_sphysnet_mt.sh "COCCOc1cc2c(cc1OCCOC)ncnc2Nc3cccc(c3)C#C C[C@H](c1c(ccc(c1Cl)F)Cl)Oc2cc(cnc2N)c3cnn(c3)C4CCNCC4 C[C@@H]1CCN(C[C@@H]1N(C)c2c3cc[nH]c3ncn2)C(=O)CC#N"