Skip to content


Repository files navigation

What is MHCSeqNet?

MHCSeqNet is a MHC ligand prediction python package developed by the Computational Molecular Biology Group at Chulalongkorn University, Bangkok, Thailand. MHCSeqNet utilizes recurrent neural networks to process input ligand's and MHC allele's amino acid sequences and therefore can be to extended to handle peptide of any length and any MHC allele with known amino acid sequence.

The current release was trained using only data from MHC class I and supports peptides ranging from 8 to 15 amino acids in length, but the model can be re-trained to support more alleles and wider ranges of peptide length.

Please see our preprint on bioRxiv for more information.


MHCSeqNet offers two versions of prediction models

  1. One-hot model: This model uses data from each MHC allele to train a separate predictor for that allele. The list of supported MHC alleles for the current release can be found here

  2. Sequence-based model: This model use data from all MHC alleles to train a single predictor that can handle any MHC allele whose amino acid sequence is known. For more information on how our model learns MHC allele information in the form of amino acid sequence, please see our preprint on bioRxiv. The list of MHC alleles used to train this model can be found here

How to install?

MHCSeqNet requires Python 3 (>= 3.4) and the following Python packages:

numpy (>= 1.14.3)
Keras (>= 2.2.0)
tensorflow (>= 1.6.0)
scipy (>= 1.1.0)
scikit-learn (>= 0.19.1)

If your system has both Python 2 and Python 3, please make sure that Python 3 is being used when following these instructions. Note that we cannot guarantee whether MHCSeqNet will work with older versions of these packages.

To install MHCSeqNet:

  1. Clone this repository
git clone

Or you may find other methods for cloning a GitHub repository here

  1. Install the latest version of 'pip' and 'setuptools' packages for Python 3 if your system does not already have them
python -m ensurepip --default-pip
pip install setuptools

If you have trouble with this step, more information can be found here

  1. Run inside MHCSeqNet directory to install MHCSeqNet.
cd MHCSeqNet
python install

How to use MHCSeqNet?

MHCSeqNet can be launched through the script or by editing sample scripts explained below

The instruction on how to use the script can be found by running:

python -h

usage: python [options] peptide_file allele_file output_file
         'peptide_file' and 'allele_file' should each contains only one column, without header row
    -p, --path                             REQUIRED: Speficy the path to pre-trained model directory
                                           This should be either the 'one_hot_model' or the 'sequence_model'
                                            directory located in 'PATH/PretrainedModels/' where PATH is where
                                            MHCSeqNet was downloaded to
    -m, --model        [onehot sequence]   REQUIRED: Specify whether the one-hot model or sequence-based model will be used
    -i, --input-mode   [paired complete]   REQUIRED: Specify whether the prediction should be made for each pair of peptide
                                            and allele on the same row of each input file [paired] or for all
                                            combinations of peptides and alleles [complete]
    -h, --help                             Print this message

Sample peptide and MHC allele files can be found in the 'Sample' directory

Sample scripts

Sample scripts for running MHCSeqNet in either the 'one-hot' mode or 'sequence-based' can be found in the 'Sample' directory. Continuing from the installation process, you may test the installation of MHCSeqNet through the following commands:

python Sample/
python Sample/

To run the sample scripts from different locations on your system, please edit the path to pretrained model in the respective script.


To replace sample peptides and MHC alleles with your own lists, please edit the 'sample_data' accordingly.

sample_data = np.array([['TYIGSLPGK', 'HLA-B*58:01'],
                        ['TYIHALDNGLF', 'HLA-A*24:02'],
                        ['AAAWICGEF', 'HLA-B*15:01'],
                        ['TWLTYHGAI', 'HLA-A*30:02'],
                        ['TWLVNSAAHLF', 'HLA-A*24:02']])

To adjust the behavior of how prediction results are output (e.g. print results to file rather than on the screen), please edit the following line:


Input format

Peptide: The current release supports peptides of length 8 - 15 and does not accept ambiguous amino acids.

MHC allele: For alleles included in the training set (i.e. supported alleles listed in the models section), the model requires the 'HLA-A*XX:YY' format.

To add new MHC alleles to the sequence-based model, the names and amino acid sequences of the new alleles must first be added to the AlleleInformation.txt and supported_alleles.txt in the sequence-based model's directory.


MHCSeqNet output binding probability ranging from 0.0 to 1.0 where 0.0 indicates an unlikely ligand and 1.0 indicates a likely ligand.

How to re-train MHCSeqNet?

This feature and instruction will be added in the future


A Python tool for prediction MHC ligand







No releases published


No packages published
