A trainable PyTorch reproduction of AlphaFold 3.
For more information on the model's performance and capabilities, see our technical report.
You can follow our twitter or join the conversation in the discord server.
pip3 install protenix
If you're interested in model training, we recommand to run with docker.
If you set up Protenix
by pip
, you can run the following command to do model inference:
# run with example.json, which contains precomputed msa dir.
protenix predict --input examples/example.json --out_dir ./output --seeds 101
# run with multiple json files, the default seed is 101.
protenix predict --input ./jsons_dir/ --out_dir ./output
# if the json do not contain precomputed msa dir,
# add --use_msa_server to search msa and then predict.
# if mutiple seeds are provided, split them by comma.
protenix predict --input examples/example_without_msa.json --out_dir ./output --seeds 101,102 --use_msa_server
Detailed information on the format of the input JSON file and the output files can be found in input and output documentation .
Alternatively you can run inference by:
Note: by default, we do not use layernorm and EvoformerAttention kernels for simple configuration, if you want to speed up inference, see setting up kernels documentation .
bash inference_demo.sh
Arguments in this scripts are explained as follows:
input_json_path
: path to a JSON file that fully describes the input.dump_dir
: path to a directory where the results of the inference will be saved.dtype
: data type used in inference. Valid options include"bf16"
and"fp32"
.use_msa
: whether to use the MSA feature, the default is true.
If your input is pdb or cif file, you can convert it to json file for inference.
# run with pdb/cif file, and convert it to json file for inference.
protenix tojson --input examples/7pzb.pdb --out_dir ./output
We also provide an independent MSA search function, you can do msa search from json file or fasta file.
# run msa search with json file, it will write precomputed msa dir info to a new json file.
protenix msa --input examples/example_without_msa.json --out_dir ./output
# run msa search with fasta file which only contains protein.
protenix msa --input examples/prot.fasta --out_dir ./output
If you're interested in model training, see training documentation .
See the performance documentation for memory and time consumption in training and inference.
Implementation of the layernorm operators referred to OneFlow and FastFold. We used OpenFold for some module implementations, except the LayerNorm
.
Please check Contributing for more details. If you encounter problems using Protenix, feel free to create an issue! We also welcome pull requests from the community.
Please check Code of Conduct for more details.
If you discover a potential security issue in this project, or think you may have discovered a security issue, we ask that you notify Bytedance Security via our security center or vulnerability reporting email.
Please do not create a public GitHub issue.
The Protenix project, including code and model parameters, is made available under the Apache 2.0 License, it is free for both academic research and commercial use.
We welcome inquiries and collaboration opportunities for advanced applications of our model, such as developing new features, fine-tuning for specific use cases, and more. Please feel free to contact us at [email protected].