Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Protein residue modification issues #55

Open
kanghw0325 opened this issue Jan 21, 2025 · 2 comments
Open

Protein residue modification issues #55

kanghw0325 opened this issue Jan 21, 2025 · 2 comments

Comments

@kanghw0325
Copy link

Thank you for the great job and I have some issues in protein residue modification.

Here is my json file

[
    {
        "sequences": [
            {
                "proteinChain": {
                    "sequence": "MFERFTDRARRVVVLAQEEARMLNHNYIGTEHILLGLIHEGEGVAAKSLESLGISLEGVRSQVEEIIGQGQQAPSGHIPFTPRAKKVLELSLREALQLGHNYIGTEHILLGLIREGEGVAAQVLVKLGAELTRVRQQVIQLLSGYKLAAALEHHHHHH",
                    "count": 1,
                    "modifications": []
                }
            },
            {
                "proteinChain": {
                    "sequence": "WLYALLK",
                    "count": 1,
                    "modifications": [
                        {
                            "ptmType": "CCD_F7P",
                            "ptmPosition": 1
                        },
                        {
                            "ptmType": "CCD_MLE",
                            "ptmPosition": 2
                        },
                        {
                            "ptmType": "CCD_NIY",
                            "ptmPosition": 3
                        },
                        {
                            "ptmType": "CCD_F7S",
                            "ptmPosition": 5
                        },
                        {
                            "ptmType": "CCD_F7V",
                            "ptmPosition": 7
                        }
                    ]
                }
            }
        ],
        "name": "6CN8"
    }

Then, I got Assert error like this.

2025-01-21 15:10:57,493 [/workdir_efs/hwkang/.conda/envs/protenix0.3.9/lib/python3.12/site-packages/runner/inference.py:232] INFO runner.inference: :
Traceback (most recent call last):
  File "/workdir_efs/hwkang/.conda/envs/protenix0.3.9/lib/python3.12/site-packages/protenix/data/infer_data_pipeline.py", line 211, in __getitem__
    data, atom_array, _ = self.process_one(
                          ^^^^^^^^^^^^^^^^^
  File "/workdir_efs/hwkang/.conda/envs/protenix0.3.9/lib/python3.12/site-packages/protenix/data/infer_data_pipeline.py", line 114, in process_one
    InferenceMSAFeaturizer.make_msa_feature(
  File "/workdir_efs/hwkang/.conda/envs/protenix0.3.9/lib/python3.12/site-packages/protenix/data/msa_featurizer.py", line 1144, in make_msa_feature
    msa_feats = InferenceMSAFeaturizer.get_inference_prot_msa_features_for_assembly(
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/workdir_efs/hwkang/.conda/envs/protenix0.3.9/lib/python3.12/site-packages/protenix/data/msa_featurizer.py", line 1083, in get_inference_prot_msa_features_for_assembly
    sequence_feat = InferenceMSAFeaturizer.process_prot_single_sequence(
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/workdir_efs/hwkang/.conda/envs/protenix0.3.9/lib/python3.12/site-packages/protenix/data/msa_featurizer.py", line 969, in process_prot_single_sequence
    sequence_features = process_single_sequence(
                        ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/workdir_efs/hwkang/.conda/envs/protenix0.3.9/lib/python3.12/site-packages/protenix/data/msa_featurizer.py", line 797, in process_single_sequence
    msa_features = load_and_process_msa(
                   ^^^^^^^^^^^^^^^^^^^^^
  File "/workdir_efs/hwkang/.conda/envs/protenix0.3.9/lib/python3.12/site-packages/protenix/data/msa_utils.py", line 558, in load_and_process_msa
    msa_data = parse_msa_data(
               ^^^^^^^^^^^^^^^
  File "/workdir_efs/hwkang/.conda/envs/protenix0.3.9/lib/python3.12/site-packages/protenix/data/msa_utils.py", line 450, in parse_msa_data
    return parse_prot_msa_data(raw_msa_paths, seq_limits)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/workdir_efs/hwkang/.conda/envs/protenix0.3.9/lib/python3.12/site-packages/protenix/data/msa_utils.py", line 518, in parse_prot_msa_data
    assert all([len(seq) == len(aligned_sequences[0]) for seq in aligned_sequences])
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AssertionError

100%|██████████| 1/1 [00:06<00:00,  6.77s/it]

is there anything wrong with my json file? I could make other files with my json files and got write outputs.

@zhangyuxuann
Copy link
Collaborator

Hi @kanghw0325, This has been fixed after v0.3.6. can you remove the output dir containing MSA, and update protenix to v0.4.1 and retry. I run successfully using your json.

Image

Image

@kanghw0325
Copy link
Author

Thank you for kind response!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants