A5_A3 get_peptide_sequence.py associates same TPMs to different samples #5

FraPria · 2022-05-17T15:08:49Z

Hello, thank you for developing this useful pipeline!
I have a technical question that I would like to address you.

I noticed from the file file A5_A3_NetMHC-4.0_junctions_ORF_neoantigens.tab that samples that share the same event share also the same Transcript_TPM.
You can see it from the header of the file (selecting only columns of interest):

Sample_id       Alt_Junction_id Transcript_id   Transcript_TPM
pat1   chr6;41090308;41091546;+        ENST00000353205.5       3.09240654483773
pat2   chr6;41090308;41091546;+        ENST00000353205.5       3.09240654483773
pat3   chr6;41090308;41091546;+        ENST00000353205.5       3.09240654483773
pat4   chr6;41090308;41091546;+        ENST00000353205.5       3.09240654483773
pat5   chr6;41090308;41091546;+        ENST00000353205.5       3.09240654483773

While if you select the same transcript from iso_tpms.txt matrix they are different.

pat1	pat2	pat3	pat4	pat5
3.092407	3.750489	7.15175	13.89057	4.364625

This seems to rise from line 136 of lib/A5_A3/get_peptide_sequence.py where it takes only the first column of the iso_tpms.txt matrix:

tokens = line.rstrip().split("\t")
transcript = tokens[0]
tpm = tokens[1]
if (transcript not in transcript_expression):
    transcript_expression[transcript] = tpm

So I tested if swapping the columns of iso_tpms.txt could change the results and it did.
For the other events this does not happen, and the code is a bit different. For example for the Exonizations it considers all the iso_tpms.txt columns:

tokens = line.rstrip().split("\t")
transcript = tokens[0]
tpm = tokens[1:]
for i in range(0,len(tpm)):
    if (transcript not in transcript_expression[header[i]]):
        transcript_expression[header[i]][transcript] = float(tpm[i])

Should I use this piece of code also for the A5_A3?
Thank you in advance

The text was updated successfully, but these errors were encountered:

Pointed by #5

JLTrincado · 2022-05-20T10:07:11Z

Hi,

Yes, this seems a bug indeed. I have changed it accordingly and quickly tested it and it seems to go smooth. Could you test it as well? I created a new branch to test this.

Thanks for your help.

Best regards,

Juanlu.

FraPria · 2022-05-20T12:04:36Z

Hi, thanks for your feedback!

I just tested it but it rises the error:
2022-05-20 13:54:04,566 - lib.A5_A3.get_peptide_sequence - ERROR - ERROR: NameError("name 'sample_id' is not defined")

I added
sample_id = tokens[0].replace(" ","")
at the lines 253 and 1005 and it worked.

Thank you,
have a nice day!

Related issue #5 - Thanks FraPria

EduEyras · 2022-05-20T12:57:08Z

Thanks, I've added those lines in the code of the master I've also merged the other fixes. I hope it is fine now Thanks E.

…

On Fri, 20 May 2022 at 22:04, FraPria ***@***.***> wrote: Hi, thanks for your feedback! I just tested it but it rises the error: 2022-05-20 13:54:04,566 - lib.A5_A3.get_peptide_sequence - ERROR - ERROR: NameError("name 'sample_id' is not defined") I added sample_id = tokens[0].replace(" ","") at the lines 253 and 1005 and it worked. Thank you, have a nice day! — Reply to this email directly, view it on GitHub <#5 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ADCZKB5TTFKVLNDNK2PTYJLVK55WFANCNFSM5WFIVMTQ> . You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>

-- Prof. E Eyras EMBL Australia Group Leader The John Curtin School of Medical Research - Australian National University https://github.com/comprna http://scholar.google.com/citations?user=LiojlGoAAAAJ

JLTrincado added a commit that referenced this issue May 20, 2022

Bug in lib/A5_A3/get_peptide_sequence.py

8045eed

Pointed by #5

JLTrincado mentioned this issue May 20, 2022

Bug in lib/A5_A3/get_peptide_sequence.py #6

Merged

EduEyras added a commit that referenced this issue May 20, 2022

Update get_peptide_sequence.py

7478b31

Related issue #5 - Thanks FraPria

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

A5_A3 get_peptide_sequence.py associates same TPMs to different samples #5

A5_A3 get_peptide_sequence.py associates same TPMs to different samples #5

FraPria commented May 17, 2022

JLTrincado commented May 20, 2022

FraPria commented May 20, 2022

EduEyras commented May 20, 2022 via email

A5_A3 get_peptide_sequence.py associates same TPMs to different samples #5

A5_A3 get_peptide_sequence.py associates same TPMs to different samples #5

Comments

FraPria commented May 17, 2022

JLTrincado commented May 20, 2022

FraPria commented May 20, 2022

EduEyras commented May 20, 2022 via email