You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am trying to run SplAdder on 83 samples using the instructions for Use on large cohorts ( https://spladder.readthedocs.io/en/latest/spladder_cohort.html ). When I get to the testing mode, I have several subgroups of samples to test based on different conditions. Some of the tests failed with the following error message:
raise ValueError(self.msg.format('endog'))
ValueError NaN, inf or invalid value detected in endog, estimation infeasible.
These error messages were accompanied by these warnings:
/users/kstankie/anaconda3/lib/python3.9/site-packages/numpy/core/fromnumeric.py:3440: RuntimeWarning: Mean of empty slice.
/users/kstankie/anaconda3/lib/python3.9/site-packages/numpy/core/_methods.py:189: RuntimeWarning: invalid value encountered in double_scalars
Based on these warnings, I looked at the distribution of inserted events and found some samples that had very low or zero inserted events while all other samples in the condition group had hundreds or thousands. Example of two samples in the same condition group for a failed test:
I removed the sample with almost no inserted events and re-ran spladder test. This time, no errors indicating "mean of empty slice" and I did receive output! However, for many of my tests I keep receiving this warning still (it was also present before I removed the offending samples causing the previous error):
users/kstankie/anaconda3/lib/python3.9/site-packages/spladder/spladder_test.py:742: RuntimeWarning: invalid value encountered in subtract
The run finishes and produces output that looks similar to tests that do not contain this RunTimeWarning. So I am not sure what is causing it and if it should raise alarm bells. As mentioned above, for this one dataset, I run several tests with different groups of samples for different conditions and only receive this warning for some of the tests. I can't figure out the reason why some tests receive this warning and not others (it is not just for the tests where I had to remove some samples due to the previous "mean of empty slice" issue). For each of my tests, each condition has 5-7 samples. I do not receive any warnings or errors in any previous steps (for spladder build)...it is only at spladder test where these RunTimeWarnings occur.
Are these warnings a concern or can the they be ignored as long as it finished running and produced output?
Thanks in advance for the help!
What I Did
# Here I use a job array to run each of the 83 samples in parallel
spladder build -o ${workdir}/array_spladder_out -a ${workdir}/annotation/genome.annotation.gff -b ${1} --merge-strat single --no-extract-ase --parallel 2 -v
# next merge the splice graphs
spladder build -o ${workdir}/array_spladder_out -a ${workdir}/annotation/genome.annotation.gff -b ${workdir}/bams.txt --merge-strat merge_graphs --no-extract-ase --parallel 40 -v
# next run quantification for each sample separately using a job array again
spladder build -o ${workdir}/array_spladder_out -a ${workdir}/annotation/genome.annotation.gff -b ${1} --merge-strat merge_graphs --no-extract-ase --quantify-graph --qmode single --parallel 2 -v
# aggregate them into a joint database
spladder build -o ${workdir}/array_spladder_out -a ${workdir}/annotation/genome.annotation.gff -b ${workdir}/bams.txt --merge-strat merge_graphs --no-extract-ase --quantify-graph --qmode collect --parallel 40 -v
# call events
spladder build -o ${workdir}/array_spladder_out -a ${workdir}/annotation/genome.annotation.gff -b ${workdir}/bams.txt --parallel 40 -v
# run 30 different tests comparing different groups of samples (again using a job array to automate running each comparison)
spladder test -o ${workdir}/array_spladder_out --out-tag Rem_Prob --conditionA ${workdir}/testing_contrasts/contrast_files/sym_${1}_${3}.txt --conditionB ${workdir}/testing_contrasts/contrast_files/sym_${2}_${3}.txt --labelA ${1}${3} --labelB ${2}${3} --diagnose-plots -v --parallel 5
The text was updated successfully, but these errors were encountered:
Thanks for reporting this. It is on my TODO list since a while to catch these warnings early and give a more informative feedback to the user. Depending on where the warnings occur, they might indicate different things. For instance that an event does not have sufficient number of quantified events in a group or that the gene expression or event fold-change contains NaNs. You can ignore these for now, but I will leave the ticket open as a reference (and reminder) for me to improve this.
Thanks so much for the reply and explanation! I do note that in the test_results_C3gene_unique.tsv files, the column 'log2FC_event_count' contains both 'nan' and 'inf' values for some events. Also, if I look at the mere_graphsC3.confirmed.txt files, I do see that some samples show 'nan' for psi for some events.
In this case, I can still ignore these warnings for now? And perhaps simply exclude events that have NaN values for further analysis? (looking at #124 )
Description
I am trying to run SplAdder on 83 samples using the instructions for Use on large cohorts ( https://spladder.readthedocs.io/en/latest/spladder_cohort.html ). When I get to the testing mode, I have several subgroups of samples to test based on different conditions. Some of the tests failed with the following error message:
raise ValueError(self.msg.format('endog'))
ValueError NaN, inf or invalid value detected in endog, estimation infeasible.
These error messages were accompanied by these warnings:
/users/kstankie/anaconda3/lib/python3.9/site-packages/numpy/core/fromnumeric.py:3440: RuntimeWarning: Mean of empty slice.
/users/kstankie/anaconda3/lib/python3.9/site-packages/numpy/core/_methods.py:189: RuntimeWarning: invalid value encountered in double_scalars
Based on these warnings, I looked at the distribution of inserted events and found some samples that had very low or zero inserted events while all other samples in the condition group had hundreds or thousands. Example of two samples in the same condition group for a failed test:
Inserted:
cassette_exon: 0
intron_retention: 0
intron_in_exon: 0
alt_53_prime: 1
exon_skip: 0
gene_merge: 0
new_terminal_exon: 0
Inserted:
cassette_exon: 1357
intron_retention: 706
intron_in_exon: 3552
alt_53_prime: 12145
exon_skip: 17159
gene_merge: 0
new_terminal_exon: 41576
I removed the sample with almost no inserted events and re-ran spladder test. This time, no errors indicating "mean of empty slice" and I did receive output! However, for many of my tests I keep receiving this warning still (it was also present before I removed the offending samples causing the previous error):
users/kstankie/anaconda3/lib/python3.9/site-packages/spladder/spladder_test.py:742: RuntimeWarning: invalid value encountered in subtract
The run finishes and produces output that looks similar to tests that do not contain this RunTimeWarning. So I am not sure what is causing it and if it should raise alarm bells. As mentioned above, for this one dataset, I run several tests with different groups of samples for different conditions and only receive this warning for some of the tests. I can't figure out the reason why some tests receive this warning and not others (it is not just for the tests where I had to remove some samples due to the previous "mean of empty slice" issue). For each of my tests, each condition has 5-7 samples. I do not receive any warnings or errors in any previous steps (for spladder build)...it is only at spladder test where these RunTimeWarnings occur.
Are these warnings a concern or can the they be ignored as long as it finished running and produced output?
Thanks in advance for the help!
What I Did
The text was updated successfully, but these errors were encountered: