Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segmentation fault when running convertalis with -qaln -- alignment is walking off the end #863

Open
ifiddes opened this issue Jul 18, 2024 · 4 comments

Comments

@ifiddes
Copy link

ifiddes commented Jul 18, 2024

GDB showed me I get a segmentation fault here

    seq=0x7ffff789709c "TATTTTATTTTGTGTAGAGATGGGGTCTCACTAGGTTGCC\n",
    offset=39, bt=..., reverse=false, isReverseStrand=true,
    translateSequence=<optimized out>, translateNucl=...)

With offset = 39, and seqPos = 40, and isReverseStrand = true, the line of code is walking off the start of this 40bp long sequence.

This seems to be because the backtrace has a length of 41:

(gdb) print bt
$6 = (const std::__1::string &) @0x7fffffff2c70: {
  static __endian_factor = 2,
  __r_ = {<std::__1::__compressed_pair_elem<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >::__rep, 0, false>> = {__value_ = {{__s = {{__is_long_ = 1 '\001',
              __size_ = 24 '\030'}, __padding_ = 0x7fffffff2c71 "",
            __data_ = "\000\000\000\000\000\000\000)\000\000\000\000\000\000\000\340E\350VUU\000"}, __l = {{__is_long_ = 1, __cap_ = 24},
            __size_ = 41,
            __data_ = 0x555556e845e0 'M' <repeats 27 times>, "I", 'M' <repeats 12 times>, "D"}, __r = {__words = {49, 41,
              93825018643936}}}}}, <std::__1::__compressed_pair_elem<std::__1::allocator<char>, 1, true>> = {<std::__1::allocator<char>> = {<std::__1::__non_trivial_if<true, std::__1::allocator<char> >> = {<No data fields>}, <No data fields>}, <No data fields>}, <No data fields>},
  static npos = 18446744073709551615}

I have not yet been able to figure out what the target sequence is to make a minimal reproducible example, but I wanted to see if you had any ideas on what would be causing this walk off the edge behavior.

@ifiddes
Copy link
Author

ifiddes commented Jul 18, 2024

The offending alignment is this:

113676 45 0.829 6.410E-05 39 0 40 527 566 585 39 0 0 584 27M1I12M1D

Which does appear to be walking off the end to me.

@ifiddes ifiddes changed the title Segmentation fault when running convertalis with -taln -- alignment is walking off the end Segmentation fault when running convertalis with -qaln -- alignment is walking off the end Jul 18, 2024
@milot-mirdita
Copy link
Member

Could you please post the mmseqs command line and terminal output too? Ideally also the sequences with which to reproduce the crash

@ifiddes
Copy link
Author

ifiddes commented Jul 18, 2024

I am having a hard time creating a minimal reference sequence to reproduce the crash. If I reduce the target database down to only the aligned sequence, it doesn't happen.

The command line in question is

mmseqs convertali querydb targetdb --format-output query,target,qstart,qend,tstart,tend,raw,cigar,qaln,taln,qlen --search-type 3

I will continue to try and make a minimal reproducible example. I did notice that adding a N to the start of my query sequence solves the issue.

@ifiddes
Copy link
Author

ifiddes commented Jul 18, 2024

I was unable to make a minimal ref, so I uploaded the ref to Box. It is a human and mouse transcriptome. I had to break it into three parts, just concatenate them.

Here is the query:

>GRCh38_ENSG00000103042.3491.40
TATTTTATTTTGTGTAGAGATGGGGTCTCACTAGGTTGCC

You should be able to reproduce the crash with

mmseqs easy-search tmp.fasta  full_ref.fa aln.out $TMPDIR  --format-output query,target,qstart,qend,tstart,tend,raw,qaln,taln,qlen --search-type 3

https://app.box.com/s/bx5y7s5gpa7ybyc6xera4hujwojagphe
https://app.box.com/s/w86ynfly4gi2zt09wb0adqc3g05ox7ok
https://app.box.com/s/g50mq3skkaimb8ggunwlqwgbdz5psb6t

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants