MetaCache 2.4.2
Improved sequence id extraction from filenames and sequence headers.
The default setting works a bit smarter now, it first tries to find NCBI-style accession or accession.version identifiers, then genbank identifiers and finally uses the filename (without path and extension).
The new command line option -sequence-id-format <type>
allows the user to select a preferred method for sequence id extraction.
Available values for <type>
are:
smart
: (default), works as described abovencbi
: only use NCBI-style accession or accession.version identifiersgenbank
: only use genbank identifiersfilename
: only use filename (without path and extension)leadingword
: only use first contiguous stretch of non-whitespace characters