Releases: shenwei356/seqkit
Releases · shenwei356/seqkit
SeqKit v2.9.0
Please cite:
- Wei Shen*, Botond Sipos, and Liuyang Zhao. 2024. SeqKit2: A Swiss Army Knife for Sequence and Alignment Processing. iMeta e191. doi:10.1002/imt2.191.
- Wei Shen, Shuai Le, Yan Li*, and Fuquan Hu*. SeqKit: a cross-platform and ultrafast toolkit for FASTA/Q file manipulation. PLOS ONE. doi:10.1371/journal.pone.0163962.
Changes
- SeqKit v2.9.0 - 2024-11-01
seqkit
:- Fix sequence ID parsing with the default regular expression (in this case, we actually use bytes.Index instead) for a rare case: "xxx\tyyy zzz" was wrongly parsed as "xxx\tyyy". #486
seqkit locate
:- Fix
-G/--non-greedy
for tandem repeats, e.g., ATTCGATTCGATTCG (ATTCGx3).
- Fix
seqkit grep/subseq
:- Fix negative regions longer than sequence length. #479.
seqkit stats
:- Add an extra column
sum_n
to count the number of ambiguous characters. #490
- Add an extra column
Links
OS | Arch | File, 中国镜像 | Download Count |
---|---|---|---|
Linux | 32-bit | seqkit_linux_386.tar.gz, 中国镜像 |
|
Linux | 64-bit | seqkit_linux_amd64.tar.gz, 中国镜像 |
|
Linux | arm64 | seqkit_linux_arm64.tar.gz, 中国镜像 |
|
macOS | 64-bit | seqkit_darwin_amd64.tar.gz, 中国镜像 |
|
macOS | arm64 | seqkit_darwin_arm64.tar.gz, 中国镜像 |
|
Windows | 32-bit | seqkit_windows_386.exe.tar.gz, 中国镜像 |
|
Windows | 64-bit | seqkit_windows_amd64.exe.tar.gz, 中国镜像 |
Notes
- please open an issuse to request binaries for other platforms.
- run
seqkit version
to check update !!! - run
seqkit genautocomplete
to update shell autocompletion script !!!
SeqKit v2.8.2
Please cite:
- Wei Shen*, Botond Sipos, and Liuyang Zhao. 2024. SeqKit2: A Swiss Army Knife for Sequence and Alignment Processing. iMeta e191. doi:10.1002/imt2.191.
- Wei Shen, Shuai Le, Yan Li*, and Fuquan Hu*. SeqKit: a cross-platform and ultrafast toolkit for FASTA/Q file manipulation.
PLOS ONE. doi:10.1371/journal.pone.0163962.
Changes
- SeqKit v2.8.2 - 2024-05-17
seqkit amplicon
:- Fix a big introduced in v2.7.0. When more than one pairs of primers are given, only the last one is used. #457
seqkit translate
:- Add option
-e/--skip-translate-errors
to skip translate error and output empty sequence. #458
- Add option
seqkit split
:- Add flag
-I/--ignore-case
for-i/--by-id
. #462
- Add flag
SeqKit v2.8.1
Notice: I forgot to update the version number, so seqkit version
will return 2.8.0
.
Changelog
- SeqKit v2.8.1 - 2024-04-07
SeqKit v2.8.0
Changelog
- SeqKit v2.8.0 - 2024-03-11
seqkit stats
:- Add column
N50_num
, an alias of L50, #15.
- Add column
seqkit seq/locate/fish/watch
:- Removing the flag
-V/--validate-seq-length
. Now the whole sequence will be checked if-v/--validate-seq
is given.
- Removing the flag
seqkit amplicon
:- Fix the speed problem, introduced in v2.7.0. #439.
- Slightly faster by reusing objects.
seqkit seq
:- Change the threshold sequence length for parallelizing complement sequence computation, 1kb->1Mb.
SeqKit v2.7.0
Current Version
- SeqKit v2.7.0 - 2024-01-31
seqkit
:- Grouping subcommands in help message, which is intuitive for beginners.
seqkit grep
:- New flag:
-D/--allow-duplicated-patterns
for outputting records multiple times when duplicated patterns are given. #427
- New flag:
seqkit subseq
:- Use the ID regular expression from the option
--id-regexp
to create FASTA index file. This solves the panic happened for sequences containing tabs in the headers. #432
- Use the ID regular expression from the option
seqkit split/sort/shuffle
:- When using the two-pass mode (
-2/--two-pass
), replace possible tabs in the sequence header.
- When using the two-pass mode (
seqkit rmdup
:- Write an empty file of duplicate numbers and lists of IDs even if there's no duplicates when using
-D/--dup-num-file
. #436
- Write an empty file of duplicate numbers and lists of IDs even if there's no duplicates when using
seqkit stats
:- New flag
-S/--skip-file-check
to skip input file checking when given files or a file list. It's very useful if you run it with millions of files.
- New flag
SeqKit v2.6.1
Changelog
- SeqKit v2.6.1 - 2023-11-18
seqkit
:- fix panic of nil pointer introduced in v2.6.0, which happens when handling multiple input files and some of them have file sizes of zero.
seqkit seq
:- fix panic (close of closed channel) when using
-v
to checking sequences.
- fix panic (close of closed channel) when using
SeqKit v2.6.0
Changes
- SeqKit v2.6.0 - 2023-11-09
seqkit
:- add the shortcut
-X
for the flag--infile-list
.
- add the shortcut
seqkit common
:- add a new flag
-e/--check-embedded-seqs
for detecting embedded sequences. - for matching by sequences: reduced the memory occupation and corrected numbers in the log. #416
- add a new flag
seqkit stat
:- add a new column
AvgQual
for average quality score. #411
- add a new column
seqkit split2
:- fix the panic for invalid input.
seqkit subseq
:- add a new flag
-R/--region-coord
for appending coordinates to sequence ID for-r/--region
. #413
- add a new flag
seqkit locate
:- add a new flag
-s/--max-len-to-show
to show at most X characters for the search pattern or matched sequences.
- add a new flag
seqkit seq
:- change the nucleotide color theme. #412
SeqKit v2.5.1
Changes
- SeqKit v2.5.1 - 2023-08-09
SeqKit v2.5.0
Changes
- SeqKit v2.5.0 - 2023-07-16
- new command
seqkit merge-slides
: merge sliding windows generated from seqkit sliding. #390 seqkit stats
:- added a new flag
-N/--N
for appending other N50-like stats as new columns. #393 - added a progress bar for > 1 input files.
- write the result of each file immediately (no output buffer) when using
-T/--tabular
.
- added a new flag
seqkit translate
:- add options
-s/--out-subseqs
and-m/--min-len
to write ORFs longer thanx
amino acids as individual records. #389
- add options
seqkit sum
:- do not remove possible '*' by default and delete confusing warnings. Thanks to @photocyte. #399
- added a progress bar for > 1 input files.
seqkit pair
:- remove the restriction of requiring FASTQ format, i.e., FASTA files are also supported.
seqkit seq
:- update help messages. #387
seqkit fxtab
:- faster alphabet computation (
-a/--alphabet
) with a new data structure. Thanks to @elliotwutingfeng #388
- faster alphabet computation (
seqkit subseq
:- accept reverse coordinates in BED/GTF. #392
- new command
SeqKit v2.4.0
Changes
- SeqKit v2.4.0 - 2023-03-17
seqkit
:seqkit locate
:- do not remove embeded regions when searching with regular expressions. #368
seqkit amplicon
:- fix BED coordinates for amplicons found in the minus strand. #367
seqkit split
:- fix forgetting to add extension for
--two-pass
. #332
- fix forgetting to add extension for
seqkit stats
:- fix compute Q1 and Q3 of sequence length for one record. #353
seqkit grep
:- fix count number (
-C
) for matching with mismatch (-m > 0
). #370
- fix count number (
seqkit replace
:- add some flags to match partly records to edit; these flags are transplanted from
seqkit grep
. #348
- add some flags to match partly records to edit; these flags are transplanted from
seqkit faidx
:- allow empty lines at the end of sequences.
seqkit faidx/sort/shuffle/split/subseq
:seqkit seq
:- allow filtering sequences of length zero. thanks to @penglbio.
seqkit rename
:- new flag
-s/--separator
for setting separator between original ID/name and the counter (default "_"). #360 - new flag
-N/--start-num
for setting starting count number for duplicated IDs/names (default 2). #360 - new flag
-1/--rename-1st-rec
for renaming the first record as well. #360 - do not append space if there's no description after the sequene ID.
- new flag
seqkit sliding
:- new flag
-S/--suffix
for change the suffix added to the sequence ID (default: "_sliding").
- new flag