Frac-KMC is a FracMinHash sketch generator tool from FASTA/FASTQ files. This tool is a modified version of the k-mer counting tool KMC (hence the name).
After a proper installation, the command fracKmcSketch should work. The command requires the following arguments:
| Argument | Mandatory? | Meaning |
|---|---|---|
<infilename> |
Yes | The input file name (should be fasta or fastq) |
<outfilename> |
Yes | The output file name |
--ksize <int> |
No | kmer size (default: 21) |
--scaled <int> |
No | Scaled value (default: 1000) |
--seed <int> |
No | Random seed (default: 42) |
--fa or --fq |
Yes | Input file format (fasta or fastq) |
--n <int> |
No | Number of threads (default: 1) |
--a |
No | Write abundance (default: false) |
As of September 2024, frac-kmc supports fasta or fastq files, both in gzipped or unzipped format.
fracKmcSketch <fasta> <sketch_name> --ksize 21 --scaled 1000 --seed 42 --n 32 --fa
This command with create a sketch from the fasta file using 21-mers, a scaled value of 1000, and use 42 as the seed for the hash function. It will also use 32 parallel threads to compute the sketch. The resulting sketch should be compatible with a sketch computed using sourmash sketch dna input_filename -p k=21,scaled=1000 -o sketch_name.
Note that fracKmcSketch requires an explicit argument --fa, which means the input file is in fasta format. If the input file is a fastq file, the argument --fq should be provided.
After downloading the repository:
make
Make sure to add the bin directory to your PATH variable.
frac-kmc can be run using docker.
docker run --platform linux/amd64 mahmudhera/frackmc:x86-64
This will download the docker image of frac-kmc into your local machine. The image is about 550 MB. After downloading the image, docker will run frac-kmc, and you should see the following outout:
Usage: /usr/src/app/bin/fracKmcSketch <infilename> <outfilename> [options]
Options:
--ksize <int> kmer size (default: 21)
--scaled <int> Scaled value (default: 1000)
--seed <int> Random seed (default: 42)
--fa Input file is in fasta format
--fq Input file is in fastq format
--a Write abundances
--n <int> Number of threads (default: 1)
You can mount a local directory when invoking the docker run command by using the -v flag.
For example, you can ask docker to mount a local directory to /data using the following command. This allows you to provide the input/output arguments in your local directory as if they are in /data in the docker container.
docker run -v <your_local_directory>:/data --platform linux/amd64 frackmc:x86-64 /data/<input_filename> /data/<output_filename> <options>
Please cite the following to credit use of frac-kmc:
Additionally, you may also cite original KMC: