General modules#
Species detection#
-m enterobacterales__species
This module will attempt to identify the species of each input assembly. It does this by comparing the assembly using Mash to a curated set of Klebsiella and other Enterobacteriaceae assemblies from NCBI, and reporting the species of the closest match.
Parameters#
--enterobacterales__species_strong
Mash distance threshold for a strong species match (default: 0.02)
--enterobacterales__species_weak
Mash distance threshold for a weak species match (default: 0.04)
Outputs#
Output of the species typing module is the following columns:
species |
Species name (scientific name) |
species_match |
Strength of the species call indicated as |
The quality and completeness of Kleborate results depends on the quality of the input genome assemblies. In general, you can expect good results from draft genomes assembled with tools like SPAdes from high-depth (>50x) Illumina data, however it is always possible that key genes subject to genotyping may be split across contigs, which can create problems for detecting and typing them accurately.
Contig stats#
-m general__contig_stats
This module takes enterobacterales__species as a prerequisite and generates some basic assembly statistics to help users understand their typing results in the context of assembly quality, although we recommend users conduct more comprehensive QC themselves before typing genomes (e.g. screen for contamination, etc).
The module reports a standard set of assembly quality metrics (see Outputs below).
It will also flag in the QC_warnings column if an assembly size falls outside those specified in the species_specification.txt in the module directory, or if N50 <10 kbp or ambiguous bases (Ns) are detected in the sequence.
Outputs#
Output of the contig stats module is the following columns:
contig_count |
Number of contigs in the input assembly |
N50 |
N50 calculated from the contig sizes |
largest_contig |
Size of largest contig (in bp) |
total_size |
Total assembly size (in bp) |
ambiguous_bases |
Detection of ambiguous bases (yes or no). If yes, the number of ambiguous bases is also provided in brackets. |
QC_warnings |
List of QC issues detected, including: |