Cholera, caused by Vibrio cholerae, in modern history (1817 - present) has spread across the world in seven pandemics. The seventh pandemic started in 1961 and continues today. The global epidemiology and population structure of the seventh pandemic clone have recently been analysed using single nucleotide polymorphisms (SNPs) from genomic data. However, the relatedness of strains between studies is poorly communicated through comparing SNPs. In this study, a multilevel genome typing (MGT) scheme capable of classifying the seventh pandemic was established. Fundamentally, the MGT is based on the multi-locus sequence typing (MLST) concept and assigns sequence types (STs) based on a strain’s combination of alleles. However, the concept is expanded to include a series of MLST schemes, capable of comparing population structure at multiple resolutions.
The V. cholerae MGT scheme consisted of 3760 loci which were shared across all analysed seventh pandemic strains. The core loci were then organised into nine MGT schemes, with the lowest, MGT1, composed of 17 loci and the highest, MGT9, consisting of 3760 loci (the seventh pandemic core genome). The genetic relationships calculated by smaller schemes of the MGT recapitulated previous findings reporting the large-scale transmission of the seventh pandemic across the globe. Conversely, the larger MGT schemes provided an increased discriminatory power and were able to examine smaller scale trends such as the Nepalese source of the 2010 Haiti outbreak. Additionally, classification of over 5000 seventh pandemic strains showed the MGT is appropriate for analysing large datasets. The MGT identifiers describing the seventh pandemic relationships are not affected by subsequent analysis (are stable) and can be directly compared between various studies (are transferable). The seventh pandemic MGT will allow tracking of new and existing clones and will be useful for controlling future outbreaks and pandemic spread of cholera.