prospects in the genomic era - Future Medicine

18 downloads 0 Views 2MB Size Report
sequences, individual sequences of thousands of base pairs in length can be uniquely .... Using MLST to study bacterial variation: prospects in the genomic era ...
SPECIAL REPORT For reprint orders, please contact: [email protected]

Using multilocus sequence typing to study bacterial variation: prospects in the genomic era Keith A Jolley*,1 & Martin CJ Maiden1

ABSTRACT: Multilocus sequence typing (MLST) indexes the sequence variation present in a small number (usually seven) of housekeeping gene fragments located around the bacterial genome. Unique alleles at these loci are assigned arbitrary integer identifiers, which effectively summarizes the variation present in several thousand base pairs of genome sequence information as a series of numbers. Comparing bacterial isolates using allele-based methods efficiently corrects for the effects of lateral gene transfer present in many bacterial populations and is computationally efficient. This ‘gene-by-gene’ approach can be applied to larger collections of loci, such as the ribosomal protein genes used in ribosomal MLST (rMLST), up to and including the complete set of coding sequences present in a genome, whole-genome MLST (wgMLST), providing scalable, efficient and readily interpreted genome analysis. Multilocus sequence typing Since its introduction in 1998, multilocus sequence typing (MLST) has proven to be an effective and widely used method for characterizing bacterial Isolates. MLST indexes the diversity of nucleotide sequences of fragments of housekeeping genes (loci), with most bacterial MLST schemes employing seven loci of approximately 400–500 bp each, a length initially chosen as achievable with dideoxy sequencing technology [1] . Each novel sequence at each locus is assigned a number in order of discovery (adk-1, adk-2… and so on) and the numbers for all the loci characterized in a particular scheme are incorporated into an allelic profile (e.g., 2–3–4–3–8–4–6). Each profile, or unique combination of alleles, is assigned an arbitrary sequence type (ST; e.g., ST-11). By the simple expedient of having look-up tables which relate STs to allelic profiles and alleles to allele sequences, individual sequences of thousands of base pairs in length can be uniquely associated with a bacterial isolate using a single number [2] .

KEYWORDS 

• clonal complex • gene-bygene • MLST • ribosomal MLST • whole-genome

MLST

MLST & bacterial isolate characterization One of the principal reasons for the success of MLST was that it was developed in the light of an improved appreciation of the population and evolutionary biology of the bacteria, specifically the role of lateral gene transfer and the consequences of this process [3] . The concept of examining the genome at multiple housekeeping genes, that is, core genome loci presumed to be under neutral or ‘nearly neutral’ selection pressures, was previously established with multilocus enzyme electro­phoresis (MLEE) [4] . MLST introduced the innovation of indexing nucleotide sequence variation in these genes, providing dramatically better resolution than had been possible with MLEE, which inferred the presence of distinct alleles from differential migration of proteins during starch gel electro­ phoresis. A further advantage of using nucleotide sequences was their reproducibility and portability, Department of Zoology, University of Oxford, South Parks Road, Oxford, OX1 3PS, UK *Author for correspondence: Tel.: +44 1865 281537; Fax: +44 1865 281275; [email protected] 1

10.2217/FMB.14.24 © Keith A. Jolley & Martin C. J. Maiden

Future Microbiol. (2014) 9(5), 623–630

part of

ISSN 1746-0913

623

Special Report  Jolley & Maiden with the availability of curated reference data sets via the internet [2,48,49] . The use of allele designations as units of analysis, rather than nucleotide sequences themselves, addresses many of the problems associated with employing phylogenetic methods to assess isolate relationships among recombining bacteria. These problems arise from the fact that individual recombination events, which are common, introduce multiple polymorphisms, while point mutations, which are often relatively rare, change only single nucleotides. In MLST, these changes (i.e., point mutations or recombination events that change many nucleotides in one event) are effectively weighted the same, as both represent an allelic change [5] ; however, ST and allele designations can be used when required to retrieve the relevant nucleotide sequences from the look-up tables for sequence-based analysis. For numerous bacteria, MLST schemes comprising as few as seven housekeeping loci have proved to be highly discriminatory [2] . A survey of the current publicly available MLST schemes reveals a staggering level of diversity among allelic profiles that represent only a fraction of the genome in question (usually less than 0.2%) [50] . This demonstrates the importance of having a straightforward and infinitely expandable means for summarizing and comparing data on bacterial diversity, such as the allele and ST number definitions. As the allele and ST designations are arbitrary, they can be grouped into different higher order groups as improved understanding of the biological structure of the diversity they catalog emerges, without the need to rewrite the fundamental nomenclature. MLST data are amenable to such analyses of population structure since the presence of shared alleles among MLST loci can be used to infer ancestral lineages by various clustering methods and Bayesian techniques such as BAPS [6] , Structure [7] and ClonalFrame [8] . The extent of genetic exchange in many bacteria is evident from the fact that the number of STs observed frequently exceeds the number of alleles observed per locus by more than an order of magnitude [9] . Nevertheless, even in highly recombining organisms certain genotypes, for which STs are effective surrogates, dominate and the analysis of seven-locus MLST data resolves numerous bacterial populations into ‘clonal complexes’. While the majority of STs in a given data set are rare and transitory, certain STs are both high frequency and stable over time and during geographic spread [2] . These STs are markers for

624

Future Microbiol. (2014) 9(5)

persistent consensus genotypes, which are variously referred to as a ‘central genotype’ or, with less accuracy, as a ‘founder’ or ‘ancestor’: there is rarely, if ever, any evidence that central genotypes represent either [10] . A striking example of such persistence is the spread of carbepenem-resistant ST-235 Pseudomonas aeruginosa across Eastern Europe [11] . The linking of specific STs with consensus genotypes has led to them acting as markers for ‘high-risk clones’ in some cases where, for example, the aforementioned ST-235, along with ST-111 and ST-175 P. aeruginosa, are associated with extensive drug-resistance in healthcare settings. [12] This association is clinically useful in the absence of specific information of the chromosomal resistance mechanisms involved. The clonal complex Clonal complexes can be detected in population data sets with various heuristic algorithms including split decomposition [13] , NeighborNet [14,15] , minimum spanning-trees, eBURST [16] and the related goeBURST [17,18] (Figure 1) . The clonal complexes, conventionally named after the consensus ST (e.g., the meningococcal ST-11 clonal complex or CC11), are frequently associated with important phenotypes: in pathogenic bacteria, for example, they are often associated with properties such as the propensity to cause disease, the type of disease caused, particular vaccine antigens, antimicrobial resistance or host association. Consequently, the MLST-defined clonal complex has become a principal unit of analysis, which has facilitated functional investigations by enabling complex phenotypes, such as host association, pathogenicity or antimicrobial resistance, to be associated with bacterial genotype [2] . By condensing sequence information into a series of numbers, comparisons among isolates can be simplified to a matter of counting the number of loci that vary. When performed for a collection of profiles, the resultant distance matrix can be graphically represented with various algorithms to identify relationships among isolates (see Box 1). Limitations of MLST The metabolic diversity of the bacterial domain has prevented the development of a universal MLST scheme for all bacteria based on metabolic housekeeping genes, as even these genes are either not widely shared or are too diverse. There is a paradox that any sequence-based typing scheme relies on variation for discrimination; however, as conventional MLST employs

future science group

Using MLST to study bacterial variation: prospects in the genomic era 

Special Report

ST-290 1.0

ST-2931

ST-10285

ST-639

0.5 ST-259 ST-34

ST-10285

ST-8049

ST-32

ST-10286 ST-259 ST-9890 ST-1096

ST-34

ST-9890 ST-1096 ST-10286

ST-33

ST-639

ST-6083

ST-33

ST-2931 ST-290

ST-8049

ST-749

ST-7460

ST-7460

ST-749

10286 ST-7460 ST-1096

ST-10286

ST-639

6083

ST-32

9890

ST-8049

7460

2931

ST-6083 ST-259

ST-10285

8049

32

ST-34 ST-33

ST-9890

259

1096 ST-749

ST-2931

ST-6083

ST-32

290 ST-290

33 34

749 639

10285

Figure 1. Graphical presentation of multilocus sequence typing data. Methods that display MLST data generally calculate a distance matrix, with distances representing the number of locus differences between every pair of samples. Input data are provided in Table 1 with output generated by (A) split decomposition using SplitsTree [15], (B) NeighborNet using SplitsTree [15], (C) Minimum-spanning tree (http://pubmlst.org/analysis/) and (D) goeBURST [18]. Methods (A–C) have been manually annotated for publication, whereas the output for method (D) is generated by PhyloViz [17]. The observed frequencies are represented by the size of the circles.

the amplification of these loci by PCRs, their variability makes the design of reliable amplification and sequencing primers difficult [2] . In addition, the variability of the content of the core genome among different bacteria makes it impossible to use the same metabolic genes other than within quite closely related organisms. Consequently, even within genera, such as the genus Streptococcus, it is frequently necessary to have more than one scheme, each with different target loci [19–22] . At the other end of the diversity scale MLST on its own cannot provide discrimination among very closely related organisms – this includes the recently evolved single-clone

future science group

asexual pathogens such as Bacillus anthracis [23] or isolates of more diverse pathogens that belong to the same clone. In these cases it has been necessary to use additional typing schemes that index rapidly evolving loci; for example, those encoding antigen genes [24,25] or variable number tandem repeats [26] . More recently, whole-genome single-nucleotide polymorphisms (SNPs) have been used for this type of analysis in such pathogens [27] . Notwithstanding these limitations at the whole domain and sub-strain typing levels, MLST has proven to be very successful in describing population diversity and structure for a wide range of bacteria.

www.futuremedicine.com

625

Special Report  Jolley & Maiden Box 1. Anatomy of a clonal complex. The Neisseria ST-32 complex (previously identified as ET-5 complex) has been responsible for epidemics of meningococcal disease in Europe [44,45] and the Americas [46,47]. To demonstrate how multilocus sequence typing (MLST) approaches can be used to analyze related strains at different levels of resolution the whole genomes of ST-32 complex isolates recovered from all cases of disease in England and Wales in two recent epidemiological years were investigated (44 isolates). ST-32 was the most frequently isolated genotype, followed by ST-34 and ST-33 (Table 1) – double- and single-locus variants of ST-32, respectively. A Neighbor-Net comparison at the MLST loci was performed using the BIGSdb Genome Comparator tool [34] (Figure 2A). The vertices of the network can be readily annotated to identify the locus changes represented, clearly showing that ST-32 itself has the largest number of variants that differ at a single locus, with ST-33 also possessing a smaller set of its own variants. Scaling the analysis up to use wgMLST (1548 variable loci) shows that while ST-32 isolates are largely clustered together, there are some present in other parts of the network (Figure 2B). Likewise, ST-33 isolates are dispersed in the network. This shows that while standard MLST is useful for comparisons at the level of the clonal complex, strains with identical ST numbers may not be genetically closer to each other than to other members of the complex.

Conclusion & future perspective MLST has been highly successful as an approach to the description, archiving and unambiguous cataloging of the diversity of a broad range of bacteria and, in many cases, groups of related bacteria defined by MLST have been used as the basis for functional studies. However, MLST has not proved to be a complete solution to the characterization problem at two levels: there has been no single universal MLST scheme applicable to all bacteria; and seven-locus MLST lacks the very high resolution required for some applications. The advent of rapid and inexpensive sequencing has removed the practical constraints that have framed the design of MLST approaches [28] ; however, while the torrent of genome sequence data now available can potentially

overcome the shortcomings of MLST, it also threatens the ordered investigation of bacterial diversity by swamping the field with ‘too much information’, or at least too much data without organization or an understandable nomenclature framework: major reasons for the success of MLST [29] . This mirrors the multiple incompatible molecular typing methods, YATMS (Yet Another Typing Method) [30] developed in the 1990s, which in large degree stimulated the proposal of MLST as a general approach [1,2] . The principles behind MLST can, however, be applied to whole-genome analysis, with schemes consisting of increasing numbers of loci, up to and including the entire complement of coding sequences within the genome (whole-genome MLST [wgMLST]). This approach has been termed ‘gene-by-gene’

Table 1. Sequence types and allelic profiles of the ST-32 clonal complex of Neisseria meningitidis causing meningococcal disease in England and Wales during the 2010–2011 and 2011–2012 epidemiological years. ST

abcZ

adk

aroE

fumC

gdh

pdhC

pgm

Frequency

32 34 33 749 259 290 2931 8049 639 1096 6083 7460 9890 10285 10286

4 8 8 8 4 8 4 8 8 4 4 4 4 4 4

10 10 10 10 10 3 5 10 10 10 10 10 10 10 10

5 5 5 77 5 5 5 5 5 5 5 48 5 5 5

4 4 4 4 40 4 4 4 9 26 4 4 630 24 8

6 5 6 6 6 1 6 6 6 6 6 6 6 6 6

3 3 3 3 3 3 3 3 3 3 416 3 3 3 3

8 8 8 8 8 8 8 15 8 8 8 8 8 9 8

11 8 5 4 3 2 2 2 1 1 1 1 1 1 1

Data taken from the MRF genome library. ST: Sequence type.

626

Future Microbiol. (2014) 9(5)

future science group

Using MLST to study bacterial variation: prospects in the genomic era 

Special Report

ST-10285

Z

abc

Z

ST-10286 ST-259 ST-9890 fumC ST-1096 fumC C m fu adk ST-2931

ST-32

pdhC

ST-6083

aroE

ST-749

ar oE

ar oE

adk ST-290

abc

ar oE

gdh

C

m

fu

ST-33

abc Z abc Z

fum C

gdh fumC

pg

fum

fumC

fumC

gdh

ST-34

ST-639

C

0.1

m

ST-8049

ST-7460 ST-7460

ST-32 ST-34 ST-33 Other ST

ST-9890

100.0 ST-10285

ST-2931

ST-10286 ST-259

ST-1096

ST-9890 ST-2931

ST-290

ST-639

ST-8049

ST-749

Figure 2. Neighbor-Net representation of the variation seen within cases of meningococcal disease caused by ST-32 clonal complex Neisseria meningitidis isolates over two epidemiological years within England and Wales (see table for allelic profiles and frequencies). (A) MLST. Individual STs are represented by shaded circles, with the observed frequencies represented by the size of the circle. Vertices are annotated with the genetic change represented; for example, ST-32 and ST-34 differ at two loci: abcZ and gdh. Parallel vertices within the network represent the same genetic change with reticulations describing alternative pathways between related profiles. The clonal complex is named after ST-32, which is the most frequently reported ST with the greatest number of similar profiles varying at a single locus. (B) Whole-genome variation: variation within the same isolates when assessed by whole-genome comparison at 1548 variable loci using MC58 (a member of the ST-32 complex) as the source of comparator sequences. Dashed lines on the vertices indicate that more than one allelic change is represented. These are individually annotated. Genome sequences are from the MRF Meningococcal Genome Library (http://www.meningitis.org/research/genome) hosted on PubMLST.org.

future science group

www.futuremedicine.com

627

Special Report  Jolley & Maiden genomic analysis [31–33] and is the philosophy behind the design of the Bacterial Isolate Genome Sequence Database (BIGSdb) platform [34] that is currently used to host most of the MLST, and increasingly now genome, databases on PubMLST [48] . One such MLST scheme that offers universal bacterial species identification and typing is ribosomal MLST (rMLST) [35,36] . This uses 53 ribosomal protein genes, the products of which come together to form the ribosome, the essential translation machinery of the cell and found throughout the bacterial domain [37] . Initial validation of rMLST with selected species indicate that it provides resolution higher than standard 7-locus MLST, and it has been used to resolve species groups within the Neisseria [38] and ­lineage structure of Campylobacter [39] . Even though whole-genome data are becoming ubiquitous, MLST is still relevant. It provides the overall clonal frame of the organism and allows genomic data to be related to legacy data sets collected over the past 15 years. Allele designations for MLST can be readily extracted from whole-genome data [40–43] and the costs of sequencing a genome and generating 7-locus

MLST data by Sanger sequencing are comparable while gene-by-gene methods, such as rMLST and wgMLST provide scalable means of studying the sequence variation encoded in the genome. Financial & competing interests disclosure MCJM is a Wellcome Trust Senior Fellow in Basic Biomedical Sciences. This publication made use of the Meningitis Research Foundation Meningococcus Genome Library (http://www. meningitis.org/research/genome) developed by Public Health England, the Wellcome Trust Sanger Institute and the University of Oxford as a collaboration. The project is funded by Meningitis Research Foundation. The authors have no other relevant affiliations or financial involvement with any organization or entity with a financial interest in or financial conflict with the subject matter or materials discussed in the manuscript apart from those disclosed. No writing assistance was utilized in the production of this manuscript.

Open access This work is licensed under the Creative Commons Attribution-NonCommercial 3.0 Unported License. To view a copy of this license, visit http://creativecommons.org/ licenses/by-nc-nd/3.0/

EXECUTIVE SUMMARY Multilocus sequence typing methodology ●●

A molecular typing method that indexes the sequences of fragments of housekeeping genes at (usually) seven loci.

●●

Each unique sequence for a given locus is given an arbitrary allele number.

●●

Each unique combination of alleles (or allelic profile) is given an arbitrary sequence type (ST) number.

●●

Allele-based methods such as multilocus sequence typing (MLST) correct for the effects of lateral gene transfer.

Clonal complexes ●●

These are groups of related STs that may be associated with particular phenotypes.

●●

Can be defined by various methods that involve counting locus differences to a ‘central’ ST or to other members of the complex.

Ribosomal MLST ●●

A universal MLST scheme that uses the 53 ribosomal protein genes.

●●

Suitable for use with whole-genome data.

●●

Higher resolution than conventional (seven locus) MLST and works for all bacteria.

Gene-by-gene analysis: whole-genome MLST ●●

MLST-type approach to analyzing genomic variation.

●●

Alleles for all coding sequences in the genome are indexed.

●●

Highly scalable and computationally efficient analysis of whole genome data using the same methods as used for MLST.

628

Future Microbiol. (2014) 9(5)

future science group

Using MLST to study bacterial variation: prospects in the genomic era  References Papers of special note have been highlighted as: •  of interest; •• of considerable interest 1



13 Bandelt HJ, Dress AW. Split decomposition:

Describes the development of the first multilocus sequence typing (MLST) scheme.

14 Bryant D, Moulton V. Neighbor-net:

Maiden MC. Multilocus sequence typing of bacteria. Annu. Rev. Microbiol. 60, 561–588 (2006).

3

Maynard Smith J, Smith NH, O’Rourke M, Spratt BG. How clonal are bacteria? Proc. Natl Acad. Sci. USA 90(10), 4384–4388 (1993).

4

Selander RK, Caugant DA, Ochman H, Musser JM, Gilmour MN, Whittam TS. Methods of multilocus enzyme electrophoresis for bacterial population genetics and systematics. Appl. Environ. Microbiol. 51, 837–884 (1986).

6

7

et al. Genetic markers of widespread extensively drug-resistant Pseudomonas aeruginosa high-risk clones. Antimicrob. Agents Chemother. 56(12), 6349–6357 (2012).

Maiden MCJ, Bygraves JA, Feil E et al. Multilocus sequence typing: a portable approach to the identification of clones within populations of pathogenic microorganisms. Proc. Natl Acad. Sci. USA 95(6), 3140–3145 (1998).

2

5

12 Cabot G, Ocampo-Sosa AA, Dominguez MA

Holmes EC, Urwin R, Maiden MC. The influence of recombination on the population structure and evolution of the human pathogen Neisseria meningitidis. Mol. Biol. Evol. 16(6), 741–749 (1999). Cheng L, Connor TR, Aanensen DM, Spratt BG, Corander J. Bayesian semisupervised classification of bacterial samples using MLST databases. BMC Bioinformatics 12, 302 (2011). Hubisz MJ, Falush D, Stephens M, Pritchard JK. Inferring weak population structure with the assistance of sample group information. Mol. Ecol. Resour. 9(5), 1322–1332 (2009).

8

Didelot X, Falush D. Inference of bacterial microevolution using multilocus sequence data. Genetics 175(3), 1251–1266 (2007).

9

Jolley KA, Kalmusova J, Feil EJ et al. Carried meningococci in the Czech Republic: a diverse recombining population. J. Clin. Microbiol. 38(12), 4492–4498 (2000).

10 Feil EJ. Small change: keeping pace with

microevolution. Nat. Rev. Microbiol. 2(6), 483–495 (2004). 11 Edelstein MV, Skleenova EN, Shevchenko

OV et al. Spread of extensively resistant VIM-2-positive ST235 Pseudomonas aeruginosa in Belarus, Kazakhstan, and Russia: a longitudinal epidemiological and clinical study. Lancet Infect. Dis. 13(10), 867–876 (2013).

future science group

24 Jolley KA, Brehony C, Maiden MC.

Molecular typing of meningococci: recommendations for target choice and nomenclature. FEMS Microbiol. Rev. 31(1), 89–96 (2007). 25 Dingle KE, Mccarthy ND, Cody AJ,

Peto TE, Maiden MC. Extended sequence typing of Campylobacter spp., United Kingdom. Emerg. Infect. Dis. 14(10), 1620–1622 (2008).

a new and useful approach to phylogenetic analysis of distance data. Mol. Phylogenet. Evol. 1(3), 242–252 (1992).

26 Adair DM, Worsham PL, Hill KK et al.

an agglomerative method for the construction of phylogenetic networks. Mol. Biol. Evol. 21(2), 255–265 (2004). 15 Huson DH, Bryant D. Application of

Diversity in a variable-number tandem repeat from Yersinia pestis. J. Clin. Microbiol. 38(4), 1516–1519 (2000). 27 Pearson T, Okinaka RT, Foster JT, Keim P.

phylogenetic networks in evolutionary studies. Mol. Biol. Evol. 23(2), 254–267 (2006).

Phylogenetic understanding of clonal populations in an era of whole genome sequencing. Infect. Genet. Evol. 9(5), 1010–1019 (2009).

16 Feil EJ, Li BC, Aanensen DM, Hanage WP,

Spratt BG. eBURST: inferring patterns of evolutionary descent among clusters of related bacterial genotypes from multilocus sequence typing data. J. Bacteriol. 186(5), 1518–1530 (2004).

28 Medini D, Serruto D, Parkhill J et al.

Microbiology in the post-genomic era. Nat. Rev. Microbiol. 6(6), 419–430 (2008). 29 Perez-Losada M, Cabezas P, Castro-Nallar E,

Crandall KA. Pathogen typing in the genomics era: MLST and the future of molecular epidemiology. Infect. Genet. Evol. 16, 38–53 (2013).

17 Francisco AP, Vaz C, Monteiro PT,

Melo-Cristino J, Ramirez M, Carrico JA. PHYLOViZ: phylogenetic inference and data visualization for sequence based typing methods. BMC Bioinformatics 13, (2012).



18 Francisco AP, Bugalho M, Ramirez M,

Carrico JA. Global optimal eBURST analysis of multilocus typing data using a graphic matroid approach. BMC Bioinformatics 10, 152 (2009). 19 Do T, Jolley KA, Maiden MC et al.

Population structure of Streptococcus oralis. Microbiology 155(Pt 8), 2593–2602 (2009). 20 Webb K, Jolley KA, Mitchell Z et al.

Development of an unambiguous and discriminatory multilocus sequence typing scheme for the Streptococcus zooepidemicus group. Microbiology 154(Pt 10), 3016–3024 (2008). 21 Coffey TJ, Pullinger GD, Urwin R et al. First

insights into the evolution of Streptococcus uberis: a multilocus sequence typing scheme that enables investigation of its population biology. Appl. Environ. Microbiol. 72(2), 1420–1428 (2006). 22 Enright MC, Spratt BG. A multilocus

sequence typing scheme for Streptococcus pneumoniae: identification of clones associated with serious invasive disease. Microbiology 144(11), 3049–3060 (1998). 23 Priest FG, Barker M, Baillie LW, Holmes EC,

Maiden MC. Population structure and evolution of the Bacillus cereus group. J. Bacteriol. 186(23), 7959–7970 (2004).

Special Report

Discusses the use of whole-genome sequencing for high-throughput typing with multilocus methods, along with a detailed comparison of allele-based and sequencebased methods of comparison.

30 Achtman M. A surfeit of YATMs? J. Clin.

Microbiol. 34(7), 1870 (1996). •

An historic reference that highlights the dangers of multiple incompatible typing methods. This is still pertinent today with the advent of whole-genome sequencing.

31 Sheppard SK, Jolley KA, Maiden MCJ. A

gene-by-gene approach to bacterial population genomics: whole genome MLST of Campylobacter. Genes 3(2), 261–277 (2012). 32 Maiden MC, Van Rensburg MJ, Bray JE et al.

MLST revisited: the gene-by-gene approach to bacterial genomics. Nat. Rev. Microbiol. 11(10), 728–736 (2013). •• A review of MLST in the genome age and how the same methodology can be applied to whole-genome analysis. 33 Bratcher HB, Bennett JS, Maiden MCJ.

Evolutionary and genomic insights into meningococcal biology. Future Microbiol. 7(7), 873–885 (2012). 34 Jolley KA, Maiden MC. BIGSdb: scalable

analysis of bacterial genome variation at the population level. BMC Bioinformatics 11(1), 595 (2010).

www.futuremedicine.com

629

Special Report  Jolley & Maiden 35 Jolley KA, Bliss CM, Bennett JS et al.

Ribosomal multi-locus sequence typing: universal characterization of bacteria from domain to strain. Microbiology 158, 1005–1015 (2012). •

Describes a universal MLST scheme that can speciate and type any bacteria since it uses the sequences of genes that encode the ribosomal proteins that are universally present throughout the bacterial domain.

spectroscopy. Appl. Environ. Microbiol. 79(3), 965–973 (2013). 40 Inouye M, Conway TC, Zobel J, Holt KE.

Short read sequence typing (SRST): multi-locus sequence types from short reads. BMC Genomics 13, 338 (2012). 41 Jolley KA, Maiden MC. Automated

extraction of typing information for bacterial pathogens from whole genome sequence data: Neisseria meningitidis as an exemplar. Euro Surveill. 18(4), 20379 (2013).

36 Ussery DW, Gordon SV. Two novel methods

for using genome sequences to infer taxonomy. Microbiology 158(Pt 6), 1414 (2012).

42 Vogel U, Szczepanowski R, Claus H,

Junemann S, Prior K, Harmsen D. Ion torrent personal genome machine sequencing for genomic typing of Neisseria meningitidis for rapid determination of multiple layers of typing information. J. Clin. Microbiol. 50(6), 1889–1894 (2012).

37 Yutin N, Puigbo P, Koonin EV, Wolf YI.

Phylogenomics of prokaryotic ribosomal proteins. PLoS ONE 7(5), (2012). 38 Bennett JS, Jolley KA, Earle SG et al.

A genomic approach to bacterial taxonomy: an examination and proposed reclassification of species within the genus Neisseria. Microbiology 158(Pt 6), 1570–1580 (2012). 39 Read DS, Woodcock DJ, Strachan NJ et al.

Evidence for phenotypic plasticity amongst multi-host Campylobacter jejuni and C. coli lineages using ribosomal MLST and Raman

630



A practical demonstration of the ease of extracting molecular typing information, such as MLST, from whole-genome data.

43 Larsen MV, Cosentino S, Rasmussen S et al.

Multilocus sequence typing of total-genomesequenced bacteria. J. Clin. Microbiol. 50(4), 1355–1361 (2012). 44 Bygraves JA, Urwin R, Fox AJ et al.

Population genetic and evolutionary

Future Microbiol. (2014) 9(5)

approaches to the analysis of Neisseria meningitidis isolates belonging to the ET-5 complex. J. Bacteriol. 181(18), 5551–5556 (1999). 45 Delbos V, Lemee L, Benichou J, Berthelot G,

Taha MK, Caron F. Meningococcal carriage during a clonal meningococcal B outbreak in France. Eur. J. Clin. Microbiol. Infect. Dis. 32(11), 1451–1459 (2013). 46 Climent Y, Urwin R, Yero D et al. The

genetic structure of Neisseria meningitidis populations in Cuba before and after the introduction of a serogroup BC vaccine. Infect. Genet. Evol. 10(4), 546–554 (2010). 47 De Filippis I, De Lemos APS, Hostetler JB

et al. Molecular epidemiology of Neisseria meningitidis serogroup B in Brazil. PLoS ONE 7(3), e33016 (2012). 48 PubMLST website hosted at the University of

Oxford, UK. http://pubmlst.org/ 49 MLST website hosted at Imperial College,

UK. http://www.mlst.net/ 50 Comprehensive list of all MLST databases.

http://pubmlst.org/databases.shtml

future science group