Genomic comparison between pathogenic ...

1 downloads 0 Views 1MB Size Report
Pattanapon Kayansamruaj a, Nopadon Pirarat b, Hidehiro Kondo c, Ikuo Hirono c, ... c Laboratory of Genome Science, Tokyo University of Marine Science and ...... Delannoy, C.M., Zadoks, R.N., Crumlish, M., Rodgers, D., Lainson, F.A., Ferguson, H.W., ... Evans, J.J., Bohnsack, J.F., Klesius, P.H., Whiting, A.A., Garcia, J.C., ...
Infection, Genetics and Evolution 36 (2015) 307–314

Contents lists available at ScienceDirect

Infection, Genetics and Evolution journal homepage: www.elsevier.com/locate/meegid

Genomic comparison between pathogenic Streptococcus agalactiae isolated from Nile tilapia in Thailand and fish-derived ST7 strains Pattanapon Kayansamruaj a, Nopadon Pirarat b, Hidehiro Kondo c, Ikuo Hirono c, Channarong Rodkhum a,⁎ a b c

Department of Veterinary Microbiology, Faculty of Veterinary Science, Chulalongkorn University, Bangkok 10330, Thailand Department of Veterinary Pathology, Faculty of Veterinary Science, Chulalongkorn University, Bangkok 10330, Thailand Laboratory of Genome Science, Tokyo University of Marine Science and Technology, Tokyo 108-8477, Japan

a r t i c l e

i n f o

Article history: Received 13 May 2015 Received in revised form 29 September 2015 Accepted 7 October 2015 Available online 9 October 2015 Keywords: Comparative genomics Evolution ST7 Streptococcus agalactiae Fish

a b s t r a c t Streptococcus agalactiae, or Group B streptococcus (GBS), is a highly virulent pathogen in aquatic animals, causing huge mortalities worldwide. In Thailand, the serotype Ia, β-hemolytic GBS, belonging to sequence type (ST) 7 of clonal complex (CC) 7, was found to be the major cause of streptococcosis outbreaks in fish farms. In this study, we performed an in silico genomic comparison, aiming to investigate the phylogenetic relationship between the pathogenic fish strains of Thai ST7 and other ST7 from different hosts and geographical origins. In general, the genomes of Thai ST7 strains are closely related to other fish ST7s, as the core genome is shared by 92–95% of any individual fish ST7 genome. Among the fish ST7 genomes, we observed only small dissimilarities, based on the analysis of clustered regularly interspaced short palindromic repeats (CRISPRs), surface protein markers, insertions sequence (IS) elements and putative virulence genes. The phylogenetic tree based on single nucleotide polymorphisms (SNPs) of the core genome sequences clearly categorized the ST7 strains according to their geographical and host origins, with the human ST7 being genetically distant from other fish ST7 strains. A pangenome analysis of ST7 strains detected a 48-kb gene island specifically in the Thai ST7 isolates. The orientations and predicted amino acid sequences of the genes in the island closely matched those of Tn5252, a streptococcal conjugative transposon, in GBS 2603V/R serotype V, Streptococcus pneumoniae and Streptococcus suis. Thus, it was presumed that Thai ST7 acquired this Tn5252 homologue from related streptococci. The close phylogenetic relationship between the fish ST7 strains suggests that these strains were derived from a common ancestor and have diverged in different geographical regions and in different hosts. © 2015 Elsevier B.V. All rights reserved.

1. Introduction Streptococcus agalactiae (Group B streptococcus; GBS), is a causative agent of meningoencephalitis in human infants, mastitis in dairy cows and septicemia in fish (Timoney, 2010). At present, streptococcosis outbreaks associated with GBS infection in commercial fish farms have been reported worldwide, causing 250 million USD losses to the aquaculture industry annually (Klesius et al., 2008). In Thailand, massive mortalities stemming from GBS infection have occurred frequently in the cage- and earthen pond-cultured tilapia farms (Kayansamruaj et al., 2014a; Suanyuk et al., 2008). Among the genotypes of GBS strains isolated from farmed Nile tilapia, the β-hemolytic, serotype Ia strain is the one that is most widely distributed in Thailand (Kayansamruaj et al., 2014a). A preliminary genomic characterization revealed that the Thai piscine GBS strains belong to sequence type (ST) 7 of clonal complex (CC) 7, the GBS lineage associated with lethal septicemia in human and fish (Evans et al., 2008; Jones et al., 2003). To date, lethal outbreaks of ST7 strains have occurred mainly in Asia, including Kuwait, ⁎ Corresponding author. E-mail address: [email protected] (C. Rodkhum).

http://dx.doi.org/10.1016/j.meegid.2015.10.009 1567-1348/© 2015 Elsevier B.V. All rights reserved.

China and Thailand (Delannoy et al., 2013; Kayansamruaj et al., 2014b; Liu et al., 2013). Because human and fish pathogenic ST7 strains are closely related, it is possible that the fish and human ST7s diverged from the same recent common ancestor (Liu et al., 2013). At present, public databases have complete genome sequences of two ST7 isolates and whole genome shotgun sequences for two other ST7 isolates. However, there is little epidemiological information showing the genetic relationship between ST7 isolates. Therefore, this study aimed to investigate the epidemiological and evolutionary relationships between ST7 isolates from different hosts and different geographical regions, including those isolated from tilapia with streptococcosis in Thailand, using bioinformatic tools. 2. Materials and methods 2.1. Genome sequences Three genome sequences of Thai piscine ST7 isolates were initially selected. The source of isolation and the procedures for bacterial genome preparation, sequencing, assembly and the identification of multilocus sequence type were carried out as described in our previous

308

P. Kayansamruaj et al. / Infection, Genetics and Evolution 36 (2015) 307–314

publication (Kayansamruaj et al., 2014b). The whole genome shotgun (WGS) sequences of the Thai ST7 isolates, i.e., FNA07, FPrA02 and ENC06, have been deposited in GenBank under the accession numbers GCA_000715295, GCA_000715315 and GCA_000714695, respectively. For genomic comparisons, three genome sequences of fish ST7 and 1 genome sequence of human ST7 prototype were included in the analytical process (Table 1). The genome sequences of the ST7 strains, both complete and draft sequences, were annotated using Rapid Annotation using Subsystem Technology (RAST) automated web service (Overbeek et al., 2014). 2.2. Prediction of genome elements The prophage sequence contained within the ST7 genomes was examined using PHAST search tool (http://phast.wishartlab.com/ index.html) (Zhou et al., 2011). ResFinder with the default setting was used to identify antibiotic resistance genes (http://www.cbs.dtu.dk/ services/ResFinder) (Zankari et al., 2012), and CRISPRfinder program online was used in order to search for clustered regularly interspaced short palindromic repeats (http://crispr.u-psud.fr/) (Grissa et al., 2007). The nucleotide sequences of the spacers contained in the CRISPR/cas region of ST7 genomes were then analyzed and categorized into the corresponding spacer group as described by Lopez-Sanchez et al. (2012). The predicted amino acid sequences derived from RAST annotation were examined for the existence of surface protein markers (αC and βC protein) and putative virulence proteins of GBS deposited in the Virulence Factor of Pathogenic Bacteria database (VFDB; www.mgc. ac.cn/VFs/main.htm) using BLASTp with a cut-off value of 1e−5. Lastly, the presence of insertion sequence elements (ISs; IS1381, IS861, IS1548, ISSa1, ISSag2, ISSag3, ISSag4 and GBSi1) in GBS genomes was observed using BLASTn against the query sequences of ISs described elsewhere (Kong et al., 2003). 2.3. Orthology reconstruction The protein sequences obtained from RAST annotation were uploaded to OrthoMCL-DB version 5.0 in order to assign proteins into the groups of orthology using the default setting (Li et al., 2003). The derived orthologous clusters were then used as a template in the OrthoMCL search strategies workspace in order to verify the common OrthoMCL groups among ST7 isolates comprised of piscine isolates FNA07, FPrA02, ENC06, and GD201008-001 and human isolate A909. 2.4. Genomic comparison The shared and unique genes among 7 isolates of ST7 were verified using the sequence-based genomic comparison tool provided by the SEED viewer (Overbeek et al., 2014). Briefly, the deduced amino acid sequences of each ST7 isolate obtained from RAST annotation were compared against other ST7 isolates by reciprocal sequence comparison with a protein identity cut-off of 80%. The “search against all” scheme for the 7 genome sequences of ST7 was simulated in all possible combinations. Subsequently, the identity of unique genes of FNA07, FPrA02 and ENC06 was searched against NCBI's non-redundant protein

sequence database using BLASTp. Additionally, visualization of the genome alignment of ST7 isolates was performed using a progressive Mauve algorithm in the Mauve program version 2.3.1 (Darling et al., 2010). 2.5. Phylogenetic reconstruction The phylogenetic tree was generated based on the concatenated sequences of single nucleotide polymorphisms (SNPs) located in the core genome of the 7 isolates of ST7. SNP calling, filtering, site validation, alignment of concatenated SNP sequence and visualization of the phylogenetic tree were conducted using an automated CSI Phylogeny 1.0a web service with the default setting (Kaas et al., 2014). 3. Results 3.1. Gene annotation The FNA07, FPrA02 and ENC06 strains yielded 2090, 2032 and 2083 predicted coding sequences (CDSs), respectively. The other ST7 isolates yielded an average of 2021 CDSs. Slightly more than half (54 to 57%) of the CDSs in each strain could be functionally categorized into 339 to 344 subsystems. Among the subsystems, carbohydrates, protein metabolism and cell wall and capsule had the greatest number of genes. An overview of the genome characteristics of the ST7 isolates and their subsystem statistics is shown in Table 2. 3.2. Genome elements of ST7 isolates 3.2.1. Prophage PHAST revealed two different phages (an intact phage and an incomplete phage) in three of the isolates (FNA07, FPrA02 and ENC06). The intact phage has a length of 31 kb comprised of 30 CDSs, which is almost identical (99% similarity) to LambdaSa04 prophage of human A909 and fish strains GD201008-001, ZQ0910 and CF01173. The Thai ST7-derived LambdaSa04 was more similar to the sequence in A909 than the sequence in GD201008-001 (total length, 28 kb) since the 3′ flanking region of Chinese ST7-derived LambdaSa04 has a large deletion. Additionally, the putative attL and attR, located at the internal and flanking region of Thai ST7's, were also deleted in the GD201008001 sequence. The incomplete prophage was found in the genomes of Thai ST7 and CF01173 strains. This incomplete prophage is 16 kb in length and contains 25 CDSs (Fig. S2). A BLASTn analysis of the 16-kb phage indicated a high similarity (95%) to the LambdaSa05 prophage derived from human strain A909. 3.2.2. CRISPRs CRISPRfinder found an identical 235-bp CRISPR sequence in FNA07, FPrA02 and ENC06 containing 4 direct repeat sequences (37 bp per repeat), 3 spacers and the conserved CRISPR-associated gene csn1. The Thai ST7 isolates (but not human ST7 strain A909) had three spacers (104, 105 and 106) that were identical to the leader-end spacers of the Chinese piscine GD201008-001 and ZQ0910 (Fig. 1). Although no CRISPR-related sequence was found in fish ST7 strain CF01173, the

Table 1 Genome sequences of Streptococcus agalactiae ST7 isolates included in the genomic comparison. Strain

Host

Geographical origin

Date

Genome status

GenBank accession number

Reference

ENC06 FPrA02 FNA07 GD201008-001 ZQ0910 CF01173 A909

Water (tilapia farm) Nile tilapia Nile tilapia Tilapia Tilapia Trout Human

Nakhon Pathom, Thailand Prachin Buri, Thailand Central region of Thailand China China UK Japan

2011 2009 2008 – – – –

Contigs Contigs Contigs Complete Scaffolds Contigs Complete

GCA_000714695 GCA_000715315 GCA_000715295 CP003810 AKAP00000000 CAQB00000000 CP000114

Kayansamruaj et al. (2014b) Kayansamruaj et al. (2014b) Kayansamruaj et al. (2014b) Liu et al. (2013) Liu et al. (2013) Rosinski-Chupin et al. (2013) Tettelin et al. (2005)

P. Kayansamruaj et al. / Infection, Genetics and Evolution 36 (2015) 307–314

309

Table 2 Characteristics of seven ST7 genomes and their subsystems. Strains

Size (Mb) %GC Number of ORFs Number of RNAs Subsystem coverage (%) Number of subsystems Subsystem feature counts Cofactors, vitamins, prosthetic groups, pigments Cell wall and capsule Virulence, disease and defense Potassium metabolism Photosynthesis Miscellaneous Phages, prophages, transposable elements, plasmids Membrane transport Iron acquisition and metabolism RNA metabolism Nucleosides and nucleotides Protein metabolism Cell division and cell cycle Motility and chemotaxis Regulation and cell signaling Secondary metabolism DNA metabolism Fatty acids, lipids, and isoprenoids Nitrogen metabolism Dormancy and sporulation Respiration Stress response Metabolism of aromatic compounds Amino acids and derivatives Sulfur metabolism Phosphorus metabolism Carbohydrates

FNA07

FPrA02

ENC06

GD 201008-001

ZQ 0910

CF01173

A909

2.1 35.5 2090 49 54 343

2.05 35.4 2032 44 56 341

2.09 35.4 2083 49 54 342

2.06 35.6 2011 97 57 343

2.00 35.5 1993 35 57 342

2.02 35.3 1991 16 57 339

2.12 35.6 2089 101 55 344

95 140 60 15 0 23 12 70 20 107 95 177 29 3 30 0 100 75 0 1 15 57 2 137 7 32 274

95 140 60 15 0 23 12 70 20 108 95 180 30 3 30 0 99 75 0 1 15 53 2 139 7 32 274

result is likely to be a false negative since only the draft genome sequence of CF01173 was employed in the CRISPR analysis. Hence, the CRISPR analysis result of CF01173 was excluded from the comparison of CRISPR-spacers among the ST7 isolates.

3.2.3. Antibiotic resistance genes According to ResFinder tool, the 13 classes of antibiotics were included in the antibiotic resistance genes finding machinery, namely,

Fig. 1. Diversity in spacer profile of CRISPR1 locus in six ST7 isolates. The spacer was identified and designated according to Lopez-Sanchez et al. (2012). The colored boxes represent identical spacers shared by different strains. GBS strains were grouped according to the alignment of the corresponded spacers. Of note, ST7 strain CF01173 was also analyzed using the identical CRISPRs finder automated program, but no CRISPR-associated sequence was identified. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)

95 139 60 15 0 23 12 70 20 107 95 177 29 3 30 0 100 75 0 1 15 58 2 137 7 32 274

100 138 57 15 0 22 13 75 18 120 93 179 34 3 33 0 106 77 3 1 15 55 2 146 7 36 281

100 137 57 15 0 22 13 75 18 112 93 179 34 3 33 0 106 77 3 1 15 54 2 146 7 36 282

100 141 61 15 0 21 13 72 21 104 93 175 34 3 34 0 104 78 3 1 16 40 2 147 7 36 280

100 142 64 15 0 22 28 75 21 103 93 179 34 3 34 0 104 77 3 1 15 55 2 146 7 36 280

beta-lactam, fluoroquinolone, fosfomycin, fusidic acid, macrolide/ lincosamide/streptogramin B, nitroimidazole, oxazolidinone, phenicol, rifampicin, sulphonamide, tetracycline, trimethoprim and glycopeptide. With the default settings (minimum sequence identity of 98% with at least 60% coverage), no antibiotic resistance genes were found in the ST7 genomes, although one sequence from FPrA02 was identified as a putative antibiotic resistance gene. This sequence has a length of 566 bp showing a 98% similarity with the partial sequence of kanamycin resistance transposon Tn903 (accession number V00359). The FPrA02derived sequence contained only an incomplete aminoglycoside phosphotransferase (aphA) domain compared to the complete aphA sequence present in Escherichia coli. 3.2.4. Surface protein markers, insertion sequence elements and putative virulence genes The ST7 genomes have several genetic elements, including surface protein markers, insertion sequences and putative virulence genes (Table 3). Only A909 carried the complete sequences of both surface proteins, αC protein (bca) and βC protein (cba). In fish ST7 isolates, cba was found missing in ENC06, FNA07 and CF01173, while the bca, with partial deletion, was observed in all fish ST7 isolates (except FPrA02 and CF01173 which bca was unidentified). Specifically, the repeat motifs within bca and cba were deleted from fish ST7. Further investigation using the Tandem Repeat Finder (https://tandem.bu.edu/ trf/trf.html) web tool found nine 246-bp repeats in the bca of strain A909 but no repeats in the fish ST7s. Furthermore, FPrA02 also had an insertion in the virulence gene. In FPrA02, the CDS of fbsB, encoding fibrinogen-binding protein, was disrupted by a 1.1-kb putative mobile element containing two ORFs (Fig. 2). The Thai ST7 isolates and GD201008-001 have identical insertion sequence (IS) profiles. The ST7 isolates had numerous similar putative

310

P. Kayansamruaj et al. / Infection, Genetics and Evolution 36 (2015) 307–314

Table 3 Surface protein markers, insertion sequence elements and putative virulence genes in the genomes of ST7 strains. The genetic determinants were identified using BLASTp against CDSs of each strain. Gene determinants

Strains FNA07

Surface proteins Insertion sequence elements

Putative virulence genes

a b

αC protein (bca) βC protein (cba) IS1381 IS861 IS1548 ISSag1 ISSag2 ISSa4 GBSi1 Laminin-binding protein (lmb) Agglutinin receptor (ssp5) Fibrinogen-binding protein A (fbsA) Fibrinogen-binding protein B (fbsB) Pilus island 1 Pilus island 2a Pilus island 2b C5a peptidase (scpB)

+ – + + – – + – + – + – + + – + –

a

FPrA02 – + + + – – + – + – – + +b + – + –

ENC06 + – + + – – + – + – + – + + – + –

a

GD201008-001 + + + + – – + – + – – + + – – + –

a

ZQ0910 + + + + – – + – – – – + + – – + –

a

CF01173

A909

– – + + + + + – + + – – + + – + +

+ + + + – + + – – + – + + + – + +

Gene partially present due to sequence deletion. Gene was interrupted by the introduction of mobile element.

virulence genes, including genes concerned with adhesion (ssb1, ssp5, pavA, fbsA, fbsB, lmb, plr, PI-1, PI-2a and PI-2b), immune evasion (cps cluster), protease synthesis (cppA, scpB, htrA and ropA), toxin synthesis (cyl operon and cfb), metal transportation (psaA), extracellular enzyme (hylB and eno) and immune reactive protein (bca, cba and sip). fbsB and pilus island 2b were the only putative virulence genes shared by all seven strains (Table 3). The complete virulence gene profile is given in Table S1.

3.3. Core and pan-genome analysis FNA07, FPrA02, and ENC06-derived genes were categorized into 1393, 1383 and 1392 OrthoMCL groups by OrthoMCL-DB. Identification of common OrthoMCL groups revealed that 1374 were shared between Thai ST7 and 1346 were shared among Thai, Chinese piscine and human ST7 isolates (FNA07, FPrA02, ENC06, GD2001008-001 and A909) as shown in Fig. 3A. The unique OrthoMCL groups of Thai ST7 isolates were listed in the Supplementary material (Table S2).

“A BLAST against all” search by the SEED viewer showed that the seven ST7 isolates shared 1920 genes while the Thai piscine ST7 isolates shared 2012 genes. The SEED viewer identified 65 genes that were specific to the Thai ST7 isolates. FNA07, FPrA02 and ENC06 had 7, 6 and 1 of these genes, respectively, FNA07/ENC06 and FPrA02/ENC06 had 43 and 2 of these genes, respectively, and the Thai ST7 isolates had 6 of these genes (Supplementary material; Table S3). An alignment of the ST7 genomes (Fig. 3B) showed that prophage LambdaSa03 and lmb-scb locus were found in human ST7 A909 but not the fish ST7s (except CF01173 in which lmb-scb locus was observed), while prophage LambdaSa05 was found in all but Chinese ST7 fish strains (GD201008-001 and ZQ0910). A MAUVE alignment (Fig. 3C) identified a 48-kb gene island that was specific to FNA07 and ENC06. This island has the size of 48 kb and contains 28 ORFs encoding Snf2 family protein, calciumbinding protein, agglutinin receptor, abortive infection proteins I & II, Tn5252 ORF proteins, C-5 cytosine specific DNA methylase and hypothetical proteins. The amino acid sequences and gene order of the island were very similar to those in the GBS strains STIR-CD-21 and 2603V/R (serotype V). It should be noted that a strain-specific 10-kb gene island

Fig. 2. Insertion of transposable elements within the fibrinogen-binding protein-encoding gene (fbsB) in Thai ST7 strain FPrA02. (A) Chinese ST7 strain GD201008-001, which has no insertion. (B) Thai ST7 strain FPrA02 which has an insertion. Arrows indicate the orientations of the genes.

P. Kayansamruaj et al. / Infection, Genetics and Evolution 36 (2015) 307–314

311

Fig. 3. Core and pan-genome analysis of human and fish ST7 strains. (A) Venn diagram represents the number of common and unique OrthoMCL groups among five ST7 strains. The detailed information of unique OrthoMCL groups of Thai piscine ST7 isolates (indicated by underlined text) is shown in Table S2. (B) Comparison of genomes of ST7 isolates. The RAST-annotated genomes of fish ST7 isolates (ENC06, FNA07, FPrA02, ZQ0910 and GD201008-001) were aligned using human ST7 strain A909 as a reference sequence. The genomes were aligned and the figure was created by the SEED Viewer (rast.nmpdr.org/seedviewer.cgi). Black bars outside of the outer circle indicate the locations of additional loci in A909. (C) Progressive MAUVE-based alignment of the genomes of three ST7 isolates, showing a gene island specifically in FNA07. The gene island is also present in ENC06. The genes in the gene island and their order are very similar to those in S. agalactiae strain 2603V/R.

that was previously reported in the fish GD201008-001 strain (Liu et al., 2013) was also observed in the Thai ST7 isolates. 3.4. Phylogenetic tree construction The seven ST7 GBS genomes contained a total of 290 SNPs located within the core genome. In an unrooted phylogenetic tree based on the concatenated SNPs (Fig. 4A), the ST7 isolates clustered according to their hosts and geographical origins. The three Thai ST7 isolates were grouped into the same clade, distinctly separate from the

branch containing the Chinese (GD201008-001 and ZQ0910) and UK (CF01173) isolates. The human ST7 strain A909 showed significant evolutionary diversification from the fish ST7, as indicated by the tree's topology and the large differences in the number of SNPs (Fig. 4B). 4. Discussion The current study aimed to monitor the phylogenetic relationship between Thai piscine ST7s and other ST7s which have emerged in different hosts and different geographical regions. The basic genome

312

P. Kayansamruaj et al. / Infection, Genetics and Evolution 36 (2015) 307–314

Fig. 4. Relationships between 290 SNPs contained in the core genomes of human and fish ST7 isolates. The SNPs were called by CSI phylogeny 1.0a (Kaas et al., 2014). (A) Unrooted phylogenetic tree based on the concatenated SNPs. The SNPs were concatenated and aligned, and the tree was constructed by CSI phylogeny 1.0a. Scale bar represents the nucleotide substitutions per site. (B) Number of SNP counts between pairs of ST7 isolates.

characteristics and subsystem statistics of Thai ST7 were similar to those of fish and human ST7 isolates (Table 2). The genome size of fish pathogenic CC7 GBS (ST6 and ST7) varies from 2.01 to 2.06 Mb, while two other lineages of fish pathogenic GBS serotype Ib, ST260–261 and 553, contain smaller size genomes (~1.8 Mb) (Liu et al., 2013; Pereira Ude et al., 2013). The reduced genome size has been attributed to the host specialization process (Liu et al., 2013; Rosinski-Chupin et al., 2013). Among piscine ST7 isolates, some noticeable differences could also be observed in the current study. The LambdaSa04 prophage was found in all ST7 isolates, with the sequence and the location within the genome being very similar. The LambdaSa04 prophage may have been inserted in the genome of the ancestor of the fish and human ST7s and vertically transmitted to the descendant strains during the evolutionary process. Nevertheless, the disappearance of LambdaSa05 prophage in the Chinese piscine ST7 genomes (GD201008-001 and ZQ0910) suggests that the evolution of Chinese genomes was distinct from that of the Thai piscine ST7s. The Thai and Chinese ST7 isolates shared 3 proximal CRISPR spacers (Fig. 1). The spacers in the CRISPR region chronologically correspond to the host's exposure to plasmids/ phages, in which the leader-distal spacers represent ancient exposure and the leader-proximal spacers represent more recent exposure (Lopez-Sanchez et al., 2012; Sorek et al., 2008). Therefore, the sharing of the proximal spacers of the Thai and Chinese ST7 isolates suggests that their common ancestor evolved very recently. However, the lack of distal spacers (63, 64, 67 and 107–108; Fig. 1) in the Thai ST7s, which might be associated with the deletion of older spacers (LopezSanchez et al., 2012), is a further sign that these isolates have evolutionarily diverged from the Chinese ST7s. The only antibiotic resistance gene in the ST7 genomes was a putative kanamycin resistance gene in FPrA02. This gene significantly matched a partial sequence of the gene encoding aminoglycoside 3′phosphotransferase (aphA) and was contained in a Tn903 transposable element. Thus, FPrA02 may have horizontally acquired this gene from related bacteria within the shared environment (Ouellette et al., 1987). aphA was found to be associated with the occurrence of a small population (10%) of high-level kanamycin-resistant strains in human GBS (Poyart et al., 2003). However, the kanamycin resistance gene of

FPrA02 does not appear to be functional since the FPrA02-associated aphA contains an incomplete aminoglycoside phosphotransferase domain and the proton acceptor active site (197D199 in the aphA of E. coli) could not be found in our sequence. Repetitive sequences in the αC protein-encoding gene (bca) were deleted in the Thai and Chinese piscine ST7 isolates, while FPrA02 lacked the whole bca gene. The deletion of the repetitive sequences was attributed to slipped-strand mispairing during DNA replication (Lindahl et al., 2005). Intriguingly, this has been hypothesized as a GBS mechanism to evade the host defensive responses since the reduction of repeat motifs could alter the immunological properties of the αC antigen (Lindahl et al., 2005). This suggests that the common ancestor of the Thai and Chinese piscine ST7 isolates was exposed to the same host defense responses. This could help explain the identical bca sequences in the descendant strains. The variation of pilus variants that we observed between the Thai and Chinese piscine ST7s (Table 3) is consistent with the finding that fish GBS isolates carry either 1) PI-2b or 2) PI-1 plus PI-2b in equal proportions (Liu et al., 2013). The virulence gene profile also varied in strains FNA07 and ENC06 which lack the gene encoding fibrinogenbinding protein A (fbsA). To the best of our knowledge, deletion of fbsA in the fish-derived GBS strains has not yet been reported. FbsA is a cell wall-associated highly repetitive adhesin that promotes adherence of GBS to human epithelial cells (Schubert et al., 2004). However, it is unclear whether deletion of fbsA in Thai ST7 isolates affects their pathogenicity to fish. In FPrA02, another gene encoding a fibrinogenbinding protein, fbsB, is interrupted by a 1.1-kb transposable element (Fig. 2). Thus, the acquisition of transposable elements by GBS isolates may, in addition to driving genomic diversity (Hery-Arnaud et al., 2005), also suppress the expression of certain genes. Analysis of the core genomes by the SEED viewer revealed 1920 genes that were shared by the ST7 isolates. Unsurprisingly, a comparison of the much more diverse genomes of 15 GBS strains representing different clonal complexes (CCs), including CC7 and 552 which experienced massive genome erosion, and different hosts revealed only 1202 shared orthologs (Liu et al., 2013). The finding that a 10-kb gene island in the Chinese piscine strains, encoding LPXTG cell wall surface

P. Kayansamruaj et al. / Infection, Genetics and Evolution 36 (2015) 307–314

and DNA-binding proteins (Liu et al., 2013) was also present in the genomes of Thai ST7 isolates suggests that the gene island could be important in the virulence of GBS in a fish model. In the pan-genomic analysis, 65 genes were identified as Thai ST7-specific genes (Supplementary material; Table S3). About seven of these genes (10.7%) appeared to have incomplete CDSs due to partial deletions and the presence of premature stop-codons. At least 40 of the ST7-specific genes exhibited high amino acid sequence similarity (98 to 100%) to the tilapia STIR-CD-21 isolate (which also belongs to ST7). Unfortunately, the geographical origin of STIR-CD-21 is not publicly available. Intriguingly, two isolates of Thai ST7 (FNA07 and ENC06) specifically harbored a 48-kb gene island which is very similar to those in the genomes of STIR-CD-21 and the serotype V GBS strain 2603V/R. Serotype V emerged as the major cause of invasive GBS disease in non-pregnant adults (Skoff et al., 2009), but there is no evidence that it is linked with disease outbreaks in fish. The BLASTp analysis revealed that several genes in the 48-kb gene island (SNF2 family, C-5 cytosine methylase, Tn5252 ORFs 21, 23, 25, 26 and 28) were homologous to the conjugative transposon Tn5252 originally identified in a human pneumococcal disease-associated pathogen S. pneumoniae (Ayoubi et al., 1991). Notably, the 5′-upstream region of the FNA07-, ENC06- and 2603V/R-derived gene island contained three CDSs corresponding to Tn5252 terminal region-clustered genes, i.e. DNA relaxase, ORF9 and ORF10 (Fig. 3C). Since the discovery of Tn5252 in Streptococcus pneumoniae, Tn5252 homologues have been identified in numerous streptococcal species, such as Streptococcus suis and GBS, which suggests that it has a high capability of conjugal transfer (Alarcon-Chaidez et al., 1997). This clearly shows that the Tn5252associated region contained in the genomes of Thai ST7 strains FNA07 and ENC06 (and also STIR-CD-21) were acquired horizontally from the related streptococci. Whereas over 22,000 SNPs have been identified in the GBSs of various CCs and host origins (Delannoy et al., 2014), only 290 SNPs were found among the ST7 isolates in this study, which confirms the previously recognized close relationship among the ST7 strains (Delannoy et al., 2014; Liu et al., 2013; Rosinski-Chupin et al., 2013). Interestingly, despite the small numbers of SNPs, the branches of the SNP-based tree of the ST7 isolates could be separated according to their host and geographical origins (Fig. 4A). The tree's topology and the SNPs counted in this study suggest that the common ancestor of the Thai piscine ST7s is more recent than the ancestor of other fish ST7 strains. In summary, the genomic comparison of ST7 strains established significant similarities between Thai piscine ST7 and other fish originated ST7s, especially the Chinese piscine ST7 strains. It is plausible that the fish ST7 strains might have originated from a common ancestor. But, as time went on, it appears that the Thai ST7 strains diverged through the acquisition of extrachromosomal genetic contents, such as the Tn5252 region, from related streptococci. Further studies are needed to determine the serological responses to the homologous and heterologous ST7 strains to understand whether the genomic differences among fish ST7 strains affect their antigenic properties. Composition of prophage derived from GBS ST7 strains FNA07, FPrA02, ENC06, GD201008-001 and A909 (Fig. S1). Composition of 16-kb incomplete prophages derived from strains FNA07, FPrA02, and ENC06 (Fig. S2). Distribution of putative virulence genes among GBS genomes (Table S1). FNA07-, FPrA02- and ENC06-associated OrthoMCL groups identified by OrthoMCL-DB (Table S2). FNA07, FPrA02 and ENC06 unique proteins identified by SEED viewer (Table S3). Supplementary data associated with this article can be found, in the online version, at http://dx.doi.org/10.1016/j.meegid.2015.10.009. Acknowledgement This work was supported by a grant from In-depth Research Cluster Project of Chulalongkorn University's Strategy (Ratchadaphiseksomphot Endowment Fund; Grant number: CU-56-506-FW), Rachadapisek

313

Sompote Fund for Postdoctoral Fellowship of Chulalongkorn University and Chulalongkorn University's PhD Scholarship for Research Abroad (D-RSAB). The authors would like to thank Mr. Daniel Montefusco and Mr. Jim Raymond for their help on proofreading this paper.

References Alarcon-Chaidez, F., Sampath, J., Srinivas, P., Vijayakumar, M.N., 1997. TN5252: a model for complex streptococcal conjugative transposons. Adv. Exp. Med. Biol. 418, 1029–1032. Ayoubi, P., Kilic, A.O., Vijayakumar, M.N., 1991. Tn5253, the pneumococcal omega (cat tet) BM6001 element, is a composite structure of two conjugative transposons, Tn5251 and Tn5252. J. Bacteriol. 173, 1617–1622. Darling, A.E., Mau, B., Perna, N.T., 2010. progressiveMauve: multiple genome alignment with gene gain, loss and rearrangement. PLoS One 5, e11147. Delannoy, C.M., Crumlish, M., Fontaine, M.C., Pollock, J., Foster, G., Dagleish, M.P., Turnbull, J.F., Zadoks, R.N., 2013. Human Streptococcus agalactiae strains in aquatic mammals and fish. BMC Microbiol. 13, 41. Delannoy, C.M., Zadoks, R.N., Crumlish, M., Rodgers, D., Lainson, F.A., Ferguson, H.W., Turnbull, J., Fontaine, M.C., 2014. Genomic comparison of virulent and non-virulent Streptococcus agalactiae in fish. J. Fish Dis. http://dx.doi.org/10.1111/jfd.12319. Evans, J.J., Bohnsack, J.F., Klesius, P.H., Whiting, A.A., Garcia, J.C., Shoemaker, C.A., Takahashi, S., 2008. Phylogenetic relationships among Streptococcus agalactiae isolated from piscine, dolphin, bovine and human sources: a dolphin and piscine lineage associated with a fish epidemic in Kuwait is also associated with human neonatal infections in Japan. J. Med. Microbiol. 57, 1369–1376. Grissa, I., Vergnaud, G., Pourcel, C., 2007. CRISPRFinder: a web tool to identify clustered regularly interspaced short palindromic repeats. Nucleic Acids Res. 35, W52–W57. Hery-Arnaud, G., Bruant, G., Lanotte, P., Brun, S., Rosenau, A., van der Mee-Marquet, N., Quentin, R., Mereghetti, L., 2005. Acquisition of insertion sequences and the GBSi1 intron by Streptococcus agalactiae isolates correlates with the evolution of the species. J. Bacteriol. 187, 6248–6252. Jones, N., Bohnsack, J.F., Takahashi, S., Oliver, K.A., Chan, M.S., Kunst, F., Glaser, P., Rusniok, C., Crook, D.W., Harding, R.M., Bisharat, N., Spratt, B.G., 2003. Multilocus sequence typing system for group B streptococcus. J. Clin. Microbiol. 41, 2530–2536. Kaas, R.S., Leekitcharoenphon, P., Aarestrup, F.M., Lund, O., 2014. Solving the problem of comparing whole bacterial genomes across different sequencing platforms. PLoS One 9, e104984. Kayansamruaj, P., Pirarat, N., Katagiri, T., Hirono, I., Rodkhum, C., 2014a. Molecular characterization and virulence gene profiling of pathogenic Streptococcus agalactiae populations from tilapia (Oreochromis sp.) farms in Thailand. J. Vet. Diagn. Investig. 26, 488–495. Kayansamruaj, P., Pirarat, N., Kondo, H., Hirono, I., Rodkhum, C., 2014b. Draft genome sequences of Streptococcus agalactiae strains isolated from Nile tilapia (Oreochromis niloticus) farms in Thailand. Genome Announc. 2, e01300–e01314. Klesius, P.H., Shoemaker, C.A., Evans, J.J., 2008. Streptococcus: a worldwide fish health problem. 8th International Symposium on Tilapia in Aquaculture, Cairopp. 83–107. Kong, F., Martin, D., James, G., Gilbert, G.L., 2003. Towards a genotyping system for Streptococcus agalactiae (group B streptococcus): use of mobile genetic elements in Australasian invasive isolates. J. Med. Microbiol. 52, 337–344. Li, L., Stoeckert Jr., C.J., Roos, D.S., 2003. OrthoMCL: identification of ortholog groups for eukaryotic genomes. Genome Res. 13, 2178–2189. Lindahl, G., Stalhammar-Carlemalm, M., Areschoug, T., 2005. Surface proteins of Streptococcus agalactiae and related proteins in other bacterial pathogens. Clin. Microbiol. Rev. 18, 102–127. Liu, G., Zhang, W., Lu, C., 2013. Comparative genomics analysis of Streptococcus agalactiae reveals that isolates from cultured tilapia in China are closely related to the human strain A909. BMC Genomics 14, 775. Lopez-Sanchez, M.J., Sauvage, E., Da Cunha, V., Clermont, D., Ratsima Hariniaina, E., Gonzalez-Zorn, B., Poyart, C., Rosinski-Chupin, I., Glaser, P., 2012. The highly dynamic CRISPR1 system of Streptococcus agalactiae controls the diversity of its mobilome. Mol. Microbiol. 85, 1057–1071. Ouellette, M., Gerbaud, G., Lambert, T., Courvalin, P., 1987. Acquisition by a Campylobacterlike strain of aphA-1, a kanamycin resistance determinant from members of the family Enterobacteriaceae. Antimicrob. Agents Chemother. 31, 1021–1026. Overbeek, R., Olson, R., Pusch, G.D., Olsen, G.J., Davis, J.J., Disz, T., Edwards, R.A., Gerdes, S., Parrello, B., Shukla, M., Vonstein, V., Wattam, A.R., Xia, F., Stevens, R., 2014. The SEED and the Rapid Annotation of microbial genomes using Subsystems Technology (RAST). Nucleic Acids Res. 42, D206–D214. Pereira Ude, P., Rodrigues Dos Santos, A., Hassan, S.S., Aburjaile, F.F., Soares Sde, C., Ramos, R.T., Carneiro, A.R., Guimaraes, L.C., Silva de Almeida, S., Diniz, C.A., Barbosa, M.S., Gomes de Sa, P., Ali, A., Bakhtiar, S.M., Dorella, F.A., Zerlotini, A., Araujo, F.M., Leite, L.R., Oliveira, G., Miyoshi, A., Silva, A., Azevedo, V., Figueiredo, H.C., 2013. Complete genome sequence of Streptococcus agalactiae strain SA20-06, a fish pathogen associated to meningoencephalitis outbreaks. Stand. Genomic Sci. 8, 188–197. Poyart, C., Jardy, L., Quesne, G., Berche, P., Trieu-Cuot, P., 2003. Genetic basis of antibiotic resistance in Streptococcus agalactiae strains isolated in a French hospital. Antimicrob. Agents Chemother. 47, 794–797. Rosinski-Chupin, I., Sauvage, E., Mairey, B., Mangenot, S., Ma, L., Da Cunha, V., Rusniok, C., Bouchier, C., Barbe, V., Glaser, P., 2013. Reductive evolution in Streptococcus agalactiae and the emergence of a host adapted lineage. BMC Genomics 14, 252. Schubert, A., Zakikhany, K., Pietrocola, G., Meinke, A., Speziale, P., Eikmanns, B.J., Reinscheid, D.J., 2004. The fibrinogen receptor FbsA promotes adherence of Streptococcus agalactiae to human epithelial cells. Infect. Immun. 72, 6197–6205.

314

P. Kayansamruaj et al. / Infection, Genetics and Evolution 36 (2015) 307–314

Skoff, T.H., Farley, M.M., Petit, S., Craig, A.S., Schaffner, W., Gershman, K., Harrison, L.H., Lynfield, R., Mohle-Boetani, J., Zansky, S., Albanese, B.A., Stefonek, K., Zell, E.R., Jackson, D., Thompson, T., Schrag, S.J., 2009. Increasing burden of invasive group B streptococcal disease in nonpregnant adults, 1990–2007. Clin. Infect. Dis. 49, 85–92. Sorek, R., Kunin, V., Hugenholtz, P., 2008. CRISPR—a widespread system that provides acquired resistance against phages in bacteria and archaea. Nat. Rev. Microbiol. 6, 181–186. Suanyuk, N., Kong, F., Ko, D., Gilbert, G.L., Supamattaya, K., 2008. Occurrence of rare genotypes of Streptococcus agalactiae in cultured red tilapia Oreochromis sp. and Nile tilapia O. niloticus in Thailand—relationship to human isolates? Aquaculture 284, 35–40. Tettelin, H., Masignani, V., Cieslewicz, M.J., Donati, C., Medini, D., Ward, N.L., Angiuoli, S.V., Crabtree, J., Jones, A.L., Durkin, A.S., Deboy, R.T., Davidsen, T.M., Mora, M., Scarselli, M., Margarit y Ros, I., Peterson, J.D., Hauser, C.R., Sundaram, J.P., Nelson, W.C., Madupu, R., Brinkac, L.M., Dodson, R.J., Rosovitz, M.J., Sullivan, S.A., Daugherty, S.C., Haft, D.H., Selengut, J., Gwinn, M.L., Zhou, L., Zafar, N., Khouri, H., Radune, D., Dimitrov, G.,

Watkins, K., O'Connor, K.J., Smith, S., Utterback, T.R., White, O., Rubens, C.E., Grandi, G., Madoff, L.C., Kasper, D.L., Telford, J.L., Wessels, M.R., Rappuoli, R., Fraser, C.M., 2005. Genome analysis of multiple pathogenic isolates of Streptococcus agalactiae: implications for the microbial “pan-genome”. Proc. Natl. Acad. Sci. U. S. A. 102, 13950–13955. Timoney, J.F., 2010. Streptococcus. In: Gyles, C.L., Prescott, J.F., Songer, G., Thoen, C.O. (Eds.), Pathogenesis of Bacterial Infections in Animals, 4th ed. Blackwell Publishing, Iowa, USA, pp. 51–74. Zankari, E., Hasman, H., Cosentino, S., Vestergaard, M., Rasmussen, S., Lund, O., Aarestrup, F.M., Larsen, M.V., 2012. Identification of acquired antimicrobial resistance genes. J. Antimicrob. Chemother. 67, 2640–2644. Zhou, Y., Liang, Y., Lynch, K.H., Dennis, J.J., Wishart, D.S., 2011. PHAST: a fast phage search tool. Nucleic Acids Res. 39, W347–W352.