Draft Genome Sequence of Staphylococcus massiliensis Strain ...

4 downloads 7838 Views 93KB Size Report
Sep 29, 2012 - Proteins were also checked for domains by using HMM against the PFAM ... posases. The PHAST server (12) and Prophage Finder (3) were.
GENOME ANNOUNCEMENT

Draft Genome Sequence of Staphylococcus massiliensis Strain 5402776T Véronique Roux, Catherine Robert, Grégory Gimenez, and Didier Raoult Aix Marseille Université, URMITE, UM63 CNRS 7278, IRD 198, Inserm 1095, Faculté de Médecine, Marseille, France

A draft genome sequence of Staphylococcus massiliensis, Gram-positive cocci isolated from a human brain abscess sample, is described here. One clustered regularly interspaced short palindromic repeat, three transposases, six putative transposases, and one potential provirus were characterized.

S

taphylococcus species belong to the low G⫹C content Grampositive group in the Firmicutes division of bacteria. Staphylococcus massiliensis was first isolated from a human brain abscess sample (1) and was supposed to be part of the human skin microflora (13). It is a coagulase-negative Staphylococcus, and its pathogenicity in humans is unknown. To date, seven full-length Staphylococcus species genomes have been deposited in the GenBank database. Both shotgun sequencing and 3-kb paired-end (PE) sequencing were performed. The shotgun library was constructed with the GS Rapid Library Prep kit (Roche). The PE library was constructed according to the 454 GS FLX Titanium paired-end protocol. The run was analyzed on the cluster through the GS RunBrowser and Newbler assembler 2.5.3. A total of 286,045 passed filter wells were obtained from both strategies and generated 96.9 Mb with an average length of 314 bp. The draft genome of S. massiliensis (about 2.36 Mb) contained eight scaffolds and 82 contigs (⬎500 bp). The G⫹C content was 36.5%, which is significantly higher than the G⫹C content of most other sequenced staphylococci. The genome contains 58 tRNA genes with one tmRNA and has 2,252 coding sequences. Of these coding sequences, 92.8% could be assigned to clusters of orthologous groups (COG) families. SLEP found 710 surface proteins identified as 110 exported proteins, 6 lipoproteins, 3 wall proteins, and 591 membrane proteins (4). The sequences of the rRNAs were found in four contigs, but their average coverage was 128⫻, compared with an average coverage of 34.3⫻ for the entire genome. It was therefore possible to conclude that several (between three and four) ribosomal operons were present in the genome. Open reading frames (ORFs) were predicted by using Prodigal (http://prodigal.ornl.gov/) with default parameters, but the predicted ORFs were excluded if they spanned sequencing gap region. The predicted bacterial protein sequences were searched against the National Center for Biotechnology Information nonredundant (NR), UNIPROT (8), and CharProtDB (7) databases by using BLASTP and against the COG (10) database by using RPSBLAST (2). The ARAGORN tool (6) was used to find tRNA genes, whereas rRNAs were found by using RNAmmer (5) and BLASTn against the NR database. Proteins were also checked for domains by using HMM against the PFAM database; 1,852 proteins (82%) have at least one PFAM domain (9). Reciprocal BLAST analysis with the available full-length Staphylococcus genomes indicated that S. massiliensis shares 1,469 orthologs with S. aureus, 1,531 with S. carnosus, 1,506 with S. haemolyticus, 1,517 with S. lugdunensis, 1,514 with S.

6984

jb.asm.org

Journal of Bacteriology

pseudintermedius, 1,530 with S. saprophyticus, and 1,469 with S. epidermidis. One clustered regularly interspaced short palindromic repeat (11), which included nine predicted spacer regions, was found by using the CRISPERfinder program online (http://crispr.u-psud.fr /Server/). The mobile genetic elements characterized were IS431, IS66 family, and ISL3 family elements and six putative transposases. The PHAST server (12) and Prophage Finder (3) were used to identify potential proviruses. One potential prophage was identified. Nucleotide sequence accession numbers. The S. massiliensis strain 5402776T whole-genome shotgun project has been deposited at DDBJ/EMBL/GenBank under accession number AKGE00000000. The version described in this paper is the first version, AKGE01000000. ACKNOWLEDGMENTS We are grateful to Laetitia Pizzo for her technical assistance. This study was supported by the Fondation Méditerranée Infection.

REFERENCES 1. Al Masalma M, Raoult D, Roux V. 2010. Staphylococcus massiliensis sp. nov., isolated from a human brain abscess. Int. J. Syst. Evol. Microbiol. 60:1066 –1072. 2. Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. 1990. Basic local alignment search tool. J. Mol. Biol. 215:403– 410. 3. Bose M, Barber RD. 2006. Prophage Finder: a prophage loci prediction tool for prokaryotic genome sequences. In Silico Biol. 6:223–227. 4. Giombini E, Orsini M, Carrabino D, Tramontano A. 2010. An automatic method for identifying surface proteins in bacteria: SLEP. BMC Bioinformatics 11:39. doi:10.1186/1471-2105-11-39. 5. Lagesen K, et al. 2007. RNAmmer: consistent and rapid annotation of ribosomal RNA genes. Nucleic Acids Res. 35:3100 –3108. 6. Laslett D, Canback B. 2004. ARAGORN, a program to detect tRNA genes and tmRNA genes in nucleotide sequences. Nucleic Acids Res. 32:11–16. 7. Madupu R, et al. 2012. CharProtDB: a database of experimentally characterized protein annotations. Nucleic Acids Res. 40(Database issue): D237–D241. 8. Magrane M, Consortium U. 2011. UniProt Knowledgebase: a hub of integrated protein data. Database (Oxford) 2011:bar009. 9. Punta M, et al. 2012. The Pfam protein families database. Nucleic Acids Res. 40:D290 –D301.

Received 29 September 2012 Accepted 3 October 2012 Address correspondence to Véronique Roux, [email protected]. Copyright © 2012, American Society for Microbiology. All Rights Reserved. doi:10.1128/JB.01909-12

p. 6984 – 6985

December 2012 Volume 194 Number 24

Genome Announcement

10. Tatusov RL, Galperin MY, Natale DA, Koonin EV. 2000. The COG database: a tool for genome-scale analysis of protein functions and evolution. Nucleic Acids Res. 28:33–36. 11. Wiedenheft B, Sternberg SM, Doudna JA. 2012. RNA-guided genetic silencing systems in bacteria and archaea. Nature 482:331– 338.

December 2012 Volume 194 Number 24

12. Zhou Y, Liang Y, Lynch KH, Dennis JJ, Wishart DS. 2011. PHAST: a fast phage search tool. Nucleic Acids Res. 39(Web Server Issue): W347–W352. 13. Zong Z. 2012. The newly-recognized Staphylococcus massiliensis is likely to be part of the human skin microflora. Antonie Van Leeuwenhoek 101: 449 – 451.

jb.asm.org 6985