b Centro de Investigacións Mariñas, Xunta de Galicia, Pedras de Corón s/n Apdo 13, Vilanova de Arousa, 36620, Spain c Sistemas Genómicos, Ronda G.
SEQUENCING AND DE NOVO ASSEMBLY OF THE DIGESTIVE GLAND TRANSCRIPTOME IN Mytilus galloprovincialis AND ANALYSIS OF DIFFERENTIALLY EXPRESSED GENES IN RESPONSE TO DOMOIC ACID Ventoso P. a, Martínez-Escauriaza R. a, Sánchez J.L. a, Pérez-Parallé M.L. a, Blanco J. b, Triviño J.C. c, Pazos A.J. a, § a
Departamento de Bioquímica y Biología Molecular, Instituto de Acuicultura, Universidad de Santiago de Compostela, Santiago de Compostela, 15782, Spain. b Centro de Investigacións Mariñas, Xunta de Galicia, Pedras de Corón s/n Apdo 13, Vilanova de Arousa, 36620, Spain c Sistemas Genómicos, Ronda G. Marconi 6, Paterna (Valencia), 46980, Spain
INTRODUCTION The mussel Mytilus galloprovincialis is an important aquaculture resource in Spain. Bivalve molluscs are filter feeding species that can accumulate biotoxins in their body tissues during harmful algal blooms. Amnesic Shellfish Poisoning (ASP) is caused by species of the diatom genus Pseudo-nitzchia, which produces the toxin domoic acid.
MATERIALS AND METHODS The M. galloprovincialis digestive gland transcriptome was de novo assembled based on the sequencing of 12 cDNA libraries, six obtained from control mussels and six from mussels naturally exposed to domoic acid. The mussels were collected from rafts in the Ría de Vigo, Spain. The cDNA libraries were sequenced by paired-end sequencing (100 x 2) on Illumina HiSeq 2000 sequencer. The de novo assembly was performed using the Oasis and Trinity programs. Furthermore digital gene expression analysis was performed in the digestive gland of M. galloprovincialis following exposure to domoic acid using the DESeq2 algorithm. Genes were considered to be significantly differentially expressed if the absolute fold change was > 1.5 and the p-value was < 0.05. A functional enrichment study was performed using the Pfam functional information from the genes differentially expressed.
Table 1. Summary of Illumina transcriptome sequencing an assembly for M. galloprovincialis digestive gland. Summary of raw reads data: Throughput (Mb)
64,919
Number of reads
705,558,488
Filtered reads
676,169,399
Average read lenght (bp)
100
Sequence quality ≥ Q30 (%)
95.04
Mean quality score
34.82
GC%
38 Summary of the assembled transcriptome
Total number of filtered reads
676,169,399
Number of assembled transcripts
94,727
RESULTS AND DISCUSSION
Number of assembled unigenes
69,294
The statistics of the transcriptome sequencing and assembly are shown in Table1. A total of 1,158 differentially expressed unigenes (Fig. 1, Table 2) were detected (686 up-regulated and 472 down-regulated). After functional enrichment of the differentially expressed genes 66 Pfam families were found to be significantly (p < 0.05) enriched (Table 3). Among these enriched domains we found: C1q domain, sulfotransferase domain, aldo/keto reductase family, carboxylesterase family and major facilitator superfamily. Some of these families contain genes involved in toxin metabolism and detoxification processes. In conclusion this study provides a high quality reference transcriptome of M. galloprovincialis digestive gland and identifies potential genes involved in response to domoic acid in M. galloprovincialis.
Contig N50 Length (bp)
761
Minimum contig length (bp)
450
Maximum contig length (bp)
15,385
Acknowledgements: This work has been supported by the Spanish Ministry MINECO and the FEDER Funds of the EU under the project AGL2012-39972-C02.
Table 2. List of the 20 genes most up-regulated in the digestive gland of M. galloprovincialis. Sequence description heavy metal-binding protein (C1q-like) endothelin-converting enzyme 1-like c-binding protein alpha subunit hypothetical protein BRAFLDRAFT_106563 c1q domain containing protein 1q3 von willebrand factor d and egf domaincontaining protein apextrin-like protein NA fibropellin-1- partial NA short-chain collagen c4 sushi repeat-containing protein srpx2 2 -5 -oligoadenylate synthase 3 neuralized pats1 hypothetical protein CGI_10028487 cathepsin d NA pi-class glutathione s-transferase low affinity immunoglobulin epsilon fc receptor
Fold Sequence Change length Blast Hit ACC 18.31 978 EKC35153 11.56 560 XP_009049950 10.15 532 EKC27057 9.42 1164 ELU10480 9.42 880 XP_002599420 7.63 1351 CBX41652 7.50 7.03 6.56 5.99 5.92 5.88 5.78 5.75 5.53 5.39 5.28 5.14 5.09 5.08
E-Value Similarity 7.73E-12 56 1.03E-40 56 5.11E-07 53 2.94E-66 86 2.40E-67 56 1.08E-12 62
637 649
EKC28789 EKC17841
2.89E-27 6.53E-88
56 76
2009
XP_009066537
1.07E-56
49
495 547 1368 2396 764 787
EKC33624 EKC28669 EKC21335 EKC29652 EKC41739 XP_005098204
3.50E-13 2.39E-31 5.41E-32 1.44E-20 1.41E-10 7.53E-16
62 56 53 50 52 87
838 833
AAS60226 XP_004370856
6.31E-54 3.39E-11
66 49
Average contig length (bp)
1,015
Total length in contigs (bp)
70,321,015
Table 3. Pfam families significantly enriched (p