SEQUENCING AND DE NOVO ASSEMBLY OF THE

0 downloads 0 Views 584KB Size Report
b Centro de Investigacións Mariñas, Xunta de Galicia, Pedras de Corón s/n Apdo 13, Vilanova de Arousa, 36620, Spain c Sistemas Genómicos, Ronda G.
SEQUENCING AND DE NOVO ASSEMBLY OF THE DIGESTIVE GLAND TRANSCRIPTOME IN Mytilus galloprovincialis AND ANALYSIS OF DIFFERENTIALLY EXPRESSED GENES IN RESPONSE TO DOMOIC ACID Ventoso P. a, Martínez-Escauriaza R. a, Sánchez J.L. a, Pérez-Parallé M.L. a, Blanco J. b, Triviño J.C. c, Pazos A.J. a, § a

Departamento de Bioquímica y Biología Molecular, Instituto de Acuicultura, Universidad de Santiago de Compostela, Santiago de Compostela, 15782, Spain. b Centro de Investigacións Mariñas, Xunta de Galicia, Pedras de Corón s/n Apdo 13, Vilanova de Arousa, 36620, Spain c Sistemas Genómicos, Ronda G. Marconi 6, Paterna (Valencia), 46980, Spain

INTRODUCTION The mussel Mytilus galloprovincialis is an important aquaculture resource in Spain. Bivalve molluscs are filter feeding species that can accumulate biotoxins in their body tissues during harmful algal blooms. Amnesic Shellfish Poisoning (ASP) is caused by species of the diatom genus Pseudo-nitzchia, which produces the toxin domoic acid.

MATERIALS AND METHODS The M. galloprovincialis digestive gland transcriptome was de novo assembled based on the sequencing of 12 cDNA libraries, six obtained from control mussels and six from mussels naturally exposed to domoic acid. The mussels were collected from rafts in the Ría de Vigo, Spain. The cDNA libraries were sequenced by paired-end sequencing (100 x 2) on Illumina HiSeq 2000 sequencer. The de novo assembly was performed using the Oasis and Trinity programs. Furthermore digital gene expression analysis was performed in the digestive gland of M. galloprovincialis following exposure to domoic acid using the DESeq2 algorithm. Genes were considered to be significantly differentially expressed if the absolute fold change was > 1.5 and the p-value was < 0.05. A functional enrichment study was performed using the Pfam functional information from the genes differentially expressed.

Table 1. Summary of Illumina transcriptome sequencing an assembly for M. galloprovincialis digestive gland. Summary of raw reads data: Throughput (Mb)

64,919

Number of reads

705,558,488

Filtered reads

676,169,399

Average read lenght (bp)

100

Sequence quality ≥ Q30 (%)

95.04

Mean quality score

34.82

GC%

38 Summary of the assembled transcriptome

Total number of filtered reads

676,169,399

Number of assembled transcripts

94,727

RESULTS AND DISCUSSION

Number of assembled unigenes

69,294

The statistics of the transcriptome sequencing and assembly are shown in Table1. A total of 1,158 differentially expressed unigenes (Fig. 1, Table 2) were detected (686 up-regulated and 472 down-regulated). After functional enrichment of the differentially expressed genes 66 Pfam families were found to be significantly (p < 0.05) enriched (Table 3). Among these enriched domains we found: C1q domain, sulfotransferase domain, aldo/keto reductase family, carboxylesterase family and major facilitator superfamily. Some of these families contain genes involved in toxin metabolism and detoxification processes. In conclusion this study provides a high quality reference transcriptome of M. galloprovincialis digestive gland and identifies potential genes involved in response to domoic acid in M. galloprovincialis.

Contig N50 Length (bp)

761

Minimum contig length (bp)

450

Maximum contig length (bp)

15,385

Acknowledgements: This work has been supported by the Spanish Ministry MINECO and the FEDER Funds of the EU under the project AGL2012-39972-C02.

Table 2. List of the 20 genes most up-regulated in the digestive gland of M. galloprovincialis. Sequence description heavy metal-binding protein (C1q-like) endothelin-converting enzyme 1-like c-binding protein alpha subunit hypothetical protein BRAFLDRAFT_106563 c1q domain containing protein 1q3 von willebrand factor d and egf domaincontaining protein apextrin-like protein NA fibropellin-1- partial NA short-chain collagen c4 sushi repeat-containing protein srpx2 2 -5 -oligoadenylate synthase 3 neuralized pats1 hypothetical protein CGI_10028487 cathepsin d NA pi-class glutathione s-transferase low affinity immunoglobulin epsilon fc receptor

Fold Sequence Change length Blast Hit ACC 18.31 978 EKC35153 11.56 560 XP_009049950 10.15 532 EKC27057 9.42 1164 ELU10480 9.42 880 XP_002599420 7.63 1351 CBX41652 7.50 7.03 6.56 5.99 5.92 5.88 5.78 5.75 5.53 5.39 5.28 5.14 5.09 5.08

E-Value Similarity 7.73E-12 56 1.03E-40 56 5.11E-07 53 2.94E-66 86 2.40E-67 56 1.08E-12 62

637 649

EKC28789 EKC17841

2.89E-27 6.53E-88

56 76

2009

XP_009066537

1.07E-56

49

495 547 1368 2396 764 787

EKC33624 EKC28669 EKC21335 EKC29652 EKC41739 XP_005098204

3.50E-13 2.39E-31 5.41E-32 1.44E-20 1.41E-10 7.53E-16

62 56 53 50 52 87

838 833

AAS60226 XP_004370856

6.31E-54 3.39E-11

66 49

Average contig length (bp)

1,015

Total length in contigs (bp)

70,321,015

Table 3. Pfam families significantly enriched (p