Recent Trends in Fish Genomics

2 downloads 0 Views 168KB Size Report
Protopterus aethiopicus (Owen, 1839). Fish genome played important role in ..... Schwartz, F.J. and Maddock, M.B. (2002). Cytogenetics of the elasmobranchs:.
Chapter 8

Recent Trends in Fish Genomics N.S. NAGPURE, ARUNIMA KUMAR VERMA, S.P. SINGH AND U.K. SARKAR National Bureau of Fish Genetic Resources, Canal Ring Road, Telibagh Post: Dilkusha, Lucknow-226002, Uttar Pradesh, India

1. Introduction The era of genome sequencing was exploded by the upcoming of Human Genome Project (1989), coordinated by the U.S. Department of Energy and the National Institute of Health to identify all the approximately 20,000-25,000 genes, 3 billion chemical base pairs, store this information in databases, improve the tools for data analysis, transfer related technologies to the private sector, and address the ethical, legal, and social issues (ELSI) that may arise from the project. The dawn of genome sequencing began with Fredric Sanger’s sequencing almost twenty-five years ago. The time consuming and labor intensive gel preparation and their running, as well as the cost of such experiments replaced the conventional sequencing by “shotgun” sequencing. Deciphering the entire genome and their characterization between organisms

Corresponding author: Dr. S P Singh, Principal Scientist, Email: [email protected], [email protected]

196 / Applied Computational Biology and Statistics in Biotechnology and Bioinformatics was unimaginable until the rapid advancement of microchips and processors over the past two decades. Thus a symbiotic harmony between sequencing technology and powerful bioinformatics skills initiated of a plethora of ambitious sequencing proposals. With the precipitous drop in costs and increase in sequencing efficiency a proposal has been passed known as Genome 10K project (2009) for assembling a ‘genomic zoo’ of vertebrate species by performing whole-genome sequencing of 10,000 vertebrate species, approximately one for every vertebrate genus. The successful completion of sequenced human genome opened gateway for sequencing of various model organisms which included some organisms of fisheries sector too. Fishes are economically important group of vertebrates that contain much of the vital proteins and rich oil necessary to nourish our ever-increasing human population. Besides serving as an important source of livelihood they also form an ecologically important part of aquatic ecosystem and offer important advantages for defining the organism–environment interface. The number of finfish species in the world is estimated to be 30,700 (http://www.iucnredlist.org/doc/). The National Bureau of fish Genetic Resources (NBFGR) has documented 2352 indigenous finfishes of which 871 are from frechwater, 113 from brackish water and 1368 from marine environment. Apart from the finfish resources, nearly 2934 species of crustaceans, 5000 mollusks and 765 species of echinoderms also contribute to India’s rich aquatic germplasm resources. In India, the fisheries sector has been recognized as a powerful income and employment generator to over 14 million fishers and fish farmers. With an annual fish production of over 7.6 million tonnes, the sector accounts for a turnover of over Rs. 300 billion, contributing over 1 percent of total GDP and five percent of agricultural GDP. Important production and performance traits (such as growth rate, feed conversion efficiency, body conformation and fillet yield) must be improved in order to make aquaculture more productive and profitable. Therefore, genomics and genetic enhancement of aquaculture species is needed not only to meet the demands of fish production, but also to ensure profitability. The analytical genetic technologies most relevant to aquaculture and capture fisheries include : DNA markers, genomic mapping, microarrays, and sequencing. DNA marker technologies are not the only basis for genetic linkage mapping, but also for analysis of genetic resources, strain differentiation, parentage identification and preservation of genetic diversity and conservation of genetic integrity. The first model fish organism chosen for understanding vertebrate developmental biology, vertebrate evolution and human diseases was the zebrafish Danio rerio (Hamilton, 1822). Later, by the advent of genome sequencing and concomitant development of improved annotation technology, genome analysis of Takifugu rubripes, a marine

Recent Trends in Fish Genomics / 197 pufferfish was accomplished (Sydney Brenner, 1993). Fish genome is very compact with high gene density and small genomic size that ranges from 0.35 pg in Tetraodon nigroviridis (Marion de proce, 1822) to132.83 pg in Protopterus aethiopicus (Owen, 1839). Fish genome played important role in providing penetrating insight into vertebrate genome evolution as whole genome duplication (WGD), a major evolutionary event that occurred 320–400 million years ago in teleost ancestor is considered the sole reason for shaping the present genome organization in vertebrates. Ohno (1970) laid notions in his book that “without duplicated genes, the emergence of metazoans, vertebrates, and mammals from unicellular organisms would have been impossible, this process required the creation of new loci with previously nonexistent functions, and atleast one whole-genome duplication facilitated the evolution of vertebrates”. The evolutionary studies reveal that all vertebrates experienced two rounds of WGD early in their evolution and that teleosts experienced a subsequent additional third-round (3R)-WGD. This 3R-WGD was confirmed by synteny analysis and ancestral karyotype inference using the genome sequences of Tetraodon and medaka. Thus teleosts with 3R-WGD serve as unique models for future studies on ecology and evolution. Fishes are distinguished aquatic vertebrates that experience highly diversified range of environmental conditions as temperature, oxygen levels, salinity, and sometimes toxic chemicals to which their physiologies, body shapes, and lifestyles have adapted. The term Environmental Genomics has been coined (Cossins and Crawford 2005) in reference to such an intimate relationship between fishes and a wide range of environmental conditions. Some elegant examples as themotolerance, hypoxia etc have been depicted for the study of the interface between fishes and their environment using genomic approaches. Transcript screening approach using microarray technique was adopted to explore changes in carp under extreme environmental stress. A graded increase in expression of a core set of genes involved in RNA processing, translation initiation, mitochondrial metabolism, proteasomal function, and modification of higher-order structures of lipid membranes and chromosomes were identified in carps that were induced to increasing levels of cold, from 30°C to 10°C . A genome-wide expression profiling was performed in medaka (Oryzias latipes, Temminct & Schlegel, 1846) for understanding influence of hypoxia on global gene expression. The occurrences of genotoxins and hypoxic conditions in coastal waters showed differential expression of 501 genes in the brain, 442 in the gill, and 715 in the liver which were related to general metabolism, catabolism, RNA and protein

198 / Applied Computational Biology and Statistics in Biotechnology and Bioinformatics metabolism, etc. Two biological pathways, ubiquitin-proteasome and phosphatidylinositol signaling, were significantly dysregulated in medaka upon hypoxia exposure. Fishes genome is being used to determine part of eukaryote genomes that do not code for proteins but are functional, as noncoding RNA genes and regulatory control regions. Such regions can be detected by aligning genomic sequences of distantly related organisms and searching for regions that have remained similar during evolution thus suggesting that such regions may have undergone mutations but functionality of the genes have remained preserved. The advantage of fish in this context is that firstly the genome of fish is saturated with such neutral mutations and secondly because of the long evolutionary distance (approximately 450 million years) that fish’s genome has experienced. An attempt to identify such conserved domains in human genome has been performed by aligning human genome with pufferfish T. nigroviridis by a tool ‘Exofish’. A comparative genome-wide discovery of such conserved regions and ultra-conserved regions (UCRs) of unknown function is of fundamental importance across vertebrates and mammalian genome. With increased number of newly annotated fish and mammalian genome sequences, such comparative studies are likely to play an important role in the identification of functional noncoding elements and in underpinning the basis of genomic homology.

2. Establishing Phylogeny on basis of Genomics About 450 to 500 million years ago (Mya), the first vertebrates (animals with segmented backbones made of cartilage or bone) appeared in the early oceans. Their descendents split into two main groups: the ray-finned fishes (Actinopterygii, fishes possessing paper-like thin fins) and the lobe-finned fishes (Sarcopterygii, fishes possessing fleshy and lobed paddle-like appendages). Over millions of years, these lobed-fins evolved into the limbs possessed by all four-limbed creatures (the tetrapods, including reptiles, amphibians, birds, and mammals) while the ray finned fishes gave rise to entire fish species including teleosts. Teleosts comprising of more than 27000 species (Nelson, 1994) is the largest and most diverse group of vertebrates and account for more than 99% of ray-finned fishes (Actinopterygians) which diverged from lobe-finned fishes (Sarcopterygians) about 420 Mya. The remaining ray-finned fishes, which are basal to teleosts, are represented by approximately 50 living species. Whole genome duplication (WGD) had structured the genome of entire vertebrates by affecting them twice. The timing of WGD has been estimated to be around 350 Mya by using molecular clock hypothesis and phylogenetic analysis. The sequencing of genes and gene families from teleost fishes had

Recent Trends in Fish Genomics / 199 unexpectedly revealed the presence of duplicate teleost genes for several human genes which in turn led to the hypothesis that WGD occurred in the ray-finned fish lineage before the diversification of teleost fishes (Lysaght et al., 2004). Tetrapods

Sareop terygii

Latimeria manadoensis (Coelecanth) Acipenser sturio (Sturgeon) Lepisosteus platyrhyncus (Gar) Cyprinus carpio (Carp) Danio rerio (Zebrafish) Salmo salar (Salmon) Oncorhynchus mykiss (Rainbow trout) Orycias latipas (Medaka) Xiphophorus maculatus (Xiphophorus) Takifugu rubripes (Fugu) Actinop terygii

Teleost

Tetraodon nigroviridis (Green spotted puffer)

Fig 1.1 : Phylogenetic relationship between fish species (Actinopterygii and Sarcopterygii) and tetrapods. The dark line represents probable position of whole genome duplication at root of teleost. (Courtesy : Crollius et al., 2005).

Fishes like elephant shark fish (Callorhinchus milii, Bory de Saint Vincent, 1823), sturgoens etc had undergone such 2R-WGD but were unaffected by 3R-WGD. A third round of WGD occurred which bifurcated Actinoterygii into teleost and non-teleost lineages. Most of the tetrapods, on the other hand, have not experienced additional WGD; however, they have experienced repeated chromosomal rearrangements throughout the whole genome. More recently, the sequencing and comparative analysis of whole-genome sequences of teleost fishes such as fugu (Aparicio et al., 2002), Tetraodon (Jaillon et al., 2004), and medaka (Kikugawa et al., 2004) have provided compelling evidence for the WGD event in the fish lineage. Followed by WGD, the teleost genome had undergone eight major chromosomal rearrangements. A comparison of the recently completed medaka genome with the zebrafish, tetraodon, and human genomes has revealed that eight major interchromosomal rearrangements occurred within a relatively short period of approximately 50 million years after the WGD in the fish lineage (Kasahara et al., 2007). Subsequently, while the medaka lineage

200 / Applied Computational Biology and Statistics in Biotechnology and Bioinformatics experienced no major interchromosomal rearrangements, three major rearrangements occurred in the Tetraodon lineage. In contrast, the zebrafish lineage has experienced many interchromosomal rearrangements after it diverged from the medaka lineage. A comparison of gene order across large regions provides support that a higher rate of chromosomal rearrangements has occured in teleost fishes as compared to other vertebrates (Venkatesh et al., 2008). At present teleost lineage is flourished by half of all vertebrate species (including T.rubripes, T.nigroviridis, D.rerio, O.latipes) which have adapted to a variety of marine and freshwater habitats and their genome evolution and diversification are important subjects for the understanding of vertebrate evolution.

3. Fish Genome Fish genome are very important for evolution and number of identified and documented fish species are approximately 32,000 at global level which is quite a large number and distributed in different environmental condition. Besides commercial utilization for human being fishes can play prominent role in the genomic research. Since fish genome is very compact with high gene density and small genomic size, it is economic to complete genomic sequencing of comparative larger number of fish species available easily in different ecological condition. The genome size of some of the fish species of the lineages actinopterygii and sarcopterygii are depicted in table 1.2, based on animal genome size database.

3.1 Genome The genomes of actinoterygiians serve as excellent models for studying comparative genomics, evolutionary processes and fates of duplicated genes. The genome size is small having high gene density and intense number of duplicated genes. The consolidated knowledge of sequenced genomes of zebrafish (D.rerio), Takifugu (T. rubripes), Tetraodon (T.nigroviridis), medaka (O.latipes), three-spined stickleback (Gasterosteus aculeatus) and many more strong frontiers of fish genomics and fish biology bestow light on evolutionary processes. Genomic sequencing of other fish species as cartilaginous fish (elephant shark) has also generated enormous information for understanding fish genomics. 3.1.1 Zebrafish Genome Sequencing The zebrafish (D. rerio) is a common and useful model fish organism for studies of vertebrate development, oncology, toxicology, reproductive studies, teratology, genetics, neurobiology, environmental sciences, stem cell and

Recent Trends in Fish Genomics / 201 regenerative medicine, and evolutionary theory. The zebrafish genome project initiated in collaboration between the Sanger Institute and the zebrafish community and was announced during the Sanger Institute Zebrafish Workshop in 2000. Zv9 is the latest integrated whole genome shotgun (WGS) assembly of the zebrafish genome and is presently being maintained by Genome Reference Consortium (GRC). The zebra fish genome is being sequenced, finished and analysed to its entirety at the Wellcome Trust Sanger Institute, England. The manual annotation is provided by the Human and Vertebrate Analysis and Annotation (HAVANA) group and is released at regular intervals into the Vertebrate Genome Annotation (Vega). The annotation is compiled in close collaboration with the Zebrafish Information Network (ZFIN), which has enabled to provide an accurate, dynamic and distinct resource for the zebrafish community as a whole. According to Ensembl the final gene-set comprises 24,020 protein-coding genes and 38 genes that have been identified as pseudogenes. The genome encodes the largest vertebrate repertoire of functional aquaporins with dual paralogy and substrate specificities similar to mammals. In India, whole genome sequencing of a Wildtype strain of Zebra fish (2009) has been successfully accomplished by Institute of Genomics and Integrative Biology, IGIB Delhi. The Supercomputing facility and Next generation sequencing technology enabled 89 Gigabytes of sequence data analysis which would help create disease specific models and drug screening even for the humans. Zebra fish genome is about half the size of the human genome containing 1700 million DNA base pairs and hence good candidate for sequencing. The genomic data generated is expected to decipher the genetic cause for various complex disorders such as cardio vascular, metabolic and neuro-degenerative disorders. Scientists even proclaim that the genomic sequence may even reveal specific mutation in the gene which causes the human European population to have a light skin colour compared to the Africans. 3.1.2 Fugu Genome Sequencing The Fugu Genome Project was initiated in 1989 by Sydney Brenner and his colleagues Greg Elgar, Sam Aparicio, and Byrappa Venkatesh. In 2000 the International Fugu Genome Consortium was formed, headed by the Joint Genome Institute (JGI) California, Institute of Molecular and Cell Biology (IMCB) Singapore, and the HGMP Cambridge which sequenced and assembled fugu genome (2001) in association with Celera Genomics using the whole genome shotgun method. T.rubripes has the smallest genome size among any vertebrate organism, about one eighth the size of the human genome. The current version of genome is fifth genome assembly available at IMCB Browser shows 1,136 protein-coding genes, 17,504 novel protein-coding genes, 593 RNA genes, 121 RNA pseudogenes. The genome of Fugu is highly

202 / Applied Computational Biology and Statistics in Biotechnology and Bioinformatics purposeful firstly because of its compactness, and secondly because of lack of enormous amount of junk DNA. One reason for the compact genome size of Fugu is that dispersed repetitive DNA accounts for less than one sixth of the sequence, compared with 40% in humans; another is that the genes have fewer, shorter introns and coding regions occupy over a third of the genome. Throughout evolution all the genomes have been dominated by the remains of ancient viral-like genomic infections that left hundreds of thousands of repetitive elements littered throughout the genome. Fugu genome is free from the enormous quantity of ‘junk’ or ‘selfish DNA’ that houses other genomes including that of mouse and humans. These reasons make sequencing of Fugu very cost effective and rapid. Apart from this it serves as a good model genome for the discovery of human genes because of enumeration of similar gene repertoire with human genome. The power of the compact genome size in comparative genomics has been amply demonstrated with the fugu genome which has a small genome size of 0.4 Gb. About 1,000 novel genes in the human genome that were undetected by other approaches, as well as a large number of conserved regulatory elements in the human genome have been identified by comparing fugu and human genome sequences. Fugu genome contains blueprints of basic vertebrate constitution and hence comparison of fugu and human genome illuminates presence of genes and regulatory sequences that have been preserved over a course of 450 million years. 3.1.3 Medaka Genome Sequencing The medaka (O.latipes) genome project commenced in late 2002 as a collaborative effort of three core laboratories led by H. Takeda (University of Tokyo, Japan), S. Morishita (University of Tokyo, Japan) and Y. Kohara (National Institute of Genetics (NIG), Japan. A high-quality draft genome sequence of a small egg-laying freshwater teleost, medaka, native to East Asia and an excellent model system for a wide range of biology, including ecotoxicology, carcinogenesis, sex determination and developmental genetics was obtained by wholegenome sequencing. Medaka establishes itself as first successful sex-reversal candidate in vertebrates and its genome demonstrates identification of the male-determining gene, DMY, the first non-mammalian equivalent of SRY. The assembled genome (approximately 700 megabases), is less than half of the zebrafish genome, and predicted to have 20,141 genes. The single nucleotide polymorphisms (SNPs) average rate is of 3.42, this is the highest SNP rate seen in any vertebrate species. Small genome size and the ability to grow at permissive temperature range (6-400C) during its embryonic development) increase the chance of medaka for being a model organism. It is anticipated that whole genome sequencing of the medaka would facilitate comparative genomic studies of both teleosts and vertebrate organisms.

Recent Trends in Fish Genomics / 203 3.1.4 Elephant shark Genome Sequencing Cartilaginous fishes (Chondrichthyes) represented by sharks, rays, skates and chimaeras, are phylogenetically the oldest group of living jawed vertebrates which serve as an important group for understanding of the origins of the complex developmental and physiological systems of jawed vertebrates. The elephant shark (C.milii) also known as the elephant fish or ghost shark represents model cartilaginous fish having smallest genome among the known cartilaginous fish genomes. Venkatesh et al. (2005) performed whole-genome sequencing and comparative analysis of elephant shark genome and estimated it to be 910 Mb long (one-third the size of the human genome). The genome is inhabited by large number of ultraconserved elements and the protein sequences evolve at a slower rate than in other vertebrates. Due to such slow rate of evolution, elephant shark genome has remained unchanged during the last 450 million years of evolution This finding indicates that the elephant shark has retained more features of the ancestral genome than other vertebrates and hence stands to be useful model for gaining insight into the ancestral genome, in which the human genome also has its roots. Repetitive sequences, represented mainly by a short interspersed element–like and long interspersed element–like sequences, account for about 28% of the elephant shark genome. The 1.5x assembly contains approximately 61,000 exons or coding sequences that can be translated and aligned with the predicted protein products of 10,400 unique human genes. The genome illustrates examples of genes that have been lost differentially during the evolution of tetrapod and teleost fish lineages. The degree of conserved sequence and the gene order (“synteny”) between the human and elephant shark genomes are higher than that between human and teleost fish genomes. Elephant shark contains putative four Hox clusters indicating that, unlike teleost fish genomes, the elephant shark genome has not experienced additional whole-genome duplication. These findings underscore the importance of the elephant shark as a critical reference vertebrate genome for comparative analysis of the human and other vertebrate genomes. The ‘primitive’ vertebrate like the elephant shark had the potential for color vision like humans that it has acquired as traits for color vision during evolution in parallel with humans. It has three cone pigments for color vision and, like humans, it accomplished this through gene duplication. Altogether elephant shark would indeed be an excellent candidate for genome sequencing as it would provide a reference genome for tetrapods while allowing the identification of genomic features that differentiate them from teleosts. Recently C.milii has been proposed as a good model to study the genome structure and gene content of a basal jawed vertebrate, and provide a common reference for tetrapods and ray-fined fishes.

204 / Applied Computational Biology and Statistics in Biotechnology and Bioinformatics 3.1.5 Tetraodon Genome Sequencing Tetraodon (T. nigroviridis) is a small brackish water tropical fish which has been sequenced in collaboration between Genoscope and the Broad Institute (MIT). The project was supported by the Consortium National de Recherche en Genomique and the National Human Genome Research Institute (NHGRI). The small and compact genome has high gene density, rapid molecular evolution tendency and high chromosome stability. The organization of paralogous and orthologous genes in the genome have been key elements to prove that a whole genome duplication has occurred at the root of the teleost lineage, providing at the same time a much clearer picture of improved fish gene catalogue, including identification of key genes previously thought to be absent in fish. Comparison with other vertebrates and a urochordate indicates that fish proteins have diverged markedly faster than their mammalian homologues. Comparison with the human genome predicted approximately 900 previously unannotated human genes. Twelve available chromosomes having 385 Million basepairs is eight times more compact than that of human, mostly because intergenic and intronic sequences are reduced in size compared to other vertebrate genomes. The annotation also includes 87 manually curated structures of a number of Hox and Cytokine genes. This species is 20-30 million years distant from F.rubripes, a marine pufferfish from the same family.

3.2 Sarcopterygii genome Apart from the above mentioned fish species, sarcopterygian fish, occupy a unique phylogenetic position between ray-finned fishes and tetrapods, and consist of very stable genome that has not undergone much change over the long periods, evolving neutrally with few major rearrangements. The two distinguished taxa of sarcopterygii, coelacanths and lung fishes are of utmost importance in evolutionary studies. Lungfish have very large genomes (>100 Gb) therefore making them poor candidates for genomic sequencing. Coelacanths are abundant in the fossil record, yet two modern coelacanth species known, Latimeria chalumnae and Latimeria menadoensis, are accountable to whole genome sequencing due to their smaller genome size (Danke et al. 2004). The dilemma of lungfishes and coelacanth being closely related to tetrapods or fishes has been made clear by Brinkmann et al., 2004 and Zardoya et al., 1997 by concluding that these sarcopterygians are close relatives of tetrapods. Analysis of available genome of coelacanth retains Hox14 genes which are only found in cartilaginous fishes and amphioxus. The probable function of Hox14 is to play an important role in axial and appendicular patterning. It is tempting to speculate on the role that these genes may have had in evolution and development of the tetrapod limb. The slow rate of evolution of the coelacanth genome may have contributed to the retention of

Recent Trends in Fish Genomics / 205 ancestral genes that have been lost in teleost and tetrapod lineages. The genome provides access to the phenotypic and genomic transitions leading to the emergence of tetrapods. In addition, comparison of coelacanth, teleost and tetrapod genomes identify functional sequences unique to each lineage.

4. Conclusion Up to now fish genomics has been able to draw the similarities and the differences between fish, vertebrate and human genomes to gain profound insights into the evolution of vertebrate genomes. A genome wide comparative analysis of cartilaginous fish and tetrapods establishes a concept of transition of vertebrates from aquatic to terrestrial environment. Comparative analysis of other fish species among them also helps in establishing cross-taxon relationships as well as greatly facilitates the study of the genetic and molecular mechanisms that underlie development and evolution. Humans and fishes share many developmental pathways, organ systems, and physiological mechanisms and the advantages of zebrafish, medaka, tetraodon, or fugu have been well exploited so far using bioinformatics tools and molecular biology techniques to make significant conclusions relevant to human biology. It is thus not surprising that several fish species have played important roles in recent years to inform us about human genes, and, somewhat more unexpectedly, to provide leads to understanding the function of genes involved in human diseases. Use of advanced genomic and post genomic technologies in diverse fields of fish biology will foster the coordinated development of screening technologies for gene function analysis as well as culminating extensive information about conserved domains in human genome. In addition, expansion of available genome resources in an alarming rate including that of medaka greatly facilitates comparative genome analysis of commercially important fish species. Fish genomics is also playing key role in promoting breed cultivation, disease control, feed nutrition, food safety and establishing knock out genes and transgenic fish species. Such transgenic species can serve as boon for fish industries raising the fish production in turn affecting the economy of a country. One more important application of fish genomics lays in in silico translation of fish genomic resources into hypothetical proteins which can generate a baseline data for fish proteomics. The experimental validation of such proteins can help to understand intricate mechanism of proteins production and understanding their function, which can further be exploited for human welfare.

206 / Applied Computational Biology and Statistics in Biotechnology and Bioinformatics Table 1.1 : Various fish genome sequencing projects along with their genomic browsers. Fish Species

Genome Browser

Zebrafish (D. rerio)

Wellcome Trust Sanger http://www.sanger.ac.uk/ Institute (Zebrafish Projects/D_rerio/ genome sequencing project)

Fugu (T. rubripes)

Link

ZFIN (The Zebrafish Model Organism Database)

http://zfin.org/cgi-bin/webdriver?MIval =aa-ZDB_home.apg

IMCB Fugu Genome Project

http://www.fugu-sg.org/

Elephant

IMCB Elephant Shark Shark (C. milii) Genome Project Medaka NBRP Medaka (O. latipes)

http://esharkgenome.imcb.a-star.edu.sg/

Tetraodon

http://www.broadinstitute.org/annotation/ tetraodon/

Tetraodon nigroviridis Database (Broad MIT)

(T. nigroviridis) Ensembl (Tetraodon)

http://www.shigen.nig.ac.jp/medaka/

http://www.ensembl.org/Tetraodon_ nigroviridis/Info/Index

The full potential of fish genomics is about to be explored with the integration of study of additional species such as carp, trout, or tilapia. National Human Genome Research Institute has approved draft assembly of Latimeria chalumnae (Coelacanth), Eptatretus burgeri (Hagfish), Petromyzon marinus (Lamprey), Raja erinacea (Skate), Lepisosteus oculatus (Spotted Gar), Gasterosteus aculeatus (Stickleback) and Taeniopygia guttata (Zebra finch). A large number of fish species still need to be sequenced to generate a daunting array of genomic data for exploring the hidden complexities and revealing intriguing facts of fish biology. The management of Indian fish Genetic Resources is of enormous importance and therefore requires a lot of dedicated approach to thrash out problems and issues relating to their enhancement and conservation. There is also need to generate genome information for most capture fisheries species so that genome technologies can help in assessment of the status and conservation of wild fish species. Despite of pre-eminence of non-model species as mouse, drosophila etc fish’s genome has unleashed evolutionary interest, paved pathways in environmental sciences and developmental biology, provided valuable research in comparative genetics and genomics and at present enjoys an exuberant position as a competent model organism.

Recent Trends in Fish Genomics / 207 Table 1.2 : Genome size information of various fish species (Based on Animal Genome Size Database) Smallest Genome

Genome Size

Fish Species

Fish genome size

0.35 pg

Tetraodon nigroviridis, Spotted green pufferfish

Skate/ray (Cartilaginous fish)

2.46 pg

Rhinobatos schlegelii, Yellow guitarfish

Shark genome size

2.73 pg

Carcharhinus obscurus, Dusky shark

Teleost genome size

0.35 pg

Tetraodon nigroviridis, Spotted green pufferfish

Dipnoan genome size 40.08 pg

Protopterus aethiopicus congicus, Marbled lungfish

Largest Genome

Genome Size

Fish Species

Fish genome Skate/ray (Cartilaginous fish) Shark genome

132.83 pg 12.04 pg

Protopterus aethiopicus, Marbled lungfish Crassinarke dormit

17.05 pg

Oxynotus centrina, Angular roughshark

Teleost genome

4.90 pg

Salmo salar, Atlantic salmon

Dipnoan genome

132.83 pg

Protopterus aethiopicus, Marbled lungfish

References 1. Abi-Rached, L., Giles, A., Shina, T., Pontarotti, P., and Inoko, H. (2002). Evidence of en- block duplication in vertebrate genomes. Nat Genet 31, 100-105. 2. Aparicio, S., Chapman, J., Stupka, E., Putnam, N., Chia J., Dehal, P., Christoffels, A., Rash, S., Hoon, S., Smit, A. et al. (2002). Whole-genome shotgun assembly and analysis of the genome of Fugu rubripes. Science 297, 1301-1310. 3. Amores, A., Force, A., Yan, Y.L., Joly, L., Amemiya, C., Fritz, A., Langeland, J., Prince, V., Wang, Y.L. (1998). Zebrafish Hox clusters and vertebrate genome evolution. Science 282: 1711-1714. 4. Brinkmann, H., Venkatesh, B., Brenner, S., and Meyer, A. (2004). Nuclear proteincoding genes support lungfish and not the coelacanth as the closest living relatives of land vertebrates. Proc. Natl. Acad. Sci. USA. 101: 4900-4905. 5. Chiu, C., Amemiya, C., Dewar, K., Kim, C.B., Ruddle, F.H., and Wagner, G.P. (2002). Molecular evolution of the HoxA cluster in the three major gnathostome lineages. Proc. Natl. Acad. Sci.USA. 99, 5492-5497. 6. Christoffels, A., Koh, E.G.L., Chia, J., Brenner, S., Aparicio, S., and Venkatesh, B. (2004). Fugu genome analysis provides evidence for a whole-genome duplication early during the evolution of ray-finned fishes. Mol. Biol. Evol. 21, 1146-1151. 75. Crollius, A. and Weissenbach, J. (2005). Fish genomics and biology. Genome Res. 2005 15: 1675-1682.

208 / Applied Computational Biology and Statistics in Biotechnology and Bioinformatics 7. Danke, J., Miyake, T., Powers, T., Schein, J., Shin, H., Bosdet, I., Erdmann, M., Caldwell, R., and Amemiya, C.T. (2004). Genome resource for the Indonesian coelacanth, Latimeria menadoensis. J. Exp. Zoolog. A Comp Exp. Biol. 301: 228-234. 8. Gu, X., Wang, Y., and Gu, J. (2002). Age distribution of human gene families shows significant roles of both large- and small-scale duplications in vertebrate evolution. Nat Genet 31, 205- 209. 9. Holland, P.W. (2003). More genes in vertebrates? J Struct. Funct. Genomics 3, 75-84. 10. Hinegardner, R. (1976). The cellular DNA content of sharks, rays and some other fishes. Comp.Biochem. Physiol. 55B, 367-370. 11. Olivier, J., Jean., Jean, P., Nicole, T., Evan, M., Laurence, B., Genome duplication in the teleost fish Tetraodon nigroviridis reveals the early vertebrate proto-karyotype. Nature 2004, 43, 946-957. 12. Kasahara, M., Naruse, K., Sasaki, S., Nakatani, Y., Qu W., Ahsan, B., Yamada, T., Nagayasu, Y., Doi, K., Kasai, Y. The medaka draft genome and insights into vertebrate genome evolution. Nature 2007, 447: 714-719. 13. Kikugawa, K., Katoh, K., Kuraku, S., Sakurai, H., Ishida, O., Iwabe, N., and Miyata, T. (2004). Basal jawed vertebrate phylogeny inferred from multiple nuclear DNA-coded genes. BMC Biol. 2:3, 1-11. 14. Kim, C.B., Amemiya, C., Bailey, W., Kawasaki, K., Mezey, J., Miller, W., Minoshima, S., Shimizu, S., Wagner, G. and Ruddle, F. (2000). Hox cluster genomics in the horn shark, Heterodontus francisci. Proc. Natl. Acad. Sci.USA 97, 1655-1660. 15. Genome 10 K Project accessible from www.genome10k.org. 16. Kumar, S., and Hedges, S.B. (1998). A molecular timescale for vertebrate evolution. Nature 392, 917-920. 17. Last, P.R., and Stevens, J.D. (1994). Sharks and Rays of Australia. p465-466. CSIRO Australia. Lindblad-Toh, K. et al. (2005). 18. Lundin, L.G., Larhammar, D., Hallbook, F. (2003). Numerous groups of chromosomal regional paralogies strongly indicate two genome doublings at the root of the vertebrates. J. Struct. Funct. Genomics 3, 53-63. 19. McLysaght, A., Hokamp, K., and Wolfe, K.H. (2002). Extensive genomic duplication during early chordate evolution. Nat Genet 31, 200-204. 20. Nelson J.S. (1994): Fishes of the world. edn. 3. Wiley & Sons, New York; pp 600. 21. Noonan J.P., Grimwood J., Danke J., Schmutz J., Dickson M., Amemiya C.T., and Myers R.M. (2004). Coelacanth genome sequence reveals the evolutionary history of vertebrate genes. Genome Res. 14: 2397-2405. 22. Ohno, S. (1970). Evolution by gene duplication. New York, Springer Verlag. 23. Ohno, S., Muramoto, J., Stenius, C., Christian, L., Kittrel, W.A. and Atkin, N.B. (1969). Microchromosomes in holocephalian, chondrostean and holostean fishes. Chromosoma (Berl.) 26, 35-40.

Recent Trends in Fish Genomics / 209 24. Ohta, Y., Okamura, K., McKinney, E.C., Bartl, S., Hashimoto, K., and Flajnik, M.F. (2000). Primitive synteny of vertebrate major histocompatibility complex class I and class II genes. Proc. Natl. Acad. Sci.USA. 97, 4712-4717. 25. Panopoulou, G., Henning, S., Groth, D. Krause, A. (2003). New evidence for genomewide duplications at the origin of vertebrates using an Amphioxus gene set and completed animal genomes. Genome Res. 13, 1056-1066. 26. Postlethwait, J., Amores, A., Cresko, W., Singer, A. and Yan, Yi-Lin. (2004). Subfunction partitioning, the teleost radiation and the annotation of the human genome. Trends Genet. 20, 481-490. 27. Shin, J.T. (2005). Human-zebrafish non-coding conserved elements act in vivo to regulate transcription. Nucleic Acids Res 33, 5437-45. 28. Stingo, V. (1979). New developments in vertebrate cytotaxonomy II. The chromosomes of the cartilaginous fishes. Genetica 50, 227-239. 29. Schwartz, F.J. and Maddock, M.B. (2002). Cytogenetics of the elasmobranchs: genome evolution and phylogenetic implications. Mar. Freshwater Res. 53, 491-502. 30. Vandepoele, K., De vos, W., Taylor, J.S., Meyer, A. and Van de Peer, A. (2004). Major events in the genome evolution of vertebrates: Paranome age and size differs considerably between rayfinned fishes and land vertebrates. Proc. Natl. Acad. Sci. USA. 101, 1638-161643. 31. Venkatesh, B., Tay, A., Dandona, N., Patil, J.G. and Brenner, S. (2005). A compact cartilaginous fish model genome. Curr. Biol. 15, R82-R83. 32. Venkatesh, B. and Yap, W.H. (2005). Comparative genomics using fugu: a tool for the identification of conserved vertebrate cis-regulatory elements. BioEssays 27, 100-107. 33. Woolfe, A. et al. (2005). Highly conserved non-coding sequences are associated with vertebrate development. PLOS Biol. 3, 1-15. 34. Yu, W.P., Rajasegaran, V., Yew, K., Woh, W, L., Tay, B, L., Amemiya, C, T., Brenmer, S., Venkatesh, B. Elephant shark sequence reveals unique insights into the evolutionary history of vertebrate genes: a comparative analysis of the protocadherin cluster. Proc Natl Acad Sci USA. 2008, 105:3819-3824. 35. Zardoya, R. and Meyer, A. (1997). The complete DNA sequence of the mitochondrial genome of a “living fossil,” the coelacanth (Latimeria chalumnae). Genetics 146: 995-1010.

