retroviruses and long terminal repeat retrotransposons as a source of ...

25 downloads 0 Views 354KB Size Report
Feb 17, 2016 - Retroviruses and long terminal repeat (LTR) retrotransposons ... 1). Both are flanked by LTRs in direct orientation that are necessary for ...
REVIEW

Not so bad after all: retroviruses and long terminal repeat retrotransposons as a source of new genes in vertebrates M. Naville1, I. A. Warren1, Z. Haftek-Terreau1, D. Chalopin1,2, F. Brunet1, P. Levin1, D. Galiana1 and J.-N. Volff1 1) Institut de Génomique Fonctionnelle de Lyon, Ecole Normale Supérieure de Lyon, CNRS UMR5242, Université Lyon 1, Lyon, France and 2) Department of Genetics, University of Georgia, Athens, GA, USA

Abstract Viruses and transposable elements, once considered as purely junk and selfish sequences, have repeatedly been used as a source of novel protein-coding genes during the evolution of most eukaryotic lineages, a phenomenon called ‘molecular domestication’. This is exemplified perfectly in mammals and other vertebrates, where many genes derived from long terminal repeat (LTR) retroelements (retroviruses and LTR retrotransposons) have been identified through comparative genomics and functional analyses. In particular, genes derived from gag structural protein and envelope (env) genes, as well as from the integrase-coding and protease-coding sequences, have been identified in humans and other vertebrates. Retroelement-derived genes are involved in many important biological processes including placenta formation, cognitive functions in the brain and immunity against retroelements, as well as in cell proliferation, apoptosis and cancer. These observations support an important role of retroelement-derived genes in the evolution and diversification of the vertebrate lineage. Clinical Microbiology and Infection © 2016 European Society of Clinical Microbiology and Infectious Diseases. Published by Elsevier Ltd. All rights reserved. Keywords: Exaptation, molecular domestication, neogene, retroelement, retrotransposon, retrovirus Article published online: 17 February 2016

Corresponding author: J.-N. Volff, Institut de Génomique Fonctionnelle de Lyon, Ecole Normale Supérieure de Lyon, 46 allée d’Italie, F-69364 Lyon Cedex 07, France E-mail: [email protected]

Introduction Retroviruses and long terminal repeat (LTR) retrotransposons shared a common evolutionary ancestry, and therefore show several structural similarities (Fig. 1). Both are flanked by LTRs in direct orientation that are necessary for transcription and integration into the genome. Both categories of elements encode a major structural protein called Gag, which is a fastevolving protein essential for particle formation. In retroviruses, the Gag protein includes three distinct regions with different functions during particle assembly: the matrix domain, which is involved in targeting to cellular membranes; the capsid domain, which mediates protein–protein interactions during

particle assembly; and the nucleocapsid domain, which generally carries zinc finger(s) allowing binding to the viral RNA genome. The gag open reading frame overlaps with a larger coding sequence called pol (polymerase). Pol is a polyprotein with domains corresponding to an aspartic protease (specific enzymatic cleavage of Pol to produce mature proteins), a reverse transcriptase (reverse transcription of the RNA intermediate into cDNA), an RNase H (degradation of RNA in the RNA/DNA heteroduplex after reverse transcription) and an integrase (integration of the new cDNA copy into the genome). A GagPol fusion protein is generated through a translational frameshift occurring between gag and pol. One of the main differences between retroviruses and LTR retrotransposons is whether they are infectious. Retroviruses are capable of moving between cells, whereas LTR retrotransposons can only insert new copies into the genome present within the same cell, and rely mostly on vertical transmission through generations. This difference is mediated by the presence of an envelope (env) gene. The env genes encode glycoproteins embedded in the lipid bilayer envelope of

Clin Microbiol Infect 2016; 22: 312–323 Clinical Microbiology and Infection © 2016 European Society of Clinical Microbiology and Infectious Diseases. Published by Elsevier Ltd. All rights reserved http://dx.doi.org/10.1016/j.cmi.2016.02.001

CMI

Naville et al.

FIG. 1. Vertebrate genes derived from LTR retrotransposons and retroviruses. The envelope sequence is only present in retroviruses but not in retrotransposons. CA, capsid; Env, envelope; LTR, long terminal repeat; MA, matrix; NC, nucleocapsid, Pol, polymerase; Pro, protease; RT, reverse transcriptase; Int, integrase.

retroviruses that are necessary for the entry of the virus particles into cells. Therefore gain or loss of the env gene leads to an evolutionary switch between LTR retrotransposons and retroviruses [1,2]. Three superfamilies of LTR retrotransposons with this typical structure are present in vertebrates: Gypsy/Ty3, BEL/Pao and Ty1/Copia [3]. A fourth family called DIRS contains more divergent elements that encode a tyrosine recombinase instead of an integrase [4]. Active LTR retrotransposons are present in amphibians and fish but absent from the genomes of mammals [3,5,6]. Nine genera of retroviruses have been described so far in vertebrates [7]. Retroviruses have been introduced repeatedly in genomes through infection in the germ-line of mammals and other vertebrates [8]. Such integrated copies, which are called endogenous retroviruses (ERVs), are transmitted vertically from generation to generation, just like retrotransposons [9]. Once considered as solely junk and selfish sequences, with their only recognized roles in infectious and genetic diseases, LTR retroelements as well as other types of transposable elements (TEs, including non-LTR retrotransposons and DNA transposons) have been subsequently shown to be powerful drivers of gene and genome evolution [10–15]. In particular, such mobile DNA can serve as a source of both regulatory and coding sequences in the evolution of host gene function. Considering regulatory sequences first, using mammals as an example, endogenous retroviruses and other types of TEs have played a major role in the rewiring of specific transcription factor regulatory networks [16,17]. In human embryonic stem cells, a recent study indicated that the great majority of humanspecific regulatory loci binding the NANOG, POU5F1 and CTCF transcription factors are included within TEs [18]. Also in humans, more than one-third of the binding sites for the tumour suppressor factor p53 overlap with ERV elements. These binding sites are primate-specific and not found in other mammals [19]. Moreover, ERVs and other TEs have probably played a decisive role in the emergence and diversification of

Retroviruses and new genes in vertebrates

313

placenta in mammals, by putting previously non-co-regulated genes under common specific regulations [20–22]. Importantly, and in addition to these regulatory contributions, retroelements and TEs as a whole have given birth to new RNA and protein-coding genes in vertebrates and other organisms, leading to the emergence of new functions that have been possibly associated with evolutionary transitions [10,23–26]. This phenomenon has been called ‘molecular domestication’. Maybe the best-studied examples of TE-derived protein-coding genes are the genes encoding the RAG1 and RAG2 recombinases, which are crucial to the vertebrate immune system. Both proteins cooperate to catalyse the V(D)J somatic recombination, which generates the highly diverse repertoire of antibodies/immunoglobulins and T-cell receptors in vertebrates. RAG1 and RAG2, as well as the recombination signals they recognize for the V(D)J recombination, have both been derived from a Transib DNA transposon over 500 million years ago (MYA) [27,28]. Another example of DNA transposon-derived gene is CENP-B, which encodes a mammalian centromere-associated protein derived from a pogo-like transposase [29]. New genes can also occur through the fusion between host and TE coding sequences, for example the primate SETMAR1 (Metnase) gene [30,31]. This gene has been formed through a fusion of an N-terminal histone-lysine Nmethyltransferase SET domain and a C-terminal transposase domain from a mariner-like Hsmar1 DNA transposon. SETMAR1 is involved in non-homologous end-joining repair [30]. Proteins encoded by retroviruses and LTR retrotransposons possess diverse properties including binding to, cutting, ligating and degrading nucleic acids, as well as interacting with other proteins, cutting them and promoting cell–cell fusion. These multiple activities might be useful for the biological pathways of the host [9,32–35]. We describe here that many vertebrate genes have been formed from retroviruses and LTR retrotransposons (Fig. 1; Table 1). Such ‘domesticated’ retroelement sequences have generally lost their ability to transpose through mutations, but show conservation of their open reading frame and genomic localization in different species. For some of these genes, functional analyses have been performed in vitro and in vivo, demonstrating the role of the presumed selfish/infectious LTR retrotransposons and retroviruses as sources of new coding sequences allowing genetic innovation.

Gag-derived genes At least 85 genes derived from the gag gene of LTR retrotransposons have been identified in the human genome. These retrotransposons belong to diverse families, with a particular involvement of Ty3/Gypsy elements. Many of these

Clinical Microbiology and Infection © 2016 European Society of Clinical Microbiology and Infectious Diseases. Published by Elsevier Ltd. All rights reserved, CMI, 22, 312–323

314

Clinical Microbiology and Infection, Volume 22 Number 4, April 2016

domesticated genes are also conserved in other mammalian species [35,36]. According to their evolutionary origin, these genes have been grouped into different major families including the Mart, Pnma and SCAN gene families [35]. The Mart gene family A well-studied example of a gag-derived gene family is the Mart family (for ‘MAmmalian RetroTransposon’, aka Sirh for ‘SushiIchi Retrotransposon Homologue’). This family contains at least 11 genes in placental mammals [37,38], with some genes present in marsupials [39,40]. The Mart gene family is probably derived from a vertebrate Ty3/Gypsy retrotransposon family called Sushi, which has been lost in mammals but is still active in fish [41]. After an initial event of molecular domestication in an ancestor of mammals, the family has been probably expanded through serial segmental duplication events. Most copies of the gene are located on the X chromosome, suggesting that the formation of the ancestral Mart gene occurred on this chromosome [37,38]. After domestication/duplication, some Mart genes have acquired introns. None of the Mart genes remain capable of transposition, and all have lost structural features such as LTRs, which are essential for mobilization. However, all Mart genes have retained a protein-coding potential, which is conserved between species for a particular gene but can vary between genes. Particularly, the pol region is completely or partially deleted to differing degrees in different Mart genes. Strikingly, all Mart genes carry a protein-coding region that is derived from the gag region of the ancestral Sushi retrotransposon [37,38]. However, only four Mart proteins have conserved the nucleic acidbinding zinc finger from the original Gag nucleocapsid protein. Two Mart genes have a partial pol sequence with protease domain; they still use the programmed ribosomal frameshift present in the ancestral retrotransposon to produce a Gag-Pol precursor polyprotein [42,43]. At least two autosomal Mart genes, Peg10 (Mart2) and Peg11/Rtl1 (Mart1), are subject to genomic imprinting and are located in imprinted clusters [44,45]. Both are exclusively expressed by the paternal allele, with their expression being under the control of differentially methylated regions. A maternally expressed microRNA (miRNA) processed from a Peg11/Rtl1 antisense transcript induces RNA interference [46,47]. Epigenetic regulations of retrotransposon-derived genes have been proposed to be remnants of ancient defence mechanisms that controlled the activity of the ancestral TE [39]. Strikingly, as many as eight out of 11 Mart genes are expressed in the placenta [48], and at least three of these genes are clearly involved in its development, but with differing functions (for review see ref. [49]). Peg10/Mart2 knockout mice

CMI

display defects in placenta formation leading to early embryonic lethality [50,51]. Placental expression of Peg10 has also been reported in human and pig, suggesting a conserved function [52,53]. Modification of Peg11/Rtl1/Mart1 expression (knockout and overexpression) causes late-foetal and/or neonatal lethality in mouse [54]. Expression of the gene is detected in the mouse labyrinth zone of the placenta (the exchange region of the placenta, analogous to the villous placenta in human). Peg11 is essential for the maintenance of foetal capillaries at the foeto–maternal interface during the late-foetal stage [54]. In human paternal uniparental disomy for chromosome 14, overexpression of Peg11, due to the presence of two paternal genes, is associated with an abnormally large placenta (placentomegaly) [55]. Maternal disomy of mouse chromosome 12, which carries Peg11, results in placental hypoplasia [56]. Peg11 expression is regulated by an miRNA, miR-127, which is processed from a maternally expressed antisense Peg11 transcript [57]. Finally, knock-out mice for Ldoc1/Sirh7/Mart7 show abnormal placental cell differentiation and maturation, with an overproduction of placental progesterone and placental lactogen 1 from trophoblast giant cells [58]. Placental progesterone is a hormone with an important role in the preparation and maintenance of pregnancy and the timing of parturition. Pregnant Ldoc1 knockout mice display delayed parturition [58]. Beside placenta, Mart genes including Peg10 and Peg11 are expressed in other embryonic and adult organs and tissues with frequent expression in brain, suggesting other functions in development and physiology [38]. Peg10 has also been proposed to work as a transcription factor binding to DNA to regulate the transcription of the myelin basic protein gene [59]. Furthermore, Peg10 might play a role in early stages of adipocyte differentiation [60]. Peg11 has been shown to contribute to muscular hypertrophy of callipyge sheep [61]. Interestingly, a recent analysis has shown that disruption of the Zcchc16/Mart4/ Sirh11 gene in the mouse causes abnormal behaviours related to cognition, including attention, impulsivity and working memory. This indicated that Zcchc16 is involved in cognitive function in the brain, possibly via the noradrenergic system [62]. Several Mart genes show an aberrant expression in human cancers [63]. Peg10 is overexpressed in human hepatocellular carcinoma and other types of tumours and might itself promote cancer formation through the suppression of apoptosis and the stimulation of cell proliferation [64–68]. Accordingly, Peg10 can interact with other proteins including SIAH1, a mediator of apoptosis [64]. Overexpression of Peg11/Rtl1 in the liver of adult mice results in highly penetrant tumour formation, suggesting a role as a driver of hepatocarcinogenesis [69]. Ldoc1, considered as a tumour suppressor down-regulated in human pancreatic cancer cell lines, inhibits nuclear factor-κB activation

Clinical Microbiology and Infection © 2016 European Society of Clinical Microbiology and Infectious Diseases. Published by Elsevier Ltd. All rights reserved, CMI, 22, 312–323

CMI

Naville et al.

and induces apoptosis [70,71]. Ldoc1 mRNA is differentially expressed in chronic lymphocytic leukaemia and predicts overall survival in untreated patients [72]. Loss of Ldoc1 expression occurs by promoter methylation in cervical cancer cells [73]. The Ma/Pnma gene family The Ma/Pnma (for Paraneoplastic Ma antigens) gene family is a second mammalian gene family that originated from a Gypsy/ Ty3 LTR retrotransposon. This family was formed independently from the Mart gene family through the domestication of a single Gypsy12_DR-related retrotransposon copy [74]. Subsequent duplication events expanded the family, with at least 15 Pnma genes present in human and 12 in mouse, most of them located on the X chromosome like the Mart genes [35,75,76]. Here again, the nature of conserved retrotransposon structures can vary between gene family members: only a few Pnma genes have kept the ribosomal gag/pol frameshift, and not all Pnma proteins have retained the zinc finger domain that was originally present in the Gag protein. Two Pnma genes have been identified in marsupials, with one of them probably corresponding to a pseudogene [77]. Pnma proteins are called ‘paraneoplastic Ma antigens’ because antibodies against some of them have been identified in serum from patients with paraneoplastic neurological disorders [78–80]. These are rare syndromes characterized by neurological dysfunctions affecting almost any part of the nervous system. Paraneoplastic neurological disorders are associated with lung and gynaecological tumours. Although the tumour itself is not directly responsible for the disease, it might express a protein antigen normally expressed in the nervous system. This can induce not only an anti-tumour immune response but also the progressive neurological damage that causes the disease [81]. Some Pnma proteins expressed in tumours of patients with paraneoplastic neurological disorders might be targeted by the autoimmune response, leading to neurological disorder. As observed for the Mart genes, there is a link between Pnma gene functions and cell proliferation, apoptosis and cancer. Pnma4 (Ma4/Map1/Maop1) is a pro-apoptotic protein that can bind the pro-apoptotic Bax (Bcl2-associated X) protein and the pro-survival Bcl-2 and Bcl-X(L) proteins [82,83]. The stimulation of death receptors induces the formation of a complex between Pnma4 and the tumour suppressor protein RASSF1A, allowing the binding of Pnma4 to Bax. Bax conformational change and induction of apoptosis in response to death receptor stimulation require interaction between RASSF1A and Pnma4 [84]. A second Pnma protein, Pnma1/Ma1, is a proapoptotic protein in neurons: Pnma1 overexpression may contribute to neurodegenerative disorders [85]. Pnma1 also

Retroviruses and new genes in vertebrates

315

promotes cell growth in human pancreatic ductal adenocarcinoma [86]. Finally, Pnma10/SIZN1 is expressed in the mouse ventral embryonic forebrain. It contributes to Bone Morphogenetic Protein (BMP)-dependent cholinergic-neuron-specific gene expression, and is a candidate gene for X-linked mental retardation in human [87–89]. The SCAN domain gene family The SCAN family of transcription factors includes C2H2 zinc finger proteins with a vertebrate-specific domain called SCAN in their N-terminus region [90,91]. This domain is a conserved 84-residue leucine-rich motif mediating interactions with other proteins; it presents structural homologies to the C-terminal domain of retroviral capsids. The SCAN gene family has probably been formed through the molecular domestication of a Gmr1-like LTR retrotransposon in an early tetrapod ancestor about 300 MYA [35,92,93]. Subsequent duplication events have led to the presence of around 70 and 40 genes in human and mouse, respectively. SCAN domain genes have also been identified in birds and reptiles [93]. SCAN domain transcription factors are involved in biological processes as diverse as haematopoiesis (Myeloid zinc finger 1 MZF1), regulation of pluripotency of embryonic stem cells (ZNF20688), control of lipid metabolism (ZNF202), regulation of hippocampal neuronal cholesterol biosynthesis (NRIF), chondrogenesis (ZNF449), as well as in muscle stem cell behaviour, core body temperature, body fat and maternal behaviour (PW1/Peg3) [91]. As observed in other Gag-related genes, SCAN domain genes play a role in the control of cell survival, proliferation and apoptosis, therefore having a possible involvement in cancer. NRIF is a mediator of neuronal apoptosis [94], ZNF307 suppresses the p53-p21 pathway possibly through p53 degradation [95] and Mzf1 is involved in the aetiology of major solid tumours such as lung, cervical, breast and colorectal cancers [96]. SCAN domain proteins are able to interact with many other proteins including the von Hippel–Lindau tumour suppressor protein (ZnF197), the E3 ubiquitin ligase Siah1a (Pw1/Peg3), the neurotrophin receptor p75 (NRIF), the peroxisome proliferator-activated receptors (SDP1 and PGC-2), the glucocorticoid receptor (Znf307), the transcriptional repressor Jumonij/Jarid2 (Zfp496), the NSD1 histone lysine methyltransferase (Nizp1) and the tumour necrosis factor mediator TRAF2 (Pw1/Peg3) [35]. Other gag-derived genes Some Gag-derived proteins play a role in the defence of organisms against infections by retroviruses. This is exemplified in the mouse by the Friend-virus-susceptibility-1 (Fv1) gene, which protects against infection by the murine leukaemia virus and

Clinical Microbiology and Infection © 2016 European Society of Clinical Microbiology and Infectious Diseases. Published by Elsevier Ltd. All rights reserved, CMI, 22, 312–323

316

Clinical Microbiology and Infection, Volume 22 Number 4, April 2016

other types of retroviruses [97,98]. Fv1 was formed about 7 MYA from the gag gene of an endogenous retrovirus family, MERV-L [97]. This virus, which is not directly related to the murine leukaemia virus, is present in many copies in the mouse and human genomes. Fv1 blocks infection through interaction with the capsid protein of the restricted murine leukaemia virus [99]. Fv1 appears to control replication of the murine leukaemia virus after entry into the target cell but before integration and formation of the provirus. In mouse, Fv1 is able to prevent or delay spontaneous or experimentally induced viral tumours. In sheep, two independent genes called enJS56A1 and enJSRV20 have been derived from enJSRV Jaagsiekte endogenous betaretroviruses. The proteins encoded by these genes can block the replication of related exogenous retroviruses [100–103]. The activity-regulated cytoskeleton-associated protein (Arc) gene (aka Arg3.1) is a neuron-specific gene that works at excitatory synapses and is required for learning and memory in the mammalian brain [104–107]. Arc has been derived from the gag gene of a Gypsy-26-I_DR retrotransposon before the divergence between mammals and amphibians [35]. Finally, a family of endogenous retrovirus-derived sequences is specifically expressed in embryonic stem cells and early embryos in chicken [108]. Among them, a gag-like gene called Ens-1/Erni is involved in the timing of neural plate emergence during embryonic development through the control of the expression of the Sox2 transcription factor [109,110].

Integrase-derived genes Compared with gag-derived genes, genes derived from integrases of retroviruses and LTR retrotransposons are not numerous in vertebrate genomes. This might appear astonishing, as molecular domestication of the evolutionary and structurally related DNA transposases is not rare [111]. Two genes related to integrases, called Gin-1 and Gin-2, have been identified in vertebrates [112–114]. Gin-2 is present in cartilaginous and ray-finned fish, amphibians, birds and reptiles, suggesting a molecular domestication event approximately 500 MYA. However, this gene has been lost in placental mammals. Gin-1 is more recent and has been formed independently of Gin2 in a common ancestor of birds/reptiles and mammals. Nothing is known of the functions of both genes, but the expression pattern of Gin-2 in zebrafish embryos suggests a role during gastrulation [114]. Additional analyses have demonstrated that Gin genes have been formed from particular DNA transposons that have themselves recruited an integrase from LTR retrotransposons to use as a transposase [113]. Hence, the

CMI

Gin genes are only indirectly derived from LTR retroelement integrases. In mammals, a gene called CGIN1 has been formed through the fusion of an endogenous retroviral sequence (integrase and RNase H) and a duplicated copy of the cellular gene KIAA0323 [115]. This domestication event took place c. 125–180 MYA in a common ancestor of marsupials and Eutherians. The CGIN1 integrase-like domain has been inactivated by mutations but might have retained the three-dimensional folding observed in retroviral integrases. A role in the resistance to retroviruses through regulation of viral protein ubiquitination has been proposed [115].

Protease-derived genes Retroviral-like aspartic protease genes that are not embedded within endogenous retroviral elements have been identified in vertebrate genomes. One of them, SASPase (ASPRV1), which is possibly derived from a retroviral/retrotransposon aspartic protease, is expressed in human and mouse epidermis [116,117]. In mouse, SASPase is involved in wrinkle formation and is necessary for the maintenance of the texture and hydration of the stratum corneum, the outermost layer of the epidermis [117,118]. In addition to the protease domain, SASPase also shows homology with the capsid domain of Gag proteins from LTR retroelements [35]. In human, two genes called DDI1 and NIX1 encode predicted products with similarities to aspartyl proteases from LTR retroelements [119]. Both proteins are related to the yeast Ddi1p protein and have homologues in α- and γ-Proteobacteria, suggesting horizontal transfer between eukaryotes and Proteobacteria [119]. In the mouse, NIX1 is expressed only in specific neurons of the central nervous system; NIX1 binds ligandactivated/constitutively active nuclear receptors and downregulates transcription [120].

Envelope-derived genes One of the most intriguing examples of convergent exaptation of LTR retroelement genes is provided by the Syncytins, which are genes derived from endogenous retrovirus envelope genes. Syncytins have been recruited several times independently in different sublineages in mammals, and are involved in the formation of the placenta, which forms the nutritional and protective interface between the mother and the developing foetus [121–124]. Syncytins are expressed in the syncytiotrophoblast layer, a continuous structure with microvillar surfaces forming the outermost foetal component of the placenta. This layer is

Clinical Microbiology and Infection © 2016 European Society of Clinical Microbiology and Infectious Diseases. Published by Elsevier Ltd. All rights reserved, CMI, 22, 312–323

CMI

Naville et al.

important for exchanges between mother and foetus. Syncytins promote the fusion of trophoblast cells, which leads to the formation of the syncytiotrophoblast. Some Syncytins—but not all—also possess immunosuppressive properties, which might help to protect foetal tissues from the maternal immune system [125]. The first Syncytin to be discovered was Syncytin-1 in human, which corresponds to an envelope-like protein encoded by a defective HERV-W provirus [126,127]. Beside placenta formation, Syncytin-1 might also be involved in osteoclast fusion and bone resorption [128] and may also regulate neuroinflammation in multiple sclerosis [129]. Subsequently, Syncytin-2, a second placental Syncytin with fusogenic properties, derived from a HERV-FRD endogenous retrovirus, has been found in human [130]. Both Syncytin-1 and -2 genes are conserved in simians, where other potentially intact retroviral env genes have been detected, but with no hint of their functions. Genes for Syncytin-1 (apes) and Syncytin-2 (apes and monkeys) have been acquired through independent infection events. Syncytin-1 cell–cell fusion requires interaction with the human sodium-dependent neutral amino acid transporter type 2 hASCT2 [131]. The Syncytin-2 receptor, MFSD2, is a presumptive carbohydrate transporter with 10–12 membranespanning domains. This receptor, which is placenta-specific and expressed at the level of the syncytiotrophoblast, is important for trophoblast fusion [132]. Another human placenta-specific HERV-derived protein called Suppressyn inhibits cell fusion. Through its binding to hASCT2, this protein might compete with Syncytin-1 to modulate its function in placenta [133]. Two Syncytin genes of independent origins, called Syncytin-A and Syncytin-B, have also been identified in the mouse and other Muroidea species [134,135]. Both exhibit placenta-specific expression and cell–cell fusogenic properties. Homozygous Syncytin-A knock-out mouse embryos die in utero between 11.5 and 13.5 days of gestation, demonstrating the critical function of this gene in development. In these embryos, absence of trophoblast cell fusion and defect in the formation of one of the two syncytiotrophoblast layers are associated with decreased vascularization, inhibition of placental transport and foetal growth retardation [136]. Syncytin-B null embryos are viable, but Syncytin-A null embryos die prematurely when Syncytin-B is also inactivated [137]. In addition to simians and mouse-related rodents, Syncytinlike genes have also been identified in squirrel-related rodents [138], lagomorphs [139], guinea pig [140], ruminants [141], including Fematrin-1 [142–144], Afrotherian tenrecs [145] and Carnivora, the latter being the oldest Syncytin gene identified to date with an age of 60–85 million years [146]. Expression of an env-derived Syncytin gene with cell–cell fusogenic properties

Retroviruses and new genes in vertebrates

317

has also been found in the short-lived marsupial placenta of the South American marsupial opossum [147]. All of these Syncytin genes have been introduced in the germ line of their hosts through independent retrovirus infections, indicating recurrent convergent domestication of env sequences into Syncytin genes in mammals. Other env-derived genes possess functions independent of placenta formation, including the restriction of infection by other viruses [148]. For instance, the Fv-4 (Friend virus susceptibility protein 4, aka Akvr1 for AKR virus restriction 1) locus in the mouse controls the susceptibility to infection by ecotropic murine leukaemia virus. This locus corresponds to a gene constituted by an entire murine leukaemia virus env gene flanked by a partial pol sequence and a 30 murine leukaemia virus LTR. Expression of the Env protein confers resistance to virus infection [149]. In domestic chicken, several env-derived genes that originated from endogenous avian leucosis viruses confer protection against exogenous avian leucosis viruses by blocking their entry into the cell [150]. In domestic sheep, one copy of an endogenous Jaagsiekte sheep retrovirus expresses an env gene that restricts infection by exogenous Jaagsiekte sheep retroviruses [151]. Other examples of env-derived genes involved in resistance to infection are known from domestic mouse (Rmcf and Rmcf2) and cat (Refrex-1) (for review see ref. [148]).

Conclusions In this review, we summarize evidence that retroviruses and LTR retrotransposons, which are mostly selfish and even infectious, have repeatedly contributed their coding sequences for the formation of new protein-coding genes in vertebrates. LTR retroelements have therefore played an important role in the evolution and diversification of the vertebrate lineage (Table 1). This is particularly true for the mammalian lineage, where many gag- and env-derived genes have been identified. In a survey of 24 TE-derived mammalian genes across 90 genomes, only a few domesticated genes were found outside Eutherians, with only ten in marsupials and three in monotremes [74]. Outside mammals, only two genes are present in reptiles (ARC and GIN1, with only ARC in amphibians; GIN2 was not included in this study), and none of the studied genes were detected in fish [74,114]. However, this study was mammalian-centred, and almost nothing is known about molecular domestication in nonmammalian vertebrate sublineages. Some domestication events might have been crucial for the emergence and evolution of new structures leading to major evolutionary transitions within vertebrates, for example

Clinical Microbiology and Infection © 2016 European Society of Clinical Microbiology and Infectious Diseases. Published by Elsevier Ltd. All rights reserved, CMI, 22, 312–323

318

CMI

Clinical Microbiology and Infection, Volume 22 Number 4, April 2016

TABLE 1. Examples of protein-coding genes derived from LTR retrotransposons et retroviruses in vertebrates Ancestral coding sequence Gag

Gene/gene family

Ancestral retroelement Species/lineage

Functions / pathways

Associated pathologies

MART/Sirh family Peg10/Mart2 Peg11/Rtl1/Mart1 Ldoc1/ Sirh7/Mart, Zcchc16/ Mart4/Sirh11 etc. Ma/Pnma family Pnma4/ Ma4/Map1/Maop1 Pnma1/Ma1 Pnma10/SIZN1 etc. SCAN family Mzf1, ZNF20688, ZNF202, NRIF, ZNF449, PW1/Peg3, etc.

Ty3/Gypsy retrotransposon (Sushi retrotransposon)

Mammals

Placentomegaly in paternal [37–73] uniparental disomy for chromosome 14 (Peg11); Cancer (Peg10, Peg11, Ldoc1)

Ty3/Gypsy retrotransposon (Gypsy12_DR-related retrotransposon)

Mammals

Placenta formation (Peg10, Peg11, Ldoc1); brain cognitive functions (Zcchc16); adipocyte differentiation (Peg10); cell proliferation and apoptosis (Peg10, Peg11, Ldoc1) Cell proliferation and apoptosis (Pnma1, Pnma4)

Ty3/Gypsy retrotransposon (Gmr1-like retrotransposon)

Vertebrates

Cancers (Mzf1)

[90–96]

Fv1

Endogenous murine retrovirus (MERV-L) Jaagsiekte endogenous betaretrovirus (enJSRV) Ty3/Gypsy retrotransposon (Gypsy-26-I_DR-related retrotransposon) GIN transposons

Mouse

Haematopoiesis (Mzf1); regulation of pluripotency of embryonic stem cells (ZNF20688); control of lipid metabolism (ZNF202); regulation of hippocampal neuronal cholesterol biosynthesis (NRIF); chondrogenesis (ZNF449); muscle stem cell behaviour, core body temperature, body fat and maternal behaviour (PW1/ Peg3); control of cells survival, proliferation and apoptosis, etc. Protection against infection by the murine leukaemia virus (MLV) Protection against infection by retroviruses Learning and memory in mammalian brain

?

[97–99]

?

[100–103]

?

[104–107]

Gin-2: fish, amphibians, Gastrulation in zebrafish (Gin-2)? birds and reptiles; Gin-1: mammals, birds and reptiles Mammals Resistance to retroviruses?

?

[109–111]

?

[112]

Mammals Eukaryotes and Prokaryotes Mammals Mouse

? ?

[113–115] [116,117]

Placenta formation ? Restriction of infection by ecotropic MuLV

[118–144] [145,146]

enJS56A1, enJSRV-20 Arc/Arg3.1

Integrase

Gin-1, Gin-2

CGIN1

Endogenous retrovirus (fusion with a gene)

Protease

SASPase/ASPRV1 DDI1, NIX1

? ?

Envelope

Syncytins/ Fematrin-1 Fv-4

Endogenous retroviruses Murine leukaemia virus (MuLV)

Sheep Mammals, amphibians

through the formation of new organs. For instance, many LTRretroelement-derived genes, including env-derived Syncytins and some gag-derived Mart genes, are necessary for the formation of placenta, and therefore might have played a role in the evolution of viviparity and in the emergence of placental mammals. However, so far there is no evidence for an ancestral Syncytin present in a common ancestor of placental mammals, indicating either that the appearance of Syncytin was not at the origin of the emergence of the placental lineage, or that this ancestral Syncytin has been subsequently replaced by other sublineage-specific Syncytins. In addition, LTR retroelement genes might have contributed to sublineage-specific diversification in vertebrates [12]. This might be the case again for Syncytins, which have been introduced many times independently in different vertebrate sublineages and may contribute to placenta diversity in mammals. The placenta is a rapidly evolving and diverse organ in mammals, perhaps due to the proposed conflict between mother and embryo, and a ready-made source of useful coding sequence

Maintenance of the epidermis Neuronal protein (NIX1)

References

Paraneoplastic neurological [74–89] disorders; X-linked mental retardation? (Pnma10)

will benefit its fast evolution [152]. Diversification of placenta in mammals by ERVs and other TEs has also probably occurred through regulatory gene network rewiring [20–22]. Other domesticated genes involved in cognitive functions like the gagderived Zcchc16/Mart4/Sirh11 and Arc genes might also have mediated important brain-linked variations between vertebrate sublineages. Some specific biological pathways seem to be more prone to the recruitment of LTR retroelements genes, including placenta formation, cognitive functions, control of cell proliferation and apoptosis (with implications in cancer biology) [63], and defence against retroelements. The latter property might be derived from mechanisms used by the ancestral retroelement to restrict activity or infection by competitors. For example, retrovirus envelopes of pre-infected cells can block further infections by viruses that use the same host receptor, a phenomenon called receptor interference. There is no doubt that what we know about the function and evolutionary impact of retroviruses and retrotransposons is

Clinical Microbiology and Infection © 2016 European Society of Clinical Microbiology and Infectious Diseases. Published by Elsevier Ltd. All rights reserved, CMI, 22, 312–323

CMI

Naville et al.

only the tip of the iceberg. Many retroelement- and other TEderived genes are still to be discovered, particularly young genes not easily identified by comparative genomics, and old genes that have heavily diverged from their TE ancestors. Even genes with well-characterized functions might have additional properties useful to their hosts, as suggested by the Mart genes with placental functions, which are expressed in many other embryonic and adult tissues and organs [38]. The evolutionary route(s) by which an assumed genome parasite or infectious agent can become a cellular gene that is useful to the host also needs to be better understood. Which types of mutations are required for the new function, regulatory and/or coding changes? Are neo-functionalizing mutations really required, or might the ancestral TE already carry beneficial functions before fixation through the loss of sequences necessary for transposition? The upcoming wave of new genome sequences will help to answer these questions, allowing us to better understand how parasitic and/or infectious retroelements have contributed new genes for the emergence and diversification of the vertebrate lineage.

Transparency Declaration The authors declare that they have no conflicts of interest.

Acknowledgements This work is supported by a grant from the Emerging Projects programme of the Ecole Normale Supérieure de Lyon.

References [1] Malik HS, Henikoff S, Eickbush TH. Poised for contagion: evolutionary origins of the infectious abilities of invertebrate retroviruses. Genome Res 2000;10:1307–18. [2] Ribet D, Harper F, Dupressoir A, Dewannieux M, Pierron G, Heidmann T. An infectious progenitor for the murine IAP retrotransposon: emergence of an intracellular genetic parasite from an ancient retrovirus. Genome Res 2008;18:597–609. [3] Chalopin D, Naville M, Plard F, Galiana D, Volff JN. Comparative analysis of transposable elements highlights mobilome diversity and evolution in vertebrates. Genome Biol Evol 2015;7:567–80. [4] Poulter RT, Butler MI. Tyrosine recombinase retrotransposons and transposons. Microbiol Spectr 2015;3. MDNA3-0036-2014. [5] Volff JN, Bouneau L, Ozouf-Costaz C, Fischer C. Diversity of retrotransposable elements in compact pufferfish genomes. Trends Genet 2003;19:674–8. [6] de la Chaux N, Wagner A. BEL/Pao retrotransposons in metazoan genomes. BMC Evol Biol 2011;11:154.

Retroviruses and new genes in vertebrates

319

[7] Hayward A, Cornwallis CK, Jern P. Pan-vertebrate comparative genomics unmasks retrovirus macroevolution. Proc Natl Acad Sci USA 2015;112:464–9. [8] Herniou E, Martin J, Miller K, Cook J, Wilkinson M, Tristem M. Retroviral diversity and distribution in vertebrates. J Virol 1998;72: 5955–66. [9] Feschotte C, Gilbert C. Endogenous viruses: insights into viral evolution and impact on host biology. Nat Rev Genet 2012;13:283–96. [10] Volff JN. Turning junk into gold: domestication of transposable elements and the creation of new genes in eukaryotes. Bioessays 2006;289:913–22. [11] Biémont C, Vieira C. Genetics: junk DNA as an evolutionary force. Nature 2006;443:521–4. [12] Böhne A, Brunet F, Galiana-Arnoux D, Schultheis C, Volff JN. Transposable elements as drivers of genomic and biological diversity in vertebrates. Chromosome Res 2008;16:203–15. [13] Goodier JL, Kazazian Jr HH. Retrotransposons revisited: the restraint and rehabilitation of parasites. Cell 2008;135:23–35. [14] Burns KH, Boeke JD. Human transposon tectonics. Cell 2012;149: 740–52. [15] Warren IA, Naville M, Chalopin D, Levin P, Berger CS, Galiana D, et al. Evolutionary impact of transposable elements on genomic diversity and lineage-specific innovation in vertebrates. Chromosome Res 2015;Sep 22 [Epub ahead of print]. [16] Feschotte C. Transposable elements and the evolution of regulatory networks. Nat Rev Genet 2008;9:397–405. [17] Rebollo R, Romanish MT, Mager DL. Transposable elements: an abundant and natural source of regulatory sequences for host genes. Annu Rev Genet 2012;46:21–42. [18] Glinsky GV. Transposable elements and DNA methylation create in embryonic stem cells Human-specific regulatory sequences associated with distal enhancers and noncoding RNAs. Genome Biol Evol 2015;7:1432–54. [19] Wang T, Zeng J, Lowe CB, Sellers RG, Salama SR, Yang M, et al. Species-specific endogenous retroviruses shape the transcriptional network of the human tumor suppressor protein p53. Proc Natl Acad Sci USA 2007;104:18613–8. [20] Lynch VJ, Leclerc RD, May G, Wagner GP. Transposon-mediated rewiring of gene regulatory networks contributed to the evolution of pregnancy in mammals. Nat Genet 2011;43:1154–9. [21] Chuong EB, Rumi MA, Soares MJ, Baker JC. Endogenous retroviruses function as species-specific enhancer elements in the placenta. Nat Genet 2013;45:325–9. [22] Lynch VJ, Nnamani MC, Kapusta A, Brayer K, Plaza SL, Mazur EC, et al. Ancient transposable elements transformed the uterine regulatory landscape and transcriptome during the evolution of mammalian pregnancy. Cell Rep 2015;10:551–61. [23] Sinzelle L, Izsvák Z, Ivics Z. Molecular domestication of transposable elements: from detrimental parasites to useful host genes. Cell Mol Life Sci 2009;66:1073–93. [24] Alzohairy AM, Gyulai G, Jansen RK, Bahieldin A. Transposable elements domesticated and neofunctionalized by eukaryotic genomes. Plasmid 2013;69:1–15. [25] Hadjiargyrou M, Delihas N. The intertwining of transposable elements and non-coding RNAs. Int J Mol Sci 2013;14:13307–28. [26] Hoen DR, Bureau TE. Discovery of novel genes derived from transposable elements using integrative genomic analysis. Mol Biol Evol 2015;32:1487–506. [27] Kapitonov VV, Jurka J. RAG1 core and V(D)J recombination signal sequences were derived from Transib transposons. PLoS Biol 2005;3: e181. [28] Kapitonov VV, Koonin EV. Evolution of the RAG1-RAG2 locus: both proteins came from the same transposon. Biol Direct 2015;10:1–8.

Clinical Microbiology and Infection © 2016 European Society of Clinical Microbiology and Infectious Diseases. Published by Elsevier Ltd. All rights reserved, CMI, 22, 312–323

320

Clinical Microbiology and Infection, Volume 22 Number 4, April 2016

[29] Casola C, Hucks D, Feschotte C. Convergent domestication of pogolike transposases into centromere-binding proteins in fission yeast and mammals. Mol Biol Evol 2008;25:29–41. [30] Lee SH, Oshige M, Durant ST, Rasila KK, Williamson EA, Ramsey H, et al. The SET domain protein Metnase mediates foreign DNA integration and links integration to nonhomologous end-joining repair. Proc Natl Acad Sci USA 2005;102:18075–80. [31] Cordaux R, Udit S, Batzer MA, Feschotte C. Birth of a chimeric primate gene by capture of the transposase gene from a mobile element. Proc Natl Acad Sci USA 2006;103:8101–6. [32] Volff J, Körting C, Schartl M. Ty3/Gypsy retrotransposon fossils in mammalian genomes: did they evolve into new cellular functions? Mol Biol Evol 2001;18:266–70. [33] Zdobnov EM, Campillos M, Harrington ED, Torrents D, Bork P. Protein coding potential of retroviruses and other transposable elements in vertebrate genomes. Nucleic Acids Res 2005;33:946–54. [34] Volff JN, Brosius J. Modern genomes with retro-look: retrotransposed elements, retroposition and the origin of new genes. Genome Dyn 2007;3:175–90. [35] Kaneko-Ishino T, Ishino F. The role of genes domesticated from LTR retrotransposons and retroviruses in mammals. Front Microbiol 2012;3:262. [36] Campillos M, Doerks T, Shah PK, Bork P. Computational characterization of multiple Gag-like human proteins. Trends Genet 2006;22:585–9. [37] Brandt J, Veith AM, Volff JN. A family of neofunctionalized Ty3/gypsy retrotransposon genes in mammalian genomes. Cytogenet Genome Res 2005;110:307–17. [38] Brandt J, Schrauth S, Veith AM, Froschauer A, Haneke T, Schultheis C, et al. Transposable elements as a source of genetic innovation: expression and evolution of a family of retrotransposonderived neogenes in mammals. Gene 2005;345:101–11. [39] Suzuki S, Ono R, Narita T, Pask AJ, Shaw G, Wang C, et al. Retrotransposon silencing by DNA methylation can drive mammalian genomic imprinting. PLoS Genet 2007;3:e55. [40] Ono R, Kuroki Y, Naruse M, Ishii M, Iwasaki S, Toyoda A, et al. Identification of tammar wallaby SIRH12, derived from a marsupialspecific retrotransposition event. DNA Res 2011;18:211–9. [41] Butler M, Goodwin T, Simpson M, Singh M, Poulter R. Vertebrate LTR retrotransposons of the Tf1/sushi group. J Mol Evol 2001;52: 260–74. [42] Manktelow E, Shigemoto K, Brierley I. Characterization of the frameshift signal of Edr, a mammalian example of programmed-1 ribosomal frameshifting. Nucleic Acids Res 2005;33:1553–63. [43] Clark MB, Jänicke M, Gottesbühren U, Kleffmann T, Legge M, Poole ES, et al. Mammalian gene PEG10 expresses two reading frames by high efficiency -1 frameshifting in embryonic-associated tissues. J Biol Chem 2007;282:37359–69. [44] Ono R, Kobayashi S, Wagatsuma H, Aisaka K, Kohda T, KanekoIshino T, et al. A retrotransposon-derived gene, PEG10, is a novel imprinted gene located on human chromosome 7q21. Genomics 2001;73:232–7. [45] Charlier C, Segers K, Wagenaar D, Karim L, Berghmans S, Jaillon O, et al. Human-ovine comparative sequencing of a 250-kb imprinted domain encompassing the callipyge (clpg) locus and identification of six imprinted transcripts: DLK1, DAT, GTL2, PEG11, antiPEG11, and MEG8. Genome Res 2001;11:850–62. [46] Seitz H, Youngson N, Lin SP, Dalbert S, Paulsen M, Bachellerie JP, et al. Imprinted microRNA genes transcribed antisense to a reciprocally imprinted retrotransposon-like gene. Nat Genet 2003;34: 261–2. [47] Davis E, Caiment F, Tordoir X, Cavaillé J, Ferguson-Smith A, Cockett N, et al. RNAi-mediated allelic trans-interaction at the imprinted Rtl1/Peg11 locus. Curr Biol 2005;15:743–9.

CMI

[48] Henke C, Strissel PL, Schubert MT, Mitchell M, Stolt CC, Faschingbauer F, et al. Selective expression of sense and antisense transcripts of the sushi-ichi-related retrotransposon-derived family during mouse placentogenesis. Retrovirology 2015;12:9. [49] Emera D, Wagner GP. Transposable element recruitments in the mammalian placenta: impacts and mechanisms. Brief Funct Genomics 2012;11:267–76. [50] Ono R, Nakamura K, Inoue K, Naruse M, Usami T, Wakisaka-Saito N, et al. Deletion of Peg10, an imprinted gene acquired from a retrotransposon, causes early embryonic lethality. Nat Genet 2006;38: 101–6. [51] Henke C, Ruebner M, Faschingbauer F, Stolt CC, Schaefer N, Lang N, et al. Regulation of murine placentogenesis by the retroviral genes Syncytin-A, Syncytin-B and Peg10. Differentiation 2013;85:150–60. [52] Smallwood A, Papageorghiou A, Nicolaides K, Alley MK, Jim A, Nargund G, et al. Temporal regulation of the expression of syncytin (HERV-W), maternally imprinted PEG10, and SGCE in human placenta. Biol Reprod 2003;69:286–93. [53] Zhou QY, Huang JN, Xiong YZ, Zhao SH. Imprinting analyses of the porcine GATM and PEG10 genes in placentas on days 75 and 90 of gestation. Genes Genet Syst 2007;82:265–9. [54] Sekita Y, Wagatsuma H, Nakamura K, Ono R, Kagami M, Wakisaka N, et al. Role of retrotransposon-derived imprinted gene, Rtl1, in the feto-maternal interface of mouse placenta. Nat Genet 2008;40:243–8. [55] Kagami M, Yamazawa K, Matsubara K, Matsuo N, Ogata T. Placentomegaly in paternal uniparental disomy for human chromosome 14. Placenta 2008;29:760–1. [56] Georgiades P, Watkins M, Surani MA, Ferguson-Smith AC. Parental origin-specific developmental defects in mice with uniparental disomy for chromosome 12. Development 2000;127:4719–28. [57] Ito M, Sferruzzi-Perri AN, Edwards CA, Adalsteinsson BT, Allen SE, Loo TH, et al. A trans-homologue interaction between reciprocally imprinted miR-127 and Rtl1 regulates placenta development. Development 2015;142:2425–30. [58] Naruse M, Ono R, Irie M, Nakamura K, Furuse T, Hino T, et al. Sirh7/ Ldoc1 knockout mice exhibit placental P4 overproduction and delayed parturition. Development 2014;141:4763–71. [59] Steplewski A, Krynska B, Tretiakova A, Haas S, Khalili K, Amini S. MyEF-3, a developmentally controlled brain-derived nuclear protein which specifically interacts with myelin basic protein proximal regulatory sequences. Biochem Biophys Res Commun 1998;243: 295–301. [60] Hishida T, Naito K, Osada S, Nishizuka M, Imagawa M. Peg10, an imprinted gene, plays a crucial role in adipocyte differentiation. FEBS Lett 2007;581:4272–8. [61] Xu X, Ectors F, Davis EE, Pirottin D, Cheng H, Farnir F, et al. Ectopic Expression of retrotransposon-derived PEG11/RTL1 contributes to the callipyge muscular hypertrophy. PLoS One 2015;10:e0140594. [62] Irie M, Yoshikawa M, Ono R, Iwafune H, Furuse T, Yamada I, et al. Cognitive function related to the Sirh11/Zcchc16 gene acquired from an LTR Retrotransposon in eutherians. PLoS Genet 2015;11: e1005521. [63] Riordan JD, Dupuy AJ. Domesticated transposable element gene products in human cancer. Mob Genet Elements 2013;3:e26693. [64] Okabe H, Satoh S, Furukawa Y, Kato T, Hasegawa S, Nakajima Y, et al. Involvement of PEG10 in human hepatocellular carcinogenesis through interaction with SIAH1. Cancer Res 2003;63:3043–8. [65] Li CM, Margolin AA, Salas M, Memeo L, Mansukhani M, Hibshoosh H, et al. PEG10 is a c-MYC target gene in cancer cells. Cancer Res 2006;66:665–72. [66] Kainz B, Shehata M, Bilban M, Kienle D, Heintel D, KrömerHolzinger E, et al. Overexpression of the paternally expressed gene 10 (PEG10) from the imprinted locus on chromosome 7q21 in high-

Clinical Microbiology and Infection © 2016 European Society of Clinical Microbiology and Infectious Diseases. Published by Elsevier Ltd. All rights reserved, CMI, 22, 312–323

CMI

[67]

[68]

[69]

[70]

[71]

[72]

[73]

[74]

[75]

[76]

[77]

[78]

[79]

[80]

[81] [82]

[83]

[84]

[85]

Naville et al.

risk B-cell chronic lymphocytic leukemia. Int J Cancer 2007;121: 1984–93. Wang C, Xiao Y, Hu Z, Chen Y, Liu N, Hu G. PEG10 directly regulated by E2Fs might have a role in the development of hepatocellular carcinoma. FEBS Lett 2008;582:2793–8. Dong H, Ge X, Shen Y, Chen L, Kong Y, Zhang H, et al. Gene expression profile analysis of human hepatocellular carcinoma using SAGE and LongSAGE. BMC Med Genomics 2009;2:5. Riordan JD, Keng VW, Tschida BR, Scheetz TE, Bell JB, PodetzPedersen KM, et al. Identification of rtl1, a retrotransposon-derived imprinted gene, as a novel driver of hepatocarcinogenesis. PLoS Genet 2013;9:e1003441. Nagasaki K, Schem C, von Kaisenberg C, Biallek M, Rösel F, Jonat W, et al. Leucine-zipper protein, LDOC1, inhibits NF-κB activation and sensitizes pancreatic cancer cells to apoptosis. Int J Cancer 2003;105: 454–8. Inoue M, Takahashi K, Niide O, Shibata M, Fukuzawa M, Ra C. LDOC1, a novel MZF-1-interacting protein, induces apoptosis. FEBS Lett 2005;579:604–8. Duzkale H, Schweighofer CD, Coombes KR, Barron LL, Ferrajoli A, O’Brien S, et al. LDOC1 mRNA is differentially expressed in chronic lymphocytic leukemia and predicts overall survival in untreated patients. Blood 2011;117:4076–84. Buchholtz ML, Jückstock J, Weber E, Mylonas I, Dian D, Brüning A. Loss of LDOC1 expression by promoter methylation in cervical cancer cells. Cancer Invest 2013;31:571–7. Kokosar J, Kordis D. Genesis and regulatory wiring of retroelementderived domesticated genes: a phylogenomic perspective. Mol Biol Evol 2013;30:1015–31. Schüller M, Jenne D, Voltz R. The human PNMA family: novel neuronal proteins implicated in paraneoplastic neurological disease. J Neuroimmunol 2005;169:172–6. Wills NM, Moore B, Hammer A, Gesteland RF, Atkins JF. A functional -1 ribosomal frameshift signal in the human paraneoplastic Ma3 gene. J Biol Chem 2006;281:7082–8. Iwasaki S, Suzuki S, Pelekanos M, Clark H, Ono R, Shaw G, et al. Identification of a novel PNMA-MS1 gene in marsupials suggests the LTR retrotransposon-derived PNMA genes evolved differently in marsupials and eutherians. DNA Res 2013;20:425–36. Dalmau J, Gultekin SH, Voltz R, Hoard R, DesChamps T, Balmaceda C, et al. Ma1, a novel neuron- and testis-specific protein, is recognized by the serum of patients with paraneoplastic neurological disorders. Brain 1999;122:27–39. Voltz R, Gultekin SH, Rosenfeld MR, Gerstner E, Eichen J, Posner JB, et al. A serologic marker of paraneoplastic limbic and brain-stem encephalitis in patients with testicular cancer. N Engl J Med 1999;340:1788–95. Rosenfeld MR, Eichen JG, Wade DF, Posner JB, Dalmau J. Molecular and clinical diversity in paraneoplastic immunity to Ma proteins. Ann Neurol 2001;50:339–48. Darnell RB, Posner JB. Paraneoplastic syndromes affecting the nervous system. Semin Oncol 2006;33:270–98. Tan KO, Tan KM, Chan SL, Yee KS, Bevort M, Ang KC, et al. MAP-1, a novel proapoptotic protein containing a BH3-like motif that associates with Bax through its Bcl-2 homology domains. J Biol Chem 2001;276:2802–7. Tan KO, Fu NY, Sukumaran SK, Chan SL, Kang JH, Poon KL, et al. MAP-1 is a mitochondrial effector of Bax. Proc Natl Acad Sci USA 2005;102:14623–8. Baksh S, Tommasi S, Fenton S, Yu VC, Martins LM, Pfeifer GP, et al. The tumour suppressor RASSF1A and MAP-1 link death receptor signaling to Bax conformational change and cell death. Mol Cell 2005;18:637–50. Chen HL, D’Mello SR. Induction of neuronal cell death by paraneoplastic Ma1 antigen. J Neurosci Res 2010;88:3508–19.

Retroviruses and new genes in vertebrates

321

[86] Jiang SH, He P, Ma MZ, Wang Y, Li RK, Fang F, et al. PNMA1 promotes cell growth in human pancreatic ductal adenocarcinoma. Int J Clin Exp Pathol 2014;7:3827–35. [87] Cho G, Bhat SS, Gao J, Collins JS, Rogers RC, Simensen RJ, et al. Evidence that SIZN1 is a candidate X-linked mental retardation gene. Am J Med Genet A 2008;146A:2644–50. [88] Cho G, Lim Y, Zand D, Golden JA. Sizn1 is a novel protein that functions as a transcriptional coactivator of bone morphogenic protein signaling. Mol Cell Biol 2008;28:1565–72. [89] Cho G, Lim Y, Golden JA. XLMR candidate mouse gene, Zcchc12 (Sizn1) is a novel marker of Cajal-Retzius cells. Gene Expr Patterns 2011;11:216–20. [90] Sander TL, Stringer KF, Maki JL, Szauter P, Stone JR, Collins T. The SCAN domain defines a large family of zinc finger transcription factors. Gene 2003;310:29–38. [91] Edelstein LC, Collins T. The SCAN domain family of zinc finger transcription factors. Gene 2005;359:1–17. [92] Ivanov D, Stone JR, Maki JL, Collins T, Wagner G. Mammalian SCAN domain dimer is a domain-swapped homolog of the HIV capsid Cterminal domain. Mol Cell 2005;17:137–43. [93] Emerson RO, Thomas JH. Gypsy and the birth of the SCAN domain. J Virol 2011;85:12043–52. [94] Linggi MS, Burke TL, Williams BB, Harrington A, Kraemer R, Hempstead BL, et al. Neurotrophin receptor interacting factor (NRIF) is an essential mediator of apoptotic signaling by the p75 neurotrophin receptor. J Biol Chem 2005;280:13801–8. [95] Li J, Wang Y, Fan X, Mo X, Wang Z, Li Y, et al. ZNF307, a novel zinc finger gene suppresses p53 and p21 pathway. Biochem Biophys Res Commun 2007;363:895–900. [96] Eguchi T, Prince T, Wegiel B, Calderwood SK. Role and regulation of myeloid zinc finger protein 1 in cancer. J Cell Biochem 2015;116: 2146–54. [97] Best S, Le Tissier P, Towers G, Stoye JP. Positional cloning of the mouse retrovirus restriction gene Fv1. Nature 1996;382:826–9. [98] Yap MW, Colbeck E, Ellis SA, Stoye JP. Evolution of the retroviral restriction gene Fv1: inhibition of non-MLV retroviruses. PLoS Pathog 2014;10:e1003968. [99] Nair S, Rein A. Antiretroviral restriction factors in mice. Virus Res 2014;193:130–4. [100] Mura M, Murcia P, Caporale M, Spencer TE, Nagashima K, Rein A, et al. Late viral interference induced by transdominant Gag of an endogenous retrovirus. Proc Natl Acad Sci USA 2004;101: 11117–22. [101] Arnaud F, Caporale M, Varela M, Biek R, Chessa B, Alberti A, et al. A paradigm for virus–host coevolution: sequential counteradaptations between endogenous and exogenous retroviruses. PLoS Pathog 2007;3:e170. [102] Arnaud F, Varela M, Spencer TE, Palmarini M. Coevolution of endogenous betaretroviruses of sheep and their host. Cell Mol Life Sci 2008;65:3422–32. [103] Armezzani A, Varela M, Spencer TE, Palmarini M, Arnaud F. “Ménage à Trois”: the evolutionary interplay between JSRV, enJSRVs and domestic sheep. Viruses 2014;6:4926–45. [104] Plath N, Ohana O, Dammermann B, Errington ML, Schmitz D, Gross C, et al. Arc/Arg3.1 is essential for the consolidation of synaptic plasticity and memories. Neuron 2006;52:437–44. [105] Bramham CR, Alme MN, Bittins M, Kuipers SD, Nair RR, Pai B, et al. The Arc of synaptic memory. Exp Brain Res 2010;200:125–40. [106] Day C, Shepherd JD. Arc: building a bridge from viruses to memory. Biochem J 2015;469:e1–3. [107] Li Y, Pehrson AL, Waller JA, Dale E, Sanchez C, Gulinello M. A critical evaluation of the activity-regulated cytoskeleton-associated protein (Arc/Arg3.1)’s putative role in regulating dendritic plasticity, cognitive processes, and mood in animal models of depression. Front Neurosci 2015;9:279.

Clinical Microbiology and Infection © 2016 European Society of Clinical Microbiology and Infectious Diseases. Published by Elsevier Ltd. All rights reserved, CMI, 22, 312–323

322

Clinical Microbiology and Infection, Volume 22 Number 4, April 2016

[108] Lerat E, Birot AM, Samarut J, Mey A. Maintenance in the chicken genome of the retroviral-like cENS gene family specifically expressed in early embryos. J Mol Evol 2007;65:215–27. [109] Streit A, Berliner AJ, Papanayotou C, Sirulnik A, Stern CD. Initiation of neural induction by FGF signalling before gastrulation. Nature 2000;406:74–8. [110] Papanayotou C, Mey A, Birot AM, Saka Y, Boast S, Smith JC, et al. A mechanism regulating the onset of Sox2 expression in the embryonic neural plate. PLoS Biol 2008;6:e2. [111] Feschotte C, Pritham EJ. DNA transposons and the evolution of eukaryotic genomes. Annu Rev Genet 2007;41:331–68. [112] Lloréns C, Marín I. A mammalian gene evolved from the integrase domain of an LTR retrotransposon. Mol Biol Evol 2001;18:1597–600. [113] Marín IGIN. transposons: genetic elements linking retrotransposons and genes. Mol Biol Evol 2010;27:1903–11. [114] Chalopin D, Galiana D, Volff JN. Genetic innovation in vertebrates: gypsy integrase genes and other genes derived from transposable elements. Int J Evol Biol 2012;2012:724519. [115] Marco A, Marín I. CGIN1: a retroviral contribution to mammalian genomes. Mol Biol Evol 2009;26:2167–70. [116] Bernard D, Méhul B, Thomas-Collignon A, Delattre C, Donovan M, Schmidt R. Identification and characterization of a novel retrovirallike aspartic protease specifically expressed in human epidermis. J Invest Dermatol 2005;125:278–87. [117] Matsui T, Kinoshita-Ida Y, Hayashi-Kisumi F, Hata M, Matsubara K, et al. Mouse homologue of skin-specific retroviral-like aspartic protease involved in wrinkle formation. J Biol Chem 2006;281:27512–25. [118] Matsui T, Kinoshita-Ida Y, Hayashi-Kisumi F, Hata M, Matsubara K, Chiba M, et al. SASPase regulates stratum corneum hydration through profilaggrin-to-filaggrin processing. EMBO Mol Med 2011;3:320–33. [119] Krylov DM, Koonin EV. A novel family of predicted retroviral-like aspartyl proteases with a possible key role in eukaryotic cell cycle control. Curr Biol 2001;11:R584–7. [120] Greiner EF, Kirfel J, Greschik H, Huang D, Becker P, Kapfhammer JP, et al. Differential ligand-dependent protein–protein interactions between nuclear receptors and a neuronal-specific cofactor. Proc Natl Acad Sci USA 2000;97:7160–5. [121] Prudhomme S, Bonnaud B, Mallet F. Endogenous retroviruses and animal reproduction. Cytogenet Genome Res 2005;110:353–64. [122] Dupressoir A, Lavialle C, Heidmann T. From ancestral infectious retroviruses to bona fide cellular genes: role of the captured syncytins in placentation. Placenta 2012;33:663–71. [123] Lavialle C, Cornelis G, Dupressoir A, Esnault C, Heidmann O, Vernochet C, et al. Paleovirology of ’syncytins’, retroviral env genes exapted for a role in placentation. Philos Trans R Soc Lond B Biol Sci 2013;368:20120507. [124] Lokossou AG, Toudic C, Barbeau B. Implication of human endogenous retrovirus envelope proteins in placental functions. Viruses 2014;6:4609–27. [125] Mangeney M, Renard M, Schlecht-Louf G, Bouallaga I, Heidmann O, Letzelter C, et al. Placental syncytins: genetic disjunction between the fusogenic and immunosuppressive activity of retroviral envelope proteins. Proc Natl Acad Sci USA 2007;104:20534–9. [126] Mi S, Lee X, Li X, Veldman GM, Finnerty H, Racie L, et al. Syncytin is a captive retroviral envelope protein involved in human placental morphogenesis. Nature 2000;403:785–9. [127] Mallet F, Bouton O, Prudhomme S, Cheynet V, Oriol G, Bonnaud B, et al. The endogenous retroviral locus ERVWE1 is a bona fide gene involved in hominoid placental physiology. Proc Natl Acad Sci USA 2004;101:1731–6. [128] Søe K, Andersen TL, Hobolt-Pedersen AS, Bjerregaard B, Larsson LI, Delaissé JM. Involvement of human endogenous retroviral syncytin-1 in human osteoclast fusion. Bone 2011;48:837–46. [129] Antony JM, Ellestad KK, Hammond R, Imaizumi K, Mallet F, Warren KG, et al. The human endogenous retrovirus envelope

[130]

[131]

[132]

[133]

[134]

[135]

[136]

[137]

[138]

[139]

[140]

[141]

[142]

[143] [144]

[145]

CMI

glycoprotein, syncytin-1, regulates neuroinflammation and its receptor expression in multiple sclerosis: a role for endoplasmic reticulum chaperones in astrocytes. J Immunol 2007;179:1210–24. Blaise S, de Parseval N, Bénit L, Heidmann T. Genomewide screening for fusogenic human endogenous retrovirus envelopes identifies syncytin 2, a gene conserved on primate evolution. Proc Natl Acad Sci USA 2003;100:13013–8. Blond JL, Lavillette D, Cheynet V, Bouton O, Oriol G, ChapelFernandes S, et al. An envelope glycoprotein of the human endogenous retrovirus HERV-W is expressed in the human placenta and fuses cells expressing the type D mammalian retrovirus receptor. J Virol 2000;74:3321–9. Esnault C, Priet S, Ribet D, Vernochet C, Bruls T, Lavialle C, et al. A placenta-specific receptor for the fusogenic, endogenous retrovirus-derived, human syncytin-2. Proc Natl Acad Sci USA 2008;105:17532–7. Sugimoto J, Sugimoto M, Bernstein H, Jinno Y, Schust D. A novel human endogenous retroviral protein inhibits cell-cell fusion. Sci Rep 2013;3:1462. Dupressoir A, Marceau G, Vernochet C, Bénit L, Kanellopoulos C, Sapin V, et al. Syncytin-A and syncytin-B, two fusogenic placentaspecific murine envelope genes of retroviral origin conserved in Muridae. Proc Natl Acad Sci USA 2005;102:725–30. Vernochet C, Redelsperger F, Harper F, Souquere S, Catzeflis F, Pierron G, et al. The captured retroviral envelope syncytin-A and syncytin-B genes are conserved in the Spalacidae together with hemotrichorial placentation. Biol Reprod 2014;91:148. Dupressoir A, Vernochet C, Bawa O, Harper F, Pierron G, Opolon P, et al. Syncytin-A knockout mice demonstrate the critical role in placentation of a fusogenic, endogenous retrovirus-derived, envelope gene. Proc Natl Acad Sci USA 2009;106:12127–32. Dupressoir A, Vernochet C, Harper F, Guégan J, Dessen P, Pierron G, et al. A pair of co-opted retroviral envelope syncytin genes is required for formation of the two-layered murine placental syncytiotrophoblast. Proc Natl Acad Sci USA 2011;108: E1164–73. Redelsperger F, Cornelis G, Vernochet C, Tennant BC, Catzeflis F, Mulot B, et al. Capture of syncytin-Mar1, a fusogenic endogenous retroviral envelope gene involved in placentation in the Rodentia squirrel-related clade. J Virol 2014;88:7915–28. Heidmann O, Vernochet C, Dupressoir A, Heidmann T. Identification of an endogenous retroviral envelope gene with fusogenic activity and placenta-specific expression in the rabbit: a new “syncytin” in a third order of mammals. Retrovirology 2009;6:107. Vernochet C, Heidmann O, Dupressoir A, Cornelis G, Dessen P, Catzeflis F, et al. A syncytin-like endogenous retrovirus envelope gene of the guinea pig specifically expressed in the placenta junctional zone and conserved in Caviomorpha. Placenta 2011;32:885–92. Cornelis G, Heidmann O, Degrelle SA, Vernochet C, Lavialle C, Letzelter C, et al. Captured retroviral envelope syncytin gene associated with the unique placental structure of higher ruminants. Proc Natl Acad Sci USA 2013;110:E828–37. Nakaya Y, Koshi K, Nakagawa S, Hashizume K, Miyazawa T. Fematrin-1 is involved in fetomaternal cell-to-cell fusion in Bovinae placenta and has contributed to diversity of ruminant placentation. J Virol 2013;87:10563–72. Nakaya Y, Miyazawa T. The roles of Syncytin-like proteins in ruminant placentation. Viruses 2015;7:2928–42. Koshi K, Nakaya Y, Kizaki K, Ishiguro-Oonuma T, Miyazawa T, Spencer TE, et al. Induction of ovine trophoblast cell fusion by fematrin-1 in vitro. Anim Sci J 2015;Jul 24. Cornelis G, Vernochet C, Malicorne S, Souquere S, Tzika AC, Goodman SM, et al. Retroviral envelope syncytin capture in an ancestrally diverged mammalian clade for placentation in the primitive Afrotherian tenrecs. Proc Natl Acad Sci USA 2014;111:E4332–41.

Clinical Microbiology and Infection © 2016 European Society of Clinical Microbiology and Infectious Diseases. Published by Elsevier Ltd. All rights reserved, CMI, 22, 312–323

CMI

Naville et al.

[146] Cornelis G, Heidmann O, Bernard-Stoecklin S, Reynaud K, Véron G, Mulot B, et al. Ancestral capture of syncytin-Car1, a fusogenic endogenous retroviral envelope gene involved in placentation and conserved in Carnivora. Proc Natl Acad Sci USA 2012;109: E432–41. [147] Cornelis G, Vernochet C, Carradec Q, Souquere S, Mulot B, Catzeflis F, et al. Retroviral envelope gene captures and syncytin exaptation for placentation in marsupials. Proc Natl Acad Sci USA 2015;112:E487–96. [148] Malfavon-Borja R, Feschotte C. Fighting fire with fire: endogenous retrovirus envelopes as restriction factors. J Virol 2015;89: 4047–50.

Retroviruses and new genes in vertebrates

323

[149] Ikeda H, Sugimura H. Fv-4 resistance gene: a truncated endogenous murine leukemia virus with ecotropic interference properties. J Virol 1989;63:5405–12. [150] Robinson HL, Astrin SM, Senior AM, Salazar FH. Host susceptibility to endogenous viruses: defective, glycoprotein-expressing proviruses interfere with infections. J Virol 1981;40:745–51. [151] Varela M, Spencer TE, Palmarini M, Arnaud F. Friendly viruses: the special relationship between endogenous retroviruses and their host. Ann N Y Acad Sci 2009;1178:157–72. [152] Chuoung EB, Tong W, Hoekstra HE. Maternal–fetal conflict: rapidly evolving proteins in the rodent placenta. Mol Biol Evol 2010;27: 1221–5.

Clinical Microbiology and Infection © 2016 European Society of Clinical Microbiology and Infectious Diseases. Published by Elsevier Ltd. All rights reserved, CMI, 22, 312–323