Comparison of the GAGA factor genes of Drosophila ...

1 downloads 0 Views 214KB Size Report
Launcher (Human Genome Center, Baylor College of Medicine,. Houston, Tex.). ..... rast M, Conklin D, Granok H, Craig C, Elgin S (1997) Multi- ple isoforms of ...
Dev Genes Evol (1998) 208:447–456

© Springer-Verlag 1998

O R I G I NA L A RT I C L E

&roles:Karl-Georg Lintermann · Günther E. Roth Kirst King-Jones · Günter Korge · Michael Lehmann

Comparison of the GAGA factor genes of Drosophila melanogaster and Drosophila virilis reveals high conservation of GAGA factor structure beyond the BTB/POZ and DNA-binding domains &misc:Received: 17 April 1998 / Accepted: 6 May 1998

&p.1:Abstract As a member of the trithorax-group, the Trithorax-like (Trl) gene of Drosophila melanogaster contributes to the expression of homeotic genes and many other genes. Trl encodes different isoforms of the GAGA factor which is thought to act as an “antirepressor” of transcription by remodelling chromatin structure and thereby rendering control regions accessible for transcriptional activators. A more global role of the GAGA factor in chromatin structure and function is suggested by various phenotypes of Trl mutations, such as modification of position effect variegation. To better define the molecular basis of these pleiotropic effects, we cloned cDNAs encoding the GAGA isoforms of D. melanogaster and a distantly related species, D. virilis. We also characterized the genomic organization of both the D. melanogaster and D. virilis genes, and analysed the expression patterns of isoform-specific mRNAs. The D. virilis GAGA isoforms show high similarity to their D. melanogaster counterparts, particularly within the BTB/POZ protein-interaction and the zinc finger DNAbinding domains. Interestingly, conservation clearly extends beyond the previously defined limits of these domains. Moreover, the comparison reveals a completely conserved block of amino acid residues located between the BTB/POZ and DNA-binding domains, and a high conservation of the C-terminus specific for one of the GAGA isoforms. Thus, sequences of as yet unknown functions are defined as rewarding targets for further mutational analyses. The high conservation of the GAGA proteins of the two species is in accord with the nearly identical genomic organization and expression patterns of the corresponding genes. &kwd:Key words Drosophila · Evolutionary conservation · GAGA factor · Trithorax-like&bdy: Edited by D. Tautz K.-G. Lintermann · G. E. Roth · K. King-Jones · G. Korge M. Lehmann (✉) Institut für Genetik, Freie Universität Berlin, Arnimallee 7, D-14195 Berlin, Germany&/fn-block:

Introduction The transcription of many D. melanogaster genes, amongst them developmental genes like Ultrabithorax or Krüppel and the heat shock genes hsp70 and hsp26, depends on binding of the GAGA factor to GA-rich sites within the control regions of these genes (for reviews see Granok et al. 1995; Wilkins and Lis 1997). In in vitro transcription assays employing the control regions of the Krüppel or hsp70 genes, binding of the GAGA factor counteracts the repressive effects of nucleosomes and histone H1 on transcription (Croston et al. 1991; Tsukiyama et al. 1994). Relief of repression at the hsp70 promoter is achieved by concerted action of the GAGA factor and the ATP-dependent nucleosome remodelling factor NURF (Tsukiyama and Wu 1995; Tsukiyama et al. 1995). These and other studies (Kerrigan et al. 1991; Lu et al. 1992, 1993; Shopland et al. 1995; Wall et al. 1995) led to a general model in which the GAGA factor acts by remodelling the chromatin structure of transcriptional control regions, thereby rendering these regions accessible for other sequence-specific transcription factors. The GAGA protein thus has the properties of an ‘antirepressor’ rather than of a ‘true’ transcriptional activator. Consistent with this view, mutations of the Trithorax-like (Trl) gene, which encodes the GAGA factor, affect homeotic gene expression and are dominant enhancers of position effect variegation (PEV; Farkas et al. 1994), processes which are believed to be crucially dependent on changes in chromatin structure. PEV is observed when regions of heterochromatin and euchromatin are juxtaposed by a chromosomal rearrangement. Genes located within the euchromatin (e.g. the white gene) are then inactivated in a variegating pattern that is clonally inherited. A possible mechanism for this inactivation is the spreading of heterochromatin into the euchromatic sequences, a process which is enhanced or suppressed by modifiers of PEV such as the GAGA factor. Its ability to modify PEV suggests that the GAGA factor generally has a function in altering heterochromatin structures. This assumption is supported by the finding that the

448

GAGA factor is associated with specific regions of the heterochromatin throughout the cell cycle (Raff et al. 1994) and by the observation of a variety of nuclear cleavage cycle defects in a Trl mutant (Bhat et al. 1996). The GAGA factor thus does not only operate as a transcriptional regulator, but also seems to play a role in maintaining other chromatin functions. The Trl gene encodes at least two isoforms, GAGA519 and GAGA-581, of the GAGA protein (Soeller et al. 1993; Benyajati et al. 1997). These isoforms share a single C2H2 zinc finger motif and two adjacent basic regions, which together constitute the DNA-binding domain (Pedone et al. 1996). Moreover, they share a putative protein-protein interaction domain located at the Nterminus, the BTB/POZ domain, which is found in a variety of DNA and non-DNA-binding proteins of different species (Godt et al. 1993; Bardwell and Treisman 1994; Zollman et al. 1994; Albagli et al. 1995). The GAGA BTB/POZ domain apparently inhibits DNA binding by the associated zinc finger domain, and it can specifically interact with the BTB/POZ domain of another Drosophila zinc finger protein, Tramtrack, in vitro (Bardwell and Treisman 1994). The two GAGA isoforms are distinguished by a third, glutamine-rich domain located at the C-terminus, which is encoded by isoform-specific exons. D. melanogaster is still the only species whose GAGA factor has been characterized on a molecular level. In view of the apparent diversity of GAGA functions, we wondered whether comparison with the GAGA factor of a distantly related species might help to discern new functional elements and to define more precisely the known functional domains of the different GAGA isoforms. We chose D. virilis for comparison, since this species is separated by about 60 million years of evolution from D. melanogaster (Beverley and Wilson 1984), enough time to expect only conservation of sequences which are governed by functional constraints. In fact, mutation rates are high enough to allow complete divergence of those amino acid sequences between the two species which are not under strong negative selection (Schmid and Tautz 1997). Accordingly, D. virilis has been successfully used for analysis of functional conservation in a number of genes (Kassis et al. 1986; Colot et al. 1988; Treier et al. 1989; Neufeld et al. 1991; O’Neil and Belote 1992; Tominaga et al. 1992; Letsou et al. 1993; Hart et al. 1993; Bomze and López 1994; Zhou and Boulianne 1994; Webster et al. 1994; Curtis et al. 1995; Pepling and Gergen 1995; Poole 1995; Bopp et al. 1996). These studies revealed a high degree of variation in the extent of overall conservation of amino acid sequences, ranging from 36 (O’Neil and Belote 1992) to 98% (Tominaga et al. 1992). In the present study we show that Trl belongs to those genes which are highly conserved between the two Drosophila species. Our analysis of the genomic organization of the D. melanogaster and D. virilis genes reveals a high correspondence of their exon-intron structures. In addition, the expression patterns of the isoform-specific transcripts are very similar in both species. On the pro-

tein level, not only the BTB/POZ and zinc finger domains are highly conserved but also considerable portions of the flanking sequences, suggesting that these domains may have broader limits than previously suspected. Interestingly, additional regions of the GAGA proteins are also conserved between the two species, indicating that these regions possess an as yet unknown functional significance.

Materials and methods Library screens Standard procedures for manipulating DNA were as described by Sambrook et al. (1989). Hybridizations with homologous probes were carried out at 42°C in 5×SSPE containing 5×Denhardt’s, 50% formamide, 0.5% sodium dodecyl sulphate (SDS) and 100 µg/ml herring sperm DNA. Washes were twice in 2×sodium sodium citrate (SSC), 0.1% SDS at room temperature and twice in 0.1×SSC, 0.1% SDS at 52°C. Hybridizations with heterologous probes were carried out in the same way, except that the formamide concentration was 30%. To isolate genomic DNA from the D. melanogaster Trl locus, we constructed a cosmid library in pWE15 (Stratagene) using DNA prepared from transformant line E5 (Hofmann et al. 1987) according to the procedure of Jowett (1986). The library was screened using a 1.1-kb EcoRI/XhoI fragment from the D. melanogaster GAGA cDNA Dr.1 (Fig. 1). Southern hybridizations of restriction digests of cosmid clone cos344 with the probe used for screening and with restriction fragments derived from the D. melanogaster cDNAs Dm1.67 and DmW20 revealed that this clone covers all known GAGA cDNA sequences. The cDNAs Dr.1 and Dm1.67 were isolated from a λgt11 cDNA library from D. melanogaster third instar larvae (a gift from Hartmut Bornschein), and cDNA DmW20 was isolated from a λ cDNA library from adult females (a gift from Rudi Grams). Dr.1 encodes a truncated form of the GAGA-519 isoform lacking the first 64 N-terminal amino acid residues, Dm1.67 contains the complete sequence encoding GAGA-519, and DmW20 the complete coding and trailer sequences specific for transcripts encoding the GAGA-581 isoform (Fig. 1). Genomic DNA from the D. virilis GAGA locus was isolated from a genomic library obtained from D. virilis third instar larvae (a gift from Wolfgang Lanio). Screening of the library, with the same 1.1-kb fragment that was used for screening of the D. melanogaster genomic library, yielded three overlapping genomic λ clones. A 0.6 kb SalI/SalI (second Sal site from the polylinker of the vector) fragment from one of these clones, which contains a sequence present in all the λ clones, was used for control hybridizations (see Results) and screening of a λ cDNA library from D. virilis pupae (kindly provided by Andrés Jarrin Hentschel). DNA sequencing and sequence alignments DNA sequences were determined by the chain-termination method using the Sequenase version 2.0 and ∆Taq cycle DNA sequencing kits (USB/Amersham Life Science) or automated cycle sequencing (MWG-BIOTECH). Sequence alignments and calculation of sequence identities/similarities were performed using the FASTA program package (Pearson 1990) provided by the BCM Search Launcher (Human Genome Center, Baylor College of Medicine, Houston, Tex.). Nucleotide sequences have been deposited in the EMBL database under Accession Numbers AJ005174 (cDNA DvA) and AJ005175 (cDNA DvB).

449 Northern analyses and whole-mount RNA in situ hybridizations Total RNA was extracted from embryos, larvae, pupae and adult flies using TRIzol (GIBCO), and poly (A)+ RNA was isolated from the RNA preparations using prepacked oligo (dT) columns (GIBCO) according to the instructions of the manufacturer. RNA underwent electrophoresis through 1% formaldehyd agarose gels and was blotted onto a Nytran NY 13 nylon membrane (Schleicher and Schüll). Hybridizations with RNA probes were performed in 5 x SSC, 50% formamide, 0.5% SDS, 20 mM sodium phosphate buffer (pH 7.0), 0.5×Denhardt’s reagent and 0.5 µg/ml herring sperm DNA overnight at 60°C. Washing conditions were 10 min in 2×SSC, 0.1% SDS and 2×20 min in 0.1×SSC, 0.1% SDS at 68–76°C. In order to generate RNA probes specific for the two transcript classes, 3′ fragments unique to the class A and B transcripts (see Fig. 1) were amplified by polymerase chain reaction (PCR) and subcloned into pBluescript (Stratagene). Antisense RNA probes from the resulting plasmids were obtained by in vitro transcription reactions using standard techniques (Sambrook et al. 1989). A DNA fragment specific for transcript class A was amplified using primers OVG27 (positions 1838–1857, cDNA DvA) and OVG28 (positions 2029–2010, ibid.). A class B-specific fragment was obtained using primers OVG25 (positions 863–882, cDNA DvB) and OVG26 (positions 1023–1004, ibid.). To obtain a fragment which allows simultaneous detection of transcripts of both classes, a short sequence from the BTB/POZ encoding region was amplified using primers OVG12 (positions 853–872 of cDNA DvA) and OVG2 (positions 1077–1097; ibid.). A D. virilis rp49 probe (kindly provided by Andrés Jarrin Hentschel) was used in control hybridizations. Whole-mount RNA in situ hybridizations with digoxigenin-labelled DNA probes were performed according to Tautz and Pfeifle (1989). Transcript class-specific probes were generated by PCR amplification of short sequences from the 3′ regions specific for transcript classes A and B using the same primers as described above. For synthesis of the D. melanogaster GAGA-519 (class A)specific probe, primers OMG6 (positions 1798–1779 of cDNA GAGA-519a; Benyajati et al. 1997) and OMG9 (positions 1709–1726, ibid.) were used. The D. melanogaster GAGA-581 (class B)-specific probe was synthesized using primers OMG4 (positions 2801–2782 of cDNA GAGA-581; Benyajati et al. 1997) and OMG10 (positions 2694–2713, ibid.). In situ hybridization to polytene chromosomes In situ hybridizations to polytene salivary gland chromosomes of D. virilis were performed as described (Lehmann and Korge 1995) using a biotin-labelled DNA probe generated by random priming from the 0.6 kb SalI/SalI fragment that was used for library screening (see above).

Results Genomic structure of the D. melanogaster Trl gene Several GAGA cDNAs have been isolated from D. melanogaster and described by different workers (Lu et al. 1993; Soeller et al. 1993; Raff et al. 1994; Farkas et al. 1994; Benyajati et al. 1997), but information about the genomic structure of the D. melanogaster Trl gene is still very limited. To compare the D. virilis and D. melanogaster GAGA-encoding genes also on a genomic level, we first isolated and characterized genomic DNA fragments containing the D. melanogaster Trl gene. Using restriction fragments from various D. melanogaster GAGA cDNAs (see Materials and methods), we identified a cosmid clone which contains all GAGA cDNA sequences

described so far (Lu et al. 1993; Soeller et al. 1993; Raff et al. 1994; Farkas et al. 1994; Benyajati et al. 1997). By restriction mapping, sequence analysis and comparison with the GAGA cDNAs, we determined the exon-intron boundaries of the D. melanogaster Trl gene (Fig. 1). The Trl gene is transcribed into mRNAs which can be classified into at least two transcript classes, A and B, encoding two GAGA isoforms of a deduced sequence of 519 and 581 amino acid residues respectively (Benyajati et al. 1997; and this study). The protein coding sequences of class A transcripts are split between four exons which are separated by three introns of 2.2 kb, 118 bp and 160 bp. Class B transcripts are derived by use of an alternative splice site within exon IV. Transcripts of both classes thus share exons I to III and the 5′ portion of exon IV. The BTB/POZ domain is encoded, in about equal shares, by exons I and II. In addition to the C-terminal half of the BTB/POZ domain, exon II also encodes a putative nuclear localization signal (Dingwall and Laskey 1991). The minimal DNA-binding domain of the GAGA factor consists of a single C2H2 zinc finger and two regions of basic amino acids located immediately N-terminal of the zinc finger (Pedone et al. 1996), and is encoded by exons III and IV. The 3′ end of exon III contains basic region 1 and the rest of the binding domain is located in that part of exon IV which is common to both transcript classes. This part also contains a third region of basic amino acid residues located C-terminal of the zinc finger which seems to be dispensable for DNA binding (Pedone et al. 1996). The polypeptide encoded by the 3′ part of exon IV, which is specific for class A transcripts, is characterized by stretches of polyglutamine. Similar regions of high glutamine content are also found in the polypeptide encoded by the class B-specific exon V (Fig. 4). Isolation of the D. virilis GAGA factor-encoding gene When a genomic Southern blot of EcoRI-digested D. virilis DNA is hybridized under moderate stringency with a 1.1-kb EcoRI/XhoI restriction fragment from the common 5′ end of the D. melanogaster GAGA cDNAs (see Fig. 1), a single fragment of 7.4 kb is detected, suggesting that a gene homologous to the D. melanogaster Trl gene is present in D. virilis (data not shown). In order to isolate the GAGA-factor-encoding gene of D. virilis, we screened a D. virilis genomic λ library using the same EcoRI/XhoI fragment as a probe. From this screen, we obtained three overlapping genomic λ clones. A 0.6-kb fragment (Fig. 1), which is contained in all of these clones, hybridized with the D. melanogaster probe as well as with the 7.4-kb EcoRI fragment detected by this probe on the D. virilis Southern blot, suggesting that the λ clones contain D. virilis GAGA-factor-encoding sequences. The 0.6-kb fragment was then used to screen a cDNA library from D. virilis prepupae. This screen yielded five GAGA cDNA clones which were characterized by Southern analysis, restriction mapping and sequence analysis (see below).

450

Fig. 1A, B Genomic organization and transcript classes of the Drosophila melanogaster (A) and D. virilis (B) GAGA-factor-encoding genes. Structures of class A transcripts and of the 3′ ends specific for class B transcripts, as well as the positions of the probes used for library screens and Northern hybridizations, are given beneath the genomic maps of the GAGA-factor-encoding loci of the two species. Positions of exons in the D. melanogaster transcripts were mapped by comparison of the genomic clones with cDNAs isolated by others (Soeller et al. 1993; Benyajati et al. 1997) and in this study (cDNA clone DmW20). Exon-intron boundaries of the D. virilis gene were determined by comparison of the genomic clones with cDNA clones DvA (class A transcripts) and DvB (class B transcripts). Since neither the 5′ ends of D. melanogaster nor D. virilis GAGA transcripts have been mapped, the indicated 5′ structures may be incomplete and even different for specific transcripts. Protein coding sequences are shown as boxes and untranslated regions as horizontal lines. The positions of sequences encoding the BTB/POZ domain, the putative nuclear localization signal (NLS) and the single zinc finger (ZF) are indicated by arrows. The arrowhead marks the alternative splice site used in class B transcripts. In view of the complex patterns of GAGA transcripts and proteins detected by Northern and Western analysis (Soeller et al. 1993; Bhat et al. 1996; Benyajati et al. 1997), it is expected that additional exons will be discovered. Therefore only exons containing protein coding sequences are numbered here to facilitate discussion in the text (B BamHI, E EcoRI, P PstI, R, EcoRV, S SalI, X XhoI, Xb XbaI)&ig.c:/f

Chromosome location of the D. virilis GAGA factor gene The chromosomal position of the D. virilis GAGA-factorencoding gene was mapped by in situ hybridization of polytene salivary chromosomes with a biotin-labelled probe. A single hybridization signal was observed on the third chromosome at position 36D6 (Fig. 2). The chromosomal region around this position contains a number of genes which are also found in the region of chromosome arm 3L of D. melanogaster which harbours the Trl gene at position 70EF. From the similar arrangement of these genes, it has been inferred that they are localized within homologous chromosome segments (Neufeld et al. 1991; Kress 1993). This conclusion is supported by our finding that the physical location of the GAGA-factor-encoding gene within these segments is also conserved.

Fig. 2 Localization of the D. virilis GAGA factor gene by in situ hybridization to polytene chromosomes of D. virilis salivary glands. A biotin-labelled probe was generated from a 0.6-kb genomic DNA fragment of the GAGA-factor-encoding gene (see Materials and methods). Using this probe, a single hybridization signal can be detected on the third chromosome at 36D6, as indicated by arrows&ig.c:/f

Comparison of the D. virilis and D. melanogaster GAGA genes and transcripts The genomic organization of the D. virilis and D. melanogaster GAGA-factor-encoding genes is very similar. The boundaries between the protein-coding exons and the introns are located at exactly the same positions, and the introns are of similar size. Not only the protein-coding

451

Fig. 3 Northern blot analysis of GAGA transcripts from D. melanogaster and D. virilis embryos. Poly (A)+ RNA was isolated from 0 to 24-h-old embryos of D. virilis (Dv) and D. melanogaster (Dm), fractionated by gel electrophoresis and blotted onto a nylon membrane. The membrane was hybridized with the same 0.6-kb genomic DNA fragment from D. virilis that was used for library screening (see Materials and methods). The transcript sizes in kilobases (kb) were determined by running RNA molecular weight markers (Biolabs) in an adjacent lane&ig.c:/f

regions but also the regions encoding the 5′ and 3′ untranslated regions (5′ and 3′ UTRs) are split up by introns. Whilst the D. melanogaster GAGA-519 cDNA (Benyajati et al. 1997) indicates the presence of introns in both the 5′ and 3′ regions, the D. virilis cDNA DvA (see below) is representative for an mRNA derived without removal of a 3′ intron. However, in view of the high conservation of the genomic organization of the GAGA genes and the small number of analysed cDNAs, it seems likely that mRNAs with different 3′ UTRs are obtained by alternative splicing in both species. As with D. melanogaster, the D. virilis cDNAs can be related to two different transcript classes, A and B, which differ by the presence of two alternative 3′ ends (Fig. 1). The D. virilis GAGA cDNAs DvA and DvB, representative for transcript class A and B respectively, were sequenced completely. Since multiple GAGA transcripts are detected on Northern blots (Soeller et al. 1993; Bhat et al. 1996; and this study, Fig. 3), these cDNAs probably represent only a subset of the GAGA mRNAs. Like all other cDNAs isolated in this study, the DvA and DvB cDNAs contain putative polyadenylation (AAUAAA) signals at their 3′ ends and poly (A) stretches located 19–40 bp further downstream, indicating that they contain complete 3′ sequences. The cDNA DvA also contains the complete coding region and 333 bp of the 5′ UTR until it passes into a sequence that is not matched by the genomic sequences (first 358 nucleotides of cDNA DvA) characterized in this study. However, when this se-

quence is hybridized in situ to D. virilis polytene chromosomes, a hybridization signal is observed at 36D6, suggesting that the utmost 5′ sequence of DvA is also part of the GAGA-factor-encoding gene. The size of about 2.5 kb of the major class A transcript (for correlation between transcripts and transcript classes, see below) detected on Northern blots (Fig. 3) suggests that the DvA cDNA lacks only a small portion of the 5′ sequences, if any. The 5′ end of cDNA DvB is located, like the 5′ ends of all other cDNAs representative for transcript class B, within exon II, and thereby only defines the unique 3′ sequence of the class B transcripts. The sizes of 3.4 and 3.9 kb of the major class B transcripts (Fig. 3) are in good agreement with the transcript size that would result from usage of the alternative 3′ end. The three major GAGA mRNAs in D. virilis embryos have sizes similar to the corresponding mRNAs of D. melanogaster (Fig. 3). Whilst a 2.5-kb transcript is observed in both species, the two larger transcripts are slightly smaller in D. melanogaster than in D. virilis (3.0 versus 3.4 kb and 3.6 versus 3.9 kb) suggesting that the lengths of the leader and/or trailer sequences of the main transcripts are different in both species. The sizes of the D. melanogaster transcripts, detected with a D. virilis probe, correspond well with the transcript sizes previously determined by Soeller et al. (1993) and Bhat et al. (1996). Comparison of the D. virilis and D. melanogaster GAGA proteins The D. virilis GAGA cDNA DvA contains an open reading frame which encodes a GAGA isoform specific for class A transcripts of 556 amino acid residues. This is somewhat larger than the size of the corresponding D. melanogaster isoform of 519 amino acids. The larger size of the D. virilis protein is essentially due to the presence of an additional 32 amino acid residues within a glutamine-rich region located in the C-terminus encoded by exon IV (Fig. 4). Usage of the alternative 5′ splice site of exon IV in class B cDNA DvB leads to a protein of 590 amino acids. The class B-specific isoform of D. virilis is thus only slightly larger than the corresponding 581 amino acid isoform of D. melanogaster. The D. virilis and D. melanogaster GAGA proteins reveal a high degree of overall sequence conservation (76% identity for class A isoform, and 85% identity for class B isoform). The most striking conservation is found within the N-terminal region encoded by exons I and II which contains the POZ/BTB domain. Not only the POZ/BTB domain itself, but also the following 63 amino acids and the immediate N-terminus are completely conserved between the D. melanogaster and D. virilis proteins. The amino acid sequences encoded by exons I and II differ in only four positions (98% identity), and even these differences are all conservative amino acid exchanges. In the remaining part of the protein which is common to both isoforms (exon III and 5′ part of exon IV), con-

452

Fig. 4A, B Comparison of the amino acid sequences of the D. virilis and D. melanogaster GAGA proteins. A The complete sequences of the D. melanogaster GAGA-519 isoform (Soeller et al. 1993) and the D. virilis class A isoform encoded by cDNA DvA. B The C-terminal sequences specific for the D. melanogaster GAGA-581 isoform (Benyajati et al. 1997) and the D. virilis class B isoform encoded by cDNA DvB. Arrowheads indicate positions of introns. Sequences with identical residues between D. melanogaster and D. virilis are boxed and the BTB/POZ domain, a sequence with homology to the nuclear localization signal (NLS), the zinc finger, and the basic regions BR1–3 are indicated by black boxes. Dashes indicate gaps introduced to maximize the alignment&ig.c:/f

servation of the amino acid sequence is somewhat lower but still high (78% identity, 88% similarity). As expected, the DNA-binding domain shows highest conservation within this region. The zinc finger motif as well as the adjacent N-terminal sequence containing basic regions 1 and 2 are completely conserved, with the exception of a conservative amino acid exchange in basic region 1. An alanine residue at position 324 of the D. melanogaster

sequence, which is probably not directly involved in contact formation with the DNA target site (Omichinski et al. 1997), is replaced by a threonine residue in the D. virilis protein. Interestingly, basic region 3, which is located C-terminal to the zinc finger and is dispensable for DNA binding in vitro (Pedone et al. 1996), is also perfectly conserved. It should be noted that another region of 37 amino acids encoded by exon III (positions 270 to 306 of the D. virilis sequence), which has not been ascribed to any functional importance hitherto, shows particularly high conservation. The only conservative exchange in this region is not present when the D. melanogaster GAGA-519 protein described by Benyajati et al. (1997) is chosen for comparison instead of the GAGA-519 protein of Soeller et al. (1993) shown in Fig. 4. The C-terminal amino acid sequences encoded by exon V, which are unique to the class B proteins, have a sequence identity of 75% (85% similarity) and thus show a clearly higher degree of conservation than the C-terminal amino acid sequences specific for the class A proteins

453

Fig. 5A–C Developmental Northern analyses of GAGA mRNAs in D. virilis. Poly (A)+ RNA from staged embryos, first (L1), second (L2), and third (L3) instar larvae, pupae, and adults, was fractionated by gel electrophoresis, blotted, and hybridized sequentially to different radiolabelled RNA probes. In A the filter was hybridized with a probe specific for class A mRNAs, in B with a probe detecting class B mRNAs, and in C with a probe complementary to the BTB/POZ domain-encoding sequence which detects both class A and class B mRNAs (for position of the probes, see Fig. 1). Hybridization to rp49 mRNA was performed as a loading control. Transcripts detected by the various probes are marked by arrows and their approximate sizes in kb are indicated. The putative maternal GAGA mRNA is located immediately underneath the 2.5-kb mRNA detected in A and C. The band marked by an arrowhead is observed with the class B-specific probe but not with the common probe. It is unknown whether this band is derived from an alternatively spliced GAGA mRNA, which is not represented by our cDNAs, or from a cross-reacting RNA&ig.c:/f

(46% identity, 66% similarity). Both the class A- and class B-specific regions contain many glutamine residues which contribute, for the most part, to the formation of homopolymeric tracts. Strikingly, these regions are also characterized by a high content of serine and threonine residues, especially within the sequences preceding the poly-glutamine tracts, where these amino acids constitute up to 37% of the amino acid residues. Apart from these similarities in amino acid content, the class A- and class B-specific regions are apparently not related to each other in amino acid sequence. GAGA expression patterns in D. virilis and D. melanogaster In order to correlate the two transcript classes with the transcripts detected in Northern blots, and to analyse whether these transcripts show any developmental specificity in their expression, we hybridized a Northern blot containing poly (A)+ RNA isolated from different developmental stages of D. virilis with probes specific for

transcript classes A and B (Fig. 5). Despite differences in RNA loading, several conclusions can be drawn from the Northern analyses. First, the class A-specific probe detects only the 2.5-kb transcript and a slightly smaller transcript (Fig. 5A), whilst the class B-specific probe detects the 3.4- and 3.9-kb transcripts of D. virilis (Fig. 5B). The smaller transcript detected by the class Aspecific probe is restricted to adult females and early embryos, suggesting that it represents a maternal mRNA. Second, the pattern of class A and class B transcript expression strikingly changes during development. This becomes particularly clear when the Northern blot is hybridized with a probe from the POZ/BTB-domain-encoding region, which detects both class A and class B transcripts (Fig. 5C). Whilst the class A transcripts clearly dominate in early embryos, transcripts of the two classes are present in similar amounts at later stages of embryogenesis (Fig. 5A–C, lanes 1 and 2). In first and second instar larvae the 3.4-kb class B mRNA is the dominating GAGA mRNA (Fig. 5A–C, lanes 3 and 4). In third instar larvae the 2.5-kb class A transcript then increases to a level comparable with the level of the 3.4-kb class B transcript, but the 3.9-kb class B transcript is still underrepresented (Fig. 5A–C, lane 5). This situation changes in the pupa, where the ratios between the three transcripts are comparable to the ratios seen in late embryos (Fig. 5A–C, lanes 2 and 6). A striking difference in the expression of the two transcript classes is then again detectable in adult flies of the two sexes. Whilst the class A transcripts dominate in females, such transcripts are not detectable in males (also not after longer exposures; data not shown). Even though it is not clear, in view of the low amount of RNA loaded onto the male lane, whether the class A transcripts are completely missing, the 3.4-kb class B transcript is clearly the dominating species in males. The 3.9-kb class B transcript, finally, seems to be strongly underrepresented in both males and females. A similar sex-specific expression of the class A and class B transcripts was observed in D. melanogaster (data not

454

shown). The changes in the relative abundances of the transcript classes during D. virilis development are in good agreement with the developmental profiles of similarly sized GAGA transcripts in D. melanogaster (Soeller et al. 1993; Bhat et al. 1996). A developmental Western analysis of GAGA expression using isoformspecific antibodies (Benyajati et al. 1997), which allows the correlation of these transcripts with specific isoforms, strongly suggests that they encode the same isoforms as the D. virilis transcripts of about equal size. Thus, the developmental specificity of GAGA isoform expression also seems to be conserved between D. virilis and D. melanogaster. To test whether the two transcript classes show differences in their spatial and temporal occurrence in embryos, we also performed whole-mount in situ hybridizations using digoxygenin-labelled probes (Tautz and Pfeifle 1989) which in Northerns specifically detect transcripts of the two classes. With embryos of D. virilis and D. melanogaster we obtained homogeneous stainings with these probes in all developmental stages examined, suggesting that class A and B transcripts are ubiquitously present during embryogenesis (data not shown).

Discussion To help define important functional elements of the GAGA factor, we have compared the GAGA-factor-encoding genes of two distantly related species. The overall conservation of the genes is high (76% identity for class A isoform, 85% identity for class B isoform). A similarly high degree of conservation between the two species has been found for many other regulatory genes (Kassis et al. 1986; Treier et al. 1989; Hart et al. 1993; Poole 1995; Zhou and Boulianne 1994; Bomze and López 1994; Neufeld et al. 1991; Pepling and Gergen 1995; Bopp et al. 1996). However, there is also a number of regulatory genes (Colot et al. 1988; O’Neil and Belote 1992; Letsou et al. 1993; Webster et al. 1994; Curtis et al. 1995) which show a clearly lower degree of conservation (