Globin Genes - The Journal of Biological Chemistry

4 downloads 0 Views 1MB Size Report
May 20, 1988 - Ben F. KoopS, David Siemieniakg, Jerry L. SlightomQT, Morris Goodmany, Joan Dunbar$, ..... human lineage (Powers and Smithes, 1986).
THEJOURNAL OF BIOLOGICAL CHEMISTRY 0 1989 by The American Society for Biochemistry and Molecular Biology, Inc.

Vol. 264, No. 1. Issue of January 5 pp. 68-79 1989 Printed in Lj.S.A.

Tarsius 6- and @-Globin Genes: Conversions, Evolution, and Systematic Implications* (Received for publication, May 20, 1988)

Ben F. KoopS, David Siemieniakg, Jerry L. SlightomQT,Morris Goodmany, Joan Dunbar$, Patricia C. WrightII**, and ElwynL. Simons11 From the Departments of $Molecular Biology and Genetics and lIAnatomy and Cell Bwlogy, Wayne State UniversitySchool of Medicine, Detroit, Michigan 48201, §Divisionof Molecular Biology, The Upjohn Company, Kalamazoo, Michigan 49007, 11 Duke University Primate Center, Durham, NorthCarolina 27705, and **Division of Biology, California Znstitute of Technology, Pasadena, California 91 125

Comparisons between duplicated genes have shown that gene conversions play an important role in the evolution of multigene families. Previous comparisons have documented in the recently duplicated 7-fetal globin genes of catarrhine primates, over15 separate conversions affecting extensive stretches of coding and noncoding sequences. In the present study, 6- and 8globin genes from a lower primate Tarsius syrichta, and the &globin gene of the Asian great ape, Pongo pygmaeus, have been isolated and sequenced. Comparisons of these sequences with other primate 6 and 6 sequences confirmed a previously reported conversion in an anthropoid ancestor and revealed additional conversions in basal primate, stem haplorhine, tarsier, and early lemur lineages. Conversions found betweenprimate 6- and &globin genes contrast with those found in they-genes inthat 6-B conversions appear much less frequentlyand are more restrictedtoregions conserved by selection (Le. coding and 5”regulatory sequences). These differences indicate that soon after a duplication occurs, conversions can be quite frequent and encompass extensive portions of the duplicated region. With time, sequence differences accumulate, particularly in noncoding regions, and limit both the frequency and sizeof the conversions. Sequences conserved by selection accumulate differences more slowly and are therefore subject to gene conversions for a longer periodof time. Both unconverted and converted sequences were consistent in supporting the placement of tarsier withanthropoids.

therian mammals (65-85 million years ago) consisted of five linked genes (5’-t-y-q-6-@-3’).In the eutherian stem, e - , y-, and q-genes arose from an embryonically expressed progenitor while the 6- and @-genesarose from a postnatally expressed progenitor (Koop and Goodman, 1988). In mammals, the pgene codesfor the major adult 6-chain and the6-gene iseither not expressed or expressed at reduced levels. Previous studies have shown that the6-locus has not evolved as anindependent lineage but in concert with the major adult 8-globin genes. In each of the mammalian orders examined thus far, the 6-locus has acquired characteristics of the p-locus through a nonreciprocal exchange of nucleotide sequences or nonallelic gene conversion mechanism (Hardies et al., 1984; Hardison and Margot, 1984; Martin et al., 1983). These gene conversions have centered over the coding regions where they have essentially removed evidence for the early mammalian origins of the 6-locus. Evidence of early origins remains only in flanking andintron 2 sequences. In lemur, an unequal cross-over resulted in a hybrid +locus (Jeffreys et al., 1982), but in most mammals the 6-locus evolved in close association with the adult p-locus. Previous human and Old World monkey sequence comparisons used molecular clock models to suggest that a conversion of 6 by p occurred in an anthropoid ancestor approximately 40 million years ago (Jeffreys et al., 1982; Martin et al., 1983; Hardison and Margot, 1984). In the present study we have extended these comparisons to include the complete sequences of an orangutan &gene and the 6- and P-genes of tarsier plus the recently sequenced lemur (Harris et al., 1986) and spider monkey 6 sequences (Spritzand Giebel, 1988). Comparison of these new sequences with orthologous Studies of primate, lagomorph, rodent, artiodactyl, and sequences from humans and Old World monkeys confirms marsupial P-globin genes (Hill et al., 1984;Jeffreys et al., 1982; the previously reported conversion between the 6- and p-loci Hardison, 1984; Hardies et al., 1984; Harris et al., 1984; in an ancestral anthropoid approximately 40 million years Goodman et al., 1984; Koopand Goodman, 1988; Goodman et ago (ancestral- or stem-anthropoid refers to thelast common al., 1987) indicate that the ancestral 8-globin cluster of eu- ancestor of hominoids, Old World monkeys, and New World monkeys). Our data further indicate that a more recent conversion between the 6- and p-loci occurred within t h e Tarsius * This study was supported by National Institutes of Health Grant lineage as late as 30 million years ago and that additional HL33940 (to M. G. and J. L. S.),National Science Foundation Grants conversions occurred in very early primate 6 and /3 evolution, BSR 83-07336 (to M. G.), BNS 81-20529 (to E. L.S. and P. 0. W.), most notably in ancestral haplorhines (anthropoid and tarsier and BSR 87-17350 (to E. L. S. and P. 0. W.), Alfred P. Sloan common ancestor) and ancestral primates (haplorhine and Foundation Award (to M. G.), and agrant from The Center for lemuroid common ancestor). As in other mammalian lineages, Molecular Biology at Wayne State University. This is Publication Number 436 (Duke University). The costs of publication of this article these 6-p conversions are concentrated over the conserved were defrayed in part by the payment of page charges. This article coding and 5‘-regulatory regions. Sequence relationships as must therefore be hereby marked “advertisement” in accordance with determined from parsimony analyses of converted and non18 U.S.C. Section 1734 solelyto indicate this fact. The nucleotide sequencefs) reported in thispaper has been submitted converted sequences clearly support the Haplorhini (anthroto the GenBankTM/EMBL Data Bank with accession number(s) 504428 poids and Tarsius)/Strepsirhini (lorisoids and lemuroids) taxonomic division within Primates (Hill, 1955). and J04429.

68

Tarsius 6- and P-Globin Genes: Conversion Evolution and

and Giebel, 1988; Kimura and Takagi, 1983; Spritz et Q L , 1980; Ponz et Q L , 1983;Hardison and Margot, 1984).The Tarsius (? sequence was aligned with human, chimpanzee, gorilla, rhesus, colobus, lemur, Australian hare, European hare, and goat 0 sequences (Lacy and Maniatis, 1980; Lawn et d., 1980; Dierks et al., 1983; Schon et Q L , 1981; Savatier et d., 1985, 1987a, 198713; Pauplin and Rech, 1987). Neither painvise alignment comparisons (Smithand Waterman, 1981; Wilbur and Lipman, 1983; Lipman and Pearson, 1985) nor dot matrix comparisons (Zweig,1984; 13 of 20 matches) permit the complete alignment of 8 with 0 sequences. Only exon, intron 1, and 5'-regulatory sequences of 6 and could be aligned with one another. Intron 2 and flanking sequences were clearly too distantly related to be compared. For sequence divergence analyses, noncoding sequence divergence was calculated directly, counting all substitutions, insertions, and gaps as single events. To account for the variable effects of selection on sequence divergence, coding sequence divergence was divided into two categories; synonymous divergence (mutations that do not change amino acid sequences) and nonsynonymous divergence (mutations that alter amino acid sequences) (Nei and Gojobori, 1986). One disadvantage of divergence analyses is that information present in each aligned position is averaged into a single value. In parsimony analyses one examines each aligned nucleotide position and calculates the minimum number of mutations needed to explain a particular branching arrangement. The tree requiring the fewest total number of mutations (the most parsimonious tree) is judged by the principle of minimum evolution to be the best working hypothesis of genealogical relationships. The advantage of the parsimony method is that information at each variable position can be evaluated separately. The additional use of clearly defined outgroups further enables the direction of individual mutations to be estimated for each aligned position within the study group. For example, within the framework established by the parsimony criteria, a base shared between a reference species such as the rabbit or goat and one or more primate species is assumed to reflect a primitive condition. Any changes from that primitive base is therefore a derived condition. Genealogical relationships are based only on shared derived features. Another feature of parsimony analyses is that ancestral sequences are estimatedas partof the parsimony procedure. Thus if all primate &genes shared one base and all nonprimate 6s and all mammalian 13genes shared another base, parsimony would place a single mutation in the ancestral primate&lineage that could be usedto support single origins of all primate &genes. Identification of Converted Regions-One of the most successful ways of identifying gene conversions in mammalian multigene families is by noting where sequence phylogenies for a specific region are

EXPERIMENTALPROCEDURES

Materials-Restriction endonucleases: AccI, AuQII,BamHI, BglII, BstEII, EcoRI, HindIII, MboI, NcoI, StuI, PstI, PuuII, and XbaI were obtained from New England Biolabs. T4 ligase,polynucleotide kinase, and bovine alkaline phosphatase were obtained from New England Biolabs or United States Biochemical Corp. Radioactive nucleotides were obtained from ICN. Chemicals for DNA sequencing were obtained from vendors recommended by Maxam and Gilbert (1980). Xray film (XAR) was obtained from Kodak. Nylon filters for blots were obtained from Schleicher and Schuell and No. 3MM absorbent paper from Whatman. DNA Isolation, Cloning, and Sequencing-High molecular weight DNAwas prepared from Tarsius syrichta liver tissue (Blin and Stafford, 1976) obtained from the tarsier colony at tho Duke University Primate Center. GenomicDNA libraries were constructed in Charon 35 (Loenen and Blattner, 1983) and 40 (Dann and Blattner, 1987) bacteriophage vectors, propagated in K-12 strain ED8767 rec A- host (Murray et al., 1977), and screened for 0-globin genes by and (?-geneprobes (Benton hybridization with labeled human c-, 7-, and Davis, 1977). Recombinant X clones containing (?-globin gene sequences were isolated and mapped with restriction enzymes BamHI, EcoRI, and HindIII. Genes were identified by comparison of cloned and genomic DNA restriction fragments that hybridized to human gene probes (Southern, 1975). Gene-containing fragments were subcloned into pUC19 plasmid vectors, transformed, and replicated in Escherichia coli K-12 strain JM109 hosts. Plasmid subclones (from Ch3214.2) containing the orangutan &gene (Koop et d . , 1986b; Slightom et d . , 1987) were also examined. Nucleotide sequences were obtained using the method of Maxam and Gilbert (1980) as modified by Slightom et al. (1987, 1988). Over 90% of transcribed sequences were sequenced on both strands, and all of the sequences were sequenced on at least two independently labeled DNA fragments. Comparative Analysis-DNA sequences were aligned with other 6 and (? sequences by comparing pairwise alignments and combining the results by hand. The results of several alignment algorithms (using various parameters recommended by the authors) and dot matrix comparisons (13 or more matches of 20 positions) were used to obtain each pairwise alignment (Smith and Waterman, 1981; Wilbur and Lipman, 1983; Lipman and Pearson, 1985; Zweig, 1984). Our alignments were similar to alignments obtained by Spritz and Giebel (19881, Harris et d. (1986), and Savatier et d . (1987a, 1987b). In regions where alignments were not clear, care was taken to prevent biases from affecting sequence relationships. Tarsius and orangutan d sequences were aligned with human, rhesus, colobus, baboon, spider monkey, lemur, and rabbit d sequences (Martin et d., 1983; Spritz

t

69

n

E

B

6

L

_______ E

B

BE

' J"J

n

nn

B E E b, 1

Ch35-15.8 Ch 40-15.5

a

E

nnn

E

H

X P

I-J

B I

NN

n

*

1

B -

E B

E

1

I

n

B

B E

*

h

1

S N N B Bs

*

Bs x 35-17.Op 1 I

E

& Ch 35-17.OP

"

Ch 35-17.0 Ch 35-12.5 Ch 35-12.1 Ch 40-12.2 E

1 Ch

4.6E

28E

"

E

m

?g

BO

Tarsier

1

n

SP

Bo I

I

4 kb

E

""-

i

Brown Lemur

L B

ow( Monkey

pa

?9

1 Ch 35- 17.0 p 5.6 E E

w FIG. 1. Tarsius &globin cluster organization and structure compared with that of the brown lemur and owl monkey. The dashed lines in the ,!?-genemap of Tarsius indicate regions where linkage is suggested only from the hybridization of specific probes to genomic blots. The square brackets in the lemur &gene map indicate where a deletion in the lemur lineage occurred. Restriction endonuclease maps of the six recombinant X clone inserts are positioned with respect to the map of the Tarsius (?-globingene cluster. Restriction maps of the three plasmid clones used to sequence tarsier 6- and &genes and the sequencing stratern are Dresented below the X clones. Restriction endonucleases: B = BamHI; Bg; BglII; Bs = BskII; E E c o R c k = kndIII; N = NcoI; P = PstI; s = Stuk X = XbaI.

n

* 4

Tarsius 6- and 0-Globin Genes: Conversion and Evolution

70

inconsistent with phylogenies for adjacent sequences and well established species phylogenies. This requires an understanding of which species are more closelyrelated than others. As there is only one true species phylogeny, the evolution of all orthologous sequences (sequences in different species descended from the same, most recent duplication product; Fitch, 1977) should reflect the same species phylogeny. If a particular sequence phylogeny does not reflect the same species phylogeny determined from other sequences or data sets, then theinvolvement of paralogous sequences should be considered (paralogous sequences being those sequences descended from different gene duplication products). When comparisons of adjacent sequences yield different genealogical branching patterns, there is reason to suspect the involvement of sequence rearrangements, duplications, or gene conversions. If the consequences of such changes further result in reconciling different sequence phylogenies to a single species phylogeny, thenthe existence of such events becomes a reasonable hypothesis. For example, previous comparisons of flanking and intron 2 sequences of 6- and @-genesfrom primates, lagomorphs, and rodents and flanking sequences from artiodactyls indicate that the 6- and 0-loci arose from a duplication occurring prior to the separation of these four mammalian orders, but 5'-regulatory, coding, and intron 1 sequences suggest that the &locus was the product of recent 0 duplications occurring within each of the four mammalian orders. In this case the hypothesis of a gene conversion event in each of the four mammalian orders has the effect of reconciling two apparently incompatible sequence phylogenies and providing a valuable insight into the role of gene conversion in the evolution of multigene families (Jeffreys et al., 1982; Martin et al., 1983; Hardison

and Margot, 1984; Hardison, 1984; Hill et al., 1984; Hardies et al., 1984). Using this same logic in comparing the two y-genes of human, chimpanzee, gorilla, orangutan, and rhesus monkey, over 15 different gene conversions have been postulated to have occurred between the duplicated y-genes of these catarrhineprimates (Slightom et al., 1985, 1987, 1988) and in a study of six different human y-gene alleles as many as 13 conversion events may have occurred just within the human lineage (Powers and Smithes, 1986). A more detailed description of the parsimony procedure and the use of parsimony analyses in locating sequences that have undergone gene conversions is described elsewhere (Goodman et al., 1979; Wiley, 1981;Slightom et al., 1987, 1988). RESULTS

Tarsier @-GlobinCluster Structure and Organization-Two genomic librariesfrom T.syrichta liver tissue(Charon 35 vector, 1.5 X IO6; and Charon 40 vector, 5 X lo5 titers) were screened with human@-, e-, and y-geneprobes, and six recombinant clones were isolated. These clones were mapped with five various restriction enzymes (Fig. 1) and found to contain separate globin genes. The five @-type globin genes were localized on mapsby comparing hybridization of specific gene probes to blotted restriction digests of both cloned and gewere confirmed nomic DNA (blots not shown). Gene identities by partial sequencing of gene regions. Linkage between the 7and &loci, thoughnot confirmed, is suggested by sizes of

0 0

140

TOMTOCMTl 0 1 C I 0 T CA 0 T CA Q T CA 0 C M 0 A AT

*

A

M

."".."""".."""".1"""".t"""".."""".t"""".*"""".."""".."""".."""".t"""".*7 1 0

FIG. 2. Aligned &nucleotide sequences. Human (Hsa, Homo sapiens), orangutan (Ppy, Pongo pygmaeus), rhesus macaque ( M m u , Macaca rnulatta), colobus (Cpo, Colobus polykomos), baboon (Pan, Papio anubia), spider monkey (Age, Ateles geoffroyi), tarsier (Tsy, T. syrichta), lemur (Lfu,Lemur fuluus),and Australian rabbit (Ocu, Oryctohgus cunniculus). The three-letter designation is an abbreviated form of the binomial species name (e.&?. Hsa, Homo sapiens). References for each of these sequences are given in the text. Gaps (**) were introduced to improve alignments. Only variable positions are given below the complete human 6 sequence. The sequence of a tarsier Alu sequence is given below the aligned sequences and its position denoted in the alignment at position 1612 by an "alu." Exons and sequence patterns typical of promoter sequences (Myers et al., 1986; Maniatis et al., 1987) are noted above the human 6 sequence. The nucleotide sequence of the orangutan &-geneindicates that the orangutan &-chain differs from the human &chain (which codes for the @-chains inthe minor (A2) hemoglobin component) by one amino acid at position 126 of the amino acid chain (Val in orangutan and Met in humans). The coding regions of human and orangutan 6-genes differ by only four synonymous changes and over noncoding regions human and orangutan differ by 3.3%. These values agree very closelywith values obtained from e-, y-, and 7-gene regions (Koop et al., 1986a, 198613; Miyamoto et al., 1987; Slightom et al., 1987, 1988).

Tarsius 6- and @-Globin Genes: Conversion and Evolution

71

140

*W

IO10

IYOO I O I * T U T C C T T T T O T C T C T C 1 C I C . T O G O T C I T m W C . A 0 0 C T C C A A C T C ~ A T O * 4 * Q T A T C l O U T A C T O T T T T " O 'ATAAATCATTTT.ACAATAIODIITU.TTOOU Cl T A UXA T TIACAATA m A a * T w LC T T T o o M T TCC T OTCTC C.CACA a OTA c o CCAA T MAC T A OTCA T T * K A A T A 00 A a T T w TT T o o AO T TCC T OTCTC C-CACA a a T A a a aac T C A A T MM T I A T 0 1 T **ACAATO 00 01 A a CT T o o T T a w T c c.c.8 o 001 o a CCAA T MAG T 01 CTATTO A T T T T A * * A &A A TTA T O I ~ A A C A M A ~ ~ O A a w ~ c c ~OOCA ~ a G . T M T T T O O M A T A ATCA T T T M A A T O AT TO A TCC c OTCTC T-CATA o am A A T m CCAA T *oca G c 01 a T T T O A a KC MCCC AM w c U IO m T A ATCA T m u c n a & a TO A ~.." . ~. .. .. . . .

oo

.

.

ooc ooc

. ....... ...

~cc

.....

m

a a oc

. . . ....... ................................. mo

...................*...................*...................+.........*................. ~

~

~

~

~

I310

1440

15ao

IO10

I100

.

C ~ A 0 c icicr T CA i a AT o CTTCTC c o M ic c i a a i A c M U """".*""""......... ..~..~"""~......~~.*~~~~"~~..~~~~ ""* " ~ "-""".* ~ . ~ "t w ~o . . . . . . * . . . . . . . .

ii

.."""".."""".*""""..". .......

.................. .."-"""..........*...."".."". ""* ..................

FIG.2"continued.

1010

Tarsius 6 - and @Globin Genes: Conversion and Evolution

72

overlapping HindIII, EcoRI, and BamHI fragments detected in genomic and cloned DNAs. The general organization of the Tarsier @-globincluster closely follows that ofowl monkey (Harris et al., 1986) and the proposed primitive eutherian mammal @-cluster(Bunn and Forget, 1986; Goodman et al., 1984, 1987; Koop and Goodman, 1988). It does not include the hybrid +gene found in the brown lemur (a prosimian representative). Tarsier 6- and @-Gene Sequences-Tarsier 6- and @-genes were completely sequenced along with two previously isolated EcoRI fragments containing the orangutan &gene. These sequences are aligned with other primate 6- and @-genesin

Figs. 2 and 3. The predicted amino acid sequence of the T. syrichta P-chain agrees very closelywith the known sequence of Tarsius bancanus, a closely related congener (Beard et al., 1976, Beard and Goodman, 1976). The P-chain amino acid sequence of the two Tarsius species differs at positions 56,73, 96, and 141 where T. syrichta has Ser, Asp, Phe, and Phe, and T. bancanus has Gly, Glu, Leu, and Leu, respectively. The minor hemoglobin component of adult T. bancanus, representing 18% of thetotal (BarnicotandHewett-Emmett, 1974), was shown to contain an altered @-chainthat shared Ala, His, and His at positions 5, 116, and 117 with the major @-chain (Beard andGoodman, 1976). These amino acids also

120

I40

ma Ptr *CY 1.Y LW 300

410

eo0 ma Dtr

no

140

*W

lono nsa Ptr *CY

cpo 1.Y Lf" LW

ocu cnl

ouul

w w w

Mc

mr

rc-1 C

c la a C C

c c c

CA

0

M ca

1

A C A A

C C O C C O

A

A T 0

C C A

CA

a c c a a a c c a

TO

A

FIG. 3. Aligned &nucleotide sequences. Human (Hsa), chimpanzee (Ptr, Pan troglodytes), gorilla (Ggo, Gorilla gorilla), macaque (Mcy, Macaca cynomolgus), colobus (Cpo),tarsier (Tsy),lemur (Lfu),Australian rabbit (Ocu), European hare (Leu, Lepus europeaeus), and goat (Chi, Capra hircus). Gaps were introduced to improve alignments and only variable positions are given below the complete human @ sequence. An Alu sequence located at position 745 is given below the complete sequence. Exons and promoter sequences are noted above the human (3 sequence. Sequences of tarsier 6- and @-genescontain promoter sequences of the type and location found in other 6- or @-genes(Myers et al., 1986; Maniatis et al., 1987; Martin et al., 1983),as well as functional coding and splicing sequences (Breathnach et al., 1978). The 6- and @-globingenes differ with respect to 5'-promoter sequences. Specifically, &genes lack the CACCC sequences found in @-genesand have three possible CCAAT sequences located 120, 140, and 205 base pairs upstream of the initiation site (Fig. 2). Only the most 3' CCAAT variant appears to be homologously related to the CCAAT element of the @-gene.Other than the TATA sequence, conserved 5' sequences revealing positions of promoter sequences in @-genesare conspicuously absent in &genes. Differences in 5"promoter regions do not appear to be the only factor in explaining reduced expression levels of the &gene. Undefined sequences within intron 2 of @ also appear to increase transcription rates (Kosche et al., 1985). The longest stretch of conserved sequence in intron 2 lies 95 base pairs 3' of the end of exon 2. Although this sequence (GTTTAGAAT) does not resemble any known enhancer sequence and is not found in other globin genes, its conservation among the @-globingenes of different eutherian orders suggests an important role, perhaps in regulation.

73

Tarsius 6- and P-Globin Genes: Conversion and Evolution H.

Ptr

GPO *CY

cmna

A TCCCAIXAC ATCUAGUAC

cpo cAaTTa

a

1.Y

maaa mom

c

TCCCI(XU

TA

IXAQ a a QWQC c a . aTmc A a *

aaa

TC

0 CA A C

T

T T

AT1 ATT OCT

TC a CA A c T ac a c ~ a ca

C

AC

A

c

AC

ac

c

A

c

A

T T

CMTA CACTA CACCA

c

AC

AC AC

LfU L N

ON C h l

Orc a rac a a r A

TQA M T M A

ac

M T GAT

M A M A

aT

a a a

a T a c

aT aT

a

A c a c a c

aT

Ta CA

AA AT

Ta

CA CT TC

a

Ta Ta

T

M A TQA CT M .....*".......*...............""t."""".""""..."."".."""".."""".1"""".. I 4 4 0

" " " " . . " " " " . t " "

'OmTTTCTMTMOCACTOICTCTCTCTQCCTATT~TCTATTTT~TTAa mT AT a A c CTC CTQCCTAT a T m T T arm T mT AT QCAC CTC C T ~ T A T aTcoAT T cchcu T a T m T T IXACCC T mT AT OCAC CTC CTGCCTAT

.

ACT

AC

..

aa~

AC

cc

CTT A M

CTA

OCAC QTAT arac

CTC CTQTCTCC CTCTCCCC

CTC **GTccTa

...

aTac

........ CTQC

AA

. . .

........ a a ~ ~ c c

owc

ana

c

T aacTaT T OOCTQT TaCTaT T

Crcriicc

TCC CTQACCA.

IXACCC ACKCC ACCCCC ACTTTC ATTTTC ATCCCT

c c c c *

.*.. ........ ..................*..........IWO

H.

Ptr

cso *CY

a a

TSY Lf u

C

L N

ON

T T

Chl

T

c+=

C

T

C

TC

T

T I.A

C

TC

T

c c

A

C T U X

C

c

A A C

C C

C

T

T

-

cc . .

C

Tic; A a TTCT T ~ A C A T T 0 1-1

a

QTC

aac cc AQC TQ ~. . aaacc

i i

cc

aTc

TMT Toll

a a a a

aaccc

T

c

T C

T C

D A

A Q

TmTa ~ mala oum

c

a

a

~MTQ

T

c a a c a a c a a

c

a

a a ~ a mcra

................................................................................................. c

cc

aTc

o

CT

w

AOC

TC

AAC

Ta

T

c

AMCT

- -,/

~

T I

T

o

~

~

a

m

"

a

c

U

C

~

a

T T 1

Q T C C O R C Q O C C A ~ T 0 1 TC CAT C 0 a 0 TC M T C Q a m I C AAT c a

C

Ac A A

a

i i z i i r t a

c t-

m

~

7

i

v

~

~

a

A

TA A A A

~

~

a a a a a a

c c 1 c c c

A A

C C

~

T T 1 - 0

T

~

T C Y I Q T C C A O C 1 - 0 T T M O ~ T T C U Q C T A M O T T A M ~

a a

a a a

a A A

T

~

c

T

c

c

c c c T c

~

T

~

c a cac c a c c

c c T T

a

c c C c

C A T C O T C Q C

c o t

T

Q

c

T Q

c

a a a c a

a

~

~

T

C

00 M M

T

a

c c c

CT

~ 0

o a A

a

a a

m a

T

Q

*

M

I010

T

~

T

~

T

CATOmCCC

c CTC crraaaici

c ATQM TCT c ATOOO c c ~ *~mc.ootc

.....

* ~ a a a a u x

......."*"........__".................*..."""*"""".*""..............................."","""".."". 1800

.-. .

T a n

. . TIC T a CTC

...

CT 01 a1 AT AT

u

EOE

."*

U

CA

TA C A O TA ir CA A EA a TA AT C A T Ta m

a "*" ".

""

O

Ta a

u

". ."

MA Ma

* A T A M ATC C A T M a 010

.m CATODD c.cc aaa C A T M TTC am -1 TIC a M c.1mc * - - .".

..

""

"_.""_." ...................................... .+"

..

1110

oar 011

1040

... OPO -7 1.Y LfU L N

oar

nsa Ptr

x

1.Y LfU L N

ON

L N

ON

"""".+"""".*""""

.."""".."""...."""".*.".

"".+"""".*""""

.......

....... *.

.,..

........*

....... ..... ..... ..... .....

1110

ACMT ACMT

."". ACMT

1840

T

~

T

~

74

Tarsius 6- and @-GlobinGenes: Conversion and Evolution

27M)

2mo ..H 1.Y

Lt"

L U ON

..H

....

1.Y LfU LI1

M

ON

-0

Q

UmQTQTTTA W......T

w...... T """-".""""~*"""."*""""-

3000

....... .... .

E

M C C C C W ~ . .con .

cc

cm

_

.

~

~

303m

.IU

FIG.3"continued.

reside at corresponding positions of the deduced T. syrichta &chain. In addition, amino acid compositions of selected tryptic peptides from major and minor chains of T. bancanus (Beard and Goodman, 1976) indicated identity between these chains at positions 1-8 and 113-120. The amino acid sequences of T. syrichta 6- and @-chainsas deduced from nucleotide sequences (Figs. 2 and 3) indicate only a single difference (position 6) over these same positions. Electrophoresis of T. syrichta hemoglobin proteins confirms the presence of a slower migrating (at pH = 8.6) minor hemoglobin component like that of T. bancanus (data notshown). This slower hemoglobin is consistent with the reduced number of negatively charged amino acids found in the &chain (see below). It is therefore very likely that the 6-locus of T. syrichta is expressed as in T. bancanus and that the6-locus codesfor the altered @-chainpresent in the minor hemoglobin component. T. syrichta 6- and @-chainsdiffer by only six amino acids (at amino acid positions 6, 16, 19, 121, 126, and 139, 6 has Asp, Ser, Asn, Gln, Leu, and Ala and @ has Glu, Gly, Asp, Glu, Val, and Thr; Figs. 2 and 3). Diuergence Analysis-Neither pairwise alignment comparisons (Smith and Waterman,1981; Wilbur and Lipman, 1983; Lipman and Pearson, 1985) nor dot matrix comparisons (13 matches out of20 in a 4-fold compression, Zweig, 1984) revealed any evidence of sequence homology between the flanking and intron2 sequences of 6- and @-loci.This together with the fact that, for these regions, alignments between primate and lagomorph 6 and primate and lagomorph p sequences are easily obtained (Figs. 2 and 3) clearly demonstrates that the 6 - and @-locioriginatedprior to theseparation of primate and rabbit lineages. Only in the region between the CCAAT promoter and the end of exon 2 (positions 291885 in Fig. 2 and positions 1208-1800 in Fig. 3) and in exon 3 were alignments between the two loci possible.As the ability to detect conversions occurs only where common alignments between 6 and P sequences can be obtained, these regions were analyzed separately from those regions in which 6 and sequences could not be aligned and were thus clearly part of two anciently separated paralogous lineages. Divergence among primate flanking and intron 2 sequences of 6 - and P-genes (Table I) indicates that cercopithecoid (Old World monkeys) and hominoid sequences are most closely related (divergence is about 5-7%) and ceboid (New World

TABLEI Noncoding diuergence values (% uncorrected) estimated from q-, 6-, and 0-globingene sequences and from DNA hybridization T&€ values Divergence

B %

6 %

9

%

Hybridization" %

Human - cercob 6.5 5.5 8.0 6.9 Human - spider 9.6 11.6 11.2 Human - tarsier 25.0 25.0 24.9 Human - lemur 24.5 25.0 23.9 21.8 Human - rabbit' 33.8 31.0 30.9 31.6 Cerco - spider 10.5 12.9 13.5* Cerco - tarsier 26.8 26.9 Cerco - lemur 26.8 26.9 24.3 25.1 Cerco - rabbit 33.7 33.0 32.6 Spider - tarsier 27.6 26.4 Spider - lemur 26.6 25.5 24.5 Spider - rabbit 34.1 32.8 Tarsier - lemur 26.7 27.9 25.7 Tarsier - rabbit 35.9 38.2 Lemur - rabbit 35.5 34.8 27.4 29.3 a Bonner et al. (1980) except where noted. Reciprocal comparisons were averaged. 'Cerco refers to the cercopithecoid macaque monkeys except in the case of b and DNA hybridization comparisons where it refers to baboons. e In q sequence comparisons the rabbit is replaced by the goat, and in DNA hybridization comparisons the rabbit is replacedby tree shrew. From T60Rvalues obtained by Benveniste (1985).

monkeys) sequences diverge about 10-11% from catarrhine (hominoid and cercopithecoid) sequences. Lemur, tarsier, and anthropoid(catarrhineandplatyrrhine) sequences areall about equally divergent from each other (divergence is about 24-28%). As inferred from sequence divergence, hominoids are most closely related to cercopithecoids, followed by ceboids andthen tarsioids and lemuroids, with anthropoid, tarsioid, and lemuroid relationships being unresolved. These same relationships are evident in both 6 - and @-genelineages and are compatible with conclusions based on a sizable body of morphological, amino acid, and nucleotide sequence, and DNA hybridization data (Koop e t al., 1986a; Bonner et al., 1980; Schwartz, 1986; Miyamoto and Goodman, 1986).

Tarsius 6- and @-GlobinGenes: Conversion and Evolution

75

P-lineages and suggest that conversions between 6- and P-loci occurred independently in the lemur and tarsier lineages. To estimate when the conversions in the tarsier and lemur lineages occurred, we compared synonymous and noncoding divergence estimates with divergence values corresponding to known divergence times (hominoids and cercopithecoid diverged 20-30 million years ago; catarrhines and platyrrhines diverged 35-45 million years ago; see references in Koop et al., 1986a). Because nonsynonymous divergence is affected by unequal selection pressures, it was not used to estimate divergence times. Synonymous and noncoding divergence between tarsier 6 - and P-genes (18.3 and 7.8% respectively) was less than that between catarrhinesandplatyrrhines (22.5 and 13%) but greater than that between humans and baboons 6 genes (10.5 and 7.1%). Therefore we estimate that a conversion of 6 by P occurred about 30 million years ago in the tarsier lineage. Synonymous divergence between lemur 6- and P-genes (16.1%) was also less than that of catarrhines and platyrrhines (24.8%) and more than that between humans and baboons (10.5%); therefore a conversion in lemurs may also have occurred about 30 million years ago. In 5”regulatory andintron 1 sequences an additionalpattern of 6 and P sequence divergence appears. In these noncoding regions, all of the primate 6 and P sequences are more similar to each other than primate 6 sequences are to rabbit 6 sequences or primate p sequences are to rabbit P sequences. This suggests that a conversion may have also occurred between ancient primate 6 and P sequences,prior to the separation of anthropoid, tarsioid, and lemuroid lineages (50-60 millionyears ago).

The sequence divergences among 6 and @ coding, 5”regulatory, and intron1regions from different groups of primates, however, are not indicative of two anciently separated gene lineages (Table 11). This is particularly true in thenoncoding and synonymous coding sequences. Because 6- and P-genes are expressed differently in separate primate lineages, selective pressures affecting the number of amino acid changing substitutions may also be different. The 6-gene of catarrhines, for example, is completely silent whereas the 6-gene of tarsiers makes up about 18%of the total adult P-hemoglobin. The plocus, however, is always the major adult hemoglobin. In noncoding and synonymous sequence divergence, anthropoid (hominoid, cercopithecoid, and ceboid) 6 and sequences are generally more similar to each other than anthropoid 6 sequences are to tarsier or lemur 6 sequences, or anthropoid P sequences are totarsier or lemur p sequences. In fact anthropoid 6 and p sequences are only slightly more divergent from each other than catarrhine (ceropithecoid and hominoid) 6 sequences are from platyrrhine (New World monkeys) 6 sequences. These patternsof divergence support the conclusions of Hardison (1984), Hardison and Margot (1984), and Martin et al. (1983) which contend that 6 was converted by P about 40 million years ago and further suggest that the conversion occurred prior to theseparation of New World and Old World monkeys. Furthermore, over the coding, intron 1 and 5’regulatory regions, tarsier 6 and P sequences are most similar to each other, and lemur /3 sequences are most similar to lemur 6 sequences (Table 11). These similarities are greater than would be expected from two anciently separated 6- and

TABLEI1 sequence divergence matrix Nonsynonymous and synonymous 6 and p coding sequence divergences (% uncorrected) are presented above the null diagonal (nonsynonymous divergence is given above synonymous divergence) and sequence divergences (% uncorrected) of 5’-regulatory and intron 1 regions are presented below the null diagonal. Pairwise sequence divereence values are obtained where the row number and the column number intersect. 6 and

-

I

~

~

~

Exons 1,2,and 3

1. Human p

-

-

3.2 10.2

4.5 32.2

10.6 28.1

5.1 27.0

3.3 19.7

5.1 24.5

3.2 21.7

5.7 26.4

7.3 23.6

18.3 35.7

2. Macaque

3.5

-

6.8 31.3

10.4 27.4

7.2 25.0

3.6 19.4

6.1 22.9

4.1 17.5

7.5 23.2

10.2 25.0

16.8 33.2

3. Tarsier p

17.7

17.7

-

11.0 30.1

6.3 33.3

6.3 28.8

7.3 31.3

4.7 30.6

1.8 18.3

6.6 26.5

17.0 32.3

4. Lemur p

15.1

16.3

17.1

-

10.8 23.8

11.6 29.8

12.3 29.7

10.4 25.8

12.8 21.2

8.9 16.1

20.4 33.1

5. Rabbit p

27.6

27.6

30.0

25.3

-

7.2 29.2

8.4 29.3

6.5 29.2

7.5 30.3

5.9 26.0

14.9 33.8

6. Human 6

12.6

13.0

20.2

17.9

26.9

-

3.3 10.5

3.2 20.1

7.0 23.4

9.8 22.9

18.1 31.4

7. Baboon 6

13.3

14.2

21.3

19.0

28.0

7.1

-

5.0 24.8

8.5 25.4

9.8 17.6

19.7 32.5

8. Spider M 6

16.5

16.9

22.1

21.5

30.4

12.8

13.1

-

5.9 23.9

7.9 25.6

17.9 36.0

9. Tarsier 6

20.5

20.5

7.8

18.0

28.2

20.2

21.8

23.2

-

8.6 15.1

17.9 32.0

-

-

10. Lemur 6”

11. Rabbit 6

30.6 1

30.2 2

29.9 3

26.8

27.0

27.5

4 5 6 5’ flanking and intron 1

Divergence values based on partial sequence.

29.0 7

32.0 8

30.3 9

10

17.7 32.3 11

Tarsius 6- and @-GlobinGenes: Conversion Evolution and

76

ParsimonyAnalysis-To furtherinvestigate conversions indicated by sequence divergences, we have examined each variable position of aligned 6 and p sequences and determined, on the basisof parsimony, which positions supported conversions and which positions did not. At each variable position where positional paralogues shared a sequence identity that parsimony depicted as derived ratherthanprimitive,the result was taken as evidence for a conversion (for definitions and a description of the methods see "Experimental Procedures'' andSlightom et al., 1987, 1988). Wherepositional

orthologues shared a derived nucleotide sequence, the result was taken asevidence against conversion in those taxa sharing the derivedchange. For example,when primate 6 and p sequences are closer to each other than primate 6 sequences are to rabbit 6 sequences and primate fl sequences are to rabbit and goat p sequences, a conversion event is possible. The results of all of the separate parsimonious solutions are presented in Fig. 4. A square box at a variable position indicates that positional paralogues are closer genealogically to each other than to positionalorthologues, therefore support-

Prlmate

0 . P

Reglon

1

2

,

3

I

5

4 0

HCG

h

R T

Y! a " " " " " "

Reglon 2

Reglon 1 , 3 , 8 5

Reglon 4

FIG. 4. Parsimony analysis of aligned 6- and &globin genes and the identification of gene conversion regions. The aligned 6- and @-genes of human, Old World monkey (Cerco, includescolobus,macaques, and baboons), New World monkey (Ceboid, includes the spider monkey), tarsier, and lemur plus catarrhine (Catar), anthropoid ( A n t h r ) ,haplorhine (Haplo), and primate ancestors were used to locate gene regions that may have been involved in conversion events. Above the organizational map of 6- and @-globin genes (exons 1, 2, and 3 are indicated by raised bars) are the results of a position by position parsimony analysis of regions where 6 and p sequences could be aligned. Ancestral genes reconstructed by the parsimony procedure were also included in the analysistodeterminewhen conversionsoccurred. Triangles indicatevariable nucleotidesequence positions (positioned with respect to the gene map of 6 and @) in which paralogous (separate 6- and @-gene) lineagesdo not support a conversion hypothesis. Boxes indicate variable positions in which 8-genes show a closer affinity to 6genes than to other @-genes(and vice versa) and therefore suggest a possible conversion event. Gene conversions are hypothesized where two or more adjacent squares are found. The lemur 6, because it has been fused with the q-gene (indicated by the dashed arrow), retains only a small part of 6 exon 2. A single position in exon 2 plus two positions in exon 3 supports a conversion event in lemur; therefore, a conversion event between lemur 6 and @ sequences seems likely. Below the 6- and @-genemaps is a regional analysis of 6 and @ sequences ( H = human; C = chimp; G = gorilla; M = macaque; Co = colobus; S = spider monkey; 0 = orangutan; P = baboon; T = tarsier; L = lemur; R 1= European hare; R2 = Australian rabbit; Gt = goat). Only inregions 2 and 4 could 6 and @ sequences be aligned with one another. Regions 1, 3, and 5 of 6- and @-genes could not be aligned and therefore cannot participate in possible conversions occurring within the primates (that primate and rabbit orthologous sequences could be aligned over regions1,3, and 5 supports this conclusion). The lack of two distinct 6- and @-lineagesdating prior to the separation of primates and rabbits in regions 2 and 4 contrast with the branching pattern shown in explanation that could resolve the conflicting gene regions 1, 3, and 5 . Gene conversionoffersareasonable phylogenies and still be compatible with a single species phylogeny. Although a ceboid &gene sequence is not known, the absence of nucleotide positions grouping spider monkey 6 sequences with hominoid and catarrhine @ sequences indicates theabsence of conversions between 6 and fl in theceboid lineage. However,the spider monkey 6 sequence in region 2 does not clearly group with higher primate 6 or @ sequences; therefore, either conversions may haveoccurred withinthe ceboid lineage orplatyrrhineandcatarrhine lineages separatedsoonafter a conversion in theircommon anthropoid ancestor.Ceboid /3 sequences are needed to ultimatelyresolve this question.

Tarsius 6 - and @-Globin Genes: Conversion and Evolution

77

We have also uncovered evidence for the involvement of conversions between the 6- and @-lociprior to the formation of a hybrid +gene in lemurs. Both coding sequence divergence analysis(Table 11) andparsimonyanalysis (Fig. 4, regions 2 and 4) suggest that a conversion affected at least part of exon 2 and exon 3 and 6. Harris et al. (1986) suggested fusion that conversions between 6 and P occurring prior to the of 7 - and &genes might help explain why the lemur @-chain appears to have accumulated such an inordinate number of amino acid substitutions. This hypothesisgains considerable support from the present data which indicates that such a conversion probably did occur. Additional conversions of 6 by @ in stem primates, stem haplorhines, and stem anthropoids also are supported by both sequence divergence (Table 11) and parsimony analyses (Fig 4).Theseeventsareincorporatedinto a generalscheme depicting someof the major events occurring in theevolution of the primate P-globin cluster (Fig. 5). Within primates as many as five gene conversions involving 6- and P-genes have occurred. As the P-locus codes for the primary adult @-type hemoglobin chain in all mammals, we and others have assumed that conversions apparent in theevolutionary history of 6- and @geneshave involved 6 being converted by @. Wecannot however dismiss the possibility that some changes originally introduced into the6-locus may be transferred to the @-gene as a result of these conversions. As suggested by Harris et al. (1986), sequenceintrogression froma less constrained &locus @ variability. into a @-gene may result in introducing increased This typeof mechanism might also explain a tendency noted by Spritz and Giebel(l988)for some synonymous divergence comparisons (Table 11) to be more divergent than noncoding and flanking sequences (Table I). In general however it appears that the tighterselectional constraints acting on thePlocus determines, if not the direction of conversions themselves, at least which conversions survive and become fixed in populations. Comparing the conversions between 6- and @-geneswith those between catarrhine y-genes reveals several differences that may simply reflect the much older origins of the 6- and P-genes (approximately 85-100 million years ago uersus 2535 million years ago; Slightom et al., 1980, 1985, 1987, 1988; Shen et al., 1981). Conversionsbetween y-genes are much more frequent (as many as 13 conversions have occurred in the human lineagealone,Powers and Smithes (1986) and within other catarrhinespecies over 12, Slightomet al. (1988)) and are much more extensive (spanning extensivenoncoding DISCUSSION regions as well as codingsequences; Slightom et al., 1985, Until recently there has been no direct evidence for the 1987, 1988; Scott et al., 1984). Conversions between primate expression of the 6-locus in lower primates. Protein studies 6- and P-genes, as evident in this paper, are relatively infrehave shownonly that twovery similar @-like chainsare quent and arelocalized to regions maintaining, throughselecexpressed in tarsiers (Pmaj= 82%, Pmin= 18%; Beard et al., tion, high levels of similarity (between theCCAAT promoter 1976) as well as galagos (PI = 60%, p2 = 40%; Watanabe et sequence and the end of exon 2 and exon 3). Conversions al., 1985; Tagle et al., 1988). In this study we have demon- themselves contribute to higher levels of sequence matches strated in Tarsius that the two P-chains are coded by a @- but are not enough to maintain such similarity. That intron locus and a 6-locus that has undergone gene conversion by 2 lies between two areas of conversion but is not converted the P-locus. There is no evidencefora duplicated @-gene. (Fig. 4) indicates an important role of selection in determining Southern blots of genomic galago DNA confirm the presence where conversions occur. It would appear then thatsoon after of five loci (Koop and Goodman, 1988; Tagle et al., 1988) of a gene duplication occurs, conversions can be frequent and which three havebeen identified asembryonic type t - , y- and extend over the entire duplicated region. Over time, differ7-genes (Tagle et al., 1988), therefore the remainingtwo loci ences accumulate, particularly in noncoding regions,and conmust code for the two adult P-hemoglobin chains. The very versions become less frequent and limited to regions conhigh amino acid sequence similarity between the two galago served by selection. These empirical observations are comadult @-chains indicates eithera recent duplication of the P- patible with models constructed on a more theoretical basis locus inadditionto a deletion of the 6-locus or a single (Walsh, 1987). conversion of the 6-locus by @. Given the history of the 6That conversionsplayamajor role in the evolution of locus, we advocate the latter more parsimonious hypothesis. multigenefamilies, such as the @-globinfamily has major ing gene conversion. A triangle, on the other hand, indicates thatpositional orthologues are more closely relatedthan paralogous sequences and hence do not support conversion. Stated in a more general way, triangles represent primate @ sequences that are more similar to other@ sequences than to 6 sequences, and squares represent P sequences that aremore similar to 6 sequences than to other @ sequences. The same can be stated for 6 sequences. On the basis of this analysis, conversions, hypothesized only where two or more boxes are clustered, have affected the region between the CCAAT element and the end of exon 2 (region 2) in the stem primates, stem haplorhines, stem anthropoids, lemur, and tarsier lineages. Part of exon 3 (region 4) also appears to be converted in stem haplorhine and in the lemur lineage, although evidence in this smallregion is not strong. Thatsequences 5' of the CCAAT element and 3' of exon 3 and intron 2 (regions 1,3, and5) were not affected bythese conversions evidenced is by our inability to find an alignmentbetween 6- and @-genes in these regions. Parsimony analysisof each of the five regions is presented in Fig. 4 below the gene map (phylogenetic patterns of regions 1, 3, and 5 were totally compatible and were therefore combined). Trees from regions 2 and 4 contrast with trees from regions 1,3, and5 in that theydo not show 6 and @ sequences evolving as two separate lineages originatingpriortothe separation of primates and lagomorphs. In regions 2 and/or 4, anthropoid 6- and P-genes group together as do haplorhine, primate, tarsier, and lemur gene lineages. Phylogenetic trees for regions 2 and 4 are therefore compatible with conversions hypothesized from position analysisas well as thosesuggested by divergence analyses. The branching arrangements indicated inFig. 4 join hominoids t o cercopithecoids followed by ceboids, tarsier, lemur, rabbit, and goat. As the phylogenetic position of tarsier has been the subject of considerable controversy (Popcock, 1918; Hill, 1955; Dene etal., 1976; Schwartz, 1986; Sarichand Cronin, 1976; Baba et al., 1982),we examined alternateplacements of Tarsius. To place tarsiers with lemurswould require hypothesizing 12 additional events in regions 1, 3, and 5 and 2 additional events inregion 4. Alternatively, toplace Tarsius at aposition ancestraltoanthropoidsandlemurs would require 21additionaleventsin regions 1, 3, and 5 and 5 additional events in region 4. In region 2 all alternatives are equally possible. Clearly these data supportplacing Tarsius 6 and @ sequences closer to orthologous anthropoid sequences than to prosimiansequences.

Tarsius 6- and @-GlobinGenes: Conversion and Evolution

78

Embryonic

PE

PY

Adult

Pr)

I

hactlvatlon of

I

Pb

pB

r)

Conversion of 6 by B

-

Converslon between

b and 6

I

I

Fuslon of

W and b

Converslon of b bv B

Recrument of y as a fetal gene

I

I

Converslon of b by B

Converslon of 6 by 8

6 0 % 81 40% P 2

E

y

J’q Galago

b

.B

1

hserhon of Kpn element between E and y

6%b . 9 4 % B

I

y Ihphcatm

Monkey Splder

o)6

1o09(

Conversion between 7 , s

Lemur

18% 6 02%B

Tarsler FIG. 5. Evolution of the primate @-globincluster showing up to six possible conversions of d by 8. Additional features such as those conversions found in the evolution of primate y-genes have recently been studied by Slightom et al. (1988) and Tagle et al. (1988). Other events are summarized from Collins and Wiessman (1984) and Bunn and Forget (1986). Possible y-gene inactivation in the tarsier lineage is indicated from preliminary sequences of the y-gene (Koop, B. F., unpublished data). Above each @ cluster, the approximate percentage of 6and @-chainsin adult blood is given; the platyrrhine (spider and owl monkey) value is from Boyer et al. (1971).

repercussions when inferring species phylogeny from sequence phylogeny. Inconsistent phylogenetic patterns between different regions within a set of homologous sequences permit the detection of converted regions. Once these conversions areincorporated into the phylogenetic history of a sequence, there should be consistency among the different regions with respect to species relationships. This is in fact what we have found. All of the 6 and p regions examined were compatible with hominoids being more closely related to cercopithecoids, followed by ceboids, tarsiers, lemurs, rabbits, and goats, respectively. This branching arrangement is quite strongly supported by both 6 and p sequences and leads us to strongly favor the placement of Tarsius with anthropoids in Haplorhini rather thanwith lemurs and lorises in Prosimii. Acknowledgments-We would like tothank Danilo Tagle and David Fitch for their help in isolating clones and for their comments on this manuscript. REFERENCES Baba, M., Weiss, M. L., Goodman, M., and Czelusniak, J. (1982) Syst. 2001.3 1,89-102 Barnicot, N. A., and Hewett-Emmett, D. (1974) in Prosimian Biology (Martin, R. D., Doyle, G. A., and Walker, A. C., eds) pp. 891-902, Duckworth, London

Beard, J. M., and Goodman, M. (1976) in Molecular Anthropology (Goodman, M., and Tashian, R. E., eds) Plenum Publishing Co., New York Beard, J. M., Barnicot, N. A., and Hewett-Emmett, D. (1976)Nature 259,338-341 Benton, W. D., and Davis, R. W. (1977) Science 196, 180-182 Benveniste, R. E. (1985) in Molecular Evolutionary Genetics (MacIntyre, R., ed) Plenum Publishing Co., New York Blin, N., and Stafford, D.W. (1976) Nucleic Acids Res. 3,2303-2308 Bonner, T. I., Heinemann, R., and Todaro, G. J. (1980) Nature 286, 420-423 Boyer, S. H., Crosby, E. F., Noyes, A. N., Fuller, G. F., Leslie, S. E., Donaldson, L. J., Vrablik, G. R., Schaefer, E. W., and Thurmon, T. F. (1971) Biochem. Genet. 5,405-448 Breathnach, R., Benoist, C., O’Hare, K., Gannon, F., and Chambon, P. (1978) Proc. Natl. Acad. Sci. U. S. A. 75,4853-4857 Bunn, H. F., and Forget, B. G. (1986) Hemoglobin: Molecular, Genetic and Clinical Aspects, W. B. Saunders Co., Philadelphia Collins, F., and Weissman, S. (1984) Prog. Nucleic Acid Res. Mol. Biol. 31,315-462 Dene, H. T., Goodman, M., and Prychodko, W. (1976) in Molecular Anthropology: Genes and Proteins in the Evolutionary Ascentof the Primates (Goodman, M., Tashian, R. E., and Tashian, J. H., eds) Plenum Publishing Co., New York Dierks, P., van Ooyen, A,, Cohran, M. D., Dobkin, C., Reiser, J., and Weissman, C. (1983) Cell 32,695-706 Dunn, I. S., and Blattner, F. R. (1987) Nucleic Acids Res. 15, 2677-

Tarsius 6- and &Globin Genes: 2698 Fitch, W. M. (1977) in Major Patterns in VertebrateEvolution (Hecht, M. K., Goody, P. C., and Hecht, B. M., eds) pp. 169-210, Plenum Publishing Co., New York Goodman, M., Czelusniak, J., Moore, G. W., Romero-Herrara, and Matsuda, G. (1979) Syst. Zool. 28, 132-163 Goodman, M., Koop, B. F., Czelusniak, J., Weiss, M. L., and Slightom, J. L. (1984) J. Mol. Biol. 180, 803-823 Goodman, M., Czelusniak, J., Koop, B. F., Tagle, D. A., and Slightom, J. L. (1987) Cold Spring Harbor Symp. Quunt. Biol. 62,875-890 Hardies, S. C., Edgell, M. H., and Hutchison, C. A., I11 (1984) J. Biol. Chem. 259,3748-3756 Hardison, R. C. (1984) Mol. Bwl. Euol. 1 , 390-410 Hardison, R. C., and Margot, J. B. (1984) J. Biol. Euol. 1,302-316 Harris, S., Barrie, P. A., Weiss, M. L., and Jeffreys, A. J . (1984) J. Mol. Biol. 180, 785-801 Harris, S., Thackeray, J. R., Jeffreys, A. J., and Weiss, M. L. (1986) Mol. Biol. Euol, 3 , 465-484 Hill, A., Hardies, S. C., Phillips, S. J., Davies, M. G., Hutchinson, C. A., 111, and Edgell, M. H. (1984) J. Biol. Chem. 259,3739-3747 Hill, W. C. 0.(1955) Primates, ComparativeAnatomy and Taxonomy, Vol. 11, The Edinburgh University Press, Edinburgh Jeffreys, A. J., Barrie, P. A., Harris, S., Fawcett, D. H., Nugent, Z. J., and Boyd, A. C. (1982) J. Mol. Biol. 156, 487-503 Kimura, A., and Takagi, Y. (1983) Nucleic Acids Res. 11,2541-2550 Koop, B. F., and Goodman, M. (1988) Proc. Natl. Acad. Sci. U. S. A. 86,3893-3897 Koop, B. F., Goodman, M., Xu, P., Chan, K., and Slightom, J. L. (1986a) Nature 3 1 9 , 234-238 Koop, B. F., Miyamoto, M. M., Embury, J. E., Goodman, M., Czelusniak, J., and Slightom, J. L. (1986b) J. Mol. Euol. 24, 94-102 Kosche, K. A., Dobkin, C., and Bank, A. (1985) Nucleic Acids Res. 13,7781-7793 Lacy, E., and Maniatis, T. (1980) Cell 2 1 , 545-553 Lawn, R. M., Efstratiadis, A., O’Connell, C., and Maniatis, T. (1980) Cell 2 1,647-651 Lipman, D. J., and Pearson, D. T. (1985) Science 2 2 7 , 435-438 Loenen, W. A. M., and Blattner, F. R. (1983) Gene (Amst.) 26, 171179 Maniatis, T., Goodbourn, S., and Fischer, J . A. (1987) Science 2 3 6 , 1237-1245 Martin, S. L., Vincent, K. A., and Wilson, A.C. (1983) J. Mol. Biol. 164,513-528 Maxam, A., and Gilbert, W. (1980) Methods Enzymol. 66,499-560 Miyamoto, M. M., and Goodman, M. (1986) Syst. Zool. 36,230-240 Miyamoto, M. M., Slightom, J. L., and Goodman, M. (1987) Science 238,369-373 Murray, N. E., Brammer, M. J., and Murray, K. (1977) Mol. Gen. Genet. 1 5 0 , 53-61

Conversion Evolution and

79

Myers, R. M., Tilly, K., and Maniatis, T. (1986) Science 2 3 2 , 613618 Nei, M., and Gojobori, T. (1986) Mol. Bwl. Euol. 3,418-426 Pauplin, Y., and Rech, J. (1987) Nucleic Acids Res. 15,5899 Popcock, R. I. (1918) Proc. Zool. SOC.Land. 19-53 Poncz, M., Schwartz, E., Ballantine, M., and Surrey, S. (1983) J. Bwl. Chem. 258,11599-11609 Powers, P. A., and Smithes, 0.(1986) Genetics 1 1 2 , 343-358 Sarich, V.M., and Cronin, J. E. (1976) in Molecular Anthropology (Goodman, M., and Tashian, R. E., eds) Plenum Publishing Co., New York Savatier, P., Trabuchet, G., Faure, C., Chebloune, Y., Gouy, M., Verdier, G., and Nigon, V. M. (1985) J. Mol. Bwl. 182,21-29 Savatier, P., Trabuchet, G., Chebloune, Y., Faure, C., Verdier, G., and Nigon, V.M. (1987a) J. Mol. Euol. 24,297-308 Savatier, P., Trabuchet, G., Chebloune, Y., Faure, C., Verdier, G., and Nigon, V. M. (1987b) J. Mol. Euol. 24, 309-318 Schon, E. A., Cleary, M. L., Haynes, J. R., and Lingrel, J. B. (1981) Cell 27,359-369 Schwartz, J. H. (1986) in Comparative Primate Biology (Swindler, D. R., and Erwin, J., eds) Vol. I, Alan R. Liss, Inc., New York Scott, A. F., Heath, P., Trusko, S., Boyer, S. H., Prass, W., Goodman, M., Czelusniak, J., Chang, L.-Y. E., and Slightom, J. L. (1984) Mol. Biol. E d . 1 , 371-389 Shen, S., Slightom, J. L., and Smithies, 0.(1981) Cell 2 6 , 191-203 Slightom, J. L., Blechl, A. E., and Smithies, 0.(1980) Cell 2 1 , 627638 Slightom, J. L., Chang, L.-Y.E., Koop, B. F., and Goodman, M. (1985) Mol. Biol. E d . 2 , 370-389 Slightom, J. L., Theisen, T. W., Koop, B. F.,and Goodman, M. (1987) J . Biol. Chem. 2 6 2 , 7472-7483 Slightom, J. L., Koop, B. F., Xu, P., and Goodman, M. (1988) J. Biol. Chem. 263,12427-12438 Smith, T. F., and Waterman, M. S. (1981) J. Mol. Biol. 1 4 7 , 195197 Southern, E. M. (1975) J. Mol. Bwl. 98, 503-517 Spritz, R. A., Deriel, J. K., Forget, B. G., and Weissman, S. H.(1980) Cell 2 1 , 639-646 Spritz, R. A., and Giebel, L. B. (1988) Mol. Biol. Euol. 6,21-29 Tagle, D. A., Koop, B. F., Goodman, M., Slightom, J. L., Hess, D. L., and Jones, R. T. (1988) J. Mol. Biol. 203,439-455 Walsh, J. R. (1987) Genetics 1 1 7 , 543-557 Watanabe, B., Fujii, T., Nakashima, Y., Maita, T., and Matsuda, G . (1985) Biol. Chem. Hoppe-Seyler 3 6 6 , 265-269 Wilbur, M. J., and Lipman, D. J. (1983) Proc. Natl. Acad.Sci. U. S. A. 80, 726-730 Wiley, E. 0.(1981) Phylogenetics: The Theory and Practice of Phylogenetic Systematics, John Wiley and Sons, New York Zweig, S. E. (1984) Nucleic Acids Res. 12, 767-776