Bombyx mori - PubMed Central Canada

37 downloads 0 Views 188KB Size Report
Dec 21, 1989 - LALAN. 0. L. S. V. L. V. S. A R C. L C a V. LA. 0. 0GC. C. AC010_ 1CACrr 001IG1 1V0GMt_ 1cOG0tA1 OGAm as. I ;Os0C cs10c aC0. 2,00. CaQ.
674 Nucleic Acids Research, Vol. 18, No. 3

The complete Bombyx mori

sCa~ ~aT (C)Oxford

of

sequence

University Press 1990

retrotransposon in

mag, a new

Jean-Jacques Michaille, S. Mathavan, Janine Gaillard and Annie Garel Centre de Genetique Moleculaire et Cellulaire, CNRS UMR 106, Universite Lyon 1, 43 Boulevard du 11 Novembre 1918, F-69622 Villeurbanne cedex, France EMBL accession no. X17219

Submitted December 21, 1989

unusually short terminal repeat, different from the arrangement of the LTR of retroviruses, has been confirmed by sequencing other copies of this element selected from a Bombyx genomic library. From the phylogenetic tree established on the RT sequence (2 and personal communication), this retrotransposon can be positioned among the copia like family elements of

Mag, a 4564bp long transposable element has been discovered in the large intron of a cloned allele of the Ser2 gene (1). A few copies (7 tol4) are dispersed into the genome of different strains of Bombyx mori. It is flanked by a Sbp repeat of the target sequence and is bordered by direct terminal repeats of 77 nucleotides. Two large open reading frames are organized as the gag and pol genes of retroviruses in the non coding strand of the Ser2 gene. The ORFI is 258 codons long and presents the characteristic features of two nucleic acid binding motifs (underlined a). The ORF2 (1195 codons) shows strong homologies with the retroviral protease (b), reverse transcriptase (c), Rnase H (d) and endonuclease (e, f), in this order. The

S I

LaR

SSL

* f G

D * N

a sS I

a RO LL

a

r

Y

L A

A o Y

G111C0SaG 504011C010

AtACAs11 Ca;c

0

D

VC

0t I

C L K

IO

L t t L

300 400 C

ICAC1G0AT1 0hh10 YCA11 R Q R

I

1TGa0000 csa10c0 c000AAtc tOOOFC0A aAI1g110 0a;01 GC10t

o000

it V N o I 5 00 a I a V t It T I a a A Y K R a v s L A L S L 1 GASIt 1Ca00G00 ocrta00 TVGAGA t^TT CTc0I2A 011_aG G1MA WA2ArCM a;A41 a 7 QA C I Y * D T V C S S C St GC 0 L aA r 3 r c 5 a COG D a A w

A

ma

E A A G G s ARIAC a a a a c r S5taa A C G ; a @11_40 CAcG .N=

NAL

CC

R O A

tAat CVaCCM AA501t1

CaQ

via

A0

COM M

TtAA_WITrS'

ClI Aa

K Y T

At L

0

A¢iL

A A

G V I r

O0A10 A10tGCCT CRtA C11OCC50 L a

t .V 1

c o o

I

TCTGWA&TA

L N U

t0tGC1G0T AtAl.C CCI1C

r r t o L D L S 0 TsA?AA ITMACTA OCTTams tTCAO L

r

0

K

0 0

vrs

x x a

ana

V mN

caA1cW A&VOCCA

C L a s

CSCtItZ 0fII11

I

ct t sGsaca

1 a T

L vI

Y S a

ATAt1C

Ia QN

L

t

r

V

cnM r n N accAA A

t

C0L0ICCr

ts D a r

V L

00AIM Into

L V

S * c *S

r O0

I

AffIV

m

L

a SILL a

0 S Z L IGCT McAAA9Cr

V t

500

t V1 D C

T1 3 0 I V L A_Aiu mcas

o m a

Gat1aT1

sI I c

D A v

IV a P

L

r

aoA

a

0A

P

L I V

lh&

C

A

0 S * m

3 s

D

AS

V

* LS

TA4C01A1 Tin

I *

m V m m

r

R

V

100

1900 V V

0 n AAITMI= a*01t51 A40010c1 ccA1CT04rA?

SK3 I

a

I

D a

v

L D

I

aI V a V 001 Y

L

i a

p

T

G

aSiaRaa_

*5 a0 9

0 D

TCUtGM

A

L

O

2000

L K

ISava V K

D r g

a

gZrMC

01D

aQ

t

R

I

V L 01101000

K

tA010.'4

S * 101010

i L KO VIIYN RON *LL T C00IT00" CAIATTATAA TAAMW 000111101 T

s D P if

L

r

AMTATAC %;;Gctrce GATCCGCATA RA

C

T

CA0QUAM OOT1

TInr k

Q A

r 0

V

1 3

401001

attL

r.

L

AO

V L ft

G

D

N

Xscd

MGM

a AA

a2 7;C

L

2oO3

scta

S It

tV L

A L L

I

m C I

I

D 101000

3)000

ust

I L G C I N V C N A V V I C a m o P S 11CMTV 01001t01 l00.IA1C 0001001101 ACCIICA5 CUCC

CI V T

L V V

D L 0 2

V D

K 3 I

S

R

9 f G

*

=

A L * 9 I

C S

I00*

P S

R

I

K

O1C0CCI

m s1 a a

t

3lO

"=MmA goN0aw 0ASIIA 0 T V S

p

D v C

S cOGICCG

Y *

A

1310

RD R o

*

I a p

a S a

VF L L

a C a

A

t

InTUAM

N

N

m R

VI

3500

K I C

ot t a

0

3100

a

36W0

001

0 0G L 00 NI D D RV tiOMn& ICDGT CAOlMW MU001TOWV

i

3500

s 5 D

1lt0Ia

0

l

SO V G

3

a V

A . t ?A AA CCACmA a woo UBR L D N L 01 t kO 0 0INE000I GOTMWUA iCNU OCWII Q L C

3300 T

chh

t r

p

R t

0

ucI _aclAm _0N mnu

L 0 GhQIT GOMM=11 IAOIAT01 ltrACA011 1 00 0 L *P Q0 S a si C * S COO C.A.IA AinCg CSAIOCcAW &01ACVC QO

LO

C I Pas

I

V

Wcs

6 9 G I

L

Y

?r

OSrc1 rI0tI00 II0IVVs casO0

ASO OCI011 01100W-AUGM

1011011 AIcMU r L I

5

L

2,00

V rA

_:=

MAM

A

a 00 IK Rt A L O Q I L N *in Ain 110rf h 0AAhUU=VAAA=&0 a D P

RR V

L

Y

s T10 a D a * S In _ I0" V1G /IWX

C

G

P

I

0 tt

SD L P

v

aC0 I

00 V

V o t D

C

SLA aSI m 32R ZAL1 CGl scKx;Ar.Ts G_aA TC0USSUC CSfAM;s MM=X; GGatGA a 1tTCG

A

IlW0M50

o F L C

L N L

V

T

C

k i

*

GO

I r

s

I at l Y

aQCTA uwoGa;T

sICUA

5

00

t

Q

0 V A

PI LL V

r

D

A L A

2400

G0C

1

1

G T

3900

0

K K L V SC OVVCAJM CfV01f01 0hGtI C&tAOO D 0 t MUM= A Q P L D L R tI 0 S R r r 1 0 IK V D T r A S I *N * ISM= * L I AU ATNTW WaCh3 CUCC CCCUM At&Arn casjA TrGg T SUA& e e o 0 0 0 L t D S G * S os P T 3 C t Z * V A V V * CC010000 001AC00001 0011T11 C1cA11010 0A514G011 GAcOMM @C ASOtA0A0 IOC AsICM

C V L L A'1CtCAAGC tCTCA11CY IGSUM iAATtCT4

L

A V S R

K CAcO

t a

00

a G C

RV

ACZ.ICCIOtAv

0

5

aI a

A t A a

V r1

i

G A

Ccmt

aG rza 10 V 100AM In00 IVAT A01 tSCSC

GC C

S

5

I

T

moo

4100

0

C

0 V T

4200 V

V

AtAT4GTTAT AAC111111r %-r11T1I

4400

S

2200

N

tGlccaA

L

m oX

srs mAv

0

;Os0C cs10c

IDOS O

I D N

L S

LA

L CD

r

TTAAA

L C aV

I

as

9a L

N

I L

A0 I*

ORO

T

OGAm

Vl K o

L

r

sWC.A

A R C

ma

G

cuaenA Oa60CI Gt GAC L

1700

L

S a

,CUg, cU,aCss Oc&g

I

G

cwacw

D L r 5 011001114

tT

v

00aD 0G

V

a m aR

t V

SOO

amA

1V OCCG

a N V a AIOtAG tAACT

t

CtMMO ToreN OV O T L

A

A

00L rm11 o YLIC

s

I

aA R

Aacactr& C%Aa Acccaa;T aNaMP

0 V V

D Y 00010100

u MO L L T O IP LL ASAa c R 0 1000 MM CACTU AOAD CAOLC KaACAuACA =_ cAGA ccc _c Atc ?GOMM= ac WAK hcGg L CIt a TSs moI aAt C V Ka ir m o m tLTL s am a o oD C CC KO i v C I a V it v a r a a a 1100 050vo cuacat IGa10 t_CO tWLh 0T0A1 a ac00 c,a c11*0A t_ m m C L L a t VS0 1 L O A I G 5 A L s c I Sam vV T V D K N 0 t M 1200 GAcw1Aa c_a1011011 A0rm0m cwACA= 0010c000 trAcc1 to00111V tsa taA 0 L D L C L L N L K r * D G S I I a P L C r I N T I v as T O C V S K Lo 1 toyuc" oTtoot C AtCTtAGCT tGA ttam: aVLC AVGioossa tt m L LGR oQ tV s D K C L A t L mI S r I * R et s r X s 0 a0t0 ALTAo CTrrooAMt 1oco T=AltcOt 4a_a1 raItac mc tA= AAnI 01IQ10A AtA1AtAt ah1 moo Iv s a N 9 L I in L r DC T L A s5 r v T a m o R 0D

OAK

1V0GMt_ 1cOG0tA1

001IG1

100

Ra GO

AA

AC010_ 1CACrr

LI

A G

c

c

C

0 A Oa _>SSW

N * LSVP LIT A L L a C NN N N rascArs srm:SMTMAW MnaW_WC-_W= GhCW 0 S L LAN V t LA S V L

SAA

AnA1

T0tt100

ACAM

10

CZX OL

L r a

Michaille,J.J., Garel,A. and Prudhomme,J.C. (1990) Gene in press. Xiong,X. and Eickbush,Th.H. (1988) Mol. Biol. Evol. 5, 675-690.

1. 2.

2.0

5 K N

L

REFERENCES

500

1 CGCt a ccc CCtAtS0 c AtC1ta.r. ctActAa1I A111011c AAac tlW t;C10I0 C -0110 atAAICtC acc G0OG01M tA0T00 ICAATCI 01T11rr0 1C1T1-11 GW0VA1A ORlFI r-a FJr L V V R K a10010 n 100A1110 1110010 00 ta ~1At a1c11 0t1000001 ta11th c10a10 tunatstvA A A A A A k L O P K T L L t L s v a A K P P DLLLS L T CAaTaGca t;oa V^GM CCAAht taWccA saAICT Q=Grf=T C-ZACT GCACGG TTAtT=

[tC#A=t 15GAA

Drosophila.

21CX0

SSSMSGGS t.-MtC' GaGsac GATsAtaT an TasaTassT CI!TOCtt As&cMA Css>aTAA; sstsss s cC Aac CGtaGa AGtUAClTS CACTAG C-TAT&

-TaC= TsST,-T,?

4:,,