Synthetic - Springer Link

3 downloads 250 Views 1MB Size Report
and JOHN D. KEMP* ... (Proudfoot and Brownlee, 1976); the sequence ATTTA, found to ...... Herrnstadt, C., Soares, G.G., Wilcox, E.R. and Edwards, D.L..
Transgenic Research 1, 2 2 8 - 2 3 6 (1992)

Synthetic crylllA gene from Bacillus thuringiensis improved for high expression in plants DENNIS

W. SUTTON,

PATTI

K. HAVSTAD

and JOHN

D. KEMP*

Plant Genetic Engineering Laboratory, New Mexico State University, Box 3GL, Las Cruces, NM 88003, USA Received 30 January 1992; revised 6 May 1992; accepted 11 May 1992

A 1974 bp synthetic gene was constructed from chemically synthesized oligonucleotides in order to improve transgenic protein expression of the crylIIA gene from Bacillus thuringiensis var. tenebrionis in transgenic tobacco. The crystal toxin genes (cry) from B. thuringiensis are difficult to express in plants even when under the control of efficient plant regulatory sequences. We identified and eliminated five classes of sequence found throughout the crylllA gene that mimic eukaryotic processing signals and which may be responsible for the low levels of transcription and translation. Furthermore, the GC content of the gene was raised from 36% to 49% and the codon usage was changed to be more plant-like. When the synthetic gene was placed behind the cauliflower mosaic virus 35S promoter and the alfalfa mosaic virus translational enhancer, up to 0.6% of the total protein in transgenic tobacco plants was crylIIA as measured from immunoblot analysis. Bioassay data using potato beetle larvae confirmed this estimate.

Keywords: Bacillus thuringiensis var. ; tenebrionis; crylllA ; synthetic gene Introduction

The study of foreign gene expression in plants was made possible with the development of techniques for the transfer of D N A from virtually any source into plants. Early successes of transgenic expression led many to believe that most genes could be expressed in a foreign background. Such optimism was questioned when genes such as the Bacillus thuringiensis (Bt) crystal protein toxin (cry) genes were found to be poorly expressed in plants even at the transcriptional level (Fischoff, 1992; M.J. Adang and G. Cardineau, personal communication). Failure of expression appeared to be due to the coding sequence itself since the cry genes had been combined with a variety of high expressing promoters and proven poly (A) addition signals (Adang et al., 1987; Barton et al., 1987; Vaeck et al., 1987). Obviously, a better understanding is needed as to why some genes are not transcribed or translated in distantly related genetic backgrounds if we are to effectively apply gene expression technology. We chose the coding sequence of the cryIIIA gene from Bt vat. tenebrionis ( Btt) (Krieg, et al., 1983, 1984) for this investigation. This gene produces a protein that is lethal to several insects in the order Coleoptera, *To whom correspondenceshouldbe addressed. 0962-8819 9 Chapman & Hall

including some important crop pests (Herrnstadt et al., 1986). The crylIIA gene has an open reading frame equivalent to a 73 kDa protein (Sekar et al., 1987). Such a protein accumulates early during Btt sporulation but later appears to be processed from the N-terminal end to a 67 kDa polypeptide (Carroll et al., 1989). Whether the 67 kDa protein is the result of processing or secondary initiation at methionine-48 (met-48) is not clear. In any event, this polypeptide accumulates to high levels during late sporulation and forms the crystals within the spore. Since it was determined that the 67 kDa protein starting at met-48 was toxic (McPherson et al., 1988), we chose to begin our synthetic cryIIIA gene at this position. An examination of the native cryIIIA coding sequence and a comparison with both monocotyledon and dicotyledon genes in GenBank reveals that the sequence contains a number of features different from typical plant coding sequences. These differences may help explain why the cry genes are poorly expressed in plants. The major differences are: the GC content of cryIIIA is 36% as compared to an average plant gene of about 47%; the codon usage of cryIIIA is very different from that found in plants; the cryIIIA gene contains three A A T A A A runs, which as poly (A) addition signals, are usually limited to the 3' untranslated region of eukaryotic genes

Highly expressed synthetic CrylIIA gene

229

(Proudfoot and Brownlee, 1976); the sequence ATTTA, found to destabilize haemoglobin m R N A (Shaw and Kamen, 1986), is rare in plant coding sequences but is found 12 times in crylllA; finally, the crylIIA gene contains many runs of A's and T's which are common in plant introns but rare in exons. It is not known at this time which of these factors is the most important. Therefore, all of the factors were considered equally important and were used to determine the modified sequence. After synthesis, the improved crylllA coding sequence was placed between the cauliflower mosaic virus (CaMV) 35S promoter (Odell et al., 1985) containing the alfalfa mosaic virus (AMV) coat protein untranslated leader sequence (Gallie et al., 1987) and the nopaline synthase 3' untranslated sequence (Rogers et al., 1987). This synthetic gene was transferred to tobacco and analysed for expression. Recently, Monsanto researchers reported that high levels of expression had been achieved in tobacco and tomato with a modified crylA gene from Bt var. kurstaki (Perlak et al., 1990, 1991). Their approach was much the same as ours in that they also changed the coding sequence of the wild type Bt gene to remove those sequences that are seldom found in eukaryotic exons. These included sequences such as potential polyadenylation signals, A and T rich areas and ATI'TA runs. The work reported in this paper demonstrates that high levels of expression of the cryllIA gene in plants is possible when the coding sequence is modified to be more plant-like. Materials

and methods

CrylllA gene construction The synthetic crylllA gene was made in six sections (IVI) of 250 to 360 base pairs, each double stranded section being defined by unique restriction sites as shown in Fig. 1. Each section was assembled from 13 to 24 oligonucleotides, 29 to 79 bases in length. The upper strand was mainly 76 and 78 boxes in length that were overlapped at each junction by 15 to 20 bases of a lower strand oligonucleotide. After synthesis (DNA Synthesizer, Applied Biosystems Inc., Foster City, CA, USA), the oligonucleotides were I BAM I

II SAC I

IIIBGL I

MLU I

IV

V CLA I

VI ECO I

KPN J

1800 s.P.

Fig. l. Assembly of the synthetic crylllA gene. After sequencing, all five synthesized DNA fragments were ligated together at unique restriction sites designed into the gene. Fragment Ill-IV was annealed and ligated in a single reaction from oligonucleotides. BAM = Bam HI, BGL = Bgl II, CLA = Cla I, ECO = Eco RI, KPN = Kpn I, MLU and SAC = Sac I.

purified on oligonucleotide purification columns (OPC, Applied Biosystems Inc.) and then by polyacrylamide gel electrophoresis (Matthes et al., 1987). Each of the six sections was assembled as follows: equal molar amounts, approximately 50 pmoles, of the oligonucleotides were combined and phosphoylated using T-4 polynucleotide kinase essentially as described by Frank et al. (1987). Annealing and ligation of the oligonucleotides into a double stranded section was carried out simultaneously using the thermostable ligase enzyme, Ampligase (Epicenter Technologies, Madison, WI, USA). Ampligase is a NAD requiring enzyme with different buffer requirements from the polynucleotide kinase enzyme. Therefore, the phosphorylated oligonucleotides were diluted five-fold with water before adding the 10• ligation buffer supplied by the manufacturer. The oligonucleotide solution was heated to 70~ for 5 min, 25 units of Ampligase was added and the reaction was incubated for 30 min at 65 ~ C. This was followed by a lowering of the temperature to 50 ~ C for 1 h, then 40 ~ C for 2 h and finally 30 ~ C for 2 h. The reaction was stopped by extraction with phenolchloroform, and was ethanol precipitated. Following restriction enzyme digestion, the fragments were electrophoretically purified from agarose (Sambrook et al., 1989). The resulting dsDNA fragments were ligated into pSP73 (Promega, Madison, WI, USA). Clones were selected for proper size and four of these were sequenced using the Sequenase sequencing kit (United States Biochemicals, Cleveland, OH, USA). The DNA clone that had the least errors (often all four clones had the same error) was ligated to the other sections to assemble the synthetic Btt CryllIA gene. In vitro mutagenesis The base deletion errors found during sequencing (none of the base substitution errors changed the amino acid sequence) were repaired using in vitro mutagenesis (Kunkle, 1985). Briefly, this procedure involved cloning the synthetic Bt gene fragment containing the errors into Bluescript phagemid (Strategene, La Jolla, CA, USA) then transforming it into CJ236 (Bio-Rad, Richmond, CA, USA), an Escherichia coli strain which substitutes uracil in place of thymine in its DNA. Uracil containing single-stranded DNA was rescued from CJ236 using a helper phage, R408 (Bluescript manual, Stratagene, La Jolla, CA, USA). The ssDNA was used as a template for an oligonucleotide that contained the repairs. Two oligonucleotides with up to three contiguous base changes were annealed simultaneously to repair deletions. Klenow fragment was used to polymerize in vitro the second strand which contained only thymine. The double stranded DNA was then transformed into E. coli strain DH5c~ (GIBCO BRL, Gaithersburg, MD, USA) where the strand that contained uracil was destroyed leaving only the strand that contained the corrections.

230

Sutton, H a v s t a d a n d K e m p

A simple method for screening colonies for repairs was carried out by calculating the differences in the dissociation temperature (Tm) between a perfect match of template and oligonucleotide and one containing from one to three mismatches. For our particular oligonucleotide/template pairs, the Tms were lowered by 5~ to 13~ (Sambrook et al., 1989 for equation). Miniprep DNA (Birnboim and Doly, 1979) from putative repaired DNA was bound to replicate nitrocellulose filters and was hybridized to 32p labelled oligonucleotide. The washing stringency was then gradually increased in 5~ increments and at each increment a filter was removed and placed on X-ray film. The oligonucleotide and template that still contained mismatched bases dissociated at the lower temperature. In general 50-75~ of the selected clones contained a repaired DNA. Finally, the repair was confirmed by sequencing.

Plant vectors The synthetic crylllA gene was placed behind the CaMV 35S promoter of pMON 316 (Rogers et al., 1987) that we had modified (Fig. 2) to enhance translation through the addition of the AMV coat protein 5' untranslated leader sequence (Gallie et al., 1987). Triparental matings were used to co-integrate the pMON vector with pTI T37SE (Rogers et al., 1987). Tobacco was transformed and regenerated by the leaf disc method (Horsch et al., 1985). Transformed plants were selected using kanamycin resistance and screened for nopaline production.

Fig. 2. Map of pGEL 260. The 1794 bP synthetic crylIIA gene was placed behind the CaMV 35S promoter and an AMV translational enhancer element in the binary plant transformation vector pMON 316. The base sequence of the AMV coat protein untranslated reader and adjacent restriction sites are as follows: G G A T C C T T 1 T T A 11-1 T 1A A T T T T C T I ~ C A A A T A C T F C C A G A T C T A C C A T G G T A C C

Barn HI

Bgl II

Nco 1 Kpn I

Bioassays Bioassays involved feeding the larvae of potato beetle, Leptinotarsa decemlineata, potato leaves painted with extracts of tobacco leaves or authentic crylIIA protein. Leaves from transformed and nontransformed control tobacco plants were homogenized in a dual tissue grinder (Kontes, Vineland, NJ, USA) with four volumes of 0.1 M Na2CO 3 buffer pH 10.5. The extract was heated to 65 ~ C for 10 min and immediately centrifuged in a microfuge for 5 min to remove cell debris. The supernatant was then dialyzed overnight in two changes of pH 6.5 sodium phosphate buffer; the first buffer change was 100 mM and the second 10 mM phosphate. Aliquots of leaf extract were either applied directly on both sides of the potato leaves or diluted with H20 before application. The painted leaves were allowed to air dry and beetle larvae, usually neonate, were placed on the leaves in covered petri dishes. The potato beetle larvae were hatched from egg masses that were a gift from Dr Fred Gould (Dept of Entomology, North Carolina State University, Raleigh, NC, USA). Typical feeding experiments lasted 24 to 72 h. Results from the bioassay were recorded as weight gain and observations of leaf damage as seen in Fig. 3.

R N A analysis Total RNA was extracted from leaf tissue by a LiC1 method (deVries et al., 1982) and the poly (A) RNA was isolated on Poly (U) Sephadex (GIBCO BRL, Gaithersburg, MD, USA) (Murray et al., 1981). Approximately 30 ~tg of total RNA and 3 ~tg of Poly (A) RNA were loaded onto formaldehyde denaturing 1% agarose gels for separation. Gels were blotted onto nitrocellulose and probed with 32p labelled synthetic crylIIA DNA (Thomas, 1980). Prehybridization and hybridization were done at

Fig. 3. Bioassay of transformed tobacco plants. Neonate potato beetle larvae were fed either tobacco leaf extracts from synthetic cryIlIA transformed plants or non-transformed controls (C). Extracts were applied to potato leaves and leaves were photographed after 48 h.

Highly expressed synthetic C r y l I I A gene

231

43 ~ C in 50% formamide, 5X SSC, 5 x Denhardt's solution, 5mM sodium phosphate, p H 7.0, 0.1% SDS, 0.1mgm1-1 denatured calf thymus D N A and 0.04 mg m1-1 poly(A). After hybridization the filters were washed three times in 2X SSC, 0.1% SDS and once in 0.2X SSC, 0.1% SDS at 4 2 ~ then exposed to Kodak X A R - 5 film.

Protein analysis Soluble proteins were extracted from young tobacco leaves in 100 mM Tris-Cl p H 7.8, 10 mM MgC12, 6mM [3-mercaptoethanol, and 1% insoluble polyvinylpolypyrrolidone, quantified using the Bradford assay (Bio-Rad, Richmond, CA, USA), precipitated with nine volumes of ethanol and separated by 10% polyacrylamide SDSP A G E (Laemmli, 1970). S D S - P A G E was also used to analyze proteins from an aliquot of leaf extracts prepared for the bioassays. Protein patterns were visualized by staining with Coomassie Blue or crylIIA was detected by immunoblot analysis (Towbin et al., 1979). Polyclonal antibodies raised in rabbits against crylIIA were a gift from Agrigenetics Research Laboratory (Madison, WI, USA). Results

Synthesis of a modified cryIIIA gene Five classes of sequence are identified and listed in Table 1 that may affect eukaryotic gene expression and be responsible for the poor expression of the Bt cry genes in transgenic plants. A search for these five classes of sequence revealed that they are infrequently found in the exons of plant genes and those bacterial genes that have been successfully expressed in plants. These sequences are more common, however, in a random group of other bacterial genes and very common in cry genes of Bt

(Table 1). These correlations led us to conclude that all five classes may affect foreign gene expression in plants. Other differences we identified as possibly being important for expression include G C content and codon usage. The G C content of most plant exons is 4 6 - 4 8 % (computer analysis not shown) compared to 36% found in cry genes. Furthermore, highly expressed plant genes tend to show a codon bias that is nearly the opposite of cry gene bias. A gene of 1794 bp was synthesized to code for a polypeptide with an amino acid sequence identical to amino acids 48 to 644 of the native crylllA gene. Seventeen per cent of the D N A sequence was changed to eliminate the five classes of sequences listed in Table 1, the amino acid codons seldom used in plants, and runs of more than 4 As or Ts. The overall G C content was also increased from 36% to a more plant-like 49%. The sequences of the synthetic gene and the native crylIIA gene are compared in Fig. 4. The synthetic gene was placed behind the CaMV 35S promoter with an A M V translational enhancer as shown in Fig. 2 and transformed into tobacco after cointegration into a disarmed A. tumefaciens Ti plasmid as described in Materials and methods.

Transcription of synthetic crylIIA The results of Northern analysis shown in Fig. 5A reveal a highly abundant transcript of approximately 2 kb that is present in crylIIA transformed tobacco but absent in nontransformed control tobacco RNA. When compared to the ribulose bis-phosphate carboxylase small subunit (Rbc S) signal on the same filter, we estimate that up to 0.5% of the total R N A is cryllA m R N A (data not shown). The native crylA gene but not the native crylIIA gene was available to our laboratory at the time of this study; therefore, a direct comparison was not made between the

Table 1. Classes of sequence with possible regulatory function in eukaryotic cells and their prevalence in selected coding sequences

Class sequence Coding sequence Plant genese (expressed)4 Bacterial genes (expressed)4 Plant genes Bacterial genes Cry genes

Poly A AA TAAA [AA TAA] No. 3 12 6 33 5 4

0.1 0.2 0.1 2.0 2.7

[0.2] [0.5] [0.7] [4.0] [8.0]

Killer A TTA

T-Rich TTTPu TPy TTTT TT

A-Rich AAAAAA

Average occurrence per coding sequence 0.1 0.8 0.0 0.2 0.9 0.6 0.7 1.2 -2.6 3.5 1.5 12.0 8.0 3.0

Downstream CA TTG 1 G TG TG T2 0.8 0.7 1.5 1.2 3.0

Five classes of sequence were identified from current literature that may cause reductions in transgenic expression. References are: poly (A), Proudfoot and Brownlee, 1976; Killer, Shaw and Kamen, 1986; T-rich, Gil and Proudfoot, 1987; downstream, Berget, 1984, and Taya et al. 1982. 3Number of sequences reviewed. The number in brackets shows the frequency of the sequence which is in brackets. 4Genes which have been expressed in transgenic plants.

Sutton, Havstad and Kemp

232 ATGACTGCAGATAATAATACGGAAG CACTAGATAGCTCTA CAACAAAAGA ATGACTGCTGATAACAACACGGAGGCACTTGATAGCTCTACCACCAAGGA *

*

*

*

*

*

*

*

TGTCATTCAAAAAGGCATTTCCGTAGTAGGTGATCTCCTAGG CGTAGTAG TGTCATTCAAAAGGGCATTTCCGTGGTGGGTGATCTCCTTGGCGTCGTTG *

*

*

*

*

*

I

GTTTCCCGTTTGGTGGAGCGCTTGTTTCGTTTTATACAAACTTTTTAAAT GATTCCCCTTCGGTGGAGCCCTTGTTTCCTTCTACACCAACTTC CTCAAC

!

ACTATTTGGCCAAGTGAAGACCCGTGGAAGGCTTTTATGGAACAAGTAGA ACTATTTGGCCAAGCGAGGACCCCTGGAAGGCCTTCATGGAG CAAGTGGA

l

A G C A T T G A T G GATC A G A A A A T A G C T G A T T A T G C A A A A A A T A A A G C T C T T G GGCACTGATGGATCAGAAGATTGCCGATTACGCAAAGAACAAGG CCCTTG

i

CAGAGTTACAGGGCCTTCAAAATAATGTCGAAGATTATGTGAGTGCATTG CAGAGCTGCAAGGCCTTCAAAA CAATGTCGAGGATTATGTTAGCGCACTC

1

AGTTCATGGCAAAAAAATCCTGTGAGTTCACGAAATCCACATAGCCAGGG AGCTCATGGCAAAAGAACCCTGTGAGCTCACGTAACCCACA CAGCCAGGG

1

GCGGATAAGAGAGCTG'FFFr C T C A A G C A G A A A G T C A T T T T C G T A A T T CAA TCGTATTAGAGAGCTGTTCTCTCAAGCAGAGAGCCATTTCCGTAACTCAA

JI

TGCCTTCGTTTGCAATTTCTGGATACGAGG TTCTATTTCTAACAACATAT TGCCTTCCTTTGCAATTTCTGGATACGAGGTTCTCTTCCTCACCACTTAC

ii

GCACAAGCTGCCAACACACATTTATTTTTACTAAAAGACGCTCAAATTTA GCACAAGCTGCCAACACGCACCTCTTCTTGCTTAAGGACGCACAAATCTA

91

T G G A G A A G A A T G G G G A T A C G A A A A A G A A G A T A T T G CTG A A T T T T A T A A A A CGGAGAGGAGTGGGGATACGAGAAGGAGGATATTGCCGAGTTCTACAAGA

51

GACAACTAAAACTTACGCAAGAATATACTGACCATTGTGTCAAATGGTAT GACAACTTAAACTTACGCAAGAGTACACTGATCACTGTGTCAAGTGGTAC

*

*

*

*

*

*

*

*

*

*

*

*

*

*

*

*

*

*

*

**

,

*

.

*

*

*

*

*

*

*

*

*

*

*

,

*

*

*

.

*

*

*

*

,

*

*

*

*

*

*

*

*

*

*

*

*

*

*

,

*

**

*

*

*

*

*

*

*

*

*

*

*

*

*

*

*

*

*

*

.

.

*

*

*

*

AATGTTGGATTAGATAAATTAAGAGGTTCATCTTATGAATCTTGGGTAAA AACGTTGGACTTGATAAACTCAGAGGTTCATCTTACGAGTCTTGGGTGAA *

*

*

*

*

*

*

*

- 51

CTTTAACCGTTATCGCAGAGAGATGACATTAACAGTATTAGATTTAATTG CTTCAACCGTTATCGCAGAGAGATGACCCTTACCGTGCTAGATCTCATCG

nI

CACTATTTCCATTGTATGATGTTCGGCTATACCCAAAAGAAGTTAAAACC CACTGTTCCCACTGTACGATGTTCGTCTTTACCCAAAGGAGGTTAAGACC

*

**

*

*

*

*

*

*

*

**

*

*

*

*

*

*

*

751

GAATTAACAAGAGACGTTTTAACAGATC CAATTGTCGGAGTCAACAACCT G A G** C T T* A C C* A G A G A C G T T C*T C*A C C*G A T C C A A T T G T C G G A G T C A A C A A C C T

oui

TAGGGGCTATGGAACAACCTTCTCTAATATAGAAAATTATATTCGAAAA C TAGAGGCTACGGAACCACCTTCTCTAATATTGAGAACTACATT CGCAAGC

851

CACATCTATTTGACTATCTGCATAGAATT CAATTTCACACGCGGTTCCAA CACACCTGTTTGACTACCTG CACAGAATCCAATTCCACACG CGTTTCCAA

*

*

*

*

*

*

*

*

*

*

*

*

*

*

*

*

901

CCAGGATATTATGGAAATGACTCTTTCAATTATTGGTCCGGTAATTATGT CCAGGATACTACGGTAACGACTCTTTCAACTACTGGTCCGGAAACTACGT

951

T T C A A C T A G A C C A A G C A T A G G A T CAAATGATATAATCA C A T C T C C A T T C T T T C A A C T A G A C C A A G C A T A G G A T C A A A C G A T A T T A T C A C C T C T CCATT CT

i001

ATGGAAATAAATCCAGTGAACCTGTACAAAATTTAGAATTTAATGGAGAA A C G G A A A C A A G T C C A G C G A G C C T G T G C A A A A CCTTGAG TT CAA C G G A G A G

1051

AAAGTCTATAGAGCCGTAGCAAATACAAATCTTGCGGTCTGGCCGTCCGC AAGGTCTAC-~ GAGCCGTGG CAAACACCAACCTTG CCGTCTGG C C CT C CG C

*

*

II01 1151

*

*

*

*

*

*

*

*

*

*

*

*

*

*

**

*

*

*

*

*

*

*

*

*

*

*

*

TGTATATTCAGGTGTTACAAAAGTGGAATTTAG CCAATATAATGATCAAA

I751

TCTAGAAAAGGGATATAGCCATCAACTCAATTATGTAATGTGCTTTTTAA TCTTGAGAAGGGATACAGCCACCAACTCAACTACGTGATGTG CTTCTTGA

1301

TGCAGGGTAGTAGAGG/~C ~ T CCCAGTGTT.~a%CTTGGACAC A T . ~ G

*

*

*

*

*

*

*

*

*

T

TGCAGGGTAGCAGAGGAACCATCCCAGTGCTCACTTGGACCCACAAGAGC *

.

,

.

.

.

.

*

1251

GTAGACTTTTTTAACATGATTGAI'TCGAAAAAAATTA CA CAA CTTCCG TT GTGGAC' r~2CTTC~CATGATTGATTCCAAG~GATTACCCAACTTCCCTT

'i

GGTG~GGCATAC~GCTCCI~O%TCTGGTGCCTCCGTTGTCGCAGGTCCTA

.i

G G T T T A C A G G A G G A G A T A T CATTCAATGCACAGAAAATG G A A G T G C G G C A GGTTC, ACC,G G A G G A G A T A T C A T T C A G T G CACCGAG, AA .CGGAAGCGCCGCA

*

,

.

,

*

,

,

.

AGTAAAGGCATATAAGTTACAATCTGGTGCTTCCGTTGTCGCAGGTCCTA *

,

.

.

,

.

,i

AATTCATTATGCTTCTACATCTCAGATAACATTTACA CTCAGTTTAGACG

Jl

G G G C A C C A T T T A A T C A A T A CTATTTCGA TAAAA C G A T A A A T A A A G G AGA C * * * , , * * * * GAGAC

-~TTCACTACGCCTCTACCTCTCAGATTACCTTCACGCTCAGCTTGGACG * * * , , * * . , *

GAGCACCCTTC~CC~TACTACTTCGATAAGACGATTAACAAGG ACATT~CGTATt~TTCATTT~TTTAGC~GTTTCAGCACACCATTCGA ACCCTTACGTACAACTCATTCAACCTTGCTAGCTTCAGCACCCCATTCGA *

Jl ,i

to a 32p labelled 1.8 kbp synthetic crylllA DNA fragment as described in Materials and methods. X-ray film was exposed to the radioactive filter for 7 h. (B) Approximately 25 ~tg of poly (A) R N A from two tobacco plants transformed with natural crylA gene from Bt var. kurstaki(lanes 1 and 2) and a non-transformed control plant (lane 3) was hybridized to 23p labelled crylA DNA. Exposure time was 7 days.

CCGATGAGGCAAGCAC~, CAAACGTA CGACTCAAAGAGAAAC, GTTGG CG C C

1201

,i

Fig. 5. Northern blot analysis of tobacco mRNA transcripts. (A) Approximately 30 ~tg of total R N A from two different tobacco plants transformed with the synthetic crylIIA gene (lanes 1 and 2) and a non-transformed control (lane 3) was hybridized

*

,

*

,

**

*

*

*

ATTATCAGGG~T~CTTACA~%TAGGCGTCACAGGATTAA~TGCTGGAG G* T T G* T C A G G C*A A C*A A C C*T C*C A A A T T,G G C G T C A C C G G A C T* T A* G C G* C C G*G A G ,

ATA~GTTTATATAGACI~d~TTG~TTTATTCCAGTG~ T T ~ A T A A G*G T C*T A C*A T A G A C A A G*A T T G A G*T T C *A T T C C A G TGA~.-TTAA

Fig. 4. Comparison of the nucleotide sequence of the natural crylllA gene from Btt (upper line of each row) with that of the synthetic crylIIA gene. A '*' symbol is below each base that was changed.

synthetic and the natural crylIIA genes. This does not appear to be a serious problem since others have shown that neither native crylA nor native cryHIA genes are effectively expressed in plants (Fischoff, 1992; M.J. Adang and G. Cardineau, personal communication). The present work demonstrates again that the crylA gene from Bt var. kurstaki HD73 behind the CaMV 35S promoter in pMON 316 (Bailie, 1988) is poorly expressed (Fig. 5B). Messenger RNA in transgenic tobacco containing the synthetic crylIIA gene accumulated at the predicted size (Fig. 5A; the higher molecular weight bands are not detectable if poly(A) RNA is probed), whereas most of the mRNA from plants containing the natural Bt gene was degraded and no mRNA of the predicted 3.8 kb was detected (Fig. 5B). Differences in abundance between the two mRNAs were even more striking. Taking into account the amount of RNA loaded, the specific activity of the probes, exposure times, and the relative intensity of the bands, we estimated that a crylA signal 103 to 104 times less intense than the crylIIA signal would be detectable.

Highly expressed synthetic C r y l I I A gene CryllIA protein expression Tobacco leaf extracts and native crylllA protein were subjected to S D S - P A G E . One section of the gel was stained with Coomassie Blue and the other visualized by immunoblot analysis as shown in Fig. 6A,B. A 68 kDa polypeptide that reacted with polyclonal antibodies to cryllIA protein and migrated the same distance as authentic crylllA was detected in all three transgenic tobacco plant extracts tested (Fig. 6A; lanes 3, 4 and 5) but not in the untransformed control plant (Fig. 6A; lane 2). Expression in the plant shown in lane 5 was very low and the signal was lost on photographic reproduction of the blot. Another band of approximately 60 kDa that cross reacts with the antibodies was found in both transformed and non-transformed tobacco. This cross reactive binding was not detected in leaves that were extracted by

233 the procedure designed for bioassay experiments (Fig. 6C). The crylIIA plant extract shown in Fig. 6A (lane 3) appears to express the crylIIA protein at a much higher level than two other independently transformed plants (Fig. 6A; lanes 4,5). We estimate that crylllA represents 0.6% of the total solubilized protein in the leaves of the highest expressing transgenic plant. This estimate was made by comparing the intensity of the 67 kDa band in Fig. 6A (lane 3), to that of the authentic crylIIA (Fig. 6A: lane 1). A new protein band can be seen in the Coomassie stained lane containing extract from the high expressing plant (Fig. 6B; lane 3). This new band migrated the same distance as the authentic crylIIA protein. Furthermore, the intensity of the new protein band (Fig. 6B; lane 3) when compared to a known amount of authentic crylIIA suggests that the high expressing tobacco plant contains 0.5-1.0% of its soluble leaf protein as crylllA. The other two transformed plants contain considerably less crylIIA protein.

Bioassays The Colorado potato beetle will not feed on tobacco. Therefore, a bioassay for toxic crylllA protein was developed using tobacco leaf extracts painted on to potato leaves (see Materials and methods). Neonate larvae seldom died within 72 h after ingesting even high levels (100 ng cm -2) of authentic cryllIA protein. They did, however, reduce their feeding and failed to gain weight in inverse proportion to the amount of crylllA on the leaf surface as shown in Fig. 7. The amount of growth

Fig. 6. Detection of crylIIA protein in transformed tobacco leaves. (A) A 400 ng sample of authentic cryllIA (lane 1) and approximately 100 Ixg of total soluble protein from a non-transformed control tobacco plant (lane 2) and three different transformed plants (lanes 3-5) were subjected to SDS-PAGE and immunoblot analysis. Transformed plants shown in lanes 3 and 4 are the same as those shown in Fig. 5A; lanes 1 and 2, respectively. It can be estimated from staining intensity that the amount of crylIIA protein in lane 3 (68 kDa band) is approximately 0.5% of the total protein loaded. (A) Aliquots of the same samples used in panel A were run on another portion of the gel and stained with Coomassie Blue. An abundant protein band that migrated the same distance as the authentic crylllA (lane 1) can be seen in lane 3. This band cannot be seen in the non-transformed plant proteins in lane 2. (C) An immunoblot analysis was also done on transformed tobacco extracts that were processed for bioassays (see Materials and methods). Lane 1 contains protein from a transformed tobacco leaf and lane 2 a non-transformed control. Of interest is that the nonspecific 60 kDa band that cross reacted with cryIIIA antibodies in panel A is no longer present. This could be the result of a proteolysis of the cross reacting material during processing.

Fig. 7. Bioassays of Btt crylIIA protein. Neonate potato beetle larvae were fed various concentrations of authentic crylIIA protein (small checks) or dilutions of an extract from transformed tobacco leaves (large checks) applied to potato leaves. An average weight of five larvae was recorded after 72 h. Zero points are larvae fed leaf extracts from non-transformed tobacco or potato leaves painted with buffer only.

Sutton, Havstad and K e m p

234 inhibition from transformed tobacco extracts applied to potato leaves was compared with that observed using known amounts of authentic crylIIA protein. Diluted transformed tobacco extracts applied at a concentration of 3.6 ~tg of total protein per cm 2 of potato leaf surface caused a 90% reduction in growth when compared to controls over a 72 h period. By comparison, authentic crylIIA protein exhibited this level of inhibition at 20 ng of toxin per cm 2 of leaf surface. These data indicate that the cryllIA concentration in the highest expressing tobacco plant is approximately 6 m g g -1 of protein (0.6%), the same as that observed when comparing the Western blot data and protein staining data. When third instar larvae were fed transformed leaf extracts applied to potato leaves at a concentration of approximately 15 ~tg total protein per cm 2 (estimated 100 ng crylIIA protein) they failed to gain weight as shown in Fig. 8. Unlike neonate larvae, however, they continued to eat as much as control larvae. Discussion

This work clearly demonstrates that modification of nucleotide sequence makes it possible to obtain high levels of expression in transgenic plants of a coding sequence that in native form was virtually non-expressing. Since five classes of modifications were used to enhance expression it is difficult to know the relative contribution of each class without further studies. It may be that the dramatic increase in expression was due to incremental increases contributed by all classes rather than any single change. The observed expression level of any gene is ultimately 120 100c

(.9

8060, 40' 20. 0-7 0

1'8 Time (hr)

Fig. 8. CrylllA bioassay using 3rd instar potato beetle larvae. When larger 3rd instar larvae were fed potato leaves painted with extracts from transformed tobacco leaves (m-m) at a concentration of 15 ~g of total tobacco protein per c m 2 they failed to gain weight. Larvae fed potato leaves applied with an equal amount of non-transformed tobacco leaf protein more than doubled their weight in 30 h (+-+). Both groups consumed nearly the same amount of leaf material.

dependent upon the steady-state level of full length mRNA as well as an efficient rate of translation. Certainly the sizes of a particular mRNA can be affected if its gene contains sequences within its coding region that are misinterpreted as processing and termination signals. Many eukaryotic mRNAs contain the sequence A A U A A A 1030 bases upstream of the poly(A) site. In addition, both GU-rich and U-rich sequence elements 3' to the A A U A A A are required for efficient 3' end formation of rabbit [3-globin mRNA (Gil and Proudfoot, 1987). Our search revealed that these processing sequences are rarely found in plant exons but frequently found in areas that are normally processed, e.g. introns and termini of 3' untranslated regions. These sequence elements may explain the observations that truncated populations of mRNA are found in plants transformed with native cry genes (Murray et al., 1991). Another sequence element, AUUUA, is frequently found in the 3' untranslated region of transiently expressed animal genes but not in genes that produce stable mRNAs (Shaw and Kamen, 1986). Further, when this element is added to the 3' end of the stably expressed globin gene its mRNA becomes highly unstable (Shaw and Kamen, 1986). Recently, the element was shown to destabilize mRNAs of reporter genes transformed into plants, especially when located in AT rich areas near the 3' terminus of the gene (Ohme-Takagi et al., 1991). Again, our search showed that this element is rarely found in the exons of plant genes but is scattered throughout the cry genes. A partially modified cryIA gene where 50% of the A U U U A and potential polyadenylation signals were removed gave a level of insect protection to tomato plants 10 times greater than the wild type gene but less than a fully modified version (Perlak et al., 1991). This result suggests that cry protein levels and possibly even mRNA levels were increased. However, no direct measurements of either were reported. This finding supports the opinion that the effect of each sequence change may be incremental. The same researchers also found that modifications to the 5' half of the gene appeared to be much more important than changes to the 3' end. Murray et al. (1991) also found that most of the mRNA instability is contained in the first 579 bases of the cryIA gene. Whether this will hold for cry genes in general is under investigation with our synthetic cryIIIA gene. Another area we considered when attempting to maximize expression of cryIIIA was codon bias. Computer searches revealed that a codon bias does exist for highly expressed plant genes but it is not as great as that for bacterial genes. In general, the frequently used codons in plants are usually the infrequently used codons in cry genes and vice versa. Furthermore, searches reveal that the NTA codon (where N is any nucleotide) is avoided in both dicotyledons and monocotyledons while the NCG codon is avoided in dicotyledons (Murray et al., 1989). In

H i g h l y expressed synthetic C r y l I I A gene highly expressed genes such as RbcS the bias against these codons is even stronger (Murray et al., 1989). The cryllIA gene uses N T A at a frequency of 10% and N C G at a frequency of 3.5%. For the improved synthetic gene we lowered these to 0.5% and 1.3%, respectively. In yeast, replacement of 25 of the more favoured codons with rarely used ones in a highly expressed gene not only decreased the protein level but also the steady state level of its m R N A (Hoekema et al., 1987). The utility of expressing crylIIA at high levels awaits the transfer of this gene into crop plants such as potato, cotton and alfalfa. The potato beetle is very sensitive to the Btt toxin but questions remain as to whether other coleopteran pests are sensitive enough to be controlled in the field by the higher levels of cryHIA in transgenic plants. Eventually, the high level of expression may be further targeted to tissues specifically damaged by the insects. The flexibility to efficiently express any foreign coding sequence in plants may be generally limited by many of the factors we have demonstrated to be important for high expression of crylIIA. Our computer search showed that bacterial genes other than the cry genes also contain many of the undesirable sequences, and furthermore, bacterial genes that have been successfully expressed in plants without coding sequence modification contain fewer of these sequences. This correlation suggests that it may be important to examine any non-plant coding sequence before attempting to express it in plants.

Acknowledgement This research has been supported, in part, by the State of New Mexico Centers of Technological Excellence Program (NM, USA).

References Adang, M.J. Firoozabady, E., Klein, J., DeBoer, D., Sekar, V., Kemp, J.D., Murray, E.E., Rocheleau, T.A., Rashka, K., Staffeld, G., Stock, C., Sutton, D. and Merlo, D.J. (1987) Expression of a Bacillus thuringiensis insecticidal crystal protein gene in tobacco plants. In Arntzen, C.J. and Ryan, C., eds, Molecular Strategies for Crop Protection, UCLA Symposia on Molecular and Cellular Biology, New Series, pp. 345-53. New York: Alan R. Liss. Bailie, S.E. (1988) Expression of the Bacillus thuringiensis var. Kurstaki HD-73 delta-endotoxin gene in transgenic tomato. M.Sc. Thesis, New Mexico State University, Las Cruces, NM, USA. Barton, K.A., Whiteley, H.R. and Yang, N.-S. (1987) Bacillus thuringiensis 8 endotoxin expressed in transgenic Nicotiana tabacum provides resistance to lepidopteran insects. Plant Physiol. 85, 1103-9. Berget, S.M. (1984) Are U4 small nuclear ribonucleoproteins involved in polyadenylation? Nature 309, 179-81.

235 Birnboim, H.C. and Doly, J. (1979) A rapid alkaline extraction procedure for screening recombinant plasmid DNA. Nucl. Acids Res. 7, 1513-23. Carroll, J., Li, J. and Ellar, D.J. (1989) Proteolytic processing of a coleopteran-specific 8-endotoxin produced by Bacillus thuringiensis var. tenetrionis. Biochem J. 261, 99-105. deVries, S.C., Springer, J. and Wessels, J.G.H. (1982) Diversity of abundant mRNA sequences and patterns of protein synthesis in etiolated and greened pea seedlings. Planta 156, 129-35. Fischoff, D.A. (1992) Field performance of insect resistant crops. J. Cell. Biochem: Suppl. 16F, 199. Frank, R., Meyerhans, A., Schwellnus, K. and B16cker, H. (1987) Simultaneous synthesis and biological applications of DNA fragments: an efficient and complete methodology. Methods Enzymol. 154, 221-49. Gallic, D.R., Sleat, D.E., Watts, J.W., Turner, P.C. and Wilson, T.M.A. (1987) A comparison of eukaryotic viral 5'-leader sequences as enhancers of mRNA expression in vivo. Nucl. Acids Res. 15, 8693-711. Gil, A. and Proudfoot, N.J. (1987) Position-dependent sequence elements downstream of AAUAAA are required for efficient rabbit [3-globin mRNA 3'end formation. Cell 49, 399406. Herrnstadt, C., Soares, G.G., Wilcox, E.R. and Edwards, D.L. (1986) A new strain of Bacillus thuringiensis with activity against coleopteran insects. Bio/Technology 4, 305-14. Hoekema, A., Kastelein, R.A., Vasser, M. and DeBoer, H.A. (1987) Codon replacement in the PGK1 gene of Saccharomyces cerevisiae: experimental approach to study the role of biased codon usage in gene expression. Mol. Cell Biol. 7, 2914-24. Horsch, R.B., Fry, J.E., Hoffman, N.L., Eichholtz, D., Rogers, S.G. and Fraley, R.T. (1985) A simple and general method for transferring genes into plants. Science 227, 1229-31. Krieg, A., Huger, A.M., Langenbrook, G.A. and Schnetter, W. (1983) Bacillus thuringiensis var. tenebrionis: ein neuer gegenuber Larven von Coleopteran wirksamer. Pathotyp. Z. Ang. Ent. 96, 500-8. Krieg, A., Huger, A.M., Langenbruch, G.A. and Schnetter, W. (1984) Neue Ergebniss fiber Bacillusthuringiensis var. tenebrionis unter besonderer Berucksichtigung seiner Wirkung auf den Kartoffelkafer (Leptinotarsa decemlineata). Ang. Schadlingskde., Pflanzenschutz, Umweltschutz 57, 145-50. Kunkle, T.A. (1985) Rapid and efficient site-specific mutagenesis without phenotypic selection. Proc. Natl Acad. Sci. USA 82, 488-92. Laemmli, U.K. (1970) Cleavage of structural proteins during the assembly of the head of bacteriophage T4. Nature 227, 6805. Matthes, H.W.D., Staub, A. and Chambon, P. (1987) The segmented paper method: DNA synthesis and mutagenesis by rapid microscale 'shotgun gene synthesis'. Methods Enzymol. 154, 250-87. McPherson, S.A., Perlak, F.J., Fuchs, R.L., Marrone, P.G., Lavrik, P.B. and Fischoff, D.A. (1988) Characterization of the coleopteran-specific protein gene of Bacillus thuringiensis var. tenebrionis.Bio/Technology 6, 61-6. Murray, M.G., Peters, D.L. and Thompson, W.F. (1981) Ancient repeated sequences in the pea and mung bean genomes and

236 implications for genome evolution. J. Mol. Evol. 17, 31-42. Murray, E.E., Lotzer, J. and Eberle, M. (1989) Codon usage in plant genes. Nucl. Acids Res. 17, 477-98. Murray, E.E., Rocheleau, T., Eberle, M., Stock, C., Sekar, V. and Adang, M. (1991) Analysis of unstable RNA transcripts of insecticidal crystal protein genes of Bacillus thuringiensis in transgenic plants and electroporated protoplasts. Plant Mol. Biol. 16, 1035-50. Odell, J.T., Nagy, F. and Chua, N.-H. (1985) Identification of DNA sequences required for activity of the cauliflower mosaic virus 35S promoter. Nature 313, 810-2. Ohme-Takagi, M., Newman, T.C., Taylor, C.B., Howard, C., Chiu, W.-L. and Green, P.J. (1991) Effect of Au-rich sequences on mRNA stability in stably transformed plant cells. Third International Congress of Plant Molecular Biology. Tucson, AZ, USA. Abstract 451. Perlak, F.J., Deaton, R.W., Armstrong, T.A., Fuchs, R.L., Sims, S.R., Greenplate, J.T. and Fischoff, D.A. (1990) Insect resistant cotton plants. Bio/Technology 8, 939-43. Perlak, F.J., Fuchs, R.L., Dean, D.A., McPherson, S.L. and Fischoff, D.A. (1991) Modification of the coding sequence enhances plant expression of insect control protein genes. Proc. Natl Acad. Sci. USA 88, 3324-8. Proudfoot, N.J. and Brownlee, G.G. (1976) 3' non-coding region sequences in eukaryotic messenger RNA. Nature 263, 2114. Rogers, S.G., Klee, H.J., Horsch, R.B. and Fraley, R.T. (1987) Improved vectors for plant transformation: Expression

Sutton, Havstad and Kemp cassette vectors and new selectable markers. Methods Enzymol. 153, 253-77. Sambrook, J., Fritsch, E.F. and Maniatis, T. (1989) Molecular Cloning: A Laboratory Manual, 2nd ed. Cold Spring Harbor, NY: Cold Spring Harbor Laboratory. Sekar, V., Thompson, D.V., Maroney, M.J., Bookland, R.G. and Adang, M.J. (1987) Molecular cloning and characterization of the insecticidal crystal protein gene of Bacillus thuringiensis var. tenebrionis. Proc. Natl Acad. Sci. USA 84, 7036-40. Shaw, G. and Kamen, R. (1986) A conserved AU sequence from the 3' untranslated region of GM-CSF mRNA mediates selective mRNA degradation. Cell 46, 659-67. Taya, Y., Devos, R., Tavernier, J., Cheroutre, H., Engler, G. and Fiers, W. (1982) Cloning and structure of the human immune interferon-c chromosomal gene. EMBO J. 1, 9538. Thomas, P.S. (1980) Hybridization of denatured RNA and small DNA fragments transferred to nitrocellulose. Proc. Natl Acad. Sci. USA 77, 5201-5. Towbin, H., Staehelin, T. and Gorden, J. (1979) Electrophoretic transfer of proteins from polyacrylamide gels to nitrocellulose sheets: Procedure and some applications. Proc. Natl Acad. Sci. USA 76, 4350-4. Vaeck, M., Reynaerts, A., Hrfte, H., Jansens, S., De Beuckeleer, M., Dean, C., Zabeau, M., Van Montagu, M. and Leemans, J. (1987) Transgenic plants protected from insect attack. Nature 328, 33-7.