Cloning and NucleotideSequence ofthe Gene Encoding Arginine ...

5 downloads 0 Views 1MB Size Report
May 21, 1990 - TAKESHI OHNO,1 OSAMU ANDO,2 KAZUHISA SUGIMURA,1* MADOKA TANIAI ..... attained (S. Satoh, I. Yoshioka, J. Kobayashi, M. Nogawa,.
INFECTION AND IMMUNITY, Nov. 1990, p. 3788-3795

Vol. 58, No. 11

0019-9567/90/113788-08$02.00/0

Copyright © 1990, American Society for Microbiology

Cloning and Nucleotide Sequence of the Gene Encoding Arginine Deiminase of Mycoplasma arginini TAKESHI OHNO,1 OSAMU ANDO,2 KAZUHISA SUGIMURA,1* MADOKA TANIAI,2 MOTOYUKI SUZUKI,2 SHIGEHARU FUKUDA,2 YASUKAZU NAGASE,3 KOSHI YAMAMOTO,4 AND ICHIRO AZUMA1 Institute of Immunological Science, Hokkaido University, Sapporo,' Hayashibara Biochemical Laboratories Inc., Okayama,2 Mochida Pharmaceutical Co., Ltd., Yotsuya 1-7, Tokyo,3 and National Institute of Animal Health, Ibaraki,4 Japan Received 21 May 1990/Accepted 10 August 1990 The existence of a mycoplasmal arginine deiminase which catalyzes the conversion of L-arginine to L-citrulline has been postulated. Here we show the partial amino acid sequence of arginine deiminase of Mycoplasma arginini and the complete nucleotide sequence of the arginine deiminase gene of M. arginini. The open reading frame deduced from this sequence consists of 1,230 bp encoding 410 amino acids. The mature form of this enzyme contains 409 amino acids after the deletion of the first methionine. In this open reading frame, TGA nonsense codons are used as tryptophan codons; this usage was verified by determination of the amino acid sequence. The molecular weight of the enzyme calculated from the deduced amino acid sequence is 46,372. Recently, the nucleotide sequence of the arginine deiminase gene of M. arginini was reported by Kondo et al. (K. Kondo, H. Sone, H. Yoshida, T. Toida, K. Kanatani, Y.-M. Hong, N. Nishino, and J. Tanaka, Mol. Gen. Genet. 221:81-86, 1990). However, their sequence differed from ours in several places and especially at the C terminus.

Mycoplasma and viral infections often induce drastic changes in host cell metabolism (3). Mycoplasmas or their products especially have been shown not only to inhibit the stimulation of lymphocytes by allogeneic cells or mitogens but also to serve as mitogens for T and B lymphocytes (3). Cooperman and Morton first described the inhibition of mitosis in lymphocyte cultures by Mycoplasma hominis (4). Barile and Levinthal reported that the inhibitory effect of M. hominis on phytohemagglutinin-stimulated lymphocytes was due to the depletion of arginine in the medium (1). Schimke and Barile demonstrated that arginine was a major energy source for nonfermentative mycoplasma species, which converted prodigious amounts of arginine to citrulline and ornithine via the arginine dihydrolase pathway (11). Since then, it has been believed that the mycoplasmal suppressive effect is mainly due to the depletion of arginine from the nutritional source for the cells (3). Recently, we identified a lymphocyte blastogenesis inhibitory factor (LBIF) in a Mycoplasma arginini-infected human histiocytic lymphoma, U937 (14, 15). LBIF was purified by fast protein liquid chromatography. The molecular weight of this factor was estimated to be approximately 45,000 by sodium dodecyl sulfate-polyacrylamide gel electrophoresis (SDS-PAGE). The cell growth inhibitory activity of LBIF was characterized by the growth of lectin-stimulated T lymphocytes (15) as well as various tumor cell lines (13). More recently, it was clarified that LBIF bears arginine deiminase activity (12). We report here the partial amino acid sequence of LBIF and the complete nucleotide sequence of the LBIF gene.

cultured at 106 cells per ml in serum-free RPMI 1640 medium. The supernatant was concentrated by an ultrafiltration membrane module (AIL-1010; Asahi Chemical Industry Co. Ltd., Tokyo, Japan). The crude concentrate was fractionated with a TSK gel DEAE-5PW column (21.5 mm by 15 cm; TOSHO, Tokyo, Japan) equilibrated with 20 mM Tris hydrochloride (pH 7.7). The separation was done with a linear gradient of 0 to 0.5 M NaCl. LBIF activity was tested as described below. The active fractions were subsequently fractionated with a Mono P chromatofocusing column (HR5/ 20; Pharmacia). A fraction of LBIF was further resolved with a Hi-Pore RP-304 reversed-phase column (4.6 mm by 25 cm; Bio-Rad) by high-performance liquid chromatography. The separation was done with a linear gradient of 0 to 90% acetonitrile-0.1% trifluoroacetic acid for 145 min at a flow rate of 0.5 ml/min at 40°C. LBIF assay. LBIF showed strong antiproliferation activity against interleukin-1-stimulated murine thymocytes in vitro (14, 15). One unit of LBIF was defined as the amount of the LBIF preparation required to induce a half-maximum response in the LBIF assay. Approximately 10 ng of the LBIF sample corresponds to 1 U of LBIF. Determination of the partial amino acid sequence of LBIF. LBIF dissolved in 200 ,ul of 1 M NaHCO3 was denatured at 100°C for 5 min. The sample was digested with N-tosyl-Lphenylalanyl chloromethyl ketone-trypsin (Sigma) at 37°C for 20 h. Seventeen polypeptide fragments were recovered with a Hi-Pore RP-318 reversed-phase column (4.6 mm by 25 cm; Bio-Rad). The separation was done with a linear gradient of 0 to 90% acetonitrile in trifluoroacetic acid for 200 min at a flow rate of 0.5 ml/min at 40°C. Amino acid sequence analyses of the N-terminal eight residues and tryptic peptides were performed with an Applied Biosystems model 470A gas-phase sequencer. Preparation of synthetic oligonucleotide probes. Heptadecamer oligonucleotide probes were synthesized on the basis of the partial amino acid sequence Glu-Ile-Asp-Tyr-Ile-Thr with an automated synthesizer (model 381A; Applied Biowere

MATERIALS AND METHODS Purification of LBIF. LBIF was purified from the crude supernatant of M. arginini-infected U937 (human histiocytic lymphoma) as described previously (13, 15). In brief, cells *

Corresponding author. 3788

.

M. ARGININI ARGININE DEIMINASE GENE

VOL. 58, 1990

2 2Kd b Kdb0 c

a 1

_ -

_

67 45

30

L

-

BD E - 1

r'

20.1

&

14.4

\

..

,,, >.1a , S-'~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

ir

'1 v

H

3789

,,.

X

_,.,Xi

;,

Vi /Iv.'

1

8

Time (min)

.

..

FIG. 1. SDS-PAGE and tryptic peptide mapping of purified LBIF. (a) Coomassie brilliant blue-stained SDS-PAGE gel. Lanes: 1, purified LBIF; 2, size markers (Kd, kilodaltons). LBIF was purified with a TSK gel DEAE-5PW column, a Mono P chromatofocusing column (HR5/20), and a Hi-Pore RP-304 column (See Materials and Methods). SDS-PAGE (15% acrylamide gel) was done as described previously (13, 15). (b) Fractionation by reversed-phase high-performance liquid chromatography of tryptic peptides of LBIF. Peptides corresponding to peaks A to R were analyzed with an amino acid sequencer.

systems). These probes were separated into three pools, designated T22A, T22G, and T22T (see Fig. 2a). Preparation of mycoplasmal genomic DNA. M. arginini and Mycoplasma hyorhinis were identified in a human histiocytic lymphoma cell line, U937, by the metabolic inhibition test, and the isolates were cloned as described previously (12). High-molecular-weight DNA of M. arginini or M. hyorhinis was prepared by the standard method as described previously (8). Construction of a genomic DNA library. High-molecularweight DNA was digested to completion with EcoRI or BamHI (Takara, Kyoto, Japan). EcoRI-digested fragments or BamHI-digested fragments were subcloned into EcoRIcleaved X gtlO or BamHI-cleaved EMBL-3, respectively, as described previously (8). Plaque hybridization. Oligonucleotide probes were labeled with [_y-32P]ATP (Amersham) with a 5'-end-labeling kit (Boehringer). Plaque hybridization was performed by the standard method (8). In brief, plaques were transferred to a nylon membrane (Hybond-N; Amersham) and hybridized with 5'-end-labeled oligonucleotide probes in 5x SSPE (lx SSPE is 0.18 M NaCl, 0.01 M sodium phosphate, and 1 mM EDTA)-5 x Denhardt solution-0.1% SDS-100 ,ug of herring sperm DNA per ml at 37°C overnight. To obtain the fulllength LBIF gene, we searched a library constructed with BamHI-cleaved. EMBL-3 by using a radiolabeled 725-bp EcoRI fragment. Nick translation was done with a nick translation kit (Amersham). The hybridization solution was a mixture of 5 x SSPE, 5 x Denhardt solution, 0.5% SDS, and 20 pug of herring sperm DNA, per ml. Hybridization was carried out at 65°C overnight. Southern blot hybridization. Restriction enzyme-digested DNA was electrophoresed in a 0.7% agarose gel and blotted onto a nylon membrane (Hybond-N) in accordance with manufacturer instructions. Hybridization was carried out with a radiolabeled probe by the same procedure as that described for plaque hybridization. Nucleotide sequence analysis. pUC18 or pUC19 was used for subcloning DNA fragments. The nucleotide sequence was determined by the dideoxy chain termination method (8). To sequence pU1.7#9 or pUEb4, we prepared various deletion mutants by using exonuclease III (Takara) and subcloned them into pUC19 or pUC18 by the standard method (8). The oligonucleotide primer (nucleotides 199 to 219; AACAACATTGATGTCGTTTGC) was also used to

verify the N-terminal nucleotide sequence by the primer extension method (8). Computer analysis. Computer analysis was done with SDC-GENETIX genetic information processing software (Software Development Co. Ltd., Tokyo, Japan). Nucleotide sequence accession number. The accession number for the arginine deiminase gene nucleotide sequence is X54312 (EMBL Data Library). RESULTS Purification and partial amino acid sequence analysis of LBIF. LBIF was purified to homogeneity from the culture supernatant of M. arginini-infected human histiocytic lymphoma U937 as described previously (12, 13, 15). The apparent molecular weight of this protein was determined to be 45,000 by SDS-PAGE (Fig. la). The purified preparation was partially digested with trypsin. The resulting peptides were fractionated by reversed-phase high-performance liquid chromatography (Fig. lb). Seventeen polypeptide fragments were subjected to amino acid sequence analyses (Fig. 2c) Cloning of the LBIF gene. Three oligonucleotide probes, designated T22A, T22G, and T22T, were synthesized on the basis of the partial amino acid sequence of tryptic peptide E (Fig. 2a). We screened a genomic X gtlO library constructed with EcoRI-digested DNA of M. arginini by using these probes. Only probe T22A hybridized and did so to 3 of 3 x 104 plaques. All three clones contained the same-sized inserts, estimated to be about 750 bp by agarose gel electrophoresis. The insert was subcloned into EcoRI-digested pUC19 (designated pUA3A-1), and the nucleotide sequence was determined. Amino acid sequences for 10 polypeptide fragments were found in a predicted open reading frame of this 725-bp EcoRI fragment (Fig. 2c). Thus, we concluded that this 725-bp fragment encoded a portion of the LBIF gene.

To obtain a full-length LBIF gene, we constructed another genomic EMBL-3 library by using BamHI-digested DNA. With the 725-bp EcoRI fragment as a probe, 1 hybridizing plaque was obtained from 5 x 105 plaques. The clone, designated X LBIF 122#3, contained a 32-kb insert. Restriction site mapping of the region encoding LBIF was done by Southern blot hybridization with a 32P-labeled 725-bp

3790

1 v T s w k!

INFECT.

OHNO ET AL.

IMMUN.

kb

a aPeptide E

Clu-Ile-Asp-Tyr-Ile-Tlir-Pro-Ala

a

b

ALBlF 122#3

3'

5' G

G

T

C

H

E

S

500 bp) E XX

C

1pU.709 .

. --

--

H

C

pUb'

T

sub-clones

GTA ATA TAA TCT ATT TC C C G G

T22T

' X

EBX X ,b

L8IF coding region

C

GTA ATA TAA TCC ATT TC G G C G

T22G

B

E'

GTA ATA TAA TCA ATT TC

T22A

5 '-

pUEP2

pU185#5 pUPb*2

T C

F

XE &2

X

c

5-end

3 end

-1 00

TAATTTTTAAAAATATCAAAAAACACATTTTTTCTTTTAAAAATAGGACTAAAAGTAGAAAAAATAAAA

-1 AAATAGTATAATATAACTGTATA-1AA AA_ TAAGGAACTAGCTTACATCATTTATTTGAAh3MAAACCCA

75 25

ATG TCT GTA TTT GAC AGT AAA TTT AAA GGA ATT CAC GTT TAT TCA GAA ATT GGT GAA TTA GAA TCA GTT CTA GTT Met Ser Val Phe Asp Ser Lys Phe Lys Gly Ile His Val Tyr Ser Glu Ile Gly Glu Leu Glu Ser Val Leu Val

P

A

150 50

CAC GAA CCA GGA CGC GAA ATT GAC TAT ATT ACA CCA GCT AGA CTA GAT GAA TTA TTA TTC TCA GCT ATC TTA GAA His Glu Pro Gly Arg Giu Ile Aso Tyr le Thr Pro Ala Arg Leu Asp Glu Leu Leu Phe Ser Ala Ile Leu Glu

E

AGC CAC GAT GCT AGA AAA GAA CAC AAA CAA TTC GTA GCA GAA TTA AAA GCA AAC GAC ATC AAT GTT GTT GAA TTA Ser His Asp Ala Arg Lys Glu His Lys Gln Phe Val Ala Glu Leu Lys Ala Asn Asp Ile Asn Val Val Glu Leu

225

GAA GAA TTT TTA GAA Glu Glu Phe Leu Glu

300 100

GAC TCA GAA CCA GTT CTA TCA GAA GAA CAC AAA GTA GTT GTA AGA AAC TTC TTA AAA GCT AAA AAA ACA TCA AGA Asp Ser Glu Pro Val Leu Ser Glu Glu His LYS Val Val Val Arg Asn Phe Leu Lys Ala Lys Lys Thr Ser Arg

125

GAT CAC GAA TTA ATC GTT Asp His Glu Leu Ile Val

450 150

GAC CCA ATG CCA AAC CTA TAC TTC ACA CGT GAC CCA TTT GCA TCA GTA GGT AAT GGT GTA ACA ATC CAC TAC ATG Asp Pro Met Pro Asn Leu Tyr Phe Thr Arg Asp Pro Phe Ala Ser Val Gly Asn Gly Val Thr Ile His Tyr Met

525 175

AAA CTA ATT AAC ACT Lys Leu Ile Asn Thr

600 200

AAC AAT GAC ACA TTA Asn Asn Asp Thr Leu

675 225

GAC TTA CAA ACA GTT ACT TTA TTA GCT AAA AAC ATT GTT GCT AAT AAA GAA Asp Leu Gln Thr Val Thr Leu Leu Ala Lys Asn Ile Val Ala Asn Lys Glu

750 250

F

ATT GAT TTA GTT GCT GAA ACA TAT GAT TTA GCA TCA CAA GAA GCT AAA GAT AAA TTA ATC Ile Asp Leu Val Ala Glu Thr Tyr Asp Leu Ala Ser Gln Glu Ala Lys As Lys Leu Ile

75

N-

375

B

AAA TTA GTA GAA ATC ATG ATG GCA GGG ATC ACA AAA TAC GAT TTA GGT ATC GAA GCA Lys Leu Val Glu Ile Met Met Ala Gly Ile Thr Lys Tyr Asp Leu Gly Ile Glu Ala

0

CGT TAC AAA GTT AGA CA.A CGT GAA ACA TTA TTC TCA AGA TTT GTA TTC TCA AAT CAC CCT Arg Tyr Lys Val Arg Gln Arg Glu Thr Leu Phe Ser ArT Phe Val Phe Ser Asn His Pro D TAG TAC GAC CCT TCA CTA AAA TTA TCA ATC GAA GGT GGA GACGTA TTT ATC TAC CCA Pro . Tyr Tyr Asp Pro Ser Leu Lys Leu Ser Ile Glu Gly Gly Asp Val Phe Ile Tyr GTA GTTCGGT GTT TCT GAA AGA ACT Val Val Gly Val Ser Glu Arg Thr

H

fG"A

CTA ACA TGT GAA TTC AAA CGT ATT GTT GCA ATT AAC GTT CCA AAA IFGKA AGA AAC TTA ATG CAC TTA GAC ACA Thr Asn Leu Met His Leu Asp Thr TrLeu Thr Cys Glu Phe Lys Arg Ile Val Ala Ile Asn Val Pro Lys T

825 275

CTA TAC TCA CCA ATC GCT AAC GAC GTA TTT AAA TTC TGA GAT TAT GAC TTA GTA Leu Tyr Ser Pro Ile Ala Asn Asp Val Phe Lys Ph. T Asp Tyr Asp Leu Val

900 300

CAA TCA ATC ATT AAC AAA Gln Ser Ile Ile Asn Lys

975 325

AAA CCA GTT TTA ATT CCT ATC GCA GGT GAA GGT GCT TCA CAA ATG GAA ATC GAA AGA GAA ACA CAC TTC GAT GGT Lys Pro Val Leu Ile Pro Ile Ala Gly Glu Gly Ala Ser Gln Met Glu Ile Glu Arg Glu Thr His Phe Asp Gly J ACA AAC TAC TTA GCA ATT AGA CCA GGT GTT GTA ATT GGT TAC TCA CGT AAC GAA AAA ACA AAC GCT GCT CTA GAA Thr Asn Tyr Leu Ala Ile Arg Pro Gly Val Val Ile Gly Tyr Ser Arg Asn Glu Lys Thr Asn Ala Ala Leu Glu

1050 350

CAA TTA TCA TTA GGT ATG GGT AAC GCT CGT TGT ATG TCA Gln Leu Ser Leu Gly Met Gly Asn Ala Arg Cys Met Ser

1200 400

G

ATG TTA GAC AAG GAC AAA TTC Met Leu Asp Lys Asp Lys Phe

M-

R-

AAC GGT GGA GCA GAA CCA CAA CCA GTT GAA AAC GGA TTA CCT CTA GAA GGA TTA TTA Gly Gly Ala Glu Pro Gln Pro Val Glu Asn Gly Leu Pro Leu Glu Gly Leu Leu

Asn

K

GCT GCA GGC ATT AAA GTT CTT CCA TTC CAC GGT AAC Ala Ala Gly Ile Lys Val Leu Pro Phe His Gly Asn ATG CCT TTA TCA CGT AAA GAT GTT AAG Met Pro Leu Ser Arg Lys Asp Val Lys

1125 375

1289 rTG;A TAGTAAATTACTATAAAATTATTTAATTTTATATATTAAAAATCTCCAGCACTTGCTGG g410 ******

TWI

GG ATATTTTTTAATATTATTTTAAAAAAT:AATAATGTATGTAAAATAATAAATAAAAGG AAAATGTATGG AAAATAAAAAAGGTAAATTATATAATGAA

CATCGAAAAGTTTTTTCAAATTATTTGGTCCAAATTTTGGGGTTCATTATAAAATACATTATATTTATTGAATTGATTAATAAAGTTAATAAATGATTAT TTTTATGATATAAAAAATTCTAAAGAACAGTCTTCTAAAAAATAAAATTATGTTAAAATATTAGAGTACTTTTTATATATAAAAAAGTATCGGATCGAT -3'

1 38 8 48 7

1

586

FIG. 2. Nucleotide sequence and corresponding amino acid sequence of LBIF. (a) Amino acid sequence of tryptic peptide E and synthetic oligodeoxynucleotide probe sequences corresponding to peptide E. The probe pools were designated T22A, T22G, and T22T. (b) Restriction map of the LBIF-coding region. The solid and open boxes indicate the LBIF-coding region and the 725-bp EcoRI fragment, respectively. B, BamHI; E, EcoRI; C, ClaI; S, Sacl; P, PstI; X, XbaI; H, HindIIl. (c) Nucleotide sequence and predicted amino acid sequence. The underlining indicates the amino acid sequences determined for the purified tryptic peptides (A to R) of LBIF. Thick underlining indicates the nucleotide sequence of the oligonucleotide probe. The inverted repeats are indicated by horizontal arrows. The large boxes show the

tryptophan encoded by the TGA codon. The small boxes show the the -35 sequence. Asterisks indicate termination codons.

possible regions of the Shine-Dalgarno sequence, the -10 sequence, and

EcoRI fragment (Fig. 2b). We subcloned six fragments in pUC19 for sequencing. The sequencing strategy is shown in Fig. 3. The nucleotide sequence of the full-length LBIF gene contained an open reading frame encoding 410 amino acid residues (Fig. 2c). The initiation codon of LBIF was the codon for methionine. However, on the basis of the N-terminal amino acid analysis of the mature LBIF protein, the first residue was serine. Therefore, it was concluded that the mature LBIF protein consisted of 409 amino acids after the

deletion of the first amino acid, methionine. The molecular weight predicted from the deduced amino acid sequence was 46,372, in close agreement with the apparent molecular weight of 45,000 determined by SDS-PAGE. Nucleotide and amino acid sequences. It was recently reported that TGA nonsense codons are read as tryptophan codons in Mycoplasma capricolum (16). In accordance with this result, we noted that there were five TGA codons in the open reading frame of the LBIF gene (Fig. 2c). We confirmed that three of the five TGA nonsense codons were used

M. ARGININI ARGININE DEIMINASE GENE

VOL. 58, 1990

3791

100 b I

Sal I

EcoR I

EcoRI

XbaI

I

I

Cla I

I PI

I ~~~~~~~~~~~~~Pst

Nip

I

XbaI

UIhF Structural gene

FIG. 3. Strategy for sequencing the LBIF gene. The nucleotide sequence Arrows indicate directions and ranges of the sequences read. b, Bases. as tryptophan codons by amino acid sequence analysis and determination of the amino acid composition of the C-terminal CNBr fragment (data not shown). In contrast, TAG nonsense codons were functional as termination signals. Examination of the 5'-flanking region revealed a possible ribosome-binding sequence (Shine-Dalgarno sequence), AGGA, 7 nucleotides upstream from the ATG initiation codon. The -10 and -35 sequences, which are known to be consensus sequences for bacterial promoters, were obscure, since the 5'-flanking region was AT-rich. At 15 bp upstream from the Shine-Dalgarno sequence, there was a -10 sequence-like sequence, TATACA (consensus sequence in Escherichia coli, TATACT), and there was a -35 sequencelike sequence, TTATCTA (consensus sequence in E. coli, TTGACA), at 18 bp upstream from the -10 sequence-like sequence. There were two inverted repeats in the 3'-flanking region; one of these may correspond to the transcriptional stop sequence (16). The hydrophilicity and hydrophobicity of the LBIF amino acid sequence were calculated with the Kyte-Doolittle algorithm (SDC-GENETIX genetic information processing software). LBIF exhibited an alternating hydrophilic-hydrophobic character throughout the amino acid sequence (data not shown). A hydrophobic signal peptide region was not found in the N-terminal portion of LBIF. When the amino acid sequence of LBIF was compared with that of arginine deiminase of Pseudomonos aeruginosa (2), these proteins showed 43% similarity in their nucleotide sequences and 27% similarity in their amino acid sequences (Fig. 4). We analyzed the EcoRI-digested DNA derived from two mycoplasma species (M. arginini and M. hyorhinis) by Southern blot hybridization (Fig. 5). M. arginini but not M.

was

determined by the dideoxy chain termination method (8).

hyorhinis is an arginine-degradating mycoplasma. The 725-bp EcoRI fragment was used as a probe. A single band of about 750 bp was detected in M. arginini but not in M. hyorhinis. GC content and codon usage in M. arginini. On the basis of the nucleotide sequence analysis, the AT contents are about 65% in the LBIF gene and about 83% in the noncoding region. This phenomenon reflects mycoplasmal codon usage. In the case of the codons used for the M. capricolum ribosomal proteins S8 and L6, more than 90% of the codons have A or T at the third position (9, 16). The same tendency in codon usage can be seen in the LBIF gene (Table 1). Only 93 of the total of 410 codons have G or C at the third position (11 of these 93 are ATG [methionine] codons). Thus, A and T are used predominantly at the third position. However, more G and C than A and T were found in the codons of Asn, Asp, His, Tyr, and Phe (Table 1). This characteristic is distinct from that of M. capricolum. DISCUSSION In this study, we determined the partial amino acid sequence of arginine deiminase of M. arginini and the fulllength nucleotide sequence of the gene. The enzyme catalyzes the direct conversion of L-arginine to L-citrulline and ammonia (12). Recently, the nucleotide sequence of the arginine deiminase gene of M. arginini was reported by Kondo et al. (7). However, their sequence differed from ours in several places and especially at the C terminus (Fig. 4 and 6). The single base changes observed in the N terminus may be attributable to the origins of the M. arginini strains. Kondo et al. used M. arginini KM101 from their stock

3792

INFECT. IMMUN.

OHNO ET AL. a MSVFDSKF[GIHVYSEIGELESVLVHEPGREIDYITPARLDELLFSAILESHDARKEHIQFVAELK 66 b ------------------------------------------------------------------ 66 c MSTEKT[LG-H--A-K-RK-M-CS--LAHQRL--SNC-----DDVIWVNQ-KRD-FD--TKMR 63

ANDINVVELIDLVAETYDLASQEAKDKLIEEFLEDSEPVLSEEHKVVVRNFLKAKKTSRKLVEIMM 132 ---------T-------------------------------------------S-----E-----ERG-D-L-MHN-LT--" IQNP--LKWILDRKITADSVG-GLTSE"L-SW-ES'LEP---A-YLI 124

AGIT[YDL............GIEADHELIVDPMPNLYFTRDPFASVGNGVTIHYMRYKVRQRETL 185 ---------------------------------

185

G-VAAD--PASEGANILKMYREYLGHSSFLLP-L--TQ----TTCWIYG---LNP-YWPA-RQ--- 190 FSRFVFSNHPKLINT"'PWYYDPSLKL..SIEGGDVFIYNNDTLVVGVSERTDLQTVTLLAKNI 245 -____________________ 245 LTTAIYKF--EFA-AEFEI--G--DKDHGSSTL----LMPIG-GVVLI-MG--SSR-AIGQV-QSL 256 ----------

VANIECEFKRIVAINVPKWTNLMHLDTWLTMLD[DIF LYSPIANDVFKFWDYDLVNGGAEPQPVE 310

------------------------------------- __ 310 F-KGAA- '-VIVAGL--SRAA-----VFSFC-R-LVTVFPEVVKEIVP-..'S-RPDPSS-YGMN 317

NGLPLEGLLQSIINIKPV LIPIAGEGASQMEIERETHFDGTNYLAIRPGVVIGYSRNEKTNAALE 375 -- --------------------------D---------------375

IRREE[TF-EVVAESLGLIKLRVVET-GNSFAA---QWD--N-VVCLE----V--D--TY--TL-R 383 AAGI[VLPFHGNQLSLGMGNARCMSMPLSRKDVKW KKDYLRPISI

410 385 419

--VE- ITISASE-GR-R-GGH--TC- IV-DPIDY FIG. 4. Amino acid sequence homologies of LBIF and other arginine deiminases. (a) LBIF. (b) M. arginini arginine deiminase sequence determined by Kondo et al. (7). (c) P. aeruginosa arginine deiminase sequence. Dashes indicate amino acids identical to those shown above. Dots indicate gaps.

collection (7). We used an M. arginini strain isolated from the U937 cell line (12). From nucleotide 1125 to the 3' end, the nucleotide sequence of Kondo et al. is completely different from ours. Nucleotide 1125 is located within an XbaI site (Fig. 6). We determined the nucleotide sequence in both directions of an EcoRI-Clal fragment (pUEb4) by preparing various deletion mutants (Fig. 3). This strategy could avoid the mistake of subcloning at the XbaI site (nucleotides 1118 to 1124). Kondo et al. could not find a transcription terminator in the 3'-untranslated region of their sequence. In the LBIF gene, there are two inverted repeats in the 3'-untranslated region; one of these may correspond to the transcriptional stop sequence (Fig. 2c). We determined the amino acid sequence of tryptic peptide I (Fig. 2c). To

1

2

23.1-

6.52.3_ 2.0 0.5-

m

FIG. 5. Southern hybridization of genomic DNA. Lanes: 1, EcoRI-digested DNA of M. arginini; 2, EcoRI-digested DNA of M. hyorhinis. A DNAs digested with Hindlll were used as size markers. The 725-bp EcoRI fragment was used as a probe. The arrowhead shows the 725-bp band. Numbers at left are in kilobases.

confirm that the last codon, TGA, of the open reading frame was read as a tryptophan codon, we determined the amino acid sequence, Pro-Leu-Ser-Arg-Lys-Asp-Val-Lys-Trp, by purifying the CNBr fragment CN3 (data not shown). The results completely matched our nucleotide sequencing results. Recently, gene cloning of arginine deiminase, a tumor growth inhibitory factor (TGIF), of Mycoplasma orale was attained (S. Satoh, I. Yoshioka, J. Kobayashi, M. Nogawa, and M. Otani, Tissue Culture Res. Commun. 9:63, 1990). The TGIF gene has an open reading frame of 1,230 bp encoding 410 amino acids (S. Satoh et al., Japanese patent Kokai Tokkyo Koho, JP, 90-53490, CI.C12N 15/55). All five tryptophan residues are encoded by TGA codons and located at the same positions as those in LBIF. The nucleotide sequences and amino acid sequences of TGIF and LBIF are 82 and 83% homologous, respectively. The molecular weights of the purified enzymes were estimated to be 46,000 by Kondo et al. (7) and 45,000 by us (12, 15). The molecular weight (46,372) deduced from the LBIF gene sequence showed good agreement with these values. We have clearly determined the arginine deiminase activity of highly purified LBIF samples. However, as there are five TGA codons in the open reading frame of the LBIF gene, we have not determined the arginine deiminase activity of the cloned gene product by using expression vectors in E. coli or eucaryotic systems. In this context, Satoh et al. have performed an expression experiment with the TGIF gene in E. coli. Only when all five TGA codons were converted to TGG by site-directed mutagenesis was a high level of arginine deiminase activity detected. When four of the TGA codons (all but the last one, TGA [nucleotides 1228 to 1230]) were converted to TGG codons, less than 30% of the arginine deiminase activity detected when the gene contained five

3793

M. ARGININI ARGININE DEIMINASE GENE

VOL. 58, 1990

TABLE 1. Comparison of the codon usage of M. arginini and other species No. of the indicated codons in:

Amino acid

Codon

Arg

Bacteria (E. coli)b

M. arginini (LBIF gene)

M. capri-

CGA or CGT CGC or CGG AGA AGG

7 1 9 0

28 1 69 0

34 25 5 2

Leu

CTA or CTT CTC or CTG TTA TTG

11 0 32 0

15 0 99 3

12 (10) 55 (55) 9 (7) 7 (7)

Ser

TCA or TCT TCC or TCG AGC AGT

21 0 1 1

72 0 4 15

25 22 9 8

Thr

ACA or ACT ACC or ACG

18 0

Pro

CCA or CCT CCC or CCG

Ala

columa

(33) (25) (5) (2)

Animals (humans)b 12 (5) 16 (14) 8 (6) 10 (16)

16 74 2 9

(19) (108) (1) (7)

(25) (22) (9) (7)

25 (17) 20 (29) 21 (17) 12 (7)

104 0

25 (26) 30 (32)

26 (22) 34 (33)

19 0

51 2

12 (12) 23 (23)

24 (22) 22 (19)

GCA or GCT GCC or GCG

25 0

132 0

69 (68) 45 (43)

42 (28) 44 (46)

Gly

GGA or GGT GGC or GGG

21 2

129 5

35 (37) 36 (35)

38 (13) 43 (47)

Val

GTA or GTT GTC or GTG

34 0

133 4

46 (49) 27 (26)

14 (8) 54 (55)

Lys

AAA AAG

27 2

217 17

45 (46) 17 (18)

19 (14) 49 (42)

Asn

AAC AAT

17 5

15 64

26 (25) 11 (10)

28 (36) 8 (5)

Gln

CAA CAG

8 0

62 0

14 (13) 29 (29)

10 (11) 28 (35)

His

CAC CAT

11 0

9 16

10 (9) 15 (16)

21 (28) 10 (9)

Glu

GAA GAG

35 0

116 8

36 (37) 17 (18)

21 (20) 34 (36)

Asp

GAC GAT

15 10

5 46

28 (27) 25 (25)

24 (32) 16 (13)

Tyr

TAC TAT

10 4

7 25

13 (12) 13 (14)

23 (21)

Cys

TGC TGT

0 2

3 4

6 (6) 5 (5)

13 (24) 10 (10)

Phe

TTC

11 7

10 43

17 (18)

TTT

18 (18)

28 (36) 13 (17)

lie

ATA or ATT ATC

14 14

122 9

29 (29) 31 (32)

15 (12) 24 (14)

Met

ATG

11

47

22 (22)

16 (17)

Trp

TGG TGA

0

0 8

11 (12) (stop) (stop)

12 (10) (stop) (stop)

aData are from references 10 and 16. b Data are from reference 6.

5

10 (19)

3794

OHNO ET AL.

INFECT. IMMUN. a

S' TAATrrTTAAAAATATCAAAAAACACATTTTTTCTTTTAAAAATAGGACTAAAAGTAGAAAAAATAAAA

-1 00

bA

-1I ATG TCT GTA TTT GAC AGT AAA Trr AAA GGA ATT CAC GrT TAT TCA GAA ATT GGT GAA TTA GAA TCA GTT CTA GCT

75

CAC GAA CCA GGA CGC GAA AlIr GAC TAT ATT ACA CCA GCT AGA CTA GAT GAA TTA TTA TTC TCA GCT ATC TTA GAA

1 50

AGC CAC GAT GCT AGA AAA GAA CAC AAA CAA rrC GTA GCA GAA TTA AAA GCA AAC GAC ATC AAT GTT GCT GAA TTA ______________- - -- - -- - -- - -- - -- --T - -- - -- - -- - -- -- -

225

ATT GAT TTA GTT GCT GAA ACA TAT GAT TTA GCA TCA CAA GAA GCr AAA GAT AAA TTA ATC GAA GAA TTT TTA GAA

300

_-

C-.CGAC TCA GAA CCA GTT CTA TCA GAA GAA CAC AAA GTA GTr GTA AGA AAC TTC TTA AAA GCT AAA AAA ACA TCA AGA --- --- --- --- --- --- T-- --- --- --- --- ---

375

AAA TTA GTA GM ATC ATG ATG GCA GGG ATC ACA AA,A TAC GAT TTA GGT ATC GAA GCA GAT CAC GAA TTA ATC GTT G-- --- --- --- --- --- --- ---

450

GAC CCA ATG CCA AAC CTA TAC TTC ACA CGT GAC CCA TTT GCA T'CA GTA GGT AAT GGT GTA ACA ATC CAC TAC ATG - ___ - - -- ---- - - -- ---T --- --- --- --- -

525

CGT TAC AAA GTT AGA CAA CGT GAA ACA TTA TTC TCA AGA TTT GTA TTC TCA AAT CAC CCT AAA CTA ATT AAC ACT

600

-

_C -

CCA TGA TAC TAC GAC CCT TCA CrA AAA TTA TCA ATC GAA GGT GGA GAC GTA TTT ATC TAC AAC AAT GAC ACA TTA ---- - -- - -- ---T - -- - -- - -- - -- - -- --__---___ ___ ___ ___ _

675

GAA AGA ACT GAC TTA CAA ACA GTT ACT TTA TTA GCT AAA AAC ATT GTT GCT AAT AAA GAA

750

TGT GM TTC AM CGT AT? GTT GCA ATT AAC Grr CCA AAA TGA ACA AAC TTA ATG CAC TTA GAC ACA TGA CTA ACA

825

ATG TTA GAC AAG CAC AAA TTC CTA TAC TCA CCA ATC GCT AAC GAC GTA TTT AAA TTC TGA GAT TAT GAC TTA GTA

900

AAC GGT GGA GCA GAA CCA CAA CCA GTT GAA AAC GGA TTA CCT CTA GAA GGA TTA TTA CAA TCA ATC ATT AAC AAA

975

TCA CAA ATG GAA ATC GAA AGA GAA ACA CAC TTC GAT GGT

1 050

ACA AAC TAC TTA GCA AT? AGA CCA GGT GT? GTA ATT GGT TAC TCA CGT AAC GAA AAA ACA AAC GCT GCT CTA GAA

11 25

GTA GTT GGT GT1' TC

AAA CCA GrT TTA Arr CCT ATIC GCA GCC

GAA GGT GCC

-

C

GCT GCA GGC AT? AAA GT? CIrr CCA TTC CAC GGT AAC CAA TTA TCA TTA GGT ATG GGT AAC GCT CGT TGT ATG TCA AAA AAG -AT TA- T?- AGG -C- AT- -CG ATT TAA TGG GCT GAG GTG AAT TAA TAA -T- TTA CAA AT- ATA --T -A-

1 200

ATG CCT TTA TCA CGT AAA GAT GC'T AAG TGA TAGTAAATTACrATAAAATTA'l'TTAATT?TATATATTAAAAATCTCCAGCACTTGCTGG --A ATA A-- AA- A-G -TT ATA T-A -TT -TG GT-Gl'T---TTC-AT?-CC--AAA--C-A-TAGA-AGT--G-AA-TAGAT-A-GTAAAA

1 289

GGATATTTT'rTAATATrATTTTAAAAAATAATAATGTATGTAAAATAATAAATAAAAGGAAAATGTATGGAAAATAAAAAAGGTAAATTATATAATGAA TCTA-AA-C--TT--CG--GA--CG-TCAC-A- -ATA-CT-T-TG-C-CTGCArr-TAAT-TTAA-GATT-T- --TTT-T-A----- AGA-ACAA-C

1 388

CATCA AAAGTTTTTTCAAATTAT'rTGGTCCAAAl"rTTGGGGT'rCATTATAA AATACATTAI'ATTTrATTGAATTrGATTAAT^AAAGTTAAT^AAATGA'I'TAT TCAATTTrCAA-A- --C-GGAC-AATA-A --- CAAGACA-AC-AC-G-G-TG-AT-T-G-AT- -ATCCTG- - -- T---T-AT- -AGAT?ATC-AT- -GT-

1 48 7

TTrA'GATATA AAAAArlCT^AAAGA ACA GTC'rTC'rAAA AA A''AAAArTATG'TAAAA'ArAGAGTACTTTTTATATATA AA AAAGTATCGGATCGAT 3' 1 586 -A--T--TATTAAAC-C-GA--TG-I'TTAG--AAT-AT-GA -CG ----T -?---T--CAT---'rA--A-TA---TA---TCTA--Tr-A-AAA --G-A

FIG. 6. Nucleotide sequences of LBIF (a) and arginine deiminase determined by Kondo et al. (7) (b). Dashes indicate nucleotides identical to those shown above. Underlining indicates the XbaI site. The arrow indicates the boundary that distinguishes the differences between the nucleotide sequences determined by Kondo et al. and us.

TGG codons in place of the five TGA codons was present (Satoh et al., Tissue Culture Res. Common., 1990), suggesting that the tryptophan residue of the C terminus might be important for the expression of enzyme activity. These results may also support the idea that the LBIF gene encodes arginine deiminase. Two important points regarding the codon usage of the arginine deiminase gene were noted. Firstly, a low GC content was present in the whole gene sequence of LBIF (Table 1). This characteristic of mycoplasmal species was previously reported (5, 10, 16). It was suggested that this AT richness was caused by the putative evolutional pressure, the force to convert GC pairs to AT pairs. A and T nucleotides appeared more frequently than G and C nucleotides especially at the third position of amino acid codons of LBIF, as well as other mycoplasmal proteins. However, with regard to amino acids encoded by fewer than two kinds of codons, the LBIF gene used more GC pairs than AT pairs at the third position of codons for Asn, Asp, His, Tyr, and Phe (Table 1). This feature may be characteristic of M. arginini and is not observed in proteins of M. capricolum (10, 16). Secondly, all tryptophans of LBIF appeared to be encoded by a TGA that was a termination (opal) codon throughout procaryotes and eucaryotes. This finding is consistent with previous findings that the TGA codon encoded tryptophan in ribosomal protein genes and tRNA genes of M. capricolum (9, 16). Thus, it is suggested that mycoplasma

species, at least M. capricolum and M. arginini, predominantly use a TGA codon but not a universal TGG codon for tryptophan. Mycoplasma genitalium and Mycoplasma pneumoniae appeared to use both codons TGA and TGG for tryptophan in the genes encoding adhesins (5). ACKNOWLEDGMENTS

We thank Mihoko Sato for expert editorial assistance. This work was supported by a Grant-in-Aid for Cancer Research from the Japanese Ministry of Education, Science and Culture; a Grant-in-Aid for Scientific Research from the Japanese Ministry of Education, Science and Culture; a special Grant-in-Aid for promotion of Education and Science in Hokkaido University provided by the Japanese Ministry of Education, Science and Culture; a Grantin-Aid from the Mochida Memorial Foundation for Medical and Pharmaceutical Research; and a Grant-in-Aid from the Akiyama Foundation. LITERATURE CITED

1. Barile, M. F., and B. G. Levinthal. 1968. Possible mechanism for mycoplasma inhibition of lymphocyte transformation induced by phytohemagglutinin. Nature (London) 219:751-752. 2. Baur, H., E. Luethi, V. Stalon, A. Mercenier, and D. Haas. 1989. Sequence analysis and expression of the arginine-deiminase and carbamate-kinase genes of Pseudomonas aeruginosa. Eur. J. Biochem. 179:53-60. 3. Cole, B. C., Y. Naot, E. J. Stanbridge, and K. S. Wise. 1985. Interactions of mycoplasmas and their products with lymphoid

VOL. S8, 1990

4.

5.

6.

7.

8.

9.

10.

cells in vitro, p. 203-257. In S. Razin and M. F. Barile (ed.), The mycoplasmas, vol. 4. Academic Press, Inc., New York. Copperman, R., and H. E. Morton. 1966. Reversible inhibition of mitosis in lymphocyte cultures by non-viable mycoplasma. Proc. Soc. Exp. Biol. Med. 123:790-795. Dallo, S. F., A. Chavoya, C.-J. Su, and J. B. Baseman. 1989. DNA and protein sequence homologies between the adhesins of Mycoplasma genitalium and Mycoplasma pneumoniae. Infect. Immun. 57:1059-1065. Grantham, R. 1981. Codon catalog usage is a genome strategy modulated for gene expressivity. Nucleic Acids Res. 9:43-74. Kondo, K., H. Sone, H. Yoshida, T. Toida, K. Kanatani, Y.-M. Hong, N. Nishino, and J. Tanaka. 1990. Cloning and sequence analysis of the arginine deiminase gene from Mycoplasma arginini. Mol. Gen. Genet. 221:81-86. Maniatis, T., E. F. Fritsch, and J. Sambrook. 1989. Molecular cloning: a laboratory manual, 2nd ed. Cold Spring Harbor Laboratory, Cold Spring Harbor, N.Y. Muto, A., Y. Kawauchi, F. Yamao, and S. Osawa. 1984. Preferential use of A- and U-rich codons for Mycoplasma capricolum ribosomal proteins S8 and L6. Nucleic Acids Res. 12:82098217. Sawada, M., A. Muto, M. Iwami, F. Yamao, and S. Osawa. 1984. Organization of ribosomal RNA genes in Mycoplasma capricolum. Mol. Gen. Genet. 196:311-316.

M. ARGININI ARGININE DEIMINASE GENE

3795

11. Schimke, R. T., and M. F. Barile. 1963. Arginine breakdown in mammalian cell culture contaminated with pleuropneumonialike organisms (PPLO). Exp. Cell Res. 30:593-596. 12. Sugimura, K., S. Fukuda, Y. Wada, M. Taniai, M. Suzuki, T. Kimura, T. Ohno, K. Yamamoto, and I. Azuma. 1990. Identification and purification of arginine deiminase that originated from Mycoplasma arginini. Infect. Immun. 58:2510-2515. 13. Sugimura, K., T. Ohno, S. Fukuda, Y. Wada, T. Kimura, and I. Azuma. 1990. Tumor growth inhibitory activity of a lymphocyte blastogenesis inhibitory factor (LBIF). Cancer Res. 50:345-349. 14. Sugimura, K., K. Tsukahara, Y. Ueda, K. Takeda, Y. Habu, and I. Azuma. 1988. Fast protein liquid chromatography of lymphocyte blastogenesis inhibitory factor (LBIF) produced by a human macrophage-like cell line U937. J. Chromatog. 440:131140. 15. Sugimura, K., Y. Ueda, K. Takeda, S. Fukuda, K. Tsukahara, Y. Habu, H. Fujiwara, and I. Azuma. 1989. A lymphocyte blastogenesis inhibitory factor (LBIF) arrests mitogen-stimulated T lymphocytes at early G1 phase with no influence on interleukin 2 production and interleukin 2 receptor light chain expression. Eur. J. Immunol. 19:1357-1364. 16. Yamao, F., A. Muto, Y. Kawauchi, M. Iwami, S. Iwagami, Y. Azumi, and S. Osawa. 1985. UGA is read as tryptophan in Mycoplasma capricolum. Proc. Natl. Acad. Sci. USA 82:23062309.