May 21, 1990 - TAKESHI OHNO,1 OSAMU ANDO,2 KAZUHISA SUGIMURA,1* MADOKA TANIAI ..... attained (S. Satoh, I. Yoshioka, J. Kobayashi, M. Nogawa,.
INFECTION AND IMMUNITY, Nov. 1990, p. 3788-3795
Vol. 58, No. 11
0019-9567/90/113788-08$02.00/0
Copyright © 1990, American Society for Microbiology
Cloning and Nucleotide Sequence of the Gene Encoding Arginine Deiminase of Mycoplasma arginini TAKESHI OHNO,1 OSAMU ANDO,2 KAZUHISA SUGIMURA,1* MADOKA TANIAI,2 MOTOYUKI SUZUKI,2 SHIGEHARU FUKUDA,2 YASUKAZU NAGASE,3 KOSHI YAMAMOTO,4 AND ICHIRO AZUMA1 Institute of Immunological Science, Hokkaido University, Sapporo,' Hayashibara Biochemical Laboratories Inc., Okayama,2 Mochida Pharmaceutical Co., Ltd., Yotsuya 1-7, Tokyo,3 and National Institute of Animal Health, Ibaraki,4 Japan Received 21 May 1990/Accepted 10 August 1990 The existence of a mycoplasmal arginine deiminase which catalyzes the conversion of L-arginine to L-citrulline has been postulated. Here we show the partial amino acid sequence of arginine deiminase of Mycoplasma arginini and the complete nucleotide sequence of the arginine deiminase gene of M. arginini. The open reading frame deduced from this sequence consists of 1,230 bp encoding 410 amino acids. The mature form of this enzyme contains 409 amino acids after the deletion of the first methionine. In this open reading frame, TGA nonsense codons are used as tryptophan codons; this usage was verified by determination of the amino acid sequence. The molecular weight of the enzyme calculated from the deduced amino acid sequence is 46,372. Recently, the nucleotide sequence of the arginine deiminase gene of M. arginini was reported by Kondo et al. (K. Kondo, H. Sone, H. Yoshida, T. Toida, K. Kanatani, Y.-M. Hong, N. Nishino, and J. Tanaka, Mol. Gen. Genet. 221:81-86, 1990). However, their sequence differed from ours in several places and especially at the C terminus.
Mycoplasma and viral infections often induce drastic changes in host cell metabolism (3). Mycoplasmas or their products especially have been shown not only to inhibit the stimulation of lymphocytes by allogeneic cells or mitogens but also to serve as mitogens for T and B lymphocytes (3). Cooperman and Morton first described the inhibition of mitosis in lymphocyte cultures by Mycoplasma hominis (4). Barile and Levinthal reported that the inhibitory effect of M. hominis on phytohemagglutinin-stimulated lymphocytes was due to the depletion of arginine in the medium (1). Schimke and Barile demonstrated that arginine was a major energy source for nonfermentative mycoplasma species, which converted prodigious amounts of arginine to citrulline and ornithine via the arginine dihydrolase pathway (11). Since then, it has been believed that the mycoplasmal suppressive effect is mainly due to the depletion of arginine from the nutritional source for the cells (3). Recently, we identified a lymphocyte blastogenesis inhibitory factor (LBIF) in a Mycoplasma arginini-infected human histiocytic lymphoma, U937 (14, 15). LBIF was purified by fast protein liquid chromatography. The molecular weight of this factor was estimated to be approximately 45,000 by sodium dodecyl sulfate-polyacrylamide gel electrophoresis (SDS-PAGE). The cell growth inhibitory activity of LBIF was characterized by the growth of lectin-stimulated T lymphocytes (15) as well as various tumor cell lines (13). More recently, it was clarified that LBIF bears arginine deiminase activity (12). We report here the partial amino acid sequence of LBIF and the complete nucleotide sequence of the LBIF gene.
cultured at 106 cells per ml in serum-free RPMI 1640 medium. The supernatant was concentrated by an ultrafiltration membrane module (AIL-1010; Asahi Chemical Industry Co. Ltd., Tokyo, Japan). The crude concentrate was fractionated with a TSK gel DEAE-5PW column (21.5 mm by 15 cm; TOSHO, Tokyo, Japan) equilibrated with 20 mM Tris hydrochloride (pH 7.7). The separation was done with a linear gradient of 0 to 0.5 M NaCl. LBIF activity was tested as described below. The active fractions were subsequently fractionated with a Mono P chromatofocusing column (HR5/ 20; Pharmacia). A fraction of LBIF was further resolved with a Hi-Pore RP-304 reversed-phase column (4.6 mm by 25 cm; Bio-Rad) by high-performance liquid chromatography. The separation was done with a linear gradient of 0 to 90% acetonitrile-0.1% trifluoroacetic acid for 145 min at a flow rate of 0.5 ml/min at 40°C. LBIF assay. LBIF showed strong antiproliferation activity against interleukin-1-stimulated murine thymocytes in vitro (14, 15). One unit of LBIF was defined as the amount of the LBIF preparation required to induce a half-maximum response in the LBIF assay. Approximately 10 ng of the LBIF sample corresponds to 1 U of LBIF. Determination of the partial amino acid sequence of LBIF. LBIF dissolved in 200 ,ul of 1 M NaHCO3 was denatured at 100°C for 5 min. The sample was digested with N-tosyl-Lphenylalanyl chloromethyl ketone-trypsin (Sigma) at 37°C for 20 h. Seventeen polypeptide fragments were recovered with a Hi-Pore RP-318 reversed-phase column (4.6 mm by 25 cm; Bio-Rad). The separation was done with a linear gradient of 0 to 90% acetonitrile in trifluoroacetic acid for 200 min at a flow rate of 0.5 ml/min at 40°C. Amino acid sequence analyses of the N-terminal eight residues and tryptic peptides were performed with an Applied Biosystems model 470A gas-phase sequencer. Preparation of synthetic oligonucleotide probes. Heptadecamer oligonucleotide probes were synthesized on the basis of the partial amino acid sequence Glu-Ile-Asp-Tyr-Ile-Thr with an automated synthesizer (model 381A; Applied Biowere
MATERIALS AND METHODS Purification of LBIF. LBIF was purified from the crude supernatant of M. arginini-infected U937 (human histiocytic lymphoma) as described previously (13, 15). In brief, cells *
Corresponding author. 3788
.
M. ARGININI ARGININE DEIMINASE GENE
VOL. 58, 1990
2 2Kd b Kdb0 c
a 1
_ -
_
67 45
30
L
-
BD E - 1
r'
20.1
&
14.4
\
..
,,, >.1a , S-'~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
ir
'1 v
H
3789
,,.
X
_,.,Xi
;,
Vi /Iv.'
1
8
Time (min)
.
..
FIG. 1. SDS-PAGE and tryptic peptide mapping of purified LBIF. (a) Coomassie brilliant blue-stained SDS-PAGE gel. Lanes: 1, purified LBIF; 2, size markers (Kd, kilodaltons). LBIF was purified with a TSK gel DEAE-5PW column, a Mono P chromatofocusing column (HR5/20), and a Hi-Pore RP-304 column (See Materials and Methods). SDS-PAGE (15% acrylamide gel) was done as described previously (13, 15). (b) Fractionation by reversed-phase high-performance liquid chromatography of tryptic peptides of LBIF. Peptides corresponding to peaks A to R were analyzed with an amino acid sequencer.
systems). These probes were separated into three pools, designated T22A, T22G, and T22T (see Fig. 2a). Preparation of mycoplasmal genomic DNA. M. arginini and Mycoplasma hyorhinis were identified in a human histiocytic lymphoma cell line, U937, by the metabolic inhibition test, and the isolates were cloned as described previously (12). High-molecular-weight DNA of M. arginini or M. hyorhinis was prepared by the standard method as described previously (8). Construction of a genomic DNA library. High-molecularweight DNA was digested to completion with EcoRI or BamHI (Takara, Kyoto, Japan). EcoRI-digested fragments or BamHI-digested fragments were subcloned into EcoRIcleaved X gtlO or BamHI-cleaved EMBL-3, respectively, as described previously (8). Plaque hybridization. Oligonucleotide probes were labeled with [_y-32P]ATP (Amersham) with a 5'-end-labeling kit (Boehringer). Plaque hybridization was performed by the standard method (8). In brief, plaques were transferred to a nylon membrane (Hybond-N; Amersham) and hybridized with 5'-end-labeled oligonucleotide probes in 5x SSPE (lx SSPE is 0.18 M NaCl, 0.01 M sodium phosphate, and 1 mM EDTA)-5 x Denhardt solution-0.1% SDS-100 ,ug of herring sperm DNA per ml at 37°C overnight. To obtain the fulllength LBIF gene, we searched a library constructed with BamHI-cleaved. EMBL-3 by using a radiolabeled 725-bp EcoRI fragment. Nick translation was done with a nick translation kit (Amersham). The hybridization solution was a mixture of 5 x SSPE, 5 x Denhardt solution, 0.5% SDS, and 20 pug of herring sperm DNA, per ml. Hybridization was carried out at 65°C overnight. Southern blot hybridization. Restriction enzyme-digested DNA was electrophoresed in a 0.7% agarose gel and blotted onto a nylon membrane (Hybond-N) in accordance with manufacturer instructions. Hybridization was carried out with a radiolabeled probe by the same procedure as that described for plaque hybridization. Nucleotide sequence analysis. pUC18 or pUC19 was used for subcloning DNA fragments. The nucleotide sequence was determined by the dideoxy chain termination method (8). To sequence pU1.7#9 or pUEb4, we prepared various deletion mutants by using exonuclease III (Takara) and subcloned them into pUC19 or pUC18 by the standard method (8). The oligonucleotide primer (nucleotides 199 to 219; AACAACATTGATGTCGTTTGC) was also used to
verify the N-terminal nucleotide sequence by the primer extension method (8). Computer analysis. Computer analysis was done with SDC-GENETIX genetic information processing software (Software Development Co. Ltd., Tokyo, Japan). Nucleotide sequence accession number. The accession number for the arginine deiminase gene nucleotide sequence is X54312 (EMBL Data Library). RESULTS Purification and partial amino acid sequence analysis of LBIF. LBIF was purified to homogeneity from the culture supernatant of M. arginini-infected human histiocytic lymphoma U937 as described previously (12, 13, 15). The apparent molecular weight of this protein was determined to be 45,000 by SDS-PAGE (Fig. la). The purified preparation was partially digested with trypsin. The resulting peptides were fractionated by reversed-phase high-performance liquid chromatography (Fig. lb). Seventeen polypeptide fragments were subjected to amino acid sequence analyses (Fig. 2c) Cloning of the LBIF gene. Three oligonucleotide probes, designated T22A, T22G, and T22T, were synthesized on the basis of the partial amino acid sequence of tryptic peptide E (Fig. 2a). We screened a genomic X gtlO library constructed with EcoRI-digested DNA of M. arginini by using these probes. Only probe T22A hybridized and did so to 3 of 3 x 104 plaques. All three clones contained the same-sized inserts, estimated to be about 750 bp by agarose gel electrophoresis. The insert was subcloned into EcoRI-digested pUC19 (designated pUA3A-1), and the nucleotide sequence was determined. Amino acid sequences for 10 polypeptide fragments were found in a predicted open reading frame of this 725-bp EcoRI fragment (Fig. 2c). Thus, we concluded that this 725-bp fragment encoded a portion of the LBIF gene.
To obtain a full-length LBIF gene, we constructed another genomic EMBL-3 library by using BamHI-digested DNA. With the 725-bp EcoRI fragment as a probe, 1 hybridizing plaque was obtained from 5 x 105 plaques. The clone, designated X LBIF 122#3, contained a 32-kb insert. Restriction site mapping of the region encoding LBIF was done by Southern blot hybridization with a 32P-labeled 725-bp
3790
1 v T s w k!
INFECT.
OHNO ET AL.
IMMUN.
kb
a aPeptide E
Clu-Ile-Asp-Tyr-Ile-Tlir-Pro-Ala
a
b
ALBlF 122#3
3'
5' G
G
T
C
H
E
S
500 bp) E XX
C
1pU.709 .
. --
--
H
C
pUb'
T
sub-clones
GTA ATA TAA TCT ATT TC C C G G
T22T
' X
EBX X ,b
L8IF coding region
C
GTA ATA TAA TCC ATT TC G G C G
T22G
B
E'
GTA ATA TAA TCA ATT TC
T22A
5 '-
pUEP2
pU185#5 pUPb*2
T C
F
XE &2
X
c
5-end
3 end
-1 00
TAATTTTTAAAAATATCAAAAAACACATTTTTTCTTTTAAAAATAGGACTAAAAGTAGAAAAAATAAAA
-1 AAATAGTATAATATAACTGTATA-1AA AA_ TAAGGAACTAGCTTACATCATTTATTTGAAh3MAAACCCA
75 25
ATG TCT GTA TTT GAC AGT AAA TTT AAA GGA ATT CAC GTT TAT TCA GAA ATT GGT GAA TTA GAA TCA GTT CTA GTT Met Ser Val Phe Asp Ser Lys Phe Lys Gly Ile His Val Tyr Ser Glu Ile Gly Glu Leu Glu Ser Val Leu Val
P
A
150 50
CAC GAA CCA GGA CGC GAA ATT GAC TAT ATT ACA CCA GCT AGA CTA GAT GAA TTA TTA TTC TCA GCT ATC TTA GAA His Glu Pro Gly Arg Giu Ile Aso Tyr le Thr Pro Ala Arg Leu Asp Glu Leu Leu Phe Ser Ala Ile Leu Glu
E
AGC CAC GAT GCT AGA AAA GAA CAC AAA CAA TTC GTA GCA GAA TTA AAA GCA AAC GAC ATC AAT GTT GTT GAA TTA Ser His Asp Ala Arg Lys Glu His Lys Gln Phe Val Ala Glu Leu Lys Ala Asn Asp Ile Asn Val Val Glu Leu
225
GAA GAA TTT TTA GAA Glu Glu Phe Leu Glu
300 100
GAC TCA GAA CCA GTT CTA TCA GAA GAA CAC AAA GTA GTT GTA AGA AAC TTC TTA AAA GCT AAA AAA ACA TCA AGA Asp Ser Glu Pro Val Leu Ser Glu Glu His LYS Val Val Val Arg Asn Phe Leu Lys Ala Lys Lys Thr Ser Arg
125
GAT CAC GAA TTA ATC GTT Asp His Glu Leu Ile Val
450 150
GAC CCA ATG CCA AAC CTA TAC TTC ACA CGT GAC CCA TTT GCA TCA GTA GGT AAT GGT GTA ACA ATC CAC TAC ATG Asp Pro Met Pro Asn Leu Tyr Phe Thr Arg Asp Pro Phe Ala Ser Val Gly Asn Gly Val Thr Ile His Tyr Met
525 175
AAA CTA ATT AAC ACT Lys Leu Ile Asn Thr
600 200
AAC AAT GAC ACA TTA Asn Asn Asp Thr Leu
675 225
GAC TTA CAA ACA GTT ACT TTA TTA GCT AAA AAC ATT GTT GCT AAT AAA GAA Asp Leu Gln Thr Val Thr Leu Leu Ala Lys Asn Ile Val Ala Asn Lys Glu
750 250
F
ATT GAT TTA GTT GCT GAA ACA TAT GAT TTA GCA TCA CAA GAA GCT AAA GAT AAA TTA ATC Ile Asp Leu Val Ala Glu Thr Tyr Asp Leu Ala Ser Gln Glu Ala Lys As Lys Leu Ile
75
N-
375
B
AAA TTA GTA GAA ATC ATG ATG GCA GGG ATC ACA AAA TAC GAT TTA GGT ATC GAA GCA Lys Leu Val Glu Ile Met Met Ala Gly Ile Thr Lys Tyr Asp Leu Gly Ile Glu Ala
0
CGT TAC AAA GTT AGA CA.A CGT GAA ACA TTA TTC TCA AGA TTT GTA TTC TCA AAT CAC CCT Arg Tyr Lys Val Arg Gln Arg Glu Thr Leu Phe Ser ArT Phe Val Phe Ser Asn His Pro D TAG TAC GAC CCT TCA CTA AAA TTA TCA ATC GAA GGT GGA GACGTA TTT ATC TAC CCA Pro . Tyr Tyr Asp Pro Ser Leu Lys Leu Ser Ile Glu Gly Gly Asp Val Phe Ile Tyr GTA GTTCGGT GTT TCT GAA AGA ACT Val Val Gly Val Ser Glu Arg Thr
H
fG"A
CTA ACA TGT GAA TTC AAA CGT ATT GTT GCA ATT AAC GTT CCA AAA IFGKA AGA AAC TTA ATG CAC TTA GAC ACA Thr Asn Leu Met His Leu Asp Thr TrLeu Thr Cys Glu Phe Lys Arg Ile Val Ala Ile Asn Val Pro Lys T
825 275
CTA TAC TCA CCA ATC GCT AAC GAC GTA TTT AAA TTC TGA GAT TAT GAC TTA GTA Leu Tyr Ser Pro Ile Ala Asn Asp Val Phe Lys Ph. T Asp Tyr Asp Leu Val
900 300
CAA TCA ATC ATT AAC AAA Gln Ser Ile Ile Asn Lys
975 325
AAA CCA GTT TTA ATT CCT ATC GCA GGT GAA GGT GCT TCA CAA ATG GAA ATC GAA AGA GAA ACA CAC TTC GAT GGT Lys Pro Val Leu Ile Pro Ile Ala Gly Glu Gly Ala Ser Gln Met Glu Ile Glu Arg Glu Thr His Phe Asp Gly J ACA AAC TAC TTA GCA ATT AGA CCA GGT GTT GTA ATT GGT TAC TCA CGT AAC GAA AAA ACA AAC GCT GCT CTA GAA Thr Asn Tyr Leu Ala Ile Arg Pro Gly Val Val Ile Gly Tyr Ser Arg Asn Glu Lys Thr Asn Ala Ala Leu Glu
1050 350
CAA TTA TCA TTA GGT ATG GGT AAC GCT CGT TGT ATG TCA Gln Leu Ser Leu Gly Met Gly Asn Ala Arg Cys Met Ser
1200 400
G
ATG TTA GAC AAG GAC AAA TTC Met Leu Asp Lys Asp Lys Phe
M-
R-
AAC GGT GGA GCA GAA CCA CAA CCA GTT GAA AAC GGA TTA CCT CTA GAA GGA TTA TTA Gly Gly Ala Glu Pro Gln Pro Val Glu Asn Gly Leu Pro Leu Glu Gly Leu Leu
Asn
K
GCT GCA GGC ATT AAA GTT CTT CCA TTC CAC GGT AAC Ala Ala Gly Ile Lys Val Leu Pro Phe His Gly Asn ATG CCT TTA TCA CGT AAA GAT GTT AAG Met Pro Leu Ser Arg Lys Asp Val Lys
1125 375
1289 rTG;A TAGTAAATTACTATAAAATTATTTAATTTTATATATTAAAAATCTCCAGCACTTGCTGG g410 ******
TWI
GG ATATTTTTTAATATTATTTTAAAAAAT:AATAATGTATGTAAAATAATAAATAAAAGG AAAATGTATGG AAAATAAAAAAGGTAAATTATATAATGAA
CATCGAAAAGTTTTTTCAAATTATTTGGTCCAAATTTTGGGGTTCATTATAAAATACATTATATTTATTGAATTGATTAATAAAGTTAATAAATGATTAT TTTTATGATATAAAAAATTCTAAAGAACAGTCTTCTAAAAAATAAAATTATGTTAAAATATTAGAGTACTTTTTATATATAAAAAAGTATCGGATCGAT -3'
1 38 8 48 7
1
586
FIG. 2. Nucleotide sequence and corresponding amino acid sequence of LBIF. (a) Amino acid sequence of tryptic peptide E and synthetic oligodeoxynucleotide probe sequences corresponding to peptide E. The probe pools were designated T22A, T22G, and T22T. (b) Restriction map of the LBIF-coding region. The solid and open boxes indicate the LBIF-coding region and the 725-bp EcoRI fragment, respectively. B, BamHI; E, EcoRI; C, ClaI; S, Sacl; P, PstI; X, XbaI; H, HindIIl. (c) Nucleotide sequence and predicted amino acid sequence. The underlining indicates the amino acid sequences determined for the purified tryptic peptides (A to R) of LBIF. Thick underlining indicates the nucleotide sequence of the oligonucleotide probe. The inverted repeats are indicated by horizontal arrows. The large boxes show the
tryptophan encoded by the TGA codon. The small boxes show the the -35 sequence. Asterisks indicate termination codons.
possible regions of the Shine-Dalgarno sequence, the -10 sequence, and
EcoRI fragment (Fig. 2b). We subcloned six fragments in pUC19 for sequencing. The sequencing strategy is shown in Fig. 3. The nucleotide sequence of the full-length LBIF gene contained an open reading frame encoding 410 amino acid residues (Fig. 2c). The initiation codon of LBIF was the codon for methionine. However, on the basis of the N-terminal amino acid analysis of the mature LBIF protein, the first residue was serine. Therefore, it was concluded that the mature LBIF protein consisted of 409 amino acids after the
deletion of the first amino acid, methionine. The molecular weight predicted from the deduced amino acid sequence was 46,372, in close agreement with the apparent molecular weight of 45,000 determined by SDS-PAGE. Nucleotide and amino acid sequences. It was recently reported that TGA nonsense codons are read as tryptophan codons in Mycoplasma capricolum (16). In accordance with this result, we noted that there were five TGA codons in the open reading frame of the LBIF gene (Fig. 2c). We confirmed that three of the five TGA nonsense codons were used
M. ARGININI ARGININE DEIMINASE GENE
VOL. 58, 1990
3791
100 b I
Sal I
EcoR I
EcoRI
XbaI
I
I
Cla I
I PI
I ~~~~~~~~~~~~~Pst
Nip
I
XbaI
UIhF Structural gene
FIG. 3. Strategy for sequencing the LBIF gene. The nucleotide sequence Arrows indicate directions and ranges of the sequences read. b, Bases. as tryptophan codons by amino acid sequence analysis and determination of the amino acid composition of the C-terminal CNBr fragment (data not shown). In contrast, TAG nonsense codons were functional as termination signals. Examination of the 5'-flanking region revealed a possible ribosome-binding sequence (Shine-Dalgarno sequence), AGGA, 7 nucleotides upstream from the ATG initiation codon. The -10 and -35 sequences, which are known to be consensus sequences for bacterial promoters, were obscure, since the 5'-flanking region was AT-rich. At 15 bp upstream from the Shine-Dalgarno sequence, there was a -10 sequence-like sequence, TATACA (consensus sequence in Escherichia coli, TATACT), and there was a -35 sequencelike sequence, TTATCTA (consensus sequence in E. coli, TTGACA), at 18 bp upstream from the -10 sequence-like sequence. There were two inverted repeats in the 3'-flanking region; one of these may correspond to the transcriptional stop sequence (16). The hydrophilicity and hydrophobicity of the LBIF amino acid sequence were calculated with the Kyte-Doolittle algorithm (SDC-GENETIX genetic information processing software). LBIF exhibited an alternating hydrophilic-hydrophobic character throughout the amino acid sequence (data not shown). A hydrophobic signal peptide region was not found in the N-terminal portion of LBIF. When the amino acid sequence of LBIF was compared with that of arginine deiminase of Pseudomonos aeruginosa (2), these proteins showed 43% similarity in their nucleotide sequences and 27% similarity in their amino acid sequences (Fig. 4). We analyzed the EcoRI-digested DNA derived from two mycoplasma species (M. arginini and M. hyorhinis) by Southern blot hybridization (Fig. 5). M. arginini but not M.
was
determined by the dideoxy chain termination method (8).
hyorhinis is an arginine-degradating mycoplasma. The 725-bp EcoRI fragment was used as a probe. A single band of about 750 bp was detected in M. arginini but not in M. hyorhinis. GC content and codon usage in M. arginini. On the basis of the nucleotide sequence analysis, the AT contents are about 65% in the LBIF gene and about 83% in the noncoding region. This phenomenon reflects mycoplasmal codon usage. In the case of the codons used for the M. capricolum ribosomal proteins S8 and L6, more than 90% of the codons have A or T at the third position (9, 16). The same tendency in codon usage can be seen in the LBIF gene (Table 1). Only 93 of the total of 410 codons have G or C at the third position (11 of these 93 are ATG [methionine] codons). Thus, A and T are used predominantly at the third position. However, more G and C than A and T were found in the codons of Asn, Asp, His, Tyr, and Phe (Table 1). This characteristic is distinct from that of M. capricolum. DISCUSSION In this study, we determined the partial amino acid sequence of arginine deiminase of M. arginini and the fulllength nucleotide sequence of the gene. The enzyme catalyzes the direct conversion of L-arginine to L-citrulline and ammonia (12). Recently, the nucleotide sequence of the arginine deiminase gene of M. arginini was reported by Kondo et al. (7). However, their sequence differed from ours in several places and especially at the C terminus (Fig. 4 and 6). The single base changes observed in the N terminus may be attributable to the origins of the M. arginini strains. Kondo et al. used M. arginini KM101 from their stock
3792
INFECT. IMMUN.
OHNO ET AL. a MSVFDSKF[GIHVYSEIGELESVLVHEPGREIDYITPARLDELLFSAILESHDARKEHIQFVAELK 66 b ------------------------------------------------------------------ 66 c MSTEKT[LG-H--A-K-RK-M-CS--LAHQRL--SNC-----DDVIWVNQ-KRD-FD--TKMR 63
ANDINVVELIDLVAETYDLASQEAKDKLIEEFLEDSEPVLSEEHKVVVRNFLKAKKTSRKLVEIMM 132 ---------T-------------------------------------------S-----E-----ERG-D-L-MHN-LT--" IQNP--LKWILDRKITADSVG-GLTSE"L-SW-ES'LEP---A-YLI 124
AGIT[YDL............GIEADHELIVDPMPNLYFTRDPFASVGNGVTIHYMRYKVRQRETL 185 ---------------------------------
185
G-VAAD--PASEGANILKMYREYLGHSSFLLP-L--TQ----TTCWIYG---LNP-YWPA-RQ--- 190 FSRFVFSNHPKLINT"'PWYYDPSLKL..SIEGGDVFIYNNDTLVVGVSERTDLQTVTLLAKNI 245 -____________________ 245 LTTAIYKF--EFA-AEFEI--G--DKDHGSSTL----LMPIG-GVVLI-MG--SSR-AIGQV-QSL 256 ----------
VANIECEFKRIVAINVPKWTNLMHLDTWLTMLD[DIF LYSPIANDVFKFWDYDLVNGGAEPQPVE 310
------------------------------------- __ 310 F-KGAA- '-VIVAGL--SRAA-----VFSFC-R-LVTVFPEVVKEIVP-..'S-RPDPSS-YGMN 317
NGLPLEGLLQSIINIKPV LIPIAGEGASQMEIERETHFDGTNYLAIRPGVVIGYSRNEKTNAALE 375 -- --------------------------D---------------375
IRREE[TF-EVVAESLGLIKLRVVET-GNSFAA---QWD--N-VVCLE----V--D--TY--TL-R 383 AAGI[VLPFHGNQLSLGMGNARCMSMPLSRKDVKW KKDYLRPISI
410 385 419
--VE- ITISASE-GR-R-GGH--TC- IV-DPIDY FIG. 4. Amino acid sequence homologies of LBIF and other arginine deiminases. (a) LBIF. (b) M. arginini arginine deiminase sequence determined by Kondo et al. (7). (c) P. aeruginosa arginine deiminase sequence. Dashes indicate amino acids identical to those shown above. Dots indicate gaps.
collection (7). We used an M. arginini strain isolated from the U937 cell line (12). From nucleotide 1125 to the 3' end, the nucleotide sequence of Kondo et al. is completely different from ours. Nucleotide 1125 is located within an XbaI site (Fig. 6). We determined the nucleotide sequence in both directions of an EcoRI-Clal fragment (pUEb4) by preparing various deletion mutants (Fig. 3). This strategy could avoid the mistake of subcloning at the XbaI site (nucleotides 1118 to 1124). Kondo et al. could not find a transcription terminator in the 3'-untranslated region of their sequence. In the LBIF gene, there are two inverted repeats in the 3'-untranslated region; one of these may correspond to the transcriptional stop sequence (Fig. 2c). We determined the amino acid sequence of tryptic peptide I (Fig. 2c). To
1
2
23.1-
6.52.3_ 2.0 0.5-
m
FIG. 5. Southern hybridization of genomic DNA. Lanes: 1, EcoRI-digested DNA of M. arginini; 2, EcoRI-digested DNA of M. hyorhinis. A DNAs digested with Hindlll were used as size markers. The 725-bp EcoRI fragment was used as a probe. The arrowhead shows the 725-bp band. Numbers at left are in kilobases.
confirm that the last codon, TGA, of the open reading frame was read as a tryptophan codon, we determined the amino acid sequence, Pro-Leu-Ser-Arg-Lys-Asp-Val-Lys-Trp, by purifying the CNBr fragment CN3 (data not shown). The results completely matched our nucleotide sequencing results. Recently, gene cloning of arginine deiminase, a tumor growth inhibitory factor (TGIF), of Mycoplasma orale was attained (S. Satoh, I. Yoshioka, J. Kobayashi, M. Nogawa, and M. Otani, Tissue Culture Res. Commun. 9:63, 1990). The TGIF gene has an open reading frame of 1,230 bp encoding 410 amino acids (S. Satoh et al., Japanese patent Kokai Tokkyo Koho, JP, 90-53490, CI.C12N 15/55). All five tryptophan residues are encoded by TGA codons and located at the same positions as those in LBIF. The nucleotide sequences and amino acid sequences of TGIF and LBIF are 82 and 83% homologous, respectively. The molecular weights of the purified enzymes were estimated to be 46,000 by Kondo et al. (7) and 45,000 by us (12, 15). The molecular weight (46,372) deduced from the LBIF gene sequence showed good agreement with these values. We have clearly determined the arginine deiminase activity of highly purified LBIF samples. However, as there are five TGA codons in the open reading frame of the LBIF gene, we have not determined the arginine deiminase activity of the cloned gene product by using expression vectors in E. coli or eucaryotic systems. In this context, Satoh et al. have performed an expression experiment with the TGIF gene in E. coli. Only when all five TGA codons were converted to TGG by site-directed mutagenesis was a high level of arginine deiminase activity detected. When four of the TGA codons (all but the last one, TGA [nucleotides 1228 to 1230]) were converted to TGG codons, less than 30% of the arginine deiminase activity detected when the gene contained five
3793
M. ARGININI ARGININE DEIMINASE GENE
VOL. 58, 1990
TABLE 1. Comparison of the codon usage of M. arginini and other species No. of the indicated codons in:
Amino acid
Codon
Arg
Bacteria (E. coli)b
M. arginini (LBIF gene)
M. capri-
CGA or CGT CGC or CGG AGA AGG
7 1 9 0
28 1 69 0
34 25 5 2
Leu
CTA or CTT CTC or CTG TTA TTG
11 0 32 0
15 0 99 3
12 (10) 55 (55) 9 (7) 7 (7)
Ser
TCA or TCT TCC or TCG AGC AGT
21 0 1 1
72 0 4 15
25 22 9 8
Thr
ACA or ACT ACC or ACG
18 0
Pro
CCA or CCT CCC or CCG
Ala
columa
(33) (25) (5) (2)
Animals (humans)b 12 (5) 16 (14) 8 (6) 10 (16)
16 74 2 9
(19) (108) (1) (7)
(25) (22) (9) (7)
25 (17) 20 (29) 21 (17) 12 (7)
104 0
25 (26) 30 (32)
26 (22) 34 (33)
19 0
51 2
12 (12) 23 (23)
24 (22) 22 (19)
GCA or GCT GCC or GCG
25 0
132 0
69 (68) 45 (43)
42 (28) 44 (46)
Gly
GGA or GGT GGC or GGG
21 2
129 5
35 (37) 36 (35)
38 (13) 43 (47)
Val
GTA or GTT GTC or GTG
34 0
133 4
46 (49) 27 (26)
14 (8) 54 (55)
Lys
AAA AAG
27 2
217 17
45 (46) 17 (18)
19 (14) 49 (42)
Asn
AAC AAT
17 5
15 64
26 (25) 11 (10)
28 (36) 8 (5)
Gln
CAA CAG
8 0
62 0
14 (13) 29 (29)
10 (11) 28 (35)
His
CAC CAT
11 0
9 16
10 (9) 15 (16)
21 (28) 10 (9)
Glu
GAA GAG
35 0
116 8
36 (37) 17 (18)
21 (20) 34 (36)
Asp
GAC GAT
15 10
5 46
28 (27) 25 (25)
24 (32) 16 (13)
Tyr
TAC TAT
10 4
7 25
13 (12) 13 (14)
23 (21)
Cys
TGC TGT
0 2
3 4
6 (6) 5 (5)
13 (24) 10 (10)
Phe
TTC
11 7
10 43
17 (18)
TTT
18 (18)
28 (36) 13 (17)
lie
ATA or ATT ATC
14 14
122 9
29 (29) 31 (32)
15 (12) 24 (14)
Met
ATG
11
47
22 (22)
16 (17)
Trp
TGG TGA
0
0 8
11 (12) (stop) (stop)
12 (10) (stop) (stop)
aData are from references 10 and 16. b Data are from reference 6.
5
10 (19)
3794
OHNO ET AL.
INFECT. IMMUN. a
S' TAATrrTTAAAAATATCAAAAAACACATTTTTTCTTTTAAAAATAGGACTAAAAGTAGAAAAAATAAAA
-1 00
bA
-1I ATG TCT GTA TTT GAC AGT AAA Trr AAA GGA ATT CAC GrT TAT TCA GAA ATT GGT GAA TTA GAA TCA GTT CTA GCT
75
CAC GAA CCA GGA CGC GAA AlIr GAC TAT ATT ACA CCA GCT AGA CTA GAT GAA TTA TTA TTC TCA GCT ATC TTA GAA
1 50
AGC CAC GAT GCT AGA AAA GAA CAC AAA CAA rrC GTA GCA GAA TTA AAA GCA AAC GAC ATC AAT GTT GCT GAA TTA ______________- - -- - -- - -- - -- - -- --T - -- - -- - -- - -- -- -
225
ATT GAT TTA GTT GCT GAA ACA TAT GAT TTA GCA TCA CAA GAA GCr AAA GAT AAA TTA ATC GAA GAA TTT TTA GAA
300
_-
C-.CGAC TCA GAA CCA GTT CTA TCA GAA GAA CAC AAA GTA GTr GTA AGA AAC TTC TTA AAA GCT AAA AAA ACA TCA AGA --- --- --- --- --- --- T-- --- --- --- --- ---
375
AAA TTA GTA GM ATC ATG ATG GCA GGG ATC ACA AA,A TAC GAT TTA GGT ATC GAA GCA GAT CAC GAA TTA ATC GTT G-- --- --- --- --- --- --- ---
450
GAC CCA ATG CCA AAC CTA TAC TTC ACA CGT GAC CCA TTT GCA T'CA GTA GGT AAT GGT GTA ACA ATC CAC TAC ATG - ___ - - -- ---- - - -- ---T --- --- --- --- -
525
CGT TAC AAA GTT AGA CAA CGT GAA ACA TTA TTC TCA AGA TTT GTA TTC TCA AAT CAC CCT AAA CTA ATT AAC ACT
600
-
_C -
CCA TGA TAC TAC GAC CCT TCA CrA AAA TTA TCA ATC GAA GGT GGA GAC GTA TTT ATC TAC AAC AAT GAC ACA TTA ---- - -- - -- ---T - -- - -- - -- - -- - -- --__---___ ___ ___ ___ _
675
GAA AGA ACT GAC TTA CAA ACA GTT ACT TTA TTA GCT AAA AAC ATT GTT GCT AAT AAA GAA
750
TGT GM TTC AM CGT AT? GTT GCA ATT AAC Grr CCA AAA TGA ACA AAC TTA ATG CAC TTA GAC ACA TGA CTA ACA
825
ATG TTA GAC AAG CAC AAA TTC CTA TAC TCA CCA ATC GCT AAC GAC GTA TTT AAA TTC TGA GAT TAT GAC TTA GTA
900
AAC GGT GGA GCA GAA CCA CAA CCA GTT GAA AAC GGA TTA CCT CTA GAA GGA TTA TTA CAA TCA ATC ATT AAC AAA
975
TCA CAA ATG GAA ATC GAA AGA GAA ACA CAC TTC GAT GGT
1 050
ACA AAC TAC TTA GCA AT? AGA CCA GGT GT? GTA ATT GGT TAC TCA CGT AAC GAA AAA ACA AAC GCT GCT CTA GAA
11 25
GTA GTT GGT GT1' TC
AAA CCA GrT TTA Arr CCT ATIC GCA GCC
GAA GGT GCC
-
C
GCT GCA GGC AT? AAA GT? CIrr CCA TTC CAC GGT AAC CAA TTA TCA TTA GGT ATG GGT AAC GCT CGT TGT ATG TCA AAA AAG -AT TA- T?- AGG -C- AT- -CG ATT TAA TGG GCT GAG GTG AAT TAA TAA -T- TTA CAA AT- ATA --T -A-
1 200
ATG CCT TTA TCA CGT AAA GAT GC'T AAG TGA TAGTAAATTACrATAAAATTA'l'TTAATT?TATATATTAAAAATCTCCAGCACTTGCTGG --A ATA A-- AA- A-G -TT ATA T-A -TT -TG GT-Gl'T---TTC-AT?-CC--AAA--C-A-TAGA-AGT--G-AA-TAGAT-A-GTAAAA
1 289
GGATATTTT'rTAATATrATTTTAAAAAATAATAATGTATGTAAAATAATAAATAAAAGGAAAATGTATGGAAAATAAAAAAGGTAAATTATATAATGAA TCTA-AA-C--TT--CG--GA--CG-TCAC-A- -ATA-CT-T-TG-C-CTGCArr-TAAT-TTAA-GATT-T- --TTT-T-A----- AGA-ACAA-C
1 388
CATCA AAAGTTTTTTCAAATTAT'rTGGTCCAAAl"rTTGGGGT'rCATTATAA AATACATTAI'ATTTrATTGAATTrGATTAAT^AAAGTTAAT^AAATGA'I'TAT TCAATTTrCAA-A- --C-GGAC-AATA-A --- CAAGACA-AC-AC-G-G-TG-AT-T-G-AT- -ATCCTG- - -- T---T-AT- -AGAT?ATC-AT- -GT-
1 48 7
TTrA'GATATA AAAAArlCT^AAAGA ACA GTC'rTC'rAAA AA A''AAAArTATG'TAAAA'ArAGAGTACTTTTTATATATA AA AAAGTATCGGATCGAT 3' 1 586 -A--T--TATTAAAC-C-GA--TG-I'TTAG--AAT-AT-GA -CG ----T -?---T--CAT---'rA--A-TA---TA---TCTA--Tr-A-AAA --G-A
FIG. 6. Nucleotide sequences of LBIF (a) and arginine deiminase determined by Kondo et al. (7) (b). Dashes indicate nucleotides identical to those shown above. Underlining indicates the XbaI site. The arrow indicates the boundary that distinguishes the differences between the nucleotide sequences determined by Kondo et al. and us.
TGG codons in place of the five TGA codons was present (Satoh et al., Tissue Culture Res. Common., 1990), suggesting that the tryptophan residue of the C terminus might be important for the expression of enzyme activity. These results may also support the idea that the LBIF gene encodes arginine deiminase. Two important points regarding the codon usage of the arginine deiminase gene were noted. Firstly, a low GC content was present in the whole gene sequence of LBIF (Table 1). This characteristic of mycoplasmal species was previously reported (5, 10, 16). It was suggested that this AT richness was caused by the putative evolutional pressure, the force to convert GC pairs to AT pairs. A and T nucleotides appeared more frequently than G and C nucleotides especially at the third position of amino acid codons of LBIF, as well as other mycoplasmal proteins. However, with regard to amino acids encoded by fewer than two kinds of codons, the LBIF gene used more GC pairs than AT pairs at the third position of codons for Asn, Asp, His, Tyr, and Phe (Table 1). This feature may be characteristic of M. arginini and is not observed in proteins of M. capricolum (10, 16). Secondly, all tryptophans of LBIF appeared to be encoded by a TGA that was a termination (opal) codon throughout procaryotes and eucaryotes. This finding is consistent with previous findings that the TGA codon encoded tryptophan in ribosomal protein genes and tRNA genes of M. capricolum (9, 16). Thus, it is suggested that mycoplasma
species, at least M. capricolum and M. arginini, predominantly use a TGA codon but not a universal TGG codon for tryptophan. Mycoplasma genitalium and Mycoplasma pneumoniae appeared to use both codons TGA and TGG for tryptophan in the genes encoding adhesins (5). ACKNOWLEDGMENTS
We thank Mihoko Sato for expert editorial assistance. This work was supported by a Grant-in-Aid for Cancer Research from the Japanese Ministry of Education, Science and Culture; a Grant-in-Aid for Scientific Research from the Japanese Ministry of Education, Science and Culture; a special Grant-in-Aid for promotion of Education and Science in Hokkaido University provided by the Japanese Ministry of Education, Science and Culture; a Grantin-Aid from the Mochida Memorial Foundation for Medical and Pharmaceutical Research; and a Grant-in-Aid from the Akiyama Foundation. LITERATURE CITED
1. Barile, M. F., and B. G. Levinthal. 1968. Possible mechanism for mycoplasma inhibition of lymphocyte transformation induced by phytohemagglutinin. Nature (London) 219:751-752. 2. Baur, H., E. Luethi, V. Stalon, A. Mercenier, and D. Haas. 1989. Sequence analysis and expression of the arginine-deiminase and carbamate-kinase genes of Pseudomonas aeruginosa. Eur. J. Biochem. 179:53-60. 3. Cole, B. C., Y. Naot, E. J. Stanbridge, and K. S. Wise. 1985. Interactions of mycoplasmas and their products with lymphoid
VOL. S8, 1990
4.
5.
6.
7.
8.
9.
10.
cells in vitro, p. 203-257. In S. Razin and M. F. Barile (ed.), The mycoplasmas, vol. 4. Academic Press, Inc., New York. Copperman, R., and H. E. Morton. 1966. Reversible inhibition of mitosis in lymphocyte cultures by non-viable mycoplasma. Proc. Soc. Exp. Biol. Med. 123:790-795. Dallo, S. F., A. Chavoya, C.-J. Su, and J. B. Baseman. 1989. DNA and protein sequence homologies between the adhesins of Mycoplasma genitalium and Mycoplasma pneumoniae. Infect. Immun. 57:1059-1065. Grantham, R. 1981. Codon catalog usage is a genome strategy modulated for gene expressivity. Nucleic Acids Res. 9:43-74. Kondo, K., H. Sone, H. Yoshida, T. Toida, K. Kanatani, Y.-M. Hong, N. Nishino, and J. Tanaka. 1990. Cloning and sequence analysis of the arginine deiminase gene from Mycoplasma arginini. Mol. Gen. Genet. 221:81-86. Maniatis, T., E. F. Fritsch, and J. Sambrook. 1989. Molecular cloning: a laboratory manual, 2nd ed. Cold Spring Harbor Laboratory, Cold Spring Harbor, N.Y. Muto, A., Y. Kawauchi, F. Yamao, and S. Osawa. 1984. Preferential use of A- and U-rich codons for Mycoplasma capricolum ribosomal proteins S8 and L6. Nucleic Acids Res. 12:82098217. Sawada, M., A. Muto, M. Iwami, F. Yamao, and S. Osawa. 1984. Organization of ribosomal RNA genes in Mycoplasma capricolum. Mol. Gen. Genet. 196:311-316.
M. ARGININI ARGININE DEIMINASE GENE
3795
11. Schimke, R. T., and M. F. Barile. 1963. Arginine breakdown in mammalian cell culture contaminated with pleuropneumonialike organisms (PPLO). Exp. Cell Res. 30:593-596. 12. Sugimura, K., S. Fukuda, Y. Wada, M. Taniai, M. Suzuki, T. Kimura, T. Ohno, K. Yamamoto, and I. Azuma. 1990. Identification and purification of arginine deiminase that originated from Mycoplasma arginini. Infect. Immun. 58:2510-2515. 13. Sugimura, K., T. Ohno, S. Fukuda, Y. Wada, T. Kimura, and I. Azuma. 1990. Tumor growth inhibitory activity of a lymphocyte blastogenesis inhibitory factor (LBIF). Cancer Res. 50:345-349. 14. Sugimura, K., K. Tsukahara, Y. Ueda, K. Takeda, Y. Habu, and I. Azuma. 1988. Fast protein liquid chromatography of lymphocyte blastogenesis inhibitory factor (LBIF) produced by a human macrophage-like cell line U937. J. Chromatog. 440:131140. 15. Sugimura, K., Y. Ueda, K. Takeda, S. Fukuda, K. Tsukahara, Y. Habu, H. Fujiwara, and I. Azuma. 1989. A lymphocyte blastogenesis inhibitory factor (LBIF) arrests mitogen-stimulated T lymphocytes at early G1 phase with no influence on interleukin 2 production and interleukin 2 receptor light chain expression. Eur. J. Immunol. 19:1357-1364. 16. Yamao, F., A. Muto, Y. Kawauchi, M. Iwami, S. Iwagami, Y. Azumi, and S. Osawa. 1985. UGA is read as tryptophan in Mycoplasma capricolum. Proc. Natl. Acad. Sci. USA 82:23062309.