Nucleotide sequence of feline panleukopenia virus - CiteSeerX

87 downloads 143 Views 693KB Size Report
map outside the segments considered to function as an. ATPase and a purine nucleotide-binding site (Anton &. Lane, 1986; Astell et al., 1987). The conservative.
Journal of General Virology (1990), 71, 2747-2753.

2747

Printed in Great Britain

Nucleotide sequence of feline panleukopenia virus: comparison with canine parvovirus identifies host-specific differences John C. Martyn,l'2~ " Barrie E. Davidson 2 and Michael J. Studdert 1. 1School of Veterinary Science and 2The Russell Grimwade School of Biochemistry, University of Melbourne, Parkville, Victoria, 3052 Australia

The nucleotide sequence of feline panleukopenia virus (FPV) strain 193 was determined and compared with the sequence of canine parvovirus (CPV) strain N and partial sequences of FPV strain Carl and CPV strain b. Base differences were identified at 115 positions in these 5.1 kb genomes and predicted amino acid differences occurred at 40 positions. The two overlapping capsid protein genes contained almost twice as many base differences as the single non-structural protein gene (49 compared to 26) and about the same ratio was calculated for predicted amino acid differ-

ences (27 compared to 13). The 27 variant amino acids in the capsid proteins were clustered at three sites in the primary sequence, whereas 10 of the 13 variant amino acids in the non-structural protein occurred in the 130 C-terminal amino acids. The two FPV strains differed consistently from the two CPV strains at 31 bases: 12 base changes in the capsid protein genes resulted in six amino acid changes, six base changes in the nonstructural protein gene resulted in three amino acid changes, and 13 base changes occurred in the noncoding sequence.

Feline panleukopenia virus (FPV) has been endemic in feline populations for at least much of this century whereas canine parvovirus (CPV) emerged suddenly, presumably as a host-range mutant of FPV or a related virus, around 1977 and caused a major pandemic of a highly fatal, hitherto unrecognized disease of dogs. Parvovirus genomes consist of one single-stranded, linear DNA molecule of about 5 kb, with terminal hairpins (Cotmore & Tattersall, 1986; Siegl et al., 1985). The D N A encodes typically at least one non-structural protein (NS1) and two capsid proteins, VP1 and VP2, a smaller protein (Cotmore et al., 1983; Cotmore & Tattersall, 1986; Paradiso, 1984; Rhode & Paradiso, 1983). The extent of mutation in the non-structural protein gene(s) of FPV in the evolution to CPV has not been defined at the level of nucleotide sequence. In this report the nucleotide sequence of FPV strain 193 (FPV. 193), a 1970 Australian isolate, is presented. Comparison with the complete nucleotide sequence of C P V . N (Reed et al., 1988) and partial sequences of FPV.Carl (Carlson et al., 1985) and CPV.b (Rhode, 1985) enabled identification of consistently variable nucleotide and amino acid residues between strains of FPV and CPV, some of which may have permitted FPV to infect canines as a host-range mutant now called CPV.

FPV.193, isolated in 1970 (Studdert & Peterson, 1973), was passaged and plaque-purified (PP) in a feline embryo (FEmb) cell line (O'Reilly & Whitaker, 1969). A stock designated FPV. 193 PP passage 18 had a titre of 108.4 p.f.u./ml. Confluent FEmb monolayer cell cultures (approx. 4 x 107 cells) grown in 150 cm 2 cell culture flasks (Flow Laboratories) were synchronized by incubation in serum-free growth medium at 37 °C for 24 h. Cells were

t Present address: CSIRO, Australian Animal Health Laboratory, Geelong, Victoria, 3220 Australia. 0000-9639 © 1990 SGM

,.,-

i

6

' .

.

.

'

'

~................... " " " " . . . .................. . ' " " ' " ' " ' : l '"

? 50

.

60

7"(3

80

" I ~"

9~I

IOO MAP UNITS

pE pPVU pVP2 DRill

iI

It

~ ,il----

II ~

II

Ii II

it

9

II

q

I

q

4

II

Fig. 1. Strategy for sequencing FPV. 193 RF DNA. Inserts ofplasmids pE, pPVU, pVP2 and pHIII were digested with restriction enzymes and shotgun-cloned into phage M13 vectors. Arrows indicate the segment of a particular strand sequenced. The cross-hatched region is the NS1 gene and the stippled region is the VP2 gene. Restriction sites:

B, BssHII; A, AluI; U, Sau3AI; E, EcoRI; X, XbaI; N, HindlI; D, HindlII; C, ScaI; P, PstI; G, BgllI; V, PvulI; H, HaeIII; R, EcoRV; S, SspI.

2748

Short communication

C T T TAGAACCAAC TGAC CAAGT T CACGTACGTATGAC GTGAT~ACGC GC G%~T/ICGC GCGC T G~C TACG~CAGT CACAC GTCATACG TACGC T CC T TGGTCAGT TGGT TC TAAAGAATGAT

120

A G G C G G T T T G T G T G T T T A A A C T T G G G C G G G A A A A G G T G G C G G G C TAAT TGTGGGCGTGGTTAAAGGTATAAAAGACAAACCATAGACCGTTACTGACATTCGCTTC TTGTCTCTGACAGA

240

NSI: M S G N Q Y T E E V M E G V N W L K K H A E N E A F S F V F K GTGAACCTCTC TTACTC TGACTAACCATGTC TGGCAACCAGTATACTGAGGAAGTTATGGAGGGAGTAAATTGGTTAAAGAAACATGCKGAAAATGAAGCATTT TCGTTTGTTTTTAAA

31 359

C D N V Q L N G K D V R W N N Y T K P I Q N E E L T S L I R G A Q T A M D Q T E TGTGACAACGTCC AACTAAATGGAAAGGATGTTCGCTGGAACAAC TATACCAAACCAATTCAAAATGAAGAGC TAACATC TT TAATTAGAGGAGCACAAACAGCAATGGATCAAACCGAA

71 479

E E E M D W E S E V D S GAAGAAGAAATGGACTGGGAATCGGAAGTTGATAGTC

L A K K Q V Q T F D A L I K K C L F E V F V S K N I E P TCGCCAAAAAGCAAGTACAAACTTTTGATGCATTAATTAAAAAATGTCT TTTTGAAGTC TTTGT T TC TAAAAATATAGAACCA

iii 599

N E C V W F I Q H E W G K D Q G W H C H V L L H S K N L Q Q A T G K W L R R Q M AATGAATGTGT TTGGTT TATTCAACATGAATGGGGAAAAGATCAAGGCTGGCATTGTCATGTTTTACT TCATAGTAAGAACTTACAACAAGCAAC TGGTAAATGGC TACGCAGACAAATG

151 719

N M Y W S R W L V T L C S V N L T P T E K I K L R E I A E D S E W V T I L T Y R AATAT GTAT TGGAGTAGATGG T TGGTGAC TC T T T G T TCGGTAAAC T TAACACCAAC TGAAAAGAT TAAGC TCAGAGAAAT TGCAGAAGATAGTGAAT GGG TGAC TATAT TAACATACAGA

191 839

H K Q T K K D Y V K M V H F G N M I A Y Y F L T K K K I V H M T K E S G Y F L S CATAAGCAAACAAAAAAAGAC TATGT TAAA TGGT TCAT T T T GGAAATAT GATAGCATAT TACT T T T TAACAAAGAAAAAAAT TGTCCACATGACAAAAGAAAG TGGC TAT T T T TTAAGT

231 959

T D S G W K F N F M K Y Q D R H T V S T L Y T E Q M K P E T V E T T V T T A Q E AC TGAT TCTGGTT GGAAAT T TAAC T T TATGAAG TATCAAGACAGACATAC TG T CAGCACAC T T TACAC TGAACAAAT GAAACCAGAAACCGT TGAAACCACAGTGACGACAGCACAGGAA

271 1079

T K R G R I Q T K K E V S I K C T L R D L V S K R V T S P E D W M M L Q P D S Y 311 ACAAAGC GC GGGAGAAT T CAAAC TAAAAAGGAAGTGTCAATCAAATG TACT T TGC GGGAC T TGGT TAGTAAAAGAGTAACATCACC TGAAGAC TGGATGATGT TACAACCAGATAGT TAT 1199

I E M M A Q P G G E N L L K N T L E I C T L T L A R T K T A F E L I L E K A D N ATT GAAATGATGGCACAACCAGGAGGT GAAAATC T TT TAAAAAATACAC T TGAAAT T TGTAC T T T GAC T T TAGCAAGAACAAAAACAGCATT TGAAT TAATAC T TGAAAAAGCAGATAAT

351 1319

T K L T N F D L A N S R T C Q I F R M H G W N W I K V C H A I A C V L N R Q G G AC TAAAC TAAC TAAC TT TGATC T TGCAAAT TCTAGAACAT GTCAAAT T T T TAGAATGCAC GGATGGAAT T GGAT TAAAGT T TGTCACGC TATAGCATGTGTT T TAAATAGACAAGGTGGT

391 1439

K R N T V L F H G P A S T G K S I I A Q A I A Q A V G N V G C Y N A A N V N F P 431 AAAAGAAATACAGT TCT T TT TCATGGACCAGCAAGTACAGGAAAAT C TAT TAT TGC TCAAGC CATAGCACAAGC T GTGGGTAATGT TGGT TGT TATAATGCAGCAAATGTAAAT T T TCCA 1559 F N D C T N K N L I W I E E A G N F G Q Q TTTAATGACTGTACCAATAAAAATTTAATTTGGATTGAAGAAGCTGGTAACTTTGGTCAACAAGT

v

N Q F K A I C S G Q T I R I D Q K G TAATCAAT T TAAAGCAATTTGTTCTGGACAAACAATTAGAATTGATCAAAAAGGT

471 1579

K G S K Q I E p T p V I M T T N E N I T I V R I G C E E R P E H T Q P I R D R M AAAGGAAGTAAGCAAAT TGAACCAAC TCCAGTAAT TATGACAAC TAATGAAAATATAACAAT TGTAAGAAT TGGATG TGAAGAAAGAC C T G A A C A T A C A C A A C C A A T ~ G A G A C A G ~ T G

511 1799

L N I K L V C K L P G D F G L V D K E E W P L I C A W L V K H G Y Q S T M A N Y 551 T TGAACATTAAGT TAGTATGTAAGC T TCCAGGAGAC T T TGGT T TGGT TGATAAAGAAGAATGGCC T T TAATATG TGCATGGTTAGT TAAACATGGT TATCAATCAACCAT GGCTAAC TAT 1919 T H H W G K V p E W D E N W A E P K I Q E G I I S P G C K D L E T Q A A S N P Q 591 ACACATCAT TGGGGAAAAGTACCAGAATGGGACGAAAAC TGGGCGGAGCC TAAAATACAAGAAGGTATAAT TT CAC CAGGTTGCAAAGAC TTAGAGACACAAGC GGCAAGCAATCC TCAG 2039 S Q D H V L T P L T P D V V D L A L E P W S T P D T P I A E T A N Q Q S N Q L G 631 AGTCAAGAC C~CGT T C TAAC TCC TC TGAC TCC GGACGTAGTGGACCT TGCAC TGGAACCGTGGAG TAC TC CAGATACGCC TAT TGCAGAAAC TGCAAATCAACAATCAAACCAACT TGGC 2159 V T H K D V Q A S P T W S E I E A D L R A I FT S E Q L E E D F R D D L D %'P1.568 GT TAC TCACAAAGACGTGCAAGCGAGTCCGACATGGTC C GAAATAGAGGCAGACC TGAGAGCCAT T T T TACT TC TGAACAAT TGGAAGAAGATT T TCGAGACGAC T TGGAT TAAGGTACG 2279 M A P P A K R A R R AT GGCACCT C CGGCAAAGAGAGCCAGGAGAGGTAAGGGTGTGT

G L V P P G 16 TAGTAAAGTGGGGGGAGGGGAAAGAT T TAATAAC T TAAC TAAG TATG TGT T T TC T TATAGGAC T TGT GC C TCCAGGT 2399

y K y L G P G N S L D Q G E P T N P S D A TATAAATATCTTGGGCCTGGGAACAGTCTTGACCAAGGAGAACCAACTAACCCTTCTGACGCCGC

A

A K E H D E A Y A A Y L R S G K N P TGCAAAAGAACACGACGAAGC TTACGCTGC TTATCTTCGCTCTGGTAAAAACCCA

56 2519

y L y F S p A D Q R F I D Q T K D A K D W G G K I G H Y F F R A K K A I A P V L 96 TACT TATAT T TCTCGC CAGCAGATCAACGC T T TATAGATCAAAC TAAGGACGC TAAAGAT TGGGGGGGGAAAATAGGACAT TAT T T TT T TAGAGC TAAAAAA_GCAAT TGC TC CAGTAT TA 2639 T D T p D H P S T S R P T K P T K R S K P P P H I F I N L A K K K K A G A G q V ACTGATACACCAGATCATCCATCAACATCAAGACCAACAAAACCAACTAAAAGAAGTAAACCACCACC TC ATAT T T TCATCAATCT TGC GCCGGTGCAGGACAAGTA %1'2: K R D N L A P M S D G A V Q P D G G Q P A V R N E R A T G S G N G S G G G G G G AAAAGAGACAATC TTGCACCAATGAGTGATGGAGCAGTTCAACCAGACGGTGGTCAACC TGC TGT CAGAAATGAAAGAGC TACAGGATC T GGGAACGGGTC TGGAGGCGGGGGTGGTGGT

136 2759 175 2879

G S G G V G I S T G T F N N Q T E F K F L E N G W V E I T A N S $ R L V H L N M 216 GGT TCTGGGGGTGTGGGGATT TC TACGGGTAC T T TCAATAATCAGACGGAAT T TAAAT T T T T GGAAAACGGAT GGGT GGAAAT CACAGCAAAC TCAAGCAGACT TGTACAT T TAAATATG 2999 P E S E N Y K R V V V N N M D K T A V K G N M A L D D I H ~V Q I V T P W S L V D 256 C C A G A A A G T G A A A A T T A T A ~ K G A G TAGT TGTAAATAATATGGATAAAAC TGCAGT TA~_GGAAACATGGC T TTAGATGATATT CATGTACAAAT TGTAACACC T TGGTCAT TGGT TGAT 3119 A N A W G V W F N P G D W Q L I V N GCAAATGCTTGGGGAGTTTGGTTTAATCCAGGAGATTGGCAACTAATTGTTAATAC

T

M S E L H L V S F E Q E I F N V V L K T V TATGAGTGAGTTGCATTTAGTTAGTTTTGAACAAGAAATTTTTAATGTTGTTTTAAAGACTGTT

296 3239

S E S A T Q P P T K V Y N N D L T A S L M V A L D S N N T M P F T P A A M R S E 335 TCAGAATC TGC TAC TCAGCCACCAAC TAAAGT T TATAATAATGATT TAAC TGCATCAT TGATGGTTGCAT TAGATAGTAATAATAC TATGCCAT T TAC TCCAGCAGC TATGAGATC TGAG 3359

Short communication

2749

T L G F Y P W K P T I P T P W R Y Y F q W D R T L I P S H T G T S G T P T N V Y ACATTGGGTTTTTATCCATGGAAACC~CCATACC~CTCCATGGAGATATTATTTTC~TGGGATAG~CATT~TACCATCTcATACTGG~CTAGTGGCACACc~CAAATGTATAT

376 3479

H G T D P D D V Q F Y T I E N S V P V H L L R T G D E F A T G T F F F D C K P C CATGGTACAGATCCAGATGATGTTc~TTTTATACTATTGAAAATTCTGTGCcAGTACACTTACT~G~CAGGTGATG~TTTGCTAcAGG~CATTTTTTTTTGATTGTAAACCATGT

416 3599

R L T H T W q T N R A L G L P P F L N S L P Q S E G A T N F G D I G V q Q D K R AGACT~CACATA~TGGCAAACAAATAGAGCATTGGGCTTACCACCATTTCTAAATTCTTTGCcTC~TCTG~GGAGCTACT~CTTTGGTGATATAGGAGTTC~C~GATAAAAGA

456 3719

R G V T q M G N T ~ Y I T E A T I M R P A E V G Y S A P Y Y S F E A S T Q G P F CGTGGTGT~CTCA~ATGGGA~ATACAGACTATATTACTG~GCTACTATTATGAGACCAGCTGAGGTTGGTTATAGTGcACcATATTATTCTTTTG~GcGTCTACAC~GGGCCATTT

496 5839

K T P I A A G R G G A Q T D E N Q A A D G D P R Y A F G R Q H G q K T T T T G E AAAA~ACCTATT~AGCAGGACGGGGGGGAGCGCAAACAGATGAAAATC~GCAGCAGATGGTGATCc~GATATG~ATTTGGTAGAC~CATGGTCAAAAGACTACTA~AGGAG~

536 3959

T P E R F T Y I A H q D T G R Y P E G D W I Q N I N F N L P V T N D N V L L P T ACACCTGAGAGATTTACATATATA~ACATC~GATACAGG~GATATCCAG~GGAGATTGGATTCA~AATATT~CTTT~cCTTCCTGT~CAAATGAT~TGTATTGcTACC~CA

576 4079

D P I G G K T G I N Y T N I F N T Y G P L T A L N N V P P V Y P N G Q I W D K E GATCC~TTGGAGGTAAAACAGG~TT~CTATACT~TATATTT~TACTTATGGTCCTTT~CTGCATTA~AT~TGTACCACCAGTTTATCCAAATGGTCAAATTTGGGATAAAG~

616 4199

F D T D L K P R L H V N A P F V C Q N N C P G Q L F V K V A P N L T N E Y D P D TTTGATACTGACTTAAAACC~GACTTCATGTAAATGCACCATTTGTTTGTCA~AAT~TTGTCCTGGTC~TTATTTGTAAAAGTTGCGCCT~TTT~CG~TG~TATGATCCTGAT

656 4319

A S A N M S R I V T Y S D F W W K G K L V F K A K L R A S H T W N P I Q Q M S I GCATCTGCT~TATGTC~G~TTGT~CTTATTCAGATTTTTGGTGGAAAGGTAAATTAGTATTTAAAG~TAAA~T~GAGCAT~T~ATA~TTGG~TCC~TTC~CAAATGAGTATT

696 4439

N V D N Q F N Y V P ~ N I G ~ M K I V Y E K S Q L A P R K L Y ~TGTAGAT~C~TTT~CTATGTACCAAAT~TATTGGAGCTATGAAAATTGTATATGAAAAATCTC~CTAGCACCTAGAAAATTATATT~TATACTTACTATGTTTTATGTTTA

727 4559

TTACATATC~CTAGCACCTAGAAAATTATATT~TATACTTACTATGTTTTTATGTTTATTACATATTATTTT~GATT~TTAAATTACAGCATAGAAATATTGTACTTGTATTTGAT f i

4579

ATAG~TTTAG~GGTTTGTTATATGGTATAC~T~CTGT~GA~ATAG~G~CATTTAGAT~TAGTTAGTAGTTTGTTTTATAAAATGT~TGTAAACTATCAATGTATGTTGTT

4799

ATGGTGTGGGTGGTTGGTT~TTTGCCCTTAG~TATGTT~GGA~CAA~TC~TAAAAGACATTTAA~6TAAATG6T--~CGTATACTGTCTAT~GGTG~T~CCT~CC~

4919

~GTATC~TCTGTCTTT~GGGGGGGGTGGGTGGGAGATGCAC~TATCAGTAGACTGACTGG

4983

Fig. 2. The nucleotide sequence of the complementa~ strand of FPV. 193 RF DNA. Deduced amino acid sequences of FPV proteins are indicated by the single letter code. The 3' palindromic sequence is overlined; dots indicate unpaired bases and - , = and arrowheads indicate ~lding (see Fig. 3). The partial 5' palindromic sequence is overlined. Promoters and the polyadenylation signal are in bold type and the stem and l ~ p st~cture a~acent to the ~lyadenylation signal is overlined with dots indicating unpaired bases. The two copies of the 59 base repeat sequence are underlined with arrows showing the direction of the repeat. Five potential splice sites each of ~ur bases are underlined. The 31 bases and nine amino acid di~rences that distinguish CPV ~om FPV are underlined.

split 1 : 3 in growth medium and dispensed into 150 cm 2 flasks. When the cells had attached to the plastic at approximately 6 h post-seeding, a 5 ml virus inoculum (0.5 ml of virus stock diluted in Eagle's basal medium containing 1~ foetal calf serum) per culture flask (m.o.i. approximately 10) was adsorbed to cells for 1 h. Growth medium (45 ml) was added and the cells were incubated for 24 h, when the FPV replicative form (RF) D N A yield was expected to be maximal (Lenghaus et al., 1985). The DNA was extracted from FPV-infected cells by a variation of the Hirt method (Hirt, 1967). The monomer RF DNA band was purified from the D N A extract following electrophoresis. Four overlapping RF DNA fragments, containing 97 ~ of the FPV genome, were purified from different restriction digests of a preparation of monomeric FPV RF DNA by electrophoresis onto DEAE-cellulose membranes (NA-45, Schleicher & Schiill). These fragments were eluted in 1 M-NaC1, 20mM-Tris-HCl, 0.1 mM-EDTA pH 8.0 at 65 °C for 1 h and ligated into pUC12 (Fig. 1). Escherichia coli strain JM107 (Messing, 1983) transformed with recombinant plasmids was

selected on minimal agar containing 50 txg/ml ampicillin, 0.1 mg/ml 5-bromo-4-chloro-3-indolyl-fl-D-galactopyranoside and 1 mM-isopropyl-fl-D-thiogalactopyranoside. Subclones in phage M 13 vectors (Yanisch-Perron et al. 1985) were generated by restriction digestion and shotgun cloning of RF D N A inserts purified from recombinant plasmids (Maniatis et al., 1982). Fortythree subclones (Fig. 1) and the 17-mer universal primer were used to sequence both strands of the 5 kbp FPV RF D N A by the dideoxynucleotide chain termination method (Sanger et al., 1977). The computer programs developed by Staden (1982a, b), arranged into the MELBDBSYS suite by Dr A. Kyne, Walter and Eliza Hall Institute, Melbourne, Australia, were used to manipulate and analyse sequence data. The nucleotide sequence of the complementary strand of FPV. 193 RF D N A is presented in Fig. 2. It is 4983 bp from the 3' terminus [map unit (m.u.) 0] to the 5'-terminal H a e l I I site (m.u. 97). A search for open reading frames (ORFs) in the viral (V or minus polarity) and complementary (C or plus polarity) strands of FPV. 193 RF

2750

Short communication

DNA identified two, in frame 3 of the C strand. The left ORF covers 2013 bases [nucleotide (nO 258 to nt 2271] and the right ORF extends to 2175 bases (nt 2358 to nt4533). Smaller ORFs include a 264 base segment (nt 1970 to nt 2234) in frame 2 of the C strand, and segments of 420 bases (nt 911 to nt 1331) and 306 bases (nt 2009 to nt 2315) in frame 2 of the V strand. The consensus promoter sequence (TATAA) was present at eight sites in the FPV. 193 genome, but only three of these were linked to the two upstream sequences proposed by Bensimhon et al. (1983) to be characteristic of eukaryotic promoters. These sequences were located at 4, 30 and 39 m.u. The FPV. 193 genome has only two copies of the AATAAA sequence which is absolutely required but not sufficient for addition of polyadenylic acid to the 3' ends of eukaryotic mRNAs (Wickens & Stephenson, 1984). This sequence is located at m.u. 31 and m.u. 94. The poly(A) signal at m.u. 94 is followed immediately by a characteristic stem-loop structure of 23 nt (Fig. 2). The amino acid sequences of the three FPV proteins were deduced from the FPV. 193 nucleotide sequence (Fig. 2). The NS1 protein would contain 668 amino acids, the VP1 protein would contain 727 amino acids and the VP2 protein would contain 584 amino acids. These correspond to Mr values of 73K, 80K and 64K respectively. There is a 59 base reiterated sequence in FPV. 193 DNA which extends from nt 4509 to nt 4567 and nt 4568 to nt 4627. This sequence includes codons of the eight C-terminal amino acids of the capsid proteins. Another 61 base sequence located about 75 bases further downstream is sometimes repeated in other strains of FPV and CPV (Carlson et al., 1985; Reed et al. 1988). The nucleotide sequence homology between FPV. 193 and minute virus of mice (MVM) (data not shown) was estimated to be 68 ~o. Regions of these genomes which are highly conserved include about 1.2 kb (approximate nt positions 730 to 1910) in the region coding for NS1 ; a smaller conserved sequence of about 340 nucleotides (approximate nt positions 2270 to 2610) includes the region of the small intron and the coding sequence of the N terminus of VP1. There is also a high degree of conservation at the genomic termini (nt 1 to nt 114 and nt 4918 to nt 4983). The amino acid sequence homology between proteins of FPV. 193 and MVM (data not shown) was estimated to be 72.9~ for NS1, 56.9~ for VP1 and 54~ for VP2. A 310 amino acid segment (aa 260 to aa 570) of NS1 which contains a high proportion of hydrophobic amino acids is highly conserved. A comparison between the capsid proteins VP1 and VP2 indicated conservation of some amino acid sequence unique to the VP1 protein (aa 1 to aa 87), and a lesser degree of conservation of amino acid

sequence common to both capsid proteins (aa 169 to aa 496, aa 589 to aa 690). The N terminus of the VP1 protein contains a high proportion of basic amino acids and is predicted to interact with and stabilize the charges of the DNA molecule in the virion. Comparison of the nucleotide sequences of FPV. 193, FPV. Carl (Carlson et al., 1985) and two strains of CPV (Reed et al., 1988; Rhode, 1985) and their amino acid translations (data not shown) enabled identification of base and amino acid differences between the four virus strains (Table 1). The 3' non-coding nucleotide sequences differ at 10 bases. The NS1 nucleotide sequence differs at 26 bases which would result in 13 amino acid changes. The small intron contains two nucleotide differences. The VP1 nucleotide sequence differs at 49 bases which would result in 27 amino acid changes. The 5' non-coding nucleotide sequences differ at 28 bases. The total number of differences for the four virus comparisons is 115 bases and 40 amino acids. Only 31 of 4983 bases and nine amino acids are consistently different between strains of FPV and CPV. The 68~ nucleotide sequence homology between FPV. 193 and MVM, and the location of FPV ORFs and regulatory sequence elements, indicate that the organization of the FPV genome resembles that of the MVM and H-1 parvovirus genomes. The left large ORF encodes a non-structural protein and the right large ORF encodes the two capsid proteins. None of the smaller ORFs is likely to encode an exon of a viral protein. Although the 264 base ORF in the C strand of FPV RF DNA is preceded by the consensus donor (nt 525 to nt528) and acceptor (nt 1997 to nt2000) splice site sequences used in the production of a second non-structural protein NS2 for the MVM parvovirus (Jongeneel et al., 1986), the codon usage curve (data not shown) suggests that this ORF in the FPV genome is non-coding sequence. Further evidence that the FPV genome does not encode NS2 is provided by immunoprecipitation studies using antiserum to bacterial fusion proteins containing MVM-specific amino acid sequences (Cotmore & Tattersall, 1986) which showed that an NS1 protein but not an NS2 protein could be precipitated from lysates of FPV-infected cells. The 420 and 306 base ORFs in the V strand are predicted to be non-coding sequences on the basis that only the V strand hybridizes to mRNA (Pintel et al., 1983). Features of the FPV. 193 genome include two promoters at m.u. 4 and m.u. 39, potential introns in both large ORFs, and a single polyadenylation signal at m.u. 94. By analogy with the functional promoters in rodent parvovirus genomes, which were identified by in vitro transcription of RF DNA fragments (Lebovitz & Roeder, 1986; Pintel et al., 1983), the third consensus promoter sequence in the FPV. 193 genome at m.u. 30

Short communication

2751

T a b l e 1. Nucleotide and amino acid differences between two feline and two canine parvoviruses Section of Base genome position* FPV. 193 FPV.Carl CPV. N CPV.b

3' Noncoding

NS1

Small intron VP1

VP2

- 3t 46~" 48 t 233t 262/3t 431 t 1007t 10091" 1314 1490t 1724 1745t 1800 1886 1892 1899 1952 1970 1986 1990 2051t 2159 2174 2192 2216 2225 2247/8 2250 2260/1 2346 2376 2423 2575 2621 t 2773/4 3014 3015 3018t 3026t 3059t 3080 3082 30881" 3133/4 3474 3479"t 3626 3706 3718 3747t

ATT 0 0 T AACC A A T G C T G T T T G T T A A A

G G T G A G T C A A A

Amino acid variation

Section of Base genome position* FPV.193 FPV.Carl C P V . N CPV.b VP2

000~ G G C 0000 G T C G T T A T T T C C T A T C

-~/ A T T A C T T G T T G A C

C

C

C

T

NC

C A G T GA G AC G C C A A TT T T A A A T T T GA G T T A G G

C A A C GA G AC A T C C A AA T T A A A C C T GA A T C A G G

C G G C GA G AC G T C A G TT T T G G C T T C GA A C T A G A

G A G C AG A CA G T G A G TT G G G G C T T C AT A C T C A A

Asp--+Glu NC NC NC Glu--+Arg Asp--->Asn Asp-+Ala

3818t 3844 3845/6 3879 3903 3906/8 3936 3941 3947t 4012 4055 4078 4106 4283 4284, 6 4301 4352 4403t 4464 4471t 4483t 4518/9 5' Non- 45361" coding 4548/9 4552/3 4563 4566 4585/6 4607/8 4623 4648 4652 4659/60 4663/4 4694 4701/2

cont.

NC~ His-+Gln Thr--+Ile Asp-+Asn NC NC NC NC NC His-*Gln Gln-+Glu NC Pro--+Leu Ile--*Val Ile-+Asn His--+Gln

Asn--+Lys Lys---~Thr NC Leu-+Gln Asn--+Lys Tyr--->Asp Lys--*Arg NC Lys---*Asn NC Ile-+Thr Val--+Ala Gly-*Asp Val--+Ile NC NC Gln-+Pro Arg-+Lys Asp--*Asn

4747 4758/9 4774t 4781/2 4783 4786 4809/10 4823 4839 4853/4 4862 4876 4882t

A C 000 G G CCA C G T A A C T A G, T G T A G A C CC T 0 0 C A 0 0 C T G AA TT G 0

A T 000 G G CCA C A T C A C C A G, T A C A C A C CC T T 0 C A 0 0 C T A AA TT G 0

G C 000 T G CCA C A C A G C T A G, T A C C G G G CC C G T C A A G C 0 G AA TT 0 +

G C CTA G A 000 A A C A A T T G T, G A C C G G G GG C T 0 G G 0 0 G T G TT 00 G 0

0

+~t

+

0

A 0 A 0 T C 0 T T 0 A A T

G 0 A 0 T T 0 T 0 0 0 T T

A 0 0 0 C T 0 T T 0 A A C

A T 0 A T T G 0 T A A A C

Amino acid variation NC Thr-+Ile Leu Asp--->Tyr Asp--+Asn Pro Gln---~Lys NC NC Glu-+Ala NC Thr--+Ile NC NC Val--->Leu NC NC NC Val--->Leu Asn--+Ser Ala--*Gly Pro-+Gly

* Base position refers to the FPV. 193 sequence (Fig. 2). t Consistent differences between FPV strains and CPV strains.

Other symbols are: -, base not sequenced; 0, base absent; NC, no change; + , additional copy of sequence from nt 4702 to nt 4762.

would not function. Similarly there are consensus intron sequences in the FPV.193 genome at those sites confirmed as introns in the MVM genome (Jongeneel et al., 1986; Morgan & Ward, 1986). FPV.193 has a potential large intron between nt 527 and nt 1998, and is predicted to have two alternative forms of the small intron in all three transcripts by splicing either nt 2274 or

nt 2310 to nt 2383. The A A U A A A sequence at m.u. 94 is likely to be the polyadenylation signal for all m R N A s transcribed from FPV. 193 RF D N A because it has an associated secondary structure and the A A U A A A sequence at m.u. 31 occurs in the middle of the left ORF. The observed Mr values of the FPV proteins NS 1, VP 1 and VP2 are 85K, 79K and 66K respectively (Mengeling

2752

Short communication

et al. 1988). Whereas the predicted and observed Mr

estimates agree for FPV capsid proteins, the observed M, estimate for the FPV non-structural protein is significantly greater than the predicted value because extensive phosphorylation increases the actual Mr (Paradiso, 1984). Parvovirus genomes have characteristic tandem repeats in the 5' non-coding region. The two FPV and two CPV sequences analysed contained two copies of the first 59 nt repeat but variable copies of the second 61 nt repeat located about 75 nucleotides further downstream. One (this report; Rhode, 1985), two (Cadson et al. 1985) and three (Reed et al., 1988) copies of this sequence have been reported. The significance of these two sets of direct repeats is uncertain. For FPV and CPV the conservation of the first repeat sequence suggests that it is essential for replication, whereas the difference in copy number of the second repeat implies one copy is adequate for this purpose. Proposals offered for this variation include optimization of packaging by adjusting the D N A length (Rhode, 1985) and enhancement of transcription (Astell et al., 1986). A transcription effect seems doubtful, however, because no enhancer activity was detected in chloramphenicol acetyltransferase fusion plasmids containing this sequence (Rhode & Richard, 1987). Analysis of recombinant genomes of FPV and its host range variant CPV (Parrish et al., 1988; Parrish & Carmichael, 1986) and of MVM(p) and its tissue-specific variant MVM(i) (Antonietti et al., 1988; Gardiner & Tattersall, 1988a, b) has implicated the capsid proteins in host celt specificity. The 12 nucleotide and six amino acid substitutions differentiating the capsid proteins of FPV strains and CPV strains were identified independently in this study and in that of Parrish et al. (1988). Part of the determinant of canine host range (m.u. 59 to m.u. 64 in the genome) detected by recombination mapping between FPV and CPV genomes (Parrish et al., 1988) contains five base mutations and four amino acid differences: Lys to Asn, Ile to Thr, Val to Ala and Gly to Asp. However only the two conservative changes Lys to Asn and Val to Ala differentiate between FPV strains and CPV strains. The mechanism by which variants of FPV capsid proteins would permit infection of canine cells is speculative. Perhaps one or more of the six amino acids specific to CPV capsid proteins confers on them some affinity for a surface protein on canine cells. On the other hand, if FPV can bind to receptors on canine ceils in a fashion analogous to MVM(p) binding to lymphocytes (Spalholz & Tattersall, 1983), then presumably these amino acid substitutions would permit interaction between capsid proteins and factors involved in uncoating or packaging of viral D N A within canine cells. The role of mutations in the non,structural protein is

~ C c /70 ~G~I"

go I A

~m

1-

CA" CAG T G T GCAG T A T G C A T G cG eo~C~ T C A C A C G T C A T A C G T A C G C.G~~t..

cG~ [] ~.~.c\oo []

ao

30

loo

1.~

I

[

GGAA CCAGT CAA CCAAGATT TCTT AC"" A C T T GG T C A G T T G G T T C T A A A G 3 ' 20

,u

A~

G "G ~-

Fig. 3. Structureof the 3' terminus of FPV. 193 virion DNA predicted from the nucleotide sequence. Numbering is from the 3' terminus of virion DNA. Arrows [-~ indicate bases that are deleted in CPV. N and arrow ~ indicates bases present in CPV. N but deleted in FPV. 193. Free energy (AG°) is -25.7 kJ/mol.

also unknown. The three amino acid substitutions found in CPV strains identified in this study (Table 1) map outside the segments considered to function as an ATPase and a purine nucleotide-binding site (Anton & Lane, 1986; Astell et al., 1987). The conservative substitutions His to Gin at aa 247 and Thr to Ile at aa 248 are located centrally in the amino acid sequence of the NS1 protein. The other conservative substitution His to Gin at aa 595 occurs at the hydrophilic C terminus of NS1, which contains about 7 7 ~ of all substitutions in NS1 observed between strains of FPV and CPV. In contrast, the majority of amino acid differences in the NS1 proteins of MVM(p) and MVM(i) cluster at the N terminus (Astell et al., 1986; Sahli et al., 1985). Factors in canine cells may recognize CPV-specific nucleotide sequences necessary for replication. Only five of the 13 non-coding base changes differentiating between FPV and CPV can be associated with a known viral function. These five base changes affect the stability of the Y-terminal hairpin involved in replication of FPV DNA. The stem of the hairpin is more stable in CPV D N A which has extended the palindrome of FPV D N A by 3 bp due to the extra nucleotides A T T (Fig. 3). The loop of the hairpin is more stable in FPV DNA, which has perfectly paired 'arms', unlike CPV D N A with two unpaired G bases in one 'arm'. Nucleotide sequences at the 5' termini could not be compared because the FPV sequence was not available and the authenticity of the CPV sequence is not established. Reed et al. (1988) conceded that the CPV sequence may be duplicated in this region due to rearrangements in the recombinant plasmid or because of the high passage number of the virus in canine cells. It is possible that a mutation affecting the origin of replication in the 5' terminus of FPV D N A may have permitted it to replicate in canine cells. The authors wish to thank Nino Ficorilli for technical assistance, Ted Stephensand Bob Ivison for photographyand MelindaCairns and Joan Caelli for typing the manuscript. This work was supported by Project Grant No. D28315867from the Australian Research Council. J.C.M. was the recipient of a Melbourne University Postgraduate Scholarship.

Short communication

References ANTON, I. A. & LANE, D. P. (1986). Nonstructural protein 1 of parvoviruses: homology to purine nucleotide using proteins and early proteins of papovaviruses. Nucleic Acids Research 14, 7813. ANTONIETTI, J. P., SAHLI, R. P., BEARD, P. & HIRT, B. (1988). Characterization of the cell type-specific determinant in the genome of minute virus of mice. Journal of Virology 62, 552-557. ASTELL, C. R., GARDINER, E. M. & TATTERSALL, P. (1986). DNA sequence of the lymphotropic variant of minute virus of mice, MVM(i), and comparison with the DNA sequence of the fibrotropic prototype strain. Journal of Virology 57, 656-669. ASTELL, C. R., MOL, C. D. & ANDERSON, W. F. (1987). Structural and functional homology of parvovirus and papovavirus polypeptides. Journal of General Virology 68, 885-893. BENSIMHON, i . , GABARRO-ARPA,J., EHRLICH, R. & REISS, C. (1983). Physical characteristics in eucaryotic promoters. Nucleic Acids Research 11, 4521-4540. CARLSON, J., RUSHLOW, K., MAXWELL,I., MAXWELL,F., WINSTON, S. & HAHN, W. (1985). Cloning and sequence of DNA encoding structural proteins of the autonomous parvovirus feline panleukopenia virus. Journal of Virology 55, 574-582. COTMORE, S. F. & TATTERSALL, P. (1986). Organization of nonstructural genes of the autonomous parvovirus minute virus of mice. Journal of Virology 58, 724-732. COTMORE, S. F., STURZENBECKER,L. J. & TATTERSALL,P. (1983). The autonomous parvovirus MVM encodes two nonstructural proteins in addition to its capsid polypeptides. Virology 129, 333-343. GAP,DINER, E. i . & TATTERSALL, P. (1988a). Evidence that developmentally regulated control of gene expression by a parvoviral allotropic determinant is particle mediated. Journal of Virology 62, 1713-1722. GARDINER, E. M. & TATTERSALL, P. (1988b). Mapping of the fibrotropic and lymphotropic host range determinants of the parvovirus minute virus of mice. Journalof Virology 62, 2605-2613. HIRT, B. (1967). Selective extraction of polyoma DNA from infected cell cultures. Journal of Molecular Biology 26, 365-369. JONGENEEL,C. V., SAHLI,R., MCMASTER, G. K. & HIRT, B. (1986). A precise map of splice junction s in the mRNAs of minute virus of mice, an autonomous parvovirus. Journal of Virology 59, 564-573. LEBOVITZ, R. i . & ROEDER, R. G. (1986). Parvovirus H-1 expression: mapping of the abundant cytoplasmic transcripts and identification of promoter sites and overlapping transcription units. Journal of Virology 58, 271-280. LENGHAUS, C., MUN, T. K. & STUDDERT, i . J. (1985). Feline panleukopenia virus replicates in cells in which cellular DNA synthesis is blocked. Journal of Virology 53, 345-349. MANIATIS, T., FRITSCH, E. F. & SAMBROOK, J. (1982). Molecular Cloning: A Laboratory Manual. New York: Cold Spring Harbor Laboratory. MENGELING, W. L., RIDPATH, J. F. & VORWALD,A. C. (1988). Size and antigenic comparisons among the structural proteins of selected autonomous parvoviruses. Journal of General Virology 69, 825-837. MESSING, J. (1983). New M13 vectors for cloning. Methods in Enzymology 101, 20-78. MORGAN, W. R. & WARD, D. C. (1986). Three splicing patterns are used to excise the small intron common to all minute virus of mice RNAs. Journal of Virology 60, 1170-1174. O'REILLY, K. J. & WmTAKER, A. M. (1969). The development of feline cell lines for the growth of feline infectious enteritis (panleucopaenia) virus. Journal of Hygiene 67, 115-124.

2753

PARADISO, P. R. (1984). Identification of multiple forms of the noncapsid parvovirus protein NCVP1 in H-1 parvovirus-infected cells. Journal of Virology 52, 82-87. PARRISH, C. R. & CARMICHAEL, L. E. (1986). Characterization and recombination mapping of an antigenic and host range mutation of canine parvovirus. Virology 148, 121-132. PARRISH, C. R., AQUADRO, C. F. & CARMICHAEL,L. E. (1988). Canine host range and a specific epitope map along with variant sequences in the capsid protein gene of canine parvovirus and related feline, mink, and raccoon parvoviruses. Virology 166, 293-307. PINTEL, D., DADACHANJI,D., ASTELL,C. R. & WARt), D. C. (1983). The genome of minute virus of mice, an autonomous parvovirus, encodes two overlapping transcription units. Nucleic Acids Research 11,

1019-1038. REED, A. P., JONES, E. V. & MILLER, T. J. (1988). Nucleotide sequence and genome organization of canine parvovirus. Journal of Virology 62, 266-276. RHODE, S. L., III (1985). Nucleotide sequence of the coat protein gene of canine parvovirus. Journal of Virology 54, 630-633. RHODE, S. L., III & PARADISO, P. R. (1983). Parvovirus genome: nucleotide sequence of H-1 and mapping of its genes by hybridarrested translation. Journal of Virology 45, 173-184. RHODE, S. L., III & RICHARD, S. M. (1987). Characterization of the trans-activation-responsive element of the parvovirus H-1 P38 promoter. Journal of Virology 61, 2807-2815. SArILI, R., MCMASTER, G. K. & HIRT, B. (1985). DNA sequence comparison between two tissue-specific variants of the autonomous parvovirus, minute virus of mice. Nucleic Acids Research 13, 3617-3633. SANGER, F., NICKLEN, S. & COULSON, A. R. (1977). DNA sequencing with chain-terminating inhibitors. Proceedings of the National Academy of Sciences, U.S.A. 74, 5463-5467. SIEGL, G., BATES, R. C., BERNS, K. I., CARTER, B. J., KELLY, D. C., KURSTAK, E. & TATTERSALL, P. (1985). Characteristics and taxonomy of Parvoviridae. Intervirology 23, 61-73. SPALHOLZ, B. A. & TATrERSALL, P. (1983). Interaction of minute virus of mice with differentiated ceils: strain-dependent target cell specificity is mediated by intracellular factors. Journalof Virology46, 937-943. STADEN, R. (1982a). An interactive graphics program for comparing and aligning nucleic acid and amino acid sequences. Nucleic Acids Research 10, 2951-2961. STADEN, R. (1982b). Automation of the computer handling of gel reading data produced by the shotgun method of DNA sequencing. Nucleic Acids Research 10, 4731-4751. STUDDERT, M. J. & PETERSON, J. E. (1973). Some properties of feline panleukopenia virus. Archiv J~r die gesamte Virusforschung 42, 346-354. WlCKENS, M. & STEVHENSON, P. (1984). Role of the conserved AAUAAA sequence: four AAUAAA point mutations prevent messenger RNA 3' end formation. Science 226, 1045-1051. YANISCH-PERRON, C., VIEIRA, J. & MESSING, J. (1985). Improved M13 phage cloning vectors and host strains: nucleotide sequences of the M13mpl8 and pUC19 vectors. Gene 33, 103-119.

(Received 25 April 1990; Accepted 24 July 1990)