Simultaneous Expression of Two P-Glycoprotein Genes in Drug-

3 downloads 0 Views 1MB Size Report
May 22, 1987 - Overexpression of P-glycoprotein is characteristic of multidrug-resistant cells. We analyzed four P- glycoprotein transcripts that are ...
MOLECULAR

AND

Vol. 7, No. 11

CELLULAR BIOLOGY, Nov. 1987, p. 4075-4081

0270-7306/87/114075-07$02.00/0 Copyright © 1987, American Society for Microbiology

Simultaneous Expression of Two P-Glycoprotein Genes in DrugSensitive Chinese Hamster Ovary Cells JANE A. ENDICOTT, PETER F. JURANKA, FARIDA SARANGI, JAMES H. GERLACH,t KATHRYN L. DEUCHARS, AND VICTOR LING*

The Ontario Cancer Institute, The Princess Margaret Hospital, and The Department of Medical Biophysics, University of Toronto, Toronto, Ontario M4X 1K9 Canada Received 22 May 1987/Accepted 7 August 1987

Overexpression of P-glycoprotein is characteristic of multidrug-resistant cells. We analyzed four Pglycoprotein transcripts that are simultaneously expressed in a drug-sensitive Chinese hamster ovary cell line. We concluded that these transcripts are encoded by two distinct members of a P-glycoprotein multigene family, each of which has two alternative polyadenylation sites. A comparison of the two hamster sequences with the single reported human and mouse P-glycoprotein cDNA sequences demonstrates that P-glycoprotein is a highly conserved protein, that the hamster multigene family is undergoing concerted evolution, and that differences between gene family members are maintained across species. These conserved differences suggest that there may be functional differences between P-glycoprotein molecules.

transport of alpha-hemolysin (a 107-kilodalton protein) from bacterial cells (8, 16). This homology is sufficiently striking that P-glycoprotein can be regarded as a tandem duplication of the HlyB molecule (8). P-glycoprotein is thought to be a membrane "pump" which actively transports drug molecules from the cell, thereby reducing intracellular drug concentrations in multidrug-resistant cells (2, 8, 10). We present the partial cDNA and deduced amino acid sequences of four different P-glycoprotein transcripts that are simultaneously expressed in drug-sensitive Chinese hamster ovary (CHO) cells. We conclude that these transcripts are encoded by two distinct members of a P-glycoprotein multigene family (pgpl and pgp2), each of which has two alternative polyadenylation sites. This is the first direct confirmation of the P-glycoprotein multigene family hypothesis (4, 19). The gene family members are distinguished by nucleotide sequence differences in both the protein-coding and 3'-untranslated regions. We also compare the two hamster P-glycoprotein cDNA sequences with the single Pglycoprotein cDNA sequences isolated from mouse and human cells (2, 10). This comparison shows that the human P-glycoprotein sequence (mdrl) is more homologous to hamster pgpl and that the mouse sequence has greater homology to hamster pgp2.

The presence of multidrug-resistant cells in human tumors and their selection and proliferation during chemotherapy may be major obstacles to cancer treatment (9). Mammalian cell lines selected for resistance to a single cytotoxic agent often develop pleiotropic resistance to unrelated drugs (9). Since many of the drugs involved are commonly used in chemotherapy, such cell lines have proven useful as in vitro models for the study of multidrug resistance in human tumors. A consistent alteration in multidrug-resistant cells is the overexpression of P-glycoprotein, an integral membrane protein of 170 kilodaltons (9). A P-glycoprotein cDNA clone, pCHP1 (19), has been isolated by using a P-glycoproteinspecific monoclonal antibody (14). This clone was used to isolate the cDNA clones discussed in this paper. Independently, an in-gel renaturation technique (20) has also been used to isolate amplified sequences from a drug-resistant hamster cell line (21). These sequences were used to isolate homologous human (2) and mouse (10) cDNA clones. The human cDNA sequence (mdrl) was reconstructed from several overlapping clones isolated from a multidrug-resistant cell line cDNA library. The mouse sequence was determined from a single full-length cDNA clone isolated from a drug-sensitive cell line. The human cDNA sequence (mdrl) encodes a protein detected by P-glycoprotein-specific monoclonal antibodies (26). We show that the human and mouse genes are homologs of different hamster P-glycoprotein (pgp) genes. We refer to the human and mouse mdr sequences as pgp genes in this communication to facilitate comparison of the P-glycoprotein genes across species. P-glycoprotein contains a tandem duplication in structure, with each half consisting of six transmembrane domains and a cytoplasmic domain containing the consensus sequences for an ATP-binding site (Fig. 1) (2, 8, 10; J.A.E. and P.F.J., unpublished observations). Amino acid and structural homology extends throughout each half of P-glycoprotein to HlyB, a membrane transport protein that is required for the *

MATERIALS AND METHODS cDNA cloning. Clones were isolated from an Okayama and Berg pCD vector library prepared from the drug-sensitive CHO cell line E29Pro' (7). This cell line exhibits no characteristics of pleiotropic drug resistance and does not overexpress P-glycoprotein. pCHP1 (19) was used as a Pglycoprotein-specific probe to screen approximately 106 clones in two independent screenings of the cDNA library. A total of 24 pCHP1-positive clones were isolated. Clones were initially identified by Maxam and Gilbert sequencing (17) that extended from the poly(A) tail. Four clones (pL2, pL20, pL28, and pL34) (Fig. 1) were subcloned into either M13mp8 or M13mp9 phage as PstI or BamHI fragments. Single-stranded DNA was sequenced by primer extension, using the dideoxy method of Sanger et al. (22). The entire nucleotide sequence was determined on both strands for two independent subclones. The nucleotide sequences were as-

Corresponding author.

t Present address: The Ontario Cancer Treatment and Research Foundation, Kingston Regional Cancer Centre, Kingston, Ontario K7L 2V7, Canada. 4075

ENDICOTT ET AL.

4076

1 2 34

56

A

MOL. CELL. BIOL.

B

2 34

56

A

B

NH2-[

pL34 1pL28 pL20 F

1968bp 1689 bp 2 321 bp

pL2

826bp 1-

1

FIG. 1. Schematic representation of P-glycoprotein. The proposed transmembrane regions are marked by numbered boxes. The two consensus sequences that form the potential ATP-binding fold common to P-glycoprotein, HlyB, and the bacterial periplasmic transport systems are marked by solid boxes lettered A and B. The positions of the four cDNA clones, pL34, pL28, pL20, and pL2, are also shown. bp, Base pairs.

sembled by using the data base programs of Staden (24). Clones were classified as either pgpl or pgp2 transcripts on the basis of sequence homologies in both the 3'-coding and 3'-untranslated regions characteristic of each gene. Nucleic acid sequence analysis. The 3'-untranslated regions were analyzed by using the National Biomedical Research Foundation RELATE program (3) and the Staden DIAGON program (24). The nucleotide sequence divergence of the protein-coding region was analyzed by the method of Perler et al. (18).

RESULTS Two P-glycoprotein genes simultaneously expressed in drugsensitive CHO cells. The partial nucleotide sequences of two P-glycoprotein genes from drug-sensitive CHO cells are shown in Fig. 2. pgpl is constructed from the overlapping clones pL28 and pL34, and pgp2 is constructed from pL20 and pL2. P-glycoprotein has a tandem internal duplication in structure, with each half encoding six potential transmembrane domains and a cytoplasmic domain containing the consensus sequences for a putative ATP-binding site (Fig. 1). The two gene transcripts encode the six potential transmembrane regions and the cytoplasmic domain of the Cterminal half of P-glycoprotein. The highest concentrations of nucleotide differences in the coding region are seen at the 5' ends of the cDNAs in the sequences that encode the six potential transmembrane regions and the short linking peptides between them. The nucleotide sequence that encodes the cytoplasmic domain is highly conserved; only 36 changes are found between the two hamster transcripts in a total sequence length of 873 bases. To align the sequences at the 3' end, an insertion of six bases has to be made in the

pgp2 sequence. Transcripts from pgpl and pgp2 show alternate polyadenylation, producing 3'-untranslated regions of four different lengths (Fig. 2). There are hexamer sequences 10 to 30 nucleotides 5' to each polyadenylation site that may act as signal sequences for polyadenylation (1). Downstream from the pL20 and pL34 polyadenylation sites, pL2 and pL28 have

a

G+T-rich sequence which is believed to be

an

important consensus sequence on the 3' side of a potential polyadenylation site (1). For pgpl, 14 clones were polyadenylated at site 1 (pL34like) and 2 clones were polyadenylated at site 2 (pL28-like). For pgp2, 3 clones were polyadenylated at site 1 (pL20-like) and 5 clones were polyadenylated at site 2 (pL2-like). Two screenings of the cDNA library failed to produce any clones that were not either pgpl or pgp2. This finding suggests that

these sequences are the major P-glycoprotein transcripts in this drug-sensitive cell line. It is possible that other multigene family members may be expressed at very low levels. The distribution of differences between the pgpl and pgp2 sequences strongly suggests that they are transcripts from two different P-glycoprotein genes and do not result either from differential splicing of a primary RNA transcript or from genomic rearrangement of a single gene. Nucleotide sequence homology with human and mouse P-glycoprotein gene transcripts. A comparison of the single P-glycoprotein cDNA sequences isolated from human and mouse cells with the two hamster P-glycoprotein gene transcripts shows that the P-glycoprotein multigene family is highly conserved across species. This homology can be demonstrated by comparing both the protein-coding and 3'-untranslated regions. The 3'-untranslated regions of the human and mouse P-glycoprotein gene transcripts were compared with the two hamster P-glycoprotein cDNA sequences in the corresponding region by using the Staden DIAGON program (results not shown). The mouse 3'-untranslated region was shown to be more homologous to that of hamster pgp2, while the human 3'-untranslated region was more homologous to the hamster pgpl sequence. On the basis of this analysis, the human sequence was tentatively identified as the human homolog of hamster pgpl, and the mouse sequence was identified as the mouse equivalent of hamster pgp2. To quantify the homology between the four 3'-untranslated regions, the sequences were also analyzed by using the National Biomedical Research Foundation RELATE program (Table 1). The segment comparison scores were calculated by using the unitary matrix with a bias of +2, a fragment length of 15, and 500 random runs per comparison. The scores are expressed in units of standard deviation (SD), the larger values corresponding to greater homology. Four nucleotide sequences were used: hamster pgpl (1716 to 1968), hamster pgp2 (1966 to 2321), human pgpl (3841 to 4222), and mouse pgp2 (3829 to 4189). The DIAGON results were confirmed; the hamster pgpl 3'-untranslated region showed greater homology to the human 3'-untranslated region (score = 13.0 SD units), and the hamster pgp2 sequence showed strong homology to the mouse 3'untranslated region (score = 17.1 SD units). The two hamster sequences have much less homology between their 3'-untranslated regions (score = 3.2 SD units). The probability of obtaining a score of >6 SD units is 10 SD units is