The Transcriptome Profile of Human Embryonic ... - Wiley Online Library

35 downloads 47 Views 287KB Size Report
Key Words. SAGE · Human embryonic stem cells · Transcriptome · POU5F1 · REX1 · SOX2 · NANOG ... cells express stage-specific embryonic antigen (SSEA)-3,.
Stem C ells O A

® This material is protected by U.S. Copyright law. Unauthorized reproduction is prohibited. For reprints contact: [email protected]

riginal rticle

The Transcriptome Profile of Human Embryonic Stem Cells as Defined by SAGE MARK RICHARDS,a SIEW-PENG TAN,a JEE-HIAN TAN,b WOON-KHIONG CHAN,b* ARIFF BONGSOa* a

Department of Obstetrics and Gynecology, National University of Singapore, National University Hospital, Singapore; bDepartment of Biological Sciences, National University of Singapore, Singapore *Both authors are corresponding authors. Key Words. SAGE · Human embryonic stem cells · Transcriptome · POU5F1 · REX1 · SOX2 · NANOG

A BSTRACT Human embryonic stem (ES) cell lines that have the ability to self-renew and differentiate into specific cell types have been established. The molecular mechanisms for self-renewal and differentiation, however, are poorly understood. We determined the transcriptome profiles for two proprietary human ES cell lines (HES3 and HES4, ES Cell International), and compared them with murine ES cells and other human tissues. Human and mouse ES cells appear to share a number of expressed gene products although there are numerous notable differences, including an inactive leukemia inhibitory factor pathway and the high preponderance of several important genes like POU5F1 and SOX2 in

human ES cells. We have established a list of genes comprised of known ES-specific genes and new candidates that can serve as markers for human ES cells and may also contribute to the “stemness” phenotype. Of particular interest was the downregulation of DNMT3B and LIN28 mRNAs during ES cell differentiation. The overlapping similarities and differences in gene expression profiles of human and mouse ES cells provide a foundation for a detailed and concerted dissection of the molecular and cellular mechanisms governing their pluripotency, directed differentiation into specific cell types, and extended ability for self-renewal. Stem Cells 2004;22:51-64

INTRODUCTION Immortal human embryonic stem (ES) cells and their derivatives promise to revolutionize the future of reparative medicine through the development of stem cell-based therapies [1-5]. ES cells form teratomas when injected into severe combined immunodeficient (SCID) mice [3, 5] and can differentiate into a variety of cell types from all three primitive germ layers in vitro and in vivo [5, 6-9]; this distinguishes ES cells from other stem cells. Several lines of evidence suggest that human and mouse ES cells do not represent equivalent embryonic cell types [10]. In vitro differentiation of human

ES cells leads to the expression of AFP and HCG, which are typically produced by trophoblast cells in the developing human embryo, while mouse ES cells are generally believed not to differentiate along this lineage. In addition, human ES cells express stage-specific embryonic antigen (SSEA)-3, SSEA-4, tumor rejection antigen (TRA)-1-60, and TRA-1-81 surface antigens prior to differentiation but only SSEA-1 upon differentiation, while mouse ES cells only express SSEA-1 prior to differentiation [3, 5, 11]. Human ES cell lines have heterogeneous genetic backgrounds and appear to behave differently in culture. For example, not all human ES

Correspondence: Woon-Khiong Chan, Ph.D., 14 Science Drive 4, S117543, Singapore. Fax: 65-6779-2486; e-mail: [email protected]; Ariff Bongso, Ph.D., D.Sc., National University Hospital, S119074, Singapore. Fax: 65-6779-4752; e-mail: [email protected] Received July 27, 2003; accepted for publication September 15, 2003. ©AlphaMed Press 1066-5099/2004/$12.00/0

STEM CELLS 2004;22:51-64

www.StemCells.com

52

cell lines are amenable to bulk and feeder-free culture protocols, doubling times differ considerably between different lines, and the degree of spontaneous differentiation in vitro also appears to show much variation [12, 13]. Several groups have used comparative data from microarray studies to propose a blueprint for the molecular basis of “stemness” in human and mouse stem cells [14-16]. They have also demonstrated that a large proportion of the transcripts expressed in stem cells are expressed sequence tags (ESTs) with indeterminate functions. Recent evidence has suggested that a small, unique network of transcription factors, including Nanog, Oct-4, and Sox-2 may be sufficient to establish self-renewal and/or suppress lineage differentiation in mouse ES cells [17-21]. Nevertheless, despite the proposed stemness molecular blueprint, many of the genes and molecular mechanisms involved in self-renewal, pluripotency, and differentiation in human ES cells are poorly understood. Moreover, considering the uniqueness of the human ES cell phenotype and the difficulty in obtaining embryonic tissues and preimplantation embryos for research due to ethical reasons, it is probable that many novel genes important for the stemness phenotype in human ES cells remain to be discovered. We have shown previously that undifferentiated, pluripotent human ES cell lines can be derived from the inner cell masses (ICMs) of 5-day-old human embryos [5, 13]. Since human ES cells lines are capable of differentiating into all three germ layers despite the reported differences in their behavior in vitro, we hypothesized that a quantitative comparison of the transcriptome profiles of selected human ES cells lines might allow the determination of key regulators involved in the maintenance of the stemness property, as previously defined [15, 16], as well as help identify a basis for line-specific cellular and behavioral differences. Serial analysis of gene expression (SAGE) allows quantitative characterization and has the added value over microarray expression profiling in its ability to identify novel splice variants, exons, and genes [22-24]. Since SAGE libraries comprise discrete data, they can be subjected to pairwise comparison to statistically analyze the differential expression of genes [25] and to generate a comparative digital gene expression profile [24]. We have used SAGE to obtain the transcriptome profiles of two human ES cell lines, HES3 and HES4, which have different gender and ethnic backgrounds. SAGE should identify genes that comprise a distinct molecular signature of human ES cells. To delineate genes that were differentially regulated in human ES cells, the human ES SAGE libraries were subjected to pairwise comparisons with 21 normal and cancer SAGE libraries. Finally, comparison with the mouse ES SAGE library [26] was conducted to determine differences

SAGE Profiling of Human Embryonic Stem Cells between the SAGE molecular signatures of ES cells between these two mammalian species. MATERIALS AND METHODS Colony Selection HES3 (46XX, Chinese; passage 40) and HES4 (46XY, Caucasian; passage 40) cell lines (proprietary cell lines of ES Cell International; Singapore; http://www.escellinterna tional.com) were cultured on murine embryonic fibroblast (MEF) feeders. Human ES cell colonies were serially cultured according to protocols established previously [5]. Six-day-old human ES cell colonies that appeared morphologically undifferentiated (> 90%) were microdissected under a microscope. They routinely tested negative for two early differentiation markers, NEUROD1 and AFP, by quantitative real-time reverse transcription-polymerase chain reaction (qRT-PCR). Also, >95% of human ES cells stained positive for TRA-1-60, indicating minimal contamination with MEF or differentiated human ES cells [27]. Using the TRA-1-85 antibody, which detects the pan-human antigen Ok(a) that is present on all human cells but absent on mouse cells, fluorescence-activated cell sorting analysis indicated that >98.9% of the cells harvested were human ES cells [28]. Microdissection was made with a sterile 30-G needle into the perimeter of the colony to avoid selecting MEFs adjacent to the colony edge, and care was taken to avoid harvesting differentiated regions of the colony. Serially passaged human ES cell colonies by microdissection resulted in the growth of larger human ES cell colonies that were, on average, 200-300 µm in diameter. This made the selection of morphologically undifferentiated human ES cells easier. SAGE Library Construction, Clone Preparation, and Sequencing Total RNA was extracted from ~800,000 undifferentiated human ES cells using TRIzol (Invitrogen; Carlsbad, CA; http://www.invitrogen.com) and Poly [A+] RNA subsequently prepared with Oligo(dT)-conjugated magnetic beads (Dynal Biotech; Oslo, Norway; http://www.dynal.no). SAGE library construction was performed with the I-SAGE kit (Invitrogen) according to the manufacturer’s protocol. The anchoring enzyme was NlaIII and the tagging enzyme used was BsmFI. Concatemerized ditags were cloned into pZERO-1™ (Ecogen; Barcelona, Spain; http://www.ecogen.com). The ligated products were transformed into One Shot® Top 10 electrocompetent Escherichia Coli (Invitrogen), and transformants were selected on low-salt LB/zeocin agar plates. Blue-white selection was used to enhance the efficiency of selecting clones with longer concatemerized inserts [29]. Plasmids were prepared with the Wizard SV 96 plasmid purification kit

Richards, Tan, Tan et al. (Promega; Madison, WI; http://www.promega.com). DNA sequencing was performed using ABI Big Dye V3.0 or V3.1 sequencing kit (Applied BioSystems; Foster City, CA; http://www.appliedbiosystems.com) and analyzed on the ABI 3100 capillary DNA sequencer (Applied BioSystems). The Gene Expression Omnibus (GEO) accession numbers for the human HES3 and HES4 SAGE libraries were GSM9220 and GSM9221, respectively. qRT-PCR Predesigned Assays on Demand and Assay by Design TaqMan™ probes and primer pairs were obtained from Applied BioSystems. Total RNA was extracted from undifferentiated and differentiated human ES cells and reverse transcribed using SuperScript II (Invitrogen). Differentiated human ES cells were obtained by subjecting them to high-density culture conditions for an extended period of 20 days. qRTPCR analysis was conducted using the ABI PRISM 7000 Sequence Detection System (Applied BioSystems). After an initial denaturation for 10 minutes at 95°C, qRT-PCR was carried out using 40 cycles of PCR (95°C for 15 seconds, 60°C for 1 minute). Changes in gene expression levels were calculated using the ∆∆Ct method after the data (in triplicates) were normalized to the 18S rRNA levels. qRT-PCR experiments were repeated at least once with reproducible results. RT-PCR Gene expression was also determined by semiquantitative RT-PCR. Initial denaturation was carried out at 94°C for 2 minutes, followed by 35 cycles of PCR (94°C for 30 seconds, 55°C for 30 seconds, 72°C for 1 minute). Primers used were: ACTB: product 400 bp, 5′-TGGCACCACACC TTTCTACAATGAGC-3′, 5′-GCACAGCTTCTCCTTAA TGTCACGC-3; BTF3: product 281 bp, 5′-GAACTGCTC GCAGAAAGAAG-3′, 5′-ACTAGTCAGACTATCCGC AC-3′; CKS1B: product 409 bp, 5′-ACATGTCATGCTGC CCAAGG-3′, 5′-ACACTCAGCTTAGGCTGTGG-3′; CLDN6: product 373 bp, 5′-AGATGCAGTGCAAGGTG TAC-3′, 5′-CAAGTGCAGCACAGCAACC-3′; DNMT3B: product 433 bp, 5′-CTCTTACCTTACCATCGACC-3′, 5′CTCCAGAGCATGGTACATGG-3′; ERH: product 495 bp, 5′-GAATGAATCCCAACAGTCCC-3′, 5′-TGGAACCAA CATTAAGTGACG-3′; FLJ10713: product 285 bp, 5′-CA GAGAAGTCGAGGGAAGAG-3′, 5′-GCTCAGCTTCA ATTGTTGGC-3′; FLJ21837: product 449 bp, 5′-GCAG CTTCTGAACATTTGGAC-3′, 5′-GCAGTAGTCTAGAA CACACC-3′; GJA1: product 492 bp, 5′-GGAGTTCAAT CACTTGGCGTG-3, 5′-CTTACCATGCTCTTCAATAC CG-3′; HESX1: product 309 bp, 5′-GGATTTCATTCCCT AGCGTGG-3′, 5′-GTGATTCTCTATGGGACCTTTTC-3′; HMGA1: product 469 bp, 5′-GAAGTGCCAACACCTAA

53

GAG-3′, 5′-AGTGGGATGTTAGCCTTGTC-3′; LIN-28: product 420 bp, 5′-AGTAAGCTGCACATGGAAGG-3′, 5′-ATTGTGGCTCAATTCTGTGC-3′; NANOG: product 493 bp, 5′-GGCAAACAACCCACTTCTGC-3′, 5′-TGTT CCAGGCCTGATTGTTC-3′; NPM1: product 343 bp, 5′TGGTGCAAAGGATGAGTTGC-3′, 5′-GTCATCATCTT CATCAGCAGC-3′; POU5F1: product 247 bp, 5′-CGRG AAGCTGGAGAAGGAGAAGCTG-3′, 5′-CAAGGGCC GCAGCTTACACATGTTC-3′; REX1: product 418 bp, 5′TCTAGTAGTGCTCACAGTCC-3′, 5′-TCTTTAGGTAT TCCAAGGACT-3′; SOX2: product 370 bp, 5′-CCGCATG TACAACATGATGG-3′, 5′-CTTCTTCATGAGCGTCT TGG-3′; and TNFRSF6: product 396 bp, 5′-AGAGTGACA CACAGGTGTTC-3′, 5′-TGGCAGAATTGGCCATCATG-3′. SAGE Data Analysis Tag extraction and pairwise comparison were performed with the SAGE2000 software v.B (Invitrogen) and database construction and management with Microsoft Access and Excel. Tags with ambiguous bases, duplicate ditags, and ditags with abnormal length (< 22 or > 24 bp) were removed by SAGE2000. The SAGE tag to gene database based on UniGene Build #157 was used. Approximately 60% of all SAGE tags match more than one clustered UniGene entry [22, 30]. To partially overcome the problem of multiple ambiguous tag-to-gene assignments associated with the SAGE technique, we used two publicly available SAGE resources, the CGAP SAGEgenie (http://cgap.nci.nih.gov/ SAGE/AnatomicViewer) [24] and the NCBI SAGEmap (http://www.ncbi.nih.gov/SAGE/) [31, 32] to assist in identifying the best SAGE tag for a particular gene. The assignment of molecular function of the proteins was based on the LocusLink database (http://www.ncbi.nih.gov/LocusLink/). Statistical Treatment The Z-test [33], based on the normal approximation of the binomial distribution, was used to determine p values for all pairwise library comparisons: Z=

p1 – p2

√ p (1 – p )/N + p (1 – p )/N 0

0

1

0

0

2

Since no a priori knowledge about the direction of the effects is available in SAGE experiments, all decision rules were formulated for a 2-sided test of the null [25]. The GEO accession numbers for the human SAGE libraries used were: GSM1498, GSM693, GSM765, GSM670, GSM671, GSM755, GSM731, GSM678, GSM686, GSM757, GSM761, GSM676, GSM728, GSM708, GSM668, GSM709, GSM785, GSM762, GSM719, GSM716, and GSM784. Excel analysis was used to determine the union/intersection of the 21 pairwise

54

statistical tests. Monte Carlo simulation was also carried out to compare the HES3 and HES4 SAGE libraries using the SAGE2000 software. RESULTS HES3 and HES4 SAGE Libraries Transcriptional profiling of mRNA isolated from undifferentiated human ES cells was performed using SAGE. Undifferentiated human ES colonies were carefully selected and individual SAGE libraries were constructed. A combined total of 145,015 SAGE tags were sequenced from HES3 (67,807) and HES4 (77,208) SAGE libraries. This translated into 31,852 distinct transcripts. Approximately 64.2% (20,447) of these distinct transcripts were found only once in the combined human ES SAGE library (HES3: 9,977; HES4: 10,470). This is probably indicative of the abundance of rare transcripts in human ES cells, although it is possible that some singletons might have resulted from sequencing errors or leaky transcription as a result of epigenetic dysregulation [34]. A vast majority of singletons that could be reliably assigned to UniGene clusters matched ESTs or hypothetical genes (46.2%), although we have also noted the existence of distinct transcripts that matched to genes like FOXD3 and GBX2, which have been previously identified to be important to mouse ES cells or are expressed in the ICM of mouse blastocysts. We omitted these singletons from our analysis to provide a more accurate estimation of distinct transcripts. Few early markers of differentiation were detected in our human ES SAGE libraries. Tags for early ectodermal markers of differentiation like SOX1, NESTIN, and βΙΙΙ− TUBULIN; early endodermal markers like PDX1, MIXER, and SOX17b; and mesodermal markers like cardiac ACTIN and β-GLOBIN were not detected in both SAGE libraries, indicating that contamination of our starting material with differentiated cells was indeed very low. The Overall Transcriptome Profiles of HES3 and HES4 Are Similar The exclusion of singletons from the combined human ES SAGE dataset left us with 11,404 distinct transcripts. Among these transcripts, 1.0% had more than 135 copies, 11.2% had between 14-135 copies, 14.1% had between 7-13 copies, and 73.7% had fewer than 6 copies. Altogether, 1,511 distinct transcripts (13.2%) could not be reliably assigned (orphan transcripts) to UniGene clusters. The remaining 9,893 distinct transcripts were matched to 12,721 UniGene clusters. Of these, 4,475 (37.6%) matched only ESTs or hypothetical genes, while 313 (2.5%) have unknown functions (Fig. 1A). A putative functional breakdown of the genes expressed in HES3 and HES4 revealed that a preponderance

SAGE Profiling of Human Embryonic Stem Cells of the genes are involved in DNA repair, stress responses, apoptosis, cell cycle regulation, and development (Fig. 1A). Based on the presence of numerous distinct transcripts that could not be reliably assigned to UniGene clusters and the prevalence of hypothetical proteins and ESTs, we conclude that a large proportion of the mRNA species in human ES cells is likely to be novel and expressed only in ES cells or cells derived from the ICM. Of the 9,917 and 9,828 distinct transcripts that were identified in the HES3 and HES4 SAGE libraries, respectively, 8,341 were common to both (Fig. 1B). Moreover, most of the 3,063 transcripts that were not detected in both human ES cell lines are rare transcripts ( 0.01

10

encode for the 73 ribosomal proteins were, on average, about 3.70-8.41 times more abundant than in normal tissues like brain cortex, cerebellum, colon, kidney, 1 1 stomach, and liver. Only the pancreas has a higher proportion of SAGE tags that were derived from ribosomal genes. This is indicative that the human ES cells must devote a large proportion of their cellular resources to the synthesis of proteins, which is certainly not unexpected given the rapid cellular proliferation rate of human ES cells. Genes Differentially Expressed in HES3 and HES4 Although the general transcriptome profiles of the two human ES cell lines we profiled were similar, a number of genes were found to be differentially represented. A pairwise comparison of the HES3 and HES4 SAGE libraries (Fig. 1C) using the Z-test statistical analysis (p ≤ 0.01) and fold differences revealed 175 differentially expressed transcripts. Monte Carlo simulation gave identical results (data not shown). A list of 25 differentially expressed HES3 and HES4 genes with the greatest fold difference is presented in Table 2. Most conspicuously, the transcript for REX1 was absent in the HES4 line. SNPs and splice variants/isoforms account for some of the differences in the HES3 and HES4 SAGE transcriptomes. For example, six differentially expressed genes were found to have two assigned SAGE tags. RPS27A, NDUFB1, and BTF3 were represented by two different SAGE tags containing an SNP within each tag sequence, while the second alternative tag for TPI1,

0.01 < p < 0.001 p < 0.001 RPS4Y

10

NDUFB1 BTF3

TACCAATGAT

100

1,000

HES4 (77,208) FSCN1, and SLC2A3 resulted from the expression of a second isoform in HES3. Several transcription factors, REX1, BTF3, ZFX and XBP1, were upregulated in HES3, but only CTBP1 was upregulated in HES4. In contrast to HES3, the upregulated genes in HES4 included mainly ribosomal proteins, cytoskeletal proteins, and enzymes involved in metabolic pathways, which probably reflect the higher metabolic and proliferation rates of HES4. Three genes, LECT1, TGFα, and IFRD1, which are associated with differentiation, were upregulated in HES3, perhaps indicative of a small subpopulation of differentiating cells. Some of the cell line-specific differential gene expression could be attributed to different gender backgrounds. For instance, the Y-linked RPS4 was found only in HES4, while all five X-linked genes were more highly expressed in HES3. About 8.7% of the differentially expressed transcripts were ESTs or hypothetical proteins and 9.1% were orphan SAGE tags. Genes Differentially Upregulated in Human ES Cells To determine genes that were upregulated in ES cells, we compared the combined human ES SAGE dataset with 21 publicly available SAGE libraries from normal adult and

SAGE Profiling of Human Embryonic Stem Cells

56

Table 1. The 30 most abundant transcripts expressed in human ES cells Tag sequence

HES3

HES4

Total

UniGene

Description

Function

TACCATCAAT

439

515

954

169476 79877

Glyceraldehyde-3-phosphate dehydrogenase Myotubularin-related protein 6

Metabolism Signal transduction

TGTGTTGAGA

501

403

904

181165

Eukaryotic translation elongation factor 1 α1

Protein synthesis

TGTACCTGTA

343

388

731

334842

Tubulin, alpha, ubiquitous

Cytoskeletal

TAGGTTGTCT

243

423

666

401448

Tumor protein, translationally controlled 1

Apoptosis

CCTAGCTGGA

222

383

605

401787

Peptidylprolyl isomerase A (cyclophilin A)

Protein modification

GAAGCAGGAC

217

339

556

180370

Cofilin 1 (non-muscle)

Cytoskeletal

ACTTTTTCAA

285

265

550

133430 156814

ESTs KIAA0377 gene product

EST/Hypothetical EST/Hypothetical

TGAAATAAAA

204

180

384

355719 48516

Nucleophosmin (numatrin) ESTs

Nucleolus EST/Hypothetical

TGTTCTGGAG

197

173

370

74471

Gap junction protein, α1 (connexin 43)

Gap junction

GGTCCAGTGT

133

193

326

181013

TCCCTATTAA

147

179

326

GCATTTAAAT

127

186

313

GGCTGGGGGC

95

215

Phosphoglycerate mutase 1 (brain)

Metabolism

No reliable UniGene match

No UniGene match

421608

Eukaryotic translation elongation factor 1 β2

Protein synthesis

310

408943 352407

Profilin 1 Chromosome 1-amplified sequence 3

Cytoskeletal EST/Hypothetical

TGGGCAAAGC

126

178

304

256184

Eukaryotic translation elongation factor 1γ

Protein synthesis

TTGGAGATCT

131

171

302

50098

NADH dehydrogenase (ubiquinone) 1α subcomplex, 4, 9kDa

Metabolism

TTGGTGAAGG

191

109

300

75968

Thymosin, β4, X chromosome

Cytoskeletal

Human promyelocytic leukemia cell mRNA, clones pHH58 and pHH81

EST/Hypothetical

No reliable UniGene match

No UniGene match

426138 TCCCCGTACA

200

93

293

AGCACCTCCA

139

141

280

75309

Eukaryotic translation elongation factor 2

Protein synthesis

TGAGGGAATA

76

184

260

83848

Triosephosphate isomerase 1

Metabolism

TTGGGGTTTC

137

119

256

418650

Ferritin, heavy polypeptide 1

Transport; iron

CTAGCCTCAC

110

140

250

14376 356575

Actin, γ1 ESTs

Cytoskeletal EST/Hypothetical

TGTAATCAAT

127

119

246

376844

Heterogeneous nuclear ribonucleoprotein A1

RNA processing

GCACAAGAAG

98

136

234

289721

Homo sapiens mRNA; cDNA DKFZp564D0164

EST/Hypothetical

CTCCTCACCT

81

132

213

93213 389335

BCL2-antagonist/killer 1 Ribosomal protein L13a

Apoptosis Protein synthesis

ATTGTTTATG

107

101

208

181163

High-mobility group nucleosomal binding domain 2

Chromatin regulation

GCCTTCCAAT

78

129

207

76053

GGAATGTACG

83

124

207

429

ACTCCAAAAA

75

131

206

75914

CTGTTGATTG

116

87

203

376844

GAAACAAGAT

118

76

194

78771

380159

KIAA1393 protein

EST/Hypothetical

DEAD/H box polypeptide 5 (RNA helicase)

RNA Processing

ATP synthase, H+ transporting, mitochondrial F0 complex, subunit c (subunit 9) isoform 3

Transport; ion

Coated vesicle membrane protein

Transport; protein

Heterogeneous nuclear ribonucleoprotein A1

RNA processing

Phosphoglycerate kinase 1

Metabolism

Orphan tags that cannot be assigned to any UniGene clusters are listed as “No reliable UniGene match.” Ribosomal genes, mitochondrial genes, and tags that match to more than three different UniGene entries have been omitted from this list.

fetal peripheral tissues and cancer tissues. Upregulated transcripts were identified based on p values (p < 0.01) and

fold differences (fold difference > 4) in 21 pairwise comparisons. The 192 upregulated transcripts included known

Richards, Tan, Tan et al.

57

Table 2. The top 25 differentially expressed transcripts in HES3 or HES4 cells showing the greatest fold difference Tag sequence CTGAGACGAA CTGAGACAAA TGATTTCACT CTCTGTTGAT AAGAATTTGA

HES3 73 1 120 20 16

HES4 1 52 5 1 1

Total 74 53 125 21 17

UniGene 101025 101025

CACGCGCTCA

15

1

16

GAATGAGGAC

13

1

14

GAATCCAACT

11

1

12

GGGAGTGTTG CATTGAAGGG TTTGTGACTG GTCACTCATA

1 9 2 13

12 1 17 2

13 10 19 15

GTGCCCGTGC TACCAATGAT AAAATTTACA AAGAATCTGA

91 0 29 0

0 105 0 36

91 105 29 36

24301 101299 167791 125435 433328 44143 97848 79026 343926 285317 376146 83848

TGCTCCGGGT GCTGCTATTT CGCACAATCA AAGAGGAGAC TGAAGGATGC ATGTGACTGT CATCTCACTC TCATAGCCCT TTTCTTAACA

26 0 14 13 0 12 12 11 11

0 20 0 0 16 0 0 0 0

26 20 14 13 16 12 12 11 11

83383 183435

97932 183435

23395 278959 284216 180911 118400 335787

Description Basic transcription factor 3 Basic transcription factor 3 Mitochondrial protein Peroxiredoxin 4 NADH dehydrogenase (ubiquinone) 1β subcomplex, 1, 7kDa Polymerase (RNA) II polypeptide E, 25kDa Cullin 5 Reticulocalbin 1 EST Neuronal protein 17.3 Polybromo 1 EST Myeloid leukemia factor 2 C-terminal binding protein 1 Hypothetical protein FLJ12891 Homo sapiens cDNA FLJ39106 Triosephosphate isomerase 1 No reliable UniGene match Leukocyte cell-derived chemotaxin 1 NADH dehydrogenase (ubiquinone) 1β subcomplex, 1, 7kDa No reliable UniGene match Myosin IXA Galanin Hypothetical protein FLJ10283 Ribosomal protein S4, Y-linked No reliable UniGene match Fascin homolog 1, actin-bundling protein Rex-1 No reliable UniGene match

Z value 8.95 6.55 11.04 4.45 3.91

Fold diff 83.1 45.7 27.3 22.8 18.2

3.77

17.1

3.46

14.8

3.12

12.5

2.82 2.74 3.17 3.10

10.5 10.3 7.5 7.4

10.18 9.61 5.75 5.62

103.6 92.2 33.0 31.6

5.44 4.19 3.99 3.85 3.75 3.70 3.70 3.54 3.54

29.6 17.6 15.9 14.8 14.1 13.7 13.7 12.5 12.5

Orphan tags that cannot be assigned to any UniGene clusters are listed as “No reliable UniGene match.” A Z value of >3.30 corresponds to p value of 0.01

1 1

10

100

1,000

10,000

1

Combined human ES cells

10

100

1,000

Combined human ES cells

Table 4. Real-time RT-PCR gene expression between undifferentiated and differentiated human ES cells Gene

SAGE HES3

POU5F1 SOX2 REX1 HESX1 DNMT3B ERH STAT3 LIF LIFR IL6ST AFP BMP4 NEUROD1 FGF4

Real-time RT-PCR (mean CT) HES4

(tpm)

(tpm)

959 413 162 15 295 428 0 0 0 0 15 15 0 0

945 590 0 52 251 973 13 0 0 0 13 0 0 0

HES3 UnDiff 24.7 24.1 25.3 30.9 23.1 25.9 28.9 32.7 37.5 31.7 nd 30.7 nd nd

10,000

1,000

Medulloblastoma (GSM 693)

(Table 4). De novo methylation of genomic DNA is a developmentally regulated process that is believed to play a pivotal role in development, genome imprinting, and gene silencing in mammals [38, 39]. LIN28, an RNA-binding and heterochronic gene, was downregulated during ES differentiation. LIN28 is a negative regulator controlling the embryonic development of a variety of somatic cell types in many organisms [40]. Downregulation of LIN28 expression has also

A

Ovarian carcinoma (GSM 731)

Figure 2. Scatter plots showing the comparative distribution of distinct transcripts in four selected tissues. The combined human ES SAGE library was compared with (A) embryonic kidney, (B) adult kidney, (C) medulloblastoma, and (D) ovarian carcinoma. Tag frequencies were plotted on a logarithmic scale and p values calculated using the Z-test.

*

(±0.01) (±0.20) (±0.05) (±0.08) (±0.10) (±0.04) (±0.06) (±0.20) (±0.30) (±0.10) (±0.09)

HES4 †

Diff 27.4 27.8 27.7 35.3 26.7 26.8 29.5 32.0 32.0 26.7 25.1 26.8 38.29 nd



FD

(±0.10) -13.9 (±0.10) -11.0 (±0.10) -4.1 (±0.10) -36.4 (±0.20) -10.1 (±0.05) -1.5 (±0.10) -2.8 (±0.02) +1.1 (±0.07) +12.9 (±0.04) +11.1 (±0.04) +30,153 (±0.04) +7.5 (±0.3) +4.4 —

UnDiff 24.3 24.6 nd 31.1 24.0 26.1 29.4 32.7 37.5 31.7 37.7 30.4 nd nd

*

(±0.06) (±0.10) (±0.09) (±0.20) (±0.03) (±0.10) (±0.04) (±0.40) (±0.20) (±0.40) (±0.10)

Diff† 27.4 26.3 nd 36.2 26.6 26.3 30.3 33.3 33.0 28.5 25.0 27.8 33.4 nd

(±0.10) (±0.07) (±0.06) (±0.10) (±0.05) (±0.10) (±0.15) (±0.06) (±0.05) (±0.20) (±0.10) (±0.10)

FD‡ -7.5 -5.33 — -58.8 -10.3 -1.0 -2.8 -1.9 +19.0 +4.5 +7000 +4.2 +96.7 —

*

UnDiff = undifferentiated human ES cells.



Diff = differentiated human ES cells from high-density cultures undergoing spontaneous differentiation in vitro.

FD = fold difference in relative mRNA levels of the target gene in undifferentiated and differentiated human ES samples calculated by the ∆∆CT method using 18S rRNA as the normalized internal standard. For genes that were not detectable in undifferentiated cells, a CT of 40 was substituted to calculate fold differences in gene expression. ‡

nd = not detected after 40 PCR cycles; tpm = tags per million.

10,000

SAGE Profiling of Human Embryonic Stem Cells

HES4 diff

HES4 undiff

HES3 diff

HES3 undiff

Heart

Lung

Kidney

Testis

Placenta

Brain

Fetal liver

Fetal brain

60

POU5F1 SOX2 NANOG REX1 HESX1 FLJ14549 FLJ10713 FLJ21837 DNMT3B LIN-28 NPM1 Otoconin 90

Figure 3. Gene expression of candidate human ES-specific marker genes. Transcriptional analysis of the 19 genes and ACTB, which is included as loading control, were carried out by RT-PCR with total RNA prepared from fetal brain, fetal liver, adult brain, placenta, adult testis, adult kidney, adult lung, adult heart, undifferentiated (7D) HES3 and HES4 cells, and differentiated (20D) HES3 and HES4. Input RNA amounts were controlled for all first-strand RT reactions. Ten percent of the PCR product was loaded into each lane and analyzed on a 1.5% agarose gel.

between our human ES and publicly available SAGE data from peripheral adult tissues enabled us to identify a group of genes that were both restricted and upregulated in human ES cells. Subsequently, we used RT-PCR to confirm if the expression of these genes declined in differentiated human ES cells and evaluated the expression of these genes in eight peripheral tissues. This allowed us to detect known ES-specific genes like POU5F1, REX1, and SOX2, as well as to identify several new human ES cell marker genes.

CLDN6 GJA1 CKS1 B ERH HMGA1 TNFRSF6 BTF3 ACTB

been associated with a progress to differentiation in embryonal carcinoma cells. Other genes, such as CLDN6, GJA1, CKS1B, ERH, and HMGA1, were expressed in some peripheral tissues, but the expression levels appeared to be much higher in human ES cells. However, no marked decline in the expression of these genes was detected during the onset of ES differentiation. Of the five transcription factors assayed by qRT-PCR (Table 4), HESX1 gene expression showed the most dramatic decline during ES differentiation. However, HESX1 was also expressed in several peripheral adult and fetal tissues. DISCUSSION We used SAGE to obtain the transcriptome profiles of the human ES cells lines HES3 and HES4. The profiles of these two human ES cell lines were largely similar. The most conspicuous difference between the two lines was the absence of REX1 expression in HES4. Taken together, we conclude that a generalized gene expression profile of the human ES cell lines can be reliably derived based on the combined HES3 and HES4 SAGE libraries. Additionally, we hypothesized that genes involved in the maintenance of pluripotency are restricted in their expression to ES cells, with low or nondetectable expression in somatic tissues. Pairwise comparisons

REX1 Is not Expressed in HES4 No SAGE tag for REX1 was detected in HES4, and this was confirmed by quantitative and semiquantitative RT-PCR. It is tempting to speculate that this could account for some of the differential gene expression between HES3 and HES4. The lack of REX1 expression in HES4 is surprising because it has been serially propagated for over 100 passages and is capable of forming teratomas in SCID mice. In the mouse, Rex1 is a direct downstream target of Pou5f1 [41] and its promoter is functional in human ES cells [42]. The exact involvement of Rex1 in the self-renewal of mouse ES cells is still unclear. However, F9 cells induced to differentiate along the visceral endoderm pathway showed increased Rex1 mRNA levels and F9 Rex1–/– cells; however, they do not form primitive and visceral endoderm upon retinoic acid-induced differentiation [43]. Taken together, these findings suggest that REX1 expression may not be essential for self-renewal in human ES cells. However, we cannot rule out if REX1 has a role in the establishment of the ICM or in specific differentiation pathways. The confirmation that HES4 carries a null allele of REX1 might have practical implications on its directed differentiation into specific cell types. It would also be prudent to determine REX1 expression in the other human ES cell lines, several of which share a similar ethnic background and source with HES4. Comparison of the Human and Mouse ES Transcriptomes by SAGE Some basic similarities in the SAGE profiles of human and mouse ES cells exist. Highly expressed genes in both of these mammalian ES cell types include metabolic enzymes, ribosomal proteins, and cytoskeletal proteins (TUBB2, TMSB10, PFN1). However, there are significant differences

Richards, Tan, Tan et al.

61

between the mouse and human ES cell transcriptomes. Transcription factors with a defined role in the maintenance of pluripotency and whose expression is downregulated upon differentiation, including SOX2, HESX1, UTF1, and REX1 [18, 41, 44, 45], are consistently expressed at higher levels in human ES cells, with POU5F1 expression reaching ~10-fold higher. In contrast, members of the leukemia inhibitory factor (LIF) signaling pathway (STAT3, LIF, LIFR, and IL6ST), FGF4, and TDGF1 are very highly expressed in mouse ES cells only. Galanin and sialoadhesin, which are highly expressed in mouse ES cells [26], are expressed at lower levels in human ES cells (Table 5). Conversely, genes that are differentially expressed in human ES cells are expressed at much lower levels in mouse ES cells. The absolute difference in the expression levels of these ES-restricted transcription factors, coupled with an inactive LIF signaling pathway, indicate there are fundamental differences in the regulatory networks that control pluripotency and self-renewal in human and mouse ES cells. Tight quantitative regulation of Pou5f1 gene expression is essential for the maintenance of mouse ES cell pluripotency [17]. While the high expression of POU5F1 is atypical of transcription factors, its expression does not decline precipitously in differentiated human ES cells, implying that it might regulate human ES cell pluripotency through a similar mechanism. An additional implication is that down-

stream targets of POU5F1 should also be upregulated in human ES cells. Indeed, this is the case for H2AFZ, SOX2, REX1, RPS7, and KPNB1 [46]. Stemness Phenotype of Human ES Cells A list of candidate human ES cell marker genes responsible for stemness in human ES cells is presented in Table 6. All of these genes were present in our list of 192 upregulated transcripts. Five of them, POU5F1, SOX2, REX1, NANOG, and FLJ10713, have been previously identified in mouse ES cells [10, 15, 20, 21, 26], and eight of them, including TGIF, TDGF1, CHEK2, GDF3, GJA1, and FLJ21837, have been identified as upregulated in a recent microarray study of the human ES cell transcriptome [47]. None of the remaining genes have been previously implicated to be important for human ES cells. These candidate human ES marker genes are either very highly expressed in human ES cells (GJA1, CLDN6, CKS1B, ERH, HMGA1) or show highly restricted expression patterns (LIN28, DNMT3B, FLJ14549, FLJ21837, TNFRSF6, NPM1, OC90). In addition, some of these new marker genes (LIN28, DNMT3B, FLJ14549, OC90, HESX1) were strongly downregulated during ES cell differentiation. Besides these known genes, we have also identified eight orphan SAGE tags that are both highly expressed and restricted to human ES cells. These genes should also prove to be extremely useful as markers for undifferentiated human ES cells.

Table 5. Comparison of human ES and mouse ES SAGE libraries UniGene

SAGE Tag†

HES (tpm)‡

UniGene

SAGE Tag†

MES (tpm)‡

POU5F1

Hs.2860

1

952

Mm.17031

2

94

SOX2

Hs.816

1

469

Mm.4541

3

51

HESX1

Hs.171980

2

48

Mm.4802

1

7

UTF1

Hs.158307

1

34

Mm.8

1

15

REX1

Hs.335787

1

76

Mm.3396

1

29

FOXD3

Hs.120204

1

7

Mm.4758

1

7

GBX2

Hs.184945

1

7

Mm.1306

1

15

NANOG

Hs.326290

1

28

Mm.6047

2

362

STAT3

Hs.321677

1

14

Mm.196029

4

142

LIFR

Hs.2798

0

0

Mm.3174

1

131

LIF

Hs.2250

0

0

Mm.4964

2

14

IL6ST

Hs.82065

1

7

Mm.4364

1

7

TDGF1

Hs.75561

1

48

Mm.5090

3

276

FGF4

Hs.1755

0

0

Mm.4956

1

43

Gene Transcription factors

Signal transduction



The number of SAGE tags that were reliably assigned to each UniGene entry within the respective SAGE libraries.



If a gene had more than one tag, the sum of all corresponding tag frequencies is listed.

SAGE Profiling of Human Embryonic Stem Cells

62

Table 6. Candidate human ES marker genes Symbol

Gene

POU5F1†

POU domain class 5, transcription factor 1

SOX2

Sox 2

NANOG

Nanog

HESX1

Homeobox expressed transcription factor in ES cells

REX1

Zinc finger protein 42

FLJ14549

Hypothetical protein FLJ 14549

TGIF



TGF-β-induced factor (TALE family homeobox)

DNMT3A/B

DNA (cytosine-5) methyltransferase 3α/β

LIN-28

RNA-binding protein LIN-28

NPM1

Nucleophosmin

TNFRSF6 TDGF1

TNF superfamily member 6



Teratocarcinoma-derived growth factor 1

GDF3†

Growth differentiation factor 3 †

FLJ21837

Hypothetical protein FLJ 21837

FLJ10713†

Hypothetical protein FLJ 10713

HMGA1

High mobility group AT-hook 1

ERH

Enhancer of rudimentary homolog

CKS1B CHEK2

CDC28 protein kinase regulatory subunit 1B †

CHK2 Checkpoint homolog

CLDN6

Claudin 6

GJA1†

Connexin 43



Indicates genes that were detected as upregulated in our study and in Sato et al. [47].

We have also identified components of the fibroblast growth factor (FGF), transforming growth factor (TGF)-β/ bone morphogenetic protein-4, and WNT signaling pathways that are believed to be important in human ES cells. In particular, the downstream transcription factor of the WNT pathway, TCF3, the TGF-β-induced factor (TALE homeobox transcription factor), and LEFTB were highly expressed in human ES cells. Other genes believed to be important for the ES cell phenotype, such as CHEK2 and GDF3, were also detected at high levels in our SAGE data. Besides the identification of putative transcription factors and signaling pathways that are important for the maintenance

of pluripotency and self-renewal in human ES cells, a huge amount of potentially important hypothetical genes, ESTs, and novel transcripts have been uncovered. The presence of many potentially novel transcripts has partially validated our decision to rely on SAGE for the profiling of the human ES cell transcriptome. Despite past failure to identify transcripts that are exclusively restricted to ES cells, some of these orphan SAGE tags are detected only for the first time in human ES cells, indicating that ES-specific genes might exist. The next phase would be to convert these short 10-bp tags to longer cDNA sequences for gene identification purposes and the subsequent evaluation of these genes as key regulators of stemness phenotype. The identification and cloning of the large number of rare human ES cell transcripts will remain a formidable challenge. The enrichment of human ES cells, by cell lineage marking or the erasure of differentiating human ES cells, in combination with single-cell transcript analysis or a micro-cDNA libraries-based approach, may help to quickly refine and identify important human ES-specific genes [4850]. Single-cell gene expression profiling might be able to confirm if there are functional subsets of human ES cells [51]. While our results have helped to confirm many of the essential attributes of stemness proposed previously [15], we have been unable to demonstrate the involvement of certain key signaling molecules such as FGF-4 and LIF, which are central to the concept of stemness in mouse ES cells. Since these studies [14, 15, 26] have employed LIF to suppress mouse ES cell differentiation, we are inclined to believe that some of these differences might be attributed to an active LIF pathway in mouse ES cells. Nevertheless, these human ES genes that we have identified, in combination with what has been reported earlier for mouse ES cells and other adult stem cells, will remain extremely useful for the dissection of the key molecular pathways involved in the maintenance of pluripotency, self-renewal, and perhaps, even the mechanism used by human ES cells to suppress differentiation. ACKNOWLEDGMENTS This study was supported by a grant from Embryonic Stem Cell International (ESI) Pte. Ltd.

R EFERENCES 1 Bongso A, Fong CY, Ng SC et al. The growth of inner cell mass cells from human blastocysts (Abstract). Theriogenology 1994;41:161. 2 Bongso A, Fong CY, Ng SC et al. Isolation and culture of inner cell mass cells from human blastocysts. Hum Reprod 1994;9:2110-2117. 3 Thompson JA, Itskovitz-Eldor J, Shapiro SS et al. Embryonic stem cell lines derived from human blastocysts. Science 1998;282:1145-1147.

4 Shamblott MJ, Axelman J, Wang S et al. Human embryonic germ cell derivatives express a broad range of developmentally distinct markers and proliferate extensively in vitro. Proc Natl Acad Sci USA 2001;98:113-118. 5 Reubinoff BE, Pera MF, Fong CY et al. Embryonic stem cell lines from human blastocysts: somatic differentiation in vitro. Nat Biotechnol 2000;18:399-404. 6 Lebkowski JS, Gold J, Xu C et al. Human embryonic stem cells: culture, differentiation, and genetic modification for

Richards, Tan, Tan et al. regenerative medicine applications. Cancer J Suppl 2001;2:S83-S93. 7 Zhang SC, Wernig M, Duncan ID et al. In vitro differentiation of transplantable neural precursors from human embryonic stem cells. Nat Biotechnol 2001;19:1129-1133. 8 Assady S, Maor G, Amit M et al. Insulin production by human embryonic stem cells. Diabetes 2001;50:1691-1697. 9 Levenberg S, Golub JS, Amit M et al. Endothelial cells derived from human embryonic stem cells. Proc Natl Acad Sci USA 2002;99:4391-4396. 10 Rizzino A. Embryonic stem cells provide a powerful and versatile model system. Vitam Horm 2002;64:1-42. 11 Henderson JK, Draper JS, Baillie HS et al. Preimplantation human embryos and embryonic stem cells show comparable expression of stage-specific embryonic antigens. STEM CELLS 2002;20:329-337. 12 Vogel G. Stem cells. Are any two cell lines the same? Science 2002;295:1820. 13 Richards M, Fong CY, Chan WK et al. Human feeders support prolonged undifferentiated growth of human inner cell masses and embryonic stem cells. Nat Biotechnol 2002;20:933-936. 14 Kelly DL, Rizzino A. DNA microarray analyses of genes regulated during the differentiation of embryonic stem cells. Mol Reprod Dev 2000;56:113-123. 15 Ramalho-Santos M, Yoon S, Matsuzaki Y et al. “Stemness”: transcriptional profiling of embryonic and adult stem cells. Science 2002;298:597-600.

63

25 Ruijter JM, Van Kampen VHC, Baas F. Statistical evaluation of SAGE libraries: consequences for experimental design. Physiol Genomics 2002;11:37-44. 26 Anisimov SV, Tarasov KV, Tweedie D et al. SAGE identification of gene transcripts with profiles unique to pluripotent mouse R1 embryonic stem cells. Genomics 2002;79:170-176. 27 Richards M, Tan S, Fong CY et al. Comparative evaluation of various human feeders for prolonged undifferentiated growth of human embryonic stem cells. STEM CELLS 2003;21:546-556. 28 Williams BP, Daniels GL, Pym B et al. Biochemical and genetic analysis of the OKa blood group antigen. Immunogenetics 1988;27:322-329. 29 Angelastro MM, Ryu EJ, Torocsik B et al. Blue-white selection step enhances the yield of SAGE concatemers. Biotechniques 2002;32:484, 486. 30 Chen J, Sun M, Lee S et al. Identifying novel transcripts and novel genes in the human genome by using novel SAGE tags. Proc Natl Acad Sci USA 2002;99:12257-12262. 31 Lash AE, Tolstoshev CM, Wagner L et al. SAGEmap: a public gene expression resource. Genome Res 2000;10:1051-1060. 32 Lal A, Lash AE, Altschul SF et al. A public database for gene expression in human cancers. Cancer Res 1999;59:5403-5407. 33 Kal AJ, van Zonneveld AJ, Benes V et al. Dynamics of gene expression revealed by comparison of serial analysis of gene expression transcript profiles from yeast grown on two different carbon sources. Mol Biol Cell 1999;10:1859-1872.

16 Ivanova NB, Dimos JT, Schaniel C et al. A stem cell molecular signature. Science 2002;298:601-604.

34 Stern MD, Anisimov SV, Boheler KR. Can transcriptome size be estimated from SAGE catalogs? Bioinformatics 2003;19:443-448.

17 Niwa H, Miyazaki J, Smith AG. Quantitative expression of Oct3/4 defines differentiation, dedifferentiation or self-renewal of ES cells. Nat Genet 2000;24:372-376.

35 Nichols J, Zevnik B, Anastassiadis K et al. Formation of pluripotent stem cells in the mammalian embryo depends on the POU transcription factor Oct4. Cell 1998;95:379-391.

18 Avilion AA, Nicolis SK, Pevny LH et al. Multipotent cell lineages in early mouse development depend on SOX2 function. Genes Dev 2003;17:126-140.

36 Hansis C, Grifo JA, Krey LC. Oct-4 expression in inner cell mass and trophectoderm of human blastocysts. Mol Hum Reprod 2000;6:999-1004.

19 Nichols J, Zevnik B, Anastassiadis K et al. Formation of pluripotent stem cells in the mammalian embryo depends on the POU transcription factor Oct4. Cell 1998;95:379-391.

37 Okano M, Xie S, Li E. Cloning and characterization of a family of novel mammalian DNA (cytosine-5) methyltransferases. Nat Genet 1998;19:219-220.

20 Chambers I, Colby D, Robertson M et al. Functional expression cloning of Nanog, a pluripotency sustaining factor in embryonic stem cells. Cell 2003;113:643-655.

38 Jahner D, Jaenisch R. In: Razin A, Cedar H, Riggs A, eds. DNA Methylation: Biochemistry and Biological Significance. New York: Springer-Verlag, 1984:189-219.

21 Mitsui K, Tokuzawa Y, Itoh H et al. The homeoprotein Nanog is required for maintenance of pluripotency in mouse epiblast and ES cells. Cell 2003;113:631-642.

39 Laird PW, Jaenisch R. The role of DNA methylation in cancer genetic and epigenetics. Annu Rev Genet 1996;30:441-464.

22 Velculescu VE, Zhang L, Vogelstein B et al. Serial analysis of gene expression. Science 1995;270:484-487.

40 Moss EG, Tang L. Conservation of the heterochronic regulator Lin-28, its developmental expression and microRNA complementary sites. Dev Biol 2003;258:432-442.

23 Angelastro JM, Klimaschewski L, Tang S et al. Identification of diverse nerve growth factor-regulated genes by serial analysis of gene expression (SAGE) profiling. Proc Natl Acad Sci USA 2000;97:10424-10429.

41 Hosler BA, Rogers MB, Kozak CA et al. An octamer motif contributes to the expression of the retinoic acid-regulated zinc finger gene Rex-1 (Zfp-42) in F9 teratocarcinoma cells. Mol Cell Biol 1993;13:2919-2928.

24 Boon K, Osorio EC, Greenhut SF et al. An anatomy of normal and malignant gene expression. Proc Natl Acad Sci USA 2002;99:11287-11292.

42 Eiges R, Schuldiner M, Drukker M et al. Establishment of human embryonic stem cell-transfected clones carrying a marker for undifferentiated cells. Curr Biol 2001;11:514-518.

64

43 Thompson JR, Gudas LJ. Retinoic acid induces parietal endoderm but not primitive endoderm and visceral endoderm differentiation in F9 teratocarcinoma stem cells with a targeted deletion of the Rex-1 (Zfp-42) gene. Mol Cell Endocrinol 2002;195:119-133. 44 Thomas PQ, Johnson BV, Rathjen J et al. Sequence, genomic organization, and expression of the novel homeobox gene Hesx1. J Biol Chem 1995;270:3869-3875. 45 Okuda A, Fukushima A, Nishimoto M et al. UTF1, a novel transcriptional coactivator expressed in pluripotent embryonic stem cells and extra-embryonic cells. EMBO J 1998;17:2019-2032. 46 Du Z, Cong H, Yao Z. Identification of putative downstream genes of Oct-4 by suppression-subtractive hybridization. Biochem Biophys Res Commun 2001;282:701-706.

SAGE Profiling of Human Embryonic Stem Cells 47 Sato N, Sanjuan IM, Heke M et al. Molecular signature of human embryonic stem cells and its comparison with the mouse. Dev Biol 2003;260:404-413. 48 Saitou M, Barton SC, Surani MA. A molecular programme for the specification of germ cell fate in mice. Nature 2002;418:293-300. 49 Chiang MK, Melton DA. Single-cell transcript analysis of pancreas development. Dev Cell 2003;4:383-393. 50 Stappenbeck TS, Mills JC, Gordon JI. Molecular features of adult mouse small intestinal epithelial progenitors. Proc Natl Acad Sci USA 2003;100:1004-1009. 51 Levsky JM, Shenoy SM, Pezo RC et al. Single-cell gene expression profiling. Science 2002;297:836-840.