Functional Network Reconstruction Reveals Somatic Stemness ...

3 downloads 0 Views 5MB Size Report
chr15q13-q15. Hs.57697. Hyaluronan synthase 1. HAS1 chr19q13.4. Hs.571528. Hyaluronan synthase 2. HAS2 chr8q24.12. Hs.58877. Hemicentin 1. HMCN1.
STEM CELL GENOMICS AND PROTEOMICS Functional Network Reconstruction Reveals Somatic Stemness Genetic Maps and Dedifferentiation-Like Transcriptome Reprogramming Induced by GATA2 TSE-SHUN HUANG,a JUI-YU HSIEH,a YU-HSUAN WU,a CHIH-HUNG JEN,b YANG-HWEI TSUANG,c SHIH-HWA CHIOU,d,e JUKKA PARTANEN,f HEIDI ANDERSON,f TAINA JAATINEN,f YAU-HUA YU,e,g HSEI-WEI WANGa,b,h Institutes of aMicrobiology and Immunology, gOral Biology, and dClinical Medicine and bVeteran General HospitalYang Ming Genome Center, National Yang-Ming University, Taipei, Taiwan; eDepartment of Medical Research and Education, Taipei Veterans General Hospital, Taipei, Taiwan; fFinnish Red Cross Blood Service, Helsinki, Finland; Departments of cOrthopedics and hEducation and Research, Taipei City Hospital, Taipei, Taiwan Key Words. CD133⫹ stem cell • Somatic stem cell • Systems biology • GATA2 • Dedifferentiation

ABSTRACT Somatic stem cell transplantation holds great promise in regenerative medicine. The best-characterized adult stem cells are mesenchymal stem cells (MSCs), neural stem cells (NSCs), and CD133ⴙ hematopoietic stem cells (HSCs). The applications of HSCs are hampered since these cells are difficult to maintain in an undifferentiated state in vitro. Understanding genes responsible for stem cell properties and their interactions will help on this issue. The construction of stem cell genetic networks will also help to develop rational strategies to revert somatic cells back to a stem-like state. We performed a systemic study on human CD133ⴙ HSCs, NSCs, MSCs, and embryonic stem cells and two different progenies of CD133ⴙ HSCs, microvascular endothelial cells (MVECs) and peripheral blood mononuclear cells. Genes abundant in each or in all three somatic stem cells were identified. We also

observed complex genetic networks functioning in postnatal stem cells, in which several genes, such as PTPN11 and DHFR, acted as hubs to maintain the stability and connectivity of the whole genetic network. Eighty-seven HSC genes, including ANGPT1 and GATA2, were independently identified by comparing CD34ⴙCD33ⴚCD38ⴚ hematopoietic stem cells with CD34ⴙ precursors and various matured progenies. Introducing GATA2 into MVECs resulted in dedifferentiation-like transcriptome reprogramming, with HSC genes (such as ANGPT1) being up and endothelial genes (such as EPHB2) being down. This study provides a foundation for a more detailed understanding of human somatic stem cells. Expressing the newly discovered stem cell genes in matured cells might lead to a global reversion of somatic transcriptome to a stem-like status. STEM CELLS 2008;26:1186 –1201

Disclosure of potential conflicts of interest is found at the end of this article.

INTRODUCTION Adult stem cells are required for a lifelong sustenance of matured cell replacement and hold great promise for future therapeutic applications. In vitro as well as in vivo studies have established that hematopoietic and endothelial cells develop from a common postnatal progenitor, the hemangioblast [1, 2]. The hematopoietic stem cell (HSC) is currently been used in clinical stem cell transplantation for the treatment of leukemia [3]. It is also used in the treatment of many nonhematological disorders, such as autoimmune diseases and metabolism disorders [3]. However, the application of HSCs is hindered by their expandability, whereas cell dose is a major determinant of survival after HSC transplantation [3]. Further characterization of hemangioblasts will be critical for a better understanding of the molecular events involved in stem cell properties, as well as for using this cell population for clinical applications.

Currently, hemangioblast or HSC isolation is performed by recognizing the CD133 antigen [4, 5]. Until recent years, the gene expression pattern in human CD133⫹ HSC was characterized by microarray analyses in several studies, whereby genes involved in self-renewal, differentiation, and lineage choice were revealed [5–7]. These expression analyses helped to uncover genetic programs accompanying the cascade in hematopoiesis or angiogenesis. Transcription factors have drawn much attention since they very often play key roles in stemness maintenance or fate determination during differentiation. For example, GATA1 plays an essential role in the promotion of hematopoietic cell differentiation [8]. GATA2, on the other hand, strongly blocks hematopoietic cell differentiation and stimulates immature cell proliferation [9]. Mesenchymal stem cells (MSCs), found in many adult tissues, are also attractive somatic stem cell sources for the regeneration of damaged tissues. Currently, MSCs can be isolated from various human sources, including bone marrow, umbilical cord, cord blood, adipose tissue, and muscle [10 –14]. MSCs

Correspondence: Hsei-Wei Wang, Ph.D., Institute of Microbiology and Immunology, National Yang-Ming University, No. 155, Sec 2, Li-Nong Street, Taipei 112, Taiwan. Telephone: 886-2-2826-7109; Fax: 886-2-2821-2880; e-mail: [email protected] Received October 3, 2007; accepted for publication February 14, 2008; first published online in STEM CELLS EXPRESS February 28, 2008. ©AlphaMed Press 1066-5099/2008/$30.00/0 doi: 10.1634/stemcells.2007-0821

STEM CELLS 2008;26:1186 –1201 www.StemCells.com

Huang, Hsieh, Wu et al. from different sources are all able to self-renew with a high proliferative capacity, and all possess a mesodermal differentiation potential, including osteogenic, chondrogenic, and adipogenic differentiation [15]. The gene expression profiles of MSCs of different sources have been widely characterized by microarray analyses [16, 17]. A very recent report compared the gene expression pattern of cord blood CD133⫹ HSCs with that of bone marrow-derived MSCs in hypoxia, and genes commonly expressed in both stem cells have been highlighted [7]. More functional studies are still needed to characterize the roles of identified MSC and HSC genes under different physiological or culture conditions. The genetic signature of another somatic stem cell type, neural stem cell (NSC), has also been studied. In mouse, a portion of the genetic program of hematopoietic stem cells is shared with embryonic stem cells (ESCs) and NSCs. These common gene products (283 genes) represent a molecular signature of stem cells [18, 19]. ESCs and NSCs are largely similar at the transcriptional level [19]. Gene expression profiles of human NSC during temporal changes of priming and differentiation, as well as neuronal precursors from human brain, have also been well studied [20, 21]. However, no study has yet addressed a systematic comparison of human ESCs and somatic stem cells. Gene signatures provided by such a study will provide a foundation for a profound understanding of human stem cell biology. Although gene expression profiling can reveal differentially expressed genes, the challenge remains to assign the biological significance of these genes into a complete biological system. There is increasing recognition that genes do not act as individuals but collaborate as a module, where cellular functions are carried out by many modules in overlapping networks [22, 23]. Disrupted signaling crosstalk among biological modules can be a hallmark of cancer [24]. With the accumulation of gene functional annotations and molecular interaction data, dynamic mapping of gene expression data to a particular pathway or a genetic network is possible. Comprehension of the value of such knowledge-based analysis in stem cells has been partly supported by the finding that stem cells and cancer cells share many features and pathways [25]. Functional network analysis has been applied to analyze the transcriptomes of embryonic stem cells and bone marrow MSCs [17, 26]. No similar research using systems biology tools has been conducted on CD133⫹ HSCs or NSCs. Moreover, the genetic network revealed from previous studies is incomplete because of the limitation of human-curated knowledge-based databases [17, 27]. The global genetic networks among stemness genes still need to be constructed via a data-driven approach. In this study, we applied gene expression microarray and systems biology tools to obtain genes involved in stemness in cord blood CD133⫹ HSCs, NSCs, and bone marrow MSCs and to provide a global genetic network for each somatic stem cell type. Genes common in all three somatic cells, as well as those also abundant in ESCs, were revealed. Novel GATA2-regulated genes were identified by further microarray experiment. Dedifferentiation-like transcriptome reprogramming was observed in GATA2-expressing endothelial cells, indicating the critical role of GATA2 in stem cell properties. Our data contribute new insights to a refined molecular picture and a better understanding of somatic stem cells. Manipulating the steady expression of stem cell genes may eventually facilitate the clinical use of CD133⫹ HSCs, NSCs, and MSCs. Introducing critical stemness factors into matured cells to convert their genetic networks to a stem cell state may eventually produce stem-like cells directly from somatic sources. www.StemCells.com

1187

MATERIALS

AND

METHODS

CD133ⴙ Stem Cells, MSCs, NSCs, and Primary Microvascular Endothelial Cells Human CD133⫹ stem cells of healthy individuals were isolated from umbilical cord blood as published before [5]. Bone marrow MSCs and NSCs were isolated from healthy individuals (Poietics, Lonza Inc., Conshohocken, PA, http://www.lonzabioscience.com/ Lonza_CatNav.asp?oid⫽867). After isolation, CD133⫹ HSCs were subjected directly to RNA extraction. MSCs were cultured in MesenCult medium (StemCell Technologies, http://www.stemcell.com) for fewer than five passages. NSCs were cultured in the Neural Progenitor Maintenance BulletKit (Poietics), and differentiation was induced by culturing them in the Neural Progenitor Differentiation BulletKit (CC-3229, Poietics). Human primary microvascular endothelial cells (MVECs; Clonetics, Lonza Inc.) and a human endothelial cell line, HMEC1, were cultured in EGM-2 MV BulletKit medium (Clonetics).

Array Probe Preparation, Data Analysis, Group Distance Calculation, and Function Network Analyses Total RNA collection, cRNA probe preparation, array hybridization, and data analysis were done as previously described [28 –30]. More details are given in the supplemental online Materials and Methods. The average linkage distance was used to assess similarity between two groups of gene expression profiles, as described previously [28]. The difference in distance between two groups of sample expression profiles and a third group was assessed via the comparison of corresponding average linkage distances (the mean of all pairwise distances [linkages] between members of the two groups concerned). The error on such a comparison was estimated by combining the SEs (the SD of pairwise linkages divided by the square root of the number of linkages) of the average linkage distances involved [28]. Classic multidimensional scaling (MDS) was performed using the standard function of the R program to provide a visual impression of how the various sample groups are related. Principal component analysis (PCA), a technique similar to MDS, was performed using Partek Genomics Suite software (Partek, Inc., St. Louis, http://www.partek.com). Gene annotation was performed by the ArrayFusion Web tool (http://microarray.ym.edu.tw/tools/arrayfusion/) [31]. Differential gene expression profiles were imported into the Ingenuity Pathways Analysis (IPA) software (Ingenuity Systems, Redwood City, CA, http://www.ingenuity.com) to obtain functional regulatory networks. The knowledge base behind IPA was built upon scientific evidence, manually curated from thousands of journal articles, textbooks, and other data sources. After a list of signature genes was uploaded, interaction among focus genes and interaction among interacting genes and molecules from the knowledge base were used to combine genes into networks according to their probability of having more focus genes than expected by chance. Networks are scored on the basis of the number of uploaded signature genes they contain. The network score is based on the hypergeometric distribution and is calculated with the right-tailed Fisher’s exact test. The score is the negative log of this p value. The higher the score, the lower the probability of finding the observed number of uploaded signature genes in a given network by random chance.

Microarray Expression Data Sets Affymetrix U133 Plus 2.0 microarray data of human ESCs were from the Gene Expression Omnibus (GEO) database (http://www. ncbi.nlm.nih.gov/geo/; GSE7896 and GSE6561) (Affymetrix, Santa Clara, CA, http://www.affymetrix.com). To extract HSC-enriched genes, we used 57 Affymetrix U133A array data produced by us [28] or from publicly accessible array databases, including GEO, ArrayExpress (http://www.ebi.ac.uk/arrayexpress), the Medical University of South Carolina (MUSC) DNA Microarray Database (http://proteogenomics.musc.edu/pss/home.php), and The Genomics Institute of the Novartis Research Foundation (GNF) SymAtlas

Genetic Network and Cell Dedifferentiation

1188

(http://symatlas.gnf.org/SymAtlas/) (Fig. 4A). The in-house data sets comprised samples from human umbilical cord vein endothelial cell (HUVEC), aortic or uterine smooth muscle cell, lymphatic endothelial cell, and blood vessel endothelial cell (ArrayExpress EMEXP-66). The downloaded data sets comprised CD34⫹ HSCs (CD34⫹CD38⫺CD33⫺KIT⫹Rhohigh and CD34⫹CD38⫺CD33⫺ Rholow [GSM51391 to GSM51408 from GEO GSE2666]), CD34⫹ precursor cells (including pre-B cells, pro-B cells [ArrayExpress E-MEXP-384], CD34⫹CD33⫹ myeloid cells, and CD34⫹CD71⫹ early erythroid cells [from GNF SymAtlas]), and a variety of matured progeny cells (including human endothelial cells of artery and vein origin [HUVEC, MUSC data set 010504, 040204, 070102, GEO GSE973] and hematopoietic cells [GEO GSE1140]). For Figure 4B and supplemental online Figure 1A and 1B, a total of 256 Affymetrix U133A array data points for normal human tissues were an extension of the 57-array data set and were collected from GNF SymAtlas and GEO (GSE1140, GSE2248, GSE2361, and GSE2666). All of the array data are available on our web site (http://infobio.ym.edu.tw/).

The gene expression profiles of all collected cell types were implemented at least in triplicate by using the Affymetrix HGU133 Plus 2.0 whole-genome chip. Genes differentially expressed between cell types (the molecular signatures) were identified according to a statistical pipeline we used [28]. A gene expression heat map for these genes indicated their unique expression patterns among each cell type (Fig. 1D), and their discrimination ability was also assessed by MDS (Fig. 1E). Compared with PBMCs and MVECs, 1,572 probe sets were abundantly overexpressed in CD133⫹ cells (with a positive false discovery rate [pFDR] threshold of q ⬍ 0.0001), 1,456 probe sets were overexpressed in MSCs (q ⬍ 0.001), and 3,252 probe sets were overexpressed in NSCs (q ⬍ 0.001) (Fig. 1F). Abundant expression of some genes (including GATA2, MCM3, MLF1IP, MYB, KIT, and FLT3) in CD133⫹ cells was verified by quantitative PCR (qPCR) (Fig. 1G). qPCR results showed a high degree of correlation with microarray results (R ⬎ 0.95).

Promoter Analysis

Molecular Signatures of Somatic Stem Cells

Transcription factor binding sites in a given promoter region (base pairs ⫺3,000 to ⫹1,000) were analyzed by the Patch program (http://www.gene-regulation.com/pub/programs.html#patch) from the BioBase Biological Databases (Beverly, MA) and the MatInspector program from Genomatix (Munich, Germany, http://www. genomatix.de/). BioTapestry software (http://www.biotapestry.org) was used to represent the relationships between stemness genes.

The top 100 genes most strongly expressed in CD133⫹ HSC and the top 60 genes in MSC and NSCs are listed in Tables 1–3. In HSCs, PROM1 (CD133) and CD34, two hematopoietic and endothelial precursor markers [4, 34], were among those 100 genes (Table 1, underlined). BAALC, another novel marker of hematopoietic progenitor cells, also appeared [35] (Table 1, underlined). Also shown in Table 1 are FLT3, KIT, HLF, ITGA9, LMO2, MLLT3, and MYB, which have all been shown to be enriched in pluripotent hematopoietic stem cells [5, 6, 9, 36]. These consistent findings demonstrate the reliability of our gene list. In MSCs, SOX9, SNAI2 (Slug), and TWIST1, which regulate epithelial-mesenchymal transition and cell migration in early neural crest development or in tumor metastasis [37, 38], were present (Table 2, underlined). In NSCs, known neuronal precursor markers (such as PTN and FOXG1) were present (Table 3, underlined). PTN mRNA is highly expressed in neural stem cells of mouse ventral mesencephalon, and PTN can promote the production of DAergic neurons from embryonic stem (ES) cell-derived nestin-positive cells [39]. FOXG1 is known to constitutively suppress the generation of the Cajal-Retzius cell, the earliest-born neuron [40]. Details for these genes and all other genes can be found in supplemental online Table 1. One hundred eighty-three probe sets were commonly expressed in all three somatic stem cells, indicating their critical roles in stem cell properties (Fig. 1F; Table 4). We also checked which of those 104 genes were also abundant in human ESCs. A gene expression heat map for genes differentially expressed among all stem cells (three somatic stem cells and ESCs) and matured cells (MVEC and PBMC) indicated 34 genes abundantly expressed in all stem cells (q ⬍ 10⫺6; Fig. 2A). PCA plots proving their discrimination ability are also shown (Fig. 2B). Details of those 34 genes are given in Table 4 (asterisks). Pre-B-cell leukemia transcription factor 1 (PBX1), a homeodomain transcription factor that was originally identified as the product of a proto-oncogene in acute pre-B-cell leukemia, is a global regulator of embryonic and B cell development [41, 42]. The role of PBX1 in HSCs is also evidenced by the reduced numbers and impaired functions of committed hematopoietic progenitors in the fetal liver that result in inadequate maintenance of definitive hematopoiesis and severe anemia [43]. EPDR1, also known as MERP1, is expressed in a hematopoietic stem cell-enriched population but is downregulated with proliferation and differentiation [44]. The presence of those known stem cell genes strengthens the reliability of our gene list. Most of the 34 stem cell genes, as well as most of the somatic stem cell genes, are novel and worthy of being investigated further. The relative expression levels of these common stemness genes across the whole human body were examined, and most of them

Plasmid Construction and Lentivirus Transduction Lentivirus production and infection were performed as described [32, 33]. Plasmid plenti4-GATA2, which was used to express FLAG-tagged GATA2, was constructed by polymerase chain reaction (PCR) using the follow primer pair: 5⬘-VGAAcggtccgCTGCACCCAGACCCTGAG (one CpoI site lowercased) and 3⬘VGAActcgagGTCCTCGACGTCCATCTGTT (one XhoI site lowercased). The PCR products were TA cloned into the pGEMTeasy vector (Promega, Madison, WI, http://www.promega.com), sequence-verified, and then cloned in-frame into the FLAG-tagged pLenti4 lentiviral vector (a gift kindly provided by Dr. Su-Fang Lin, National Health Research Institute, Znunan, Taiwan) by cutting at the CpoI and XhoI sites.

RESULTS Isolation and Characterization of Somatic Stem Cells and Primary Cells To access molecular signature genes for somatic stem cells, a set of primary human somatic stem or matured cells, including cord blood CD133⫹ HSCs, NSCs, MSCs derived from bone marrow, dermal MVECs, and peripheral blood mononuclear cells (PBMCs), were collected. Of all the matured cell types, only PBMCs and MVECs were collected, since we focused more on CD133⫹ HSCs, the ancestor of these two somatic cell types. MVECs were positive for endothelial cell (EC) markers CD31 and von Willebrand factor (Fig. 1A). MSCs were positive for CD44 and CD73 (both as mesenchymal markers) and negative for CD34 and CD45 (both as hematopoietic markers) (Fig. 1B). Isolated MSCs could also differentiate into cells of osteogenic, adipogenic, and chondrogenic lineages (not shown). NSCs could be maintained in an undifferentiated state as neurospheres (Fig. 1C, upper panel) and were positive for nestin, a NSC marker (not shown). Cultured NSCs could be induced into neurogenic and glial lineages, since differentiated cells expressed the neuron marker ␤-tubulin III or the glial cell marker glial fibrillary acidic protein (an intermediate filament protein that is found in glial cells, such as astrocytes) (Fig. 1C, lower panel, green and red, respectively).

Huang, Hsieh, Wu et al.

1189

Figure 1. Gene expression microarray analysis of three somatic stem cells, PBMCs, and primary MVECs. (A): Characteristics of isolated MVECs. All MVECs (passage 6) express CD31 and vWF. (B): Immunophenotype of MSCs by flow cytometric analysis. Representative histograms are demonstrated, and their respective isotype controls are shown by filled blue areas. MSCs were positive for CD44 and CD73 and negative for CD34 and CD45. (C): Characteristics of NSCs. Top panel, undifferentiated NSCs formed neurospheres; middle and lower panels, NSCs were induced into differentiation for 3 and 12 d, respectively. At d 12 postinduction, differentiated cells were stained for ␤-tubulin III (a neuron marker; green), glial fibrillary acidic protein (for glial cells such as astrocytes; red) and nuclear DNA (Hoechst 33258; blue).(D): A heat map shows genes enriched in CD133⫹ stem cells, in MSCs, in NSCs, or in MVECs. Genes in red, increased expression; in blue, decreased. (E): An MDS plot shows the discrimination ability of the obtained molecular signatures of cell groups. Each spot represents a single array sample. Each cell group exhibited a significantly distinct global gene expression profile. (F): Venn diagram detailing shared and distinct gene expression among human HSCs, NSCs, and MSCs. (G): Validation of CD133⫹ HSC genes by real-time reverse transcription-polymerase chain reaction. Mean expression levels of target genes were compared with that of GAPDH control. Results are expressed as the mean ⫾ SD. Abbreviations: d, days; GAPDH, glyceraldehyde-3-phosphate dehydrogenase; HSC, hematopoietic stem cell; MDS, multidimensional scaling; MSC, mesenchymal stem cell; MVEC, microvascular endothelial cell; NSC, neural stem cell; PBMC, peripheral blood mononuclear cell; vWF, von Willebrand factor.

(such as ANGPT1 and NPR3) were unique in stem cells (supplemental online Fig. 1A, 1B). One of these genes is angiopoietin-1 (ANGPT1; Tables 1, 2, underlined). The role of ANGPT1 secreted by CD133⫹ HSCs and MSCs in angiogenesis is supported by the www.StemCells.com

finding that defective vascular remodeling in RUNX1 homozygous mutant mice could be rescued by addition of HSCs or ANGPT1 [45]. ANGPT1 was abundant in several somatic stem cell types, including MSCs and long-term self-renewing (LT) and short-term

Genetic Network and Cell Dedifferentiation

1190

Table 1. Top 100 genes in CD133⫹ HSC UniGene ID

Gene title

Gene symbol

Chromosomal location

Hs.226568 Hs.585129 Hs.369675 Hs.525163 Hs.335239 Hs.632601 Hs.511311 Hs.591063 Hs.533446 Hs.388313 Hs.632677 Hs.119302 Hs.137359 Hs.13528 Hs.120591 Hs.374990 Hs.470654 Hs.119882 Hs.496587 Hs.292375 Hs.576092 Hs.29341 Hs.115617 Hs.436542 Hs.6179 Hs.112981 Hs.533717 Hs.570374 Hs.317659 Hs.533644 Hs.412597 Hs.504765 Hs.638714 Hs.29725 Hs.50802 Hs.148768 Hs.134090 Hs.507590 Hs.367725 Hs.73797 Hs.318894 Hs.24258 Hs.620129 Hs.302145 Hs.196952 Hs.110637 Hs.248136 Hs.348935 Hs.113157 Hs.632338 Hs.479754 Hs.492314 Hs.34560 Hs.130714 Hs.370475 Hs.486548 Hs.377830 Hs.592017 Hs.526754 Hs.270978 Hs.168799 Hs.591085 Hs.82906 Hs.458272 Hs.289795 Hs.585782 Hs.339024 Hs.531941

ATP-binding cassette, sub-family A (ABC1), member 13 ATP/GTP binding protein-like 3 Angiopoietin 1 Ankyrin repeat domain 10 Ankyrin repeat domain 28 Amphiregulin (schwannoma-derived growth factor) ATPase, Class I, type 8B, member 4 UDP-Gal:betaGlcNAc beta 1,4-galactosyltransferase, polypeptide 6 Brain and acute leukemia, cytoplasmic BIC transcript B-box and SPRY domain containing C1q and tumor necrosis factor related protein 4 Chromosome 20 open reading frame 175 Chromosome 5 open reading frame 23 Coiled-coil domain containing 4 CD34 molecule Cell division cycle associated 7 Cyclin-dependent kinase 6 Chordin-like 1 Carbohydrate (chondroitin 4) sulfotransferase 13 Collagen, type XXIV, alpha 1 Carboxypeptidase X (M14 family), member 1 Corticotropin releasing hormone binding protein Cysteine-rich secretory protein LCCL domain containing 1 DEAD (Asp-Glu-Ala-Asp) box polypeptide 17 DEP domain containing 6 Delta-like 1 homolog (Drosophila) DNA (cytosine-5-)-methyltransferase 3 beta Developmental pluripotency associated 4 dpy-19-like 2 (C. elegans) Desmoglein 2 Ets variant gene 6 (TEL oncogene) Family with sequence similarity 30, member A Hypothetical protein FLJ13197 Hypothetical protein FLJ14712 Hypothetical protein FLJ36166 Hypothetical protein FLJ38379 fms-related tyrosine kinase 3 GATA binding protein 2 Guanine nucleotide binding protein (G protein), alpha 15 (Gq class) G protein-coupled receptor 126 Guanylate cyclase 1, soluble, alpha 3 Glucuronidase, beta pseudogene 1 Hemoglobin, gamma G Hepatic leukemia factor Homeobox A9 5-hydroxytryptamine (serotonin) receptor 1F Immunoglobulin lambda-like polypeptide 1 Integrin, alpha 9 KIAA0125 v-kit Hardy-Zuckerman 4 feline sarcoma viral oncogene homolog Lysosomal associated protein transmembrane 4 beta LIM domain only 2 (rhombotin-like 1) Similar to HSPC323 LUC7-like 2 (S. cerevisiae) Microtubule-associated protein 7 Membrane bound O-acyltransferase domain containing 1 Multiple C2 domains, transmembrane 2 Meis1, myeloid ecotropic viral integration site 1 homolog (mouse) Mesoderm specific transcript homolog (mouse) Methyltransferase like 3 Myeloid/lymphoid or mixed-lineage leukemia; translocated to, 3 Myeloproliferative leukemia virus oncogene Myeloperoxidase Metallophosphoesterase domain containing 2 Musashi homolog 2 (Drosophila) Methionine sulfoxide reductase B3 v-myb myeloblastosis viral oncogene homolog (avian)

ABCA13 AGBL3 ANGPT1 ANKRD10 ANKRD28 AREG ATP8B4 B4GALT6 BAALC BIC BSPRY C1QTNF4 C20orf175 C5orf23 CCDC4 CD34 CDCA7 CDK6 CHRDL1 CHST13 COL24A1 CPXM1 CRHBP CRISPLD1 DDX17 DEPDC6 DLK1 DNMT3B DPPA4 DPY19L2 DSG2 ETV6 FAM30A FLJ13197 FLJ14712 FLJ36166 FLJ38379 FLT3 GATA2 GNA15 GPR126 GUCY1A3 GUSBP1 HBG1 HLF HOXA9 HTR1F IGLL1 ITGA9 KIAA0125 KIT LAPTM4B LMO2 LOC284422 LUC7L2 MAP7 MBOAT1 MCTP2 MEIS1 MEST METTL3 MLLT3 MPL MPO MPPED2 MSI2 MSRB3 MYB

chr7p12.3 chr7q33 chr8q22.3-q23 chr13q34 chr3p24.3 chr4q13-q21 chr15q21.2 chr18q11 chr8q22.3 chr21q21.3 chr9q32 chr11q11 chr20q13.13 chr5p13.3 chr4p13 chr1q32 chr2q31 chr7q21-q22 chrXq22.3 chr3q21.3 chr1p22.3 chr20p13-p12.3 chr5q11.2-q13.3 chr8q21.11 chr22q13.1 chr8q24.12 chr14q32 chr20q11.2 chr3q13.13 chr12q14.2 chr18q12.1 chr12p13 chr14q32.33 chr4p14 chr7p21.3 chr7q22.1 chr2q37.3 chr13q12 chr3q21.3 chr19p13.3 chr6q24.1 chr4q31.3-q33 chr5q13.2 chr11p15.5 chr17q22 chr7p15-p14 chr3p12 chr22q11.23 chr3p21.3 chr14q32.33 chr4q11-q12 chr8q22.1 chr11p13 chr19p13.3 chr7q34 chr6q23.3 chr6p22.3 chr15q26.2 chr2p14-p13 chr7q32 chr14q11.1 chr9p22 chr1p34 chr17q23.1 chr11p13 chr17q22 chr12q14.3 chr6q22-q23 (continued)

Huang, Hsieh, Wu et al.

1191

Table 1. (continued) UniGene ID

Gene title

Gene symbol

Chromosomal location

Hs.25960 Hs.487036 Hs.21365 Hs.237028 Hs.79881 Hs.369984 Hs.191046 Hs.481819 Hs.128433 Hs.435479 Hs.479220 Hs.446083 Hs.113912 Hs.485938 Hs.591445 Hs.379191 Hs.435274 Hs.520319 Hs.40510 Hs.530003 Hs.272284 Hs.485572 Hs.98243 Hs.591773 Hs.122061 Hs.475812 Hs.300624 Hs.269722 Hs.479226 Hs.213762 Hs.326801 Hs.427284

v-myc myelocytomatosis viral related oncogene, neuroblastoma Myosin VC Nucleosome assembly protein 1-like 3 Natriuretic peptide receptor C/guanylate cyclase C Purinergic receptor P2Y, G-protein coupled, 1 PAN3 polyA specific ribonuclease subunit homolog (S. cerevisiae) Phosphodiesterase 1A, calmodulin-dependent PDZ domain containing 2 Prostaglandin D2 synthase, hematopoietic Protein phosphatase 1H (PP2C domain containing) Prominin 1 Protein tyrosine phosphatase, receptor type, D Rap guanine nucleotide exchange factor (GEF) 2 Ras-related GTP binding D Sterile alpha motif domain containing 13 Stearoyl-CoA desaturase 5 Sodium channel, voltage-gated, type III, alpha Solute carrier family 22 (organic cation transporter), member 16 Solute carrier family 25, member 27 Solute carrier family 2 (facilitated glucose/fructose transporter), 5 SLIT and NTRK-like family, member 4 Suppressor of cytokine signaling 2 Serine peptidase inhibitor, Kazal type 2 (acrosin-trypsin inhibitor) Single-stranded DNA binding protein 2 START domain containing 9 STT3, subunit of the oligosaccharyltransferase complex, homolog B TAR DNA binding protein T-cell lymphoma breakpoint associated target 1 Tctex1 domain containing 1 WD repeat domain 49 Zinc finger protein 711 Zinc and ring finger 1

MYCN MYO5C NAP1L3 NPR3 P2RY1 PAN3 PDE1A PDZD2 PGDS PPM1H PROM1 PTPRD RAPGEF2 RRAGD SAMD13 SCD5 SCN3A SLC22A16 SLC25A27 SLC2A5 SLITRK4 SOCS2 SPINK2 SSBP2 STARD9 STT3B TARDBP TCBA1 TCTEX1D1 WDR49 ZNF711 ZNRF1

chr2p24.1 chr15q21 chrXq21.3-q22 chr5p14-p13 chr3q25.2 chr13q12.2 chr2q32.1 chr5p13.3 chr4q22.3 12q14.1-q14.2 chr4p15.32 chr9p23-p24.3 chr4q32.1 chr6q15-q16 chr1p31.1 chr4q21.22 chr2q24 chr6q22.1 chr6p11.2-q12 chr1p36.2 chrXq27.3 chr12q chr4q12 chr5q14.1 chr15q15.2 chr3p23 chr1p36.22 chr6q21 chr1p31.3 chr3q26.1 chrXq21.1-q21.2 chr16q23.1

self-renewing (ST) HSCs (supplemental online Fig. 1A), and its high expression level in another somatic stem cell type, limbus stem cells, was revealed by qPCR (supplemental online Fig. 1C). ESCs, on the other hand, do not express high levels of ANGPT1 (supplemental online Fig. 1A).

Genetic Networks of CD133ⴙ Stem Cells, MSCs, and NSCs Increasing evidence shows that genes do not act as individuals but collaborate in genetic networks [24]. To better understand how genes enriched in somatic stem cells are related to each other, we performed genetic network analysis for signature genes. Signature probe sets were input in the IPA software to construct network modules. The knowledge base behind IPA summarizes known molecular interactions evidenced in published literature (described in Materials and Methods). The term “network” in IPA is not the same as a biological or canonical pathway with a distinct function (i.e., angiogenesis) but a reflection of all interactions of a given protein as defined in the literature. In CD133⫹ stem cells a major network consisting of 215 genes was identified (Fig. 3A). This network included most of the known stemness-related or pro-proliferating genes. Among those genes, STAT5A/B and ESR1 are novel markers for CD133⫹ HSCs, and HOXA9, GATA2, KIT, MPL, MYB, and MYCN can support self-renewal and keep HSCs in undifferentiated status [9, 36]. CDK6 is a factor promoting G1 phase of HSC [46]. CD34 is a well-known marker for hematopoietic precursors, and ANGPT1 is crucial for angiogenesis [45]. Besides reproducing what have already known for CD133⫹ cells, novel genes that may play crucial roles in self-renewal and differentiation were also revealed. Genes without previous imwww.StemCells.com

plication in hematopoiesis or endothelial differentiation but with evidence in the development of other organs, including NCL, THRB, NR1I2 (PXR), and TRH [47, 48], were also in this main network (supplemental online Table 2). This network also revealed genes with significant biological roles in CD133⫹ cells. Some genes, regarded as “hubs, ” had higher connectivity to others or resided in a position among submodules in the major network (Fig. 3B). Dysregulation of hubs may eventually lead to the disruption of the genetic network and the malfunction of cells [24]. NFE2L2, GNAQ, and MYCN were hubs connecting different submodules in the major network component (Fig. 3B1, 3B2). Central to the network, there were significant hubs, including GATA2, HOXA9, SPP1, ESR1, IL1B, KIT, MPL, MYB, IGF1R, PTPN11, and STAT5A/B (Fig. 3B3). Of these hub genes, PTPN11, TIMP3, and DHFR were commonly expressed in CD133⫹ cells, NSCs, and MSCs (Table 4; Fig. 3C). We got similar results—DHFR, TIMP3, and especially PTPN11 with higher connectivity —when conducting functional network analysis on MSC genes (supplemental online Fig. 2A). When a similar function network analysis was conducted on NSC-enriched genes, DHFR was once again a hub (supplemental online Fig. 2B), suggesting its critical role in maintaining the integrity of genetic networks and in stem cell properties.

Universal Genes of CD133ⴙ HSCs To further compare our gene signature to published ones, we collected a set of array data for HSCs (both LT-HSCs and ST-HSCs), CD34⫹ precursors, and terminal differentiated progeny cells of HSCs. These array data (57 arrays in total) were implemented using the Affymetrix U133A chip, which contains ⬃22,200 probe sets. By comparing the gene expression profiles

Genetic Network and Cell Dedifferentiation

1192

Table 2. Top 60 genes in MSC UniGene ID

Gene title

Gene symbol

Chromosomal location

Hs.593858 Hs.534115 Hs.58324 Hs.459538 Hs.116471 Hs.101302 Hs.172928 Hs.489142 Hs.443625 Hs.474053 Hs.420269 Hs.233240 Hs.522891 Hs.154654 Hs.156316 Hs.190977 Hs.284244 Hs.519385 Hs.9914 Hs.269027 Hs.514746 Hs.631650 Hs.40098 Hs.57697 Hs.571528 Hs.58877 Hs.549040 Hs.450230 Hs.557403 Hs.513022 Hs.591210 Hs.459088 Hs.534499 Hs.65436 Hs.406475 Hs.369840 Hs.9315 Hs.494928 Hs.74615 Hs.434900 Hs.339831 Hs.92282 Hs.136348 Hs.405156 Hs.431092 Hs.157461 Hs.632475 Hs.131269 Hs.351306 Hs.360174 Hs.592098 Hs.233160 Hs.409602 Hs.632099 Hs.371147 Hs.143250 Hs.199814 Hs.66744 Hs.435013 Hs.561260

ADAM metallopeptidase domain 12 (meltrin alpha) ADAM metallopeptidase with thrombospondin type 1 motif, 1 ADAM metallopeptidase with thrombospondin type 1 motif, 5 Aldehyde dehydrogenase 1 family, member A3 Cadherin 11, type 2, OB-cadherin (osteoblast) Collagen, type XII, alpha 1 Collagen, type I, alpha 1 Collagen, type I, alpha 2 Collagen, type III, alpha 1 Collagen, type VI, alpha 1 Collagen, type VI, alpha 2 Collagen, type VI, alpha 3 Chemokine (C-X-C motif) ligand 12 (stromal cell-derived factor 1) Cytochrome P450, family 1, subfamily B, polypeptide 1 Decorin Ectonucleotide pyrophosphatase/phosphodiesterase 2 (autotaxin) Fibroblast growth factor 2 (basic) Forkhead box D1 Follistatin D-galactosamine:polypeptide N-acetylgalactosaminyltransferase GATA binding protein 6 Glycosyltransferase 8 domain containing 2 Gremlin 1, cysteine knot superfamily, homolog (Xenopus laevis) Hyaluronan synthase 1 Hyaluronan synthase 2 Hemicentin 1 Homeobox C6 Insulin-like growth factor binding protein 3 Interleukin 1 receptor, type I Immunoglobulin superfamily containing leucine-rich repeat Integrin, beta-like 1 (with EGF-like repeat domains) KIAA1199 Keratin associated protein 1–5 Lysyl oxidase-like 1 Lumican Nidogen 2 (osteonidogen) Olfactomedin-like 3 Pregnancy-associated plasma protein A, pappalysin 1 Platelet-derived growth factor receptor, alpha polypeptide PDZ domain containing RING finger 3 Proenkephalin Paired-like homeodomain transcription factor 2 Periostin, osteoblast specific factor Phosphatidic acid phosphatase type 2B Protein phosphatase 4, regulatory subunit 2 Proline rich 16 Paired related homeobox 1 Retinoic acid receptor responder (tazarotene induced) 1 Solute carrier family 16, member 4 (monocarboxylic acid transporter Snail homolog 2 (Drosophila) SRY (sex determining region Y)-box 9 Stanniocalcin 2 Sulfatase 1 Transgelin Thrombospondin 2 Tenascin C (hexabrachion) Thyrotropin-releasing hormone degrading enzyme Twist homolog 1 (Saethre-Chotzen syndrome) (Drosophila) Vestigial like 3 (Drosophila) Wingless-type MMTV integration site family, member 5A

ADAM12 ADAMTS1 ADAMTS5 ALDH1A3 CDH11 COL12A1 COL1A1 COL1A2 COL3A1 COL6A1 COL6A2 COL6A3 CXCL12 CYP1B1 DCN ENPP2 FGF2 FOXD1 FST GALNT5 GATA6 GLT8D2 GREM1 HAS1 HAS2 HMCN1 HOXC6 IGFBP3 IL1R1 ISLR ITGBL1 KIAA1199 KRTAP1–5 LOXL1 LUM NID2 OLFML3 PAPPA PDGFRA PDZRN3 PENK PITX2 POSTN PPAP2B PPP4R2 PRR16 PRRX1 RARRES1 SLC16A4 SNAI2 SOX9 STC2 SULF1 TAGLN THBS2 TNC TRHDE TWIST1 VGLL3 WNT5A

chr10q26.3 chr21q21.2 chr21q21.3 chr15q26.3 chr16q22.1 chr6q12-q13 chr17q21.33 chr7q22.1 chr2q31 chr21q22.3 chr21q22.3 chr2q37 chr10q11.1 chr2p21 chr12q21.33 chr8q24.1 chr4q26-q27 chr5q12-q13 chr5q11.2 chr2q24.1 18q11.1-q11.2 chr12q chr15q13-q15 chr19q13.4 chr8q24.12 chr1q25.3-q31.1 chr12q13.3 chr7p13-p12 chr2q12 chr15q23-q24 chr13q33 chr15q24 chr17q12-q21 chr15q24-q25 chr12q21.3-q22 chr14q21-q22 chr1p13.2 chr9q33.2 chr4q11-q13 chr3p13 chr8q23-q24 chr4q25-q27 chr13q13.3 chr1pter-p22.1 chr3p13 chr5q23.1 chr1q24 3q25.32-q25.33 chr1p13.3 chr8q11 17q24.3-q25.1 chr5q35.2 chr8q13.2-q13.3 chr11q23.2 chr6q27 chr9q33 chr12q15-q21 chr7p21.2 chr3p12.1 chr3p21-p14

of HSCs and precursors/matured cells, we acquired another list of stem cell signature genes (q ⬍ 10⫺3). This list was then compared with the one obtained from the U133 Plus 2.0 chip analysis in Figure 1. As a result, a total of 87 genes were consistent in both signatures (Fig. 4A). These genes therefore represent the most likely stemness genes in CD133⫹ HSC population. Several of them have been proved or suggested

before, including ANGPT1, GATA2, HLF, ITGA9, KIT, NPR3, and PROM1 (CD133) (Fig. 4A) [5, 49, 50]. The relative expression levels of those 87 genes across the whole human body were examined by checking the genes’ relative hybridization signals in another 256-chip microarray data set. The abundant expression of GATA2 in both LT- and ST-HSCs was observed (Fig. 4B), indicating its essential role in HSCs.

Huang, Hsieh, Wu et al.

1193

Table 3. Top 60 genes in NSC UniGene ID

Gene title

Hs.590919 Hs.620557 Hs.315369 Hs.300304 Hs.524672 Hs.34114 Hs.380027 Hs.75360 Hs.169047 Hs.45127 Hs.314543 Hs.507755 Hs.34780 Hs.234074 Hs.82002 Hs.22634 Hs.26770 Hs.506357 Hs.528335 Hs.491856 Hs.98523 Hs.632336 Hs.420036 Hs.134974 Hs.65029 Hs.387995 Hs.75819 Hs.495710 Hs.32763 Hs.385956 Hs.592171 Hs.445265 Hs.380048 Hs.12827 Hs.517868 Hs.163244 Hs.368281 Hs.438709 Hs.591101 Hs.468505 Hs.591993 Hs.371249 Hs.489824 Hs.388827 Hs.135787 Hs.511265 Hs.132591 Hs.502338 Hs.481918 Hs.432638 Hs.518438 Hs.592098 Hs.195922 Hs.62886 Hs.303609 Hs.56145 Hs.143250 Hs.460789 Hs.300701 Hs.535724

ADAM metallopeptidase with thrombospondin type 1 motif, 3 Ankyrin 2, neuronal Aquaporin 4 Aristaless related homeobox Achaete-scute complex-like 1 (Drosophila) ATPase, Na⫹/K⫹ transporting, alpha 2 (⫹) polypeptide Chromosome 1 open reading frame 61 Carboxypeptidase E Chondroitin sulfate proteoglycan 3 (neurocan) Chondroitin sulfate proteoglycan 5 (neuroglycan C) Catenin (cadherin-associated protein), delta 2 Doublecortin and CaM kinase-like 1 Doublecortex; lissencephaly, X-linked (doublecortin) Delta-notch-like EGF repeat-containing transmembrane Endothelin receptor type B ets variant gene 1 Fatty acid binding protein 7, brain Family with sequence similarity 107, member A Family with sequence similarity 123A Family with sequence similarity 77, member D FAT tumor suppressor homolog 3 (Drosophila) Forkhead box G1 Glutamate decarboxylase 1 (brain, 67kDa) Growth-associated protein 43 Growth arrest-specific 1 Glycosyltransferase 25 domain containing 2 Glycoprotein M6A Glycoprotein M6B Glutamate receptor, ionotropic, AMPA 2 Heparan sulfate 6-O-sulfotransferase 2 Integrin, beta 8 LIM homeobox 2 Hypothetical gene supported by AK091454 Hypothetical protein LOC645323 Leucine-rich repeat containing 3B Leucine-rich repeat neuronal 1 Microtubule-associated protein 2 Multiple EGF-like-domains 10 Likely ortholog of mouse neighbor of Punc E11 Neurexin 1 Paired box gene 6 (aniridia, keratitis) Pleiotrophin (neurite growth-promoting factor 1) Protein tyrosine phosphatase, receptor-type, Z polypeptide 1 Regulatory factor X, 4 (influences HLA class II expression) sal-like 1 (Drosophila) Sema domain, transmembrane domain (TM), cytoplasmic domain 6D Solute carrier family 10 (sodium/bile acid cotransporter family), 4 Solute carrier family 1 (glial high affinity glutamate transporter), 2 Solute carrier family 1 (glial high affinity glutamate transporter), 3 SRY (sex determining region Y)-box 11 SRY (sex determining region Y)-box 2 SRY (sex determining region Y)-box 9 Sp8 transcription factor SPARC-like 1 (mast9, hevin) ST6-N-acetylgalactosaminide alpha-2,6-sialyltransferase 5 Thymosin-like 8 Tenascin C (hexabrachion) Trinucleotide repeat containing 9 Tubulin, beta 2B Zic family member 1 (odd-paired homolog, Drosophila)

Dedifferentiation-Like Transcriptome Reprogramming Induced by GATA2 in Human Endothelial Cells Using the knowledge-based strategy that we applied, only a small fraction (215/1,572 ⫽ 13.7%) of CD133⫹ cell-enriched genes were involved in network formation (Fig. 3A). This may www.StemCells.com

Gene symbol

ADAMTS3 ANK2 AQP4 ARX ASCL1 ATP1A2 C1orf61 CPE CSPG3 CSPG5 CTNND2 DCAMKL1 DCX DNER EDNRB ETV1 FABP7 FAM107A FAM123A FAM77D FAT3 FOXG1 GAD1 GAP43 GAS1 GLT25D2 GPM6A GPM6B GRIA2 HS6ST2 ITGB8 LHX2 LOC285382 LOC645323 LRRC3B LRRN1 MAP2 MEGF10 NOPE NRXN1 PAX6 PTN PTPRZ1 RFX4 SALL1 SEMA6D SLC10A4 SLC1A2 SLC1A3 SOX11 SOX2 SOX9 SP8 SPARCL1 ST6GALNAC5 TMSL8 TNC TNRC9 TUBB2B ZIC1

Chromosomal location

chr4q13.3 chr4q25–q27 chr18q11.2–q12.1 chrXp22.1–p21.3 chr12q22–q23 chr1q21–q23 chr1q22 chr4q32.3 chr19p12 chr3p21.3 chr5p15.2 chr13q13 chrXq22.3–q23 chr2q36.3 chr13q22 chr7p21.3 chr6q22–q23 chr3p21.1 chr13q12.13 chr8q12.3 chr11q14.3 chr14q12–q13 chr2q31 chr3q13.1–q13.2 chr9q21.3–q22 chr1q25 chr4q34 chrXp22.2 chr4q32–q33 chrXq26.2 chr7p15.3 chr9q33–q34.1 chr3q27.2 chr5q14.3 chr3p24 chr3p26.2 chr2q34–q35 chr5q33 chr15q22.31 chr2p16.3 chr11p13 chr7q33–q34 chr7q31.3 chr12q24 chr16q12.1 chr15q21.1 chr4p12 chr11p13–p12 chr5p13 chr2p25 chr3q26.3–q27 chr17q24.3–q25.1 chr7p21.2 chr4q22.1 chr1p31.1 chrXq21.33–q22.3 chr9q33 chr16q12.1 chr6p25 chr3q24

be due to the fact that only few molecular processes of stem cells had been unveiled. To improve our understanding of stem cell biology, we next aim to find out novel interactions between signature genes of CD133⫹ HSC. Since GATA2 is uniquely expressed in CD133⫹ cells (Fig. 4) and is essential for the proliferation/survival of early hematopoietic cells [51, 52], we

Genetic Network and Cell Dedifferentiation

1194

Table 4. Genes highly expressed in all 3 somatic stem cells UniGene ID

Hs.334707 Hs.386684 Hs.567501 Hs.500645 Hs.369675 Hs.525163 Hs.584884 Hs.418062 Hs.533446 Hs.124366 Hs.334370 Hs.288809 Hs.143733 Hs.79015 Hs.531962 Hs.6693 Hs.59159 Hs.86368 Hs.592052 Hs.436542 Hs.520070 Hs.189119 Hs.315167 Hs.592364 Hs.194392 Hs.527980 Hs.388116 Hs.134857 Hs.403594 Hs.271667 Hs.249718 Hs.9295 Hs.563491 Hs.28020 Hs.468140 Hs.173859 Hs.213389 Hs.191539 Hs.42586 Hs.99195 Hs.485557 Hs.301961 Hs.202179 Hs.463677 Hs.595053 Hs.528382 Hs.443650 Hs.492314 Hs.332795 Hs.514535 Hs.177926 Hs.584775 Hs.145481 Hs.571729 Hs.467634 Hs.526754 Hs.270978 Hs.444483 Hs.485527 Hs.21365 Hs.12554 Hs.50130 Hs.481181 Hs.461787 Hs.155017 Hs.632458 Hs.526594 Hs.518774

Gene title

Aminoacylase 1 Abelson helper integration site 1 Androgen-induced 1 Aldehyde dehydrogenase 18 family, member A1 Angiopoietin 1 Ankyrin repeat domain 10 ATPase, Ca⫹⫹ transporting, type 2C, member 1 beta-1,3-N-acetylgalactosaminyltransferase 1 Brain and acute leukemia, cytoplasmic Bobby sox homolog (Drosophila) Brain expressed, X-linked 1 Basic, immunoglobulin-like variable motif containing Coiled-coil domain containing 34 CD200 molecule Centrosomal protein 70kDa Coiled-coil-helix-coiled-coil-helix domain containing 3 Chromodomain helicase DNA binding protein 9 Calmegin CKLF-like MARVEL transmembrane domain containing 4 Cysteine-rich secretory protein LCCL domain containing 1 cutA divalent cation tolerance homolog (E. coli) CXXC finger 5 Defective in sister chromatid cohesion homolog 1 (S. cerevisiae) Dihydrofolate reductase dpy-19-like 3 (C. elegans) dUTP pyrophosphatase Dishevelled, dsh homolog 3 (Drosophila) EF-hand calcium binding domain 2 EF-hand domain family, member A2 EH domain binding protein 1 Eukaryotic translation initiation factor 4E Elastin (supravalvular aortic stenosis, Williams-Beuren syndrome) Ependymin-related protein 1 (zebrafish) EPM2A (laforin) interacting protein 1 Family with sequence similarity 98, member A Frizzled homolog 7 (Drosophila) Golgi autoantigen, golgin subfamily b, macrogolgin, 1 Golgi associated PDZ and coiled-coil motif containing Glycerol-3-phosphate acyltransferase, mitochondrial G protein-coupled receptor 125 Glutathione S-transferase A4 Glutathione S-transferase M1 General transcription factor IIH, polypeptide 2, 44kDa Helicase, lymphoid-specific Heat shock 60kDa protein 1 (chaperonin) Intraflagellar transport 81 homolog (Chlamydomonas) Jumonji, AT rich interactive domain 1B (RBP2-like) Lysosomal associated protein transmembrane 4 beta Leucine zipper, down-regulated in cancer 1-like Lectin, galactoside-binding, soluble, 3 binding protein Exonuclease NEF-sp /// exonuclease NEF-sp Low density lipoprotein receptor-related protein 6 Leucine rich repeat containing 16 Melanoma antigen family D, 4 Membrane bound O-acyltransferase domain containing 2 Meis1, myeloid ecotropic viral integration site 1 homolog (mouse) Mesoderm specific transcript homolog (mouse) Meiosis-specific nuclear structural 1 Methylmalonyl Coenzyme A mutase Nucleosome assembly protein 1-like 3 Nucleosome assembly protein 1-like 5 Necdin homolog (mouse) NIMA (never in mitosis gene a)-related kinase 1 Neuron derived neurotrophic factor Nuclear receptor interacting protein 1 Nuclear casein kinase and cyclin-dependent kinase substrate 1 Obscurin-like 1 Phosphoribosylaminoimidazole carboxylase synthetase

Gene symbol

Chromosomal location

ACY1 AHI1 *AIG1 ALDH18A1 ANGPT1 ANKRD10 ATP2C1 B3GALNT1 BAALC BBX *BEX1 BIVM *CCDC34 *CD200 *CEP70 CHCHD3 CHD9 *CLGN CMTM4 *CRISPLD1 CUTA CXXC5 DCC1 *DHFR DPY19L3 DUT DVL3 EFCAB2 EFHA2 EHBP1 EIF4E ELN *EPDR1 EPM2AIP1 FAM98A FZD7 GOLGB1 GOPC GPAM *GPR125 GSTA4 GSTM1 GTF2H2 HELLS HSPD1 *IFT81 *JARID1B *LAPTM4B LDOC1L LGALS3BP LOC81691 LRP6 *LRRC16 *MAGED4 *MBOAT2 MEIS1 *MEST MNS1 MUT *NAP1L3 NAP1L5 ⴱNDN NEK1 NENF NRIP1 NUCKS1 ⴱOBSL1 PAICS

chr3p21.1 chr6q23.3 chr6q24.2 chr10q24.3 chr8q22.3-q23 chr13q34 chr3q22.1 chr3q25 chr8q22.3 chr3q13.1 chrXq21-q23 chr13q32-q33.1 chr11p14.1 chr3q12-q13 chr3q22-q23 chr7q32.3-q33 chr16q12.2 chr4q28.3-q31.1 chr16q22.1 chr8q21.11 chr6pter-p21.31 chr5q31.3 chr8q24.12 chr5q11.2-q13.2 chr19q13.11 chr15q15-q21.1 chr3q27 chr1q44 chr8p22 chr2p15 chr4q21-q25 chr7q11.23 chr7p14.1 chr3p22.1 chr2p22.3 chr2q33 chr3q13 chr6q21 chr10q25.2 chr4p15.31 chr6p12.1 chr1p13.3 chr5q12.2-q13.3 chr10q24.2 chr2q33.1 chr12q24.13 chr1q32.1 chr8q22.1 chr22q13.31 chr17q25 chr16p12.2 chr12p11-p13 chr6p22.2 chrXp11.22 chr2p25.1 chr2p14-p13 chr7q32 chr15q21.3 chr6p21 chrXq21.3-q22 chr4q22.1 chr15q11.2–q12 chr4q33 chr1q32.3 chr21q11.2 chr1q32.1 chr2q35 chr4pter–q21 (continued)

Huang, Hsieh, Wu et al.

1195

Table 4. (continued) UniGene ID

Gene title

Gene symbol

Chromosomal location

Hs.368610 Hs.493096 Hs.191046 Hs.487296 Hs.498732 Hs.468415 Hs.181272 Hs.407580 Hs.188614 Hs.409965 Hs.632618 Hs.99500 Hs.506852 Hs.534612 Hs.193118 Hs.513057 Hs.546282 Hs.479396 Hs.445030 Hs.550150 Hs.13640 Hs.301048 Hs.591753 Hs.384598 Hs.305971 Hs.167700 Hs.463439 Hs.308418 Hs.516153 Hs.581171 Hs.297324 Hs.216386 Hs.181444 Hs.440968 Hs.500812 Hs.434971 Hs.507916 Hs.375468 Hs.173094 Hs.295732 Hs.113876 Hs.444451 Hs.269211 Hs.85863 Hs.306221 Hs.584933 Hs.270869 Hs.29698 Hs.433473 Hs.382874 Hs.127473 Hs.37138

3⬘-Phosphoadenosine 5⬘-phosphosulfate synthase 1 Pre-B-cell leukemia transcription factor 1 Phosphodiesterase 1A, calmodulin-dependent Phosphoglycerate dehydrogenase Phytanoyl-CoA 2-hydroxylase Phosphatidylinositol glycan anchor biosynthesis, class F Polycystic kidney disease 2 (autosomal dominant) Plakophilin 4 Pleckstrin homology domain containing, family A member 5 Pinin, desmosome-associated protein Peptidylprolyl isomerase (cyclophilin)-like 4 PR domain containing 16 Protein tyrosine phosphatase, non-receptor type 11 RAB7B, member RAS oncogene family Retinoic acid induced 17 RAN binding protein 5 Retinoblastoma binding protein 8 Recombining binding protein suppressor of hairless (Drosophila) Rho-related BTB domain containing 3 Ring finger protein 12 Roundabout, axon guidance receptor, homolog 1 (Drosophila) SEH1-like (Saccharomyces cerevisiae) Selenoprotein P, plasma, 1 Serpin peptidase inhibitor, clade G (C1 inhibitor), member 1, Solute carrier family 2 (facilitated glucose transporter), member 10 SMAD, mothers against DPP homolog 5 (Drosophila) Sperm-associated antigen 9 Suppressor of hairy wing homolog 3 (Drosophila) Synaptic vesicle glycoprotein 2A Transducin (beta)-like 1X-linked receptor 1 TIMP metallopeptidase inhibitor 3 Transmembrane protein 5 Transmembrane protein 9 Tumor protein p53 binding protein, 1 Translocated promoter region (to activated MET oncogene) Trophinin TSC22 domain family, member 1 TSPY-like 3 (pseudogene) TSPY-like 5 UTP20, small subunit (SSU) processome component, homolog (yeast) Wolf-Hirschhorn syndrome candidate 1 Sterile alpha motif and leucine zipper containing kinase AZK Zinc finger, MYM-type 4 Zinc finger protein 135 Zinc finger protein 326 Zinc finger protein 334 Zinc finger protein 410 Zinc finger protein 605 Zinc finger protein 667 Zinc finger protein 70 Zinc finger family member 788 Zinc finger protein 85

PAPSS1 ⴱPBX1 PDE1A ⴱPHGDH PHYH PIGF PKD2 PKP4 ⴱPLEKHA5 PNN PPIL4 PRDM16 PTPN11 RAB7B RAI17 RANBP5 ⴱRBBP8 RBPSUH RHOBTB3 RNF12 ROBO1 SEH1L ⴱSEPP1 SERPING1 SLC2A10 SMAD5 SPAG9 SUHW3 ⴱSV2A TBL1XR1 TIMP3 TMEM5 ⴱTMEM9 TP53BP1 TPR ⴱTRO TSC22D1 TSPYL3 TSPYL5 UTP20 WHSC1 ZAK ZMYM4 ZNF135 ZNF326 ZNF334 ZNF410 ZNF605 ZNF667 ZNF70 ZNF788 ZNF85

chr4q24 chr1q23 chr2q32.1 chr1p12 chr10pter–p11.2 chr2p21–p16 chr4q21–q23 chr2q23–q31 chr12p12 chr14q21.1 chr6q24–q25 chr1p36.23–p33 chr12q24 chr1q32 chr10q22.3 chr13q32.2 chr18q11.2 chr4p15.2 chr5q15 chrXq13–q21 chr3p12 chr18p11.21 chr5q31 chr11q12–q13.1 chr20q13.1 chr5q31 chr17q21.33 chrXq25 chr1q21.2 chr3q26.32 22q12.1–q13.2 chr12q14.2 chr1q32.1 chr15q15–q21 chr1q25 Xp11.22–p11.21 chr13q14 chr20q11.21 chr8q22.1 chr12q23 chr4p16.3 chr2q24.2 chr1p32–p34 chr19q13.4 chr1p22.2 chr20q13.12 chr14q24.3 chr12q24.33 chr19q13.43 chr22q11.2 chr19p13.2 chr19p13.1–p12

q ⬍ 10⫺3. Asterisks indicate genes that are also expressed in embryonic stem cells.

explored the downstream target genes for this critical transcription factor. GATA2 was introduced into the endothelial cell line HMEC1 or primary MVECs by lentivirus transduction, and gene expression microarray analysis was performed to examine the effects of GATA2 overexpression. The expression of 3,013 probe sets was affected by GATA2 in HMEC1 (with a pFDR threshold q ⬍ 0.2). The gene expression profile of GATA2transduced ECs was then compared with that of vector-transduced cells. We found that GATA2 overexpression resulted in the upregulation of CD133⫹ genes (such as ANGPT1 and NPR3) but the downregulation of EC genes (such as EPHB2, a www.StemCells.com

key receptor involved in angiogenesis [53]) (Fig. 5B, 5C, respectively). This scenario is similar to that of somatic cell dedifferentiation or EC dysfunction [54 –56]. Among those 45 GATA2-regulated HSC genes, 13 genes (28.9%) (ANGPT1, BBX, DPY19L3, GSTM1, HELLS, MEIS1, NRIP1, OBSL1, PDE1A, PHYH, SMAD5, SPAG9, and TPR) were present in all three somatic stem cells, suggesting an upstream and critical role of GATA2 in stem cells. To provide more quantitative evidence, we calculated the average linkage distances between CD133⫹ and GATA2-transduced ECs and between CD133⫹ and vector-transduced ECs. We used an average linkage distance analysis to assess the similarity between

1196

Genetic Network and Cell Dedifferentiation

Figure 2. Supervised hierarchical analysis of human somatic and embryonic stem cells. (A): A heat map shows genes differentially expressed between stem cells and matured cells. Genes in red, increased expression; blue, decreased expression. (B): Principal component analysis using global transcriptome (upper panel) or genes in (A) (lower panel). Each spot represents a single array sample. x-, y-, and z-axes represent three major PCs. Stem cells formed one cluster, and matured cells (PBMCs and two different types of endothelial cells) formed another. Abbreviations: BEC, blood vessel endothelial cell; ESC, embryonic stem cell; HSC, hematopoietic stem cell; LEC, lymphatic endothelial cell; MSC, mesenchymal stem cell; NSC, neural stem cell; PBMC, peripheral blood mononuclear cell; PC, principal component.

Figure 3. Interaction network analysis as a framework for the interpretation of stem cell biology. (A): A functional genetic network composed of multiple CD133⫹ hematopoietic stem cell (HSC) genes. This network is displayed graphically as nodes (gene products) and edges (biological relationships between nodes) mapped by the Ingenuity Pathway Analysis tool. The intensity of the node color indicates the degree of upregulation. Nodes are displayed using various shapes that represent the functional class of the gene product (right panel). (B): Selected core regions of the HSC gene network, highlighting several hub genes. Their corresponding locations in the genetic network are indicated by numbers. Key hub genes are labeled in yellow. (C): Locations of stem cell-common genes in the CD133⫹ genetic network. Genes commonly expressed in CD133⫹ cells, neural stem cells, and mesenchymal stem cells (Fig. 2; Table 4) are shown in red in the right panels.

two groups of gene expression profiles, as described previously [28]. As shown in Figure 5D, the distance between CD133⫹ HSCs

and GATA2-transduced ECs was smaller than that between HSCs and vector-transduced ECs. The genetic profile of GATA2-ex-

Huang, Hsieh, Wu et al.

1197

Figure 4. Narrowing down HSC genes by comparing CD34⫹ HSCs with CD34⫹ precursors and a variety of matured progeny cells of HSCs. (A): A heat map showing 87 stemness genes enriched in CD34⫹ HSCs. U133A array data for each cell type were collected from publicly accessible databases (described in Materials and Methods). This collection comprised 18 arrays for CD34⫹ HSCs (both CD34⫹CD38⫺CD33⫺KIT⫹Rhohigh and CD34⫹CD38⫺CD33⫺Rholow HSCs), 8 arrays for CD34⫹ precursor cells (including pre-B cells, pro-B cells, CD34⫹CD33⫹ myeloid cells, and CD34⫹CD71⫹ early erythroid cells), and 31 arrays for matured progeny cells (including endothelial cells of artery, vein, microvascular blood vessel, or lymphatic vessel origin; hematopoietic cells; and smooth muscle cells). Genes underlined are transcription factors. Genes in red, increased expression; green, decreased expression. (B): GATA2 gene expression distribution among various types of normal human tissues. A total of 256 Affymetrix U133A array data were normalized together (described in Materials and Methods). Abbreviations: BEC, blood vessel endothelial cell; ESC, embryonic stem cell; GI, gastrointestinal; HAEC, human artery endothelial cell; HBEC, human bronchial epithelial cell; HSC, hematopoietic stem cell; HUVEC, human umbilical vein endothelial cell; LEC, lymphatic endothelial cell; LT, long-term self-renewing; MSC, mesenchymal stem cell; PBMC, peripheral blood mononuclear cell; ST, short-term self-renewing.

pressing ECs was therefore closer to that of CD133⫹ cells, supporting a dedifferentiation-like transcriptome reprogramming induced by GATA2 overexpression. We also checked which GATA2-regulated CD133⫹ genes might be the direct targets of GATA2. Promoter regions of GATA2-regulated genes were examined for putative GATA-binding sites, and such motifs could be found in the promoter regions of 30 genes (Fig. 5E). One of them is ANGPT1, a cytokine commonly expressed in CD133⫹ HSCs, NSCs, and MSCs (Table 4; supplemental online Fig. 1). There were quite a few putative GATAbinding motifs in the promoter region of ANGPT1 (supplemental online Fig. 3A). Semiquantitative PCR confirmed that the ANGPT1 level was increased after GATA2 overexpression (supplemental online Fig. 3B). The relationships between GATA2 and its regulating genes, including ANGPT1, are summarized in Figure 5E.

DISCUSSION Stem cells have recently drawn immense research interest because of their unique biological behaviors and potential clinical uses. An improved understanding of HSCs can help the ex vivo expansion of them or the in vivo control of their differentiation directions, thereby furthering their potential in therapeutic applications. In this study we encompassed an extensive comparative transcriptome and gene network analysis of HSCs, NSCs, MSCs, ESCs, and the progeny populations of HSCs. Moreover, we developed methods in systems biology to reveal the interactions between signature genes. Genetic networks of stem cells provide in-depth information (such www.StemCells.com

as key hub genes and novel GATA2 targets) that could not be readily extracted from one-dimensional gene list analysis. Novel relationships between GATA2 and other stemness genes were also mapped and confirmed in this study (Fig. 5; supplemental online Fig. 3A, 3B). For the first time, we found that somatic stem cell genes, like some embryonic stemness genes, also hold the potential of dedifferentiate matured cells: GATA2 overexpression induced a dedifferentiation-like transcriptome reprogramming in endothelial cells. These findings might pave the way to unraveling the myth of somatic stem cells, especially HSC, and, furthermore, to contributing to cell-based therapy. We narrowed down our stem cell genes by comparing our gene list with those of previous work. We did a meta-analysis on several public microarray data sets, which comprised CD34⫹CD38⫺ CD33⫺ HSCs (both Rhohigh LT-HSCs and Rholow ST-HSCs), CD34⫹ precursor cells (including pre-B cells, pro-B cells, CD34⫹CD33⫹ myeloid cells, and CD34⫹CD71⫹ early erythroid cells) and a variety of matured progeny cells of HSCs (including endothelial cells of artery, vein, microvascular blood vessel, or lymphatic vessel; hematopoietic cells; and smooth muscle cells). For the identification of HSC-enriched genes, this collection was very comprehensive. By overlapping HSC signatures from two different sources (i.e., from our own Affymetrix U133 Plus 2.0 data set and the publicly accessible U133A array collection), a total of 87 genes were then disclosed. Among those genes, there were known hemangioblast markers (ANGPT1, CRHBP, GATA2, HLF, KIT, MEIS1, NPR3, and PROM1 [5–7]), supporting results generated using proposed strategy. These 87 genes should serve as candidate targets for future research. The gene expression profile of human CD133⫹ cells has been reported by a few groups [5–7]. Nevertheless, finding significant

1198

Genetic Network and Cell Dedifferentiation

Figure 5. Constitutive expression of GATA2 in matured endothelial cells induced dedifferentiation-like transcriptome reprogramming. (A): Expression of GATA2 protein in HMEC1. FLAG-tagged GATA2 was transduced into a human endothelial cell line HMEC1 by lentivirus infection at a multiplicity of infection of 100, four times. Two days after the fourth infection, cell lysates were collected and GATA2 protein was detected by western blotting with an antiFLAG antibody. A 56-kDa GATA2 band was observed (indicated by an arrow). (B): A heat map showing the upregulated CD133⫹ genes by GATA2. The gene expression profiles between GATA2- or vector-transduced HMEC1 were compared to reveal the impact of GATA2 on endothelial cell (EC) transcriptome. (C): A heat map showing the downregulated EC genes by GATA2. Again, the gene expression profiles between GATA2- and vector-transduced HMEC1 were compared. (D): Transcriptome distance analysis for CD133⫹ hematopoietic stem cells (HSCs), GATA2-transduced ECs, and empty lentivirus-transduced ECs. Average linkage distances between transcriptomes were calculated as described [70] using 4,454 probe sets distinguishing CD133⫹ HSCs and HMEC1 (q ⬍ 10– 4). GATA2-overexpressing cells exhibited a gene expression pattern closer to that of CD133⫹ stem cells than was the profile of vector-transduced ones. (E): Initial gene regulatory network relations for GATA2 and its downstream targets. All these genes contain at least 1 GATA motif in their promoter regions (base pairs ⫺3,000 to ⫹1,000). Genes in yellow, transcription factors or cofactors; blue, signaling-related genes according to Gene Ontology database; green, signaling-related genes (also acting as transcription factors or cofactors); in black, genes with other functions. Genes commonly expressed in CD133⫹ HSC, neural stem cells, and mesenchymal stem cells (according to Table 4) are indicated by asterisks. Abbreviation: EC, endothelial cell.

targets and related genes is always a daunting task. Analyzing gene signatures by dividing them into functional subgroups (e.g., by gene set enrichment analysis [57]) or network modules is an efficient way to provide more insights into gene lists [57]. A systemic approach is mandatory to view the overall molecular events as a biological system for a given biological process, where we can find important genes as controllers [22, 23]. Those key genes very often serve as hubs to maintain the stability of a genetic module or to connect modules within a major network. By applying systems

biology tools we identified a major functional network in CD133⫹, cells as well as in two other somatic stem cell types, and hub genes such as DHFR were identified via this systemic approach. Several CD133⫹ hub genes, such as PTPN11 (protein tyrosine phosphatase, nonreceptor type 11) and DHFR (dihydrofolate reductase), were also hubs in MSCs (Fig. 3D). PTPN11, also known as Shp-2, is required for embryonic development, as mice homozygous for the mutant allele die in utero at midgestation [58]. Mice mutant for EGFR and PTPN11 have defective cardiac semilunar valvulogen-

Huang, Hsieh, Wu et al. esis [59], supporting the hypothesis that PTPN11/Shp-2 is required for EGFR signaling in vivo. In our array analysis, EGFR was abundantly expressed in both MSC and NSC (supplemental online Table 1). Of note, in the network maps (Fig. 3; supplemental online Fig. 2), the number of common stem cell genes varies in each gene network. This is due to the fact that we applied a knowledge-based strategy to construct genetic networks, and only genes with known interactions will be included. An important task in systems biology is to identify novel interactions and networks for signature genes. Using a knowledge-based strategy, only 13.7% of CD133⫹ genes were involved in networks (Fig. 3A). Although several crucial hub genes were mapped, many more key genes, such as GATA2, are still waited to be unveiled. We extended this network core through the assistance of further microarray array experiment, promoter analysis, and wetlab confirmation. By overexpressing GATA2 in MVECs, we expanded this known genetic network and our knowledge on stem cells by identifying novel GATA2 targets, including ANGPT1. The robustness of this strategy can be sped up by incorporating more systems biology algorithms, such as coexpression correlation deduced by Pearson correlation coefficient, or more dynamic algorithms, such as liquid association [22, 23, 26, 60]. Applying such analyses to stem cells is the notion of correlated expression patterns of genes with related functions or regulator-target relationships, a high-level selforganization in gene expression networks, and a scale-free topology of such networks in cells [23, 61]. We found that GATA2 could induce transcriptome reprogramming in human endothelial cells. Previous studies had shown that differentiated adult cells could be transformed, or dedifferentiated, into pluripotent cells when fused with ES cells or by exposing them to extracts of ES cells [54, 55, 62]. This suggested that factors found in ES cells might be essential to conferring pluripotency on other cells. The artificially induced pluripotent stem cells could be generated directly from human or mouse skin fibroblast cultures by the introducing just four defined stemness factors: Oct3/4, Sox2, cMyc, and Klf4 [56, 63– 67]. However, it is not clear yet whether genes from somatic stem cells have a similar ability. For HSCs, one of the critical factors might be GATA2. GATA2 is involved in the restriction of hematopoietic cell differentiation. Forced expression of GATA2 in erythroid precursors provoked their proliferation while blocking their differentiation [9, 52]. In adipose tissues, constitutive expression of GATA2 in brown adipocytes suppressed genes expressed in matured adipocytes, whereas disruption of a GATA2 allele resulted in significantly elevated differentiation of preadipocytes [68]. These data, together with our transcriptome distance measurement, indicated that key factors of somatic stem cells also hold the potential to reinduce multipotency. The combination of GATA2 with other critical somatic stemness genes may eventually reprogram the genetic network of endothelial cells back to a stem cell-like state in vitro. Manipulating the steady expression of GATA2, as well as other discovered stem cell genes, may also help the ex vivo expansion of CD133⫹ stem cells. We evaluated the impact of GATA2 overexpression in matured endothelial cells, but no significant phenotypic

REFERENCES

1199

changes could be observed (data not shown). This may due to the fact that it will take more than one gene to induce a significant stem cell-like phenotype [56, 63– 67]. GATA2 alone, therefore, may not be enough to induce a clear stemlike phenotype. However, a global transcriptome change did occur in GATA2-expressing ECs (Fig. 5), implying that they had started their journey back to the ancestor stem cell state. These data further imply that detecting gene expression changes is a more sensitive tactic than analyzing cellular function in studies such as somatic cell dedifferentiation or transformation. A recent report on the transformation of human MSCs supported this point: even though clear malignancy phenotypes (such as grow in soft agar or form tumors in nude mice) could be observed only after all five oncogenes (hTERT, E6, E7, small t Ag, and Ras) were introduced into primary MSCs, clear transcriptome changes could be detected whenever an oncogene was added [69].

CONCLUSION Stemness and differentiation are most complex processes governed by a highly coordinated regulation of distinct genetic programs. The biological function of genes enriched in CD133⫹ cells, NSCs, or MSCs remain mostly unknown. More comprehensive, integrated studies enabling the determination of all interactions will offer additional insights into how such a complex interaction map may contribute to unique stem cell behaviors. Thus, this study provides novel strategies for additional genetic network studies on stem cells and on somatic cell dedifferentiation research. We also provide a novel bioinformatics approach, via calculating the transcriptome distances, to study dedifferentiation induced by stem cell genes. Lastly, we believe that the approaches taken here can also serve as a model for other complex biological processes.

ACKNOWLEDGMENTS We thank Li-Li Li and Dr. Oscar Kuang-Sheng Lee for critical reading of the manuscript. We acknowledge the technical services provided by Gene Expression Analysis Core Facility of the Veteran General Hospital-Yang Ming Genome Center (VYMGC), National Yang-Ming University. The Gene Expression Analysis Core Facility is supported by National Research Program for Genomic Medicine (NRPGM), National Science Council. This work was supported by the National Science Council (NRPGM, NSC963112-B-010-009 and NSC96-2320-B-010-026), the Yen Tjing Lin Medical Foundation (CI-94-10 and CI-96-11), the Taipei City Hospital (95002-62-086), and a grant from Ministry of Education Aim for the Top University Plan.

DISCLOSURE

2

Pelosi E, Valtieri M, Coppola S et al. Identification of the hemangioblast in postnatal life. Blood 2002;100:3203–3208. Gla¨sker S, Li J, Xia JB et al. Hemangioblastomas share protein expression with embryonal hemangioblast progenitor cell. Cancer Res 2006; 66:4167– 4172.

www.StemCells.com

CONFLICTS

The authors indicate no potential conflicts of interest.

3 4

1

OF POTENTIAL OF INTEREST

5

6

Brunstein CG, Baker KS, Wagner JE. Umbilical cord blood transplantation for myeloid malignancies. Curr Opin Hematol 2007;14:162–169. Salven P, Mustjoki S, Alitalo R et al. VEGFR-3 and CD133 identify a population of CD34⫹ lymphatic/vascular endothelial precursor cells. Blood 2003;101:168 –172. Jaatinen T, Hemmoranta H, Hautaniemi S et al. Global gene expression profile of human cord blood-derived CD133⫹ cells. STEM CELLS 2006;24:631– 641. He X, Gonzalez V, Tsang A et al. Differential gene expression profiling

Genetic Network and Cell Dedifferentiation

1200

7 8 9

10 11 12 13 14 15 16 17

18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37

of CD34⫹ CD133⫹ umbilical cord blood hematopoietic stem progenitor cells. Stem Cells Dev 2005;14:188 –198. Martin-Rendon E, Hale SJ, Ryan D et al. Transcriptional profiling of human cord blood CD133⫹ and cultured bone marrow mesenchymal stem cells in response to hypoxia. STEM CELLS 2007;25:1003–1012. Kawabata H, Germain RS, Ikezoe T et al. Regulation of expression of murine transferrin receptor 2. Blood 2001;98:1949 –1954. Minegishi N, Suzuki N, Yokomizo T et al. Expression and domainspecific function of GATA-2 during differentiation of the hematopoietic precursor cells in midgestation mouse embryos. Blood 2003;102: 896 –905. Mitchell KE, Weiss ML, Mitchell BM et al. Matrix cells from Wharton’s jelly form neurons and glia. STEM CELLS 2003;21:50 – 60. Lee OK, Kuo TK, Chen WM et al. Isolation of multipotent mesenchymal stem cells from umbilical cord blood. Blood 2004;103:1669 –1675. Panepucci RA, Siufi JL, Silva WA Jr et al. Comparison of gene expression of umbilical cord vein and bone marrow-derived mesenchymal stem cells. STEM CELLS 2004;22:1263–1278. Wang HS, Hung SC, Peng ST et al. Mesenchymal stem cells in the Wharton’s jelly of the human umbilical cord. STEM CELLS 2004;22: 1330 –1337. Chang YJ, Shih DT, Tseng CP et al. Disparate mesenchyme-lineage tendencies in mesenchymal stem cells from human bone marrow and umbilical cord blood. STEM CELLS 2006;24:679 – 685. Pittenger MF, Mackay AM, Beck SC et al. Multilineage potential of adult human mesenchymal stem cells. Science 1999;284:143–147. Wagner W, Wein F, Seckinger A et al. Comparative characteristics of mesenchymal stem cells from human bone marrow, adipose tissue, and umbilical cord blood. Exp Hematol 2005;33:1402–1416. Tsai MS, Hwang SM, Chen KD et al. Functional network analysis on the transcriptomes of mesenchymal stem cells derived from amniotic fluid, amniotic membrane, cord blood, and bone marrow. STEM CELLS 2007; 25:2511–2523. Ivanova NB, Dimos JT, Schaniel C et al. A stem cell molecular signature. Science 2002;298:601– 604. Ramalho-Santos M, Yoon S, Matsuzaki Y et al. “Stemness”: Transcriptional profiling of embryonic and adult stem cells. Science 2002;298: 597– 600. Cai Y, Wu P, Ozen M et al. Gene expression profiling and analysis of signaling pathways involved in priming and differentiation of human neural stem cells. Neuroscience 2006;138:133–148. Yu S, Zhang JZ, Xu Q. Genes associated with neuronal differentiation of precursors from human brain. Neuroscience 2006;141:817– 825. Baraba´si AL, Oltvai ZN. Network biology: Understanding the cell’s functional organization. Nat Rev Genet 2004;5:101–113. Janes KA, Yaffe MB. Data-driven modelling of signal-transduction networks. Nat Rev Mol Cell Biol 2006;7:820 – 828. Hanahan D, Weinberg RA. The hallmarks of cancer. Cell 2000;100: 57–70. Morrison SJ, Kimble J. Asymmetric and symmetric stem-cell divisions in development and cancer. Nature 2006;441:1068 –1074. Woolf PJ, Prudhomme W, Daheron L et al. Bayesian analysis of signaling networks governing embryonic stem cell fate decisions. Bioinformatics 2005;21:741–753. Calvano SE, Xiao W, Richards DR et al. A network-based analysis of systemic inflammation in humans. Nature 2005;437:1032–1037. Wang HW, Trotter MW, Lagos D et al. Kaposi sarcoma herpesvirusinduced cellular reprogramming contributes to the lymphatic endothelial gene expression in Kaposi sarcoma. Nat Genet 2004;36:687– 693. Storey JD, Tibshirani R. Statistical significance for genomewide studies. Proc Natl Acad Sci U S A 2003;100:9440 –9445. Li C, Wong WH. Model-based analysis of oligonucleotide arrays: Expression index computation and outlier detection. Proc Natl Acad Sci U S A 2001;98:31–36. Yang TP, Chang TY, Lin CH et al. ArrayFusion: A web application for multi-dimensional analysis of CGH, Snp and microarray data. Bioinformatics 2006;22:2697–2698. Godfrey A, Anderson J, Papanastasiou A et al. Inhibiting primary effusion lymphoma by lentiviral vectors encoding short hairpin RNA. Blood 2005;105:2510 –2518. Lagos D, Trotter MW, Vart RJ et al. Kaposi sarcoma herpesvirusencoded vFLIP and vIRF1 regulate antigen presentation in lymphatic endothelial cells. Blood 2007;109:1550 –1558. Phillips RL, Ernst RE, Brunk B et al. The genetic program of hematopoietic stem cells. Science 2000;288:1635–1640. Baldus CD, Tanner SM, Kusewitt DF et al. BAALC, a novel marker of human hematopoietic progenitor cells. Exp Hematol 2003;31: 1051–1056. Ogawa M, Kizumoto M, Nishikawa S et al. Expression of alpha4integrin defines the earliest precursor of hematopoietic cell lineage diverged from endothelial cells. Blood 1999;93:1168 –1177. Sakai D, Suzuki T, Osumi N et al. Cooperative action of Sox9, Snail2

38 39

40 41 42 43 44 45 46

47 48 49 50 51

52 53 54

55 56 57 58 59 60 61 62 63 64

and Pka signaling in early neural crest development. Development 2006; 133:1323–1333. Yang J, Mani SA, Donaher JL et al. Twist, a master regulator of morphogenesis, plays an essential role in tumor metastasis. Cell 2004; 117:927–939. Jung CG, Hida H, Nakahira K et al. Pleiotrophin mRNA is highly expressed in neural stem (progenitor) cells of mouse ventral mesencephalon and the product promotes production of dopaminergic neurons from embryonic stem cell-derived nestin-positive cells. FASEB J 2004;18: 1237–1239. Hanashima C, Li SC, Shen L et al. Foxg1 suppresses early cortical cell fate. Science 2004;303:56 –59. Kim SK, Selleri L, Lee JS et al. Pbx1 inactivation disrupts pancreas development and in Ipf1-deficient mice promotes diabetes mellitus. Nat Genet 2002;30:430 – 435. Sanyal M, Tung JW, Karsunky H et al. B-cell development fails in the absence of the Pbx1 proto-oncogene. Blood 2007;109:4191– 4199. DiMartino JF, Selleri L, Traver D et al. The Hox cofactor and protooncogene Pbx1 is required for maintenance of definitive hematopoiesis in the fetal liver. Blood 2001;98:618 – 626. Gregorio-King CC, McLeod JL, Collier FM et al. MERP1: A mammalian ependymin-related protein gene differentially expressed in hematopoietic cells. Gene 2002;286:249 –257. Takakura N, Watanabe T, Suenobu S et al. A role for hematopoietic stem cells in promoting angiogenesis. Cell 2000;102:199 –209. Leibundgut K, Schmitz NM, Hirt A. Catalytic activities of G1 cyclindependent kinases and phosphorylation of retinoblastoma protein in mobilized peripheral blood CD34⫹ hematopoietic progenitor cells. STEM CELLS 2005;23:1002–1011. Forrest D, Erway LC, Ng L et al. Thyroid hormone receptor beta is essential for development of auditory function. Nat Genet 1996;13: 354 –357. White R, Leonardsson G, Rosewell I et al. The nuclear receptor corepressor nrip1 (RIP140) is essential for female fertility. Nat Med 2000; 6:1368 –1374. Loges S, Fehse B, Brockmann MA et al. Identification of the adult human hemangioblast. Stem Cells Dev 2004;13:229 –242. Suzuki N, Ohneda O, Minegishi N et al. Combinatorial Gata2 and Sca1 expression defines hematopoietic stem cells in the bone marrow niche. Proc Natl Acad Sci U S A 2006;103:2202–2207. Tsai FY, Orkin SH. Transcription factor GATA-2 is required for proliferation/survival of early hematopoietic cells and mast cell formation, but not for erythroid and myeloid terminal differentiation. Blood 1997;89: 3636 –3643. Ikonomi P, Rivera CE, Riordan M et al. Overexpression of GATA-2 inhibits erythroid and promotes megakaryocyte differentiation. Exp Hematol 2000;28:1423–1431. Zou JX, Wang B, Kalo MS et al. An Eph receptor regulates integrin activity through R-Ras. Proc Natl Acad Sci U S A 1999;96: 13813–13818. Taranger CK, Noer A, Sorensen AL et al. Induction of dedifferentiation, genomewide transcriptional programming, and epigenetic reprogramming by extracts of carcinoma and embryonic stem cells. Mol Biol Cell 2005;16:5719 –5735. Grinnell KL, Yang B, Eckert RL et al. De-differentiation of mouse interfollicular keratinocytes by the embryonic transcription factor Oct-4. J Invest Dermatol 2007;127:372–380. Takahashi K, Yamanaka S. Induction of pluripotent stem cells from mouse embryonic and adult fibroblast cultures by defined factors. Cell 2006;126:663– 676. Subramanian A, Tamayo P, Mootha VK et al. Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles. Proc Natl Acad Sci U S A 2005;102:15545–15550. Saxton TM, Henkemeyer M, Gasca S et al. Abnormal mesoderm patterning in mouse embryos mutant for the SH2 tyrosine phosphatase Shp-2. EMBO J 1997;16:2352–2364. Chen B, Bronson RT, Klaman LD et al. Mice mutant for Egfr and Shp2 have defective cardiac semilunar valvulogenesis. Nat Genet 2000;24: 296 –299. Li KC, Liu CT, Sun W et al. A system for enhancing genome-wide coexpression dynamics study. Proc Natl Acad Sci U S A 2004;101: 15561–15566. Yook SH, Oltvai ZN, Barabasi AL. Functional and topological characterization of protein interaction networks. Proteomics 2004;4:928 –942. Tada M, Takahama Y, Abe K et al. Nuclear reprogramming of somatic cells by in vitro hybridization with ES cells. Curr Biol 2001;11:1553–1558. Okita K, Ichisaka T, Yamanaka S. Generation of germline-competent induced pluripotent stem cells. Nature 2007;448:313–317. Wernig M, Meissner A, Foreman R et al. In vitro reprogramming of fibroblasts into a pluripotent ES-cell-like state. Nature 2007;448: 318 –324.

Huang, Hsieh, Wu et al.

1201

65 Park IH, Zhao R, West JA et al. Reprogramming of human somatic cells to pluripotency with defined factors. Nature 2008;451:141–146. 66 Takahashi K, Tanabe K, Ohnuki M et al. Induction of pluripotent stem cells from adult human fibroblasts by defined factors. Cell 2007;131:861–872. 67 Yu J, Vodyanik MA, Smuga-Otto K et al. Induced pluripotent stem cell lines derived from human somatic cells. Science 2007;318: 1917–1920. 68 Tsai J, Tong Q, Tan G et al. The transcription factor GATA2 regulates

differentiation of brown adipocytes. EMBO Rep 2005;6:879 – 884. 69 Funes JM, Quintero M, Henderson S et al. Transformation of human mesenchymal stem cells increases their dependency on oxidative phosphorylation for energy production. Proc Natl Acad Sci U S A 2007;104: 6223– 6228. 70 Wang L, Wakisaka N, Tomlinson CC et al. The Kaposi’s sarcomaassociated herpesvirus (KSHV/HHV-8) K1 protein induces expression of angiogenic and invasion factors. Cancer Res 2004;64:2774 –2781.

See www.StemCells.com for supplemental material available online.

www.StemCells.com