Systematic Prediction of Scaffold Proteins Reveals ... - Semantic Scholar

3 downloads 0 Views 4MB Size Report
Sep 22, 2015 - Baltimore, Maryland, United States of America, 3 The Sidney Kimmel Comprehensive Cancer Center, Johns. Hopkins School of Medicine, ...
RESEARCH ARTICLE

Systematic Prediction of Scaffold Proteins Reveals New Design Principles in ScaffoldMediated Signal Transduction Jianfei Hu1, Johnathan Neiswinger2, Jin Zhang2,3,4, Heng Zhu2,3,5, Jiang Qian1,3*

a11111

1 Department of Ophthalmology, Johns Hopkins School of Medicine, Baltimore, Maryland, United States of America, 2 Department of Pharmacology and Molecular Sciences, Johns Hopkins School of Medicine, Baltimore, Maryland, United States of America, 3 The Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins School of Medicine, Baltimore, Maryland, United States of America, 4 Solomon H. Snyder Department of Neuroscience, Johns Hopkins School of Medicine, Baltimore, Maryland, United States of America, 5 Center for High-Throughput Biology, Johns Hopkins School of Medicine, Baltimore, Maryland, United States of America * [email protected]

Abstract OPEN ACCESS Citation: Hu J, Neiswinger J, Zhang J, Zhu H, Qian J (2015) Systematic Prediction of Scaffold Proteins Reveals New Design Principles in Scaffold-Mediated Signal Transduction. PLoS Comput Biol 11(9): e1004508. doi:10.1371/journal.pcbi.1004508 Editor: Andrey Rzhetsky, University of Chicago, UNITED STATES Received: March 30, 2015 Accepted: August 3, 2015 Published: September 22, 2015 Copyright: © 2015 Hu et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Data Availability Statement: All relevant data are within the paper and its Supporting Information files. Funding: This work was supported in part by the NIH grants (RR020839 to JQ; GM076102, CA160036, HG006434, GM111514, and CEIRS to HZ; DK073368 and CA174423 to JZ). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. Competing Interests: The authors have declared that no competing interests exist.

Scaffold proteins play a crucial role in facilitating signal transduction in eukaryotes by bringing together multiple signaling components. In this study, we performed a systematic analysis of scaffold proteins in signal transduction by integrating protein-protein interaction and kinase-substrate relationship networks. We predicted 212 scaffold proteins that are involved in 605 distinct signaling pathways. The computational prediction was validated using a protein microarray-based approach. The predicted scaffold proteins showed several interesting characteristics, as we expected from the functionality of scaffold proteins. We found that the scaffold proteins are likely to interact with each other, which is consistent with previous finding that scaffold proteins tend to form homodimers and heterodimers. Interestingly, a single scaffold protein can be involved in multiple signaling pathways by interacting with other scaffold protein partners. Furthermore, we propose two possible regulatory mechanisms by which the activity of scaffold proteins is coordinated with their associated pathways through phosphorylation process.

Author Summary Despite their importance in the signaling transduction, there is no systematic effort in identifying and characterizing the scaffold proteins in humans. In this work, we predicted scaffold proteins by integrating the available protein-protein interactions and kinase-substrate relationships. The predicted scaffold proteins showed characteristics for known scaffold proteins, suggesting the fidelity of our prediction. More importantly, the systematic prediction of scaffold proteins provides biological insights in the scaffold-mediated signal transduction. We found that scaffold proteins are likely to form complexes, suggesting that scaffold proteins could participate in diverse signaling pathways through the combinatorial interactions among scaffold proteins. Furthermore, the regulation of scaffold proteins’

PLOS Computational Biology | DOI:10.1371/journal.pcbi.1004508 September 22, 2015

1 / 20

Systematic Prediction and Characterization of Scaffold Proteins

activities has not been extensively studied. Our bioinformatics analysis proposed that scaffold proteins themselves might be regulated through phosphorylation process.

Introduction Protein phosphorylation and dephosphorylation is an important means of protein regulation that occur in both prokaryotic and eukaryotic organisms [1–5]. Phosphorylation of a protein may result in a conformational change in its structure, recruitment of binding partners or change of localization, leading to its activation or deactivation [6,7]. In the context of a signaling pathway, a relay of phosphorylation events could allow the transmission of extracellular signals to intracellular targets. One well-known example is the RAS-ERK pathway, in which a small G-protein RAS activates MAP3K RAF, which then phosphorylates and activates MAP2K MEK1 (MAPKK1). MEK1 then phosphorylates and activates MAPK ERK1/2[8]. Biological systems contain a large number of phosphorylation-related signaling pathways. Many of these signaling pathways share common signaling components and are subject to extensive crossregulation. The emergence of complex signaling networks prompts the question of specificity, and understanding how individual signals are transduced to arrive at specific outputs is of great importance to the biological community. It is believed that the answer may partially lie in the existence of scaffold proteins. Scaffold proteins act as “molecular glue”, linking multiple components in a phosphorylation-dependent signaling pathway together to facilitate signal transduction, and as such play a crucial role in the regulation of signaling cascades [8–13]. The scaffold proteins exert their effects through simple tethering of signaling proteins, properly orienting target proteins, or allosteric assembly of pathway components. They can enhance signaling specificity by sequestering proteins, preventing unwanted cross-influence between proteins in different signaling pathways. They can also increase the signaling efficiency by increasing the local concentration of each signaling component. Thus, the knowledge of scaffold proteins can help improve our understanding of the regulation of subcellular signal transduction [14]. Traditional biochemistry approach to identifying scaffold proteins requires multiple steps [15,16], including 1) selection of a candidate as a scaffold protein and the corresponding signaling pathway; 2) testing the protein-protein interactions between the scaffold candidate and the protein members of the selected pathway; and 3) assessment of the enhanced signaling readout of the signaling pathway in the presence of the scaffold candidate [12]. To date, there is no report on a systematic effort to comprehensively identify scaffold proteins. In this work, by taking advantage of the existing extensive datasets of protein-protein interactions (PPIs) and kinase-substrate relationships (KSRs), we developed a statistical approach to predict scaffold proteins. We predicted a large number of potential scaffold proteins, which share many similar characteristics with known scaffold proteins. Interestingly, we discovered that these predicted scaffold proteins are likely to form scaffold complexes and contain more phosphorylation sites than other proteins in human proteome, suggesting that the functionality of the scaffold proteins might be regulated by phosphorylation process.

Results Protein mediators are widespread in signaling networks We first construct a composite network, which includes 55,048 protein-protein interactions (PPIs) and 1103 kinase-substrate relationship (KSR) in human [3,5,17]. For a given protein

PLOS Computational Biology | DOI:10.1371/journal.pcbi.1004508 September 22, 2015

2 / 20

Systematic Prediction and Characterization of Scaffold Proteins

pair, we calculated the shortest distances connecting them in the PPI network (see Methods). A distance of 1 indicates that two proteins directly interact with each other, while a distance of 2 indicates that they do not interact directly with each other, but both interact with a third protein (Fig 1A). Among 1,103 protein pairs with known KSRs, 24.9% of them have a distance of 2 in the PPI network, suggesting that these signaling proteins are likely to interact with a shared protein mediator. In contrast, of the 6.4×107 human protein pairs in the PPI network, only 2.7% have a distance of 2 (Fig 1A). The shortest distance analysis suggested that protein mediators might be widespread among signaling proteins in the phosphorylation networks. We next examined the network motifs in the composite network, which represent the basic building blocks in a network [18]. The network motif relevant to scaffold proteins is singleinput module (SIM), where a single regulator regulates a set of proteins [19]. Here, the single regulator corresponds to a scaffold protein, while the set of proteins are the protein members in a signaling pathway. In our analysis, a SIM is identified if one protein shows PPIs with a set of proteins and the set of proteins form a linear cascade through KSRs (Fig 1B). We observed that the occurrences of the SIMs are significantly enriched as compared to their expected occurrences in the networks, where the PPIs were randomly permutated (Fig 1B). For example, the SIM motif with a cascade length of 5 occurs 47 times; whereas only 2 times is expected in a randomized network (Fig 1B). Both shortest distance and network motif analyses suggest that a scaffold mediator is likely a widely-used mechanism in phosphorylation signaling cascades.

Prediction of scaffold proteins In order to predict potential scaffold proteins in phosphorylation signaling cascades, we searched in the composite networks for proteins that show protein-protein interactions with multiple components in KSR networks (Fig 2). Note that in this work we do not distinguish scaffold proteins and adaptors, which are smaller proteins binding only two signaling proteins [20]. The scaffold proteins in this work are simply defined as the protein hubs that interact with multiple members in a signaling pathway. A stringent requirement was made in predicting potential scaffold proteins by examining whether a given candidate interacts with all components in a particular pathway. Here, the pathway is defined as a set of proteins with linear KSRs. For example, if Kinase A phosphorylates Kinase B, and Kinase B phosphorylates Protein C, we constructed a pathway of A ! B ! C. Some proteins might interact with subset of proteins in the pathway, such as proteins A and B (or proteins B and C) in the pathway. Continuous sub-paths within a long pathway are also considered as separate pathways (such as A ! B and B ! C). Note that such defined pathways are not necessary to be the same biological pathway as those defined in other databases (e.g., KEGG database)[21]. To assess the statistical significance for predicting scaffold proteins, simulations were performed by permutation of the PPIs, while keeping the interaction degree (i.e., number of interacting partners) for each protein unchanged. For a protein with a PPI degree of n and a targeted signaling pathway with length of l, we calculated in the permutated networks the chance that a protein with the same PPI degree is predicted as a scaffold protein. Using 1000 random PPI data to calculate the false discovery rate and choosing 0.01 as the cutoff of false discovery rate, 212 proteins were predicted as scaffold proteins, which are associated with 605 non-redundant phosphorylation pathways. Among the 1,103 known KSRs, 359 of them (33%) are associated with at least one predicted scaffold protein. The resulting network is shown in S1 Fig. The predicted scaffold proteins and their associated pathways are listed in S1 Table. We then examined whether these scaffold proteins are chosen simply because of their high interaction degrees. Based on the PPI degree distribution, we found that the peak of the

PLOS Computational Biology | DOI:10.1371/journal.pcbi.1004508 September 22, 2015

3 / 20

Systematic Prediction and Characterization of Scaffold Proteins

Fig 1. Scaffold proteins are widespread in signaling networks. (A) PPI distance of KSR pairs and all human protein pairs. The PPI distance of a protein pair is defined as the shortest distance of the two proteins in PPI network. KSR pairs are significantly enriched in PPI distance = 2. In fact, 24.9% of KSR pairs have PPI distance of 2, while only 2.7% of all human protein pairs have the same PPI distance. (B) Network motifs in which one protein interacts with a series of proteins and these proteins form a cascade via KSRs. These network motifs are enriched, suggesting that scaffold proteins are widespread in signaling pathways. doi:10.1371/journal.pcbi.1004508.g001

distribution locates around 10 (S2 Fig). This distribution is similar to that of known scaffold proteins. This result indicates the prediction of scaffold proteins is unlikely to be an artifact due to their high PPI degrees; whereas we did observed that proteins with high PPI degrees have high possibilities to be scaffold proteins (S3 Fig). We collected 78 known scaffold proteins for kinase signaling pathways through literature curation (S2 Table). Our prediction recovered 18 of them, yielding a sensitivity of 23%. In

PLOS Computational Biology | DOI:10.1371/journal.pcbi.1004508 September 22, 2015

4 / 20

Systematic Prediction and Characterization of Scaffold Proteins

Fig 2. Strategy to predict scaffold proteins. For each potential scaffold protein, we corrected the effect of interaction degree of the protein and the length of associated pathways. We utilized the randomized PPI to assess the significance of a predicted scaffold protein. The random PPI keep the same PPI degree for each protein by randomly selecting two PPI pairs and changing their partners. doi:10.1371/journal.pcbi.1004508.g002

contrast, when 212 proteins were selected randomly among the whole human proteome (~24,000 proteins), it is only expected to recover 0.69 known scaffold protein. Therefore, our prediction of scaffold proteins is of > 26-fold enrichment (p