Evaluation of online miRNA resources for ... - Wiley Online Library

3 downloads 0 Views 1001KB Size Report
2005; Ahluwalia et al. 2008 ..... Ahluwalia, J.K., Khan, S.Z., Soni, K., Rawat, P., Gupta, A.,. Hariharan ... Ajay, S.S., Athey, B.D. & Lee, I. (2010) Unified translation.
REVIEW

Evaluation of online miRNA resources for biomedical applications Neil H. Tan Gana, Ann F. B. Victoriano and Takashi Okamoto* Department of Molecular and Cell Biology, Nagoya City University Graduate School of Medical Sciences, 1-Kawasumi, Mizuho-cho, Mizuho-ku, Nagoya City 467-8601, Japan

MicroRNAs (miRNAs) are endogenous single-stranded, 22-nt (nucleotide) RNAs which complement mRNA to initiate post-transcriptional regulation. This review presents updates and evaluations of the public domain resources available for miRNA identification and target prediction toward their utilization in the biomedical research approach. This study discusses the basic principles of miRNA computational studies based on the nature and mechanism of action of miRNAs. Furthermore, we have explored fifty-nine current online miRNA tools that can be categorized into three classes in this paper: (i) miRNA identification; (ii) miRNA target prediction; and (iii) specialized miRNA tools.

Introduction MicroRNAs (miRNAs), a variety of small non-coding RNAs [approximately 22nt (nucleotide)], endogenously regulate gene expression at the post-transcriptional level, which continuously establishes itself as a significant research area in the biomedical sciences (Bartel 2009; Eggleston 2009). With its numerous potential applications in oncology (Cho 2010), hematology (Zhao et al. 2010), neurology (Junn & Mouradian 2010), cardiology (da Costa Martins et al. 2010), metabolic diseases (Alvarez-Garcia & Miska 2005), immunity (Bagasra & Prilliman 2004) and infection (Yeung et al. 2007) etc., it is widely seen to initiate cutting-edge strategies for combating diseases in the not so distant future. The miRBase, a Web-based miRNA repository, currently holds 1424 putative human miRNA precursors and 1902 mature miRNAs (Kozomara & Griffiths-Jones 2011). These numbers comprise only a small fraction when compared with the estimated total number of miRNAs awaiting discovery. Consequently, the miRNAs should undergo experimental validation for their functional associations with biological processes. The miRNAs constitute an estimate Communicated by : Mitsuhiro Yanagida *Correspondence: [email protected]

of more than 3% of human genes potentially regulating about 30% of the proteins (Lim et al. 2005). These values represent the human genome alone. Thus, there is an immense amount of work to be carried out, and computational techniques provide essential support for data generation, analyses, organization, and curation. The aims of this review article are, first, to assess the current online tools for miRNA studies and, second, to present a generalized perspective on how these tools could be used to determine roles of miRNAs in various diseases and disorders. Hoping to cater to the beginners of miRNA research, this study sets the tone in a non-programmer’s point-of-view. However, this study would be potentially useful for the hardcore bioinformaticians for prospective insights to develop next-generation miRNA tools as more user-friendly applications. To single-out the miRNA biomedical niche is exceptionally timely because it is currently one of the most progressive research directions.

Making sense out of miRNAs The key to successful computational analyses of miRNAs rests on a comprehensive understanding of a miRNA’s core biological and molecular properties, bio-physical interactions it participates and finally, its

DOI: 10.1111/j.1365-2443.2011.01564.x  2011 The Authors Journal compilation  2011 by the Molecular Biology Society of Japan/Blackwell Publishing Ltd.

Genes to Cells (2012) 17, 11–27

11

NH Tan Gana et al.

functional integration as a component of a biological system which may be related to other biological systems. Thus, discussing some salient principles on miRNA biogenesis, mechanisms and evolution becomes imperative. This simplified discussion may be supplemented with highly detailed articles by others (Newman & Hammond 2010; Siomi & Siomi 2010). miRNA biogenesis, mechanisms, and evolution

miRNAs may originate from almost any part of the genome. Any precursor miRNA can be located intergenically or intragenically, although most reports indicate that majority are within the 3¢ untranslated regions (UTRs) (Leung & Sharp 2006). The review of Wang (2009) thoroughly distinguished how miRNAs of various genomic locations are deduced to their mature form. Disregarding genome origins, the processing of mature miRNAs follow almost a similar mechanism. The process generally involves conversion of the primary miRNA (pri-miRNA) into a precursor miRNA (pre-miRNA) that eventually becomes the mature miRNA (Zeng & Cullen 2006). The pre-miRNAs forming hairpin-loop secondary structures are cleaved into parental pri-miRNA upon endonuclease activity of a protein complex called microprocessors (Kurihara & Watanabe 2010). These pre-miRNAs associate with RanGTP and exportins to facilitate their translocation from nucleus to the cytoplasm (Bohnsack et al. 2004). While in the cytoplasm, the dissociation among pre-miRNA and transporter proteins are initiated by catalytic hydrolysis of RanGTP (Kim 2004). The pre-miRNA is now free for DICER action, consequently forming the mature approximately 22-nt miRNAs (Carmell & Hannon 2004; Cullen 2004; Hammond 2005; Harvey et al. 2008; Flores-Jasso et al. 2009). One of the complimentary strand from this double-stranded mature miRNA, now called as the guide strand, associates itself with the RNA-induced silencing complex (RISC) assembly (Kawamata & Tomari 2010). Then, the miRNA guide strand-RISC association effects conformational changes forming grooves to subsequently enhance the site for miRNA and mRNA target interaction (Brodersen & Voinnet 2009). Within the site of this complex, a highly specific base complementation of the miRNA with its target mRNA sequence happens (Ajay et al. 2010). Overall complementarity of the binding sites is the single most important factor that influences the miRNA targeting behavior to cause a variety of gene regulatory effects 12

Genes to Cells (2012) 17, 11–27

(Long et al. 2008). A specific motif called the seed region, located at the nucleotide position 2–8 of the mRNA target should perfectly match the miRNA to initiate binding interactions (Bail et al. 2010; Heikham & Shankar 2010). Extensive proofs on the importance of these recognition sites were reviewed by others (Bartel 2009). Variations outside this seed region may also influence miRNA activity. The efficiency of mRNA target degradation increases, as there is increased degree of complementarity within accessible binding sites. Perfect complementation between miRNA and its target mRNA induces gene silencing by Argonaute proteins via endoproteolytic cleavage. It was also noted that a single miRNA can bind to multiple sites within the 3¢ UTR when the degree of complementation becomes highly flexible. It is indicated that imperfect complementation between miRNA and its mRNA target leads to translational inhibition regardless of the involvement of the seed regions (Jackson & Standart 2007; Seitz 2009; Pan et al. 2010). Although these observations may likely weaken the effect of degradation, miRNAs tend to increase their effectiveness by having multiple binding sites with its target. Concerted interactions of the different binding sites within the miRNA-RISC complex enhance the effects of degradation of the target mRNAs while being deadenylated and decapped (Wu et al. 2006; Eulalio et al. 2009; Tomari 2009). Interestingly, another study linked the A–U-rich elements within miRNA target interaction sites to the enhancement of translation (O’Toole et al. 2006). Figure 1 illustrates an example of miRNA detailing its regions as described. Furthermore, evidences imply miRNAs follow a highly conserved evolutionary pattern of which there are significant similarities among seed regions of related species (Zhang et al. 2009). Extending these observations are the seed regions of high similarities targeting the same sites among related species (Takane et al. 2010). In contrast to these evidences are indications of divergence among miRNA species and targets which are all evidences pointing to speciesspecific miRNA interaction (Ha et al. 2008; Okamura et al. 2008; Liang & Li 2009). These distinctly opposite miRNA properties should be thoroughly characterized in designing miRNA discovery and targeting programs. Based on the statements on miRNA biogenesis, mechanism and evolutionary properties, we can readily explain the logic why computational approaches are necessary for miRNA studies. First, miRNAs are distinctly derived from highly stringent nucleotide

 2011 The Authors Journal compilation  2011 by the Molecular Biology Society of Japan/Blackwell Publishing Ltd.

Online miRNA resources Interior loop

1

(a)

(c)

10

15

20

25

30

C UCAA UUU U UGGUGUU AGAG A U ACCACGA _UCUU AUUG UAUUGGCUAAAG (42) (63) UUAA UCU (4)

(25)

AUGACUGAUUUC

65

(b)

5

Terminal loop

60

55

50

45

40

35

1-AUGACUGAUUUC-12 4-ACUGAUUUCUUUUGGUGUUCAG-25 16-UGGUGUU-22 AUGACUGAUUUCUUUUGGUGUUCAGAGucaauauaauu

(d) AUUGUAUUGGCUAAAGUCUACCACGA - UCUU- - - - - - - - - - -

(e)

4-ACUGAUUUCUUUUGGUGUUCAG-25 IIIIIIIIIIIIIIIIIIIIIIIIIXIII 63-AUUGGCUAAAGUCUACCACCGAU-42

(f)

HHHHEDDDDDDDDDLLLDDDDDDDXDDDELLLLL

Figure 1 Anatomy of a miRNA hairpin. (a) The stem loop structure of human miRNA, hsa-mir-29a (MI0000087) hairpin showing interior loop or bulges (arrow ‘1’), the terminal loop (arrow ‘2’); (b) mature miRNA minor sequence, nt positions 4–25 (MIMAT0004503) showing first stem (nt 1–12, in gray) and inner stem (nt 16–22, in gray) which hybridize with the mature miRNA major sequence into a mature miRNA duplex; (c) 5¢ stem sequences forming as extensions covering the variable length regions in between loops, particularly the nt positions 1–12 (first stem) and nt 16–22 (interior stem), and; (d) 3¢ stem sequences overhang (3¢-AUUG-5¢, in gray) which are typical in pri-miRNAs (e) mature miRNA major sequence, nt positions 42–63 (MIMAT0000086) which hybridize with the mature miRNA major sequence into a mature miRNA duplex hybridization shown); (f) hypothetical interpretation of an algorithm identifying distinct areas of a miRNA in relation to a target, where H-overhang, D-stem, E-end nucleotide, X-mismatch and L-loop.

sequence. Second, miRNAs are remarkably conserved following common path of evolutionary conservation and a distinct pattern of evolutionary divergence. miRNA biological functions and related information

miRNA studies or microRNnomics cover extensive information; thus, the necessity for computational analyses (Zhang 2008). The expanse miRNA information can be traced from their identification to their functional assignment in systems biology. The creation of information begins with miRNA biogenesis grounded on nucleotide sequences and

their corresponding structures potentially identifying miRNAs. As miRNAs identification takes place, targets are also predicted based on complementation rules, thus creating another layer of data set. Later, miRNA functions are defined in spatio-temporal states of regulation processes. Clearly in the end, the expanse of miRNA information would go beyond their identification and targeting. An example of this information is whole-genome deep sequencing results which are likely to define the future of miRNA computational biology (discussed in succeeding sections).

miRNA data-mining tools The review cites fifty-nine online human and biomedical-related miRNA resources which are publicly available for data mining. This large number of tools indicates a diversity of methodologies requiring further organization to provide highly logical problem-solving approaches. These tools can be classified into several categories but usually either as a tool for miRNA prediction or the identification of its targets. Yue et al. (2009) extensively categorized these miRNA prediction tools under the context of statistical framework with two categories: rule-based and datadriven approaches. Further classification criteria may include the algorithms used defining miRNA functionality, the extent of genome coverage, miRNA experimental data types and disease coverage (Mendes et al. 2009). For this study, we classify these online tools into three categories. First category is for miRNA prediction and includes tools for identification of putative miRNAs. Second, tools for miRNA target identification, which compute for miRNA targets within mRNA sequences. Third are the integrated sites that have modules carrying multiple tasks for miRNA identification and target prediction. The third category may cover intuitive post-data processing like biological functions assignment, literature organization, etc. Sometimes, included are support online tools from manufacturers serving as additional links. miRNA prediction

Bearing information on miRNA biophysical attributes, computational biologists are able to design programs that predict miRNA from a given sequence. Initially, software programs are able to identify hypothetical miRNA genes based on primary sequences and secondary structures of previously characterized miRNAs. However, this approach limited the discovery of novel miRNAs as the applications are restricted to

 2011 The Authors Journal compilation  2011 by the Molecular Biology Society of Japan/Blackwell Publishing Ltd.

Genes to Cells (2012) 17, 11–27

13

NH Tan Gana et al.

evolutionarily conserved miRNA species. Thus, future computational tools should predict miRNAs beyond comparative genomics information. The initial restrictive capacity of comparative genomics-based prediction tools which only identify highly conserved putative miRNAs candidates are now addressed by machine-learning approaches. These machine-learning methodologies have improved miRNA prediction by extending the analyses beyond sequence and structural properties to predict unknown miRNAs. The technique can define thermodynamic properties of standard miRNAs which through ‘statistical inference’ may define a putative miRNA candidate. Machine learning algorithms allow computer programs to ‘learn’ from the information collected from previously verified miRNAs used like positive miRNA standards. Examples of these algorithms are neural networks, hidden Markov Models (HMM), k-nearest neighbor algorithm, Naı¨ve Bayes (NB), and support vector machines (SVM) (Mendes et al. 2009; Saito & Saetrom 2010). Neural networks are wide-application thinking machines that simulate human brain processes for analyses of data. The mentioned processes may involve pattern recognition, detection, prediction, and decision making (Yang 2010). HMMs allow pattern recognition among data sets specifically nucleotide sequences (Agarwal et al. 2010). Naı¨ve Bayes algorithm follows a training data set resulting in a classification model previously computed for probability that it belonged to a certain class (Yousef et al. 2007). SVM, in contrast, is a technique combining numbers to describe miRNA features as a single vector representing an assigned N-dimensional space. This algorithm compares the vectors from positive class against the negative class and determines a hyperplane that produces the best separation margin between the two classes (Ben-Hur & Weston 2010). In the end, accumulated information on novel miRNA species should further strengthen the ability of bioinformatics prediction of unidentified miRNAs. Understanding that miRNAs are transiently expressed, exceptionally small molecules requiring further distinction from other ncRNAs, it is almost impossible to fully characterize a miRNA candidate entirely from their computational data. Thus, it is eventually necessary to confirm this data with biochemical techniques. Although the laboratory detection processes are also occasionally challenged, it is a must that any predicted miRNA pass the rigorous test of experimental validation. To date, the necessary information needed to verify miRNA must include the approximately 22-nt mature miRNA transcript 14

Genes to Cells (2012) 17, 11–27

detected either by traditional hybridization-based techniques (Northern blot hybridization) or by amplification-based techniques (cloning, quantitative real-time PCR, etc.). An expression data should be further confirmed by evidences which follow the standard biogenic pathway for miRNAs (as mentioned earlier). These aforesaid criteria set the core standards for miRNA discovery (Lee & Ambros 2001). Recent technological advances improved the detection of miRNA expression patterns like high-throughput screening and next-generation sequencing. These breakthroughs consequentially generated large quantities of data sets and would be bound to modify concepts of miRNA computational analyses (discussed later). Wang et al. (2010a,b) recently published a book with a comprehensive list of these techniques. Hence, it is clear that miRNA prediction programs are highly dependent on pre-discovered miRNAs consequentially becoming the training data for miRNA predictive algorithms. Also, miRNA predictive tools are supposed to address the limits of biomolecular techniques. However, in conclusion, there should be co-dependence between computational data and biomolecular evidences, to make a putative miRNA considered as valid. Table 1 lists the ten miRNA prediction tools surveyed in this paper. The list shows that most updated tools use machine-learning approaches, as justified by our previous discussions why it is the most preferred technique to date. However, we must take into consideration that even if two prediction tools use the same classifier, it is unlikely that these tools are similar, thus making direct comparisons regarding their performances a bit complicated. The only way for an end user to maximize the miRNA predictive tools is to work on several algorithms and determine which predicted miRNAs are commonly represented in all the algorithms. miRNA targeting

miRNA-directed gene regulation corresponds to complementary targeting of a gene transcript defined in a spatio-temporal context. Thus, miRNAs and their corresponding mRNA targets could be defined by their functional correlation. As our survey indicates, there are more miRNA targeting tools compared with miRNA predictive tools. The proliferation of target analyses tools may be attributed to the development of cutting edge genomics tools that can generate a lot of experimental data on miRNA–mRNA target interactions.

 2011 The Authors Journal compilation  2011 by the Molecular Biology Society of Japan/Blackwell Publishing Ltd.

Online miRNA resources Table 1 General description of algorithms and software for miRNA discovery Tool features Resource Name

Algorithm

Sensitivity

Specificity

Organism

BayesMiRNA Find HHMMiR MatureBayes

NB HMM NB

74.33%1a 93.20%2a na3a

95.33%1b 89.00%2b na3b

h all all

miRScan

RF

70.00%4a

50.00%4b

f, h

ProMir Triplet-SVM Mir-abela miPred Micro-pro SVM miRFinder

HMM SVM SVM RF SVM SVM

73.00%5a 93.30%6a 71.00%7a 98.21%8a 95.00%9a 99.60%10a

96.00%5b 88.10%6b 97.00%7b 95.05%8b 90.00%9b 70.00%10b

h h h h h h

References

Training parameters

Pri-miRNA structures Sequence and secondary structures of pre-miRNA Scoring filter based on validated miRNA hairpins Sequence alignments Hairpin structure Global hairpin structure Hairpin structure Sequence patterns Secondary structures

Yousef et al. (2006) Kadri et al. (2009) Gkirtzou et al. (2010) Lim et al. (2003a,b) Nam et al. (2006) Xue et al. (2005) Sewer et al. (2005) Jiang et al. (2007) Helvik et al. (2007) Huang et al. (2007)

Algorithm (NB, Naı¨ve Bayes; SVM, Support Vector Machine; HMM, Hidden Markov Model; RF, Random Forest). Sensitivity (in %); Specificity (in %). The sensitivity and the specificity of each program are mostly as reported by authors; 1a,1b average of fivefold cross validation of positive and negative examples from human, worm and mouse; 2a,2b average of 10-fold cross validation across among vertebrates, invertebrates and plants data sets; 3a,3b,;4a,4bbased on real and pseudo human miRNA precursors; 5a,5b, 7a, 7b, 9a, 9b,10a,10bSummarized in report by Li et al. (2010). Organism compatibility: [f (fly), w (worm), h (human), m (mouse), r (rat),c (chicken),z (zebrafish),d (dog),v (vertebrates), all (all types of organisms)]. The software is available in the Web, but source codes are not published. This list of miRNA prediction tools are restricted to the ones evaluated by the authors based on the following criteria: (i) tools that are potentially useful for biomedical-related work, (ii) the tools that are current and regularly updated within the last 5 years, and (iii) tools that evolved as ‘standard controls’ for miRNA prediction as per peer citation and preference.

Table 2 lists the currently available 21 target prediction tools surveyed in this study. In comparing these tools, there are no outright standards but a few guidelines. First, it is more rational to use tools that integrate machine-learning programs and filter-based algorithms to maximize possibilities of finding putative miRNAs in the non-canonical perspective without yielding high false-positive rates. Second, it is more flexible and useful if the tools are downloadable, thus allowing user inputs and further data manipulation. Third, for the advanced users, an open-source code opens a lot of possibilities. Lastly, software tools that integrate multi-algorithm analyses to allow performance comparisons are better than sites featuring a single algorithm. Many authors cited TargetScan (Friedman et al. 2010) as the best performer in this category, backed with actual proteomics data (Baek et al. 2008). Thus, TargetScan may be used as a gold standard for comparison against new softwares. However, caution must be taken, as TargetScan is a highly stringent tool thus by-passing non-conventional targets. Other relevant information for target prediction analyses may include miRNA

sequences, mRNA sequences, and experimental evidences confirming miRNA–mRNA associations.

Four properties describing the miRNA–mRNA target interactions Molecular modalities of interaction between miRNA and mRNA are fourfold, as the following: 1 Extensive characterization of miRNA and mRNA targets associations identified base-pairing pattern as the most salient factor in miRNA target recognition. The ‘seed region’, the +2 to +8 nucleotide position from the 5¢ end of the miRNA as the crucial component functions as the nucleation site of the RISC (Long et al. 2008). Studies confirmed these miRNA and target site associations, which allow perfect (stringent) and non-perfect (nonstringent) complementation, correlate with the degree of regulation (Gaidatzis et al. 2007). The RISC distinguishes offset sites in the +3 to +10 nucleotide positions, where they can follow several patterns of seed matching (Bartel 2009). To add,

 2011 The Authors Journal compilation  2011 by the Molecular Biology Society of Japan/Blackwell Publishing Ltd.

Genes to Cells (2012) 17, 11–27

15

NH Tan Gana et al. Table 2 General description of algorithms and software for miRNA target prediction 1. Tool features

2. Resource access

Resource name

1.A

1.B

1.C

1.D

1.E

1.F

2.A

2.B

2.C

2.D

2.E

References

Diana miRExtra MiRanda PicTar Target Miner TargetScan GenMiR++ HocTar HumiTar MicroCosm miRDB mirRIM MirTif MirWIP MiTarget MovingTarget MTar NbMirTAR PITA RepTar TargetRank TargetSpy

RB RB HMM SVM RB NB HMM HMM SVM SVM HMM SVM RB SVM RB ANN NB RB HMM RB RB

nc nc nc nc c nc nc nc nc nc c nc nc c nc c nc nc c nc c

dcf dcf dcf dcf cfo dcf dcf nc dcf dcf cfo nc dcf cfo dcf cfo dcf cfo cfo cfo dcf

mfe mfe mfe mfe au nc mfe nc mfe mfe mfe nc mfe mfe mfe mfe mfe mfe all au mfe

nc psc msc nc psc nc nc nc psc psc psc nc psc nc psc msc nc psc msc nc psc

nc nc nc ep nc ep ep nc nc nc nc nc ep nc nc ep nc nc nc nc nc

) h,m,r v,f,w h v h h h met h,m,r,d,c h, f ) w h,m,r,d,c f h h h,m,f,w h,m, vir h,m h,m,c,f

os os os os os ds ds ds ds ds

os ds ds ds os ds ds ds ds ds

os ds ds ds ds os os os ds os

os ds ds ds ds os os os ds os

) + ) ) + + + ) ) ) ) ) ) ) ) ) ) + + + +

+ + + + + ) + ) + + ) + + + ) ) + + + + +

Alexiou et al. (2010) Betel et al. (2008) Ruan et al. (2008) Bandyopadhyay & Mitra (2009) Lewis et al. (2003) Huang et al. (2008) Gennarino et al. (2009) Ruan et al. (2008) Griffiths-Jones et al. (2008) Wang (2008) Terai et al. (2007) Yang et al. (2008) Hammell et al. (2008) Kim et al. (2006) Burgler & Macdonald (2005) Chandra et al. (2010) Yousef et al. (2007) Kertesz et al. (2007) Elefant et al. (2011) Nielsen et al. (2007) Sturm et al. (2010)

Tools Features: 1.A Algorithm (ANN, artificial neural network; HMM, Hidden Markov Model; M, multiple algorithm; NB, Naı¨ve Bayes; RB, Restricted Bayesian; SVM, support vector machines); 1.B [target positions considered (c), target positions not considered (nc)]; 1.C [conservation filter option (cfo),default conservation filter (dcf),no conservation filter (nc)]; 1.D [minimum free energy site accessibility (mfe), A–U-rich flanking regional accessibility (AU), site accessibility not considered (nc)]; 1.E [multiple sites considered (msc), putative sites considered (psc)]; 1. F[ expression profile considered (ep), expression profile not considered (nc)]. Resource Access: 2.A [f (fly), w (worm), h (human), m (mouse), r (rat),c (chicken), z (zebrafish), d (dog), v (vertebrates), vir (virus)]; 2.B [optional own miRNA sequence (os), default miRNA sequences (ds)]; 2.C [optional own mRNA sequence (os), default mRNA sequences (ds); 2D [source code: available (+),not available ())]; 2.E [Web access: available (+),not available ())]. This list of miRNA target tools are restricted to the ones evaluated by the authors based on the following criteria: (i) tools that are potentially useful for biomedical-related work, (ii) the tools that are current and regularly updated within the last 5 years, and (iii) tools that evolved as ‘standard controls’ for miRNA targeting as per peer citation and preference.

Figure 2 Categories of seed matches in miRNA target sites illustrated by human miRNA, hsa-mir-29a (MI0000087) targeting HIV-1 genome (3¢ UTR regions). (A) canonical sites are given: (A.1) 7mer-A1 site: an exact match to positions 2–7 of the mature miRNA (the seed) followed by an ‘A’; (A.2) 7mer-m8 site: an exact match to positions 2–8 of the mature miRNA (the seed + position 8); (A.3) 8mer site: an exact match to positions 2–8 of the mature miRNA (the seed + position 8) followed where all have been confirmed in various ‘wet’ experiments which hsa-mir-29a target HIV-1 3¢ UTRs (Hariharan et al. 2005; Ahluwalia et al. 2008; Nathans et al. 2009). (B.1–2) Marginal sites and (C.1–2) atypical sites have hypothetical nucleotide targets as these are not normally considered as preferred types in actual miRNA targeting studies. (C.2.) Categories of mismatched seed regions as seen in atypical compensatory sites are further classified as (C.2.1) GUM contains a G:U wobble and U on the seed site of miRNA; (C.2.2) GUT has a U present anywhere on the mRNA target; (C.2.3) BM contains a single bulge and mismatch at the seed site; (C.2.4) BT contains a mismatch present at the target site; (C.2.5) LP contains a single loop (Bartel 2009). As miRNA binding to target is directed solely by the seed region at the 5¢ end strictly requiring perfect complementation, with the 3¢ end allowing imperfect complementation. In cases where there is weak 5¢ seed binding and wobbled or mismatched 3¢ end, it would be enhanced by a long, continuous set of complementing nucleotides in between (Bail et al. 2010).

16

Genes to Cells (2012) 17, 11–27

 2011 The Authors Journal compilation  2011 by the Molecular Biology Society of Japan/Blackwell Publishing Ltd.

Online miRNA resources

Type of miRNA target site/ Seed match categories

A.Canonical sites A.1 7mer-A1 site: an exact match to positions 2–7 of the mature miRNA (the seed) followed by an ‘A’

Representative samples of predicted hsa-mIR-29a miRNA consequential pairing demonstrating several types of miRNA target sites (TargetScan) Note: target region (top) and miRNA (bottom) Position 3856–3862 of TET3 3′ UTR Seed match +

A at position 1

5′ ...AUCCUGUCUUGCUGAGGUGCUAU...... Poly(A)

IIIIIII 3′ ....AUUGGCUAAAGUCUACCACGAU...5′

miRNA

87654321

Seed

A.2 7mer-m8 site: an exact match to positions 2–8 of the mature miRNA (the seed + position 8)

Position 53–59 of TET3 3′ UTR

Seed match + match at position 8

5′ ...CCCGAGCUGUCUCUGUGGUGCUU....Poly(A)

IIIIIII

3′ ....AUUGGCUAAAGUCUACCACGAU...5′ miRNA 87654321

Seed

A.3. 8mer site: an exact match to positions 2–8 of the mature miRNA (the seed + position 8) followed by an ‘A’

Position 8749–8870 of HIV-1 3′ UTR Seed match +

A1 and m8

5′ ....CACUGACCUUUGGAUGGUGCUA...Poly(A)

IIIIIIII

3′ ... UUGGCU_AAAGUCUACCACGAU...5′ miRNA 87654321

Seed

B. Marginal sites Seed match

B.1. 6mer site

5′ ...............................................NNNNNN.........Poly(A)

IIIIIII

3′ ..... NNNNNNNNNNNNNNNNNNNNNN...5′ miRNA 87654321

Seed

B.2. Offset 6mer site

5′ ............................................NNNNNN............Poly(A)

IIIIIII

3′ ..... NNNNNNNNNNNNNNNNNNNNNN...5′ miRNA 87654321

Seed

C. Atypical Sites Supplementary Loop pairing (~3–4 pairs) 1–5nt ....

C.1. 3′-Supplemetary site

.. 5′ ....NNNNNNNN

IIIIIII

..

Seed match

NNNNNNN.........Poly(A)

IIIIIIII

3′ ..NNNNNNNNN NNNNNNNN.......5′ miRNA 1918171615141312N N N 8 7 6 5 4 3 2 1 Seed

C.2. 3′-Compensatory site: an imperfect match to the seed, together with pairing to the 3′ portion of the miRNA that can compensate for the single-nucleotide bulge or mismatch

Compensatory Loop Seed pairing (~4–5 pairs) 1–5 .... nt mismatch

.. 5′ ....NNNNNNNN

IIIIIII

..

N NNN NNNA.........Poly(A)

III

IIII

3′ ..NNNNNNNNN NNNNNNNN.......5′ miRNA 1918171615141312N N N 8 7 6 5 4 3 2 1 Seed

C.2. Categories of seed region mismatches C.2.1. GUM: has G:U wobble

5′ .....AGGGUUCA.......Poly(A)

I www I I I I

3′ ....UUCCAACU......5′ miRNA 87654321

C.2.2. GUT: U present at any part of mRNA target 5′ .....AUUUUUCA.......Poly(A)

IIIIIIII

3′ ....UAGAAAGU......5′ miRNA 87654321

C.2.3. BM: bulge or mismatch at seed site

C 5′ .....GAAGAG A.......Poly(A)

II IIII

I

3′ ....CUUCUC U......5′ miRNA C 87654321

C.2.4. BT: mismatch at target site C 5′ .....GAAGAG A.......Poly(A)

II IIII

I

3′ ....CUUCUC--U......5′ miRNA 87654321

C.2.5. LP: looping at seed site NN 5′ .....GAAGA A.......Poly(A)

II III

I

3′ ....CUUCU U......5′ miRNA NN 87654321

 2011 The Authors Journal compilation  2011 by the Molecular Biology Society of Japan/Blackwell Publishing Ltd.

Genes to Cells (2012) 17, 11–27

17

NH Tan Gana et al.

base pairing on the 3¢ part of the miRNA also increases efficacy in the site recognition of the miRNA target with seed pairing. Thus, binding sites are grouped into 5¢ dominant or stringent (perfect seed base paring and almost perfect 3¢ complementation), 5¢ dominant canonical (perfect base paring at the 5¢ seed regions and extensive complementation at 3¢ regions), 3¢ supplementary (covers and extensive 3¢ complementation pattern), and 3¢ compensatory(imperfect 5¢ seed matching but strong complementation in 3¢ region), which are the fundamental criteria of most predictive programs (Brennecke et al. 2005; Grimson et al. 2007). Further classification of these stringent or perfect sites can be divided into several categories, namely 8mer, 7mer-8m, 7 mer-A1, and 6mer (Saetrom et al. 2007; Bartel 2009). The combination patterns of the crucial nucleotides in the first and the last position, of which adenine (A), increases efficiency forms a system of classification. Figure 2 illustrates these detailed miRNA seed patterns using hsamiR-29a, a well-documented human miRNA to target HIV-1 as a model (Hariharan et al. 2005; Ahluwalia et al. 2008; Nathans et al. 2009). Microarrays and conservation enrichment information have made possible the ranking of various matches and seeds in their order of efficacy for target prediction. Listed in order from most to least effective among matches and seed types following this criteria are stringent seed> stringent seed in offset>>moderate-stringent seeds>moderate-stringent in offset; 8mer>7mer-m8 > 7mer-A1 > 6mer in stringent-seed types; and bulge>G; U wobble> loop in the moderate stringent seed types (Grimson et al. 2007). Multiple types have favored efficiency than a single site type (Saetrom et al. 2007). 2 The secondary structure of the miRNA–mRNA hybrid interactions influences the thermodynamic stability, which is a well studied aspect in miRNA target prospecting. It is significant because thermodynamically stable interactions between miRNA and the corresponding target allow ample time for RISC to process its enzymatic activity. Favorable miRNA ⁄ mRNA target interaction requires more energy in duplex formation expressed as low value of minimum free energy (MFE) or Gibb’s free energy (DG) of hybridization based on the content of A–U nucleotide pairs. As identified, A–U-rich sites at either 30 bp upstream or downstream of the seed match results to effective target sites. These A–U-rich regions as later shown are typically

18

Genes to Cells (2012) 17, 11–27

located in the 3¢ UTRs (Jacobsen et al. 2010b). In miRNA target prediction, the specificity of the complementation should always be considered following rules of sensitivity, defined in the threshold accounting of binding properties. Defining thermodynamic stability and accessibility of mRNA secondary structures require thorough scrutiny of the mRNA folding configurations. Thus, developed programs can be used, for example, RNA package (Washietl 2010), RNAfold, and Mfold (Krol et al. 2004) in calculating secondary structures and interaction energies between miRNA–mRNA targets. Although these information may provide preliminary clues, it does not always suggest that low free energies of the miRNA–mRNA duplex formation leads to accurate and favorable molecular interaction and binding (Maiti et al. 2010) because of the thermal dynamics. Therefore, energy threshold accounting must be supplemented with additional data analyses for increased accuracy in target gene predictions. 3 Initial observations showed miRNAs bound only to 3¢ UTR regions of target genes making miRNA activity highly localized within the genome. Later analyses of the miRNA–RISC complex showed possible interaction outside the 3¢ UTR regions. One example is the work in (Das 2009), which showed the human miRNA hsa-miR-650 genes overlapping with the exons of immunoglobulin lambda variable region (IGVL) through phylogenetic reconstruction studies. The hsa-mir-650, a duplication or deletion product, uses the same gene promoter of IGVL and illustrates a perfect example of genomic association among protein-coding genes and a miRNA. Thus, it implies that target prediction programs should assess the full-length genomic sequences for comprehensive miRNA target prediction. Saito & Saetrom (2010) thoroughly explained why miRNA targeting occurs mainly at the 3¢ UTR. The 3¢ UTR region provides a limitedly competitive environment for miRNA-RISC binding. This is in contrast to other areas of the genome which are binding sites of other protein complexes such as translational factors and ribosomes. The 3¢ UTRs are also able to elicit different regulatory responses by variations in length. UTRs for most specialized genes are relatively longer, whereas ubiquitous genes have shorter UTRs to avoid miRNA regulation. Different post-processing activities which vary 3¢ UTR structure, like alternative splicing and polyadenylation site modifications,

 2011 The Authors Journal compilation  2011 by the Molecular Biology Society of Japan/Blackwell Publishing Ltd.

Online miRNA resources

can also influence miRNA targeting effects. Recently, a report proposed an emerging group of miRNA target as ‘miBridge’, wherein a single miRNA can synergistically interact at different positions in the seed pairing site at the 3¢ UTR and 3¢ pairing site in the 5¢ UTR (Ajay et al. 2010). This development provides a challenge to design programs that detect these phenomena. 4 The two principal mechanisms of miRNA-mediated regulation of gene expression specifically, translational inhibition and repression of mRNA expression would likely affect the target genes in two ways (Tomari 2009). Initially, miRNAs down-regulate gene expression as often reported (Pillai et al. 2007). However, recent evidences showed miRNAs can also up-regulate target mRNAs (Baek et al. 2008). Moreover, reported by several findings was repression of expression by mRNA events specifically: mRNA adenylation, mRNA degradation, and mRNA sequestration (Cai et al. 2009; Carthew & Sontheimer 2009). Translational repression also occurs as an independent event in RNA regulatory pathways (Newman & Hammond 2010). As miRNAs are usually short sequences, a single miRNA can potentially regulate a number of distinct genes. With the miRNA expression levels correlated to mRNA gene expression profile, they function as a molecular switch in a tissue-specific manner as suggested (Brameier & Wiuf 2007).

Integrated tools and other support sites

The exponential growth of miRNA information including predicted and experimental evidences drive the current platform of miRNA tools to be integrated as one body of information. As a result, these current miRNA tools have dual or multiple functions, namely as database for computational results, experimental validation, and a hub for executable programs for miRNA analyses. Table 3 lists the examples of surveyed resources for this category. The integration of tasks into the software may include events describing miRNA interactions covering transcription factors, disease states and tissues, and other applications where miRNA participate. Examples like Bi-targeting (Veksler-Lublinsky et al. 2010) and RepTar (Elefant et al. 2011) use novel algorithms to simulate a multi-dimensional interaction among miRNAs and targets. These given programs are used to describe host–virus interactions of the miRNA pathways. In general, the surveyed

tools attempt to describe miRNAs as an integral component of systems biology. As mentioned earlier, significant advances in miRNA and their target detection allowed the generation of additional data needing further processing using computational analyses. Particular examples include genome wide analyses of miRNA expression using microarrays and next-generation sequencing approaches (details are reviewed elsewhere). This scenario in particular opened opportunities to combine primary and secondary miRNA targets as evidenced by co-expression patterns. Thus, a combinatorial mapping out of miRNA regulatory networks is possible. Data enrichment correlates primary and secondary miRNA data to GO terms or biological pathways, allowing a multidimensional association of miRNAs into biological systems. Although other sites may not specifically identify miRNAs as a feature, they are more or less equally indispensible in carrying out miRNA analyses. These sites may include repositories for genome sequences, experimental evidences, literature, and other relevant information. Table S1 (Supporting Information) is a list of these sites. Recently, companies engaged in miRNA discovery offer computational support services as part of their product promotions.

Beyond prediction and targeting: conclusive statements and prospects After the extensive discussions on miRNA prediction and targeting tools, it becomes clear that the miRNA computational tools are heading to integration. The integration process involves the consolidation of tools into a database and as a hub for executable programs. In the ultimate, such integrative approaches are to make miRNA information comprehensive and reliable. The next question is: ‘Are the current platforms comprehensive and reliable enough?’ The answer to this question is rather complicated. There are still flaws among current platforms that must be addressed immediately. First, the current predictive tools generate high false-positive rates. Reducing false-positive rates remains a challenge to bioinformaticians. Integrating computational tools to come up with comparative analysis can minimize false-positive data. Second, the tools, when subjected to comparative analyses yield incoherent results. However, integration of such bioinformatic tools should provide safe-guard mechanisms that could be based on accumulated empirical evidences. Third, online programs would truly benefit the end users if the software developers would cater to

 2011 The Authors Journal compilation  2011 by the Molecular Biology Society of Japan/Blackwell Publishing Ltd.

Genes to Cells (2012) 17, 11–27

19

NH Tan Gana et al.

Table 3 Resources for integrated and specialized miRNA tools with emphasis on biomedical research and applications Integrated tools and features

20

Resource name

Prediction software

Bi-Targeting

Targeting software

Third party software

Data features

References

NA

Bi-targeting support vector machines

GO Disease Atlas

Veksler-Lublinsky et al. (2010)

miR2Disease

NA

NA

GO

Exprtarget

NA

LCL HapMaps

FANTOM4 EdgeExpressDB

NA

Tarbase miRanda PicTar NA

Software for bi-targeting host and viral miRNA genes modeled after EBV and human host genes Database for miRNA deregulation in human diseases Integrative human miRNA prediction

HMDD

NA

NA

TransmiR

NA

NA

S-MED

NA

NA

NA

PuTmiR: microRNAs

NA

NA

NA

miRGator

NA

miRanda TargetScanS PicTar

miRNApath

NA

NA

PhenomiR

NA

NA

miRBase Pathways GEO UCSC Genome DB GO KEGG miRBase Entrez Gene miRBase OMIM Pathways

MAGIA

NA

PITA miRanda TargetScan

Genes to Cells (2012) 17, 11–27

CAGE DB EDGE Express DB

MISIM (miRNA network annotation tool) TAM (miRNA annotation tools) PubMed HMDD

miRBase DAVID Ensembl miR2Disease EbiMed

Jiang et al. (2009)

Gamazon et al. (2010)

Extensive database for human regulatory network in myeloid leukemia cell line (THP-1); there is a current move to develop FANTOM5 Human miRNA disease database associations

Severin et al. (2009)

Human transcription factor-miRNA regulation database Sarcoma miRNA expression database Neighborhood extraction of transcription factors of human miRNA functional annotation tools

Wang et al. (2010b)

Associates miRNAs, target genes and metabolic pathways A knowledgebase for miRNA disease phenotypes and biological processes Integrated miRNA target prediction and gene analyses by multiple programs

Lu et al. (2008)

Sarver et al. (2010) Bandyopadhyay & Bhattacharyya (2010) Nam et al. (2008)

Chiromatzo et al. (2007)

Ruepp et al. (2010)

Sales et al. (2010)

 2011 The Authors Journal compilation  2011 by the Molecular Biology Society of Japan/Blackwell Publishing Ltd.

Online miRNA resources Table 3 (Continued) Integrated tools and features Prediction Resource name software Targeting software

Third party software Data features

MapMI

MapMI

NA

miRBase Ensembl RepBase Bowtie RNAFold PhyML PhyloWidget MUSCLE RNALogo miRBAse

mimiRNA

NA

TargetScan miRanda

miRMaid

NA

NA

miRecords

NA

miRonTop

NA

mirRor Suite

NA

miRDeep

NA

Diana-microT MicroInspector miRAnda mirTarget miTarget NBmiRTAR PicTar PITA RNA22 RNAHybrid TargetScan Target Scan microcosm miRanda PITA Seed-search miRror psi-miRror (PITA, PicTar, TargetScan, TargetRank, microcosm, miRAnda, Diana-microT, EIMMO-MirZ,miRDB, RNA22, MAMI, Map2) NA

miRSel

NA

NA

NA

miRTRAP,

NA

miRSel TarBase miRecords miR2Disease

NA

miRBase PhenomiR mimiRNA Landgraf cloning data NA

References

Comprehensive Web-server program to locate and miRNA precursors; also involved in classification and phylogenetic mapping

Guerra-Assuncao & Enright (2010)

Ritchie et al. Profiler and classification (2010) resource for functional correlation of miRNAs expression and targets Jacobsen et al. An advanced programming (2010a) interface to facilitate data management and processing of the miRBase data by user plugins Xiao et al. (2009) Integrated resource for predicted and validated animal miRNA targets by eleven miRNA targeting programs

Le Brigand et al. (2010)

GO

miRNA target mining from cross-species genome expression studies

miRrorNet Annotation tools (Pandora, DAVID, Reactome, STRING)

Friedman et al. A smart suite for miRNA prediction and target analyse (2010) across programs and species

NA

miRNA analyses from deep sequencing data Extractor of functional associations of miRNAs biomedical literature Disease-related miRNA functional analyses from high throughput sequencing data

 2011 The Authors Journal compilation  2011 by the Molecular Biology Society of Japan/Blackwell Publishing Ltd.

Friedlander et al. (2008) Naeem et al. (2010) Hendrix et al. (2010)

Genes to Cells (2012) 17, 11–27

21

NH Tan Gana et al. Table 3 (Continued) Integrated tools and features

Resource name

Prediction software

MISIM MirWalk

NA NA

Pathways PubMed

MirZ

Multi-species (human, mouse, rat) predicted and validated targets miR-abela

repTAR

RepTar HMM

NA

TAM

NA

NA

VirMiR

RNAhybrid

NA

Targeting software

Third party software

Data features

References

NA MirWalk RNA22 miRanda miRDB TargetScan RNAhybrid PITA PicTar DianaMicroT http://www.ma.uni heidelberg.de/apps/ zmf/mirwalk/ documentation.html

DAG Chromosome Genes OMIM

Disease-based human miRNA functional categorization and network association

Wang et al. (2010a)

EIMMo

smiRNA DB Oligo Map NA

Integrator site for miRNA atlas and miRNA targets HMM-based algorithm for prediction of cellular targets in host and viral miRNAs; inverse targets miRNA categorization based on enrichment and depletion analyses Predictive multi-viral miRNA targets across species

Hausser et al. (2009) Elefant et al. (2011)

miRBase HMDD PubMed NA

Lu et al. (2010)

Li et al. (2008)

NA, not applicable. Except MiRZ, repTAR, VirMiR, and MapMI, all other resources are mostly as an offshoot of a miRNA targeting tools without any miRNA predictive function ⁄ software.

non-programmers. This is through observation that most pieces of information are discontinuous which often require complicated data post-processing. Based on the authors’ experience, after generation of predictive and targeted mRNA results data unification becomes a challenge. Given these arguments of uncertainty and disarray, scientists remain dedicated in developing more efficient programs to predict systematically miRNA and their respective targets. As miRNA studies imply an immense amount of information, without the aid of computational tools, the direction for this research endeavor would rather be difficult. In fact, almost all miRNA experimental validation experiments started off with computational analyses. 22

Genes to Cells (2012) 17, 11–27

The next-generation miRNA tools should be designed to perform tasks beyond prediction and targeting. The trend is to migrate towards the integration of evidence-based miRNA information. The concern here is how to manage effectively, curate and reconcile the computationally generated and experimentally obtained data. In particular, the following five concerns must be addressed. First, the future algorithms should consider spatiotemporal relationships of miRNA with chromosomally linked evidences, tissue localization, and miRNA signatures of diseases. As there is continuous accumulation of differential expression data among various diseases such as cancers and leukemia, it is now possible to group miRNAs into clusters or arrays to further clarify

 2011 The Authors Journal compilation  2011 by the Molecular Biology Society of Japan/Blackwell Publishing Ltd.

Online miRNA resources

human disease with regard to gene expression profiles such as mRNA and miRNA. Second, following the chromosomal linkage data are pieces of information leading to the calculations of the optimal distances for multiple miRNA sites or alternative sites within the CDS and 5¢ UTRs. This is necessary to ameliorate spatiotemporal relationships among mRNA and coregulated miRNAs. Third, algorithms need to be developed to predict and categorize miRNAs either as down-regulators or up-regulators. The resultant data should be beneficial for defining systems biology perspective. Fourth, while most current methods identify miRNA and its direct target or cis-regulatory mechanisms, they must extend analyses to consider the relationships between miRNA and the secondary targets or the trans-regulatory mechanisms (Dai & Zhou 2010). Fifth, miRNAs like any other components of the genome is prone to polymorphism. Several documentations displayed miRNA sequence variations eventually influencing their targeting abilities. Corollary to this event is the mRNA targets exhibiting polymorphism. These events can modulate each other in delivering effective regulation. Given earlier are a few of many data sets potentially applicable for accelerated miRNA studies. Thus, future programs must extend the scope to establish and strengthen the linkages among these sets of information mentioned in this review article.

Acknowledgements This work was partially supported by the Ministries of Education, Culture, Sports, Science and Technology, and Health, Labor and Welfare of Japan. We also sincerely express our gratitude to the invaluable support of Drs. Kaori Asamitsu, Satoshi Kanazawa and Hiroaki Uranishi of the Cell and Molecular Biology Laboratory, Nagoya City University Graduate School of Medical Sciences.

References Agarwal, S., Vaz, C., Bhattacharya, A. & Srinivasan, A. (2010) Prediction of novel precursor miRNAs using a context-sensitive hidden Markov model (CSHMM). BMC Bioinformatics 11(Suppl. 1), S29. Ahluwalia, J.K., Khan, S.Z., Soni, K., Rawat, P., Gupta, A., Hariharan, M., Scaria, V., Lalwani, M., Pillai, B., Mitra, D. & Brahmachari, S.K. (2008) Human cellular microRNA hsa-miR-29a interferes with viral nef protein expression and HIV-1 replication. Retrovirology 5, 117. Ajay, S.S., Athey, B.D. & Lee, I. (2010) Unified translation repression mechanism for microRNAs and upstream AUGs. BMC Genomics 11, 155.

Alexiou, P., Maragkakis, M., Papadopoulos, G.L., Simmosis, V.A., Zhang, L. & Hatzigeorgiou, A.G. (2010) The DIANA-mirExTra web server: from gene expression data to microRNA function. PLoS ONE 5, e9171. Alvarez-Garcia, I. & Miska, E.A. (2005) MicroRNA functions in animal development and human disease. Development 132, 4653–4662. Baek, D., Villen, J., Shin, C., Camargo, F.D., Gygi, S.P. & Bartel, D.P. (2008) The impact of microRNAs on protein output. Nature 455, 64–71. Bagasra, O. & Prilliman, K.R. (2004) RNA interference: the molecular immune system. J. Mol. Histol. 35, 545–553. Bail, S., Swerdel, M., Liu, H., Jiao, X., Goff, L.A., Hart, R.P. & Kiledjian, M. (2010) Differential regulation of microRNA stability. RNA 16, 1032–1039. Bandyopadhyay, S. & Bhattacharyya, M. (2010) PuTmiR: a database for extracting neighboring transcription factors of human microRNAs. BMC Bioinformatics 11, 190. Bandyopadhyay, S. & Mitra, R. (2009) TargetMiner: microRNA target prediction with systematic identification of tissue-specific negative examples. Bioinformatics 25, 2625– 2631. Bartel, D.P. (2009) MicroRNAs: target recognition and regulatory functions. Cell 136, 215–233. Ben-Hur, A. & Weston, J. (2010) A user’s guide to support vector machines. Methods Mol. Biol. 609, 223–239. Betel, D., Wilson, M., Gabow, A., Marks, D.S. & Sander, C. (2008) The microRNA.org resource: targets and expression. Nucleic Acids Res. 36, D149–D153. Bohnsack, M.T., Czaplinski, K. & Gorlich, D. (2004) Exportin 5 is a RanGTP-dependent dsRNA-binding protein that mediates nuclear export of pre-miRNAs. RNA 10, 185–191. Brameier, M. & Wiuf, C. (2007) Ab initio identification of human microRNAs based on structure motifs. BMC Bioinformatics 8, 478. Brennecke, J., Stark, A., Russell, R.B. & Cohen, S.M. (2005) Principles of microRNA-target recognition. PLoS Biol. 3, e85. Brodersen, P. & Voinnet, O. (2009) Revisiting the principles of microRNA target recognition and mode of action. Nat. Rev. Mol. Cell Biol. 10, 141–148. Burgler, C. & Macdonald, P.M. (2005) Prediction and verification of microRNA targets by MovingTargets, a highly adaptable prediction method. BMC Genomics 6, 88. Cai, Y., Yu, X., Hu, S. & Yu, J. (2009) A brief review on the mechanisms of miRNA regulation. Genomics Proteomics Bioinformatics 7, 147–154. Carmell, M.A. & Hannon, G.J. (2004) RNase III enzymes and the initiation of gene silencing. Nat. Struct. Mol. Biol. 11, 214–218. Carthew, R.W. & Sontheimer, E.J. (2009) Origins and Mechanisms of miRNAs and siRNAs. Cell 136, 642–655. Chandra, V., Girijadevi, R., Nair, A.S., Pillai, S.S. & Pillai, R.M. (2010) MTar: a computational microRNA target prediction architecture for human transcriptome. BMC Bioinformatics 11(Suppl. 1), S2.

 2011 The Authors Journal compilation  2011 by the Molecular Biology Society of Japan/Blackwell Publishing Ltd.

Genes to Cells (2012) 17, 11–27

23

NH Tan Gana et al. Chiromatzo, A.O., Oliveira, T.Y., Pereira, G., et al. (2007) miRNApath: a database of miRNAs, target genes and metabolic pathways. Genet. Mol. Res. 6, 859–865. Cho, W.C. (2010) MicroRNAs in cancer - from research to therapy. Biochim. Biophys. Acta 1805, 209–217. da Costa Martins, P.A., Leptidis, S., Salic, K. & De Windt, L.J. (2010) microRNA regulation in cardiovascular disease. Curr. Drug Targets, 11, 900–906. Cullen, B.R. (2004) Derivation and function of small interfering RNAs and microRNAs. Virus Res. 102, 3–9. Dai, Y. & Zhou, X. (2010) Computational methods for the identification of microRNA targets. Open Access Bioinformatics 2, 29–39. Das, S. (2009) Evolutionary origin and genomic organization of micro-RNA genes in immunoglobulin lambda variable region gene family. Mol. Biol. Evol. 26, 1179–1189. Eggleston, A.K. (2009) RNA silencing. Nature 457, 395. Elefant, N., Berger, A., Shein, H., Hofree, M., Margalit, H. & Altuvia, Y. (2011) RepTar: a database of predicted cellular targets of host and viral miRNAs. Nucleic Acids Res. 39, D188–D194. Eulalio, A., Huntzinger, E., Nishihara, T., Rehwinkel, J., Fauser, M. & Izaurralde, E. (2009) Deadenylation is a widespread effect of miRNA regulation. RNA 15, 21–32. Flores-Jasso, C.F., Arenas-Huertero, C., Reyes, J.L., Contreras-Cubas, C., Covarrubias, A. & Vaca, L. (2009) First step in pre-miRNAs processing by human Dicer. Acta Pharmacol. Sin. 30, 1177–1185. Friedlander, M.R., Chen, W., Adamidi, C., Maaskola, J., Einspanier, R., Knespel, S. & Rajewsky, N. (2008) Discovering microRNAs from deep sequencing data using miRDeep. Nat. Biotechnol. 26, 407–415. Friedman, Y., Naamati, G. & Linial, M. (2010) MiRror: a combinatorial analysis web tool for ensembles of microRNAs and their targets. Bioinformatics, 26, 1920–1921. Gaidatzis, D., van Nimwegen, E., Hausser, J. & Zavolan, M. (2007) Inference of miRNA targets using evolutionary conservation and pathway analysis. BMC Bioinformatics 8, 69. Gamazon, E.R., Im, H.-K., Duan, S., Lussier, Y.A., Cox, N.J., Dolan, M.E. & Zhang, W. (2010) ExprTarget: an Integrative Approach to Predicting Human MicroRNA Targets. PLoS ONE 5, e13534. Gennarino, V.A., Sardiello, M., Avellino, R., Meola, N., Maselli, V., Anand, S., Cutillo, L., Ballabio, A. & Banfi, S. (2009) MicroRNA target prediction by expression analysis of host genes. Genome Res. 19, 481–490. Gkirtzou, K., Tsamardinos, I., Tsakalides, P. & Poirazi, P. (2010) MatureBayes: a Probabilistic Algorithm for Identifying the Mature miRNA within Novel Precursors. PLoS ONE 5, e11843. Griffiths-Jones, S., Saini, H.K., van Dongen, S. & Enright, A.J. (2008) miRBase: tools for microRNA genomics. Nucleic Acids Res. 36, D154–D158. Grimson, A., Farh, K.K., Johnston, W.K., Garrett-Engele, P., Lim, L.P. & Bartel, D.P. (2007) MicroRNA targeting speci-

24

Genes to Cells (2012) 17, 11–27

ficity in mammals: determinants beyond seed pairing. Mol. Cell 27, 91–105. Guerra-Assuncao, J.A. & Enright, A.J. (2010) MapMi: automated mapping of microRNA loci. BMC Bioinformatics 11, 133. Ha, M., Pang, M., Agarwal, V. & Chen, Z.J. (2008) Interspecies regulation of microRNAs and their targets. Biochim. Biophys. Acta 1779, 735–742. Hammell, M., Long, D., Zhang, L., Lee, A., Carmack, C.S., Han, M., Ding, Y. & Ambros, V. (2008) mirWIP: microRNA target prediction based on microRNA-containing ribonucleoprotein-enriched transcripts. Nat. Methods 5, 813–819. Hammond, S.M. (2005) Dicing and slicing: the core machinery of the RNA interference pathway. FEBS Lett. 579, 5822–5829. Hariharan, M., Scaria, V., Pillai, B. & Brahmachari, S.K. (2005) Targets for human encoded microRNAs in HIV genes. Biochem. Biophys. Res. Commun. 337, 1214–1218. Harvey, S.J., Jarad, G., Cunningham, J., Goldberg, S., Schermer, B., Harfe, B.D., McManus, M.T., Benzing, T. & Miner, J.H. (2008) Podocyte-specific deletion of dicer alters cytoskeletal dynamics and causes glomerular disease. J. Am. Soc. Nephrol. 19, 2150–2158. Hausser, J., Berninger, P., Rodak, C., Jantscher, Y., Wirth, S. & Zavolan, M. (2009) MirZ: an integrated microRNA expression atlas and target prediction resource. Nucleic Acids Res. 37, W266–W272. Heikham, R. & Shankar, R. (2010) Flanking region sequence information to refine microRNA target predictions. J. Biosci. 35, 105–118. Helvik, S.A., Snove, O. Jr & Saetrom, P. (2007) Reliable prediction of Drosha processing sites improves microRNA gene prediction. Bioinformatics 23, 142–149. Hendrix, D., Levine, M. & Shi, W. (2010) miRTRAP, a computational method for the systematic identification of miRNAs from high throughput sequencing data. Genome Biol. 11, R39. Huang, J.C., Frey, B.J. & Morris, Q.D. (2008) Comparing sequence and expression for predicting microRNA targets using GenMiR3. Pac. Symp. Biocomput. http://psb.stanford. edu/psb-online/proceedings/psb08/abstracts/2008_p52.html. Huang, J.C., Morris, Q.D. & Frey, B.J. (2007) Bayesian inference of MicroRNA targets from sequence and expression data. J. Comput. Biol. 14, 550–563. Jackson, R.J. & Standart, N. (2007) How do microRNAs regulate gene expression? Sci. STKE 2007, re1. Jacobsen, A., Krogh, A., Kauppinen, S. & Lindow, M. (2010a) miRMaid: a unified programming interface for microRNA data resources. BMC Bioinformatics 11, 29. Jacobsen, A., Wen, J., Marks, D. & Krogh, A. (2010b) Signatures of RNA binding proteins globally coupled to effective microRNA target sites. Genome Res. 20, 1010–1019. Jiang, P., Wu, H., Wang, W., Ma, W., Sun, X. & Lu, Z. (2007) MiPred: classification of real and pseudo microRNA precursors using random forest prediction model with combined features. Nucleic Acids Res. 35, W339–W344.

 2011 The Authors Journal compilation  2011 by the Molecular Biology Society of Japan/Blackwell Publishing Ltd.

Online miRNA resources Jiang, Q., Wang, Y., Hao, Y., Juan, L., Teng, M., Zhang, X., Li, M., Wang, G. & Liu, Y. (2009) miR2Disease: a manually curated database for microRNA deregulation in human disease. Nucleic Acids Res. 37, D98–D104. Junn, E. & Mouradian, M.M. (2010) MicroRNAs in neurodegenerative disorders. Cell Cycle 9, 1717–1721. Kadri, S., Hinman, V. & Benos, P.V. (2009) HHMMiR: efficient de novo prediction of microRNAs using hierarchical hidden Markov models. BMC Bioinformatics 10(Suppl. 1), S35. Kawamata, T. & Tomari, Y. (2010) Making RISC. Trends Biochem. Sci., 35, 368–376. Kertesz, M., Iovino, N., Unnerstall, U., Gaul, U. & Segal, E. (2007) The role of site accessibility in microRNA target recognition. Nat. Genet. 39, 1278–1284. Kim, S.K., Nam, J.W., Rhee, J.K., Lee, W.J. & Zhang, B.T. (2006) miTarget: microRNA target gene prediction using a support vector machine. BMC Bioinformatics 7, 411. Kim, V.N. (2004) MicroRNA precursors in motion: exportin-5 mediates their nuclear export. Trends Cell Biol. 14, 156–159. Kozomara, A. & Griffiths-Jones, S. (2011) miRBase: integrating microRNA annotation and deep-sequencing data. Nucleic Acids Res. 39, D152–D157. Krol, J., Sobczak, K., Wilczynska, U., Drath, M., Jasinska, A., Kaczynska, D. & Krzyzosiak, W.J. (2004) Structural features of microRNA (miRNA) precursors and their relevance to miRNA biogenesis and small interfering RNA ⁄ short hairpin RNA design. J. Biol. Chem. 279, 42230–42239. Kurihara, Y. & Watanabe, Y. (2010) Processing of miRNA precursors. Methods Mol. Biol. 592, 231–241. Le Brigand, K., Robbe-Sermesant, K., Mari, B. & Barbry, P. (2010) MiRonTop: mining microRNAs targets across large scale gene expression studies. Bioinformatics 26, 3131– 3132. Lee, R.C. & Ambros, V. (2001) An extensive class of small RNAs in Caenorhabditis elegans. Science 294, 862–864. Leung, A.K. & Sharp, P.A. (2006) Function and localization of microRNAs in mammalian cells. Cold Spring Harb. Symp. Quant. Biol. 71, 29–38. Lewis, B.P., Shih, I.H., Jones-Rhoades, M.W., Bartel, D.P. & Burge, C.B. (2003) Prediction of mammalian microRNA targets. Cell 115, 787–798. Li, S.C., Shiau, C.K. & Lin, W.C. (2008) Vir-Mir db: prediction of viral microRNA candidate hairpins. Nucleic Acids Res. 36, D184–D189. Li, S.C., Chan, W.C., Hu, L.Y., Lai, C.H., Hsu, C.N. & Lin, W.C. (2010) Identification of homologous microRNAs in 56 animal genomes. Genomics 96, 1–9. Liang, H. & Li, W.H. (2009) Lowly expressed human microRNA genes evolve rapidly. Mol. Biol. Evol. 26, 1195–1198. Lim, L.P., Glasner, M.E., Yekta, S., Burge, C.B. & Bartel, D.P. (2003a) Vertebrate microRNA genes. Science 299, 1540. Lim, L.P., Lau, N.C., Garrett-Engele, P., Grimson, A., Schelter, J.M., Castle, J., Bartel, D.P., Linsley, P.S. & Johnson, J.M. (2005) Microarray analysis shows that some micro-

RNAs downregulate large numbers of target mRNAs. Nature 433, 769–773. Lim, L.P., Lau, N.C., Weinstein, E.G., Abdelhakim, A., Yekta, S., Rhoades, M.W., Burge, C.B. & Bartel, D.P. (2003b) The microRNAs of Caenorhabditis elegans. Genes Dev. 17, 991–1008. Long, D., Chan, C.Y. & Ding, Y. (2008) Analysis of microRNA-target interactions by a target structure based hybridization model. Pac. Symp. Biocomput. http://psb.stanford.edu/ psb-online/proceedings/psb08/abstracts/2008_p52.html. Lu, M., Shi, B., Wang, J., Cao, Q. & Cui, Q. (2010) TAM: a method for enrichment and depletion analysis of a microRNA category in a list of microRNAs. BMC Bioinformatics 11, 419. Lu, M., Zhang, Q., Deng, M., Miao, J., Guo, Y., Gao, W. & Cui, Q. (2008) An analysis of human microRNA and disease associations. PLoS ONE 3, e3420. Maiti, M., Nauwelaerts, K., Lescrinier, E., Schuit, F.C. & Herdewijn, P. (2010) Self-complementary sequence context in mature miRNAs. Biochem. Biophys. Res. Commun. 392, 572–576. Mendes, N.D., Freitas, A.T. & Sagot, M.F. (2009) Current tools for the identification of miRNA genes and their targets. Nucleic Acids Res. 37, 2419–2433. Naeem, H., Kuffner, R., Csaba, G. & Zimmer, R. (2010) miRSel: automated extraction of associations between microRNAs and genes from the biomedical literature. BMC Bioinformatics 11, 135. Nam, J.W., Kim, J., Kim, S.K. & Zhang, B.T. (2006) ProMiR II: a web server for the probabilistic prediction of clustered, nonclustered, conserved and nonconserved microRNAs. Nucleic Acids Res. 34, W455–W458. Nam, S., Kim, B., Shin, S. & Lee, S. (2008) miRGator: an integrated system for functional annotation of microRNAs. Nucleic Acids Res. 36, D159–D164. Nathans, R., Chu, C.Y., Serquina, A.K., Lu, C.C., Cao, H. & Rana, T.M. (2009) Cellular microRNA and P bodies modulate host-HIV-1 interactions. Mol. Cell 34, 696–709. Newman, M.A. & Hammond, S.M. (2010) Emerging paradigms of regulated microRNA processing. Genes Dev. 24, 1086–1092. Nielsen, C.B., Shomron, N., Sandberg, R., Hornstein, E., Kitzman, J. & Burge, C.B. (2007) Determinants of targeting by endogenous and exogenous microRNAs and siRNAs. RNA 13, 1894–1910. Okamura, K., Phillips, M.D., Tyler, D.M., Duan, H., Chou, Y.T. & Lai, E.C. (2008) The regulatory activity of microRNA* species has substantial influence on microRNA and 3¢ UTR evolution. Nat. Struct. Mol. Biol. 15, 354–363. O’Toole, A.S., Miller, S., Haines, N., Zink, M.C. & Serra, M.J. (2006) Comprehensive thermodynamic analysis of 3¢ double-nucleotide overhangs neighboring Watson-Crick terminal base pairs. Nucleic Acids Res. 34, 3338–3344. Pan, W., Xin, P. & Clawson, G.A. (2010) MicroRNAs align with accessible sites in target mRNAs. J. Cell. Biochem. 109, 509–518.

 2011 The Authors Journal compilation  2011 by the Molecular Biology Society of Japan/Blackwell Publishing Ltd.

Genes to Cells (2012) 17, 11–27

25

NH Tan Gana et al. Pillai, R.S., Bhattacharyya, S.N. & Filipowicz, W. (2007) Repression of protein synthesis by miRNAs: how many mechanisms? Trends Cell Biol. 17, 118–126. Ritchie, W., Flamant, S. & Rasko, J.E. (2010) mimiRNA: a microRNA expression profiler and classification resource designed to identify functional correlations between microRNAs and their targets. Bioinformatics 26, 223– 227. Ruan, J., Chen, H., Kurgan, L., Chen, K., Kang, C. & Pu, P. (2008) HuMiTar: a sequence-based method for prediction of human microRNA targets. Algorithms Mol. Biol. 3, 16. Ruepp, A., Kowarsch, A., Schmidl, D., Buggenthin, F., Brauner, B., Dunger, I., Fobo, G., Frishman, G., Montrone, C. & Theis, F.J. (2010) PhenomiR: a knowledgebase for microRNA expression in diseases and biological processes. Genome Biol. 11, R6. Saetrom, P., Heale, B.S., Snove, O. Jr, Aagaard, L., Alluin, J. & Rossi, J.J. (2007) Distance constraints between microRNA target sites dictate efficacy and cooperativity. Nucleic Acids Res. 35, 2333–2342. Saito, T. & Saetrom, P. (2010) MicroRNAs - targeting and target prediction. N. Biotechnol. 27, 243–249. Sales, G., Coppe, A., Bisognin, A., Biasiolo, M., Bortoluzzi, S. & Romualdi, C. (2010) MAGIA, a web-based tool for miRNA and Genes Integrated Analysis. Nucleic Acids Res. 38, W352–W359. Sarver, A.L., Phalak, R., Thayanithy, V. & Subramanian, S. (2010) S-MED: sarcoma microRNA expression database. Lab. Invest. 90, 753–761. Seitz, H. (2009) Redefining microRNA targets. Curr. Biol. 19, 870–873. Severin, J., Waterhouse, A.M., Kawaji, H., Lassmann, T., van Nimwegen, E., Balwierz, P.J., de Hoon, M.J., Hume, D.A., Carninci, P., Hayashizaki, Y., Suzuki, H., Daub, C.O. & Forrest, A.R. (2009) FANTOM4 EdgeExpressDB: an integrated database of promoters, genes, microRNAs, expression dynamics and regulatory interactions. Genome Biol. 10, R39. Sewer, A., Paul, N., Landgraf, P., Aravin, A., Pfeffer, S., Brownstein, M.J., Tuschl, T., van Nimwegen, E. & Zavolan, M. (2005) Identification of clustered microRNAs using an ab initio prediction method. BMC Bioinformatics 6, 267. Siomi, H. & Siomi, M.C. (2010) Posttranscriptional regulation of microRNA biogenesis in animals. Mol. Cell 38, 323–332. Sturm, M., Hackenberg, M., Langenberger, D. & Frishman, D. (2010) TargetSpy: a supervised machine learning approach for microRNA target prediction. BMC Bioinformatics 11, 292. Takane, K., Fujishima, K., Watanabe, Y., Sato, A., Saito, N., Tomita, M. & Kanai, A. (2010) Computational prediction and experimental validation of evolutionarily conserved microRNA target genes in bilaterian animals. BMC Genomics 11, 101.

26

Genes to Cells (2012) 17, 11–27

Terai, G., Komori, T., Asai, K. & Kin, T. (2007) miRRim: a novel system to find conserved miRNAs with high sensitivity and specificity. RNA 13, 2081–2090. Tomari, Y. (2009) Biochemical dissection of RISC assembly and function. Nucleic Acids Symp. Ser. (Oxf) http://nass. oxfordjournals.org/content/53/1/15.long Veksler-Lublinsky, I., Shemer-Avni, Y., Kedem, K. & ZivUkelson, M. (2010) Gene bi-targeting by viral and human miRNAs. BMC Bioinformatics 11, 249. Wang, D., Wang, J., Lu, M., Song, F. & Cui, Q. (2010a) Inferring the human microRNA functional similarity and functional network based on microRNA-associated diseases. Bioinformatics 26, 1644–1650. Wang, J., Lu, M., Qiu, C. & Cui, Q. (2010b) TransmiR: a transcription factor-microRNA regulation database. Nucleic Acids Res. 38, D119–D122. Wang, X. (2008) miRDB: a microRNA target prediction and functional annotation database with a wiki interface. RNA 14, 1012–1017. Wang, Z. & Yang, B. (2010) Detection, profiling, and quantification of miRNA expression: MicroRNA Expression Detection Methods (pp. 3–64). Springer, Berlin. Washietl, S. (2010) Sequence and structure analysis of noncoding RNAs. Methods Mol. Biol. 609, 285–306. Wu, L., Fan, J. & Belasco, J.G. (2006) MicroRNAs direct rapid deadenylation of mRNA. Proc. Natl Acad. Sci. USA 103, 4034–4039. Xiao, F., Zuo, Z., Cai, G., Kang, S., Gao, X. & Li, T. (2009) miRecords: an integrated resource for microRNA-target interactions. Nucleic Acids Res. 37, D105–D110. Xue, C., Li, F., He, T., Liu, G.P., Li, Y. & Zhang, X. (2005) Classification of real and pseudo microRNA precursors using local structure-sequence features and support vector machine. BMC Bioinformatics 6, 310. Yang, Y., Wang, Y.P. & Li, K.B. (2008) MiRTif: a support vector machine-based microRNA target interaction filter. BMC Bioinformatics 9(Suppl. 12), S4. Yang, Z.R. (2010) Neural networks. Methods Mol. Biol. 609, 197–222. Yeung, M.L., Bennasser, Y. & Jeang, K.T. (2007) miRNAs in the biology of cancers and viral infections. Curr. Med. Chem. 14, 191–197. Yousef, M., Jung, S., Kossenkov, A.V., Showe, L.C. & Showe, M.K. (2007) Naive Bayes for microRNA target predictions – machine learning for microRNA targets. Bioinformatics 23, 2987–2992. Yousef, M., Nebozhyn, M., Shatkay, H., Kanterakis, S., Showe, L.C. & Showe, M.K. (2006) Combining multi-species genomic data for microRNA identification using a Naive Bayes classifier. Bioinformatics 22, 1325–1334. Yue, D., Liu, H. & Huang, Y. (2009) Survey of Computational Algorithms for MicroRNA Target Prediction. Curr. Genomics 10, 478–492. Zeng, Y. & Cullen, B.R. (2006) Recognition and cleavage of primary microRNA transcripts. Methods Mol. Biol. 342, 49– 56.

 2011 The Authors Journal compilation  2011 by the Molecular Biology Society of Japan/Blackwell Publishing Ltd.

Online miRNA resources Zhang, C. (2008) MicroRNomics: a newly emerging approach for disease biology. Physiol. Genomics 33, 139–147. Zhang, Y., Zhang, R. & Su, B. (2009) Diversity and evolution of MicroRNA gene clusters. Sci. China, C, Life Sci. 52, 261–266. Zhao, H., Wang, D., Du, W., Gu, D. & Yang, R. (2010) MicroRNA and leukemia: tiny molecule, great function. Crit. Rev. Oncol. Hematol. 74, 149–155.

Table S1 List of additional Web resources Additional Supporting Information may be found in the online version of this article. Please note: Wiley-Blackwell are not responsible for the content or functionality of any supporting materials supplied by the authors. Any queries (other than missing material) should be directed to the corresponding author for the article.

Received: 18 February 2011 Accepted: 23 August 2011

Supporting Information ⁄ Supplementary material The following Supporting Information can be found in the online version of the article:

 2011 The Authors Journal compilation  2011 by the Molecular Biology Society of Japan/Blackwell Publishing Ltd.

Genes to Cells (2012) 17, 11–27

27