Coeliac disease-associated polymorphisms influence thymic ... - Nature

9 downloads 0 Views 356KB Size Report
May 29, 2014 - Significant associations between coeliac disease (CD) and single nucleotide ... disease risk by influencing gene expression and regulation.
Genes and Immunity (2014) 15, 355–360 & 2014 Macmillan Publishers Limited All rights reserved 1466-4879/14 www.nature.com/gene

ORIGINAL ARTICLE

Coeliac disease-associated polymorphisms influence thymic gene expression SS Amundsen1, MK Viken2,3, LM Sollid1 and BA Lie2,3 Significant associations between coeliac disease (CD) and single nucleotide polymorphisms (SNPs) distributed over 40 genetic regions have been established. The majority of these SNPs are non-coding and 20 SNPs were, by expression quantitative trait loci (eQTL) analysis, found to harbour cis regulatory potential in peripheral blood mononuclear cells (PBMC). Almost all regions contain genes with an immunological relevant function, of which many act in the same biological pathways. One such pathway is T-cell development in the thymus, a pathway previously not explored in CD pathogenesis. The aim of our study was to explore the regulatory potential of the CD-associated SNPs (n ¼ 50) by eQTL analysis in thymic tissue from 42 subjects. In total, 43 nominal significant (Po0.05) eQTLs were found within 24 CD-associated chromosomal regions, corresponding to 27 expression-altering SNPs (eSNPs) and 40 probes (eProbes) that represents 39 unique genes (eGenes). Nine significant probe–SNP pairs (corresponding to 8 eSNPs and 7 eGenes) overlapped with previous findings in PBMC (rs12727642-PARK7, rs296547-DDX59, rs917997-IL18RAP, rs842647-AHSA2, rs13003464-AHSA2, rs6974491-ELMO1, rs2074404-NSF (two independent probes) and rs2298428-UBE2L3). When compared across more tissues, we found that 14 eQTLs could represent potentially novel thymus-specific eQTLs. This implies that CD risk polymorphisms could affect gene regulation in thymus. Genes and Immunity (2014) 15, 355–360; doi:10.1038/gene.2014.26; published online 29 May 2014

INTRODUCTION Coeliac disease (CD) is a chronic inflammatory disease of the small intestine that develops in genetically susceptible individuals as a result of an inappropriate immune response against dietary wheat gluten, as well as related proteins from rye and barley. There is an unambiguous genetic component in CD. Specific DQA1 and DQB1 alleles within the human leukocyte antigen (HLA) complex are strongly associated with CD; 90–95% of the patients carry the alleles DQA1*05:01/DQB1*02:01 (encoding the HLA-DQ2.5 molecule), whereas almost all the other patients either carry the alleles DQA1*03:01/DQB1*03:02 (encoding the HLA-DQ8 molecule) or DQA1*02:01/DQB1*02:02 (encoding the HLA-DQ2.2 molecule).1 These specific DQA1 and DQB1 alleles are needed, but not sufficient for the disease to develop, nor to explain the whole genetic predisposition. During the last couple of years, convincing evidence for association between CD and single nucleotide polymorphisms (SNPs) within 39 loci outside of the HLA complex has been revealed by genome-wide association (GWA) as well as follow-up studies.2–4 The vast majority of the CD-associated SNPs are non-coding variants (that is, intronic and intergenic SNPs). Such diseaseassociated non-coding risk variants are proposed to contribute to disease risk by influencing gene expression and regulation. Genetic variants that affect expression of a transcript mapping within the same locus (that is, cis associations) have been extensively reported.5,6 Thus, to address the regulatory potential of the CD-associated SNPs, cis expression–genotype correlation studies (so called expression quantitative trait loci (eQTL) analyses) have been performed in peripheral blood mononuclear cells (PBMCs). These analyses revealed that the CD-associated regions were significantly enriched for eQTLs as the number of SNPs

displaying eQTLs within these regions was significantly higher than expected by chance.2 Hence, these observations strongly indicate that several of the CD risk variants contribute to disease risk through mechanisms altering gene expression and regulation. As the above-mentioned eQTL studies were performed using PBMCs, the available expression data come from a limited number of cell types. Although whole blood constitutes a good surrogate tissue for gene expression analysis,2 genes with expression specific to other cell types are most likely also relevant for CD pathogenesis (for example, intestinal epithelial cells or thymic tissue), and would be missed when studying eQTLs only in PBMCs. Consequently, expression data from additional disease-relevant tissues are warranted in order to evaluate the relevance of such eQTL data in CD pathogenesis. Almost all the novel CD risk variants map near genes with immunologic function clearly pointing towards the involvement of the immune system in CD pathogenesis. Many of the genes also act together in the same biological pathways. T-cell development in the thymus was one pathway highlighted in these studies,2 a pathway that has not yet been explored in the context of CD development. Owing to the fact that CD is a T-cell-mediated disease characterised by T-cell tissue destruction, as well as the fact that the T-cell repertoire is shaped in the thymus, we considered thymus to be a highly relevant organ to address the regulatory role of genetic risk variants for CD. Therefore, based on the preceding knowledge, our study aimed to explore the regulatory potential for the novel CD-associated SNPs within thymic tissue by cis expression–genotype correlation (that is, eQTL) analysis.

1 Centre for Immune Regulation, Institute of Immunology, University of Oslo, Oslo University Hospital, Rikshospitalet, Oslo, Norway; 2Department of Medical Genetics, University of Oslo, Oslo University Hospital, Ullevål, Oslo, Norway and 3Department of Immunology, Oslo University Hospital, Rikshospitalet, Oslo, Norway. Correspondence: Dr SS Amundsen, Centre for Immune Regulation, Institute of Immunology, University of Oslo, Oslo University Hospital, Rikshospitalet, 0027 Oslo, Norway. E-mail: [email protected] Received 11 January 2014; revised 21 April 2014; accepted 22 April 2014; published online 29 May 2014

Coeliac disease-associated eQTLs in thymus SS Amundsen et al

356 RESULTS Genotype quality control All SNPs (N ¼ 50) included in this study displayed a genotype success rate above 92% (that is, maximum three missing genotypes) when genotyped with the mass array technology. To validate the Sequenom mass array results, two randomly selected SNPs were also genotyped by TaqMan, and complete genotype correlation was seen for all samples successfully genotyped with both methods. Gene expression in thymus Expression data for 549 probes, covering 337 of the total 424 non-HLA genes located within the 39 candidate regions, were available and hence included in our analysis (Supplementary Table 1). Over all, the probe intensity signals in this data set were in the range 5–14 with the majority (66%) of probes in the lower end of the scale (that is, intensity of 5–7). This is in line with what we would expect, because all genes are assumed to be expressed in at least low amounts in thymic tissue. Few probes displayed signals in the upper end of the scale, implying that few of the genes within the risk loci were strongly expressed in the thymus (Figure 1). In order to validate the array data, we also measured expression of nine randomly selected genes using TaqMan gene expression assays (see Materials and Methods). Good correlation was seen between the gene expression levels measured by the two methods, that is, genes with low array signals amplified with high Ct values (CtX32), whereas genes with signals in the upper range amplified with low Ct values (Ct ¼ 32-25) (data not shown). From these data, we concluded that genes with array signals as low as 5–5.5 were either not expressed or expressed at very low amounts in thymic tissue, whereas genes with signals ranging from 6–14 showed expression in thymus. Expressed quantitative trait loci (eQTL) associated with CD An on-going challenge in genetic research of complex human traits is the assignment of function to the detected disease-associated non-coding genetic variants. As eQTL analysis is one way to suggest causal genes within defined risk regions, we investigated the 39 non-HLA risk loci (50 SNPs) for correlation with

15 14 13

Intensity

12 11 10 9 8 7 6 5 Probes

Figure 1. Distribution of the signal intensity for the 549 probes covering genes located within the 39 non-HLA CD-associated regions. Each dot represents one probe and the intensity value is the mean intensity of the 42 thymic samples. The 20 most strongly expressed genes (array signals 410, captured by 25 probes) within the CD-associated regions were TNFRSF14, EST1, AIRE, SRRM1, ARL6IP5, TM2D1, SYF2, PSMG2, BCL11A, ZFP36L1, CD247, CSTB, XPO1, LMOD3, VIL2, COX17, PARK7, ERP29, TMSB4X and TRIM13. Genes and Immunity (2014) 355 – 360

cis gene expression (excluding the HLA region on chromosome 6 because the major causal gene within this region is known and also because high degree of exonic polymorphisms influence the binding efficiency of probes to various HLA alleles). In total, 43 nominal significant probe–SNP pairs (Po0.05) distributed over 24 CD-associated chromosomal regions were detected (Table 1 and Supplementary Figure 1). The 43 eQTLs involved 27 expression-altering SNPs (eSNPs) and 40 probes (eProbes) representing 39 unique genes (eGenes). The most significant eQTLs (that is, P-valueo0.01) were rs2298428-UBE2L3 (P-value ¼ 9.0  10  4), rs12727642-PARK7 (P-value ¼ 1.4  10  3), rs3748816-HES5 (P-value ¼ 1.7  10  3), rs917997-IL18RAP 3 (P-value ¼ 1.8  10 ), rs13003464-XPO1 (P-value ¼ 5.8  10  3), rs4675374-RAPH1 (P-value ¼ 1.4  10  3), rs6441961-CCR1 3 (P-value ¼ 1.7  10 ), rs1738074-ILMN_1902604 (P-value ¼ 3.2  10  3) and rs13003464-C2orf74 (P-value ¼ 1.1  10  3) (Figure 2). The distance between the eSNP and the eGene ranged from 0 to 728 kb, with an average distance of 223 kb. Seven of the 27 eSNPs were intergenic, while 16 were located within known genes in the respective regions: exons (n ¼ 4), UTRs (n ¼ 2) and introns (n ¼ 14). Two SNPs were located within the gene of which they were associated with differential expression, that is, rs17035378 in PLEK and rs6974491 in ELMO1. Both SNPs were located in introns, not close to exons. Thymic eQTLs compared with PBMC Of the 27 putative thymic eSNPs, 16 had previously been annotated as eSNPs in PMBC by Dubois et al.2 Eight of these eSNPs influenced expression of the same gene in both tissues with the same direction of allelic expression (rs12727642-PARK7, rs296547-DDX59, rs917997-IL18RAP, rs842647-AHSA2, rs13003464AHSA2, rs6974491-ELMO1, rs2074404-NSF (two independent probes) and rs2298428-UBE2L3). The remaining eight eSNPs showed tissue-specific regulation and influenced expression of different genes in the two tissues (that is, discordant regulation; Table 1). Furthermore, 11 eSNPs were previously not annotated as eSNPs in PMBC and represent potentially thymus-specific eSNPs. The association between SNP genotype and gene expression for these 27 eSNPs is shown in Supplementary Figure 1. eQTL comparisons across tissues Encouraged by these findings, we wanted to explore the regulatory potential of the 27 identified eSNPs in more tissues using the Genevar software, particularly aiming to evaluate the 11 potentially thymus-specific eSNPs. The Genevar software contains eQTL results generated from five distinct tissues (B-lymphoblastoid cell lines, T cells, fibroblast, skin and adipose tissue), thus providing a tool for comparison of SNPs regulatory potential over several tissues. Using this software, all 27 eSNPs appeared to be expression associated SNPs also in other tissues and cell types. However, in most cases, the genes under regulation, that is, the eGenes, differed between different tissues (Supplementary Table 2). Hence, none of the eSNPs were found to exclusively influence gene expression in thymus. On the other hand, expression of 13 genes were only associated with eSNPs in thymic tissue, and not found to be differentially expressed due to the CD-risk SNPs either in PMBC2 or in any of the other five tissues available in Genevar (Supplementary Table 3). This implies that expression of these 13 genes could be exclusively regulated by the CD-associated SNPs in thymus, and hence represented by putative thymus-specific eQTLs. Notably, of these 13 genes, probes for 7 genes were not tested across all five tissues (that is, RGS13, INADL, PANK4, TSP50, ERP29, FAM109 and ZFP36L1). Therefore, only six eQTLs could be perceived to be thymus-specific based on current information; that is, rs864537POU2F1 (P-value ¼ 1.3  10  2), rs13098911-LRRC2 (P-value ¼ 3.7  10  2), rs196432-GRHL3 (P-value ¼ 1.4  10  2), rs196432& 2014 Macmillan Publishers Limited

Coeliac disease-associated eQTLs in thymus SS Amundsen et al

357 Table 1.

Overview of CD-associated SNPs being eQTLs in thymic tissue

Region

eSNP

1 2

rs12727642 rs2816316 rs2816316 rs296547 rs296547 rs296547 rs6691768 rs864537 rs3748816 rs3748816

3 4 5 6

7 8 9 10

11 12 13

14 15 16 17 18 19

20 21

22 23 24

Location

eGene

IlluminaArrayID

Intensity

P-value

Distance (kb)

eQTL in blooda

1 1 1 1 1 1 1 1 1

Intergenic Intergenic Intergenic Intergenic Intergenic Intergenic Intron (NFIA) Intron (CD247) Exon (MMEL1)

PARK7 RGS13 TROVE2 GPR25 DDX59 KIF21B INADL POU2F1 HES5

ILMN_1744713 ILMN_2407775 ILMN_1738909 ILMN_1761766 ILMN_1748077 ILMN_1800670 ILMN_1773312 ILMN_1794333 ILMN_1794742

10.67 5.33 5.71 5.15 8.23 5.24 5.36 6.05 5.52

1.4  10  3 b 2.7  10  2 4.3  10  2 3.8  10  2 3.3  10  2 3.3  10  2 4.8  10  2 1.3  10  2 1.7  10  3

1.3 68 492 49 253 46 416 15 65

1

Exon (MMEL1)

PANK4

ILMN_1743910

6.28

2.3  10  2

68

2

PARK7 — — DDX59 DDX59 DDX59 — CD247 PLCH2, TNFRSF14, C1orf93, MMEL1 PLCH2, TNFRSF14, C1orf93, MMEL1 — — — — — IL18RAP IL18RAP AHSA2 AHSA2 AHSA2 AHSA2 AHSA2 — — CCR3 CCR3 CCR3, CXCR6 CCR3, CXCR6 — — TAGAP ELMO1 ZMIZ1 SH2B3, ALDH2, TMEM116

Chr

GRHL3 IL28RA IL22RA1 PLEK PPP3R1 IL18RAP TMEM182 AHSA2 AHSA2 C2orf74 C2orf74 XPO1 RAPH1 FBXL2 TSP50 CCR1 CCR1 LRRC2 ARHGAP31 NUDT6 EZR-AS1 ELMO1 PPIF FAM109A

ILMN_1782141 ILMN_1680805 ILMN_1666175 ILMN_1795762 ILMN_1796962 ILMN_1721762 ILMN_1806801 ILMN_1798308 ILMN_1798308 ILMN_1754501 ILMN_1754501 ILMN_1725121 ILMN_1783846 ILMN_1688639 ILMN_1775615 ILMN_1678833 ILMN_1678833 ILMN_1748530 ILMN_1915328 ILMN_2366972 ILMN_1902604 ILMN_1784320 ILMN_1809607 ILMN_1743316

5.59 5.34 5.23 8.94 7.71 6.12 5.36 7.80 7.80 6.67 6.67 10.10 6.60 5.57 5.23 5.82 5.82 5.38 5.12 6.21 5.41 7.53 5.59 5.37

1.4  10 1.7  10  2 2.5  10  2 4.2  10  2 1.4  10  2 1.8  10  3 3.2  10  2 4.6  10  2 3.1  10  2 b 1.1  10  3 2.7  10  2 5.8  10  3 1.4  10  3 b 2.8  10  2 3.5  10  2 1.7  10  3 6.5  10  3 b 3.7  10  2 b 4.0  10  2 4.1  10  2 3.2  10  3 1.3  10  2 4.7  10  2 1.2  10  2

170 348 392 — 112 1,5 308 285 217 185 252 518 402 303 401 106 10 322 20 728 224 — 48 83/206

ERP29

ILMN_2323048

11.29

4.6  10  2

443/566

SH2B3, ALDH2, TMEM116

14 17

Exon (RCAN3) Exon (RCAN3) Exon (RCAN3) Intron (PLEK) Intron (PLEK) Intergenic Intergenic Intron (REL) Intron (PUS10) Intron (PUS10) Intron (REL) Intron (PUS10) Intron (ICOS) Intergenic Intergenic Intergenic Intron (CCR3) Intergenic (CCR3) Intron (CDGAP) Intron (KIAA1109) UTR (TAGAP) Intron (ELMO1) Intron (ZMIZ1) EXON (SH2B3) and Intron (ATXN2) EXON (SH2B3) and Intron (ATXN2) Intergenic Intron (WNT3)

ZFP36L1 NSF

ILMN_1675448 ILMN_1680353

10.16 9.63

1.2  10  2 2.4  10  2

18 31

rs2074404

17

Intron (WNT3)

NSF

ILMN_2251784

6.16

3.0  10  2

31

rs1893217* rs4819388 rs2298428 rs2298428 rs2298428

18 21 22 22 22

Intron (PTPN2) UTR (ICOSLG) Exon (YDJC) Exon (YDJC) Exon (YDJC)

SLMO1 PDXK UBE2L3 TOP3B HIC2

ILMN_2232157 ILMN_1672504 ILMN_1677877 ILMN_1912077 ILMN_1652762

6.06 8.80 7.90 5.37 7.60

8.4  10  3 3.9  10  2 9.0  10  4 2.7  10  2 3.7  10  3

377 465 4.6 328 177

— NSF, LRRC37A, WNT3, LOC388397 NSF, LRRC37A, WNT3, LOC388397 — RRP1 UBE2L3 UBE2L3 UBE2L3

rs196432 rs196432 rs196432 rs17035378 rs17035378 rs917997 rs917997 rs842647 rs13003464 rs13003464 rs842647 rs13003464 rs4675374 rs13314993 rs6441961 rs6441961 rs13098911 rs13098911 rs11712165 rs13151961 rs1738074 rs6974491 rs1250552 rs3184504 and rs653178* rs3184504 and rs653178* rs4899260 rs2074404

1 1 1 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 4 6 7 10 12 12

Abbreviations: Chr, chromosome; ns, not significant. These results represent Kruskal–Wallis statistics. aReference Dubois et al.2 eSNPs marked in bold do not overlap with eSNPs in PBMC. Intensity refers to the mean signal intensity of the probe. As rs3184504 and rs653178* are in complete LD, the results for these two SNPs were identical. Distance refers to the distance from the SNP to the gene under regulation. When performing the Mann–Whitney U-test, homozygous and heterozygote carriers for the risk alleles were grouped and compared with homozygous non-risk individuals. The Mann–Whitney U-test was performed only in the cases when a clear trend towards carrier status was seen. In a few cases, only two SNP genotype groups were available (due to low minor allele frequency), and in this case the Mann–Whitney U-test was performed. bMann–Whitney U-test.

IL22RA1 (P-value ¼ 2.5  10  2), rs917997-TMEM182 (P-value ¼ 3.2  10  2) and rs13314993-FBXL2 (P-value ¼ 2.8  10  2) (Supplementary Table 3). As the six putatively thymus-specific regulated genes showed low expression (with array intensity in the range 5–6; Table 1 and Supplementary Figure 1) we further confirmed the expression of these genes with TaqMan analysis. All & 2014 Macmillan Publishers Limited

genes amplified with Ct values that correlated well with the low array signals (Ct values in the range 30–34) except for IL22RA1 where most samples were non-detectable (Ct values435). Taken together, these data suggest that 27 of the 50 CDassociated SNPs investigated in this study are eSNPs in thymic tissue, and six of the identified eQTLs were exclusively found in Genes and Immunity (2014) 355 – 360

Coeliac disease-associated eQTLs in thymus SS Amundsen et al

358 P-value = 2.0x10-4

8.0

5.8

7.5

5.6

7.0

Intensity

5.4

6.5 6.0

5.2

5.5

5.0

rs 91 79 97 -

T

Genotypes

Genotypes UBE2L3 (ILMN_1677877) 9.0 8.5

7.5 Intensity

Intensity

C2orf74 (ILMN_1754501) P-value = 8.0x10-4

8.0

P-value = 2.0x10-4

rs 91 79 97 -T T

C

C

C

rs 37 48 81 6_ A A

rs 37 48 81 6_ G G

rs 37 48 81 6_ AG

5.0 rs 91 79 97 -

Intensity

IL18RAP (ILMN_1721762)

HES5 (ILMN_1794742) P-value = 2.0x10-4

6.0

8.0

7.0 6.5

7.5 6.0 7.0

T

T

_T 64 rs 1

30 rs 1

30

34

34

64

_C 64 34 30 rs 1

Genotypes

Genotypes

PARK7 (ILMN_1744713)

ILMN_1902604

XPO1 (ILMN_1725121) P-value = 5.8x10-3

10.6

6.0

P-value = 1.4x10-3

5.8

Genotypes

C C 38 rs 17 C C 1_ 91 98

30 98 91 1_ rs 1

1_ 91 98 30 rs 1

C

TT

T 61 19 rs 64 4

61 19 rs 64 4

_T

_C

C _C 61 19 rs 64 4

53 rs 46 7

53 67 rs 4

74

74

_C C

_C T

_T 74

Genotypes

30

5.0

rs 1

5.0 T

5.5

T

6.0 5.5

53

38

38 rs 17

00 rs 13

6.0 5.5

67

C T

4_

64 34

34 00 rs 13

6.5

6.0

rs 4

CCR1 (ILMN_1678833) P-value = 6.5x10-3

7.0

Intensity

Intensity

6.5

T

Intensity

Genotypes

6.5

7.0

Figure 2.

_C

_C 64

64 34 00 rs 13

Genotypes CCR1 (ILMN_1678833) P-value = 6.5x10-3

P-value = 1.4x10-3

7.5

C

T

T _T

C 2_

Genotypes RAPH1 (ILMN_1783846)

4_

5.0

07

9.6

TT

5.2

C

9.8

64 27 27 rs 1

rs 1

rs 1

27

27

27

27

64

64

2_

2_

A

AC

A

10.0

5.4

4_

10.0

5.6

07

10.2

07

Intensity

Intensity

10.5

Intensity

10.4 11.0

P-value = 3.2x10-3

rs 17

11.5

_C

C

G _G 28 84 29 rs 2

rs 2

rs 2

29

29

84

84

28

28

_A

_A G

A

5.5

Genotypes

eQTLs with strongest statistical significance in thymus.

the thymus, indicating that expression of these six genes could be uniquely regulated in thymus by the CD-associated SNPs. DISCUSSION More than half of the CD-associated regions investigated, 24 out of 39 loci, were in this study suggested to be expressed quantitative trait loci in thymus. The associations did not stand correction for multiple testing, which is not surprising given limited thymus sample size; hence, the results should be confirmed in future studies. Nevertheless, 16 of the 27 potential Genes and Immunity (2014) 355 – 360

thymic eSNPs had previously also been identified as eSNPs in PBMC, and all showed association with the same allelic direction. Eight of these eSNPs showed concordant regulation of the same genes in PBMC and thymus, while the remaining eight SNPs appeared to regulate another nearby gene. In addition, at least 6 thymus-specific eQTLs were indicated, and 13 when including those that were not explored across the entire set of tissues. The expression levels of some of these genes were low, thus their in vivo relevance is unclear. The vast majority of cells in thymus are thymocytes. Hence, it is most likely that the expression profiles and differences observed in this study are due to expression & 2014 Macmillan Publishers Limited

Coeliac disease-associated eQTLs in thymus SS Amundsen et al

359 differences in this dominant cell type. However, as the expression analysis in this study was performed on total RNA, consisting of a pool of mixed cell types, we are unable to draw conclusions about which subpopulations of cells the eQTLs are likely to be important. So, the observation of low expression of many genes could be explained by the fact that they are expressed in only subpopulation of cells in thymus. Further investigations are needed to conclude whether and how the observed eQTLs specifically contribute to the aetiology of CD. At any rate, identification of eSNPs using blood as surrogate tissue seems to be good but not sufficient for tissue- and disease-specific eQTL mapping. This is noteworthy given that SNPs associated with complex traits have been found to more often exert a tissue-dependent effect on gene expression.7 SNPs located in transcriptional regulatory elements (including SNPs at 30 and 50 regions as well as non-synonymous-coding SNPs) have been shown to be enriched for tissue-dependent regulation.7 It has been suggested that the majority of eQTLs display concordant association across tissues and that tissue-dependent eQTLs in most cases can be explained by different causal variants (that is, from more and independently associated eSNPs).7 One of the potential thymus-specific eSNPs was non-synonymous (rs196432), whereas the rest were intronic and intergenic SNPs. This may relate to the fact that the majority of the 50 CDassociated SNPs included in this study were intergenic (n ¼ 26) and intronic (n ¼ 14). Alternatively, it could imply that these eSNPs are markers for the true regulatory variant giving significant eQTL signals through linkage disequilibrium with the causal regulatory variant yet to be discovered. As thymus is an immunologically relevant organ involved in autoimmunity in general, one could envisage that thymus-specific eQTLs would be shared between autoimmune diseases. Of the putative thymus-specific eQTLs identified in this study, all eSNPs, except rs13314993, are located within regions found to associate with at least one additional disease that is related to the immune system. For instance, the coding SNP rs3184504 (locating within the gene SH2B3) is one such common autoimmune risk variant. This exon SNP is positioned within a histone mark H3K27ac site that is a genomic site known to spot active enhancers. The same is true for the nearby intronic SNP rs653178. Both SNPs were found to regulate FAM109 in a putative thymus-specific manner. Another regulatory SNP position within a histone mark H3K27ac site was rs2298428. In terms of statistical significance, cis regulation of UBE2L3 by rs2298428 was among the most significant (P-value ¼ 9.0  10  4). This gene was also strongly expressed in thymus (array signal of 7.9). Concordant regulation of UBE2L3 by rs2298428 was reported in PBMC,2 and rs2298428 locates within a genomic region shown to confer risk to various immune-related diseases (for example, systemic lupus erythematosus and rheumatoid arthritis). More than half of the genes (53%) included in this study appeared to be either weakly or not expressed in thymus (array signal o6). This may reflect the reality, but it could also be due to poor probe coverage as the majority of the genes within these regions were covered by only one probe on the microarray. More probes covering a gene would increase the likelihood to detect the transcripts expressed by a particular gene (exemplified with the gene UBE2E3 where good expression was only seen for one of in total seven probes covering this gene). Of the 39 unique eGenes in this study, 19 displayed low expression (array intensity in the range 5–6) and only 7 genes were strongly expressed, that is, PARK7, PLEK, XPO1, PDXK, NSF, ERP29 and ZFP36L1, of which ERP29 and ZFP36L1 were among the genes displaying putative thymusspecific regulation. The gene ZFP36L1 (Zinc finger protein 36, C3H type-like 1) is a RNA-binding protein that regulates mRNA turnover by binding to AU-rich elements in the 30 -untranslated region of mRNAs (UUAUUUAU being the most optimal), and thereby triggering their degradation. In mice, post-transcriptional & 2014 Macmillan Publishers Limited

regulation mediated by ZFP36L1 was found to affect thymocyte development as it contributed to aberrant b-selection.8 Studies of ZFP36L1 in mice have also shown that modulation of mRNA stability represents a promising therapeutic approach in cancer therapy.9 Our finding that ZFP36L1 is expressed in a genotypedependent manner in thymus, influenced by the CD-risk SNP rs4899260, is therefore interesting. Specific gene expression in thymus has been implicated in the pathogenesis of autoimmune diseases. In type 1 diabetes, diseaseassociated variants upstream of the insulin gene lead to altered insulin expression in thymus, and this was hypothesised to altered T-cell tolerance for insulin as a self-protein.10 The importance of thymic T-cell regulation and development in CD has so far not been investigated. Hence, to our knowledge this is the first study showing that CD-associated SNPs act as cis regulatory variants also in the thymus. In conclusion, some of the CD-associated SNPs appear to be cis regulatory variants in thymus, indicating that regulation of gene expression in thymus could be of importance in CD aetiology. Hence, our results add to the knowledge of the CD-associated SNPs being regulatory SNPs and further underline the need for investigating eQTLs in multiple disease-relevant tissues. MATERIALS AND METHODS Sample material Thymic tissue samples from 42 Norwegian children undergoing heart surgery (22 girls and 20 boys) were used in this study. All children were less than 13 years (26 samples were collected from children less than 1-year old). This project was approved by the regional ethical committee and written informed consent was given by all parents. All tissue samples were kept anonymous.

SNP selection We included in our study all 40 SNPs displaying genome-wide (P-value o5  10  8) as well as suggestive association with CD (in the study by Dubois et al.,2 defined as 5  10  6 o P-value 45  10  8 and/or P-valueGWAS o10  4 and P-valuefollow-up o0.01). In addition, we included the second SNP that display independent and strong association with CD from the six loci showing evidence of multiple independent associations (in logistic regression analysis). The four non-synonymous SNPs (rs196432RUNX3, rs3184504-SH2B3, rs3816281-PLEK and rs3748816-MMEL1) with evidence for association with CD (P-valueGWASo10  4) within these regions were also included. In total, 50 SNPs were investigated (Supplementary Table 1).

DNA and SNP genotyping DNA was extracted from thymic tissue samples and whole-genomeamplified using GenomiPhi Amplification kit (GE Healthcare, Little Chalfont, UK). The whole-genome-amplified DNA samples were genotyped using the Sequenom MassArray technology (Sequenom, San Diego, CA, USA). Primer design and multiplexing were performed using the SEQUENOM software MassArray Assay Design (version 2.0.0.17). Size separation was performed in a MALDI-TOF mass spectrometer (integrated in the Sequenom equipment), and genotypes were called using the software SPECTRO ACQUIRE (version 3.0.1.14). As an additional quality control and method validation, we genotyped two randomly selected SNPs (rs802734-THEMIS, and rs10903122-RUNX3) using the TaqMan technology. The selected TaqMan assays (Applied Biosystems, Foster City, CA, USA) were run in accordance with the manufacturer’s recommended programme in a StepOne Plus RT-PCR instrument (Applied Biosystems) and in a total reaction volume of 5 ml.

RNA isolation and cDNA synthesis The thymic tissue samples were collected as thin slices in tubes containing RNAlater solution (Ambion, Austin, TX, USA) immediately after surgical removal. Small pieces of B50 mg were used for the extraction of DNA and total RNA using TRIzol Reagent (Invitrogen, Carlsbad, CA, USA). Hence, cDNA synthesis was performed with random hexamers and SuperScrip III Reverse Transcriptase (Invitrogen) after treatment with RNase-free DNase I Genes and Immunity (2014) 355 – 360

Coeliac disease-associated eQTLs in thymus SS Amundsen et al

360 (New England Biolabs, Ipswich, UK). We performed the cDNA synthesis according to the manufacturer’s recommendation using B1000 ng of total RNA as input.

female twins of the MuTHER resource.14 eQTLs with a P-valueo0.05 within any of the above-mentioned sample sets were considered significant and thus selected for comparison.

RNA quality control

Statistics

All RNA samples were run on a Bioanalyzer 2100 instrument (Agilent Technologies, Santa Clara, CA, USA) in order to measure the purity and RNA integrity (degradation). None of the samples showed signs of DNA contamination. The RNA concentrations were quantified using NanoDrop spectrophotometer (Thermo Fischer Scientific Inc., Waltham, MA, USA).

For the eQTL analysis, we performed a nonparametric one-way ANOVA (Kruskal–Wallis) test. For a few SNPs, none of the individuals were homozygous for the minor allele, and we performed a two-tailed Mann– Whitney U-test. A P-value o0.05 was considered to be statistically significant.

Determination of gene expression levels by microarray A whole-genome expression array was performed utilising the Illumina HumanWG-6 v3 array (Illumina, San Diego, CA, USA) and raw probe intensity was extracted using BeadStudio software (Illumina). From the whole-genome expression data, only genes mapping within one mega base (1 Mb) surrounding the CD-associated SNP were included in our study (Supplementary Table 1). Two thymic tissue samples were removed before further analysis due to low RNA integrity number, leaving 40 cDNA samples for further analysis. Quantile-normalised data were log2-transformed by using J-express software.11 After log2 transformation, the probe intensity values ranged from 5 to 14.

CONFLICT OF INTEREST The authors declare no conflict of interest.

ACKNOWLEDGEMENTS We thank Harald Lindberg at Rikshospitalet for collecting the thymus tissues, Siri Flåm and Hege D Sollid for preparing the DNA and RNA samples, and Haleh Saeedi for TaqMan gene expression analyses. This study was supported by grants from the South-Eastern Norway Regional Health Authorities, the Norwegian Diabetes Association and Novo Nordisk.

TaqMan real-time gene expression In order to validate the microarray data, we measured gene expression for nine randomly selected genes (selected among the genes within the CDassociated regions) with predesigned TaqMan gene expression assays (Applied Biosystems). We selected assays designed to capture all or at least as many splice variants as possible: MMEL1 (Hs00364353_m1), TNFRSF14 (Hs00998600_g1), REL (Hs00231279_m1), CTLA4 (Hs03044418_m1), OLIG3 (Hs00703087_s1), TNFAIP3 (Hs01568119_m1), IL2 (Hs00174114_m1), IL21 (Hs00222327_m1) and CLEC16A (Hs00389799_m1). In addition, we quantified the expression of the six putative thymus-specific differentially expressed genes: GRHL3 (Hs00297962_m1), IL22RA1 (Hs00222035_m1), TMEM182 (Hs00288464_m1), FBXL2 (Hs00247211_m1), LRRC2 (Hs00225885_m1) and POU2F1 (Hs01552835_m1_M). Gene expression was performed by real-time PCR (RT-PCR). All samples were run in triplicates in an ABI7900HT instrument under standard conditions and analysed by the SDS v2.3 software (Applied Biosystems) after removal of samples with standard deviation above 0.167. All reactions were performed in a total volume of 10 ml: 4 ml cDNA (1 ng ml  1), 0.5 ml Taqman gene expression assay (20  ), 5 ml Taqman gene expression master mix (10  ) and 0.5 ml sterile water.

eQTL analysis SNP genotypes were correlated with expression data for all genes locating within a window of 1 Mb surrounding the respective SNP (that is, 500 kb on each side of the SNP) within the 39 non-HLA CD-associated regions. The Illumina chip contained expression data for 337 genes (covered by 549 probes) of the total 424 genes located within these 39 regions (that is, 87 genes were not covered by probes on the array; Supplementary Table 1). The number of genes tested for each region ranged from 2 to 23, with an average of 8.6 (Supplementary Table 1).

The Genevar software The gene expression variation (Genevar) software was used in order to compare the eSNPs regulatory potential across the cell types and tissues available in this database (http://www.sanger.ac.uk/resources/software/ genevar/). eQTL data were available from (1) HapMap lymphoblastoid cell lines from several populations (CEU (n ¼ 109), CHB (n ¼ 80), GIH (n ¼ 82), JPT (n ¼ 82), LWK (n ¼ 82), MEX (n ¼ 45), MKK (n ¼ 138) and YRI (n ¼ 108));12 (2) T-cells, lymphoblastoid cell lines and fibroblast derived from umbilical cords of 75 Geneva GenCord individuals13 and (3) adipose (n ¼ 166), skin (n ¼ 160) and lymphoblastoid cell lines (n ¼ 156) derived from healthy

REFERENCES 1 Sollid LM. Coeliac disease: dissecting a complex inflammatory disorder. Nat Rev Immunol 2002; 2: 647–655. 2 Dubois PC, Trynka G, Franke L, Hunt KA, Romanos J, Curtotti A et al. Multiple common variants for celiac disease influencing immune gene expression. Nat Genet 2010; 42: 295–302. 3 Hunt KA, Zhernakova A, Turner G, Heap GA, Franke L, Bruinenberg M et al. Newly identified genetic risk variants for celiac disease related to the immune response. Nat Genet 2008; 40: 395–402. 4 van Heel DA, Franke L, Hunt KA, Gwilliam R, Zhernakova A, Inouye M et al. A genome-wide association study for celiac disease identifies risk variants in the region harboring IL2 and IL21. Nat Genet 2007; 39: 827–829. 5 Emilsson V, Thorleifsson G, Zhang B, Leonardson AS, Zink F, Zhu J et al. Genetics of gene expression and its effect on disease. Nature 2008; 452: 423–428. 6 Stranger BE, Forrest MS, Clark AG, Minichiello MJ, Deutsch S, Lyle R et al. Genomewide associations of gene expression variation in humans. PLoS Genet 2005; 1: e78. 7 Fu J, Wolfs MG, Deelen P, Westra HJ, Fehrmann RS, Te Meerman GJ et al. Unraveling the regulatory mechanisms underlying tissue-dependent genetic variation of gene expression. PLoS Genet 2012; 8: e1002431. 8 Hodson DJ, Janas ML, Galloway A, Bell SE, Andrews S, Li CM et al. Deletion of the RNA-binding proteins ZFP36L1 and ZFP36L2 leads to perturbed thymic development and T lymphoblastic leukemia. Nat Immunol 2010; 11: 717–724. 9 Planel S, Salomon A, Jalinot P, Feige JJ, Cherradi N. A novel concept in antiangiogenic and antitumoral therapy: multitarget destabilization of short-lived mRNAs by the zinc finger protein ZFP36L1. Oncogene 2010; 29: 5989–6003. 10 Vafiadis P, Bennett ST, Todd JA, Nadeau J, Grabs R, Goodyer CG et al. Insulin expression in human thymus is modulated by INS VNTR alleles at the IDDM2 locus. Nat Genet 1997; 15: 289–292. 11 Dysvik B, Jonassen I. J-Express: exploring gene expression data using Java. Bioinformatics 2001; 17: 369–370. 12 Stranger BE, Montgomery SB, Dimas AS, Parts L, Stegle O, Ingle CE et al. Patterns of cis regulatory variation in diverse human populations. PLoS Genet 2012; 8: e1002639. 13 Dimas AS, Deutsch S, Stranger BE, Montgomery SB, Borel C, Attar-Cohen H et al. Common regulatory variation impacts gene expression in a cell type-dependent manner. Science 2009; 325: 1246–1250. 14 Nica AC, Parts L, Glass D, Nisbet J, Barrett A, Sekowska M et al. The architecture of gene regulatory variation across multiple human tissues: the MuTHER study. PLoS Genet 2011; 7: e1002003.

Supplementary Information accompanies this paper on Genes and Immunity website (http://www.nature.com/gene)

Genes and Immunity (2014) 355 – 360

& 2014 Macmillan Publishers Limited