Impact of the Interaction between 3′-UTR SNPs and microRNA on the Expression of Human Xenobiotic Metabolism Enzyme and Transporter Genes

Genetic variation in the expression of human xenobiotic metabolism enzymes and transporters (XMETs) leads to inter-individual variability in metabolism of therapeutic agents as well as differed susceptibility to various diseases. Recent expression quantitative traits loci (eQTL) mapping in a few human cells/tissues have identified a number of single nucleotide polymorphisms (SNPs) significantly associated with mRNA expression of many XMET genes. These eQTLs are therefore important candidate markers for pharmacogenetic studies. However, questions remain about whether these SNPs are causative and in what mechanism these SNPs may function. Given the important role of microRNAs (miRs) in gene transcription regulation, we hypothesize that those eQTLs or their proxies in strong linkage disequilibrium (LD) altering miR targeting are likely causative SNPs affecting gene expression. The aim of this study is to identify eQTLs potentially regulating major XMETs via interference with miR targeting. To this end, we performed a genome-wide screening for eQTLs for 409 genes encoding major drug metabolism enzymes, transporters and transcription factors, in publically available eQTL datasets generated from the HapMap lymphoblastoid cell lines and human liver and brain tissue. As a result, 308 eQTLs significantly (p < 10−5) associated with mRNA expression of 101 genes were identified. We further identified 7,869 SNPs in strong LD (r2 ≥ 0.8) with these eQTLs using the 1,000 Genome SNP data. Among these 8,177 SNPs, 27 are located in the 3′-UTR of 14 genes. Using two algorithms predicting miR-SNP interaction, we found that almost all these SNPs (26 out of 27) were predicted to create, abolish, or change the target site for miRs in both algorithms. Many of these miRs were also expressed in the same tissue that the eQTL were identified. Our study provides a strong rationale for continued investigation for the functions of these eQTLs in pharmacogenetic settings.


INTRODUCTION
Xenobiotic metabolizing enzymes and transporters (XMETs) are involved in biotransformation and detoxification of carcinogens, environmental toxins, and therapeutic drugs (Carlsten et al., 2008;Korkina et al., 2009). In humans, the process of biotransformation and detoxification of xenobiotics by XMETs can be divided into three phases: modification (phase I) primarily by enzymes of the cytochromes P450 superfamily; conjugation (phase II), e.g., glucuronidation by UDP-glucuronosyl transferase; and excretion (phase III) mainly by membrane transporters. XMETs are expressed in almost all tissue types, centrally and locally protecting the entire body against the damages caused by various natural and synthetic compounds. XMETs are highly expressed in digestive tract and especially in the liver, the most important organ for central metabolism (Conde-Vancells et al., 2010).
Variations in the expression and activity of these XMETs lead to significant inter-individual difference in the disposition of exogenous chemicals including absorption, distribution, metabolism, and excretion (ADME) of pharmaceutical drugs. On the other hand, many XMETs are also found to be very abundant in non-digestive tract tissues/cells, e.g., brain, lung, bladder, and blood (Pavek and Dvorak, 2008). These XMETs could affect the local response to certain drugs at the site of action. Meanwhile, due to the crucial role of XMETs in detoxification of carcinogens and toxins, genetic variation in XMETs function in specific tissues/organs is also an important mechanism underlying genetic susceptibility to certain diseases, e.g., those XMETs expressed in lung and bladder may modify cancer risk. Recent genome-wide association studies have identified polymorphisms at the UGT1A locus strongly associated with urinary bladder www.frontiersin.org cancer risk (Selinski et al., 2012). XMETs are sensitively regulated by various nuclear receptors (NRs) and transcription factors (TFs). These trans-acting regulators play a pivotal role in mediating cellular response to exposure to xenobiotics by modulating the transcription of XMETs, thus significantly contributing to the variability in the function of XMETs (Bourgine et al., 2012).
Identifying the DNA polymorphisms leading to the variations in XMET function is a major area of interest in pharmacogenetic and genomic research. To date, numerous studies focused on individual XMET genes have discovered a large number of sequence variations, many of which alter protein coding sequence and consequently affecting the activity of XMETs (Adjei et al., 2003;Hildebrandt et al., 2004;Ji et al., 2005;Moyer et al., 2007;Mrozikiewicz et al., 2011). Meanwhile, even more variants were suggested to quantitatively modulate gene transcription (Pavek and Dvorak, 2008). Recently, genome-wide mapping for gene expression quantitative trait loci (eQTLs) in a few human tissues/cells offered unprecedented opportunities to identify the most influential single nucleotide polymorphisms (SNPs) determining gene expression level of XMETs (Gamazon et al., 2010). However, unlike the variants located in the protein coding sequences for which the causality for altered enzyme activity can be more easily understood, how eQTLs affect gene transcription is largely unknown. Understanding the underlying mechanisms will lead to identification of novel causative DNA variants for XMET function as well as reliable pharmacogenetic markers.
MicroRNAs (miRs) are single stranded, about 22-nucleotides (nt) long, evolutionarily conserved, and function as important posttranscriptional regulators of mRNA expression by binding to the 3 -UTR of target mRNAs (Ambros, 2004;Bartel, 2004). MiRs are involved in various developmental and physiological processes by negatively regulating gene expression . Over 30% of all protein-coding genes were estimated to be regulated by miRs (Brennecke et al., 2005;Krek et al., 2005;Lewis et al., 2005;Lim et al., 2005). Due to the conservation of the miR target site, SNPs located in 3 -UTR sequences may abolish or create a miR target, thus significantly affecting the mRNA expression (Saunders et al., 2007). Previous studies have suggested that many XMETs are regulated by miRs (Tsuchiya et al., 2006;Takagi et al., 2010;Patron et al., 2012). Several studies also demonstrated that SNPs in XMET gene 3 -UTRs led to different levels of enzyme activity (Saunders et al., 2007;Chin et al., 2008). Hence, we hypothesized that it may be an important mechanism that common SNPs or their linkage disequilibrium (LD) proxies located in the XMET gene 3 -UTR sequences alter mRNA expression via interference with miR targeting. In order to identify these candidate SNPs that may significantly modulate XMET expression, in this study we used multiple published human eQTL datasets to perform an in silico screening for SNPs that highly correlated with mRNA level of 409 major XMET genes. The significant SNPs and/or their LD proxies located in the gene 3 -UTRs were selected to predict a potential interference with miRs. We found that 27 SNPs located in the 3 -UTR of 14 XMET genes are likely associated with gene expression via altering miR binding.

SELECTION OF eQTLs
The general strategy for the data analysis was presented in Figure 1. We used the published eQTLs datasets generated from the HapMap lymphoblastoid cell lines (LCLs; Montgomery et al., 2010), human liver (Schadt et al., 2008), and human brain (Gibbs et al., 2010). Although additional eQTL datasets in human LCLs are also available, we chose to use the one by Montgomery et al. (2010) which utilized high-throughput sequencing for the quantification of gene expression, as this technology has been suggested to produce more accurate gene expression data. To our knowledge, all datasets were collected from tissue/cells derived from individuals of Caucasian in origin. We used the online tool 1 to search statistically significant eQTLs. As our study was focused on cis-acting eQTLs, we used a cut-off of p = 10 −5 for significance, considering the window for genomic region (500 kb) of each gene and the potential number of SNPs (1 in every 100-1,000 bp).

SEARCH FOR SNPs IN LD WITH eQTLs
To search SNPs in LD with significant eQTLs, we used the SNAP 2 program to screen the 1,000 Genome SNP data within 500 kb range of the eQTLs of interest in the CEU population with a LD level cut-off of R 2 = 0.8. Annotation for the location of eQTLs and their proxies relative to the gene structure was also collected with 1 http://www.ncbi.nlm.nih.gov/gtex/GTEX2/gtex.cgi 2 http://www.broadinstitute.org/mpg/snap/ldsearch.php Frontiers in Genetics | Pharmacogenetics and Pharmacogenomics the program. Only SNPs and/or their proxies located within the 3 -UTR of the studied genes of interest were retained for further analyses.

PREDICTION OF SNP-miR INTERACTION
In order to predict the potential SNP-miR interaction, two programs, MicroSNiPer 3 and PolymiRTS 4 were used. The major difference between the two programs is the algorithm used to predict the target site of miRs. The PolymiRTS program used the Tar-getScan 5 ; Lewis et al., 2005;Friedman et al., 2009) algorithm (Bao et al., 2007). In contrast, the MicroSNiPer program used the FASTA (Pearson and Lipman, 1988) alignment program to determine if a change in a nucleotide in 3 -UTR sequence would change the miR binding capability, based on the requirement of perfect Watson-Crick match to the seed 2-7 nt of miRs (Lewis et al., 2005). To be conservative, we used 7-mers match as the cut-off value for a positive prediction.

GENOME-WIDE eQTL ANALYSIS OF XMETs
Expression quantitative traits loci were screened for all 409 major XMET genes, including 144 phase I, 85 phase II and 111 phase III genes, 48 NRs, and transcription factor genes as well as another 21 genes related to drug ADME (Table A1 in Appendix). As a result, a total of 308 significant (p < 10 −5 ) eQTLs were identified from 101 XMET genes. These include nine in LCL, 83 in liver, and 221 in brain tissues. Five SNPs were found as eQTLs shared in two tissue types: rs1023252 in both LCL and brain tissues, rs11101992, rs156697, rs2071474, and rs241440 in both liver and brain tissues (Figure 2). Among the total of 308 eQTLs, 20 SNPs were found to be located in the 3 -UTR region; 3 SNPs were in the 5 -UTRs; FIGURE 2 | Significant eQTLs in different tissues. A total of 308 significant eQTLs were identified, including 9 eQTLs in LCL, 83 in liver, and 221 in brain tissues. Five eQTLs were shared in two tissue types. 171 SNPs were intronic; 8 and 6 SNPs were synonymous and nonsynonymous coding variants, respectively; and 12 and 15 SNPs were located in the upstream and downstream flanking region of the genes, respectively. The remaining 73 SNPs were located in intergenic regions.

eQTLs AND THEIR LD PROXIES
We chose to screen the 1,000 Genome SNP dataset as this would produce the most comprehensive coverage for the SNPs that may be in LD with a given eQTL. A total of 7,869 SNPs with significant LD with 260 eQTLs were identified. Combined with the remaining 48 eQTLs which had no reliable proxies in the 1,000 Genome dataset, a total of 8,177 SNPs (308 eQTLs and 7,869 proxy SNPs) were included in the subsequent analyses.

PREDICTION OF miR-SNPs INTERACTION
Of the 112 eQTLs and proxies located in the 3 -UTR sequences, 27 SNPs were found in the 3 -UTR of 14 genes of interest. The remaining SNPs were located in nearby genes thus were excluded from the subsequent analysis. These SNPs were all common SNPs with their minor allele frequency (MAF) ≥0.067. Among the 27 SNPs, 12 were found in liver, and 15 were identified in brain tissue. More detailed information for these SNPs was listed in Table A2 in Appendix.
We focused our study on the association between miRs and these 27 SNPs in the 14 genes. After screened with the two algorithms, MicroSNiPer (Barenboim et al., 2010) and PolymiRTs (Gong et al., 2012), all the 27 SNPs apart from rs11807 (which is not predicted to be in a target site in PolymiRTs database) were found to potentially create, abolish, or alter the target site for miRs in both algorithms. Notably, 34 miRs were predicted by both algorithms to interact with 19 of these SNPs ( Table A2 in Appendix). Of these 34 overlap miRs, except for rs2480256 of CYP2E1 which is not located in the seed sequence of hsa-miR-570-3p, all the remaining SNPs were found to be located in the seed sequence of miR targets.
To further validate the interaction between miRs and SNPs, we investigated whether the identified miRs were expressed in the same tissue as the identified eQTL. We used the GEO datasets (GSE21279 and GSE26545) to screen miR expression in liver and brain tissues, respectively (Hou et al., 2011;. Since many predicted miRs were new and not probed by the published platforms, we thus only concentrate on the list of miRs probed in the platforms. Overall, over 74% (20 out of 27) of the identified miR-SNPs were found to have at least one predicted miR co-expressed with the gene of interest in the same tissue.
We further aimed to investigate whether these 27 SNPs are more likely to be targeted by miRs especially by the co-expressed miR in liver and brain tissues, compared to random-selected 3 -UTR SNPs with similar MAF. No statistical significance were found, possibly due to the limited power caused by the small number (n = 27) of SNPs involved (data not shown).

DISCUSSION
Although a large number of DNA variants affecting the function of XMETs have been identified, and many of them have been well linked with clinical response to pharmacotherapy or disease susceptibility (Motsinger-Reif et al., 2010), genetic variations in the www.frontiersin.org activity of most XMETs remain incompletely explained. Recent studies continue to discover novel functional variants in XMET genes (Ramsey et al., 2012). Meanwhile, genome-wide association studies have found a number of XMET SNPs without previously known function significantly associated with different phenotypes in humans (Teichert et al., 2009;Estrada et al., 2012). These studies consistently suggested that additional sequence variants with fundamental role in XMET function have not been identified. Recent eQTL mapping in human tissues provided an opportunity to discover functional XMET polymorphisms at the genome-wide level. However, questions remain whether the identified eQTLs are causal for the altered gene expression and via what mechanism. Our study provides a comprehensive evaluation for this question in major human XMET genes, and generated a list of candidate SNPs that may modulate XMET genes via interference with miR targeting in multiple human tissue types.
Single nucleotide polymorphisms located in the gene 3 -UTRs could have great impact on miR targeting. It has been demonstrated that the entire 3 -UTR sequence could play important roles in miR function in addition to miR target sites (Hu and Bruno, 2011). In particular, negative selection in humans is stronger on computationally predicted conserved miR binding sites than on other conserved sequence motifs in 3 -UTRs, and polymorphisms in predicted miR binding sites are highly likely to be deleterious (Chen and Rajewsky, 2006). Gong et al. (2012) mapped SNPs to the 3 -UTRs of all human protein coding genes. Their results showed that among the 225,759 SNPs identified in 3 -UTRs, over 25% of SNPs potentially abolished 90,784 original miR target sites, while another 25% created a similar number of putative miRNA target sites. Besides these in silico studies, a number of SNPs altering miR targeting have been experimentally demonstrated to be associated with multiple diseases as well as drug metabolism and environmental procarcinogen detoxification (Abelson et al., 2005;Tan et al., 2007;Yu et al., 2007;Yokoi and Nakajima, 2011). Although the seed sequences for miR binding are critical and highly conserved, recent studies have also suggested that 3 -UTR sequences outside of the seed sequences, e.g., flanking sequences may be equally important for miR targeting by controlling the accessibility of the miR or local RNA structure (Grimson et al., 2007). For example, a SNP (829C > T) located 14 bp downstream of a miR-24 binding site in the 3 -UTR of human dihydrofolate reductase gene (DHFR) was demonstrated to affect DHFR expression by interfering with miR-24 function, resulting in DHFR over expression and methotrexate resistance (Mishra et al., 2007). By using two algorithms predicting potential SNP-miR interaction, we suggested that 27 eQTLs or their proxies in high LD for 14XMET genes may function through interference with one or more miRs, with most of the SNPs located in the seed sequences. Meanwhile, the majority (20 out of 27) of the identified miR-SNPs were found to have predicted miR co-expressed with the gene of interest in the same tissue. Although no statistically significant enrichment of miR targeting for these SNPs, the strong trends observed here warrants further experimental validations.
Our findings may also provide useful information in addition to the previous observations on the function of these SNPs. Previous studies demonstrated that SNP rs2480256 in the CYP2E1 gene was significantly associated with systemic lupus erythematosus (Liao et al., 2011). Another study showed that cyclosporine A concentration in serum was significantly correlated with the genotype of the CYP3A5 rs15524 polymorphism (Onizuka et al., 2011). In addition, a GSTM3 haplotype including rs1537236 was significantly associated with a decreased growth for maximum mid-expiratory flow rate (MMEF) in a large population-based lung function study (Breton et al., 2009). SNP rs11807 in the 3 region of GSTM5 was found to be associated with hypertension (Delles et al., 2008). Our results thus may help further elucidate the mechanism(s) by which the SNPs are involved in the susceptibility to these specific phenotypes.
In conclusion, our study summarized the potentially interacting SNP-miRs that may affect the expression of major XMET gene, which may ultimately facilitate to elucidate the mechanism how these genes are regulated as well as how they are involved in the genetic variations in drug metabolism and disease pathogenesis. Further investigations are necessary to corroborate the hypotheses generated in this study.  Epidemiol. 167, 759-774. Chen, K., and Rajewsky, N. (2006). Natural selection on human microRNA binding sites inferred from SNP data. Nat. Genet. 38, 1452-1456. Chin, L. J., Ratner, E., Leng, S., Zhai, R., Nallur, S., Babar, I., et al. (2008). A SNP in a let-7 microRNA complementary site in the KRAS 3 untranslated region increases nonsmall cell lung cancer risk. Cancer Res. 68, 8535-8540.       www.frontiersin.org