Genome-Wide Identification of RNA Modifications for Spontaneous Coronary Aortic Dissection

RNA modification plays important roles in many biological processes such as gene expression control. Genetic variants that affect RNA modification may have functional roles in aortic dissection. The aim of this study was to identify RNA modifications related to spontaneous coronary artery dissection (SCAD). We examined the association of RNA modification-associated single-nucleotide polymorphisms (RNAm-SNPs) with SCAD in summary data from a genome-wide association study (GWAS) of European descent (270 SCAD cases and 5,263 controls). Furthermore, we performed expression quantitative loci (eQTL) and protein quantitative loci (pQTL) analyses for the RNAm-SNPs using publicly available data. Functional enrichment and protein–protein interaction analyses were performed for the identified proteins. We found 11,464 unique RNAm-SNPs in the SCAD GWAS dataset, and 519 were nominally associated with SCAD. Nine RNAm-SNPs were associated with SCAD at p < 0.001, and among them, seven were N6-methyladenosine (m6A) methylation-related SNPs, one (rs113664950 in HLA-DQB1) was m7G-associated SNP, and one [rs580060 in the 3′-UTR of Mitochondrial Ribosomal Protein S21 (MRPS21)] was A-to-I modification SNP. The genome-wide significant SNP rs3818978 (SCAD association p = 5.74 × 10–10) in the 5′-UTR of MRPS21 was related to m6A modification. These nine SNPs all showed eQTL effects, and six of them were associated with circulating protein or metabolite levels. The related protein-coding genes were enriched in specific Gene Ontology (GO) terms such as extracellular space, extracellular region, defense response, lymphocyte migration, receptor binding and cytokine receptor binding, and so on. The present study found the associations between RNAm-SNPs and SCAD. The findings suggested that RNA modification may play functional roles in SCAD.


INTRODUCTION
Aortic dissections (ADs) are complex disorders affecting the aorta. Typically, ADs begin with a tear in the aortic intima, and then blood will intrude into the aortic media and separate the mid-membrane, therefore separating a true from a false lumen. AD is a leading cause of morbidity and mortality. Recently, the Global Burden Disease 2010 project showed that the death rate of AD was increased to 2.78 per 100,000 in 2010 (Sampson et al., 2014a,b). The disease burden increases with age and is more common in men than in women (Sampson et al., 2014a).
Aortic dissection can be inherited, and genetic factors were involved. AD is a heterogeneous disease that can occur in association with syndromes (e.g., Marfan syndrome, Loeys-Dietz aortic aneurysm syndrome) genetically predisposed to AD (Finkbohner et al., 1995;van Karnebeek et al., 2001). Previous studies focused on the syndromes due to variations of TGFBR2 gene have confirmed the abnormality of the transforming growth factor-beta (TGF-β) pathway regulation as a mechanism contributing to aneurysm formation. Aortic disease in Marfan syndrome is due to the defects in the fibrillin-1 gene at chromosome 15q15-31 (Lee et al., 1991). Genetic studies also have demonstrated that ADs occur in familial aggregates and have identified many genetic loci for this disease (LeMaire et al., 2011;Guo et al., 2016;van 't Hof et al., 2016;Turley et al., 2020). Identification of functional variants in these genetic loci for AD will increase our understanding of the pathological mechanism of AD.
To date, more than 170 types of chemical modifications present in RNA molecules have been reported. The RNA modifications are modifiable and are involved in regulations of diverse biological processes in living cells (Malbec et al., 2019). As sufficiently sensitive high-resolution transcriptomewide techniques developed, several RNA modifications have been widely studied, including N 6 -methyladenosine (m 6 A), m 6 Am, m 1 A, 2 -O-Me, m 5 C, m 5 U, m 7 G, A-to-I, and pseudouridine. Among these known RNA modification types, m 6 A has been discovered as the first example. m 6 A is a type of reversible and widely conserved RNA methylation among eukaryotes. It is known to us because it is important in the regulation of gene expression (Meyer and Jaffrey, 2014) and mRNA stability (Wang et al., 2014) and homeostasis (Edupuganti et al., 2017). It also has been shown to play an important role in the etiology of various diseases (Visvanathan and Somasundaram, 2017). In recent years, researchers have shown that genetic variants would have impacts on all types of RNA modifications by changing the nucleotides at which the modifications occur or RNA sequences around the target sites (Zheng et al., 2018). Therefore, the RNA modificationassociated single-nucleotide polymorphisms (RNAm-SNPs) may have impacts on gene expression regulation by influencing RNA modifications and may be important functional variants for SCAD. Currently, the relationship between RNAm-SNPs and SCAD is still unknown. Annotating the functional impacts of RNAm-SNPs on SCAD may help to decipher the pathogenicity mechanisms. This study will present the first effort of evaluating the impacts of the RNAm-SNPs on SCAD genome-wide.

Determination of RNA Modification-Associated Single-Nucleotide Polymorphisms for Spontaneous Coronary Artery Dissection
In this study, we made use of the novel RNA modification annotations to obtain more functional interpretation of results from the SCAD GWAS (Saw et al., 2020). The GWAS comprised 270 SCAD cases and 5,263 controls of European descent. Summary statistics from the initial GWAS (Saw et al., 2020), which included association p-values of 607,778 genotyped variants with SCAD, were downloaded and used as "original data" in the analysis of the present study. The SCAD GWAS summary datasets were publicly available at NHGRI-EBI GWAS Catalog (accession numbers GCST90000582 and GCST90000583).
To identify the RNAm-SNPs in the large amount of SNPs from the GWAS, we obtained a set of RNAm-SNPs in the RMVar database 1 . The RMVar database was an updated version of the m6AVar database. Currently, it contains 1,678,126 RNAm-SNPs for the nine types of RNA modifications (m 6 A, m 6 Am, m 1 A, 2 -O-Me, m 5 C, m 5 U, m 7 G, A-to-I, and pseudouridine), which is four times larger than m6Avar. There are three confidence levels for RNAm-SNPs in the RMVar database, ranging from low confidence to high confidence. RNAm-SNPs derived from single base resolution experiments were classified into high confidence level. Those with medium confidence levels were obtained from MeRIP-Seq experiments. The lowconfidence level m 6 A-associated variants were predicted by a CNN model. Based on the annotation of the RNAm-SNP sets, we annotated the GWAS-tested SNPs with RNA modifications in the GWAS summary dataset. Then, RNAm-SNPs that were associated with SCAD were picked out (p-values < 0.05 were considered).

Expression Quantitative Loci Analysis for the Spontaneous Coronary Artery Dissection-Associated RNA Modification-Associated Single-Nucleotide Polymorphisms
The SCAD-associated RNAm-SNPs may affect SCAD risk via disturbing RNA modification. Because RNA modification is critical in gene expression regulation, the RNAm-SNPs may act through the regulation of gene expression and therefore may be associated with mRNA expression levels. We therefore performed gene expression quantitative loci (eQTL) analysis to find associations between RNAm-SNPs and mRNA expression levels in different types of cells and tissues. We focused on the association between RNAm-SNPs and the gene in which they are located (cis-acting eQTL). Data in the HaploReg browser 2 were searched (Ward and Kellis, 2012). After RNAm-SNPs with cis-acting eQTL effects were found, we tried to identify their functionalities. Transcription regulation functions such as altering protein binding were searched in HaploReg and RegulomeDB 3 . We also looked for eQTL signals in 34 human tissues using data from the GTEx project (v7 release) 4 .

Protein Quantitative Loci Analysis for the Spontaneous Coronary Artery Dissection-Associated RNA Modification-Associated Single-Nucleotide Polymorphisms
We carried out protein quantitative loci (pQTL) analysis in peripheral blood for the identified RNAm-SNPs to find proteins related to SCAD. The associations between RNAm-SNPs and plasma protein levels were searched in three pQTL studies. The data included, first, the pQTL study that examined the associations between 509,946 SNPs and 1,124 proteins. The summary data were available in the pGWAS Server database 5 (Suhre et al., 2017). The second pQTL study was the INTERVAL study containing 3,301 individuals of European descent. This study examined the associations between 10.6 million imputed autosomal variants and plasma levels of 2,994 proteins 6 (Sun et al., 2018). The third pQTL study analyzed 83 proteins measured in 3,394 individuals 7 (Folkersen et al., 2017). Functional enrichment analyses were performed in the DAVID database, and protein-protein interactions were found in the STRING database.
In addition, we looked for associations between RNAm-SNPs and concentrations of cytokines, lipids, and metabolites. The cytokine pQTL study examined the genome-wide associations for circulating levels of 41 cytokines in 8,293 Finns 8 (Ahola-Olli et al., 2017). The summary data from a study that examined the associations between genome-wide SNPs and 123 metabolites in blood samples of 24,925 individuals were obtained from http://www.computationalmedicine.fi/data#NMR_GWAS (Kettunen et al., 2016).

Spontaneous Coronary Artery Dissection-Associated RNA Modification-Associated Single-Nucleotide Polymorphisms
We first picked out RNAm-SNPs from the SCAD GWAS datasets based on the annotation of the nine types of RNAm-SNPs in the RMVar database (Table 1). For m 6 A modification, we found 11,464 unique RNAm-SNPs in coding and non-coding genes in the SCAD GWAS dataset. Among them, 519 (4.5%) were found to be nominally (p < 0.05) associated with SCAD ( Table 1). Among these 519 SNPs, 213 were high-confidence RNAm-SNPs, 103 were medium-confidence RNAm-SNPs, and 203 were low-confidence RNAm-SNPs. The SNPs mainly located in exon (75), UTR (88), and intron (156). Specifically, nine RNAm-SNPs were associated with SCAD at p < 0.001, and rs3818978 in the 5 -UTR of MRPS21 was significantly associated with SCAD (p < 5.0 × 10 −8 ) ( Table 2). Seven of the nine identified RNAm-SNPs were m 6 A-associated SNPs (Figure 1). The genome-wide significant SCAD m 6 A-associated SNP rs3818978 (p = 5.74 × 10 −10 ) belongs to the medium confidence category.
We identified 171 unique m 7 G modification SNPs from the SCAD GWAS dataset, and 14 of them were nominally associated with SCAD ( Table 1). The functional loss m 7 Gassociated SNP rs113664950 in an exon of HLA-DQB1 was associated with SCAD (p = 1.83 × 10 −4 ) ( Table 2). For A-to-I modification, we found 377 unique RNAm-SNPs from the SCAD GWAS dataset, and 14 of them were nominally associated with SCAD ( Table 1). The functional loss A-to-I-associated SNP rs580060 in the 3 -UTR of MRPS21 was associated with SCAD (p = 2.97 × 10 −6 ) ( Table 2). Besides, we found one m 6 Am-associated SNP, 24 m 1 A-associated SNPs, and one m 5 C-associated SNP that were nominally associated with SCAD (Table 1).

Gene Expression Analysis
We further tried to identify gene expressions that were associated with the nine identified SCAD-associated RNAm-SNPs ( Table 2).
All of these RNAm-SNPs showed eQTL effects on protein-coding genes in various types of cells or tissues from GTEx project (Figure 2). The number of eQTL signals for the nine SNPs varied (from 29 to 540). A total of 540 eQTL signals were found for rs3818978 in MRPS21, which was significantly associated with SCAD (p = 5.74 × 10 −10 ). According to the HaploReg database, rs3818978 overlapped promoter histone marks in 24 tissues related to 33 bound proteins and altered four regulatory motifs. This SNP was associated with gene expression of MRPS21 in aorta artery (p = 5.30 × 10 −7 ) and appendage atrial tissues (p = 6.12 × 10 −11 ) and associated with gene expression of ADAMTSL4 in aorta artery (p = 8.31 × 10 −7 ) and tibial artery (p = 1.51 × 10 −6 ). The m 6 A-associated SNP rs12758270 in the 3 -UTR of RPRD2 overlapped promoter histone marks in one tissue and enhancer histone marks in six tissues and altered two regulatory motifs. It was associated with gene expression of ECM1 in fibroblast cells (p = 2.79 × 10 −5 ) and esophagus mucosa (p = 3.69 × 10 −7 ) and ADAMTSL4 in esophagus mucosa (p = 9.84 × 10 −12 ).

DISCUSSION
This study showed that SCAD-associated SNPs identified in GWAS were related to RNA modification. These SNPs showed cis-acting eQTL effects in various types of cells and tissues, and some of them were found to be associated with circulating protein levels. Moreover, the associated proteins were enriched in specific functional annotation terms.
Increasing evidence has shown that m 6 A methylation is critical in gene expression regulation, as it modulates RNA processing such as nuclear export, translatability, splicing, and stability (Li et al., 2018;Visvanathan and Somasundaram, 2018). Variations in the methylation target sites can interrupt the modification and then disturb gene expression regulations (Zheng et al., 2018). Therefore, identification of RNAm-SNPs in diseases is a way to ascertain causal variants and therefore helpful for the interpretation of the GWAS findings. In the present study, we identified many SCAD-associated RNAm-SNPs and showed that these kinds of SNPs may have impacts on gene expression, including mRNA levels and protein levels. The findings suggested that RNA modification may play a role in SCAD.
One of the genome-wide significant RNAm-SNPs, rs3818978, has not been discussed in the SCAD GWAS paper. This m 6 Aassociated SNP was close to the reported SCAD-associated SNP rs12740679 (4 Kb upstream of gene MRPS21). RNAm-SNP rs3818978 locates in the 5 -UTR of MRPS21. The function of MRPS21 gene has been less known. According our analysis, rs3818978 was associated with the expression of MRPS21 and ADAMTSL4 in the artery aorta and associated with the plasma level of several proteins with specific functions. In MRPS21, there was another SNP identified in this study, the A-to-I modification SNP rs580060, which locates in the 3 -UTR of  MRPS21. This SNP was also associated with the expression of ADAMTSL4 in the artery aorta and associated with the plasma level of Cystatin F. ADAMTSL4 codes an extracellular matrix protein. The extracellular matrix protein binds to fibrillin-1 to form microfibrils in the matrix (Hubmacher and Apte, 2015). The original SCAD GWAS has demonstrated that ADAMTSL4 gene was expressed in the interarterial layer and vascular smooth muscle cells. It was suggested that ADAMTSL4 deficiency was involved in arterial fragility (Saw et al., 2020). The present study identified two RNA modification SNPs that may affect ADAMTSL4 expression as potential functional variants for SCAD.
In addition to the genome-wide significant signals, we found several RNAm-SNPs that did not reach the genome-wide significance threshold of 5.0 × 10 −8 for SCAD. For example, the m 6 A-associated SNP rs12758270 that locates in the 3 -UTR of RPRD2 did not reach the genome-wide significant level for SCAD. Then, it has been ignored in the original GWAS. In the present study, we found that rs12758270 was associated with gene expression of ECM1 and ADAMTSL4 and associated with plasma levels of ADAMTS4, DDC, B2M, CCL20, CRISP3, ECM1, and TSHB, which have specific biological functions. Therefore, these RNAm-SNPs may also be potential functional candidates for SCAD.
The RNAm-SNPs could disturb RNA modification, and then the mRNA expression levels were changed and consequently affect SCAD risk. But further evidence is needed to prove that gene expression affected by these RNAm-SNPs was associated with SCAD risk. According to the result of a transcriptome sequencing study (GSE153434), DSP (fold change = 0.327, p = 2.30 × 10 −4 ), MKX (fold change = 0.391, p = 3.83 × 10 −8 ), and SERPINF1 (fold change = 0.066, p = 2.26 × 10 −15 ) were differentially expressed between Stanford type A aortic dissection patients and controls in ascending aortic tissues (Zhou et al., 2020). SERPINF1 expression level was associated with rs11062 (m 6 A-associated SNP locates in the 3 -UTR of SMYD4) in transformed fibroblast cells, left ventricle, subcutaneous adipose, atrial appendage, and esophagus mucosa tissues. In addition, our analysis showed that RNAm-SNPs affected genes involved in specific biological functions, which were highly related to SCAD. SCAD is mediated by inflammasome activation, which exacerbates the secretion of pro-inflammatory cytokines, chemokines, and matrix metalloproteinases. The dysregulation of extracellular matrix results in progressive smooth muscle cell depletion and inflammation, leading to SCAD (Mayer et al., 2020;Shen et al., 2020). Therefore, gene expressions affected by the RNAm-SNPs play important roles in SCAD. This evidence may suggest that the identified RNAm-SNPs may participate in the pathogenesis of SCAD by altering RNA modification.
This study has some potential limitations. First, the sample size of the SCAD GWAS was still relatively small. Second, the m 6 A SNP set was large, but data for other types of RNA modification were still very rare. Third, although we identified associations between SCAD-associated SNPs and gene expression and plasma protein levels, the associations between the gene expression and plasma protein levels and SCAD have not been examined. Finally, the functions of these SNPs and the effects of modification of the detected genes on SCAD have not been examined experimentally. More evidence is needed to prove that these SNPs may indeed participate in the pathogenesis of SCAD by altering RNA modification. Further experiments determining the functions in related cells (e.g., vascular smooth muscle cells) are needed.

CONCLUSION
This study found SCAD-associated RNAm-SNPs and therefore suggested that RNA modification may play important roles in SCAD. The findings increased our understanding on the associations identified in the SCAD GWAS. These kinds of SNPs in susceptibility genes of SCAD may work by regulating gene expression, including mRNA levels and protein levels.
No previous study has shown the relationship between RNA modification and SCAD. Therefore, this study may add valuable clues for further understanding of functional mechanisms underlying the development of SCAD.

DATA AVAILABILITY STATEMENT
The original contributions presented in the study are included in the article/supplementary material, further inquiries can be directed to the corresponding author.

AUTHOR CONTRIBUTIONS
TC, MT, and LC contributed to the conceptualization. TC contributed to data curation and writing of the original draft. MT contributed to the formal analysis. TC, MT, XY, ZQ, and XL contributed to the investigation. TC, MT, and XL contributed to the methodology. TC and MT contributed to the visualization. MT and LC contributed to the writing, review, and editing. All authors contributed to the article and approved the submitted version.