Development of an miRNA-Array-Based Diagnostic Signature for Periodontitis

Periodontitis progression is accompanied by irreversible alveolar bone absorption and leads to tooth loss. Early diagnosis is important for tooth stability and periodontal tissue preservation. However, there is no recognized miRNA diagnostic signature with convincing sensitivity and specificity for periodontitis. In this study, we obtained miRNA array expression profiles of periodontitis from the Gene Expression Omnibus (GEO) database. After screening for differentially expressed miRNAs, the least absolute shrinkage and selection operator (LASSO) method was performed to identify and construct a 17-miRNA-based diagnostic signature (hsa-miR-3917, hsa-mir-4271, hsa-miR-3156, hsa-miR-3141, hsa-miR-1246, hsa-miR-125a-5p, hsa-miR-671-5p, hcmv-mir-UL70, hsa-miR-650, hsa-miR-497-3p, hsa-miR-145-3p, hsa-miR-141-3p, hsa-miR-210-3p, hsa-miR-204-3p, hsa-miR-203a-5p, hsa-miR-99a-3p, and hsa-miR-30a-3p). Periodontal tissue samples with higher risk scores were more likely to show symptoms of periodontitis. Then, the receiver operating characteristic (ROC) curves were used to assess the diagnostic value of the miRNA signature, which indicated that the optimum cutoff value in periodontitis diagnosis was 0.5056 with an area under the ROC curve (AUC) of 0.996, a sensitivity of 97.3%, a specificity of 100.0% in the training cohort; in the testing cohort, the corresponding values were as follows: an AUC of 0.998, a sensitivity of 97.9%, and a specificity of 91.7%. We next evaluated the efficacy of the signature in differentiating disease subtype and affected range. Furthermore, we conducted functional enrichment analysis of the 17 miRNA-targeted mRNAs, including the regulation of mTOR activity and cell autophagy, Th1/Th2 cell balance and immunoregulation, cell apoptosis, and so on. In summary, our study identified and validated a 17-miRNA diagnostic signature with convincing AUC, sensitivity, and specificity for periodontitis.

Periodontitis progression is accompanied by irreversible alveolar bone absorption and leads to tooth loss. Early diagnosis is important for tooth stability and periodontal tissue preservation. However, there is no recognized miRNA diagnostic signature with convincing sensitivity and specificity for periodontitis. In this study, we obtained miRNA array expression profiles of periodontitis from the Gene Expression Omnibus (GEO) database. After screening for differentially expressed miRNAs, the least absolute shrinkage and selection operator (LASSO) method was performed to identify and construct a 17-miRNA-based diagnostic signature (hsa-miR-3917, hsa-mir-4271, hsa-miR-3156, hsa-miR-3141, hsa-miR-1246, hsa-miR-125a-5p, hsa-miR-671-5p, hcmv-mir-UL70, hsa-miR-650, hsa-miR-497-3p, hsa-miR-145-3p, hsa-miR-141-3p, hsa-miR-210-3p, hsa-miR-204-3p, hsa-miR-203a-5p, hsa-miR-99a-3p, and hsa-miR-30a-3p). Periodontal tissue samples with higher risk scores were more likely to show symptoms of periodontitis. Then, the receiver operating characteristic (ROC) curves were used to assess the diagnostic value of the miRNA signature, which indicated that the optimum cutoff value in periodontitis diagnosis was 0.5056 with an area under the INTRODUCTION Periodontitis, reported as one of the most prevalent diseases worldwide (Kassebaum et al., 2014), begins with damage to periodontal tissue, with typical symptoms, including gingival recession, attachment loss, and alveolar bone resorption, and eventually leads to tooth loss (Gross et al., 2017;Papapanou and Susin, 2017). Tooth loss from aggressive periodontitis (AP) is 0.05-0.12 per patientyear (Nibali et al., 2013;Clark et al., 2017). Periodontitis identified at an early stage carries a much-improved prognosis compared to advanced-stage disease because of the irreversible alveolar bone absorption that occurs during progression (Hienz et al., 2015). Thus, the early diagnosis of periodontitis is crucial for alveolar bone preservation and tooth stability.
A variety of miRNAs are dysregulated in periodontitis tissue compared to healthy tissue, such as upregulated miR-15a, miR-29b, miR-125a, miR-146a, miR-148/148a, and miR-223 and downregulated miR-92 (Luan et al., 2018). However, there is no recognized miRNA diagnostic signature for periodontitis, with convincing sensitivity and specificity, until now (Arias-Bujanda et al., 2019). In this study, we compared the miRNA profiles between disease and control samples and identified an miRNA diagnostic signature with high sensitivity and specificity for periodontitis.

Search Strategy
The Gene Expression Omnibus (GEO 1 ) database was searched to find datasets related to periodontitis, using "Homo sapiens [porgn: txid9606] and periodontitis"; 14 datasets met the criteria. Only five datasets collected miRNA profiles. To reduce random error, we limited the target to datasets with a sample size greater than 100, and eventually GSE54710 was screened out. 1 https://www.ncbi.nlm.nih.gov/geo/

Sample Clinical Characteristics and Data Processing
The clinical characteristics and miRNA profiling data of periodontitis samples (GSE54710), obtained with whole-genome microarray analysis (U-133 Plus 2.0; Affymetrix, Santa Clara, CA, United States), were downloaded from the GEO database.
The data contained 159 disease samples (gingival samples showing periodontitis, with attachment level ≥3 mm) and 41 control samples (healthy gingival samples, with attachment level ≤2 mm) (Stoecklin-Wasmer et al., 2012). The miRNA microarray data were annotated with GPL15159. All miRNAs differentially expressed between the disease and control samples were analyzed using the Linear Models for Microarray data (Limma, version 3.30.0) package in R. MicroRNAs with the parameters absolute log2 fold change >0.45 and P value < 0.05  were regarded as differently expressed miRNAs (DE-miRNAs). The DE-miRNA expression value was log2 transformed for next data processing.

Construction and Validation of the miRNA-Array-Based Diagnostic Model
The outcome measure was periodontitis/control classification. Two hundred samples were randomly divided into a training cohort (112 disease and 29 control samples) and a testing cohort (47 disease and 12 control samples) at a ratio of 7:3.
Afterward, we used least absolute shrinkage and selection operator (LASSO) regression analysis for DE-miRNA selection and reduction using the glmnet package (version 3.0) (Gao et al., 2010) in R. This LASSO method shrinks coefficients toward zero, and eliminates unimportant terms entirely, thus reducing prediction error and minimizing overfitting. The sensitivity and specificity were evaluated for each DE-miRNA and were validated within the internal validation samples. Afterward, diagnosisassociated DE-miRNAs with nonzero coefficients were selected to build a diagnostic miRNA signature. We obtained diagnostic models according to a variety of evaluation variables, all with accuracy ≥ 0.95, sensitivity ≥ 0.95, specificity ≥ 0.9, and recall ≥ 0.9. DE-miRNA-based receiver operating characteristic (ROC) curves were constructed to assess the optimal signature model in the training set with the pROC (version 1.15.3) package. The ROC figure was plotted using the ggplot2 and ggfortify packages. Area under the ROC curve (AUC) was calculated to quantify the diagnostic accuracy (Sauerbrei et al., 2007) of the signature. Then, the cutoff value was identified according to the ROC curve, using the local maxima method of the pROC package. Thereafter, the diagnostic value of the signature was further validated in the testing set (Xu et al., 2017;Liu et al., 2019).

The Diagnostic Signature Diagnosed Clinical Outcomes
To further explore the prognostic function of the signature, we utilized the miRNA signature for assessing the disease subtype and affected range. According to the clinical features of different subtypes, all samples were divided into AP and chronic periodontitis (CP). AP is a subtype of periodontitis with a prevalence of 0.6-2.2% (Ababneh et al., 2012;Catunda et al., 2019) and is characterized by rapid periodontal attachment loss and bone resorption (Susin et al., 2014). According to the number of affected teeth, the samples were grouped into localized periodontitis with gingival attachment loss and bone resorption sites ≤30% of all teeth, and generalized periodontitis with affected tooth sites >30%. The risk score was calculated for each sample and used to classify each sample into a high expression group (with risk scores above the cutoff value) and a low expression group (with risk scores below the cutoff value). The associations between risk scores and various clinical characteristics were analyzed with the chi-square test (Yao et al., 2018).

Functional Enrichment of the DE-miRNA Signature
The functions of the DE-miRNAs of the diagnostic signature were explored using the competing endogenous RNA (ceRNA; lncRNA-miRNA-mRNA) network. The sequences of the identified DE-miRNAs were obtained from Ensembl and put into miRcode 2 and starBase v 2.0 3 to predict their lncRNA targets. To improve the data reliability, lncRNAs that have not been annotated by GENCODE 4 were omitted. Furthermore, the mRNAs shown to be targeted by miRNAs with experimental support were then identified using miRTarBase 5 . Only miRNAs with target RNAs in these four databases were used to construct the ceRNA network. We built the ceRNA network based on the abovementioned data and used the ggalluvial (version 0.9.1) package to visualize the network (Jin et al., 2019).

Gene Ontology and Kyoto Encyclopedia of Genes and Genomes Analysis
To gain further insights into the biological pathways involved in the miRNA diagnostic signature, we conducted Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis. An original list of the DE-mRNAs of the corresponding DE-miRNAs was predicted through miRTarBase. In addition, we obtained the differentially expressed mRNAs for periodontitis by analyzing the GSE16134 in our previous work (Jin et al., 2019). In order to appropriately restrict the number of DE-mRNAs for periodontitis, the original list of DE-mRNAs was overlapped with the differentially expressed mRNAs from GSE16134; the overlapping mRNAs between these two cohorts are shown in Supplementary Table 3. We then annotated the overlapping mRNAs using GO and KEGG pathway enrichment analysis with the clusterProfiler package (Ma et al., 2019) in R.

Statistical Analysis
The sensitivity, specificity, and AUC values of the diagnostic signature were quantified based on the ROC curve. The association between the miRNA signature risk scores and categorical clinical characteristics was analyzed with the chisquare test. All statistical tests were two-sided, and P < 0.05

Differential Expression of miRNAs
We initially compared the miRNA profiles of 159 disease samples with 41 control samples (Figure 1). A total of 72 miRNAs, including 47 upregulated and 25 downregulated, were identified as significantly differentially expressed between the disease and control samples (Supplementary Table 1). The relative expression values of all the DE-miRNAs (Figure 2A) and the 72 significantly DE-miRNAs ( Figure 2B) are illustrated, demonstrating substantial variation. The 200 samples were randomly divided into the training set (112 disease and 29 control samples) and testing set (47 disease and 12 control samples). A LASSO regression analysis was used to select and reduce variables within the 72 DE-miRNAs in the training cohort. Seventeen miRNAs were identified as variables.
In the training set, the AUC of the signature for periodontitis diagnosis was 0.996 (95% CI: 0.990-1.000; Figure 3A). The ROC curve was used to determine the best cutoff value in periodontitis diagnosis, which was 0.5056. Applying the miRNA signature yielded a sensitivity of 97.3% and a specificity of 100.0% in the training dataset ( Figure 3B). To validate the diagnostic value of the signature in the testing cohort, we used the equation above to compute risk scores with the clinical outcome and miRNA   profile data. The distribution of risk scores for the testing set was very similar to that of the training set. In the testing set, the area under the ROC curve for the signature was 0.998 (95% CI: 0.993-1.000; Figure 3C). Applying the miRNA signature yielded a sensitivity of 97.9% and a specificity of 91.7% ( Figure 3D). The 17 miRNAs in the training set ( Figure 3E) and testing set ( Figure 3F) were able to distinguish periodontally diseased gingiva from normal controls.

The Diagnostic Signature Diagnosed Clinical Outcomes
We next assessed the efficacy of the miRNA signature in differentiating control vs. disease samples, CP vs. AP, and localized vs. generalized range. The risk scores were calculated for each sample. Then, the cutoff value was used to divide 200 samples into two groups: 156 cases showed high risk scores, and 44 showed low risk scores ( Table 2). The diagnostic signature could differentiate disease samples from control samples (P = 1.508 × 10 −36 , Figure 4A), as well as the affected range of periodontitis (P = 0.015, Figure 4C). In general, periodontal tissue samples with high risk scores were more likely to show symptoms of periodontitis. In contrast, the disease subtype (P = 0.138, Figure 4B) was not significantly associated with the signature scores.

Functional Enrichment of the DE-miRNA Signature
The 17 DE-miRNAs of the signature were put into the miRcode and starBase v 2.0 databases to predict their lncRNA targets.
Then, the corresponding coding genes of the DE-miRNAs were predicted through miRTarBase. After removing results that did not meet the screening criteria, 5 miRNAs, 13 lncRNAs, and 32 mRNAs were used to establish a ceRNA network ( Figure 5A).
The detailed data of all 81 ceRNA network components are shown in Supplementary Table 2. Furthermore, the degree of connection of each gene by topology was calculated to illustrate its importance in the ceRNA network ( Figure 5A).

GO and KEGG Analysis
There were 1,107 overlapping mRNAs between the previously mentioned two cohorts (Supplementary Table 3). Functional enrichment analysis identified 441 GO terms in the biological    Table 4). The top 12 GO results and top 20 KEGG results are shown in Figure 5. Regarding BP, the DE-mRNAs were involved in cell-substrate adhesion, the intrinsic apoptotic signaling pathway, and regulation of cell morphogenesis ( Figure 5C). In terms of MF, the DE-mRNAs were mainly associated with DNA-binding transcription activator activity, bacterial-type RNA polymerase transcriptional activator activity, and bacterial-type RNA polymerase transcriptional repressor activity, among other associations. The CCs of the DE-mRNAs were transcription factor complex, nuclear envelope, intrinsic component of organelle membrane, and so on. In addition, the KEGG results showed enrichment in the rap 1 signaling pathway, phospholipase D signaling pathway, autophagy animal, and Th1 and Th2 cell differentiation, and so on.
Previous studies have highlighted a single miRNA biomarker of periodontitis, such as miR-143-3p (P ≤ 0.05) (Nisha et al., 2019) or miR-1226 (P < 0.05) (Mico-Martinez et al., 2018). However, the analysis of single miRNAs cannot provide a complete picture of the existing physiological and pathological state. Multi-miRNA signatures may have higher AUC, sensitivity, and specificity values. Therefore, for the first time, we developed and validated a multiple miRNA-based diagnostic signature for periodontitis, obtaining an AUC of 0.996 in the training set and an AUC of 0.998 in the testing set, which is very convincing for periodontitis biomarkers, to the best of our knowledge.
These results demonstrate that the signature can effectively assess disease vs. control. The signature also can differentiate affected range, which may be because these two diagnoses have similarities. The main criterion of affected range is affected tooth number. Thus, the more tooth positions show symptoms of periodontitis, the higher the risk score will be. However, the signature cannot assess AP vs. CP, which may be because both AP and CP belong to disease samples, and hence cannot be distinguished effectively.
The signaling pathway enrichment analysis indicated that the 17 miRNA-targeted mRNAs play important roles in the rap 1 signaling pathway, phospholipase D signaling pathway, and autophagy animals, which are closely related to the regulation of mTOR activity and cell autophagy (Liu et al., 2013;Zhao et al., 2015;Chen et al., 2018). The other enriched pathways include Th1 and Th2 cell differentiation and Th17 cell differentiation; these pathways are directly connected to Th1/Th2 balance and immunoregulation (Yun et al., 2003;Orozco et al., 2007). In addition, the Fc epsilon RI signaling pathway induces NF-kappa B activation in mast cells and mast cell degranulation (Klemm and Ruland, 2006), which is related to cell apoptosis. Their involvement with pathways in cancer confirmed an association between various types of cancer and clinical signs of periodontitis, combined with poor oral health (Hoare et al., 2019).
There are limitations to our research. Although the AUC, sensitivity, and specificity of the diagnostic signature are convincing for non-coding RNA-based signatures, the samples came from periodontal gingival tissue, which is difficult to collect. Further research should focus on miRNA-based diagnostic signatures of gingival crevicular fluid for periodontitis. Besides, available data are not enough for external verification, which needs future work to be perfected.

CONCLUSION
Our study identified a 17-miRNA-based diagnostic signature for periodontitis, and it warrants large-scale clinical validation in the future.

DATA AVAILABILITY STATEMENT
The dataset presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found in the article/ Supplementary Material.

AUTHOR CONTRIBUTIONS
S-HJ drafted the manuscript. J-GZ analyzed the data. X-YG and G-HB interpreted the data. J-GL revised the work critically. L-WC drafted the manuscript and gave final approval of the version to be published. All authors had made substantial contributions to conception and design of the study.