Plasma MicroRNA Pair Panels as Novel Biomarkers for Detection of Early Stage Breast Cancer

Introduction: Breast cancer is the second leading cause of cancer death among females. We sought to identify microRNA (miRNA) markers in breast cancer, and determine whether miRNA expression is predictive of early stage breast cancer. The paired panel of microRNAs is promising. Methods: Global miRNA expression profiling was performed on three pooling samples of plasma from breast cancer, benign lesion and normal, using next generation sequencing technology. Thirteen microRNAs (hsa-miR-21-3p, hsa-miR-192-5p, hsa-miR-221-3p, hsa-miR-451a, hsa-miR-574-5p, hsa-miR-1273g-3p, hsa-miR-152, hsa-miR-22-3p, hsa-miR-222-3p, hsa-miR-30a-5p, hsa-miR-30e-5p, hsa-miR-324-3p, and hsa -miR-382-5p) were subsequently validated using real-time quantitative reverse transcription-polymerase chain reaction (RT-qPCR) in a cohort of 53 breast cancer, 40 benign lesions and 38 normal cases. The pairwise miRNA ratios were calculated as biomarkers to classify breast cancer. Results: According to the model used to predict breast cancer from benign lesions, a panel of five miRNA pairs had high diagnostic power with an AUC of 0.942. The sensitivity, specificity, positive predictive value (PPV) and negative predictive value (NPV) of this model after 10-fold cross validation were 0.881, 0.775, 0.827, and 0.756, respectively. In addition, the other panels of miRNA pairs distinguishing the breast cancer from normal and non-cancer patients had good performance. Conclusion: Certain MicroRNA pairs were identified and deemed effective in breast cancer screening, especially when distinguishing cancer from benign lesions.


INTRODUCTION
According to the American Cancer Society report of 2017, an estimated 252,710 cases of invasive breast cancer and an additional 63,420 new cases of in situ lesions of the breast were diagnosed in women. It was further estimated that 40,610 women would die from breast cancer (American Cancer Society, 2017). Breast cancer is the second leading cause of cancer death in women following lung cancer. Sometimes breast cancer is found after the symptoms, but in many women breast cancer appear with no symptoms. Thus, the early diagnosis of breast cancer plays a critical role in the prognosis of breast cancer. Mammograms are currently the best test for breast cancer screening, however, the false positive rate is high. On average, 10% of women will be recalled from screening examinations for further testing such as the expensive magnetic resonance imaging (MRI) and/or the invasive biopsy, while only 5% of these women will actually have cancer (Rosenberg et al., 2006). According to one US study, over the course of 10 screening examinations, about 1.5 women will experience a false positive and about 19% will undergo biopsy (Elmore et al., 1998). It was estimated that breast cancer was over-diagnosed by mammography in up to 30% of all breast cancers diagnosed in 2008 (Jorgensen et al., 2009;Nelson et al., 2009;Puliti et al., 2009;Duffy et al., 2010;Bleyer and Welch, 2012;Gotzsche and Jorgensen, 2013;Marmot et al., 2013). Due to their low sensitivity, the known serumbased markers such as CA15.3 and BR27.29 are not used for screening breast cancer (Molina et al., 2005). Thus, there is a need for the development of novel biomarkers that are minimally invasive to improve the early diagnosis of malignant breast lesions.
MicroRNAs (miRNAs), a class of small non-coding RNAs (ncRNAs) containing ∼22 nucleotides, regulate gene expression in the post-transcription phase. They function in numerous cancer related processes such as cell proliferation, differentiation and apoptosis (Jansson and Lund, 2012). Clinical trials using circulating miRNAs as cancer biomarkers are being carried out in the United States and other countries 1 . In recent years, with the advent of gene expression profiling technologies, an increasing number of studies have revealed the genetic association between miRNAs and cancer, including colorectal cancer (Schetter and Harris, 2009), lung cancer (Hu et al., 2016) and breast cancer (Takahashi et al., 2015;Kurozumi et al., 2017). However, very few studies have compared the expression profiles of miRNAs between benign lesions and breast cancer. Even though there were several reports using circulating miRNA markers for breast cancer detection, they were quite inconsistent (Roth et al., 2010;Cookson et al., 2012;Chan et al., 2013;Cuk et al., 2013;Guo et al., 2013;Freres et al., 2016).
Due to the low concentration of circulating ncRNAs in peripheral blood, data normalization in plasma or serum ncRNA experiments using real-time quantitative reverse transcription-polymerase chain reaction (RT-qPCR) is challenging. Current normalization strategy uses endogenous controls that display stable expression across all samples, like reference miRNAs such as miR-16 (Van Schooneveld et al., 2012) and miR-39 (Chen et al., 2016). Some researchers also have made effort to seek the suitable endogenous control miRNAs (ECMs) but no such suitable and universal ECMs have been established for blood miRNA quantification in humans (Davoren et al., 2008;Hackenberg et al., 2011;Langmead and Salzberg, 2012).
Since there are no current consensus normalization methods 1 clinicaltrials.gov for miRNAs, some studies have analyzed plasma miRNA values looking at the reciprocal ratio of miRNAs to bypass the normalization issue which has proven to be more informative for disease status than the absolute levels of individual miRNAs (Dou et al., 2018).
In this study, we proposed a ratio based method for breast cancer detection. We first performed the Illumina platform to sequence miRNAs in pooled samples and the selected miRNAs were further evaluated by RT-qPCR. The RT-qPCR was performed in a cohort of breast cancer, benign lesions, and normal patients and then calculated the pairwise ratio of any two miRNAs in the same samples. A diagnostic test based on the miRNA ratios was then constructed. This study focused on distinguishing cancer patients from benign lesions.

Patients, Plasma Sample Collection and Preparation
Each pooling sample contained 30 individual plasma samples from breast cancer, benign lesions, and normal patients, respectively, with matched age and race (Supplementary Table S1). The cohort included 53 breast cancer, 40 benign, and 38 normal patients from Rush Breast Cancer Repository (ORA number: 15021301-IRB01-CR02). The patients were selected according to the following criteria: (1) all patients were female; (2) all patients were diagnosed and confirmed by pathology; (3) patients with breast cancer were at the early stage (0, I, and II) according to the clinical staging method; (4) none of the patients underwent preoperative adjuvant chemotherapy or radiotherapy; and (5) patients had no other cancer or diseases which might affect the miRNA profiling. Benign lesions were defined as hyperplasia, fibroadenomas, cyst, and some unspecified findings in the breast. Normal blood samples were collected from healthy women with no history of malignant diseases and no inflammatory conditions. All plasma samples were collected using EDTA-anticoagulant In situ 11 Frontiers in Physiology | www.frontiersin.org tubes at 4,000 RPM for 10 min, followed by a 15 min high-speed centrifugation at 12,000 RPM to completely remove cell debris. The supernatant plasma was stored at -80 • C until analysis.

RNA Isolation and Illumina Next-Generation Sequencing
Total RNA was extracted from 200 µl of plasma using Qiagen miRNeasy Mini kit (Qiagen, Valencia, CA) according to the manufacturer's protocol. In brief, the plasma was mixed with QIAzol Lysis Reagent and chloroform. After centrifugation, the aqueous phase was transferred into another tube, and 1.5 volumes of absolute ethanol were added. The mixture was then applied to miRNeasy Mini kit columns, followed by washing with RWT and RPE buffers. The RNAs were finally eluted in 40 µl of RNase-free water. Sequencing was performed on a HiSeq 2500 (Illumina). The sequencing adapters were removed from the FASTQ files by local alignment of the adapter to the sequenced reads. All sequences had a length <15 bp after the adapter removal was discarded. The reads in each library were summarized to tag in a quantified FASTA format. The FASTA reads were then mapped to the genome under consideration with Bowtie. To eliminate the ambiguous mapping hits, only the uniquely mapping loci with the newest alignment mismatches were reported, allowing for a maximum of two mismatches. The clean reads were then re-mapped back to human small ncRNA using Bowtie, the small ncRNA abundance (count) was determined, and the annotation for each mapped locus was derived from ncRNA database such as miRBase (Supplementary Table S2).
The abundance (count) data was normalized by DESeq normalization. The top miRNAs that had fold change > = 5 in any comparison among pooling samples were selected for further PCR validation.   kits (Applied Biosystems, Foster City, CA, United States) on an Eppendorf iplex 4 system (Eppendorf, Hamburg, Germany). The relative expression levels were express cycle threshold (CT) values. The ratio strategy described below in the statistical analysis section was used to reduce the experimental variations instead of normalizing by endogenous control.

Statistical Analysis
CT values in PCR is a log (base 2) value of the observed count. From the formula below, we can see that the log (base 2) ratio value of two miRNAs is the difference in CT values of the two miRNAs, which will make the calculation even easier and more convenient for clinical practice based on RT-qPCR data.
The difference in miRNA ratios between breast cancer and non-cancer patients (normal, benign or normal and benign patients) were examined by two sample t-tests. The fold change and regulation direction were then reported. The p-values were corrected by False Discover Rate (FDR) Benjamini and Hochberg. The association between the outcome variable, benign lesions or breast cancer, and each of the miRNA ratios were then evaluated by the logistic regression. The performance parameters such as sensitivity, specificity, positive predictive value (PPV) and negative predictive value (NPV) were summarized, and the area under the receiver operating (ROC) curve (AUC) was calculated to assess the discrimination power of each ratio. To avoid over-fitting, 10-fold cross validation was conducted. All analyses were performed by SAS 9.4 and p-value < 0.05 was considered as statistical significance. The miRNA pathway was analyzed using DIANA tools (Vlachos et al., 2015).

Significantly Differentiated miRNA Pairs Among Normal, Benign Lesions, and Breast Cancer Patients
The concentrations of pairwise miRNA ratios were compared between breast cancer and three non-cancer groups (Cancer vs. Benign, Cancer vs. Normal, and Cancer vs. Non-cancer including Benign and Normal). MicroRNA ratios with fold change > 1.5 and FDR < 0.05 were listed in the Tables 2-4. The discriminative powers of individual ratios were ranged from 60 to 85%.

Identification of miRNA Ratios as Biomarkers for the Detection of Breast Cancer
A subset of ratios was chosen based on the rank of the correlation with the classification (cancer or not). Table 5 summarized the qPCR evaluation results in each comparison.

DISCUSSION
In clinical practice, mammographic screening is the most common method used for early stage breast cancer diagnosis. However, high false positives rates in mammography warrants further investigation using expensive breast imaging and invasive biopsy exposing women to harmful anti-cancer therapy and affecting their quality of life. Therefore, the development of a more sensitive approach for early breast cancer diagnosis, particularly from benign lesions, is needed to supplement and/or complement existing detection methods. Our goal was to determine pairs of miRNA as alternative biomarkers that can be used to differentiate breast cancer from benign lesions or non-cancer patients. As far as we know, this is the first study on miRNA ratios in distinguishing early stage breast cancer from benign lesions. Our current data showed that the miRNA ratios are likely to perform well in distinguishing breast cancer from benign lesions. The miRNA ratio chosen over the single miRNA provided more candidates for diagnosis of early stage breast cancer. In addition, the combination of the selected miRNA ratios had a high diagnostic value for breast cancer prediction. We have identified five miRNA ratios that can differentiate breast cancer from benign lesions with over 90% accuracy. The ratio based normalization method, which is completely independent of spike-in or internal controls, has a great chance of producing more reliable and reproducible biomarkers in common types of cancer. In addition, the ratio based normalization method provides more biomarkers as candidates.
The interpretation of the miRNA ratios is more complicated than the individual miRNAs. Based on the equation in the methods section and Figures 1-3, the up-regulation of the miRNA ratio in the cancer group indicates higher level of the miRNA as the denominator in the ratio and lower level of the miRNA as the nominator in the ratio, and vice versa. For example, miR-192 was identified as the nominator in two ratios that distinguished breast cancer from benign lesions. This indicated that the concentration level was lower in cancer group, i.e., down regulation. The individual miRNAs from the ratios studied in our study were primarily identified in other studies. The miR-192 was found in down-regulation in breast cancer compared with the non-cancerous tissue, indicating that miR-192 may act as tumor suppressor gene in the development of breast cancer (Hu et al., 2013). Yang et al. (2017) reviewed the versatile functions of miR-30 family members in breast cancer. In particular, miR-30a suppressed tumor growth, proliferation, migration and invasion of breast cancer. Another study found that miR-221 was over expressed in breast cancer tissue compared to the non-cancerous tissue and concluded that miR-221 was a potential biomarker for predicting the survival of breast cancer patients (Eissa et al., 2015). However, plasma miR-21 expression was not observed to have a significant difference in benign patients, under normal controls, compared to cancer patients in breast cancer (Chen et al., 2016). Interestingly, our study shows that miR-21 interacted with other miRNAs can regulate significantly, indicating that miR-21 could serve as a long-term follow-up biomarker in the detection of cancer. Ho et al. study found that miR-382-5p was up-regulated in breast cancer compared to the benign breast disease, and significantly functioned as an independent oncomiR for the higher incidence and poorer prognosis of breast cancer (Ho et al., 2017). MiR-574-3p was first reported by Krishnan et al. as a promising prognostic maker for breast cancer (Krishnan et al., 2015). So far, nobody has released the relationship between miR-574-5p and breast cancer. Our study is a good start for further research.
We realized that the sample size of our subjects, including breast cancer patients, benign lesions, and normal controls are small, limiting the evaluation on miRNAs as predictive biomarkers in the early detection of cancer. Another limitation of this study is the lack of a validation patient cohort. We believe that further studies investigating more powerful and specific miRNA biomarkers to discriminate early cancer from pre-cancerous lesions are needed.

CONCLUSION
The expression profile of plasma miRNA ratios can serve as novel non-invasive biomarkers for the early detection of breast cancer. The strategy of using next generation sequencing followed by RT-qPCR validation provides a successful approach to identifying plasma miRNA profiles as biomarkers for the diagnosis of common types of cancer.

AVAILABILITY OF DATA AND MATERIAL
The datasets used and/or analyzed during the current study are available from the corresponding author upon reasonable request.

ETHICS STATEMENT
The study was approved by the institutional review board of the Rush University Medical Center (ORANo. 15021301-IRB01-CR02). All participants provided a written informed consent.

AUTHOR CONTRIBUTIONS
RF, YD, YZ, LH, and JA analyzed the data. RF, VSK, and YD interpreted the results and drafted the manuscript. LH, DJ, and BJ helped in manuscript revision. YD, YZ, LH, HZ, and XH designed the study and experiments. All authors have read and approved the final manuscript.

FUNDING
This project was supported by a NIH grant (1R21CA164764), Bears Care Foundation and Hawai'i Community Foundation to YD. This work was also supported by the NIH grants 5P30GM114737, P20GM103466, U54MD007584, and 2U54MD007601.

ACKNOWLEDGMENTS
We thank Yan Li from Rush University for the help in the algorithm development.

SUPPLEMENTARY MATERIAL
The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fphys. 2018.01879/full#supplementary-material FIGURE S1 | Function analysis heat map showed the hierarchical clustering of miRNAs and pathways based on the levels of their interactions.