Untargeted GC-MS-Based Metabolomics for Early Detection of Colorectal Cancer

Background Colorectal cancer (CRC) is one of the most common malignant gastrointestinal cancers in the world with a 5-year survival rate of approximately 68%. Although researchers accumulated many scientific studies, its pathogenesis remains unclear yet. Detecting and removing these malignant polyps promptly is the most effective method in CRC prevention. Therefore, the analysis and disposal of malignant polyps is conducive to preventing CRC. Methods In the study, metabolic profiling as well as diagnostic biomarkers for CRC was investigated using untargeted GC-MS-based metabolomics methods to explore the intervention approaches. In order to better characterize the variations of tissue and serum metabolic profiles, orthogonal partial least-square discriminant analysis was carried out to further identify significant features. The key differences in tR–m/z pairs were screened by the S-plot and VIP value from OPLS-DA. Identified potential biomarkers were leading in the KEGG in finding interactions, which show the relationships among these signal pathways. Results Finally, 17 tissue and 13 serum candidate ions were selected based on their corresponding retention time, p-value, m/z, and VIP value. Simultaneously, the most influential pathways contributing to CRC were inositol phosphate metabolism, primary bile acid biosynthesis, phosphatidylinositol signaling system, and linoleic acid metabolism. Conclusions The preliminary results suggest that the GC-MS-based method coupled with the pattern recognition method and understanding these cancer-specific alterations could make it possible to detect CRC early and aid in the development of additional treatments for the disease, leading to improvements in CRC patients’ quality of life.


INTRODUCTION
Colorectal cancer (CRC) is a severe health problem and ranks as the third leading cause of tumor-causing death in Europe and the USA (1). And while there are many factors, including the environment, alcohol consumption and smoking are believed to increase the incidence of CRC (1). The pathogenetic progression of colorectal cancer is closely related to polyps (2). Most of the CRCs arise from adenomas, beginning as polyps on the inner wall of the colon or rectum, and subsequently intravasating into lymph vessels or blood vessels, increasing the chance of disseminating to other organs (3). Detecting and removing these malignant polyps promptly is the most effective method at present in CRC prevention. Therefore, the analysis and disposal of malignant polyps is conducive to preventing CRCs.
Currently, there are ways to relatively detect CRCs, such as colonoscopy (4), computed tomography colonography (5), fecal occult blood test (6), and multitarget stool DNA testing (7). However, it has some disadvantages, bleeding risk, inconvenience, no cost-effectiveness, and lower sensitivity and specificity. Unfortunately, most CRC patients are diagnosed when they are in the late stages of the disease with metastasis, making it harder to achieve complete remission. The development of reliable and predictive biomarkers would be a critical tool to identifying individuals with evolving CRC or presence of early disease. However, there is still no tissue or serum biomarker that can be utilized for contented CRC diagnosis. It is urgent to find new screening methods with sensitive, specific, convenient, and non-invasive characters for the early diagnosis of CRC.
Metabolomics is a high-throughput tool useful for exploring metabolites by detecting small-molecule metabolites using mass spectrometry (<1,800 Da). For this reason, small variations in the body can indicate early biological changes to the host due to perturbations in metabolic pathways. Therefore, the metabolome could be regarded as the amplified output of a biological system (8,9). Monitoring fluctuations of certain metabolite levels in body fluids has become an important way to detect early stages in CRC. High-throughput analytical technologies for metabolomics, such as nuclear magnetic resonance (NMR) and mass spectrometry (MS), seem imperative in an untargeted type (10). Recent technological advances allow for the establishment of systematic, holistic methods to relatively short analysis times. In addition, chromatographic methods, such as LC and GC, offer a significant advantage to the MS detection allowing the identification of metabolites based on their chemical properties. Compared to conventional liquid chromatography-tandem mass spectrometry (LC-MS) instruments, gas chromatography-tandem mass spectrometry (GC-MS) detection has gained popularity with its higher chromatographic resolution, reproducibility, and robustness which allowed the establishment of a comprehensive database of identified peaks (11).
Herein, GC-MS-based tissue metabolomics was applied for analyzing the difference between cancer tissue and paracarcinoma tissue in CRC patients. Simultaneously, GC-MS-based serum metabolomics between preoperative and postoperative (2 weeks) CRC patients is a significant auxiliary for the sake of explicating metabolic pathway transformation and establishing a panel of biomarkers, which would be of diagnostic significance to CRC (Figure 1).

Study Population and Sample Collection
All the experiments of human specimens were in accordance with the ethical code and recommendation issued by the Ethics Committee of Human Experimentation and Chinese Animal Community and with the Helsinki Declaration (approval number: KY2020081). The tissue and serum samples were collected from 48 patients diagnosed with CRC undergoing colorectal resection in Nanjing Hospital of Chinese Medicine Affiliated to Nanjing University of Chinese Medicine. Clinical information, including patient samples, tumor size, and clinical staging, and other information on these CRC patients are shown in Table 1 and Supplemental Information Table S1. The patients we collected came from various localities and owned various living environments and personal habits. Histopathologic examination and immunohistochemistry were utilized to confirm the diagnosis of CRC. Concurrently, the inclusion criteria of the CRC patients in this research included the following: (1) age between 35 and 85 years; (2) conforming to the "Diagnosis, management, and treatment of colorectal cancer" (2015) standard; (3) clear and definite preoperative diagnosis; (4) patients agree to become involved in clinical trials. Exclusion criteria were as follows: (1) gestational period and suckling period of female; (2) malignant hematopathy, including all types of leukemia and anemia; (3) patients with significant metabolic abnormalities; (4) patients with primary tumors in other parts of the body; (5) patients with allergic constitution; (6) use of specific drugs during the last 3 weeks, such as antibiotics, hormones, and nonsteroid anti-inflammatory drugs.

Sample Preparations
The tissue (cancer tissue and paracarcinoma tissue) and serum (preoperative and postoperative of 2 weeks) samples were screened from 48 CRC volunteers taken from Nanjing Hospital of Chinese Medicine Affiliated to Nanjing University of Chinese Medicine.
After resection of CRC patients, tumor tissue was collected in the central area of solid tumor immediately, and paracancerous tissue was collected in the area about 5 cm away from the solid tumor. A 100-mg sample was taken and then mixed with 10 ml methanol solution that consists of internal standard solution of heptadecanoic acid (1 mg/ml) into 1.5-ml EP tubes, extracted with 0.4 ml extraction liquid (V (methanol): V (chloroform) = 3:1), and vortexed for 30 s at room temperature. Following centrifugation at 13,000 rpm for 10 min at 4°C, the resulting 300-ml supernatant was transferred into a sample vial for vacuum drying at room temperature. The residue was redissolved in 40 ml of a methoxyamine solution (15 mg/ml in pyridine) and vortexed for 1 min. An oximation reaction was performed at 37°C for 1.5 h. An 80-ml aliquot of BSTFA containing 1% trimethylchlorosilane (TMCS) was then added to the solution, and vortex oscillation for 30 s followed. The obtained samples were then kept at 70˚C in an oven for 1 h before weighing, and a 40-ml aliquot of acetonitrile was added. Samples were then centrifuged at 13,200 rpm for 10 min at 4°C. The supernatant was transferred to an autosampler vial for GC-MS analysis.
Ten-microliter internal standard solutions (heptadecanoic acid in methanol, 1 mg/ml) and extraction liquid (V (methanol): V  (chloroform) = 3:1, 600 µl) were added to serum samples. The mixture was vortexed for 30 s, and the mixture was stored at 37˚C for 10 min. The resulting 600 -ml supernatant was transferred into a sample vial for vacuum drying at room temperature. The remaining methods are as aforementioned.

Gas Chromatography-Tandem Mass Spectrometry Conditions
The samples were analyzed using an Agilent 7890 chromatograph coupled with a 5977B MS system (Agilent Technologies, USA) and EI Source. A DB-5 ms capillary column which was coated with 95% dimethyl/5% diphenyl polysiloxane (30 m × 0.25 mm inner diameter i.d., 0.25-µm film thickness, CA, USA) was utilized in the separation system. The temperature procedure of the column was established as follows: initially, the GC oven temperature was maintained at 60°C for 1 min, then the temperature was raised to 325°C at a rate of 10°C/min and then maintained at 325°C for 10 min. The temperature of the inlet and ion source was maintained at 250°C, respectively. The injection volume was 1 ml. Helium was utilized at a constant flow rate of 0.87 ml/min as the carrier gas. The MS system was operated with electron impact ionization at 70 eV and a scanning range of m/z 50-650 (full-scan mode).

Data Processing and Statistical Analysis
QC samples were analyzed five times at the beginning of the run and injected once after every 10 injections of the random sequenced samples. The raw data obtained from the GC-MS run were transformed to the mzData format using MassHunter Workstation Software (Version B.06.00, Agilent Technologies). Data pretreatments including non-linear retention time alignment, peak discrimination, filtering, alignment, matching, and identification were done using the XCMS Online platform (https://xcmsonline.scripps.edu). The obtained three-dimensional data which were generated from XCMS were conducted by principal component analysis (PCA) and orthogonal partial least square discriminant analysis (OPLS-DA) using SIMCA-P software (version 14.0, Umetrics, Sweden). The t-test was utilized to compare the significant difference between cancer tissue and paracarcinoma tissue for parametric variables. For each statistical analysis, a p-value less than 0.05 was considered as significant. The metabolites with variable importance on the projection (VIP) value greater than 1.0 and p values less than 0.05 were set for differential metabolites.

Biomarker Identification and Pathway Enrichment Analysis
Only the variables according with the criteria of "p-value <0.05 in ANOVA and VIP values ≥1.0" were screened as potential biomarkers, and then research for the molecular mechanism was continued. The elemental formula and fragmentation patterns were obtained using MassHunter Workstation Software. Simultaneously, differential metabolites were tentatively identified by library search (

Confirmatory Studies by Histopathologic Examination and Immunohistochemistry
The CRC samples utilized for immunohistochemistry staining were formalin-fixed, paraffin-embedded tissues and included tissues of the CRC tumor from each patient. Fifty cases of CRC were collected and spotted in duplicate. Diagnostic paraffin blocks were selected on the basis of the availability of suitable formalin-fixed paraffin-embedded tissue. A histological confirmation of CRC was achieved in all cases by a central review using standard tissue sections, and most of the tumor-rich areas were marked in the paraffin blocks. Finally, the CRC was confirmed by hematoxylin and eosin (H&E) staining ( Figure S1).

Result of Multivariate Statistical Analysis
Typical GC-MS base peak intensity (BPI) chromatograms of tissue samples from the cancer and paracarcinoma and serum samples from the preoperative and postoperative CRC were obtained ( Figure 2). Firstly, we specifically compared the tissue   and serum samples for the sake of unambiguous classification. Supervised PCA was conducted on the samples to visualize general clustering, trends, or outliers among the observations. As shown in Figure 3A, the clustering significantly differed between the cancer and paracarcinoma tissues, implying the chemical composition in the significant diversity of tissues between two groups. Simultaneously, the PCA analysis results for serum are shown in Figure 4A. In order to better characterize variations of tissue and serum metabolic profiles, orthogonal partial least-square discriminant analysis was carried out to further identify significantly features/ions. The key differences in t R -m/z pairs were screened by the S-plot and VIP value ( Figures 3C and 4C) from OPLS-DA, respectively. In the OPLS-DA S-plot ( Figures 3B and 4B), each variation point represents an t R -m/z pair; the X-axis represents variable contribution which has a further distance from the origin point, the greater the contribution to the separation of groups; the Y-axis represents variable confidence which has a further distance from the origin point, the higher the confidence level of the t R -m/z pairs to the separation of groups (12). The OPLS-DA model was further validated by cross-validation and permutation test ( Figures 3D  and 4D). Furthermore, the ions with variable importance in the projection (VIP) values >1.0 and p-value <0.05 in ANOVA that at the corner of S-plot were the variables dedicating most to the differences.

Identification of Potential Biomarkers and Pathway Analysis of CRC
The ions with VIP ≥1.0 and p < 0.05 obtained from the S-plot were considered as the candidate putative biomarkers for NIST, METLIN, MetaboAnalyst, and Human Metabolome Database identification. Finally, 17 tissue and 13 serum candidate ions were selected and their corresponding retention time, p-value, m/z, and VIP value are summarized in Table 2, and the heat maps of the tissue and serum samples were conducted ( Figure 5). Simultaneously, the receiver-operating characteristic (ROC) curve was utilized to evaluate the potential biomarkers ( Figure S2). The results show that the trends of stearic acid and cholesterol are consistent, which were the most promising biomarkers. More detailed analyses of the most relevant pathways and networks for CRC were performed by MetaboAnalyst 5.0, which is a free web-based tool that combines results from powerful pathway enrichment analysis involved in the conditions under study. Metabolic pathway analysis revealed (Table 3 and Figure 6) that 12 pathways contributed to CRC in tissue level, including inositol phosphate metabolism, primary bile acid biosynthesis, steroid biosynthesis, and phosphatidylinositol signaling system. At the same time, there are 12 pathways ( Table 3 and Figure 6) contributing to CRC in serum level, including linoleic acid metabolism, primary bile acid biosynthesis, and steroid biosynthesis. Ultimately, the most influential pathways contributing to CRC were inositol phosphate metabolism, primary bile acid biosynthesis, phosphatidylinositol signaling system, and linoleic acid metabolism.

Signaling Networks
Identified potential biomarkers were leading in the KEGG (http://www.kegg.jp/) in finding interactions, which show the relationships among these signal pathways. The networks were primarily related to inositol phosphate metabolism, primary bile acid biosynthesis, steroid biosynthesis, phosphatidylinositol signaling system, and linoleic acid metabolism. According to the pathway flow analysis, the primary bile acid biosynthesis pathway was deemed to be the upstream signaling network. Compared with the paracarcinoma tissue/preoperative serum, the levels of 2,3-butanediol, cholesterol, muco-inositol, oleic acid, and stearic acid were significantly downregulated. Conversely, Llactic acid, allose, cholestane-3,5-diol, and glucose were markedly upregulated (Figure 7).

DISCUSSION
Colorectal cancer is the third leading cause of cancer-related deaths, and late-stage diagnosis is a major cause of morbidity and mortality of CRC, greatly threatening the health of humans (13,14). The incidence of CRC has been rising continuously in recent decades, resulting in about 900,000 deaths per year globally [15][16]. There is thus an urgent need to discriminate accurate and non-invasive biomarkers to assist the early diagnosis and clinical management of CRC (15,16). However, metabolomics is usually utilized to discriminate the potential biomarkers and, for the purpose of the study, the complexity and hugeness of metabolic networks based on a limited number of single pathways to characterize pathological states in animals and human (17). In this study, the potential mechanisms of CRC, through gas chromatography-tandem mass spectrometry and multivariate statistics analysis, were assessed. A metabolic profiling method based on gas chromatography-tandem mass spectrometry coupled with multivariate statistical analysis including principal component analysis and orthogonal partial least square-discriminant analysis was employed to discriminate the groups, screen differential metabolites, identify the significant metabolites, and illustrate the mechanisms of disease. We gave an illustrative case to show that metabolomics is an innovative method for exploring disease biomarkers or intervention-related perturbed metabolic pathways (18).
In this study, untargeted GC-MS-based tissue (cancer tissue and paracarcinoma tissue) and serum (preoperative and postoperative of 2 weeks) metabolomics was applied to investigate the metabolic state of 50 human subjects. The itemization of CRC consensus molecular subtypes was performed in an effort to explicate the clinical heterogeneity through data analysis results (19). Additionally, an eightbiomarker panel (L-lactic acid, 2,3-butanediol, cholesterol, allose, malic acid, muco-inositol, oleic acid, stearic acid) can differentiate well between cancer tissue and paracarcinoma tissue. A six-biomarker (cholesterol, oleic acid, stearic acid, cholestane-3,5-diol, glucose, 3-hydroxybutyric acid) can differentiate well between the preoperative and postoperative sera of 2 weeks. As a consequence of the above, the levels of 2,3butanediol, cholesterol, muco-inositol, oleic acid, and stearic acid were significantly upregulated. Conversely, L-lactic acid, allose, cholestane-3,5-diol, and glucose were markedly downregulated. CRC is closely related to metabolic dysfunction in the pathway of inositol phosphate metabolism, primary bile acid biosynthesis, steroid biosynthesis, phosphatidylinositol signaling system, and linoleic acid metabolism.
Glucose metabolism was utilized by cancer cells to provide sufficient metabolite precursors and energy to sustain fast cell growth (20,21). Increased glucose uptake and enhanced glycolysis has been identified as a hallmark of cancer cells, with upregulated levels of transporters and enzymes involved in glucose metabolism. In our study, we found that glucose in preoperative CRC serum decreased obviously then that in 2week postoperative serum. The decrease of glucose indicates that glycolysis is a metabolic alteration that appears early on the development of CRC to provide energy for cancer cell growth [22]. Inducing a condition of "hunger" in the cancer cells could be an effective method to improve the disease and possibly even treat the disease (22).
As the molecular skeleton of inositol hexaphosphate (IP6), a carbohydrate, and the precursor of phosphorylated compounds (23), inositol is primarily utilized to treat CRC (24) and other diseases. Inositol exhibits its biological activity on anticancer, and it synergistically reinforces the inhibitory pesticide effects of IP6 on the growth of colon and mammary cancers (24). It was shown that inositol decreased to about 42% in tumor tissues compared to non-tumor tissues. Simultaneously, with the reduction of inositol, an inhibitor of cancer, the cancer cell grows faster. Therefore, inositol might be a useful tool to inhibit the development and progression of CRC (25). Although cholesterol is essential in our body, its levels are related with increased CRC risk and the treatment of statin could decrease CRC risk in older adults under 75 years of age   (26). In cholesterol, for example, accumulation is observed in tumors from gastrointestinal cancer patients through increased low-density lipoprotein receptor (LDLR) and decreased ATP-binding cassette transporter (ABCA1) expression (27). Researchers interpreted that the CRC liver metastasis-specific cholesterol metabolic pathway is established for colonization of metastatic CRC cells (27). Simultaneously, inhibiting this CRC liver metastasis-specific cholesterol metabolic pathway could suppress CRC liver metastasis. Finally, it was confirmed that targeting the cholesterol biosynthesis pathway may be a promising therapy for liver metastasis of CRC (28). Cholesterol is an important metabolite participating in bile acid biosynthesis whose levels had increased in the CRC. Simultaneously, bile acids are cholesterol derivatives with detergent properties, are normally seen in the intestine, and have been suggested since 1939 as the tumor-promoting agents (29).

CONCLUSION
Our research concentrated on the colorectal cancer patients based on untargeted GC-MS-based tissue (cancer tissue and paracarcinoma tissue) and serum (preoperative and postoperative of 2 weeks) metabolomics methods. Finally, 17 tissue and 13 serum candidate ions were selected based on their corresponding retention time, p-value, m/z, and VIP value. The results show that the trends of stearic acid and cholesterol are consistent, which were the most promising biomarkers. This also means that these metabolites might have important clinical significance for the detection of CRC. Metabolic pathway analysis revealed that 12 pathways contributed to CRC in tissue level, including inositol phosphate metabolism, primary bile acid biosynthesis, steroid biosynthesis, and phosphatidylinositol signaling system. At the same time, there are 12 pathways contributing to CRC in serum level, including linoleic acid metabolism, primary bile acid biosynthesis, and steroid biosynthesis. Ultimately, the most influential pathways contributing to CRC were inositol phosphate metabolism, primary bile acid biosynthesis, phosphatidylinositol signaling system, and linoleic acid metabolism. Further research will be conducted to determine if these biomarkers could be fully integrated into application for early diagnosis of CRC.

DATA AVAILABILITY STATEMENT
The original contributions presented in the study are included in the article/Supplementary Material. Further inquiries can be directed to the corresponding authors.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by the Nanjing Hospital of Chinese Medicine Affiliated to Nanjing University of Chinese Medicine. The patients/ participants provided their written informed consent to participate in this study.

AUTHOR CONTRIBUTIONS
GZ, DK, and ZF designed the research. WW, YW, and GZ performed the experiments. BP and FS collected the samples. GZ and WW analyzed the data. GZ, YW, and YZ wrote the manuscript. All authors contributed to the article and approved the submitted version.