Differential Metabolites in Chinese Autistic Children: A Multi-Center Study Based on Urinary 1H-NMR Metabolomics Analysis

Background: Autism spectrum disorder (ASD) is a group of early-onset neurodevelopmental disorders. However, there is no valuable biomarker for the early diagnosis of ASD. Our large-scale and multi-center study aims to identify metabolic variations between ASD and healthy children and to investigate differential metabolites and associated pathogenic mechanisms. Methods: One hundred and seventeen autistic children and 119 healthy children were recruited from research centers of 7 cities. Urine samples were assayed by 1H-NMR metabolomics analysis to detect metabolic variations. Multivariate statistical analysis, including principal component analysis (PCA), and orthogonal projection to latent structure discriminant analysis (OPLS-DA), as well as univariate analysis were used to assess differential metabolites between the ASD and control groups. The differential metabolites were further analyzed by receiver operating characteristics (ROC) curve analysis and metabolic pathways analysis. Results: Compared with the control group, the ASD group showed higher levels of glycine, guanidinoacetic acid, creatine, hydroxyphenylacetylglycine, phenylacetylglycine, and formate and lower levels of 3-aminoisobutanoic acid, alanine, taurine, creatinine, hypoxanthine, and N-methylnicotinamide. ROC curve showed relatively significant diagnostic values for hypoxanthine [area under the curve (AUC) = 0.657, 95% CI 0.588 to 0.726], creatinine (AUC = 0.639, 95% CI 0.569 to 0.709), creatine (AUC = 0.623, 95% CI 0.552 to 0.694), N-methylnicotinamide (AUC = 0.595, 95% CI 0.523 to 0.668), and guanidinoacetic acid (AUC = 0.574, 95% CI 0.501 to 0.647) in the ASD group. Combining the metabolites creatine, creatinine and hypoxanthine, the AUC of the ROC curve reached 0.720 (95% CI 0.659 to 0.777). Significantly altered metabolite pathways associated with differential metabolites were glycine, serine and threonine metabolism, arginine and proline metabolism, and taurine and hypotaurine metabolism. Conclusions: Urinary amino acid metabolites were significantly altered in children with ASD. Amino acid metabolic pathways might play important roles in the pathogenic mechanisms of ASD.


INTRODUCTION
Autism spectrum disorder (ASD) is a group of early-onset neurodevelopmental disorders characterized by social communication difficulties, narrow interests, and repetitive stereotyped behaviors (1). In the United States, the prevalence of ASD has increased in recent years, ranging from 1.57% in 2009 to 1.85% in 2020 (2,3). The prevalence of ASD among children aged 6-12 years is ∼0.70% in China (4). The large cost associated, mainly consisting of special education services and parental productivity loss, has caused a heavy burden to society and families (5). Despite the lack of effective drug treatments, several studies highlight the potential benefits of early diagnosis and parent-mediated interventions, which have to some extent improved children's social and communicative abilities (6,7). The detection rate of genetic etiology of ASD is about 10-15% (8)(9)(10). However, early diagnosis remains a challenge for nongenetic ASD, which is mainly based on combining clinician observation with caregiver reports (11). Currently, there is an urgent need to find valuable biomarkers for the early diagnosis of ASD.
Some metabolomic studies have indicated the presence of elevated biomarkers in blood and urine samples from ASD patients, and these biomarkers include pyruvate, lactate, and mitochondrial-related enzymes (12)(13)(14). Major analytical techniques for metabolomics are nuclear magnetic resonance spectroscopy (NMR) and chromatography, including gas chromatography-mass spectrometry (GC-MS) and liquid chromatography-mass spectrometry (LC-MS). The advantages of 1 H-NMR are that the sample preprocessing is simple and non-destructive and that the detection of metabolites is comprehensive (15,16). In 2010, Yap et al. first used urinary 1 H-NMR analysis to detect potential biomarkers for ASD (17). Over the last few years, with urinary 1 H-NMR analysis, some discriminating metabolites have been identified between ASD and healthy people. Studies indicated that ASD group showed higher levels of hippurate, glycine, tryptophan and D-threitol and lower levels of glutamate, creatine, valine, betaine and 3methylhistidine. Further analyses indicated possible pathogenic mechanisms involving gut microbial metabolism, oxidative stress conditions and amino acid metabolism (18,19). Overall, 1 H-NMR analysis shows great potential for the identification of biochemical signatures and investigation of the disease mechanisms of ASD. However, previous 1 H-NMR analyses lack large-scale sample sizes to confirm the significance and connection between metabolites and ASD. We aimed to conduct a large-scale and multi-center study to identify metabolic variations between ASD and control groups through urinary 1 H-NMR metabolomics analysis and to investigate potential biological mechanisms related to differential metabolites.

Participants
The study was conducted from January 2014 to December 2016. All the participants were recruited from research centers of 7 cities (Shanghai, Guangzhou, Changsha, Chongqing, Chengdu, Wenzhou, and Beijing). Participants were drawn from ASD and control group.
Autistic children from both hospitals and local autism rehabilitation of each research center were enrolled in the ASD group. The inclusion criteria for ASD group were: (a) children aged 2-18 years; (b) no limitation on the gender; (c) the diagnosis of ASD was based on the Diagnostic and Statistical Manual of Mental Disorders 5 (DSM-5) criteria (1) and confirmed with the Autism Diagnostic Observation Schedule (ADOS) and the Autism Diagnostic Interview-Revised (ADI-R) criteria by trained clinical psychiatrists from each research center; (d) urine sample was available. Exclusion criteria for ASD group were: (a) symptomatic autism (such as Rett syndrome and fragile X syndrome); (b) other mental illness (such as attention-deficit hyperactivity disorder); (c) other neurological disorders (such as epilepsy and central nervous system infections); (d) inherited metabolic diseases; (e) history of brain injury; (f) taking nonessential drug or dietary supplement before (72 h) and during sample collection.
Healthy children from the health examination center of each research center were enrolled in the control group. The inclusion criteria for control group were: (a) children with no abnormality in health examination and typical development; (b) age-and sex-matching with ASD group; (c) urine sample was available. Exclusion criteria for control group were: (a) siblings of ASD group; (b) clinical evidence of ASD diagnosis; (c) mental illness (such as attention-deficit hyperactivity disorder); (d) neurological disorders (such as epilepsy and central nervous system infections); (e) inherited metabolic diseases; (f) history of brain injury; (g) taking non-essential drug or dietary supplement before (72 h) and during sample collection.
After all eligible participants and their parents provided informed consent, they were invited to participate in the study. The study was approved by the institutional ethics committee at the Children's Hospital of Fudan University.

Sample Collection
First morning urine specimens were collected from all participants during the research period. Samples were collected in 15 mL urine collection tubes without preservative. Each sample was centrifuged and aliquoted into 1.5 mL EP tubes. Afterwards, samples were numbered ("1" represents the ASD group, and "2" represents the control group) and stored at −80 • C immediately until 1 H-NMR analysis.

Sample Preparation
A 500 µL urine sample was added to a 1.5 mL EP tube prefilled with 14 µL KF (5 M) solution. After vortexing, the sample was allowed to rest for 10 min, followed by centrifugation (12,000 rpm, 4 • C) for 10 min. A total of 450 µL liquid supernatant was added to an NMR tube preloaded with 10 µL EDTA-d12 (0.1 M). The NMR tube cap was covered and mixed by hand. Finally, 45 µL Na + /K + buffer (1.5 M, pH = 7.40) was added to the NMR tube, which was mixed by hand and then placed in the NMR spectrometer for data collection. The sample preparation process for 1 H-NMR analysis is shown in Supplementary Figure 1. 1

H-NMR Spectroscopy Experiments
All 1 H-NMR spectroscopy experiments were performed at 298 K using a Bruker AVIII 600 MHz NMR spectrometer (Bruker BioSpin, Germany) with a proton resonance frequency of 600.13 MHz.

Data Processing
Spectra were processed using MestReNova software (MestReNova 8.1, Spain). The 1 H-NMR FID signals were multiplied by an exponential function equivalent to a line broadening of 1 Hz before performing an automatic Fourier transformation. The phase distortion and baseline of each spectrum were manually adjusted. The internal standard trimethylsilyl propanoic acid (TSP, δ = 0 ppm) was used as the baseline to calibrate the chemical shifts. The concentration of TSP was 0.261 mmol/L, and the spectrum of TSP represented 12 1 H. The regions of the 1 H-NMR spectra (δ 0.3-9.5 ppm) were divided into consecutive integrated spectral regions of equal width (δ 0.004 ppm). The spectral region of the water (δ 4.71-5.055 ppm) and urea (δ 5.6-6.12 ppm) peaks were removed from each spectrum to minimize variations caused by the presaturation of the residual water and urea resonances. Mnova software was used to correct the spectra with obvious chemical shifts after the integration. Metabolites were assigned by referencing the values for chemical shifts in J-resolved (JRES), COZY, TOCSY, HSQC, and HMBC spectra and literature reports. A series of 2D NMR spectra were acquired for selected samples.
To eliminate the instrument differences of sensitivity and stability and to reduce the analysis errors caused by the concentration differences of the samples, two normalization methods were performed. Creatinine normalization: the 1 H-NMR spectra were normalized by using the creatinine methylene resonance (δ = 4.05 ppm) as a reference. Total area normalization (20): the integrated area in each bucket was normalized by the total sum of peak intensities to eliminate the effects of variable concentration among different samples.

Data and Statistical Analysis
Clinical Characteristics of Participants Difference in age between the ASD and control groups was evaluated by Student's t-test when the distribution was normal or the Mann-Whitney U test when it was skewed. Difference in sex was investigated using chi-square test. Statistical analyses were performed by using the SPSS statistical package program (version 20, SPSS Inc., Chicago, IL, USA), and P < 0.05 was considered statistically significant.

Multivariate Analysis
The normalized data were imported into the SIMCA-P+ software package (version 13.0, Umetrics, Sweden) for multivariate statistical analysis, including principal component analysis (PCA) and orthogonal projection to latent structure discriminant analysis (OPLS-DA) (21)(22)(23)(24). PCA was first used to observe the overall distribution among samples and the stability of the whole analysis process. Abnormal data, which related to sample contamination or improper sampling, were removed according to the overall aggregation trend in all the samples. After which OPLS-DA was used to distinguish the overall difference in the metabolic profile and to find differential metabolites between the groups.
To prevent model overfitting, an internal validation method was used to verify the validity of the model. The OPLS-DA model was validated by a 7-fold cross-validation (CV) (25). R 2 and Q 2 were two parameters to assess the quality of the model. OPLS-DA was further validated by variance analysis of the cross-validated residuals (CV-ANOVA) (26), and the model was considered valid at P < 0.05.
After multiplying the loading value of each variable with its standard deviation, backtracking conversion of the data was performed. Then, the data were assessed by multivariate analysis and imported into mapping software based on MATLAB (version 7.1, USA) to plot the loading diagram of the correlation  coefficient. The Pearson correlation coefficient represented the linear correlation between the variable and the first principal component score value of the OPLS-DA model and was used to determine whether the variation contributed significantly to the intergroup differentiation. The significance was evaluated by the threshold value of the absolute correlation coefficient, which was determined according to the confidence interval of the sample size. If the absolute value of the Pearson correlation coefficient of the variable was higher than the threshold value (P < 0.05), the content of the variable was considered significantly different between groups.

Univariate Analysis
Differential metabolites between the two groups were selected for univariate analysis. Data from the total area normalized peak area are expressed as the mean ± SD. An independent sample t-test was used for comparing the two groups when the distribution was normal. The non-normal distribution data were evaluated by Mann-Whitney U test. P < 0.05 was considered statistically significant. The false discovery rate (FDR) was used to correct multiple hypothesis testing. FDR P < 0.1 was considered statistically significant. We applied correlation analysis to detect the relationship between differential metabolite levels and age.

Receiver Operating Characteristics (ROC) Curve Analysis
The sensitivity and specificity of metabolites with significant differences between the two groups in the diagnosis of ASD were evaluated using a ROC curve analysis. ROC curves were generated by using MedCalc statistical software (version 19.1.7, Belgium). The area under the curve (AUC) was used to measure the overall degree of identification power. An AUC > 0.7 was considered acceptable. Optimal cut-off points were determined by maximizing the Youden's J index (J = sensitivity + specificity -1). The sensitivity, specificity, false-negative rate (FNR), falsepositive rate (FPR), positive predictive value (PPV), negative predictive value (NPV), likelihood ratio positive (LR+), and likelihood ratio negative (LR-) were calculated to compare the diagnostic accuracy of the metabolites. Further analysis was performed after each group was stratified by sex and age. Logistic regression was used to analyse the combined metabolites.

Metabolic Pathways and Network Analysis
MetaboAnalyst (version 4.0, https://www.metaboanalyst.ca/) was used for pathway and network analyses (27). All differential metabolites were imported into the pathway analysis module to obtain matched pathways according to the P-values from the pathway enrichment analysis and pathway impact values from the pathway topology analysis. Significantly affected metabolic pathways associated with differential metabolites were identified by metabolite set enrichment analysis (MSEA). A metabolitegene-disease interaction network was established to detect the connections of differential metabolites and associated pathways.

Characteristics of the ASD and Control Groups
A total of 117 children with ASD were enrolled in the ASD group, and 119 healthy children were enrolled in the control group. The mean ages of the ASD and control groups were 10.12 ± 2.60 and 9.90 ± 1.73, respectively, with no significant difference between them (P = 0.468). The male to female ratios were 4.82:1 in the ASD group and 3.58:1 in the control group, with no significant difference between them (P = 0.388).  Unknown −0.296 * The negative value indicates that the metabolite content in the urine samples of the control group is significantly lower than that of the ASD group. Conversely, the positive value indicates that the metabolite of the control group is significantly higher than that of the ASD group.

Total Area Normalization Analysis
The 1 H-NMR spectra of the two groups were analyzed by PCA after total area normalization. The parameters of the PCA score plot ( Figure 3A) were R 2 X = 0.366 and Q 2 = 0.0656. The OPLS-DA score plot ( Figure 3B) showed R 2 X = 0.274, Q 2 = 0.0565 and CV-ANOVA P = 0.009. These results suggested that the model was valid.

Differential Metabolites Between the ASD and Control Groups
To distinguish the metabolic differences between the two groups, a polychromatic correlation coefficient loading plot (Figure 4) was drawn. The color of the polychromatic loading plot was encoded by the absolute value of the correlation coefficient. The warmer the color is, the higher the absolute value of the correlation coefficient and the greater the contribution to the intergroup differentiation. The threshold of the absolute value of the Pearson's correlation coefficient was determined to be 0.182. The variables corresponding to the correlation coefficient with an absolute value > 0.182 contributed significantly to the intergroup differentiation (P < 0.05). Differential metabolites identified by the Pearson's correlation coefficient were showed in Table 1. Compared with the control group, the ASD group showed higher levels of glycine, guanidinoacetic acid, creatine, hydroxyphenylacetylglycine, phenylacetylglycine and formate and lower levels of 3-aminoisobutanoic acid, alanine, taurine, creatinine, hypoxanthine and N-methylnicotinamide. The fold changes of differential metabolites are summarized in Figure 5. A total of 6 metabolites, including 3-aminoisobutanoic acid (P = 0.0425), creatine (P = 0.0009), creatinine (P < 0.0001), hypoxanthine (P < 0.0001), formate (P = 0.0267), and N-methylnicotinamide (P = 0.0149), showed significant differences in the normalized peak areas between the two groups. The differences of above 6 metabolites remained significant after correction for multiple hypothesis testing (FDR P < 0.1) FIGURE 5 | Comparation of the fold changes of the differential metabolites between the ASD and control groups. The fold change represents the ratio of average peak area between ASD group and control group.

Differential Metabolites and Potential Biological Mechanisms
The Sensitivity and Specificity of Metabolites in the Diagnosis of ASD The diagnostic accuracies of differential metabolites in the two groups were evaluated by ROC curve analysis ( Table 3, Supplementary Figure 2). The ROC curve showed relatively significant diagnostic values of hypoxanthine (AUC = 0.657, 95% CI 0.588 to 0.726), creatinine (AUC = 0.639, 95% CI 0.569 to 0.709), creatine (AUC = 0.623, 95% CI 0.552 to 0.694), N-methylnicotinamide (AUC = 0.595, 95% CI 0.523 to 0.668) and guanidinoacetic acid (AUC = 0.574, 95% CI 0.501 to 0.647) for ASD (Supplementary Figure 3). The AUC of ROC analysis of the creatine/creatinine ratio was 0.6480 (95% CI 0.579 to 0.718). For each metabolite, there was no significant difference in AUCs between males and females (Supplementary Table 2). Compared with age stratification of 7-9 years old, the metabolites guanidinoacetic acid and creatine showed significantly higher diagnostic accuracy for ASD in the age stratification of 13-15 years old (AUC of guanidinoacetic acid = 0.802, 95% CI 0.566 to 0.944, P = 0.0282; AUC of creatine = 0.823, 95% CI 0.589 to 0.955, P = 0.0344. Results shown in Supplementary Table 3). By combining the metabolites of creatine, creatinine and hypoxanthine, the AUC of the ROC curve reached 0.720 (95% CI 0.659 to 0.777), with a sensitivity of 80.34% and specificity of 52.94%.

Correlated Metabolic Pathways and Networks
The main metabolic pathways associated with differential metabolites are shown in Supplementary Figure 4. According to the bubble plot of the metabolic pathway impact, there were significant metabolite changes in the glycine, serine and threonine metabolism, glyoxylate and dicarboxylate metabolism and taurine and hypotaurine metabolism pathways (Figure 6A). The plot of metabolite set enrichment analysis (MSEA) listed the significant enrichment pathways of differential metabolites ( Figure 6B). Glycine, serine and threonine metabolism, primary bile acid biosynthesis and aminoacyl-tRNA biosynthesis were the pathways with the most significant enrichment. The metabolitegene-disease interaction network provides a global view of the connection of the differential metabolites and the potential functional relationships among metabolites, connected genes, and target diseases (Supplementary Figure 5).

DISCUSSION
Complex etiologies and atypical symptoms pose very large challenges to the early diagnosis of ASD. Urinary 1 H-NMR analysis provides a fast and comprehensive assessment to detect potential biomarkers of ASD (28). Our study was based on the 1 H-NMR analysis of urine samples from ASD and control groups to identify candidate metabolites and associated pathogenic mechanisms. Compared with other studies of urinary 1 H-NMR metabolomics analysis of ASD, our study sample was large size, multi-center, and representative. Besides, our study used two normalization methods, creatinine normalization and total area normalization. The analysis using creatinine normalization showed no significant differences between the two groups, though the total area normalization did detect a difference. The variation of creatinine was confirmed by the total area normalization. Whiteley et al. also found that excretion of urinary creatinine in the group of pervasive developmental disorders, which included ASD, was significantly lower than controls (29). The abnormal creatinine metabolism might be caused by rigidity in food choice and various exclusion diets associated with ASD (30). Therefore, total area normalization was more suitable for our study. The results obtained from the 1 H-NMR analysis revealed that the levels of glycine, guanidinoacetic acid, creatine, hydroxyphenylacetylglycine, phenylacetylglycine and formate were higher in the ASD group than those in the control group. Moreover, the levels of 3-aminoisobutanoic acid, alanine, taurine, creatinine, hypoxanthine and N-methylnicotinamide were lower in the ASD group than those in the control group. The normalized peak areas of 3-aminoisobutanoic acid, creatine, creatinine, hypoxanthine, formate and Nmethylnicotinamide differed significantly between the two groups. The metabolite levels of glycine, taurine, creatine and creatinine were consistent with those from previous reports (19,31). Our study is the first to detect variations in the metabolites of hydroxyphenylacetylglycine in urine samples from children with ASD. The results of ROC analysis indicated that creatine, creatinine and hypoxanthine have the potential to be biomarkers for the diagnosis of ASD. The creatine/creatinine ratio slightly improved the diagnostic accuracy. Recent research has reported significant female-related alterations of creatine and creatinine, and the creatinine/creatine ratio might be a good predictor of Frontiers in Psychiatry | www.frontiersin.org ASD in female subjects (32). We further found that combining the metabolites creatine, creatinine and hypoxanthine as a potential diagnostic indicator can largely improve the diagnostic accuracy for ASD.
The extension study of the metabolic pathway analysis demonstrated a possible imbalance of amino acid metabolism in ASD children. Differential metabolites between the ASD and control groups involved glycine, serine and threonine metabolism, arginine and proline metabolism, taurine and hypotaurine metabolism, and glutathione metabolism pathways (Supplementary Figure 6). In accordance with previous observations, amino acid metabolism disorder plays an important role in the pathogenesis mechanism of ASD (33)(34)(35)(36)(37). Creatine and creatinine, which show significant metabolism alterations in ASD, play an essential role in maintaining a high level of energy supply for the brain (38). Studies have indicated that creatine deficiency occurs in some ASD cases, and creatine may be a therapeutic target for ASD (38)(39)(40)(41). Abnormalities in creatinine might closely linked to abnormalities in creatine. Creatine is biosynthesized from glycine and arginine with an intermediate metabolite of guanidinoacetic acid. Glycine acts as an excitatory neurotransmitter in the early developmental stage. As the nervous system matures, it transforms to the major inhibitory neurotransmitter. If the transformation does not occur, an abnormal level of glycine may result in neural disorders, including ASD (42)(43)(44). The taurine and hypotaurine metabolism pathway also differed significantly between the two groups. Oxidative stress imbalance is considered to be important components of the pathophysiology of ASD, and antioxidant therapy may improve the prognosis of ASD (45,46). Park et al. reported that taurine, as an antioxidant and regulator of inflammation, might be a valid biomarker for ASD (47). Combined with vitamin D3, taurine showed benefits in the treatment of ASD (48,49). All metabolic pathways interact with each other and constitute a complex network with related genes and diseases (Supplementary Figure 5).
Many studies have reported that ASD is associated with abnormal gut microbial metabolism (50)(51)(52). Gut microbiota metabolites, including phenylalanine, tyrosine, hippurate and tryptophan, have been reported to be factors in the development of ASD (53)(54)(55)(56). In our study, phenylacetylglycine, a gut microbial co-metabolite, had a slight variation between the two groups. Phenylacetylglycine is the end product of the phenylalanine metabolism pathway (57). However, there are no studies on the relationship between phenylacetylglycine and the pathogenesis of ASD.

LIMITATIONS
There are some limitations in our study. At the cellular level, biochemical processes such as oxidative phosphorylation, redox reaction, and oxidative stress are regulated by circadian rhythms (58). About 50% of metabolites are thought to be rhythmic, involving metabolic pathways such as nucleotides, energy, oxidation, and carbohydrates (59). Studies indicated that disrupted circadian rhythm is closely related to ASD (60)(61)(62). Thus, in our study, inconsistent urine collection times may have affected the quality of metabolic analyses due to the restriction of research centers. A few small-scale sample studies have set healthy sibling group as control to remove the effects of confounding factors, such as heredity and environment (17,19). However, our study lacks of healthy siblings as controls.
Though 1 H-NMR analysis is the regular NMR method, some studies have used 2D HSQC-NMR to improve urinary screening in ASD. Compared to 1 H-NMR analysis, 1 H-13 C HSQC-NMR analysis shows the advantage of improving the metabolite detection accuracy and the discrimination ability (18,31). Moreover, it is necessary to compare metabolite levels that vary with ASD severity to better clarify the pathogenesis of ASD.

CONCLUSIONS
In our study, 1 H-NMR metabolomic analysis was used to investigate urinary metabolism patterns in the ASD and control groups. We revealed that urinary amino acid metabolites were significantly altered in children with ASD. A series of variations in amino acid metabolism pathways, including glycine, serine and threonine metabolism, arginine and proline metabolism, and taurine and hypotaurine metabolism might play important roles in the pathogenic mechanisms of ASD. Further studies of differential metabolites are needed to improve the understanding of ASD pathogenesis.

DATA AVAILABILITY STATEMENT
The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by the institutional ethics committee at the Children's Hospital of Fudan University. Written informed consent to participate in this study was provided by the participants' legal guardian/next of kin.

AUTHOR CONTRIBUTIONS
YW conceived of the study. YM and HZ contributed to the analysis, synthesis and interpretation of the results, and wrote the manuscript. CL, LZ, and TW contributed to the sample collection. XZ, XL, LW, TL, XC, MM, YH, and EL contributed to the diagnosis of ASD and sample collection from each research center. YA, XX, WY, and YJ provided guidance for the study. All of the authors contributed to the preparation of the manuscript.

ACKNOWLEDGMENTS
We sincerely thank all the patients, controls and parents who provided written consent to participate in this study, as well as the clinicians who contributed to the recruitment of the participants.