Integration of the Human Gut Microbiome and Serum Metabolome Reveals Novel Biological Factors Involved in the Regulation of Bone Mineral Density

While the gut microbiome has been reported to play a role in bone metabolism, the individual species and underlying functional mechanisms have not yet been characterized. We conducted a systematic multi-omics analysis using paired metagenomic and untargeted serum metabolomic profiles from a large sample of 499 peri- and early post-menopausal women to identify the potential crosstalk between these biological factors which may be involved in the regulation of bone mineral density (BMD). Single omics association analyses identified 22 bacteria species and 17 serum metabolites for putative association with BMD. Among the identified bacteria, Bacteroidetes and Fusobacteria were negatively associated, while Firmicutes were positively associated. Several of the identified serum metabolites including 3-phenylpropanoic acid, mainly derived from dietary polyphenols, and glycolithocholic acid, a secondary bile acid, are metabolic byproducts of the microbiota. We further conducted a supervised integrative feature selection with respect to BMD and constructed the inter-omics partial correlation network. Although still requiring replication and validation in future studies, the findings from this exploratory analysis provide novel insights into the interrelationships between the gut microbiome and serum metabolome that may potentially play a role in skeletal remodeling processes.


INTRODUCTION
Osteoporosis is a progressive age-related condition associated with reduced bone mineral density (BMD) and increased susceptibility to low trauma fractures, which are the clinical endpoint of the disease. It represents the most prevalent metabolic bone disorder affecting >200 million people worldwide (Reginster and Burlet, 2006), and the burden is particularly large among postmenopausal women, which is mainly attributed to the reduced production of estrogen and other hormonal/metabolic changes that occur during menopause. It is estimated that at least one in three postmenopausal women have osteoporosis, and nearly half of those women will experience fragility fractures in their remaining lifetime (Melton et al., 1992). Dual-energy X-ray absorptiometry (DXA) derived BMD measurements of the hip and spine are the most frequently used metric for clinically diagnosing osteoporosis, as well as the most powerful known risk factor for predicting fracture risk (Kanis et al., 2005).
The gut microbiome, composed of the bacteria residing in the human gastrointestinal tract, is involved in a variety of diverse functions that are important for physiological wellbeing. There are several potential mechanisms through which the microbiome may impact bone metabolism, as previously reviewed (Hernandez et al., 2016;Chen et al., 2017). The microbiota can influence the intestinal absorption of essential minerals (e.g., calcium) that are important for maintaining skeletal homeostasis (Weaver, 2015), elicit immune responses which may alter the levels of inflammatory cytokines (e.g., TNF-a) that are important for bone health (Sjogren et al., 2012), produce metabolic byproducts (e.g., short chain fatty acids) which regulate critical cell signaling factors for bone remodeling processes (Lucas et al., 2018), and modulate the levels of hormones and neurotransmitters through the gut-brain axis (Cryan et al., 2019), including some (e.g., serotonin) that have been shown to interact with bone cells (Bliziotes, 2010). Although experimental animal models have provided compelling evidence that the gut microbiome may play a role in the regulation of bone mass (Sjogren et al., 2012), only a few limited studies have explored this relationship in humans Das et al., 2019;Xu et al., 2020). While these early efforts reported significant differences in the microbial diversity between osteoporosis cases and healthy controls, they were generally limited by small sample sizes and the inability to reveal specific trait-associated bacteria or functional mechanisms.
Metabolomics enables the comprehensive profiling of the intermediate and end products of cellular metabolism. Since metabolites represent the downstream expression of genomic, transcriptomic, and proteomic factors, small changes in other omics may be amplified at the metabolomic level, enabling the detection of critical biomarkers or corresponding therapeutic target pathways closely related to disease risk (Johnson et al., 2016). However, the application of metabolomics for osteoporosis is rather limited. At present, most efforts are confined to animal experiments (Lv et al., 2016), although several early studies in humans have identified novel osteoporosis biomarkers involved in the metabolism of tryptophan, phenylalanine, lipids, and energy (Miyamoto et al., 2018;Moayyeri et al., 2018;Zhao Q et al., 2018;Gong et al., 2021). Notably, some studies demonstrated that the effects of the novel metabolites identified were more significant than classical bone turnover markers (Qi et al., 2016), supporting the crucial functions of small molecule metabolites in BMD regulation and osteoporosis prediction.
It is well established that changes in diet may be accompanied by shifts in the composition of the microbiome , but perhaps even more important is the resulting effect on the human metabolome. The diet contains many compounds that cannot be broken down by human digestive enzymes, and therefore pass to the gut where they are catabolized by the microbiota (Lamichhane et al., 2018). Some of the metabolic byproducts generated during these processes may then be absorbed into the circulating blood, where they can potentially impact human health. For instance, the gut metabolite Trimethylamine N-oxide (TMAO), regulated by dietary phosphatidylcholine intake, has been shown to promote the development of atherosclerosis (Wang et al., 2011;Tang et al., 2013). Based on these findings, a novel therapeutic approach was established to inhibit microbial production of TMAO . We hypothesized that there could be similar undiscovered mechanisms which contribute to the osteoporosis susceptibility.
Multi-omics integration analyses of microbiome and metabolite profiles collected from the same individuals are very much needed to elucidate the full range of interactions between these biological factors with respect to bone phenotypes. We integrated the paired gut microbiome and untargeted serum metabolite profiles from a large sample of peri-and early postmenopausal Chinese women to explore the crosstalk which may contribute to BMD variation at the femoral neck, the most common site for hip fracture, which is one of the most devastating types of osteoporotic fractures (LeBlanc et al., 2014). An overview of the study workflow is provided in Figure 1.

Sample Recruitment
We randomly recruited 499 peri-and early post-menopausal Chinese women (aged 40 -65) living in Guangzhou City, China. Perimenopausal refers to the menopause transition phase, characterized by irregular menstrual cycles, while postmenopausal is defined by the cessation of menstrual periods for >1 year (Lumsden, 2016). Women who had taken antibiotics or estrogens within three months of enrollment were excluded. We also excluded women with preexisting conditions relevant to bone mass development such as serious residual effects from cerebral vascular disease, diabetes mellitus, chronic renal failure, chronic liver failure, chronic lung disease, alcohol abuse, corticosteroid therapy for more than 6 months duration, evidence of other metabolic or inherited bone disease, rheumatoid arthritis, collagen disorders, and chronic gastrointestinal diseases. Each subject signed an informed consent, and the study protocol was approved by the Medical Ethics Committee of Southern Medical University.
BMD of the hip and spine were measured with DXA (Lunar, GE Healthcare, Madison, WI, USA) by trained and certified research staff. The machine was calibrated daily using a phantom scan for quality assurance, and the accuracy of BMD measurement was assessed by the coefficient of variation for repeated measurements, which was 0.89% for spine BMD. To minimize information loss from artificially dichotomizing individuals into low/high BMD groups, BMD was considered as a quantitative trait. BMD measurements were standardized to have a mean of zero and standard deviation of one, and the normalized values were used as the phenotype.
Each subject provided stool and blood samples for metagenomic and metabolomics analyses, respectively. Stool samples were frozen at -80°C after sample procurement until DNA extraction. To avoid variation due to circadian rhythm, which is known to affect the metabolome (Sahar and Sassone-Corsi, 2012), 10 ml of blood was drawn from each subject after >8 hours of overnight fasting. Serum was extracted from the blood samples according to the protein precipitation protocol (Bruce et al., 2009) developed for metabolomics analysis, aliquoted, and stored at -80°C until used for further analysis. The subjects also completed a questionnaire to collect relevant covariate information (e.g., demographic and lifestyle factors). Since sex hormones are involved in metabolism in general (Guarner-Lans et al., 2011), and bone metabolism more specifically (Drake et al., 2013), the serum levels of follicle stimulating hormone (FSH) and estradiol were measured using routine enzyme linked immunoassay ELISA kits (Immunodiagnostic Systems, Gaithersburg, MD, USA).

Metagenomic Sequencing
DNA was extracted from 200 mg of stool sample using the E.Z.N.A. ® Stool DNA Kit (Omega, Norcross, GA, USA) following the manufacturer's protocol. The total DNA was eluted in 50 ml of elution buffer (QIAGEN, Hilden, Germany) and stored at -80°C until metagenomic sequencing (LC-BIO Technologies Co. LTD., Hang Zhou, China). We constructed a fecal DNA library, and used HiSeq 4000 (Illumina, San Diego, CA, USA) with the paired end 150 bp strategy to conduct sequencing. Fecal DNA was fragmented using dsDNA Fragmentase (New England BioLabs, Ipswich, MA, USA) by incubating at 37°C for 30 min, and the DNA library was constructed by TruSeq Nano DNA LT Library Preparation Kit (Illumina, San Diego, CA, USA). Blunt-end DNA fragments were generated using a combination of fill-in reactions and exonuclease activity, and size selection was performed with the provided sample purification beads. An Abase was added to the blunt ends of the strands, preparing them for ligation to the indexed adapters. Each adapter contained a Tbase overhang for ligating the adapter to the A-tailed fragmented DNA. The adapters were ligated to the fragments and the ligated products were amplified with PCR by the following conditions: initial denaturation at 95°C for 3 min, 8 cycles of denaturation at 98°C for 15 sec, annealing at 60°C for 15 sec, extension at 72°C for 30 sec, and then final extension at 72°C for 5 min.
The raw sequencing reads were then processed to obtain valid reads for further analysis by removing sequencing adapters with cutadapt v1.9 (Martin, 2011), trimming low quality reads using fqtrim v0.94 (Pertea, 2015), and aligning reads to the human reference genome (GRCh38/hg38) to remove host contamination with Bowtie2 v2.2.0 (Langmead and Salzberg, 2012). The quality filtered reads were de novo assembled to construct the metagenome for each sample using SPAdes v3.10.0 (Bankevich et al., 2012). All coding regions of metagenomic contigs were predicted using MetaGeneMark v3.26 (Zhu et al., 2010), and the coding sequences of all samples were clustered to obtain UniGenes with CD-HIT v4.6.1 (Fu et al., 2012). The UniGene abundances for FIGURE 1 | Overview of study workflow. 499 peri-and early post-menopausal women provided stool and blood samples for shotgun metagenomic sequencing and untargeted serum metabolomics profiling. Single omics association analyses were first conducted to identify microbes and metabolites that are associated with BMD. The paired microbiome and metabolite profiles were then integrated by performing a supervised feature selection with respect to BMD. The selected features were used to conduct inter-omics network analysis to explore the crosstalk between these biological factors. a given sample were estimated by transcripts per million (TPM) based on the number of aligned reads, G k = r k L k * 1 S n i=1 G i L i * 10 6 where k refers to the k th UniGene, r is the number of UniGene reads, and L is the UniGene length. The DIAMOND+MEGAN approach was then applied for taxonomic annotation. The UniGenes were aligned against the NCBI non-redundant protein database with DIAMOND v0.9.20 (Buchfink et al., 2015). The quality of the alignments was determined based on the bit score, which represents the required size of a sequence database in which the current match could be found by chance, and E-value, which denotes the likelihood that a given sequence match is due purely to chance. The resulting alignments were then used as input for taxonomic binning using the lowest common ancestor (LCA) algorithm in MEGAN v6.12.3 (Huson et al., 2007), which places a read on the lowest taxonomic node in the NCBI taxonomy that lies above all taxa to which the read has a significant alignment. We note that while the limitations of this local sequence alignment approach have been documented (Koski and Golding, 2001), it is a standard protocol for taxonomic profiling (Bagci et al., 2021).
The microbiome data are relative abundances since the total number of read counts per sample is highly variable and constrained by the maximum number of reads the sequencer can provide (Gloor et al., 2017), and the data are considered compositional because the relative abundances of all bacteria species within each sample are proportions which have a unit sum. We eliminated the rare species with an average relative abundance <0.01% to reduce the extreme sparsity of the data and remove sequencing artifacts (Cao et al., 2020). The relative abundances were then normalized by the centered log-ratio (CLR) transformation, which has been shown to be effective in transforming the compositional data to be approximately multivariate normal (Gloor et al., 2017).
A high-resolution tandem mass spectrometer Triple TOF5600plus (SCIEX, UK) was used to detect the metabolites eluted from the column in both positive and negative ion modes. The curtain gas was set to 30 PSI, ion source gas one and two were both set to 60 PSI, and the interface heater temperature was set to 650°C. The Ionspray voltage floating was 5000 V for positive ion mode and -4500 V for negative ion mode. The mass spectrometry data were acquired in IDA mode, and the TOF mass range was from 60 to 1200 Da. The survey scans were acquired in 150 ms, and as many as 12 product ion scans were collected if exceeding a threshold of 100 counts per second and with a 1+ charge-state. The total cycle time was fixed to 0.56 s. Four different time bins were summarized for each scan at a pulser frequency value of 11 kHz through monitoring of the 40 GHz multi-channel TDC detector with four-anode/channel detection. Dynamic exclusion was set for 4 s. During the acquisition, the mass accuracy was calibrated every 20 samples. To evaluate the stability of the LC-MS during the whole acquisition, a pooled quality control sample was acquired after every 10 samples.
The acquired MS data pretreatments including peak picking, peak grouping, retention time correction, and second peak grouping were performed using XCMS v3.16.1 (Smith et al., 2006). CAMERA v1.50 (Kuhl et al., 2012) was used to annotate the identified features with related isotopic peaks and adducts. Each ion was characterized by retention time and mass-to-charge ratios (m/z), and the intensities of each peak were recorded. Metabolite identification and data processing were performed using metaX v1.0.3 (Wen et al., 2017). The Human Metabolome Database (HMDB) and Kyoto Encyclopedia of Genes and Genomes (KEGG) were used to annotate metabolites by performing a mass-based search with a weight tolerance of 10 ppm. To provide more confident and reproducible study findings, we retained metabolites with annotations that were validated using an in-house fragment spectrum library.
Metabolite features detected in <50% of quality control samples or <80% of biological samples were removed, and the remaining peaks with missing values were imputed with the knearest neighbor algorithm. Probabilistic quotient normalization was applied to minimize technical artifacts, and robust spline correction was used for the post-acquisition correction of batch effects. In addition, the relative standard deviations of the metabolite features were calculated across all quality control samples, and those >30% were removed. The remaining metabolite features were log transformed and scaled to have zero mean and unit variance, which is a common normalization technique (Zhao Q et al., 2018;Gong et al., 2021). The log transformation converts skewed data to symmetric, while scaling makes all metabolites of equal importance and enables comparison based on correlations (Li et al., 2016).

Microbiome Association Analysis
Individual microbes were tested for association with BMD using a constrained elastic net regression model, which is a commonly used feature selection approach with compositional covariates (Lin et al., 2014). The model imposes a sparsity penalty along with a constraint that the regression coefficients of the CLRtransformed relative abundances sum to zero,b = arg min ( The elastic net regularization is a combination of both ridge and lasso penalty functions, where ridge results in a nonzero coefficient for every feature and lasso only assigns nonzero coefficients to the most strongly associated features. Since the penalized regression model does not provide conventional association pvalues, partial Spearman correlation analysis was used to individually test each microbe selected in the initial feature screening.

Functional Profiling of Microbiota
The abundances of metabolic pathways in the microbiome community were profiled using the Human Microbiome Project Unified Metabolic Analysis Network (HUMAnN2) pipeline (Franzosa et al., 2018). HUMAnN2 first maps metagenomic reads to the pangenomes (Huang et al., 2014) of species identified by taxonomic profiling. The protein-coding sequences in these pangenomes have been pre-annotated to their respective UniRef90 families (Suzek et al., 2015), which serve as a non-redundant protein sequence database. Metagenomic reads that do not align to a known pangenome are subjected to a translated search against the full UniRef90 database. All hits are weighted by quality and sequence length to estimate the gene abundances. These genes are then annotated to metabolic enzymes and further analyzed to quantify the abundances of complete metabolic pathways obtained from MetaCyc (Caspi et al., 2016). HUMAnN2 assigns a coverage and abundance score for each pathway in each sample based on the detection of all its constituent genes. The coverage and abundance scores represent the number and abundance of complete copies of the pathway in each sample. Partial Spearman correlation analyses were used to test the associations between the pathway abundances and BMD.

Fecal Metabolite Imputation
The Model-based Genomically Informed High-dimensional Predictor of Microbial Community Metabolic Profiles (MelonnPan) approach (Mallick et al., 2019) was applied to predict the abundances of fecal metabolites from the microbiome gene abundances estimated by HUMAnN2. Elastic net prediction models were trained to select a sparse set of microbiome genes that are predictive for each fecal metabolite based on an independent set of 155 reference subjects for which both metagenomic and metabolomic profiling of the stool samples were both available . The fecal metabolite concentrations were then imputed as a linear combination of the microbiota gene abundances with weights learned from the training set. We retained the well predicted fecal metabolites, which had at least a moderate correlation (Spearman r >0.3) between the observed and imputed metabolite abundances in the training sample, as previously detailed (Mallick et al., 2019).

Metabolite Association Analysis
Partial least squares regression (PLS) is a multivariate approach which combines aspects of principal component analysis (PCA) and linear regression (Rohart et al., 2017). The principle is to extract a set of orthogonal components that have large covariance with the phenotype. PLS is well suited for the metabolomics analysis due to the high degree of correlation between functionally related metabolites (i.e., metabolites involved in the same metabolic pathways). A variable importance in projection (VIP) score is used to summarize the contribution of each feature to the model, which is computed as a weighted sum of the squared correlations between the PLS components and phenotype. Metabolites with VIP ≥2.0 were considered important for the phenotype. As a complementary approach, all metabolites were also individually tested using linear regression.

Coinertia Analysis
The global similarity between the gut microbiome and serum metabolome was investigated using coinertia analysis, which identifies successive axes of covariance between two datasets measured on a single group of subjects (Dray and Dufour, 2007). Principal coordinate analysis (PCoA) with Bray Curtis distance and PCA were applied to the microbiome and metabolite profiles, respectively, and the ordinations were used as input for the coinertia analysis. The coinertia analysis produces an RV coefficient, which is a multivariate extension of the squared Pearson correlation coefficient computed as

Supervised Multi-Omics Feature Selection
Canonical correlation analysis (CCA) has previously been proposed as a promising approach for performing integration analysis (Parkhomenko et al., 2009). Assuming two different data modalities measured on the same subjects, CCA seeks weighted linear combinations of the features from each dataset that have large correlation. However, the conventional CCA model assigns nonzero weights to every feature, which can result in overfitting for high dimensional data, and CCA is traditionally unsupervised since it does not take the phenotype information into consideration. The overfitting issue can be addressed by introducing a sparsity penalty into the CCA model, which allows for the incorporation of feature selection. The sparse CCA model can then be further extended to be supervised (sCCA), such that the selected features are correlated across omics modalities with importance for a quantitative phenotype (Parkhomenko et al., 2009). The sCCA model is expressed as, u T X T Yv subject to ‖ u ‖ 2 ≤ 1, ‖ v ‖ 2 ≤ 1, P 1 (u) ≤ c 1 , P 2 (v) ≤ c 2 , u j = 0 ∀ j ∉ Q 1 , v j = 0 ∀ j ∉ Q 2 . X and Y denote the paired multi-omics datasets, u and v are the canonical vectors containing the weights for each feature, and Xu and Yv, taken to be the weighted linear combinations of features within each subject, are the canonical scores. The P 1 and P 2 represent lasso penalty functions on the canonical variates, and the resulting u and v are sparse for c 1 and c 2 sufficiently small. Q 1 and Q 2 denote subsets of features in X and Y that have large univariate correlation with the phenotype, and features that are not strongly associated with the phenotype are automatically assigned zero weights. The optimal tuning parameters for the model were selected by 10-fold cross validation.

Inter-Omics Network Analysis
The sCCA selected microbes and metabolites were used as input to construct the inter-omics Gaussian graphical model (GGM), where the edges represent partial correlations between features. The optimal GGM was selected by minimizing the extended Bayesian information criterion (EBIC) of unregularized GGM models (Foygel and Drton, 2010). We first selected the top 100 models by estimating a sparse inverse covariance matrix along a path of regularization parameters using the graph lasso penalty to select the significant edges. Each of these models was refit without regularization, and the model with the smallest EBIC was chosen as the optimal network.

Sample Characteristics
The sample consisted of 499 peri-and early post-menopausal Chinese women that provided both stool and blood samples for metagenomic and metabolomic profiling ( Table 1). 84% of these women were classified as postmenopausal (i.e., >1 year since final menstrual period), while 16% were still in the perimenopause transition period. The average time since menopause was 2.0 years (SD = 1.0), corresponding to the life stage when women typically begin to experience rapid bone loss (Melton et al., 1992). On average, the subjects were 53.0 years old (SD = 2.9) with body mass index (BMI) of 23.0 kg/m 2 (SD = 2.9) and reported exercising approximately once per week (SD = 0.8). 62% of the women had undetectable estradiol levels (<18.35 pmol/L), an indicator of menopause, and the average level of FSH was 76.2 mIU/ml (SD = 32.2). BMI, exercise, time since menopause, estradiol, and FSH had significant bivariate associations with BMD (p-values <0.05).

Microbiome Association Analyses
After shotgun metagenomic sequencing of the stool DNA samples, we obtained approximately 7.35 giga base pairs of sequencing data per subject. Among >10,000 microbial features, there were 672 species with an average relative abundance >0.01%, which accounted for approximately 96% of the total microbiome across all subjects. 59.2% of these taxa belong to the Firmicutes phylum, 31.5% to Bacteroidetes, 6.1% to Proteobacteria, 2.2% to Actinobacteria, 0.5% to Fusobacteria, and 0.5% to Verrucomicrobia. On average, the Bacteroidetes and Firmicutes phyla accounted for 50% and 45% of the microbiome composition, respectively.
The SparCC approach (Friedman and Alm, 2012) was applied to explore the strength of relationships between the microbiota. SparCC accounts for the compositional nature of the data by approximating the correlations between the log-ratio transformed relative abundances. We observed that 8,697 pairs of taxa had strong positive correlations (r's > 0.5 and p-values < 0.001), while 898 had strong negative correlations (r's < -0.5 and p-values < 0.001). These relationships were visualized by the microbiome co-occurrence network (Figure 2). The underlying causes of these microbial interactions are complex. While mutualistic and phylogenetically related bacteria may sometimes co-occur, this is not always the case. Similarly, microbes with antagonistic relationships, such as those competing for the same niche, may sometimes have inverse associations, while in other circumstances they may actually cooccur due to variation in their shared environment (Levy and Borenstein, 2013). Co-exclusions can also arise due to incompatible abiotic factors in the microbiome community (Weiss et al., 2016). There were 44 taxa identified for potential association with BMD in the initial feature selection by the constrained elastic net regression model, including 22 which also had p-values <0.05 when tested individually using partial Spearman correlation analysis adjusted for relevant covariates ( Table 2). Among the putative BMD associated microbes, 9 had FDR <0.05, and the remaining 13 had FDR <0.1. Several of the identified species including Bacteroides vulgatus, Bacteroides uniformis, Bacteroides fragilis, and Bacteroides massiliensis, all of which were negatively associated with BMD, were among the most abundant species in the microbiome with average relative abundances >1%. On the other hand, Firmicutes microbes, such as Clostridium leptum and Ruminococcus lactaris, were observed to be positively associated with BMD.
Functional profiling of the microbiome yielded pathway abundances for 516 metabolic pathways in the microbial community, all of which involve the microbiota producing metabolic byproducts from the catabolism of dietary components. Partial Spearman correlation analyses identified 22 pathways for putative association with BMD at a threshold of p-value <0.05 (Table 3). However, due to the number of pathways tested and the modest effect sizes, none of the pathway associations remained significant after multiple testing correction (FDR >0.2).
Predictive metabolomic profiling was performed to impute the fecal metabolite profiles based on the gene abundances in the microbiome communities. Among 80 predicted intestinal metabolites, 3 were identified for potential association with BMD based on VIP ≥2.0 in PLS, and 17 had p-values <0.05 with FDR of 0.2 when individually tested using linear regression ( Table 4). Several of these compounds including butyrate, propionate, and valeric acid are short chain fatty acids (SCFAs), a special class of microbial byproducts that play an important role in gut and metabolic health (Blaak et al., 2020).

Serum Metabolite Association Analyses
Based on LC-MS untargeted metabolomics profiling of the serum samples, 3,202 unique metabolite features were identified in positive ion mode and 2,674 were detected in negative ion mode. Among the unique metabolite features, 381 had putatively confirmed identities. There were 12 serum metabolites identified for potential association with BMD based on VIP ≥2.0 in PLS, and 13 which had p-values <0.05 when tested individually by linear regression ( Table 5). 8 serum metabolites were detected by both approaches, but none of the identified metabolites remained significant after the multiple testing correction (FDR >0.2). Notably, several putative BMD associated serum metabolites including 3-phenylpropanoic acid, which is primarily derived from the degradation of plant polyphenols (Trost et al., 2018), and glycolithocholic acid (Taylor and Green, 2018), a secondary bile acid, are intricately linked with the microbiota. While both these compounds were imputed in the fecal metabolite analysis, no significant associations were observed for the predicted fecal abundances.
The metabolite and microbiome canonical scores, taken to be the weighted linear combination of features within each subject, were correlated with each other (r = 0.45, p-value < 0.001) and with BMD (b adj = 0.09 and 0.03 with p-values = 0.002 and 0.03, respectively), demonstrating the effectiveness of the supervised integrative feature selection. Based on the magnitude of the canonical loadings, which represent the contributions of each feature to the inter-omics relationship, the most important metabolites in the inter-omics relationship with respect to BMD were 3-phenylpropanoic acid and glycolithocholic acid, while the bacteria with the largest loadings were Fusobacterium ulcerans and Bacteroides fragilis, each of which were also detected in the single omics association analyses. The relationships between the abundances of these features and BMD after adjustment for covariates were visualized by added variable partial regression plots (Figure 4).
The GGM ( Figure 5) had an edge density of 0.22, which represents the ratio of the number of edges and the number of possible edges, and a transitivity of 0.46, which is defined as the probability that adjacent nodes of a given node are connected. Fusobacterium ulcerans was positively connected to deoxycholic acid and negatively connected to both 3-phenylpropanoic acid and glycolithocholic acid. Bacteroides fragilis was negatively connected to glycolithocholic acid, while Bacteroides ovatus was negatively connected to 3-phenylpropanoic acid. Alpha-D- glucose and deoxycholic acid were positively and negatively connected to BMD, respectively.

DISCUSSION
In this systematic multi-omics analysis of a relatively large sample of peri-and early postmenopausal Chinese women, we characterized the microbiota, serum metabolites, and possible crosstalk between these biological factors that may influence bone physiology. To our knowledge, this is one of the first reports to integrate paired metagenomic and metabolomic profiles to provide novel insights into the molecular mechanisms of skeletal remodeling. The findings, although biologically plausible, still require replication and functional validation in future studies. Many of the putative BMD associated microbes belong to the Bacteroides genus, which were inversely associated with bone mass. Bacteroides vulgatus and Bacteroides fragilis, which were identified in both the single omics and integrative analyses, have previously been reported to induce activation of the proinflammatory NF-kb signaling pathway, which is associated with bone loss (Kim et al., 2002). Additionally, Bacteroides  vulgatus was recently shown to increase serum levels of the bone resorption marker CTX-1 and decrease serum levels of the bone formation marker osteocalcin in vivo in an ovariectomized (OVX) mouse model (Lin et al., 2020). On the other hand, we observed a positive effect for the Firmicutes microbe Clostridium leptum, a probiotic species that is known to be an important producer of beneficial metabolic byproducts such as butyrate (Canani et al., 2012). We further observed a negative effect of Fusobacterium ulcerans, which also played an important role in the multi-omics integration analysis. Fusobacteria have been shown to promote M1 macrophage production via AKT2 signaling (Liu et al., 2019), which induces inflammation and has been associated with the development of osteoporosis (Yang and Yang, 2019).
To investigate the potential mechanisms by which the microbiota may influence BMD, we profiled the abundances of metabolic pathways in the microbial community and assessed their associations with BMD variation. Although the pathway associations were not significant at a stringent threshold accounting for multiple testing, several were still interesting due to their known roles in bone metabolism. We observed a positive association between BMD and several glycolytic pathways, such as glycogen biosynthesis/degradation, which are essential for cellular energy (Adeva-Andany et al., 2016). We further identified a negative association for urate biosynthesis, and it has been reported that uric acid induces intracellular oxidative stress and inflammatory cytokines that stimulate bone resorption by osteoclasts and inhibit bone formation by osteoblasts (Austermann et al., 2019). Lastly, we detected a positive association for L-methionine biosynthesis, an amino acid which has been shown to down-regulate NF-kb signaling in osteoclast precursors to reduce bone loss (Vijayan et al., 2014).
Since microbial metabolites contribute to host-microbiome interactions (Rooks and Garrett, 2016), we performed metabolomic imputation to predict the intestinal metabolite profiles based on the observed genes in the microbiome community. Similar techniques are frequently used in genetic association studies to impute the gene expression levels based on genotype information (Gamazon et al., 2015). We identified positive associations between BMD and several SCFAs, which are exclusively produced by the microbiota through the breakdown of non-digestible dietary fiber. The SCFAs are potent signaling molecules that modulate host gene expression by interacting with various epigenetic factors such as DNA methylation and histone acetylation (Alenghat and Artis, 2014). Butyrate and propionate have previously been reported to induce metabolic alterations of osteoclasts that lead to downregulation of crucial genes such as TRAF6 and NFATc1 (Lucas et al., 2018). Butyrate has also been shown to stimulate bone formation through regulatory T cell mediated regulation of Wnt10b expression (Tyagi et al., 2018). Furthermore, valeric acid was recently demonstrated to promote/inhibit osteoblast/ osteoclast differentiation in vitro (Lin et al., 2020). We note that the precision of fecal metabolite imputation by MelonnPan has room for improvement (Yin et al., 2020). Additionally, the prediction models were trained in a different study population (mixed sexes, North American, some with irritable bowel disease), and therefore the accuracy of the imputation in this sample is unknown.
While the fecal metabolites are the most representative of the direct metabolic output of the gut, many of those compounds are excreted from the body without ever having any influence on human health. The serum metabolome, which includes both host and microbiota derived metabolites, provides a window into which gut metabolic byproducts are absorbed into the circulating blood to potentially impact host physiology. We observed that metabolites involved in energy metabolism, such as alpha-Dglucose, were positively associated with BMD. Energy metabolism is critical for bone remodeling, and it has also been demonstrated that there is a feedback loop where bone can act as an endocrine gland by secreting bone specific proteins such as osteocalcin and osteoprotegerin, which can regulate insulin function and glucose metabolism (Faienza et al., 2015). Additionally, we detected the amino acid histidinyl glycine, a conjugation of histidine and glycine. Targeted deletion of histidine decarboxylase in mice, which converts histidine to histamine, was found to increase bone formation and protect against bone loss (Fitzpatrick et al., 2003). Glycine is reported to improve bone health by increasing the production of collagen, which is a major building block of bone (Jennings et al., 2016). Several of the BMD associated serum metabolites are involved in lipid metabolism, and accumulating evidence has demonstrated that alterations in lipid levels are associated with changes in bone metabolism (Tian and Yu, 2015). Lipid metabolism is regulated PPARg, which inhibits the differentiation of osteoblasts and promotes the formation of adipocytes (Wan, 2010). Dodecanoic acid was recently reported to have a causal effect on BMD, which was validated using bone cells cultured in vitro as well as in vivo in an OVX mouse model (Gong et al., 2021). Palmitic acid is a saturated fatty acid which has been shown to increase bone loss by promoting osteoclast survival (Oh et al., 2010). The increase of LysoPC (18:0), a lysophoshpatidylcholine, is indicative of oxidative stress, and LysoPCs have been detected at elevated levels in the serum of osteoporotic mice .
Most notably, the serum metabolite analysis identified several microbiota-linked compounds for association with BMD. 3phenylpropanoic acid, also known as hydrocinnamic acid, is mainly produced by the microbial catabolism of dietary polyphenols, which are acquired from plant-based food sources such as leafy greens, tea/coffee, wheat, berries, fruits, and other vegetables (Trost et al., 2018). Dietary polyphenols have been reported to reduce the risk of various age-related diseases and FIGURE 3 | Correlation heatmap of sCCA selected microbes (rows) and metabolites (columns) that are correlated with each other and with BMD. Canonical loadings are provided for each feature, which represent the contributions to the inter-omics relationship. Positive correlations are represented by blue, negative correlations are shown in red, and intensity of the color represents the strength of association.
have a lot of promise for protecting against bone loss due to their antioxidant properties (Sato et al., 2011). One study found that phenolic acids in the serum of rats fed a blueberry diet stimulated osteoblast differentiation, resulting in elevated bone mass (Chen et al., 2010). Additionally, in vitro experiments with phenolic acids using bone marrow stroma cells demonstrated stimulation of osteoblast differentiation and inhibition of adipogenesis (Chen and Anderson, 2014). Plant polyphenols are also an established source of Hippuric acid, which was recently observed to inhibit osteoclast formation in vitro (Zhao et al., 2020). On the other hand, Hippuric acid can also be derived from aromatic organic acids such as phenylalanine and tryptophan (Wikoff et al., 2009).
Glycolithocholic acid and deoxycholic acid are secondary bile acids that are formed when primary bile acids produced by the liver, such as cholic acid and chenodeoxycholate, enter into the intestine via the bile duct and are acted on by the microbiota (Taylor and Green, 2018). It has previously been reported that bile acids are essential for the intestinal absorption of lipids and lipid-soluble compounds such as vitamin D (Nehring et al., 2007), and abnormal bile acid turnover has been linked with osteoporosis in postmenopausal women (Hanly et al., 2013). Lithocholic acid and deoxycholic acid have been shown to enhance and reduce calcium absorption, respectively (Marchionatti et al., 2018). Additionally, a growing body of evidence suggests that bile acids may regulate skeletal remodeling processes through direct interactions with osteoblasts and osteoclasts (Cho et al., 2013).
The network analysis revealed several inter-omics relationships that could potentially play a role in the regulation of BMD. First, we observed a positive connection between Fusobacterium ulcerans and deoxycholic acid. Sulfate esterification of bile acids in the liver enhances their excretion (Alnouti, 2009), and Fusobacteria have been reported to be involved in desulfatation, which keeps them in circulation (Jia et al., 2018). Second, we observed a negative connection between Fusobacterium ulcerans and 3-phenylpropanoic acid. It has previously been reported that polyphenol compounds have antimicrobial properties that inhibit the growth and virulence of Fusobacteria (Ben Lagha et al., 2017). Third, we observed a negative relationship between Bacteroides fragilis and glycolithocholic acid. Bacteroides are reported to promote deconjugation of bile acids, and individuals with higher abundance of Bacteroides have been shown to have lower plasma levels of secondary bile acid metabolites (Gu et al., 2017). Lastly, we observed a negative connection between 3-phenylpropanoic acid and Bacteroides ovatus. Previous studies The plots illustrate the relationships between the abundance of a given microbe/metabolite and BMD after adjustment for age, BMI, exercise, years since menopause, FSH, and estradiol. The x-axes correspond to the residuals from regressing the microbe/metabolite on the covariates. The y-axes correspond to the residuals from regressing the phenotype on the covariates. The blue line represents the line of best fit from linear regression, and the corresponding confidence interval is shown in gray.
have demonstrated that dietary polyphenols can inhibit the growth of Bacteroides microbes (Cardona et al., 2013). However, further analyses are needed to determine the precise mechanisms of these relationships and how they may be relevant for bone physiology.
Despite the novelty of this study in the bone field, there are several limitations that should be taken into consideration. First, among thousands of unique metabolite features, we were only able to produce high confidence annotations for a relatively small number, and there could be important compounds in the serum, especially gut metabolites, that were ignored. Second, we have only considered linear relationships between the molecular features. In the future, this could be addressed through nonlinear extensions of the applied methods, which may capture complex statistical dependencies that conventional correlation approaches fail to detect (Szekely et al., 2007). Third, we can only speculate about the directionality of the inter-omics relationships observed in the network based on previously reported findings. Lastly, it is unclear if the findings can be generalized to other populations. Different ethnicities have vastly different diets, and metabolomic patterns are influenced by various factors including age, genetics, and menopause (Auro et al., 2014).
In summary, we conducted a comprehensive multi-omics integration analysis and provided novel insights into the interactions between the gut microbiome and serum FIGURE 5 | Inter-omics Gaussian graphical model for sCCA selected features. The edges represent partial correlations, and significant edges were selected by the graph lasso penalty. Blue/red nodes correspond to microbe/metabolite features, BMD is shown in purple, and green/red edges correspond to positive/negative partial correlations. The edge width indicates the strength of association. metabolome that may be relevant for the regulation of BMD. We hope that these findings will stimulate future studies to further explore the relationships between the microbiome and host omics factors that may be involved in bone health.

DATA AVAILABILITY STATEMENT
The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found below: https://www.ebi.ac. uk/metagenomics/, PRJEB50761.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by Medical Ethics Committee of Southern Medical University. The patients/participants provided their written informed consent to participate in this study.

AUTHOR CONTRIBUTIONS
JG conducted the data analysis and prepared the manuscript. XL, K-JS, and RG contributed to the data analysis plan. HS contributed to the manuscript revision. JS and H-MX managed and directed the study components conducted within their respective institutions. H-WD conceived, designed, and directed the whole project. All authors contributed to the article and approved the submitted version.