Novel insight into the genetic signatures of altitude adaptation related body composition in Tibetans

Background The Tibetan population residing in high-altitude (HA) regions has adapted to extreme hypoxic environments. However, there is limited understanding of the genetic basis of body compositions in Tibetan population adapted to HA. Methods We performed a genome-wide association study (GWAS) to identify genetic variants associated with HA and HA-related body composition traits. A total of 755,731 single nucleotide polymorphisms (SNPs) were genotyped using the precision medicine diversity array from 996 Tibetan college students. T-tests and Pearson correlation analysis were used to estimate the association between body compositions and altitude. The mixed linear regression identified the SNPs significantly associated with HA and HA-related body compositions. LASSO regression was used to screen for important SNPs in HA and body compositions. Results Significant differences were observed in lean body mass (LBW), muscle mass (MM), total body water (TBW), standard weight (SBW), basal metabolic rate (BMR), total protein (TP), and total inorganic salt (Is) in different altitudes stratification. We identified three SNPs in EPAS1 (rs1562453, rs7589621 and rs7583392) that were significantly associated with HA (p < 5 × 10−7). GWAS analysis of 7 HA-related body composition traits, we identified 14 SNPs for LBM, 11 SNPs for TBW, 15 SNPs for MM, 16 SNPs for SBW, 9 SNPs for BMR, 12 SNPs for TP, and 26 SNPs for Is (p < 5.0 × 10−5). Conclusion These findings provide insight into the genetic basis of body composition in Tibetan college students adapted to HA, and lay the foundation for further investigation into the molecular mechanisms underlying HA adaptation.


Introduction
High altitude (HA), which is defined as being above 2,500 m (approximately 8,200 feet) above sea level (mASL), is typically characterized by low atmospheric pressure, thin oxygen levels, and low temperatures.Short-term exposure to HA anoxic environments could lead to acute altitude sickness, with symptoms including high altitude pulmonary edema (HAPE) and cerebral edema (1).Longterm exposure to these environments can also increase the risk of pregnancy complications, such as preeclampsia (2).However, genetic adaptation to a new environment is a fundamental process of species survival and adaptation.Populations that have resided at HA for generations have experienced selective pressures and undergone physiological and genetic adaptations to thrive in anoxic environments.Exposure to hypoxia enables an animal's homeostasis system to effectively respond to changes in oxygen concentration, which is crucial for survival (3).Tibetans have lived at very HA for thousands of years, and they possess distinctive physiological traits that enable them to adapt to the HA environment (4).
Body composition refers to the composition of various components, such as water, muscle, fat, and inorganic salts in the human body, as well as their percentage of the total body mass.The composition of the body is a crucial factor that impacts human health and has gained increasing attention in recent years.Systematic review and meta-analysis studies have reported significant reductions in body weight, fat mass (FM), fat free mass (FFM), and lean body mass (LBM) of individuals exposed to HA (5,6).In the healthy indigenous populations living on the Qinghai-Tibet Plateau, protein mass, bone mass (BM), FM and body water values decrease with increasing altitude (7).Sympathetic activity is reduced during prolonged exposure to HA, resulting in a decrease in basal metabolic rate (BMR) (6).Additionally, exposure to HA hypoxic environment can lead to serious loss of muscle mass (MM), which results in skeletal muscle atrophy (8).Participants living at sea level tend to be taller, heavier, and have a higher body mass index (BMI), and waist circumference (WC) relative to those living at HA (9).Previous studies have shown that the growth indicators, such as height, weight, chest circumference, and WC, differ among Chinese Tibetan adolescents at different altitudes (10).However, there is a lack of systematic research on the differences in human body composition among Tibetan college students at different altitudes.Therefore, understanding the differences in body composition among Tibetan populations at different altitudes can provide valuable insights into the physiological mechanisms of HA adaptation.
Over the past decade, numerous studies and more recent genomic association analysis studies have provided evidence for the genetic basis of these physiological changes (11,12).A large-scale genomewide study (GWAS) was conducted to identify genetic signals of HA adaptation at nine genomic loci, seven of which are unique to 3,008 Tibetans and 7,287 non-Tibetan individuals of Eastern Asian ancestry (13).Additionally, a significant number of GWAS have been conducted on various populations, exploring body components such as energy expenditure, LBM, WC, waist-hip ratio (WHR), height, BMI, and body fat have been widely reported (14)(15)(16)(17)(18)(19).However, there have been no reports on the systematic GWASs of body composition in Tibetan college students.
In this study, we aim to conduct a GWAS on 996 Tibetan college students from different HA areas in Tibet to identify genetic variants associated with HA and HA-related body components indicators, including body fat ratio (BFR), body fat, LBM, total body water (TBW); MM, BMI, obesity, standard weight (SBW), WHR, BMR, total energy expenditure (TEE), impedance (IM), total protein (TP), and total inorganic salt (Is).These findings will contribute to a deeper understanding the genetic basis of body composition of Tibetan college students and reveal the intricate molecular mechanisms of HA adaptation.

Participants
This study recruited a total of 996 students (545 females and 451 males, aged 16-25 years) from Tibetan freshmen in the classes 2019 and 2020 at Xizang Minzu University.All participants were healthy Tibetan college students from different altitudes in Tibet, and all participants had lived in the region for at least three generations (Figure 1).The demographic information of the research subjects, including gender, age, population, residential history, etc., was collected through a questionnaire survey.We then used websites to check the altitude,1 air pressure,2 latitude, and longitude 3 of the participants' habitual residence.Participants whose habitual residence was below 1,500 meters or above 5,500 meters above sea level were excluded.Peripheral venous blood sample (5 mL) was collected from each subject using EDTA anticoagulant tubes and stored in a refrigerator at-80°C for future use.This study was approved by the Ethics Committee of the Medical College of Xizang Minzu University (No. 20180-18) and was conducted in accordance with the Declaration of Helsinki.All participants signed written informed consent.

Body composition traits detection
All subjects' weight, height, blood pressure and WC were measured on an empty stomach in the morning.The InBody720 body composition analyzer was used to measure various body composition indicators, including BMI, BFR, body fat, LBM, TBW, MM, obesity, SBW, WHR, BMR, TEE, IM, TP, and Is.

Genotyping and quality control
DNA was extracted from peripheral blood using the GoldMag DNA Extraction Kit (GoldMag Co. Ltd.).The concentration and purity of DNA were subsequently determined using a NanoDrop 2000 spectrophotometer (Thermo Scientific).The Precision medicine diversity array (PMDA) high-throughput chip was processed using GeneTitan multichannel instruments, and the extracted genomic DNA samples were genotyped.All genotyping data obtained were analyzed using Applied Biosystems™ Axiom™ software.A total of 755,731 single nucleotide polymorphisms (SNPs) were genotyped in

LASSO deep learning algorithm screening HA associated SNPs
The least absolute shrinkage and selection operator (LASSO) is a regression-based approach that incorporates penalty to estimate regression coefficients.It achieves this by maximizing the logarithmic likelihood function.The main concept behind LASSO is to minimize the residual sum of squares while putting a constraint on the sum of the absolute values of the regression coefficients, ensuring it is less than a constant.This constraint enables the identification of regression coefficients that are strictly equal to zero, resulting in optimal screening results.In our study, all SNPs variables were converted into independent variables, while HA and body composition traits served as dependent variables.To determine the appropriate adjustment parameter (λ) for LASSO logistic regression, we utilized the internal validation method of 10-fold cross-validation with the minimum criterion and the 1-SE of the minimum criterion.

Statistical analysis
After organizing the data using Microsoft Excel, statistical processing was performed using SPSS 25.0 statistical software.The measurement data exhibited normally distributed, and the results were expressed as mean ± standard deviation (X ± S).One-way analysis of variance (ANOVA) was used to evaluate the statistical significance among multiple groups.Pearson correlation analysis was employed to determine the extent of direct and indirect influence between body composition change and altitude.The correlation coefficient is a real number ranging from −1 to +1.A correlation coefficient (r) between-1 and 0, with a p-value <0.05, indicates a negative correlation between Scatter plot of samples distribution.variables.Conversely, a correlation coefficient between 0 and 1, suggests a positive correlation between variables.If p > 0.05, there was no statistically significant correlation between the variables.We used mixed linear regression under an additive genetic model in Gold Helix SNP & Variation Suite software (version 8.7) to identify SNPs significantly associated with HA and body composition traits.The threshold for significance was set at p < 5.0 × 10 −5 .Manhattan plots were constructed to visualize the genome-wide association results for altitude and body composition traits.Quantile-quantile (Q-Q) plots were used to assess the validity of the distributional assumption for the dataset.Additionally, the genomic inflation factor (λ) was calculated to compare the distribution of the test statistics across the genome with the expected null distribution.Regional plots for top SNPs were created using Locus Zoom. 4   3 Results

Descriptive characteristic of the subjects
A total of 996 Tibetan college students were included in this study.Among them, 2 were excluded from the study due to residing at elevations below 1,500 m and above 5,500 m.The included participants consisted of 449 male students (45.17%) and 545 female students (54.83%).The total number of samples collected for human composition indicators can be found in Supplementary Table S1.Table 1 depicts the mean ± standard deviation values of 14 human body component indicators across different altitude stratifications.Figure 2 displays the T-tests results comparing different altitude groups.The results indicated significant differences (p < 0.05) in 4 http://locuszoom.org/LMB, TBW, MM, SBW, BMR, TP, and Is between the altitude stratifications of 2,500-3,500 m and 4,500-5,500 m.Notably, LBM, TBW, MM, SBW, BMR, TP, and Is were highest at altitudes between 2,500-3,500 m above sea level and tended to decrease with increasing altitude.

SNPs associated with HA
The QQ plot of GWAS analysis results related to HA revealed an expansion coefficient of 1.007 (Figure 3A), indicating no significant systematic bias in the correlation results.Furthermore, the Manhattan plot (Figure 3B) of GWAS analysis results related to HA showed that SNPs located on the EPAS1 gene (member of the HIF gene family) on chromosome 2p21 exhibited the strongest association with HA.During the GWAS analysis of HA, we identified 39 SNPs that were significantly associated with HA (p < 5.0 × 10 −5 ), as presented in Table 2. Notably, a consistent region on chromosome 2, including six significant SNPs in the EPAS1 gene (rs4953342, rs1562453, rs7589621, rs1992846, rs12467821, and rs7583392), was found to be associated with HA.Of particular interest, rs7583392 in EPAS1 (p = 2.07 × 10 −8 ) and rs72949528 in TENM4 (p = 1.43 × 10 −8 ) surpassed the genomewide significance threshold of 5.0 × 10 −8 (Figure 3C).

SNPs further screening by LASSO
The LASSO method was employed to further screen the most significant loci associated with 7 HA-related human body component indicators.The optimal lambda (λ) parameters in the LASSO regression model were selected through 10-fold cross validation.The LASSO coefficient profiles of the SNPs with non-zero coefficients were determined by the optimal lambda (λ) (Supplementary Figure S2).There are two dashed lines in the cross-validation diagram, one is the input value with the minimum mean squared deviation and the other is the input value of the minimum mean squared error.We take the geometric mean of the two as the λ value.As shown in the Figure 5, the 39 SNPs associated with HA were reduced to 29 according to the LASSO regression method when λ = 12.65.When λ = 10.04, the 9 SNPs associated with BMR were reduced to 6.When λ = 0.018, the number of SNPs associated with IS decreased from 26 to 22.When λ = 0.31, the number of SNPs associated with LBM decreased from 14 to 13.When λ = 0.36, the number of SNPs associated with muscle mass decreased from 15 to 11.When λ = 0.22, the number of SNPs associated with SBW decreased from 16 to 10.When λ = 0.40, the number of SNPs associated with TBW decreased from 11 to 8 (Supplementary Table S2).

Discussion
In order to adapt to the extreme anoxic environment of the plateau region, the Indigenous people of Tibet have developed a markedly different set of physiological characteristics.During prolonged hypoxia, it can affect a person's body composition, such as reductions in body weight, fat free mass (FFM), MM, and TBW.The study aims to provide a preliminary basis for discovering the role of genetic factors in the changes in HA-related body composition in Tibetan populations adapted to HA environment.From the 279,608 imputed SNPs and 14 body composition phenotypes investigated, we found that 39 SNPs were significantly associated with HA and 103 SNPs were significantly associated with 7 HA-related body composition phenotypes (LBM, TBW, MM, SBW, BMR, TP, and Is) (p < 5 × 10 −5 ).Of these, 14 SNPs were located in genes with known functions, helping to explain the genetic and physiological mechanisms that lead to changes in body composition in HA populations.Manhattan plots showing association of all SNPs with body composition traits.SNPs are plotted on the x-axis according to their position on each chromosome against association with these traits on the y-axis (shown as -log10 p-value).The red dashed line shows genome-wise significance with a p-value threshold of 5E-05.BFR, body fat ratio; BFT, body fat; LBM, lean body mass; TBW, total body water; BMI, body mass index; SBW, standard weight; WHR, waist to hip ratio; BMR, basal metabolic rate; TEE, total energy expenditure; IM, impedance; TP, protein; Is, inorganic salt.In this study, we found a negative correlation between 9 body compositions (BFT, LBM, TBW, MM, SBW, BMR, TEE, TP, and Is) and HA, and 7 indicators (LBM, TBW, MM, SBW, BMR, TP, and Is) showed significant differences in different altitude stratification in the Tibetan college students.Similarly, these indicators have previously been reported to decrease with increasing altitude (7,8,20).In addition, inverse association between obesity and altitude has previously been reported (21).Although the association between BMI, WHR, obesity, and HA was not significant in this study, these indicators showed a negative correlation trend with altitude.
We identified 39 SNPs related to HA, 6 SNPs (rs4953342, rs1562453, rs7589621, rs1992846, rs12467821, and rs7583392) of which were located in EPAS1 gene on chromosome 2.The EPAS1 gene encodes hypoxia inducible factor 2α (HIF-2α), a transcription factor that is involved in the induction of oxygen-regulatory genes when oxygen levels decline.As one of the major gene in the HIF pathway, EPAS1 has been reported as the most important candidate gene for HA adaptation (22,23).Adaptive mutations in EPAS1 may serve as an adaptive strategy in HA indigenous peoples.Bhandari et al. (24) showed that individuals carrying the derived alleles of rs12467821 in EPAS1 has lower hemoglobin levels than wild-type allele carriers in in Tibetans and Sherpas.The SNP rs1562453 has been confirmed to be associated with the susceptibility to high altitude pulmonary hypertension (HAPH) in Chinese Han population (25).For rs7589621, linear-by-linear association test revealed a significant increasing trend of major G allele and genotype GG frequencies with increasing altitude among native Tibetans (26).In our study, we found a possible involvement of a novel SNP (rs7583392) in EPAS1 that was associated with HA.
In the early stage, we observed a significant decrease in LBM as altitude increased.GWAS analysis revealed that rs77267056 in RXRA was associated with LBM.RXRA acts as a transcription factor for various nuclear receptors, including PPARα, and is known to play a crucial role in fatty acid metabolism.Previous research has shown that under hypoxic conditions, the activity of the PPARα/ RXRA complex is reduced, leading to a suppression of fatty acid metabolism (27).Prolonged hypoxia can alter DNA methylation patterns.Studies have demonstrated that CpG island methylation in the promoter region of RXRA is lower at HA compared to low altitudes, potentially resulting in increased expression (28).RXRA, a member of the retinoic acid receptor family, is essential for normal hematopoietic function during development, and its methylation levels are positively correlated with hemoglobin levels (29).These findings suggest that variants of RXRA may be involved in HA adaptations.
In summary, we have identified several regions on the chromosome associated with HA human body components, some of which are consistent with previously reported SNPs.It is worth noting that some SNPs related to traits are found in the intergenic regions of functional coding genes (30).Studies have shown that a large number of disease-related SNPs are found in the non-coding RNAs lacking conservation, known as lincRNAs, which has become an area of interest (31).For instance, polymorphisms of LincPINT and Linc00599 have been found to be associated with HAPE susceptibility in the Chinese population (32).However, there are limited reports on SNPs in lincRNAs associated with HA-related body components.In our study, we discovered that rs10801160 and rs10921285, located 22 kb downstream of lncRNA RP11-139E24.1,are associated with HA-related traits such as LBM, MM, and TP.Currently, there are no reported studies on the adaptation of lncRNA RP11 to hypoxia in HA.Most studies have focused on investigating the role of the lncRNA RP11 gene in cancer.However, it has been discovered that hypoxiainduced lncRNA RP11-367G18.1 regulates hypoxia-induced target genes by regulating histone markers of H4K16Ac.This regulation leads to epithelial-mesenchymal transition, metastasis, and tumorigenicity (33).In our study, we observed significant associations between HA-related characteristics such as LBM, MM, and TP with  SNPs selection using the least absolute shrinkage and selection operator.two specific variants, rs10801160 and rs10921285, in the lncRNA RP11-139E24.1 gene.Furthermore, previous studies have observed that rs645040 near MSL2 is significantly associated with lipid traits, such as triglycerides (34) and high-density lipoprotein (HDL) cholesterol (35).Moreover, the SNP rs7621025 (STAG1) was identified as a pleiotropic variant for HDL-cholesterol (36).In this study, we identified two loci (rs645040 and rs7621025) associated with BFT levels in the Tibetan college students.However, the identified loci related to other human body composition indicators in this study have not been reported yet.Therefore, further validation of the results of this study is needed.Furthermore, the results of this study have a positive impact on public health, especially for people living in high-altitude areas.Firstly, by analyzing genetic variations and body composition indicators related to high-altitude adaptability, we can better understand the physiological adaptation process of Tibetan college students to high-altitude environments.This helps to develop personalized health management and prevention measures to improve their quality of life in highaltitude environments.Secondly, the research results can provide a scientific basis for public health policies in high-altitude areas.By understanding the performance of different body components in high-altitude adaptation, governments and health institutions can develop more effective health policies to meet the special health needs of residents in high-altitude areas.Lastly, our results also provide a reference for similar research in other high-altitude areas, promoting the development of global high-altitude health research.
This study has several limitations.Firstly, it only included samples of Tibetan college students from HA areas, which may result in insufficient representativeness and restrict the generalizability of the findings to the entire Tibetan population or other populations in different regions.Secondly, the possibility of genetic heterogeneity between Tibetan populations in HA areas and populations in other regions could potentially impact the relationship between genes and human body composition.Thirdly, variations in environmental conditions between HA areas and other regions may influence human body composition and introduce confounding factors that could complicate the association between genes and human body composition.Lastly, it is important to note that GWAS research can solely identify associations between genes and human components, and cannot establish specific functional mechanisms.Therefore, further experimental research is warranted to elucidate these associations.
In conclusion, several candidate loci associated with HA and HA-related body composition indicators, such as LBM, TBW, MM, BMR, SBW, TP and Is were identified.Additionally, it was found that some loci or genes were common across these traits, suggesting a shared genetic basis.These differential loci indicate a strong early genetic adaptation to life at high altitude, followed by the spread of these adaptive populations.Further studies are required to gain a deeper understanding of the underlying mechanisms through functional investigations.

FIGURE 3
FIGURE 3Quantile-quantile plot and Manhattan plot of the association analysis of high altitude.(A) Quantile-quantile plot.(B) Manhattan plot.The red line represents the genome-wide significance threshold p = 5E-05, and the blue line represents the genome-wide suggestiveness threshold p = 5E-06.(C) Regional plots of SNPs with threshold p < 5E-08 for high-altitude.The plots were generated using Locuszoom.

TABLE 1
Description of human body composition indicators in different altitude stratification.
BFR, body fat ratio; BFT, body fat; LBM, lean body mass; TBW, total body water; BMI, body mass index; SBW, standard weight; WHR, waist to hip ratio; BMR, basal metabolic rate; TEE, total energy expenditure; IM, impedance; TP, total protein; Is, inorganic salt.p-values were calculated from Student's t-test (two-sided).Bold font and p < 0.05 indicates statistical significance.

TABLE 2
Association between SNPs and altitude from GWAS analysis.

TABLE 3
Results of GWAS analysis of seven altitude-related body composition indicators.

TABLE 4
Results of GWAS analysis of other seven body composition indicators.