A Potential Endurance Algorithm Prediction in the Field of Sports Performance

Sport performance is influenced by several factors, including genetic susceptibility. In the past years, specific single nucleotide polymorphisms have been associated to sport performance; however, these effects should be considered in multivariable prediction systems since they are related to a polygenic inheritance. The aim of this study was to design a genetic endurance prediction score (GES) of endurance performance and analyze its association with anthropometric, nutritional and sport efficiency variables in a cross-sectional study within fifteen male cyclists. A statistically significant positive relationship between GES and the VO2 maximum (P = 0.033), VO2 VT1 (P = 0.049) and VO2 VT2 (P < 0.001) was observed. Moreover, additional remarkable associations between genotype and the anthropometric, nutritional and sport performance variables, were achieved. In addition, an interesting link between the habit of consuming caffeinated beverages and the GES was observed. The outcomes of the present study indicate a potential use of this genetic prediction algorithm in the sports’ field, which may facilitate the finding of genetically talented athletes, improve their training and food habits, as well as help in the improvement of physical conditions of amateurs.

Sport performance is influenced by several factors, including genetic susceptibility. In the past years, specific single nucleotide polymorphisms have been associated to sport performance; however, these effects should be considered in multivariable prediction systems since they are related to a polygenic inheritance. The aim of this study was to design a genetic endurance prediction score (GES) of endurance performance and analyze its association with anthropometric, nutritional and sport efficiency variables in a cross-sectional study within fifteen male cyclists. A statistically significant positive relationship between GES and the VO 2 maximum (P = 0.033), VO 2 VT1 (P = 0.049) and VO 2 VT2 (P < 0.001) was observed. Moreover, additional remarkable associations between genotype and the anthropometric, nutritional and sport performance variables, were achieved. In addition, an interesting link between the habit of consuming caffeinated beverages and the GES was observed. The outcomes of the present study indicate a potential use of this genetic prediction algorithm in the sports' field, which may facilitate the finding of genetically talented athletes, improve their training and food habits, as well as help in the improvement of physical conditions of amateurs.

INTRODUCTION
Athletic performance can be influenced by several factors. These can include extrinsic factors such as the quality of training, the dietary habits, the technology used or the weather conditions (Sacha and Quinn, 2011). Besides, intrinsic factors such as individual genetic profile also play an important role (Peveler and Green, 2010). In fact, the study of the genetic influence on sports performance has become a leading field of research.
Single nucleotide polymorphisms (SNPs) are among the genetic variants implicated in the phenotypic differences that can influence individual physical abilities (Santos et al., 2016). To date, various links between several genetic variants and favorable phenotypes for certain sports have been established (Santos et al., 2016). This suggests that the presence of specific genotypes can predispose an individual to competitive advantages in a particular sport (Zilberman-Schapira et al., 2012).
For example, backed by a meta-analysis, the II genotype of the angiotensin converting enzyme ACE insertion/deletion (I/D) rs4340 has been significantly associated with endurance's discipline as compared with the ID and DD genotypes (Ma et al., 2013). Specifically, it is suggested that the I allele implies higher blood flow and sugar utilization by muscles, that facilitates endurance performance (Woods et al., 2000). Additionally, considering reported difficulties in genotyping ACE I/D, the ACE rs4343 A and G alleles are admitted as equivalents to the ACE rs4340 I and D alleles, respectively, and considered an alternative method for genotyping of the ACE I/D polymorphism (Glenn et al., 2009).
Moreover, the different genotypes of the peroxisome proliferator activated receptor alpha PPARA rs4253778 have also been associated with athlete's performance. Concretely, in a meta-analysis carried out by Lopez-Leon et al. (2016) the GG genotype and G allele were significantly more common in endurance athletes as compared to controls.
Besides, the α-actinin 3 (ACTN3) is a sarcomeric actinbinding protein specifically expressed in fast twitch myofibers of the skeletal muscle, required for explosive muscle contraction (Wilson et al., 2012). But the gene encoding this protein also seems to be associated with endurance capacities. In particular, TT rs1815739 carriers were found to be more common in endurance athletes as compared with sedentary individuals in a sample of 395 Israeli (Eynon et al., 2009). They were also described to obtain better results in an endurance test (Pasqua et al., 2016), to exhibit a higher proportion of enduranceassociated type I myofibers and to prefer to skate long-than short-distance races (Ahmetov et al., 2011).
Another gene with a potential association with endurance capacities is the Aquaporin 1 AQP1 rs1049305 where C allele carriers were faster in running performances than carriers of the GG genotype (Martinez et al., 2009;Rivera et al., 2011;Saunders et al., 2015).
Moreover, the CC genotype of the peroxisome proliferator activated receptor gamma, coactivator 1 alpha PPARGC1A rs8192678 has been described to be associated with high values of aerobic performance (Akhmetov et al., 2007;Stefan et al., 2007;Eynon et al., 2010b).
Thus, it is clear that there is a genetic influence, but there is still a weak scientific evidence for most of the reported associations. Besides, most studies that associate genetics with physical capacities have focused on individual genes. However, as a polygenic inheritance, various genes can make a contribution to the overall outcome. For these reasons, we consider that it is necessary to create more complex prediction algorithms, including different genetic factors together. Therefore, in the present study, we have developed a predictive algorithm of endurance performance including 11 genes. In this report, we present the associations of the predictive algorithm and each of the SNPs with anthropometric, nutritional and sport performance variables, in a group of 15 semi-professional cyclists.

Subjects
Fifteen healthy male members of the Spanish Cycling Federation, body mass index (BMI) 22.3 ± 2.5 aged 40.7 ± 7.0, with at least 5 years of experience in national-level competitions were recruited by the Sports Medicine University Center (Complutense University of Madrid, Madrid, Spain) to participate in the present study. All participants were nonsmokers. This research was conducted according to the guidelines laid down in the Declaration of Helsinki and all procedures involving human subjects were approved by the Research Ethics Committee of the IMDEA Food Foundation (PI-0031). Written informed consent to participate in the study was obtained from all subjects.

Design
This was a cross-sectional clinical study where volunteers attended two different centers to complete the study: 1) The Sports Medicine University Center to carry out a maximal incremental treadmill test.
2) The Research Institute on Food and Health Sciences "IMDEA Food" (Madrid, Spain) for anthropometric measurements, body composition analysis, dietary records, DNA collection and genotyping.

Cardiopulmonary Exercise Test
All participants carried out an incremental (30 w/min) exercise testing on a cycle ergometer (Cardgirus, Barcelona, Spain) after 2 h of fasting. During the test, the heart rate (HR) was measured using a 12-lead wireless electrocardiograph (Norav, Wiesbaden, Germany). Oxygen uptake (VO 2 ), carbon dioxide output (VCO 2 ) and minute ventilation (VE) were assessed using the analyzer Jaeger Oxycon-Pro (Hoechberg, Germany). The respiratory exchange ratio (RER) was calculated as VCO 2 /VO 2 , while the VO 2 pulse as VO 2 /HR (Bergh et al., 2000). Every 2 min cyclists had to estimate their feelings of exertion and pain using the Borg Rating of Perceived Exertion (RPE) Scale (Borg et al., 1985).
The aerobic ventilatory threshold (VT1) was estimated using the criteria of the ventilatory equivalent for VO 2 (VE/VO 2 ratio), corresponding to the rupture of the linearity in the increment of VE. Finally, the anaerobic ventilatory threshold (VT2) was calculated by the increase of the VCO 2 equivalent principles (VE/VCO 2 ), as the second rupture of the linearity in the increment of VE (Beaver et al., 1986).

Anthropometry and Lifestyle Parameters
Anthropometric measurements were determined while subjects were wearing light clothing and no shoes. Height was assessed to the nearest 0.1 cm using a stadiometer (Leicester-Biological Medical Technology SL, Barcelona). Body weight, fat mass and muscle mass percentages were evaluated using a BF511 Body Composition Monitor (BF511-OMRON Healthcare UK, LT, Kyoto, Japan). Brachial, contracted arm, waist, hip and leg circumferences were measured with an inextensible tape (KaWe Kirchner & Wilhelm GmbH, Asperg, Germany; range 0-150 cm, 1 mm of precision). A caliper (Holtain Ltd., Crymych, United Kingdom; 10 g/mm 2 constant pressure; range 0-39 mm and 0.1 mm of precision) was used for biceps, triceps, subscapular, abdominal, supraspinal, front thigh and medial calf skinfolds determinations. Moreover, the diameters of the femur and humerus were also assessed using a small bone caliper (Nonio sliding Bicondyleo, Holtain Ltd., United Kingdom). Systolic and diastolic blood pressures were evaluated using an automatic digital blood pressure monitor Model M3 (OMRON Healthcare UK, LT, Kyoto, Japan) in the right arm, with the patient seated and relaxed. Measurements were taken three times after a 5-min resting period, following World Health Organization (WHO) criteria (Whitworth and Chalmers, 2004). Finally, with the different anthropometric data, the BMI was calculated as the body weight divided by the squared height (kg/m 2 ) and somatotype values (endomorphic, mesomorphic, and ectomorphic values) accordingly to Heath-Carter method (Carter and Heath, 1990).
The food habits of each participant were recorded using a validated 3-day dietary food record and a food frequency questionnaire (Aguirre-Jaime et al., 2008). Subsequently, the composition of the different dietary records was analyzed using the DIAL software (2.16 version Alce Ingeniería, Madrid, Spain). For the calculation of the Healthy Eating Index score (Guenther et al., 2013), the DIAL program gives different values ranging from 0 to 100 considering the daily servings of cereals, vegetables, fruits, dairy products, and meat; the percentage of energy provided by total and saturated fats; the amount of cholesterol and sodium per day and the number of different foods consumed. The final score is classified into five categories: an "excellent diet" (>80 points), a "very good diet" (71-80 points), a "good diet" (61-70 points), an "acceptable diet" (51-60 points), or an "inadequate diet" (0-50 points).

Genotyping of the Population
A sample of 500 µl of peripheral capillary blood of each volunteer was drawn for DNA extraction. To perform the subsequent analysis of the samples, genomic DNA was extracted from the cellular fraction collected by the Genomic QIAamp DNA Blood Kit Mini Kit (QIAGEN, Spain). The samples were genotyped with TaqMan Assays by the high-performance QuantStudio Real-Time PCR (Applied Biosystem, United States).

Statistical Analysis
Data were analyzed using the R Statistical Software Version 3.4.1 1 . The description of the qualitative data was made in the form of absolute frequencies and percentages and the quantitative data by mean and standard deviation. The Mann-Whitney U test was used to check for significant differences in the continuous variables (not always normally distributed) for the different genotypes. The Spearman correlation coefficient was used for the association between the algorithm and the other variables. The Bonferroni correction was also applied to control against type-I errors for multiple tests. All the statistical tests were two-tailed. Statistical significance was assumed when P < 0.05.

Selection of SNPs and Design of the GES
An exhaustive literature review of the scientific databases (Pubmed, Medline, Web of Sciences) was carried out to identify all studies that analyzed the relationship between one or more SNP and sports performance. A selection of 11 SNPs was made considering the European frequencies of each SNP according to Ensembl database, the scientific evidence of each association and its availability for TaqMan SNP Genotyping Assay. Among them, nine SNPs were associated with endurance capacities and two SNPs were mainly related to power abilities.

Descriptive Analysis
Mean values of anthropometric, body composition, somatotype and blood pressure of all cyclists are shown in Table 2.
As expected, the mean BMI and fat mass percentage values were in the range of "normal weight" [18.5-24.9 BMI, 12-20% fat mass percentage according to the Spanish Society for the Study of Obesity (Salas-Salvado et al., 2007)]. The three somatotype components were around the moderate 3-5 rate (Carter and Heath, 1990). Moreover, the results of the analysis of the 3-day dietary records can be observed in Table 3. Here, it can be appreciated how the energy requirements of the cyclists were slightly higher than the energy intakes, these lasts with a high variability between the participants. Moreover, the mean Healthy Eating Index resulted in a "good diet" which, as explained in the methodology, is considered an intermediate value (61-70 points). This table also shows the average of servings per group of food consumed according to the Validated Food Frequency Questionnaire, where we can observe that vegetables and fruits are the most consumed.
Finally, Table 4 describes the mean results of the cardiopulmonary exercise test carried out at the Sports Medicine University Center.

The GES Correlates With the Cardiovascular Exercise Test
Once established the prediction algorithm (Table 1), it was related to the results obtained from the functional cardiovascular exercise test ( Table 2). The association analysis of the GES and the variables obtained in the cardiovascular exercise test, revealed a statistically significant positive relationship between the GES results and the VO 2 max (P = 0.033, Figure 1), VO 2 in VT1 (P = 0.049, ρ = 0.516) and VO 2 in VT2 (P < 0.001, ρ = 0.813) values. As expected, SNPs included in the GES were also individually associated with specific parameters related to individual  performance. The analysis of the association of the different genotypes individually and the cardiopulmonary exercise test showed that GG genotypes for AQP1 showed lower levels of VO 2 in VT1 (32.1 ± 4.6 mL/kg/min vs 42.3 ± 7.4 mL/kg/min) and fewer values of% VO 2 in VT1 in relation to VO 2 max (58.2 ± 7.0% vs 70.1 ± 7.7%) than CC + CG (P = 0.020 and P = 0.030, respectively) (Figures 2A,B). On the other hand, GG genotypes for the PPARα presented higher values of VO 2 in VT2 than CC + CG (52.6 ± 4.6 mL/kg/min vs 42.8 ± 5.9 mL/kg/min, P = 0.030), as shown in Figure 2C.
Regarding the GABPB1, AA genotypes had significantly lower maximal HR ( Figure 2D) and minor HR in VT1 and VT2 than AC genotypes (167.8 ± 16.5 bpm vs 185.9 ± 4.5 bpm, 120.8 ± 11.0 bpm vs 144.1 ± 7.8 bpm and 149.0 ± 13.3 bpm vs 167.6 ± 6.3 bpm; P < 0.001, P-adjusted for Bonferroni = 0.040 and P = 0.001, respectively). There were no CC genotypes in the studied sample. Moreover, a statistically significant difference was found between AA genotypes for the ACE gene and the AG + GG with respect to the maximum VE (200 ± 16 L/min vs 162 ± 19 L/min, P = 0.040, Figure 2E).

HIF1A Genetic Variant Is Associated With Somatotype
When we studied the relationship between the different SNPs and the results of the somatotype, we found a statistically significant association between HIF1A genotypes and the mesomorphic component. Precisely, among the CC cyclists, 8.33% presented a low mesomorphic value, 91.7% a moderate value and 0% a high value; while amongst the CT individuals, 33.33% presented a moderate result and the rest 66.7% a high value (P = 0.029). There were no TT individuals in the studied sample.

Genes Associated With Dietary Records
When analyzing the association of the different genotypes of each SNP and the results of the dietary variables, we found different interesting associations.
On the other hand, with regard to the food frequency questionnaire, and inverse correlation between the number of coffee and tea rations consumed per day and the result of the GES was observed (P = 0.004), so that the higher the consumption the lower the GES (Figure 4).

DISCUSSION
The present research provides new information regarding the link between genetics and sport performance from different angles.
The most important result was the potential validation of an algorithm prediction of genetic susceptibility to endurance abilities. At present, cardiovascular exercise test is considered the gold standard assessment of endurance performance (Hausen et al., 2018). We observed that those subjects with a higher value in the GES, presented significantly better results in the cardiovascular exercise test according to VO 2 in VT1, VO 2 in VT2 and most importantly, VO 2 max which is internationally considered the key measure of aerobic capacity (Hausen et al., 2018). Actually, VO 2 max obtained in this kind of test represents the highest capacity of oxygen consumption during a maximal exercise (Shete et al., 2014). Additionally, VT1 is the point at which the aerobic metabolism is no longer the major energy source and the anaerobic metabolism begins to be used. The exercise intensity at which VT1 occurs is considered the highest submaximal level tolerated by an individual for long time periods (Herdy et al., 2016). Hence, we hypothesized that higher values of VO 2 at VT1 may predict higher VO 2 max, which leads to better endurance conditions. Concerning VT2, it is the point where the demand of oxygen by muscles exceeds the mitochondrial supplies and consequently, energy production begins predominantly anaerobic. It begins when lactate starts to accumulate in blood which is usually at 60-70% VO 2 max (Albouaini et al., 2007). The VO 2 at VT2 is considered a submaximal index of exercise capacity and endurance performance, so that the higher VO 2 at VT2, the better endurance results (Coyle et al., 1988;Herdy et al., 2016). These achievements may complement the association found between a genetic score based on a GWAS study and the improvement of the VO 2 max in a sedentary population carried out by previous researchers (Bouchard et al., 2011).
When we analyzed the association between the different genotypes individually and the cardiovascular test results, we also found some interesting correlations. In this regard, individuals presenting the favorable genotypes for endurance sports according to AQP1 (Martinez et al., 2009;Saunders et al., 2015), showed better results in the cardiovascular exercise test (higher levels of VO 2 in VT1 and % of VO 2 in VT1 in relation to VO 2 max), than GG homozygous. The physiological explanation for this association may lie in the AQP1 encoded membrane protein role. The aquaporin 1 water channel is implicated in the transport of water, maintaining the osmotic balance between the blood and the cells (Frigeri et al., 2004). During prolonged exercise, body temperature is regulated by controlling the distribution of body fluid; water passes from the intracellular to the extracellular spaces and evaporates by sweating. An inadequate loss of sweat during exercise, especially in hot climates, will negatively affect athletic performance (Kenny, 2014). Thus, we hypothesize that CC genotype is associated with an efficient state of cellular hydration and body temperature regulation which leads to a better endurance performance.
Moreover, favorable genotypes for endurance performance of the PPARα (Lopez- Leon et al., 2016) presented significantly higher values of VO 2 in VT2 than CC + CG. This gene regulates the expression of other genes involved in the metabolism of energy, fats and sugars in the skeletal muscle among other tissues (Duval et al., 2004). Given its involvement in these processes, it is hypothesized that it is activated during endurance exercise (Lopez-Leon et al., 2016). Another suggested explanation for the association of PPARα genotypes with endurance performance has to do with the type of fibers in the skeletal muscle. In this sense, in a cohort of 786 Russian athletes it was observed that GG homozygous presented significantly higher percentage of type I muscle fibers than the other genotypes (Ahmetov et al., 2006). These skeletal muscle fibers are classified into two types: type I or slow contraction and type II or rapid contraction fibers. Type I fibers have greater resistance to fatigue and predominate in resistance athletes, while the type II ones are adapted to strong and explosive muscle contractions and predominate in athletes who perform power sports (Cartee et al., 2016).
Regarding the GABPB1, the individuals presenting the AA genotype showed a lower maximal HR and lower HR in VT1 and  VT2 than AC genotypes. This result is in accordance with other studies that also consider this genotype favorable for endurance sports (Eynon et al., 2013) as it has been described that endurance athletes present lower HR in maximum aerobic traits (Zaniqueli et al., 2014). The GABPB1 gene encodes the GA-binding protein transcription factor, which is implicated in the regulation of the mitochondrial function generating ATP energy (Dinkova-Kostova and Abramov, 2015) what might explain the implication of this gene in the individual endurance capacity.
Additionally, AA genotypes for the ACE gene presented higher maximum VE than the AG + GG genotypes. The AA genotype of the rs4343 is considered equivalent to the II genotype of the ACE I/D rs4340 (Glenn et al., 2009) which at the same time is the most studied favorable genotype for endurance performance (Ma et al., 2013). Maximal VE has been reported to be directly correlated with VO 2 max (Keramidas et al., 2010;Malekmohammad et al., 2012). Accordingly, it is suggested that muscles of endurance athletes require higher values of VO 2 so that their VE during exercise is higher.
Hypoxia-inducible factor-1 (HIF1) regulates oxygen homeostasis in mammalian cells and in particular, it seems to have a role during high intensity exercise, helping the skeletal muscle to adapt to low oxygen concentrations (Freyssenet, 2007). According to the association between somatotype variables and the different genotypes, individuals carrying the T allele of the HIF1A SNP presented moderate or high mesomorphic values, while none of the CC homozygous had a high mesomorphic component. Precisely, allele T is associated with power-oriented athletes (Drozdovska et al., 2013). This makes sense as high mesomorphic individuals are characterized by high skeletal muscle mass, needed for power anaerobic exercises (Gutnik et al., 2015). Besides, as cycling is more an endurance-oriented sport than a power sport, it also makes sense that the mesomorph was the component with the lowest value among the sample.
When we analyzed the association of the different genotypes and the dietary records, we also found diverse significant associations. The cyclists with a favorable genotype for endurance sports regarding the AQP1 presented a higher intake of carbohydrates which we hypothesize that would probably be due to maintain the glycogen stores needed for long distance exercise (Alghannam et al., 2018). Similarly, individuals with an unfavorable genotype for power sports according to CKM consumed higher amounts of sugars (Chen et al., 2017). Commonly, it is considered that genotypes unfavorable for power are favorable for endurance sports and vice versa, which might explain this association. To our knowledge, this is the first time that AQP1 and CKM genotypes have been associated with dietary intake.
Moreover, individuals presenting a favorable genotype for endurance sports according to ACE gene presented a higher mean glycemic index intake than the other genotypes. Although it seems clear that carbohydrate consumption is needed to maintain glycogen stores, whether these macronutrients are preferable to be complex or with a high glycemic index appears to be controversial, as there are studies that point out that moderate glycemic index diets improves exercise performance (Durkalec-Michalski et al., 2017) while others support the low glycemic index (Durkalec-Michalski et al., 2018). A possible explanation of this controversy may lie in the fact that the unfavorable ACE genotype for endurance performance has also been linked to lower glucose tolerance (Schuler et al., 2017). Thus, we hypothesize that ACE AA individuals might take advantage in endurance sports by consuming a moderate glycemic index diet as they can metabolize glucose in a better way. However, we are aware that more studies in this field are needed to confirm this hypothesis.
Finally, an interesting association between the habit of consuming caffeinated beverages and the result of the GES was observed. A possible explanation for this inverse correlation might be that athletes less genetically predisposed to endurance sports where energy is a key factor, are more likely to use ergogenic aids such as caffeine. However, we have to consider that the food frequency questionnaire used gives overage data of the last year, but we do not have data on dietary intakes in specific time frames such as competitions.

CONCLUSION
The outcomes of the present study confirm a positive relationship between an endurance prediction algorithm and the results of a cardiopulmonary exercise test. Moreover, AQP1, PPARα, GABPB1, and ACE genes were individually related with endurance performance. Besides, HIF1A showed an association with the somatotype and AQP1, CKM, and ACE genes were associated with the athletes' dietary intake. In addition, an inverse association between the habit of consuming caffeinated beverages and the GES was observed.
This information may facilitate the design of larger studies implicated in the prediction of sports capacities, which may facilitate the finding of genetically talented athletes, improve their training and dietary habits, as well as help in the improvement of physical conditions of amateur athletes.

DATA AVAILABILITY STATEMENT
The datasets presented in this article are not readily available because are part of the GENYAL Platform for clinical trials in nutrition and health (https://www.food.imdea.org/services/ Platform-Clinical-Trials-Nutrition-and-Health) database. This is a database that is currently registered as a collection under the Spanish rules which by policy of the Center will be public afterwards once the data of the entire expected population is gathered. Requests to access the datasets should be directed to ana.ramirez@imdea.org.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by the Research Ethics Committee of the IMDEA Food Foundation. The patients/participants provided their written informed consent to participate in this study.

AUTHOR CONTRIBUTIONS
AR, GR, JS, and VL-K proposed, funded, and designed the research. FL-S, JR-A, and JS did the recruitment of the sample and the monitoring of the cardiopulmonary exercise test. RI, IE-S, EB-R, HM-P, and EA-A performed the anthropometric measurements, body composition analysis, dietary records and, DNA collection and genotyping. GC performed the statistical analysis. RI wrote the first draft of the manuscript. IE-S, FL-S, and JR-A wrote sections of the manuscript. All authors contributed to manuscript revision, read and approved the submitted version.