Complex Networks Models and Spectral Decomposition in the Analysis of Swimming Athletes’ Performance at Olympic Games

This study aims to present complex network models which analyze professional swimmers of 50-m freestyle Olympic competitions, comparing characteristics and variables that are considered performance determinants. This comparative research includes Olympic medalists’ versus non-medalists’ behavior. Using data from 40 athletes with a mean age, weight and height of 26 ± 2.9 years, 87 ± 5.59 kg, 193 ± 3.85 cm, respectively, at the Olympics of 2000, 2004, 2008, 2012, and 2016 (16-year interval), we built two types of complex networks (graphs) for each edition, using mathematical correlations, metrics and the spectral decomposition analysis. It is possible to show that complex metrics behave differently between medalists and non-medalists. The spectral radius (SR) proved to be an important form of evaluation since in all 5 editions it was higher among medalists (SR results: 3.75, 3.5, 3.39, 2.91, and 3.66) compared to non-medalists (2.18, 2.51, 2.23, 2.07, and 2.04), with significantly differences between. This study introduces a remarkable tool in the evaluation of the performance of groups of swimming athletes by complex networks, and is relevant to athletes, coaches, and even amateurs, regarding how individual variables relate to competition results and are reflected in the SR for the best performance. In addition, this is a general method and may, in the future, be developed in the analysis of other competitive sports.


INTRODUCTION
Swimming is a kind of sport which involves agile body mechanics, including the action and reaction of Newton's third law (Cureton, 1930) where the human body must combine physiology with engineering concepts to work as a whole (Alexander, 1992). Although it is one of the oldest physical activities, the first swimming competition (Porter, 2017) was only hosted in 1837 at London's six artificial pools. The Olympic Era, in Athens, started in 1896, at a male-only swimming competition (Oppenheim, 1970). The Summer Olympic Games are a topic which has been attracting the attention of both general audiences and researchers. However, the Social Sciences have generated the most Olympic papers at 1,155 papers, while Exact Sciences like Engineering, have produced only 510 (Skibb et al., 2016). Considering the research involving swimming, until 2013 performance increased each year for both women and men in the World Championships and the Olympics (Wild et al., 2014), which points to the demand for evaluation of the parameters for understanding this growing performance behavior. Studies in this issue have made valuable contributions; however, there is still no consensus for the most knowledgeable way of studying the performance of athletes.
It is also important to consider that, regarding sport investigations, the multiple variable approach for the assessment of cardiorespiratory coordination and the effects on performance helps in the evaluation of different health and fitness training interventions (Garcia-Retortillo et al., 2019). In addition, there are interesting works in the literature that identified and created the first physiological complex models representing body changes and interactions among several organ systems. We can cite one of the first physiological networks created in the analysis of sleep stages, where each network node represented a type of body change interacting each other (Bashan et al., 2012). Some authors were also able to relate different aspects of physiological regulation using non-linear dynamics methods (Bartsch and Ivanov, 2014). The changes in brain, cardiac, and respiratory systems presented an important and strong relationship between network connectivity, links' weights and physiologic functions. An interesting investigation about the dynamic interactions between organs developed the concept of Time Delay Stability, using complex hierarchical reorganization in network models . The authors proposed transitions across physiological states, and the network models represented various interactions, for example, between the brain and different organs. Statistical tools have also played an important role, once they were able to capture key elements throughout dynamic physiological interactions (Lin et al., 2016).
With the aim of contributing to the analysis of swimming performance, we here provide an investigation by considering the methodology of complex networks. In simple terms, these networks are mathematical graphs, which usually represent athletic performance parameters as nodes (vertices) and interactions as links (edges). Due to their flexibility, complex networks have been applied to a variety of scientific contexts, despite the fact that are still a lack of studies involving sports evaluation (Lewis, 2009). Considering complex social networks, the players were studied as nodes and the amount of interactions between them as links (Passos et al., 2011). Such investigation identified a higher number of interactions among efficient players. Another paper, inspired by the mentioned research on network physiology, investigated the fatigue occurrence during tethered running (Pereira et al., 2015). The network metrics assisted in the understanding of performance and in the avoiding of fatigue occurrence. Variables like Power, Velocity and Lactate time were highlighted. Additionally, considering sprinter athletes running in track field tests, the complex network models revealed that aerobic and anthropometric measures are meaningful in mathematical models and emphasized the extent of the comprehension of an entire complex context for an optimal performance output (Pereira et al., 2018). However, in the literature, the swimming topic through complex network analysis has not yet been explored in the context of the possibilities provided by such a remarkable approach.
Similar to the mentioned complex models, the network approach offers an overview of the athletes' parameters during a competition scenario for swimming analysis. Shortly, a complex network model can be represented by a mathematical graph, which has an adjacency matrix. It is also possible to determine its decomposition system, for the spectral decomposition analysis. Through the matrix, the eigenvalue can be calculated for each network node, and the higher network eigenvalue is called Spectral Radius (SR). Through the groups of medalists versus non-medalists comparison, the spectral decomposition here proposed becomes possible to analyze the robustness of the interactions. All of this assists in performance comprehension. Physiological and related variables represented as nodes in the networks are one of the first steps in a distinguished level of swimming performance interpretation. If compared with isolated cause and effect studies, variables' interactions are able to closely represent different levels of change during an activity or physical exercise.
In this article, inspired by the mentioned complex networks approaches on physiology and sports analysis, we developed a complex network study of the most recent 50 m freestyle swimming performance at the Olympic Games, Rio 2016, compared to London 2012, Beijing 2008, Athens 2004, and Sydney 2000 This 16 year range is useful to understand the medalists (winners) versus non-medalists' behavior. Specifically, topological properties are used to summarize the impact of topology on behavior (Lewis, 2009), but not yet in sports like swimming in the Olympics, as introduced in this research.

MATERIALS AND METHODS
Considering swimming as a kind of exercise which depends on muscle contractility, strength, and speed through previously applied maximal or submaximal loads on the muscle system (Cuenca-Fernández et al., 2018) and metabolic responses (Hellard et al., 2018), data were selected from that which was publicly available from the last 5 editions of the games, specifically concerning the Men's 50 m freestyle swimming, whose finalists are 8 athletes in each year, totalizing 40 athletes. Using data from swimmers with a mean age, weight and height of 26 ± 2.9 years, 87 ± 5.59 kg, 193 ± 3.85 cm, respectively, it was possible to organize the following data available for each athlete in the event: Birth year (YYYY), Age (years), Weight (kg), Height Thereafter, it was necessary to proceed with the organization of data with the 8 finalists of each edition, leading to the correlations' calculation and model construction. The 40 finalist athletes of the Men's 50 m freestyle swimming competition, in a range of 16 years, were selected from the largest swimming competition that occurs every 4 years. These athletes were filtered through pre-Olympic qualifiers and then each staged within the specific event until the grand finale, where only 8 can compete, representing the swimming elite, with a small variance. The other filter of these athletes is that they should be specialized in high intensity and short duration exercise, since this test is crossing an Olympic pool of 50 m in about 20 s. It was also possible to work with current competition data from athletes representing the world's greatest sprinter swimmers over 5 competitions, something that could be different from other groups of volunteers, usually from the same country, as that even though they were good swimmers, they would have a different kind of representativeness. It is believed, therefore, that the 2000, 2004, 2008, 2012, and 2016 Olympics finalists are high-level representatives of outstanding swimming performance.
The dataset comparisons via Pearson's correlation were calculated. We firstly built two complex networks for each of the 5 editions of the games: groups of medalists and nonmedalists dataset. The values of the correlations were the value of the network links in %. In order to set a complex model that adequately represents an analysis linked to individual performance, publicly available data were chosen. In this way, 6 variables were selected and transformed into nodes of the complex models. They are: Age (years), Weight (kg), Height (cm), BMI -Body Mass Index (kg/m 2 ), Number of Olympic Medals and Reaction Time (s). The parameters which may directly predict victory, such as Velocity and Time, were not transformed into nodes, in order to avoid bias results. The selection of the variables was made considering those related to performance and individualizing each athlete, according to the Data Availability of each competition. Variables considered unrelated to performance were not included, such as team size and lane. In addition, there is great importance identified in runners for anthropometric data (Pereira et al., 2018). In accordance with the concept of creating complex networks, it is of fundamental importance to include variables such as height, weight, age, BMI, etc. The idea is to understand the relationships (information flow) among them and other variables. Such relationships are represented by links and the variables under analysis represented by nodes, which result in the final complex structure under analysis.
Five correlations (5 links) were calculated for each of the 6 variable analyzed (6 nodes) with a total of 30 links by each of 10 network model. The total calculation involves the creation of: 10 complex networks, 60 nodes, and 300 weighted links (correlations).
Any network can be represented by a graph. Any graph can be represented by its adjacency matrix, from which other matrices such as Laplacian are derived. This linear algebra determines that for each matrix, a collection of eigenvalues with their respective eigenvectors can be associated. The term Eigen has a German origin and means what is inherent, a characteristic or fundamental property. Therefore, knowing that each graph is represented by its matrix, it is natural to investigate its "Eigen system" once it characterizes the graph (Van Mieghem, 2014). Other topological graph characteristics are used to characterize network connectivity, for example, in financial market fluctuations (Spelta, 2017). Topological metrics can be classified into metrics that are based on graph distance, connectivity and spectrum (Van Mieghem, 2010;Jovanovic et al., 2017). The nature of a complex system pattern is possible to determine by decomposing the system's response to a stimulus into a set of fundamental modes or basis vectors also called orthonormal vectors. This mathematical process is called spectral decomposition. This process of finding the basic vibrational modes (harmonics) and expressing them in terms of constants is called spectral analysis.
This kind of complex network analysis refers to the analysis of a mathematical graph. The measure of the degree of the nodes (parameters under analysis) of a complex network (graph) is related to the total number of edges (relations between the nodes) incident to this node. Nodes with an higher number of edges to it incidents are called hubs. Only the measure of nodes' degree may not adequately reflect the importance of these nodes in the complex model. An alternative metric can be used to calculate the eigenvalues for each node of the resulting network and to rank these eigenvalues for the General Winners network. Each metric, such as eigenvalue calculation, was made by Eclipse IDE via Java Programing algorithm. The eigenvalue calculation is crucial for the understanding of this approach, and it is explained at the introductory section. The largest eigenvalue of a graph is also known as its SR or index. The basic information about the largest eigenvalue of a (possibly directed) graph is provided by Perron-Frobenius theory (Smyth, 2002).
Each graph G has a real eigenvalue θ 0 with non-negative real corresponding eigenvector, and such that for each eigenvalue θ we have |θ| ≤ θ 0 . The value θ 0 (G) does not increase when vertices or edges are removed from G (Brouwer and Haemers, 2011).
Under the assumption that G is strongly connected, then: (i) θ 0 has multiplicity 1.
(ii) If G is primitive (strongly connected, and such that not all cycles have a length that is a multiple of some integer d > 1), then | θ | < θ 0 for all eigenvalues θ different from θ 0 . (iii) The value θ 0 (G) decreases when vertices or edges are removed from G.
There are essentially two types of information related to the spectrum. The largest eigenvalues (and their eigenspaces) give some information on global graph properties. The typical eigenvalues give information on local graph properties, such as degree, partition function, etc. Here the focus is on the SR, which is related to a graph property called maximum eigenvalue. The maximum eigenvalue of a complex network (graph) is also called the SR. As mentioned in the explanation of its calculation, the eigenvalue is a final value assigned to each node, considering not only the number of edges (links) of this node but also the weight of the links to it, associated to its location in the complex network topology. A node with final high value of eigenvalue, if compared to the others in the same graph, is interpreted as an important node due to the amount and weight of the edges, besides it being a node that connects to the nodes around it that, in turn, also have greater amount of weighted links and so on, relevance. In this new proposal of spectral decomposition, considering that the correlation matrix in the context of the networks is sensitive to the weight value attributed to the link, we used correlations in an attempt to not discard connections which have their importance in the context of the complex model and the calculation of the SR. The idea was to consider the set of interactions and their outcomes. That is the reason why complex network analysis includes such measure and the greater adequacy of a measure of maximum eigenvalue (SR) to a convergence tendency of the complex system as a whole. The data that support the analysis and conclusions of this article are publicly available on the website Sports Reference (Evans et al., 2016) and are in accordance with all the Publishing Ethics of this journal.

RESULTS
The following public data available at every edition of the games was analyzed. The measures of central tendency by each Olympic game edition are shown at the Table 1.
Pearson's correlations among sets of data were defined as the weights of the network. Such correlations were a useful choice once it showed the same result for both directions of the model set, as shown at the Figure 1.
A similar method was used in another research of the authors (Pereira et al., 2015). An algorithm was built in Java Programing Language, which received the data sets as vectors in a main function, passing them to the function to correlation calculation including the set of data gathered. Every value of correlation was considered and transformed into link. For example, if the result of the correlation between Age and BMI was 0.56, a connection (link) was added between such nodes with a weight of 56%. The 6 nodes of each Olympic edition were included with the same magnitude, once one of the main goals of the model is to identify the resulting dynamics of the nodes through complex metrics. The resulting weighted network had bidirectional links, which means that the link influence flows in both directions. This was necessary once it is not possible to stand that a node like Age, for example, has a cause and effect weight in BMI. Instead, it has a correlation inside the dynamic network. It is interesting to note that most real-world networks, links' weights may mean capacity, flow or intensity . In this way, a complex model is analyzable simplification of reality representation with mathematical groundwork. Such links' weights in every complex model directly determined the network structure and the main complex metric utilized in the result: The SR. The complex network models were built as shown in Figure 2.
With the focus on what data to analyze together, from the point of view of complex network construction, and the spectral decomposition, which can help to identify trends in the profile of the winners, the complex networks were built in a computational interface. Then, the network SRs were determined. The SR is computed by finding the largest eigenvalue of the weighted connection matrix C, where an element of C is equal to the weight assigned to the link between nodes. Matrix C is symmetric, because links are bi-directional.
By considering the 6 nodes and 10 models proposition, 2 networks by edition were constructed -2 networks for each edition (2 networks for Sydney, 2 networks for Athens, 2 networks for Beijing, 2 networks for London and 2 networks for  Rio). The core idea was to compare medalists' and non-medalists' behavior via complex models. The Winners -Medalists network has the correlations of the 3 medalists for that edition. The nonmedalists network was created by data from the 5 other athletes' positions, which did not win medals. For comparative analysis, within these new 10 complex models we found the eigenvalues of each node and the value of the SR (largest eigenvalue) for each network. In fact, the SR of the winners' networks, in each edition, are predominantly bigger than the SR of the non-medalists' networks, as shown at Figures 3, 4. The SR proved to be an important form of evaluation since in all 5 editions it was higher among medalists (SR results: 3.75; 3.5; 3.39; 2.91; and 3.66) compared to non-medalists (2.18; 2.51; 2.23; 2.07; and 2.04), with significant differences between groups (One-way ANOVA followed by Tukey HSD post hoc test, p < 0.05).

DISCUSSION
Through this new study using quantitative data for the construction of complex networks models, it is interesting to note that success and winning in sports are reached by decisions guided by data and models. Sports analytics is a process of strategically modeling the data available to transform it into a source of competitive advantage (Trewin et al., 2004;Miller, 2015). Players, managers, owners and fans are interested in such strategies in the context of data science, where sports analytics is a blend of business savvy, information technology and modeling techniques.
Our network approach moves toward the recent understanding of the human body as a collection of physiologically interacting systems, according to the interdisciplinary concept of network physiology . Networks are representations of these interacting physiological systems, which include organ changes and metabolites, among others, and provide feedback and feedforward data interacting, which are capable of reflecting on different performances. The computational modeling comes to help in this understanding, allowing the calculation of complex metrics like SR, for example. The concept of network physiology in different sports is still not fully explored but is promising, and may in the future also involve specific interactions, such as those found in different brain regions and their effects on physiological states during sleep . Considering the potential limitations of this work, there is the fact that the data were not obtained in time series and successively, as could be done in a hypothetical scenario of competition. On the other hand, it was possible to work with actual competition data for athletes representing the world swimming elite over 5 distinct competitions (16 years range) from different countries.
This research approach brings a novel way in order to identifying data sources, gathering data to organize and prepare for the complex analysis. Furthermore, the data selections for this study seemed to be in concordance with research involving the need of evaluate anthropometrics and athletes' variables. For example, evaluating the performance and the anthropometrical parameters, such as body height, called the attention for both female swimmers (Jagomägi and Jürimäe, 2005) and male swimmers 100-m events (Sammoud et al., 2018). In addition, age, height, and hand grip strength were the best predictors in short-distance events (Zampagni et al., 2008). A balanced diet allows to maintain a stable body weight for athletic performance in swimmers (Ciosek et al., 2015) and anaerobic qualities are important in regards to age in other competitions (Fairbrother, 2007). These are shown at the Olympics over the years, with the important contribution to the model body demonstrated as interactions of a complex network (Herman et al., 2009;Ivanov et al., 2016).
The complex networks make possible the study of the dynamical interactions among professional athletes. This article took into account 5 different Olympics editions and allowed the calculation of the SR, which is a measure that reflects the robustness of each complex model. Once multiple seasons were analyzed, it is possible to track the development of winners SR values and its similarities. In the case of the models of the medalists, there is a communication of greater weight among the variables, that is, for a winner; the intensity of communication between variables at the proposed levels was reflected in a higher SR. For non-medalists, this lower level of communication among the variables may have been decisive in the position they reached.
By considering the SR values analysis, the winners' networks always have the highest SR values. It is also true even when considering the mean eigenvalues. Winners may be better at combining all factors, here represented as nodes. Maybe a well-balanced athlete is a winner and the complex networks and SR analysis are a newsworthy way of identification of the best fit athlete. It is interesting to note that the variables analyzed via complex models together, may indicate the best use of the set of factors by the winners. Thus, complex networks in association with complex metrics, such as SR, may, in the future, allow for a given test, according to the specific profile and performance of the analyzed athletes, the calculation of SR for different groups of athletes in training and determination of SR values. These network models and their metrics can assist in verifying which groups of athletes would present a greater chance of victory when compared to each other.
The higher SR value among medalists should reflect the more efficient communication of the variables analyzed within the model. The combined communication between physiological basis and previous experience, results in a higher SR for medalists. The application of these physiological complex network models should be taken into consideration when focusing on new training strategies, assisting coaches, athletes, and amateurs. The complex models in conjunction with the spectral analysis FIGURE 3 | The Spectral Radius (SR) network (y-axis) by Olympic edition, comparing Medalists result (white) and non-medalists result (black). The Medalists -Winners network of each edition had the highest SR values at all competitions. It is worth mentioning that the winners always resulted in higher SR values at all editions analyzed, with significantly differences between groups (One-way ANOVA followed by Tukey HSD post hoc test, p < 0.05).
FIGURE 4 | The SR results (y-axis) by Olympic edition versus mean eigenvalues (x-axis) represented by the triangle with the coordinates (x; y). It is important to note that the Medalists SR results were always above 2.91 and the non-medalists' results are under 2.51. This point to a common behavior among Medalists-Winners (highest values) at all 5 editions of the games analyzed, with significant differences (One-way ANOVA followed by Tukey HSD post hoc test, p < 0.05).
proposed by this study showed consistency with the profile of the winners. Such analysis can be applied in future work for women's swimming events and also for other sports categories, such as athletics. The methodology presented here can also be applied in other types of tests and even other sports, in order to identify the profiles of possible medalist groups and may help in practice, which may inspire new future research applications.

DATA AVAILABILITY
All datasets generated for this study are included in the manuscript and/or the supplementary files.

ETHICS STATEMENT
Ethical review and approval was not required for the study on human participants in accordance with the local legislation and institutional requirements. Written informed consent for participation was not required for this study in accordance with the national legislation and the institutional requirements.

AUTHOR CONTRIBUTIONS
VP-F, TL, LF, and LD proposed models ideas, interpreted the data, and created the models. VP-F, LF, and LD wrote the main manuscript text and prepared the figures. TL and VP-F developed the network software, designed and built complex models, and made metrics. All authors reviewed the manuscript.