Investigating the searching behavior of Sino-U.S. relations in China based on complex network

The Sino-U.S. relationship is one of the most important bilateral relationships in the literature of global geopolitics. Not only the two countries, but also other nations in the relevant regions have been influenced by their economic, cultural, political, educational, and diplomatic relations. In this paper, we have applied Visibility Graph as the method, analyzed the data from Baidu searching index of the keyword “Sino-U.S. relations” from 2011 to 2021 into a directionless and authoritarian network, and studied its dynamic characteristics. from the time series data, it has been found that the relationship between the data is closer with more edges, greater degrees, and greater clustering coefficients. Meanwhile, the shorter the average path length, the closer the relationship between the data. The results provide a new perspective for analyzing the time series characteristics of Sino-U.S. relations.


Introduction
In recent years, the increasing instability of Sino-U.S. relations result in the conflicts of interest between two countries, and various neighbor countries have also introduced new policies due to the change of Sino-U.S. relations: trading policies, tariff policies, migration policies etc., which have aroused widespread concern among researchers in the field of geopolitics and others. For instance [1], have studied the dynamic relationship between the political conflict and bilateral trading connections of China and the U.S., The results showed that along with the decreasing global economic, the Chinese and U.S. governments should strengthen trading cooperation and seek consensus on economic interests in order to improve bilateral relations [2]. Studied how to discuss Sino-U.S. relations in the news coverage of Sino-U.S. trading conflict. They found that Sino-U.S. relations can be divided into antagonistic relations, negotiatory relations and cooperative relations. In these relationships, China is structured as a victim of the trade conflict, a defender of the free trade, a facilitator of negotiations, and a beneficiary of cooperation. These identities and relationships are constructed through constructive, disruptive, justified, and transformative strategies. It was further found that the diversity of China's ethnic identity and Sino-U.S. relations in Chinese news reports were the result of political, social, and economic differences between the two countries [3]. Considered the impact of artificial intelligence (AI) industry on Sino-U.S. relations and argued that AI would become a more focused area of competition between China and the U.S., especially in the context of increasingly fierce and complex strategic competition between both countries in recent years, which can be used as a microcosm of changes between the two countries [4,5]. Also believe that AI could profoundly affect Sino-U.S. relations. On the one hand, the competition between China and the OPEN ACCESS EDITED BY U.S. around AI area may exacerbate strategic suspicion, accelerate the arms race in AI industry, and undermine the strategic stability of the two countries. On the other hand, the rapid development of AI industry has the potential to open new areas of Sino-US cooperation. As AI introduces new forms of uncertainty into the international geopolitics arena, China, and the U.S., as two global powers, need to increase trust and resolve disputes [6]. Analyzed the impact of the latest shift in Chinese naval strategy on Sino-U.S. maritime relations [7]. Further studied the Sino-U.S. relationship from the year of 2008-2021. The main purpose of this research is to reveal the connections of Sino-U.S. relationship through the network. Based on previous studies from international relationship and geopolitics area, this paper introduced visibility graph theory into this topic for the first time. The results from the analysis provide a novel research methodology for scholars to analyze Sino-U.S. relationship.
The rapid development of information technology and the popularity of the Internet have changed people's lives. The Internet has become a hub for receiving and transmitting information. Moreover, searching engines provide Internet users with the most convenient way to acquire the information they need. Due to the large amount of data stored in its servers, searching engines can now demonstrate the searching trend within a time frame. Internet giants such as Google and Baidu have both released their own searching engine products. Google Trends and Baidu Index, these two searching engines often considered as the largest in English and Chinese. Moreover, these products can now show searching trends for certain keywords over time, and it can also provide a source for analysts to explore the behavior of Internet users, analyze market trends, and monitor public sentiments. Based on the complexity of Internet user relationships and their information, the entire Internet is a complex network.
Since the birth of complex networks at the end of the last century, this discipline has become an important way for scholars to reveal hidden connections between multiple systems and their components [26]. As this new research has evolved over the decades, the study of complex networks flourished in many directions [9], and view-based network analysis is one of them. Meanwhile, network science has been used in many fields of science: epidemic spreading process in the complex network [10][11][12], locating multi-sources in social networks [13], the social network problems in the multiplex networks [14]. [15] proposed a simple and fast computational method called the visibility graph algorithm in 2008. This method converts the time series data into a visibility graph which explores a new direction for the development of complex network, providing a new method for characterizing time series from a new perspective. Visibility graph analysis is a method based on the principle of time series visualization; hence, various networks can be derived, such as transportation networks, worldwide networks, interpersonal networks, etc. thereby providing a novel method for characterizing time series from a totally different perspective. After developing rapidly for years, this technique has been widely used in various researches [10,[16][17][18][19], Inspired by this, we adopt the visibility graph to explore a typical time series data, i.e., the Baidu searching index about the keyword "Sino-U.S. relations" (in Chinese pinyin"zhong mei guan xi"); thus, the dynamic characteristics of Sino-U.S. relations can be analyzed explicitly. From the aspect of complex network, we hope to reveal more information about time series data. Based on previous studies, this research method has been implemented into many academic backgrounds [20]. Studied time series by constructing complex networks through pseudo-periodic time series, and studied the relationship between constructed networks and primitive time series [21]. Mapped time series to the nearest neighbor network [22]. Introduced the concept of recursive networks [23]. Found out that the visibility graph algorithm preserved the characteristics of the time series [24]. Also indicated that by using visibility graph algorithm, it is possible to determine whether the studied system has the feature of definiteness and randomicity based on the characteristics of complex network [25]. Also applied visibility graph algorithm to analyze the similarity and heterogeneity of price time series in seven carbon pilot markets in China. At present, time series analysis based on visibility graph Frontiers in Physics frontiersin.org algorithm has been applied in multiple fields [26]. Summarize the model of converting time series into complex network [27]. Analyzed the rate change during the past years based on the visibility graph in order to explore new features. From the perspective of time series, this paper converts the time series data from Baidu searching index with the keyword "Sino-U.S. relations" into a complex network through the principle of visibility graph algorithm. Then, we analyze the kinetic characteristics and influence mechanism of the time series through various evaluation metrics, including the degree distribution, cluster coefficient, network diameter and other indicators. As revealed by the obtained results, we find that Chinese searching behavior is typical social behavior, and the degree distribution of the search index is a power-law distribution, which shows that the network is a scale-free network. Chinese people  have a long-term memory of the attention of China and the United States. Furthermore, essential time nodes can also be identified through the centrality of the network.
The following sections of the paper will be listed as: Section 2 will introduce the adopted model and corresponding research methods; Section 3 will list the incorporated data to be analyzed in this paper; the paper will draw the conclusions in Section 4, also with some further research orientations.

Model and method 2.1 Research description
When we discuss relations between countries, scholars usually focus on international relations, linguistics, diplomacy, geopolitics, and other fields of study. However, the relationships of all varieties of data can construct a complex network. Therefore, we construct this network using the time series visibility graph (VG) method proposed by Lacasa et al. [15,28]; with the help of this VG theory, we can perform sufficient analysis on the Sino-U.S. relationships. The notion of complex networks is a branch of statistical physics and network science. It describes the characteristics of complicate systems and their corresponding dynamics. In addition, time series data can be mapped into a complex network by multiple algorithms. By studying the topologies of different networks, it is possible to discover their dynamic characteristics, which in turn reveals hidden connections on the Internet. According to Lacasa et al., different time series can be converted into multiple networks. That is, periodic time series, random time series and fractal time series can be constructed as regular networks, random networks, and scale-free networks, respectively.

Complex networks
Considering that time series data will be projected into a complex network, the concept of a complex network will be briefly described. The graph consists of a node set and an edge set, usually thought of as a network, and is often described as G = (V, E), where Num = V represents the number of nodes and M = E represents the number of edges. In addition, if a complex network is a weighted network, then edges can also be described as (u, v, w), where u and v represent different nodes (u≠v), and w represents the weight of the edge. Complex networks are an effective tool for analyzing problems, which are widely used for analysis of social problems, management science problems, computer and statistical problems, and other issues.
Aiming to discover the features of the chosen dataset, we then will apply various evaluation metrics to analyze the derived networks, including degree distribution, cluster coefficient, average cluster coefficient, network diameter and other indicators [27].

Degree and degree distribution
For the analysis of complex network, degree is one of the most fundamental properties. For a single node v, it is degree (i.e., d(v)) is defined as the number of edges with node v. Moreover, for the directed graph, according to the start and end points of the edges, two types of degrees are defined, i.e., outdegree and in-degree respectively. In a complex network, we usually anticipate that the larger the degree of node, the higher importance is. Once the degrees of all the nodes are derived, the average degree of the network can be obtained accordingly which is calculated as: where k i represents the degree of nodes v i and N denotes the total number of nodes in the studied networks. The distribution of the degrees of nodes in the network can be described by the distribution function P(k), which represents the proportion of nodes in the network with a moderate degree of k in the network. Common degree distributions are listed as: Delta distributions for regular networks, Poisson distributions for completely random networks, and power-law distributions for most realworld networks.

Cluster coefficient
For complex networks, clustering coefficient is another important static geometric feature. If a node is directly connected to v i , then it is referred as the neighbor node of v i . Aiming to measure the level of a node being are connected, scholars defined the clustering coefficient. For node v i, corresponding clustering coefficient C i is defined as the ratio between the actual number of edges E i and the total number of possible edges for neighbors of node v i (i.e., C ki 2 ).
where k i represents the total number of neighbors for v i . On the other hand, the average clustering factor C is the average of the  clustering coefficients for all nodes in the entire network, which is given as: where N indicates the total number of nodes. Obviously, the value of C varies from 0 to 1. When all nodes are isolated, we can have C = 0; when any two nodes in the network are directly connected which in another word, globally coupled, then we have C = 1. When the value of N is large enough, C = O (N-1) holds. In practice, there exist obvious clustering effects in many actual complex networks, and with the increase of the network size, the clustering coefficient C tends to be a non-zero constant, that is, when N→∞, C = O (1), and similar nodes tend to be clustered together.

Network diameter and average path length
In the network for any two nodes i and j, we can get the shortest distance d ij between these two nodes, then the network diameter D is the largest path between these shortest paths, Average path length is usually defined as L, which describes the average number of edges that are necessary to be passed from one node to another. This can also be referred to as the average of the shortest path among all node pairs in the viewable view. It is defined as:

Graph density
For graph density, it is defined to measure the integrity of a network which also refers to the level of connection tightness among nodes in a network. Thus, the more connections exist among nodes, the larger the graph density is. We know that for any possible node pairs in a complete graph, the two nodes are connected to each other, thus, we have a graph density of 1. For the network G= (V, E), corresponding graph density is calculated as: where N indicates the total number of nodes, and E represents the total number of edges.

Motif
Motif can be regarded as subgraph of a network, i.e., a "small system" in a complex system. It is a small-scale pattern that occurs rather frequently in real networks than in random networks. It belongs to one of the basic topological structures of networks. The frequency of motifs in real networks is much higher than in random networks with the same number of nodes and connections [16,18]. For majority networks, there are some common motifs which are composed of a fixed number of four nodes. Here, we illustrate 3-motif and 4-motif as in Figures 1A, B respectively.

Visibility graph algorithm
In this paper, the VG model is employed to convert time series data into a complex network. Corresponding definition for VG theory is provided as: if any two points (t i , y i ) and (t j , y j ) in a time Frontiers in Physics frontiersin.org series y(t) satisfies the following formula, then visibility in the data is satisfied: Figure 2 shows the definition in Eq. 7. In Figure 1A, the height of the straight rod represents the data value for each time step. If the tops of the two straight rods can touch each other, then we can say that the two corresponding points are connected, as shown in Figure 1B.
Specifically, Figure 2 illustrates a good illustration of the conversion of time series data into a network. As it is shown in Figure 1, the values for nodes 1, 2, and 3 are 4, 6, and 9, respectively. According to Eq. 7, it can be calculated that the values of nodes 1, 2, and 3 have a relationship 6 < 4 + (4-9) (1-2)/(3-1) = 6.5. It can therefore be explained that Node 1 and Node 3 are connected.

Data
Due to the unlimited nature of time series data, the time range explored in this paper is the most recent time series data indexed by Baidu, the largest Chinese searching engine for the last 11 years. We define each day of the year as a node, so there will be a total of 365 nodes per year, in addition to 366 nodes in 2012, 2016 and 2020.

Viewable network
Based on the Baidu searching index data on "Sino-U.S. relations" for each year, we can easily obtain a complex network representing Sino-US relations through the adoption of VG theory. The characteristics of the derived networks are provided in Table 1. In this table, number of nodes are represented by days of each year, while the number of edges reached its peak value at the year of 2020. Meanwhile, the density has also increased and reached its high at 0.047 this year with a diameter between 4 and 5, indicating that more information about Sino-US relations has been searched. Even before the year of 2020, the density has increased gradually only except the year of 2012, 2014, and 2018 with slightly decline.
Similarly, the average path length has decreased from 3.979 to 3.07, and the average clustering coefficient ranged between 0.7 and 0.8, with an overall decreasing trend. It explains that the network's connection is becoming more intensive. Chinese citizens no longer gather in a single topic of the affairs between two countries, but gradually began developing a multi-directional circulation trend. With the interactions in multiple aspects of China and America, there are more discussions in different topics of Sino-U.S. relationships on the Internet.
As illustrated, we can find that for the year where the relationship between the data is closer with greater degrees and clustering coefficients. The path between data also obtains with shorter path length. Furthermore, the derived complex network of the U.S.-China Relations can also be visualized as in Figure 3.
Furthermore, we also present the searching trend of "Sino-U.S. relations" from the year 2011-2021. The results are provided in Figure 4, where the red line represents the total searching index consisting of PCs and mobile devices. The green line indicates that the searching index of mobile devices dominates the total searching index. In 2010, China's GDP rise to the second in the ranking list among all the countries in the world. Then, starting from 2011, the United States introduced a series of measures to suppress the development of China, for instance, "returning to the Asia-Pacific region," implementing "Asia-Pacific rebalancing," and forming the TPP (Trans-Pacific Partnership) while excluding China. In 2020, the US government represented by Trump emphasized "America first"; the US government focused on promoting "decoupling" from China and vigorously suppressed Chinese technology companies such as Huawei and TikTok. As a result of the all-round suppression to China by the United States, frictions between the world's two largest Frontiers in Physics frontiersin.org economies in politics, economy, diplomacy, and people-to-people exchanges have been deepening, and Sino-US relations have continued to deteriorate. As the world's largest power, the economic and political measures of the United States will have a profound impact on the development of China and even the world, so Sino-US relations have also been valued by their citizens. Figure 5 shows the visual and degree distribution of the complex network of the Sino-U.S. relations search index in 2011, while Figure 6 shows the degree distribution from the year of 2012-2022. The visual network of Sino-U.S. relations is a scalefree network. In this network, there are more time nodes with fewer consecutive edges, and a small proportion of nodes with large degree values. Meanwhile, with the increase of degrees, the number of nodes decreases. This reflects the volatility of the original time series. The degree distribution shows that the Baidu Searching Index of Sino-U.S. relations is a fractal time series with long-term correlation. On the one hand, the original time series have statistical similarities in different time ranges, such as days, weeks, months, etc.; On the other hand, due to the long-term correlation, the change in the Baidu search index of future Sino-U.S. relations may be like some past period.
Furthermore, we also try to investigate the distribution of the motifs in the derived networks for the considered Sino-U.S. relation datasets. Here, we mainly consider the 4-motif. Corresponding results are provided in Figure 7. As presented, we find that the number of the second type motif equals to that for the third type motif. Similarly, the number of fourth motif are equivalent to the fifth motif. This shows that the number of Chinese searches for Sino US relations is highly relevant in terms of time.

Conclusion
This paper transforms the time series data from Baidu searching index of the keyword "Sino-U.S. relations" from 2011 to 2021 into a viewable graph. After deriving the networks, we can easily study corresponding dynamic characteristics. The results provide a new perspective for analyzing the time series characteristics of Sino-U.S. relations. The network constructed by using the viewable algorithm in this paper is a directionless and authoritarian network, which lost some information of the original time series data during the conversion process, resulting in its inability to analyze the change characteristics of the time series. Therefore, other methods for analyzing the changing characteristics of the time series can be selected for optimization in the future for a better result.
For the analyzing method, we suppose the choosing of the keyword in the searching engine is quite tricky. On the one hand, this keyword cannot be too narrow and obscure, keywords like terminologies are not appropriate; on the other hand, the keywords which are too broad do not have representativeness. Therefore, in order to fetching the most valuable data for analysis, it is important to choose the keywords more precisely. Based on the keyword of "Sino-U.S.," we can draw the conclusion that only when China and the United States deeply understand each other's differences and promote exchanges and cooperation, both countries can avoid conflicts. This study also provides a historical diagnosis for the development of Sino-U.S. relations. The other researches related to this topic can be explored in other aspects based on more specific time series data.

Data availability statement
The raw data supporting the conclusion of this article will be made available by the authors, without undue reservation.

Author contributions
JC proposed the question and method, WW collected the data and produced all figures, LW finished the manuscript, all authors confirmed the final version of the paper.

Funding
This work was supported in part by the Research Funds for Philosophy and Social Science Disciplinary Subject under Grant 21GH031111, Northwestern Polytechnical University.