Association, Correlation, and Causation Among Transport Variables of PM2.5

The issue of P M 2.5 pollution has received significant attention in the literature as it has social, economic, and political implications. Big data sets have been collected by pollution monitoring stations (i.e., nodes) throughout the world, and this has made it possible to quantitatively characterize the dependence of P M 2.5 pollution in different regions. Here we divide the dependency relationship into three types: association, correlation, and causation. This study conducted such relationships using three approaches: the random matrix theory (RMT), cross-correlation, and convergent cross-mapping (CCM). The aim of this study is to determine the above three relationships between pollution data from different nodes. A random matrix analysis revealed that pollutant time series are not completely random, but are associated. Further analysis showed that P M 2.5 sequences had clear short-range correlations, yet the long-range correlations were blurred. Moreover, at the collect level, there were no clear causalities among pollutant concentrations from different geographical regions, regardless of distance and direction. These results indicate that the dependence of P M 2.5 pollution between different sites is complex. Nonetheless, this comprehensive analysis based on big data provided insights into critical issues of general interest, including pollution-induced climate change, and pollution abatement.


INTRODUCTION
Air quality is a significant environmental concern around the world. Air pollution and aerosols have significant impacts on human health, climates, meteorological phenomena, and the environment, and many studies have focused on these effects [1][2][3][4]. PM 2.5 , as an airborne particulate, is the deadliest form of air pollution due to its ability to penetrate deep into the lungs and bloodstream, unfiltered [4]. This allows it to cause permanent DNA mutations, heart attacks, and premature deaths, and lead to the deaths of three to seven million people every year [5,6]. Moreover, the local nature of air pollution means that the particle can significantly impact temperature, precipitation, and extreme events at a regional level, e.g., aerosols affect regional climates and ocean-atmosphere feedback [2,7]. Thus, as Booth et al. have demonstrated, anthropogenic aerosol emissions influence historical climate events, such as peaks in hurricane activity and Sahel droughts [8]. Thus, scientists have developed new ways to understand the different factors that contribute to poor air quality, and this research has been used to develop observational systems using models and data and to assist decision makers with air quality assessments.
To investigate these influences, Rohde et al. developed a technique for mapping air pollution concentrations and sources using data from monitoring stations; after studying PM 2.5 pollution in China for about four months, a shortdistance effect was found [9]. Dai et al. analyzed six pollutants in 350 Chinese cities and found both long-term correlations and a relationship between spatial correlations and provincial administrative divisions [10]. Additionally, teleconnections were found to indicate relationships between climate anomalies at significant distances (i.e., thousands of km) [11,12]. Moreover, Yu et al. conducted mineral analyses to demonstrate that the long-range transport of soil particles contributed significantly to high concentrations of PM 2.5 during "dust days" [13]. Kaneyasu et al. focused on the impacts of long-range PM 2.5 transports in Kyushu area and noticed that the PM 2.5 concentration is primarily dominated by the inflow of long-range transported aerosols [14]. Perrone et al. demonstrated that Mediterranean sites may be affected by long-range transported pollution, and its pollution depends on the airflow [15]. Zhang et al. found that the strongest correlation between winter and the PM 2.5 concentration in the North China Plain, which is mainly caused by the transport of PM 2.5 [16]. Recently, numerous researchers have used network methods to study climate and environmental issues, where nodes in the network represent geographic coordinate sites, and crosscorrelation and mutual information about the time series of two nodes are used to represent connected edges. And they found that climate networks had very strong links that were caused by a proximity (i.e., distance) effect. Namely, pairs of sites close to each other (less than 2,000 km apart) were often strongly and positively correlated [17][18][19][20][21][22][23][24][25][26].
Technically speaking, the association is different from the correlation. Association means that one variable provides information about another, but correlation means that two variables show an increasing or decreasing trend. Correlation means an association, but not causation. On the contrary, causality means an association, not correlation [27]. Despite this research, little attention was given to PM 2.5 in regards to different temporalspatial scales and causal-effect among different sites. It was generally accepted that climate change causality detection and human crizes had important roles in future research on climate and environmental policies [28][29][30].
In order to explore the association, correlation and causality between PM 2.5 time series in different regions with the changes in the direction and distance of the monitoring sites, we use the RMT method to detect the association, the cross-correlation method to measure the correlation, and finally the CCM to understand the causality. The latter was recently developed as a non-linear dynamics based method for ascertaining and quantifying causal relationships between time series [30,31]. The results of the study are as follows. First, the RMT analysis revealed that the time series of PM 2.5 at different sites is not completely random, and there is a certain association. Secondly, a conventional correlation analysis indicated that there is a clear short-distance correlation between the time series from different sites, but there is no concise and clear correlation in the longdistance range. Finally, the analysis results of the causality detection algorithm CCM demonstrated that at the collect level, the causality between the PM 2.5 time series of different nodes is not obvious, and there is no distinct relationship with the distance and direction between the sites.

DATA COLLECTION
Empirical big data sets were analyzed for this study. They were obtained from global, Chinese, and American monitoring stations. Time resolutions were given in hours, which allowed the researchers to capture time evolutions in relation to PM 2.5 . Moreover, in order to meet calculation requirements for correlation and causality algorithms, the data was cleaned: empty data segments were removed, it was ensured that the length of each original time series was more than 8,760 h (about 1 y: 24 p 365), and all the time series were adjusted to have common starting and ending points. As a result, the length, L h , of each time series (after the cleaning) was less than 8,760 h. The basic statistical properties of the filtered data sets can be seen in Table 1, where N is the number of nodes (i.e., cities, counties, or regions) and L h is the length of each time series in an hourly resolution. Detailed descriptions of the different public data sets are given below.

Global Stations
The global data was collected from [32]. It comprised names, longitudes and latitudes, recorded times (years, months, and hours), and PM 2.5 values. The data was obtained from December 2016 to December 2017 through a monitoring network that operated in 632 regions (or cities) across the world.

Chinese Stations
The Chinese data was collected from [33]. It comprised names, longitudes and latitudes, recorded times (years, months, and hours), and PM 2.5 time series. The data was obtained from January 2015 to June 2017 through a monitoring network that operated in 365 cities across China.

United States Stations
The American data was collected from [34]. It comprised names, longitudes and latitudes, and PM 2.5 time stamps. The PM 2.5 data was divided into two categories, firm and non-firm, and the nonfirm data was used for the analyses. The data was obtained from January 2016 to December 2016 through a monitoring network that operated in 137 regions (or counties) across the United States (USA).

METHODS
This chapter discusses the two correlation calculation methods, i.e., cross correlation and the RMT, which were used in this study to determine correlations between nodes. Moreover, this chapter discusses the causality detection algorithm, CCM, which was used to measure causalities between the nodes. Finally, the azimuth (α) polar coordinate system and distance (d) are discussed.

Cross Correlation
Previous studies used cross-correlation analyses to measure correlations between node distances [17][18][19][20][21]. For this study, a cross-correlation method [18][19][20] was used to calculate the W max and hour resolution (L h of each time series; (see Table 1). Giveñ T d (h), where d was a day and h was an hour (from zero to 23), each filtered record was defined as where L d was the total number of days, as shown in Table 1. For each pair of nodes, i.e., i and j, a cross-correlation in time series was calculated as follows.
where σ T d i (h) was the standard deviation of T d i (h), τ was the time lag with a max value of 30 days and C d ij (τ) C d ji (−τ). Maximum time lag was defined as τ max , with which C d ij (τ max ) is maximum. Then, positive link weights (W max ) were defined as follows.
where the average was a mean and standard deviation was denoted by "std."

Random Matrix Theory
A challenge could arise when interpreting correlations involving the PM 2.5 time series in that the exact natures of interactions were unknown. However, the RMT was a significant theory in data analysis often used to extract underlying information in a time series. Therefore, with minimum assumptions about random Hamiltonian statistics and a real symmetric matrix with independent random elements, the RMT was implemented to address significant amounts of spectroscopic data in regards to the energy levels of complex quantum systems [35,36]. The simplest way to determine correlations between the different time series was to use the equal time cross correlation matrix C, which had elements of one in that τ 0 [37,38]. After this was completed, the statistical properties of matrix C were determined by employing the RMT's processes. Following this RMT procedure, C was first diagonalized and the eigenvalue λ was obtained. Next, p(λ) was defined as an eigenvalue density as follows.
where N is the number of nodes (as shown in Table 1), and n(λ) was the number of eigenvalues for C that were less than λ.
Following previous studies, it was determined that Q ≡ L d /N and L d indicated the length of a time series in a day resolution. Then, p(λ) was computed as follows.
where λ min ≤ λ ≤ λ max and σ 2 were equal to 1 with this normalization. Additionally, λ max and λ min were calculated as follows.
It should be noted that although the RMT was a powerful method for identifying clues to the underlying interactions of the systems, its parameter choices differed slightly from the datasets. The basic parameters of the RMT used in this study are listed in Table 1, where λ max and λ min are the maximum and minimum eigenvalues from Eq. 4 and λ real max is the maximum eigenvalue from the real data.

Convergent Cross-Mapping
Causality has been investigated in many studies, such as social, economic, climatology, and gene perturbation experiments [39][40][41]. Indeed, identifying causality in complex systems can be difficult but exciting in nature, and determining causal relationships is pertinent to many disciplines with broad applications. Traditionally, Granger causality analyses can be used as paradigmatic frameworks to determine such relations [39,42]. However, Granger causality is linear and multivariate in nature and involves statistical regression, so various methods derived from such causality are required for extensive data [31]. Entropy based methods result in similar difficulties [43], but CCM [31] is based on non-linear time series analyses [44] and was developed to overcome these challenges. That is, CCM is powerful for detecting and quantifying causations between pairs of dynamic variables based on time series [31].
For this study, a phase space was developed for each variable based on a delay-coordinate embedding method [44]. For example, for time series where τ was a delay time and E x was an embedding dimension. The same could be done for time series y(t) of variable y to yield a reconstructed vector in dimensional space E y . The basic principle was to compare the predictions in each subspace. Consider the pair of vectors [X(t), Y(t)] at time t, one vector from each subspace. In subspace Y, one could find a set of neighboring vectors for Y(t) and identify the corresponding set in subspace X based on which one could be used to predict the value of X(t). The difference between X(t) and its predicted value characterized the accuracy of the prediction. Similarly, based on neighboring vectors in subspace X, a prediction in subspace Y could be made. Comparing prediction accuracies regarding the two subspaces could determine the causal relationship between X and Y. The principle underlying this method was asymmetry in regards to directional predictability. Suppose one wished to detect a causal interaction between two subsystems with the state variables X(t) and Y(t), respectively. Using X(t), the value of Y(t) could be predicted, such as Y(t), and correlation ρ YX (t) could be predicted between Y(t) and Y(t). Similarly, using Y(t), a prediction could be obtained for X(t), such as X(t), and the correlation ρ XY (t) between X(t) and X(t) could be calculated. If no causal relationship existed between X(t) and Y(t), the predictions in both directions were even, so statistically, the correlations ρ YX (t) and ρ XY (t) could not be distinguished from each other. That is, However, if X(t) was more a cause of Y(t) than the opposite, the prediction X(t), obtained from Y(t), was better than that of Y(t) [from X(t)]. This was because information about X(t) was contained in Y(t). Thus, it was determined that ρ XY (t) was greater than ρ YX (t), or Δ was greater than 0. Moreover, statistically positive Δ values could be considered heuristic criteria for determining that the direction of the causal interaction was from X(t) to Y(t). Likewise, if Δ was less than 0, it indicated that Y(t) was more a cause of X(t) than the opposite [31].

Polar Coordinate System
Generally, the polar system is a two-dimensional coordinate method in which each point on a plane is determined by a distance from a reference point and an angle from a reference direction. The reference point (analogous to the origin of a Cartesian coordinate system) was called a pole, and a ray from the pole in the reference direction was the polar axis. The distance from the pole was called a radial coordinate, radial distance, or simply radius, and the angle was called an angular coordinate, polar angle, or azimuth.

Distance
Orthodromic distance, the shortest distance between two points on the surface of a sphere, was measured along each surface. In particular, for any two i and j points specified by ϕ i , η i and ϕ j , η j , where ϕ was a geographical latitude and η was a geographical longitude, Δϕ and Δη were absolute differences. The spherical law of cosines was then used for the central angle Δσ between i and j as follows.
Δσ arccos sin ϕ i · sin ϕ j + cos ϕ i · cos ϕ j · cos Δη . (6) The distance was obtained using d rΔσ, where r was the radius of the sphere.

Azimuth
Azimuth, denoted as α, was defined as a horizontal angle measured clockwise from a north base line or meridian. For example, for reference point i with the latitude ϕ i and the longitude η i , the azimuth of point j (ϕ j , η j ) was determined using the following equation [45].
As Eq. 7 returned a value in the range (−180 + ,180 + ), the result was normalized to a compass bearing in the range (0 + , 360 + ). The transformed formula was as follows: where % is (floating point) modulo. Then, moving clockwise in a circle, the east, south, and west directions had azimuths 90 + , 180 + , and 270 + , respectively.

RESULTS
This section introduces the analysis results of the RMT algorithm, cross-correlation algorithm, and CCM algorithm. First, we statistically analyze the distribution of the direction and distance of the monitoring stations. Secondly, the RMT algorithm is used to calculate the association between the monitoring stations. Then, the correlation between the stations is analyzed using the crosscorrelation algorithm. Finally, it shows the changes of CCM causality between different sites in different directions and distances.

Distribution of Azimuth α
As aforementioned, one of the main aims of this work was to study correlations and causalities between PM 2.5 monitoring station data sequences in different directions. Therefore, the distribution of PM 2.5 monitoring site directions was significant. This section discusses the distribution of the azimuth α. As can be seen in Figures 1A-C, the azimuth α distribution for the three different data sets (Global, China and the United States) had peaked at different values of α. Figure 1A reports the four main peaks for α in the global data, which was the largest dataset at around 45 + . Most neighbors were located in the northwest section of the region. Similarly, many neighbors were located in the east (≈ 90 + ), northeast (≈ 60 + ), and southwest (≈ −110 + ) sections. Additionally, Figure 1B shows that cities neighboring each other in China were generally distributed in the northeast and southwest directions of the city. Further, Figure 1C demonstrates that the neighboring cities in the USA United States were generally located to the east and west. These results were in agreement with the distributions of urban belts globally, in China, and in the United States. It should, however, be noted that most of the PM 2.5 detection sites were located in urban, i.e., densely populated areas; PM 2.5 monitoring in non-urban or sparsely populated areas was needed. Nonetheless, this study's examination of α distribution diversity provided an understanding of the influence of α on the associations, correlations and causalities of PM 2.5 sequences. These results help to understand the distribution of the direction as a whole and avoid the deviation of the conclusion caused by the statistical differences of the direction in the subsequent data analysis.

Distribution of Distance d
To study trends regarding correlations and causalities between PM 2.5 sites and the distances between the monitored locations, the distance distributions of the sites were needed. Figures 1D different values of d. The distance distribution for the global data set had two peaks, while the distance distributions for the Chinese and American data sets had only one peak each. The two peaks noted in the global site distribution ( Figure 1D) suggested that the monitoring sites were distributed throughout two relatively concentrated places. In contrast, the single-peak distributions of China and the United States ( Figures 1E,F) showed that the monitoring sites were relatively concentrated and closely connected. Nevertheless, these distributions were not perfectly bimodal or normal. One reason for this is that the distributions of the monitoring sites conformed to non-uniform population distributions. These results help to understand the distance distribution of the monitoring stations, and at the same time avoid the deviation of the conclusion caused by the difference in distance distribution in the subsequent data analysis.

Random Matrix Theory
The RMT was helpful for comparing the properties of a null hypothesis purely random matrix (a strictly independent and identically distributed random time series) to those of the empirical correlation matrix C. Deviations from the purely random matrix could suggest the presence of underlying interactions [37,38]. The RMT method was thus used to study the statistical properties of C in regards to the cross correlations of PM 2.5 changes. Initially, the elements of C were from Eq. 1 when τ 0; then, they degenerated into Pearson's correlation coefficient ρ from a two-time series. Figures  2A-C demonstrates the distribution of ρ for the global, China and United States data sets, respectively. The means, ρs, and standard deviations, σs, of the three distributions were as follows: ρ ≈ 0.04 and σ 0.14 for global sites; ρ ≈ 0.11 and σ 0.11 for the Chinese sites; and ρ ≈ 0.04 and σ 0.08 for the American sites. Thus, a clear deviation could be seen between the distribution of ρ and the curve fit Frontiers in Physics | www.frontiersin.org July 2021 | Volume 9 | Article 684104 by the normal distribution. These results indicated that the PM 2.5 time series associations for the data were not completely random [37,38]. Future research should investigate the causes of these non-random associations, such as whether they were affected by climatic conditions. These non-normal distributions suggested the existence of non-trivial relationships between detection sites. As aforementioned, C was diagonalized to obtain λ eigenvalues. Finally, the distributions of the eigenvalues from the empirical time series were considered in regards to the finite strictly independent and identically distributed random time series. Figures 2D-F represent the distributions of the eigenvalues of real (green bar) and random (red bar) time series globally (d), for China (e), and for the United States (f) with the length L d (see Table 1). As can be seen, there were dramatic differences between the random series and the real PM 2.5 time series. The results are qualitatively similar to those of earlier studies about the global crude oil market and the global stock market, in which they observed that the largest eigenvalue reflects the collective effect of the global market, the second to fifth largest eigenvalues can distinguish six clusters, and the smaller eigenvalues portray the time series pair with the largest correlation coefficient [46,47].
Moreover, the RMT predictions indicated that the distributions of the eigenvalues should follow the black dash line, which shows a distinct deviation from the real PM 2.5 time series but it is in good agreement with mimic random series. This result indicates that there are deviations in the eigenvalue distributions of the correlation matrix from the empirical data from purely random time series, implying that PM 2.5 time series are not purely random but with a finite amount of association. These findings were consistent with results obtained by examining the distribution of ρ, as shown in Figures 2A-C. However, this work represented only a preliminary attempt to identify the associations of real PM 2.5 time series. The actual relationship may be more complex, and the underlying mechanism of the associations was not within the scope of the RMT. This indicates that there is a nonrandom association between PM 2.5 sites, suggesting that there is some association between our PM 2.5 sequences, for example, the correlation changes with distance [17][18][19][20][21]. Furthermore, future research should focus on the time-lag cross-correlations RMT, because this method focuses on the magnitudes of the sequence, so that the method can quantitatively mine the long-range collective movements hidden behind the short-range correlation features [48].

Illustration of Cross-Correlation
To characterize the transport dynamics of PM 2.5 , this study aimed to determine the various hidden relationships between the distinct timeseries measurement nodes. A straightforward method was used: crosscorrelation. It had been used in research on pollution transports to detect Rossby Waves, teleconnection paths, and El Niño impacts [18][19][20]. Relationships depended on distance, and with the growth of distance, relationship values followed some specific features (e.g., monotonic decreases or increases to relation values, distances that showed concentration values, and more). To understand the relationships between distances (radial) and angles (azimuth αs) from a site (i.e., a pole or reference point), the polar coordinate system was employed (see Methods). For example, given the PM 2.5 time series recorded from a large number of monitoring stations (nodes) in a given geographical region, a distance and azimuthal angle could be calculated for each nodal pair. The value of the correlation could then be represented in terms of color in the polar coordinates for distance and angle. Initially, it was demonstrated that cross-correlation was associated with distance, as shown in panels (a), (b) and (c) of Figure 3. Further, Figure 3A was a representative case in which correlation decreased with distance, regardless of direction. Finally, Figure 3B showed correlations that were random in regards to both distance and direction, and could have been a result of a completely random time series. In contrast, Figure 3C revealed that as the distance increased, the correlations between the stations became increasingly stronger but did not suggest differences in regards to direction. These figures thus indicated the different types of relationships between distance and crosscorrelation.

Pearson's Correlation Coefficient ρ in Polar Coordinates
As aforementioned, one aim of this study was to detect variations in correlations between different sites with distance d and direction α (see Methods). Pearson's correlation coefficient was a classic method of measuring a correlation between two sequences. Using the method, a distribution was thus obtained for Pearson's correlation coefficients (ρ) for the three different data sets in the polar coordinate system (d, α). According to Figure 3D, the global data had a clear anticorrelation between Pearson's correlation coefficient ρ and distance d; that is, the shorter the distance, the stronger the correlation. However, Pearson's correlation coefficient ρ did not have significant differences in regards to the different directions, α. Similar results were observed for both China and the United States, as shown in Figures 3E,F. However, the anti-correlation characteristics of ρ and d for the United States and China data were not as apparent as those in the global data, which may have been due to a relatively small number of observation stations in the United States and China. In the future, more sites and further detailed data are needed to study the relationships between ρ, α, and d. This result, on a larger scale and more source data, validates the inverse correlation between PM 2.5 sequences and distances found in previous studies [9].

Cross-Correlation of W max in Polar Coordinates
In recent climate network models articulated to study the spatiotemporal behavior of the climate system, nodes denote geographical coordinate sites and a link between a pair of nodes is defined by the cross-correlation and mutual information between the time series from the two nodes [17][18][19][20]. Following this approach, one related type of crosscorrelation, denoted as W max (see Methods) was calculated: so-called positive link weights between PM 2.5 recordings and pairs of nodes for the L h time series. Detailed information on the (L h ) data is shown in Table 1 and the Methods section.
As aforementioned, one goal of this study was to determine whether there were long-range correlations between the PM 2.5 recordings from the different sites. To accomplish this goal, a polar representation of cross-correlation was used. Specifically, for any nodal pair, the distance d and the azimuthal angle α could be defined (e.g., the zero angle 0 + meant that one node was exactly north of another node, and 90 + indicated that one node was east of another node). The correlation between the nodal pair was then color coded and represented in the polar coordinates (d, α).
In Figure 3G and in regards to worldwide PM 2.5 s, W max correlation values were represented by color-coded dots in the polar coordinates. Larger values were noted for the positive link weight W max at short distances (fewer than hundreds of km). Small W max values were distributed approximately uniformly at larger distances and in other directions. This phenomenon was in agreement with climate network results and was called the proximity (distance) effect [18][19][20]. Namely, pairs of sites close to each other (fewer than 2,000 km apart) were often strongly positively correlated [18,21].
However, there were no long-range correlations for the PM 2.5 time series. Similar results were obtained for the China and American PM 2.5 s, as shown in Figures 3H,I, respectively. While this was not ideal, a few points had the large positive link weight W max distributed across a long-distance range. The reasons for these outliers were not the focus of this paper, but future research should investigate these anomalies with more detailed data. The phenomenon implied that PM 2.5 could not transmit at long ranges. Nonetheless, these results did not seem to depend on the length of the time series, insofar as it was reasonable, as shown in Figures 3G-I for the respective data sets. Moreover, the results indicated only short-range correlations and a lack of long-range correlations, which suggested that PM 2.5 could transport across only short distances (fewer than hundreds of km) [9]. This result was in sharp contrast, for example, to climate phenomena in teleconnections [11,12] and temperature [18][19][20]. These results indicate that the PM 2.5 sequence and the temperature etc. meteorological sequences are different, and there is no teleconnections. Furthermore, our results only display the lack of long-range correlation between different sequences in the spatial distance range, but a large number of previous studies including detrended fluctions analysis (DFA), Detrended Cross-Correlation Analysis (DCCA) and multifractal detrended fluctuation analysis (MFDFA) have observed that there is a long-range correlation in the time dimension [49][50][51]. Therefore, future research should pay attention to the correlation of PM 2.5 sequence at the time dimension, so as to be able to deeply understand the trend of PM 2.5 . Moreover, future research should focus more on the deeper causes of different transmission phenomena, for example, the difference in transmission media [52][53][54][55].

The Distribution of Causation Δ
Although previous research expressed non-trivial correlations between monitoring stations in regards to distance, the detection of causality between stations had always been a problem of great theoretical significance and practical value [39][40][41][42]. In general, causation and correlation are not equivalent to each other: two time series can be highly correlated but without any causal relation Frontiers in Physics | www.frontiersin.org July 2021 | Volume 9 | Article 684104 8 [27]. However, this study applied CCM to the PM 2.5 data to determine the existence of causal relationships between the time series and the different geographical locations [31]. This method is suitable for nonlinear systems in the presence of noise [31,56].
In particular, for a nodal pair, non-zero Δ value could indicate a causal relationship between the PM 2.5 time series. First, the relative strength of causation Δ was computed (see Methods) for the three datasets; their representative distributions can be seen in Figures 4A-C. These distributions showed that the causation Δ distribution fit the data well in regards to normal distributions of the global, Chinese, and American data, respectively. It was found that the means, Δs, and standard deviations, σs, of the three distributions were as follows. Δ ≈ 0.00 and σ 0.02 for the global data; Δ ≈ 0.00 and σ 0.02 for the Chinese data; and Δ ≈ 0.00 and σ 0.02 for the American data. These perfect normal distributions suggested that there may be no significant causality between the PM 2.5 sequences at the collective level. Although no clear causality was observed at the collective level, this is a useful exploration of the causal-effect in the PM 2.5 sequence data. Causality detection would have important roles in future research on climate and environmental policies [28][29][30].

Causation Δ in Comparison to Azimuth α
In order to deeply understand the relationship between causation Δ and direction α between PM 2.5 monitoring sites, we examined the trend of causality with the directions of the sites. It is apparent that in all three data, causality is basically symmetrically distributed in different directions. In particular, Figure 4D shows that the mean value of Δ followed the straight line Δ 0. This result indicated that at the collective level, the causalities and directions between the different PM 2.5 stations were irrelevant. The dense distribution in some directions, e.g. α ≈ 45 in Figure 4D was consistent with the distribution characteristics in Figure 1A. Additionally, Figure 4E,F indicated similar relationships concerning China and the United States. It was thus concluded that there is no apparent causal-effect trend among the different sites in the different directions. It is apparent that one cannot simply judge the causality of PM 2.5 observation sites based on their direction. Frontiers in Physics | www.frontiersin.org July 2021 | Volume 9 | Article 684104 9 4.4.3 Causation Δ in Comparison to Distance d As aforementioned, a non-trivial correlation was present between the monitoring sites as distance d changed. This study thus investigated causality trends between the monitoring sites and distance d. Figure 4G revealed that as distance d changed, causation Δ was almost evenly distributed on both sides of the central Δ 0 axis. This result clearly indicated that at the collective level, there were no relations between causality and distance d. A similar result was observed in both cases from China ( Figure 4H) and the United States ( Figure 4I). Although no consistent causality was observed at the collect level, a relative causality seemed likely at the individual level. Future research should therefore investigate this at the individual level, particularly the impact of PM 2.5 sequence length. This result indicates that there is no definite conclusion about the causality and distance between PM 2.5 monitoring stations, and they should be studied separately according to the different situations of PM 2.5 stations.

CONCLUSION
With the development of data collection technology, more and more studies focus on PM 2.5 sequence, some studies focus on the impact of PM 2.5 sequence on climate and meteorology in some places and periods, and others focus on the correlation of PM 2.5 sequence and distance. However, these studies rarely focus on the various relationships of PM 2.5 in a large-scale spatiotemporal range, not to mention the causal-effect between PM 2.5 sequences. We have conducted a study of the relations of PM 2.5 time series based purely on massive data. Our statistical and nonlinear analysis of worldwide PM 2.5 time series over the period of one year leads to a number of findings. Firstly, a random matrix based analysis indicates that the spatiotemporal PM 2.5 data are not purely random, but with associations. Secondly, correlation among the PM 2.5 time series exists over a short distance, which is consistent with the first finding. However, there is lack of consistent Long-range cross correlation, suggesting that transport of PM 2.5 over long distance is complex and changeable. Thirdly, since correlation does not imply causation in general, a causality analysis is necessary to assess the likelihood of Long-range transport, which we carry out by employing the nonlinear dynamics based CCM method. The analysis reveals an unequivocal absence of consistent indication of transport of PM 2.5 over long distance (e.g., over 1,000 km). The simultaneous absence of consistent long-range correlation and statistical causation leads to the conclusion that transport of PM 2.5 over long distance is sophisticated and varied.
It should be noted that these analyses were only statistical. However, for numerous pairs across a significant distance, a lack of consistent and definitive casual relations was found (as shown in Figures 4D-I. That is, over the given distance, the direction of causal interaction appeared completely random, and this implied a lack of causation in general. These findings extended prior work about PM 2.5 series to a large spatial-temporal scale with causality analysis [30]. Moreover, an absence of consistent long-range transport was found for three different data sets, both with cross-correlation analyses and causality detection methods. This finding is promising, however, it should be explored using other association and causal analyses as well as different data sets. Generally, PM 2.5 pollution affects only a short distance (no more than several hundred km); policies should be changed to address this. However, while this study was an empirical analysis of direct PM 2.5 time series collected from various monitoring stations, it was only a preliminary attempt to identify the associations, correlations and causal effects of real PM 2.5 time series. Actual relationships may be more complex, and the underlying mechanisms that cause these relationships were not within the scope of this study. Our results demonstrate that the random matrix theory can also play an important role in PM 2.5 sequences. Future research should pay more attention to the specific meaning of eigenvalues and their corresponding eigenvectors [46,47]. Meanwhile, the research on magnitude based on the time-lag cross-correlations RMT is also a very promising direction [48]. In addition, future research should focus on the correlation of PM 2.5 in the time dimension [49][50][51]. Moreover, this analysis did not investigate other pollutants, such as CO X , NO X , and SO X , etc [10], nor did it consider meteorological variables, such as temperature, relative humidity, precipitation, cloud cover, wind speed, and wind direction [57]. Additionally, it did not conduct mineralogical composition analyses [52][53][54][55], Total Ozone Mapping Spectrometers [54], or climate models and biogeochemical interactions [8]. Further research should investigate these topics and examine the role of PM 2.5 in the spread of disease, especially concerning the recent impact the coronavirus (COVID-19) has had on the world [58]. Nonetheless, this research is significant as a basis for these future researches.