Spatio-Temporal Representativeness of Air Quality Monitoring Stations in Mexico City: Implications for Public Health

Assessment of the air quality in metropolitan areas is a major challenge in environmental sciences. Issues related include the distribution of monitoring stations, their spatial range, or missing information. In Mexico City, stations have been located spanning the entire Metropolitan zone for pollutants, such as CO, NO2, O3, SO2, PM2.5, PM10, NO, NOx, and PMCO. A fundamental question is whether the number and location of such stations are adequate to optimally cover the city. By analyzing spatio-temporal correlations for pollutant measurements, we evaluated the distribution and performance of monitoring stations in Mexico City from 2009 to 2018. Based on our analysis, air quality evaluation of those contaminants is adequate to cover the 16 boroughs of Mexico City, with the exception of SO2, since its spatial range is shorter than the one needed to cover the whole surface of the city. We observed that NO and NOx concentrations must be taken into account since their long-range dispersion may have relevant consequences for public health. With this approach, we may be able to propose policy based on systematic criteria to locate new monitoring stations.


INTRODUCTION
As population density, mobility, and industrial activity keep growing at an accelerated rate, air pollution has gained the attention of policy makers in urban and metropolitan areas. There is a common concern in highly polluted cities regarding the increasing mortality associated with chronic and acute diseases whose effects may be aggravated due to exposure to air contaminants (1)(2)(3).
It is well-known that different diseases or health-related effects depend on both the exposure time and concentration levels (1). As evidence suggests, not only long periods of exposure can be damaging, but exposure to high levels in short periods-even a few hours-may have an immediate negative impact (4,5).
Mexico City (Figure 1), as many other metropolis worldwide, has implemented strategies for urban planning, transportation, and regulations of industrial activity to reduce contaminant emissions (6). As an example, the Metrobus transport system started operating in the year 2005 as an emission reduction strategy. By comparing CO, NO x , PM 10 , and SO 2 measurements before and after the Metrobus operations, a reduction ranging from 5 to 9% for different contaminants in city areas was observed (7). Another example is driving restriction policy in Mexico City, which was originally set only for weekdays. In an attempt to improve results, the program extended this restriction to Saturdays without meeting the expected results of reducing emission by almost 15% (8).
So far, these efforts were not successful as planned. On the contrary, pollution levels have not decreased, which is noticeable from the continuous environmental alerts throughout the years. Some contaminants, such as particulate matter PM 10 vary seasonally; however, some regulations may be effective for this type of pollutant, some may not be useful for others (9).
In this regard, public policies will only be effective if they rely on the proper identification of pollution sources, the understanding of the dispersion dynamics, and the adequate measurement of relevant variables.

The Relevance of Pollution Monitoring and Assessment
Determining the number and distribution of air quality monitoring stations depends on the area to be covered, traffic, spatial variability due to land use, influence of meteorological variables (temperature, wind speed, and ultraviolet radiation), and dispersion dynamics of each pollutant (10)(11)(12).
Environmental policy planning needs reliable methods to assess the risk level associated with exposure to chemical and other noxious agents. This latter can be made by direct and indirect measurements of pollutants with epidemiological and toxicological dimensions.
Direct approaches require the estimation of the incidence of undesired effects by considering individual exposures to contaminants. Environmental hazard of this kind often relies on the analysis of spatial data collected by environmental surveys (13).
Monitoring networks have two main purposes. First, by measuring spatial and temporal trends of pollutant concentrations, they provide air quality estimations to determine whether the population is exposed to dangerous levels or not. In addition, with the use of social-demographic, land use related variables and meteorological data, simulation models can guide to better decision-making procedures.
Second, once implemented, the effectiveness of public policies and regulations can be evaluated by analyzing changes in pollution levels that are caused merely by the imposed regulations. Thus, the development of a monitoring system is a critical component of public health policy making to decrease toxic emission and eventually prevent population from adverse contaminant effects.
A relevant emerging concept is environmental health surveillance. For this concept, the quality and completeness of information has been found variable, depending on individual hazards or exposures, even in well-developed public health surveillance systems, such as the one in Canada (13).
The Mexico City Ministry of Environment (Secretaría del Medio Ambiente, SEDEMA) is responsible for the establishment of measuring procedures, data gathering, and reporting air quality levels (14). The estimations are based on the measurement of carbon monoxide (CO), nitrogen dioxide (NO 2 ), ground level ozone (O 3 ), sulfur dioxide (SO 2 ), small fine particulate matter (PM 2.5 and PM 10 ), nitrogen oxide (NO), other nitrogen oxides (NO x ), and coarse particulate matter (PM CO ).
Although Mexico City's monitoring network meets international standards, it fails to have complete records for all the contaminants. In some cases, continuous monitoring stations stopped functioning due to technical reasons and maintenance, whereas others just stopped operating and in some cases, measurements were not registered while they were still active.
The appropriate functioning of such monitoring stations is extremely relevant to public health issues (15). It is known, for instance, that ozone and particulate matter (PM 2.5 ) levels have been closely associated with a number of adverse health effects that may lead to premature mortality (16). Such effects are particularly relevant in the context of urban environments (17).

Evaluation of Health Impact and Monitoring Stations
The progressive incorporation of information sampling and retrieval technologies and the use of geographic information systems (GIS) to analyze the data have become a central tenet of Health Impact Assessment (HIA) programs. The way to analyze the data however is shifting from merely transaction reports to the use of advanced analytics, such as the ones used in business intelligence and data science.
Latin American countries have developed specialized programs to make use of GIS and computational intelligence to improve their HIA programs. Studies, such as the ALBA, GeoSur, or in the case of Mexico, the Global Environmental Outlook (GEO) are aiming in this direction (13).
GEO has indeed developed its own strategy within the "geotext" framework in order to use spatial analysis to provide policy makers (and even the public) with enhanced information resources, however these resources are just as good as the information they are based on (18,19).
In the case of air pollution monitoring stations, the WHO has actually advanced some guidelines as to what standards are desirable for the data sources to be useful in the context of HIA programs (20).
Mexico City is doing partially well according to these standards; however, our results have shown that there are things that need improvement, in particular taking into account the size and urban characteristics of the metropolitan area of Mexico City as large urban areas pose particular environmental challenges (21).
It has been discussed that increased risks created by urban development include unhealthy conditions, which may arise from unplanned settlements or rapidly growing urban environments, environmental pollution by over-concentration of waste and other pollutants, and overcrowding, among others (22).

The Question of Spatial Representativeness
Determiniation of the spatial representativeness of background monitoring stations from concentration measurements of air pollutants, has been a matter of intense research (23)(24)(25)(26)(27). It has been shown that the size and shape of representative areas differ between pollutants and measured locations, and representative areas may range from 220 to 4,500 km 2 (24).
To improve the assessment of coverage estimation in the case of a limited number of stations, detailed pollutant concentration maps at pedestrian level have been used (27). In this example, for Pamplona, Spain, the authors found that ∼18% of the entire area is well-represented, as most of the residential areas are included. This result states that it is possible to assess the covered area by air quality networks integrated by a limited number of stations for a small city (23 km 2 ).
The most complete study on the spatial representativeness of monitoring sites is the JCR Technical Report developed by the Forum for Air Quality Modeling in Europe (FAIRMODE) (28,29). The aim was to perform an inter-comparison of 25 assessment methods from 14 different countries based on a literature review of scientific journals and technical documentation.
The outcomes of the above-mentioned study were established to define spatial representativeness and to propose standard methodological procedures for European country members. The different methodologies can be categorized according to their assessment criteria, such as modeling, measurements, proxies, station classifications, and annual concentrations. The outputs from these studies are presented as delimited areas or size parameters.
In order to have an adequate assessment of the effectiveness of those monitoring networks, the city's spatial heterogeneities should be taken into account.
To estimate the concentrations at unmeasured locations, interpolation methods, such as land-use regression (LUR), inverse distance weighting (IDW), or kriging, use historical data from monitoring stations and other monitoring procedures (30,31). These estimations are mainly used for health risk assessment. Prediction of high values and trends helps to guide decisions both, locally and at citywide levels.
Recently, kriging geo-statistical approach has been used to analyze spatial representativeness from NO 2 preliminary concentrations in urban areas (26,32). The kriging methodology for spatio-temporal interpolation is based on the covariance data structure on spatial or spatio-temporal level. To achieve that task, the empirical semivariogram is modeled with a parsimonious covariance structure, through the use of different kernel functions, to determine the spatial and temporal correlation range.

Intervention Policy and Assessment
The development of analytical approaches to determine and assess environmental pollution data with the best spatiotemporal granularity is key in the design and implementation of proper intervention policy, for example, regarding urbanization process, over-population, personal monitors, indoor environments, vehicle fleet, peak hours, and green areas in the city, among others (33). The PAHO Regional Plan on Urban Air Quality and Health 2000-2009 has proposed efficient systems for air pollution health impact monitoring. These must include periodic surveillance of morbidity and mortality associated with air pollution, risk assessment, effective information systems, and reliable estimation of social costs related to air pollution.
In this regard, research designs, such as the one advanced here will help to address some of the main concerns included in the PAHO plan and also allow us to comply with the agreements on other initiatives, such as the Air Management Information Systems (AMIS).
To this end, Mexico (as a country) has developed a nationwide air quality monitoring program (the Sistema Nacional de Información de la Calidad del Aire, SINAICA https:// sinaica.inecc.gob.mx/). It is worth noticing that the flagship implementation of SINAICA has been indeed the metropolitan area of Mexico City.
The information derived from the SINAICA program (in particular the one constituted in the PROAIRE initiative) has already allowed the country to develop general policies to improve the air quality (the PROAIRE strategies for emission reduction).
Research efforts along these lines, although admittedly far from complete, have allowed to implement public health policy to lower the negative health impact on air pollution. Take, for instance, the case of ozone, whose high levels are known to affect human health, in particular that of vulnerable or over exposed groups, such as athletes, outdoor workers, asthmatics and people with respiratory illnesses, and children.
It has been reported that by implementing some of the recommendations in the PROAIRE initiative, average ozone levels in the Metropolitan area of Mexico City diminished from almost 0.18 parts per million (ppm) in 1991 to around 0.1 ppm in 2007. These levels have remained below (34). It is expected that such a decrease in the ozone levels would also decrease respiratory illness incidence.
It is, however, complex to determine the real impact of such measures, although HIA programs have pointed out that by implementing appropriate policies up to 33,084 ozone-related deaths may have been prevented in Mexico City during the period of 2000-2020 (35).
Another study in three of the largest cities in the Americas (Mexico City, São Paulo and New York) reported similar results. In Cifuentes et al. (36), it was mentioned that during the period of 2000-2020, up to 64,000 premature deaths could be prevented, just by reducing the levels of ozone and particulate matters in around 10%.

Scope and Outline of This Work
In this work, we present a novel methodology based on the use of spatial and temporal variogram ranges modeling to estimate monitoring stations representativeness. This methodology does not require estimation of pollutant concentration (full interpolation procedure).
This work aims to show the temporal evolution of spatial representativeness of monitoring stations in Mexico City, one of the most complex networks and metropolitan areas worldwide. Additionally, temporal representativeness is shown, which is not the case for most of these studies. We explore these spatial coverage and temporal dependence on measurement for all pollutants currently reported in Mexico City.
In brief, two main questions are addressed here: • What is the spatial and temporal representativeness of the air quality monitoring network in Mexico City? • Which is the space/time range within which sample point measurements are correlated with measured values at monitoring stations?
We also discuss about the public health implications of these questions and how can we use this information to provide feedback to health and public policy makers.  (37). Hereafter, the geo-spatial data granularity was kept at the 16 available boroughs (municipalities).

Air Pollution Database
The Mexico City Air Quality Monitoring System public database is available from the Aire CDMX website (38). For this study, the required data were accessed using the R package aire.zmvm Naturally, all the monitoring stations do not collect all types of contaminants. Additionally, there exist missing records due to service maintenance or other incidents. The time period used in this work is from 2009 to 2018, when possible. Location of monitor stations for each pollutant can be observed in Figure 2. Complete monitor stations data can be found in Supplementary Table 1.

Spatio-Temporal Statistics
The collected data were explored to get a clear picture of the pollutant monitor stations representativeness in Mexico City. Hourly contaminant data were summarized by their average into a week time-basis, if more than 5 days were available containing  17-h records or more. Data were plotted for each pollutant. Figure 3 shows data for nitrogen dioxide (NO 2 ). The rest of pollutant data can be found in Supplementary Figures 1-8.
In this work, we used the semivariogram to estimate the degree of spatial and temporal dependence between measurements for the different air pollutants. In brief, the semivariogram provides a description of how measurements vary across distance, time or both, as it measures the degree of spatial correlation of a random variable Z(x), Z(t), or Z(x, t), respectively. In particular, for the unidimensional spatial component Z(x), the experimental semivariogramγ (h) that varies with distance h is written in equation (1): where Z(x i ) is the observed value for the ith location at coordinate x i , Z(x i + h) the observed value at location x i + h, and N(h) the number of measured points within a distance h. An equivalent representation applies for the temporal component. As the reader can see, the expression presented in equation (1) has no close form, hence, the task here is to find the suitable formulation that best explains the data from one of the following: exponential equation (2), spherical equation (3), or Gaussian equation (4), as described in Chilès and Delfiner (40).
where n is also known as the nugget, which can be interpreted as the residual spatial dependence as it is defined as n = lim h→0 +γ (h), i.e., the y-intercept of the semivariogramγ (h) where it is supposed that no correlation with other measurement exists, but the data point itself, when the distance h is as close to zero; s is a.k.a the sill, which is defined as s = lim h→∞ +γ (h) and it can be interpreted as the limit of the variogram tending to infinity distance; and r is a.k.a the range, which is the distance where it is satisfied that lim ∀h>0 γ (r) −γ (r + h) → 0, i.e., the distance where the spatial correlation is lost where the sill levels off, and for a fixed sill model it would be the first time the sill is reached, whereas for a asymptotic sill it would conventionally be the distance where the semivariogram first reaches 95% of the sill; a = 1 3 as defined in Chilès and Delfiner (40), and H A (h) is the unit step Heaviside function, where it is 1 if h ∈ A and 0 otherwise. Now, if we move forward toward spatio-temporal correlations, the unidimensional concepts for space (h) and time (u) need to introduce a covariance structure with its associated semivariogram γ (h, u) form for the different implementations included in gstat R package implementation as described in Pebesma (41), Gräler et al. (42), and Baca-Lopez et al. (43) for the following models: separable equation (5), productSum equation (6 and 7), metric equation (8), sumMetric equation (9), and simpleSumMetric equation (10).
sill st = k × sill s × sill t + sill s + sill t (7) γ simple sum metric (h, u) = n × H h>0∨u>0 + γ s (h) + γ t (u) where for the spatial s and time t domains, the corresponding variables are h and u, respectively; the sill parameter has been described explicitly as sill to avoid confusion with the space variable s; γ s and γ t are the spatial and temporal semivariograms with their respective standardized versions γ s and γ t with separate nugget effects and (joint) sill of 1; k is a positive parameter, i.e., k > 0 that satisfies equation (7); κ is the spatio-temporal anisotropy (stAni) correction; n is the nugget parameter; and H A (h) is the unit step Heaviside function, where it is 1 if h ∈ A and 0 otherwise. The initial semivariogram parameter values were obtained from the empirical spatio-temporal pollutant measurements using gstat R package implementation as described in Pebesma (41), Gräler et al. (42), and Baca-Lopez et al. (43). Here, the initials values are computed as follows: • Nugget: It is the median value of the first three empirical variogram matrix row/column means for the spatial or temporal initial guess.
• Sill: It is the median value of the last five empirical variogram matrix row/column means for the spatial or temporal initial guess. • Range: The spatial range is one-third of the lagged maximum spatial value and for the temporal case, it corresponds to the maximum value.
In addition, the spatial and temporal anisotropy was estimated using a linear model as specified in the gstat implementation. Finally, the best parsimonious model was found for each contaminant. Briefly, using the initial variogram parameters, different spatial, temporal, or joint semivariogram structures were tested to find the one that best explained the correlated data description, according to the available implementations in gstat (metric, separable, productSum, sumMetric, and simpleSumMetric) (41,42). In this context, the best parameter combination was found testing all possible single, double, or triple semivariogram combinations (exponential, Gaussian, and spherical), where the model selection criterion used was to minimize the weighted mean squared error (see Supplementary Table 3). Finally, the winner spatio-temporal semivariogram structure was used to extract final semivariogram parameters (see Supplementary Table 4).
Last but not least, the integration of both contaminant's monitoring representativeness plots and final spatio-temporal variogram range parameters were used to get a clear picture of Mexico City pollutant radius representativeness according to the time evolution monitor station activity. The spatial correlation range for each pollutant at a particular year was used as a radii to build a circumference around each monitoring station. Thus, the union of circles from all monitoring stations within the network constitute the covered area for a specific pollutant. This procedure was applied to all years of study to show how covered area has changed over time.

Spatio-Temporal Data Exploration
Monitoring stations geo-localization are depicted in Figure 7, the 16 boroughs of Mexico City. At first sight, the global picture makes it clear that not all the contaminants are acquired for each available station. Second, it seems that for all the contaminants the south of Mexico City (borough numbers 8, 11, and 12) are not as well-represented as its northern counterpart.
This concern is related to rural and urban distribution, where most urban populated boroughs are located to the north of the city. Indeed, some contaminant stations are located outside Mexico City. Pollutants, such as PM CO and PM 25 are almost exclusively collected inside Mexico City, whereas the rest have monitor stations outside the city.
To further explore the data completeness, the monitor station representativeness plots were generated. In Figure 3, the case of NO 2 is presented for the available stations from the beginning of 2009 until late 2018.
It is clear that there exists block of missing data (in white), where some of them can be tracked down to the monitoring station inauguration (CUA, MON, CAM, NEZ, and so on) in the mid 2011, or until 2015 for AJM and MGH stations.
As a matter of fact, there are no stations whatsoever that had not lost track of NO 2 at least for some hours (red to yellow cells) or even had stopped working for a time gap of days, weeks, or months. The last case, can be pictured for TLI, VIF, ATI, and ACO to name some stations in the time window including the beginning of 2016.
The last time pattern can be considered as the complete station shut down, as depicted by the block of LAG, CES, AZC, TAC, and TAX, that stopped working at the end of 2010. Fortunately, these stations are not located neither in the same borough nor close each other to leave non-measured areas (see Figure 2). However, this data description level does not represent the extend covered by each monitoring station.

Spatio-Temporal Variogram Estimation
Using the available data, sample spatio-temporal variograms were addressed. These variograms were used to generate the initial guesses values (Supplementary Table 2). Depending on the contaminant, the initial guesses are different for the nugget, range, sill, and stAni. In this context, the nugget is the model intercept attributable to measurement errors or spatial sources of variation at distances smaller than the sampling interval or both.
Interestingly, these sources of variations are negligible for CO, in contrast to the wide range of nugget values (0.03 − 178.67). In addition, the correlation extends between measurements, also known as range; in all cases, it is almost the same for all contaminants and last about 12 years for as far as 21.4 km. The value for the variogram when the distance reaches the range, also known as the sill, is as close to the nugget only for CO and departs from it at most double its value.
Final covariance model weighted mean square error for all the tested variogram permutations can be found in Supplementary Table 3. It is worth to mention that the lowest error for the different covariance structure methods was the one that included sumMetric for CO, NO 2 , O 3 , NO, NO x , PM 10 , and PM CO and simpleSumMetric for SO 2 and PM 2.5 . Within these covariance models, there was no apparent pattern in the winner variogram model permutation (temporal + spatial + joint). This is a data-driven approach that required to explore the complete permutation grid in order to reach a parsimonious spatio-temporal correlation model. A visual comparison of each winner covariance model and sample variogram can be found in Figure 4 for NO 2 . The rest of variograms for the other pollutants can be found in Supplementary Figures 9-16.
Regarding ranges from winner models, the case of spatial correlation is presented in Figure 5A. The spatial range values measured in kilometer are interpreted as the separation distance between two measured locations, i.e., monitoring stations, that from this value onwards, measurements are no longer correlated. This is, measured values in a given monitoring station will be correlated with stations located within this range. For instance, NO x measurements between monitoring locations have the longest spatial correlation of 84.44 km, followed by NO with 77.15 km. On the contrary, particulate matters PM 2.5 , PM 10 , and SO 2 have the shortest ranges, 13, 12.98, and 5.92 km, respectively (see Supplementary Table 4).
In the case of temporal correlation, ranges are shown in Figure 5B. PM 10 has the longest value of ∼6 months (178.35 days) followed by NO x and NO with ranges of 175.60 and 133.49 days, respectively. These three contaminants differ in great extent in their correlation measurements in comparison to CO, PM CO , NO 2 , PM 2.5 , S0 2 , and O 3 with ranges between 12 and 73 days overall (see Supplementary Table 4). This wider time correlation window also presents some implications for environmental control policies, in particular under the scenario of extraordinary events. For instance, abnormal pollution levels may correlate with registers several days apart, hence difficulting emergency decision-making and action taking.
To graphically show the spatial representativeness of each pollutant, we used spatial ranges reported in Figure 4 that were obtained as final parameter values of the variogram models shown in Supplementary Table 4.
For example, for the case of nitrogen dioxide (NO 2 ), a spatial range of 43.37 km was obtained from the empirical semivariogram and covariance structure modeling (see Figure 4). Thus, for each monitoring station that measured NO 2 , the center of a circular area with radii 43.37 is matched to the station's location.
To generate a buffer area for NO 2 to represent the spatial influence for measuring this contaminant, circles traced at each location were joint to define a single border area. This process is performed for NO 2 in the years 2009, 2012, 2015, and 2018.
As seen in Figure 3, for each year, there is a different number of active monitoring sites. Specifically, in 2009, only 18 sites collected hourly concentrations for NO 2 . In 2012, six additional monitoring stations started collecting data for this contaminant. For the years, 2015 and 2018 the active sites were 26 and 25, respectively. In general, an increasing number of active sites can be seen for all pollutants, starting with the year when measurements began (see Supplementary Figures 1-8).
Analogously, using the calculated spatial ranges for CO, NO 2 , O 3 , SO 2 , PM 2.5 , PM 10 NO, NO x , and PM CO , temporal evolution of representativeness areas are shown in Figure 6.
The different contaminants are color coded and displayed as columns, while rows are assigned to selected years. By looking at 2009 year panel, it is noticeable that covered areas between contaminants differ widely. The largest difference in range can be seen for SO 2 and NO x with the smallest and largest ranges, respectively (5.92 and 84.44 km).
The representativeness area for SO 2 is seen to mostly cover the north part of the city, while the south is not and for the years 2012, 2015, and 2018, similar patterns were obtained.
As expected, although the number of monitored locations increased, there is not a significant increase in the covered area throughout the years due the small range of measurement correlation. This small range depends on the intrinsic physicochemical properties of SO 2 and consequently, its diffusion in the atmosphere, as well as due to the complex traits of urban environments.
However, regardless of land use, traffic, population density, and other variables involved for a selected year, it can be FIGURE 4 | Tested spatio-temporal semivariograms for NO 2 . The 2D-sample semivariogram was obtained from the collected data. In addition, the winner fitted covariance structure models (metric, separable, product sum, sum metric, and simple sum metric) are also presented. In this case, the sum metric structure is the one that outperforms its competitors.
seen that for the same locations, the representativeness area for relatively different monitoring networks, these patterns are pollutant dependent.
Similar patterns of an increasing covered area that goes from north to south is observed for CO, PM 2.5 , PM 10 , and PM CO . It can be seen in the timeline that for these pollutants in the year 2009 (except for PMCO, which was not registered at the time), the southern area was not included in the network but, in the consecutive years this area was extended to almost cover the entire city.
For NO 2 , O 3 , NO, and NO x , regardless of the number of monitoring stations in 2009, because of the large correlation of measurements (ranges), the representativeness area of the network accounts for the whole city and a considerable percentage of the VMMA. Thus, although monitoring stations were added to the network, no significant change is observed for the successive years.
The complete spatio-temporal evolution for these contaminants depicted in Supplementary Figure 17 In Figure 6, we present the case of CO and NO 2 . The case of well-represented monitoring networks is shown for NO and NO x in Figure 7 as example of pollutants with long spatial correlation, 77.15 and 84.44 km, respectively (see Figure 5). In other words, the amount and selected location for these monitors to construct the network can be considered as effective. It even has shown improvement since throughout the years (see Figure 6).
Another aspect of the NO and NO x networks is that their representative area clearly exceeds the city's territory, which is beneficial for both, Mexico City and the VMMA. A relevant issue of this extended coverage is that monitoring stations installed in one of the 16 boroughs in Mexico city are able to capture the influence of pollution from sources outside the city, as these pollutants can eventually move toward the city due to dispersion phenomena.
Additionally, these results allow us to establish neighborhood limits for interpolation purposes. To select the proper number of monitoring stations required to estimate concentration values at unmeasured locations, we can refer to spatial and temporal ranges of correlation to determine which stations have to be included in the analysis. An interesting comparison between a well-represented network and one with lack of representativeness is displayed in Figure 8. The full time evolution of the network coverage for SO 2 and PM 2.5 is shown in Figure 6, and their status in year 2018 can be seen in Figure 8.
In the case of SO 2 , the central and north parts of the city are almost covered, which is not the case from the center to the south. Additional 15 stations located outside the city, i.e., in the VMMA, are partially connected to the network in the north and four are disconnected.
The current status of PM 2.5 representation area shows an improvement compared with previous years as seen in Figure 6.
For the year 2018 as presented in Figure 8, there is a complete coverage of Mexico City's surface. All areas of individual stations are connected and opposite to SO 2 , this area includes the south.

General Discussion
Air quality assessment is essential for public health, individual, and general population wellness. People's quality of life is strongly determined by the levels of contamination. Hence, to deliver reliable measurements of air pollutants in time and space is of outmost importance. In this work, based on spatio-temporal correlations of monitoring stations of nine air pollutants in Mexico City, we have been able to evaluate whether the location of those stations is adequate to cover the surface of the city.
The analysis showed that the distribution and the number of the monitoring stations is sufficient to evaluate the majority of pollutants, with the exception of sulfur dioxide (SO 2 ). The spatial range of monitoring station for SO 2 is shorter than the other pollutants. The problem does not occur in all zones of the city, and it is constrained to the southern part, as shown in Figures 2, 6, 8. The southern part of Mexico City has a large rural region, and concomitantly, the population density is also short. It is possibly the reason for which there is a limited number of stations in that zone.
Regarding SO 2 , this compound is mainly generated by industrial activity, which is carried into the northern side of Mexico City, and SO 2 levels in the south are likely due to the dispersion of this pollutant.
With respect to NO and NO x , and PM 2.5 , whose high levels are crucial in terms of individual and public health, the first two pollutants are well-measured and estimated; however, this is not the case of PM 2 .5 and SO 2 as the spatial ranges of the monitoring stations for these pollutants are short.
Notwithstanding, in the case of PM 2.5 monitoring stations, the AJU station (43), which is the southernmost located one, it is able to measure PM 2.5 , and given this, the monitoring stations are able to measure this pollutant cover almost the entire surface of the city. In terms of public policies, an economic and at-hand option to increase the measured surface of SO 2 stations is to enable the AJU station with SO 2 capacity. It is worth to mention that the metropolitan areas are not isolated; contamination could arrive from external places. For instance, during May 2019, Mexico City experimented an unusual environmental challenge due to an elevated concentration of PM 2.5 and O 3 (44). Several forest fires have taken place in the Tepozteco National Park and Ajusco National Park, which are both located in the southern border of Mexico City. With this in mind, we state that it is necessary to have also monitoring stations in the periphery of the city to be able to establish, based on spatio-temporal criteria, models to predict contamination indexes and have a better plan for reducing the occurrence of these episodes.
To establish an appropriate methodology to measure air pollutants in time and space, several factors must be taken into account. The correlation between pollutant concentration and health should be carefully evaluated in order to avoid misinterpretations. In what follows, we will discuss some relevant elements that need to be observed.
Depending on the type of pollutant, the residence time in the atmosphere may vary from minutes to weeks. For example, ultrafine PM (< 0.1µm) remains suspended in the air in the range of minutes to hours. Conversely, PM10 may remain suspended from minutes to hours [Air quality criteria for particulate matter, Washington, DC, US Environmental Protection Agency, 2004 (http://cfpub.epa.gov/ncea/cfm/partmatt.cfm)].
Additionally, photochemical transformations due to sunlight radiation, meteorology, or several other factors should be taken into account in the assessment and the concomitant establishment of public policies.
With this approach, we do not only provide a protocol to measure and estimate areas of representativeness for several pollutants, but also provide suggestions for public policies that are not expensive or logistically complicated. These suggestions may have an impact on the evaluation of the air quality in Mexico City, and hence to help to increase the quality of life of people living in this place.

Extended Dispersion of Emissions: The Case of NO and NO x
Due to the particular environmental and infrastructural conditions of Mexico City, we have registered a phenomenon of overdispersion (evidenced by wider spatial correlation lengths) of certain pollutants, in particular NO and NO x as it was shown in Figure 7. Large amount of nitrogen oxides have been directly related to industrial activity-based von fossil fuels (45).
Due to the specific features of NO and NO x in terms of relatively small particle sizes, low aggregation and cluster formations, and other intrinsic physicochemical characteristics, nitric oxide emissions may become over-dispersive under certain atmospheric conditions. This has relevant implications because NO x emissions lead to the formation of secondary pollutants, contributing, for instance, to high concentrations of atmospheric ozone (46). Aside from these issues, NO x emissions may also contribute to the deposition of NO 3 creating environmental problems, such as ecosystem acidification.
NO and NO x overdispersion also poses additional challenges to regulation and inspection policy. This is so, since in large metropolitan areas, such as Mexico City, extended spatial presence also means that attributing emissions to chemical plants and industries may require deeper inspections and effective scheduling of these (47).
Aside from direct effects of nitric oxides, ozone and particulate matters contribute to important morbidity and mortality. NO, NO x , and their secondary pollutants may constitute, via widespread exposure, relevant risk-increasing factors to conditions, such as preeclampsia, systemic inflammation, increase in oxidative stress, and cardiovascular events (48,49). In the case of nitrogen oxides, even causal relationships have been established (50)(51)(52). These and other associations with public health concerns will be further discussed in the next subsection.

Public Health Implications
The results just discussed may have important implications in the development of successful HIA programs. HIA programs are aimed at the identification, mitigation, and optimization of the impacts that non-health sector policies may have on public health (53).  Monitoring stations (colored dots) report hourly concentrations of these and other pollutants. A different coverage pattern that depends on the spatial range of each pollutant can be seen for SO 2 and PM 2 .5. In the case of SO 2 , some monitoring stations seem isolated outside the city limits while for PM 2.5 most of the city surface is well-represented.
Risk quantification used to be mainly based on toxicological or biomedical studies, but more recently the scope of HIAs has broadened to incorporate more general social determinants of health (54).
As it was shown here, using large-scale empirical data from the monitoring network itself, some of the actual challenges have to do with the fact that the radii of coverage are actually different from the various pollutants (see Figure 6).
It is worth mentioning that the regions in that figure correspond to the empirical distribution of air pollutants as given by the characteristic environmental conditions of the metropolitan area of Mexico City and the particular monitoring technology available there.
These facts are indeed matter of current interest, since air pollution in large Latin American cities has become a source of special concerns in recent times. According to a report from the Pan American Health Organization (PAHO) (34), the leading causes of urban air pollution in the Americas are fossil fuels in industry and transportation. The aforementioned report states that in the case of the Mexico City metropolitan area, transportation alone is responsible for some 12% of PM 10 particles, 30% of PM 2.5 , 5.06% of SO 2 , a staggering 98% of CO, 79% of nitrogen oxides (NO x ), as well as 31% of the volatile organic compounds (34).
As discussed, even if Mexico City has implemented some regulatory systems to reduce air particle concentrations, the results have not been enough to comply with national and international standards. This may be due to the fact that programs approved by policy makers have relied on inadequate air quality measurements.
In these terms, PAHO has been stated that . . . there is a clear need for better monitoring systems to analyze trends using more exhaustive, continuous, reliable and complex data and methodologies that are comparable between countries, so that better intervention measures could be adopted to control air pollution . . . (34). Our analysis, as presented here, aims to diagnose some aspects of what is missing and what can be improved in terms of the spatio-temporal representativeness of the air quality monitoring stations in the metropolitan area of Mexico City. Improving our HIA programs and policy is extremely relevant, in particular considering the steady decline in, say PM10 particulate levels, that had been observed from the early 2000s in Mexico City was overturned by a dramatic increase during the years 2008 and 2009. Even if another decrease has been observed since then, we are still lagging to reach the WHO recommended levels. Mexico was, in fact, the country with more deaths due to outdoor air pollution than other countries in the Americas (20,496 in 2012) according to a recent survey (55).

Implications for Intervention Policy
Recalling Figure 6, it is noticeable that intervention policy has indeed improved the quality of monitoring stations for most (but not all) of the pollutants considered.
The metropolitan area of Mexico City has been covered well for ozone levels monitoring since 2009. This, however, was not the case for PM 2.5 which was poorly covered in the Southside of the city in 2009 and by 2018 has almost complete coverage. A similar case happens to CO levels monitoring which was almost uncovered in 2009 and is well-covered since 2018. Other cases are still worse, more striking in the case of SO 2 levels, which were poorly covered in the Southside of the city in 2009 and still remains not covered there up to date. This is not to be disregarded since the Southside of Mexico City consists mostly of residential areas with the industrial zones more widely present in the north and east sides of the metropolitan area.
By looking at Figure 6 and Supplementary Figure 1, one can observe that SO 2 monitoring station facilities have indeed improved in number and effectiveness. However, due to the different coverage features of the stations for the different pollutants, these efforts have been insufficient to date. This is why spatio-temporal representation studies, such as the present are relevant for public policy making.
All the aforementioned facts highlight the relevance of datadriven efforts to improve health impact assessments. Aside from air quality monitoring, there are other data-centered measures that must be implemented, such is the case of exposure evaluation which is indeed essential to calculate risk levels.
A number of epidemiological methodologies have been developed to assess population exposure to air pollutants. Most of them are based on the consideration of the radial distance from stations within the local monitoring networks, used as a proxy to the proximity of population subjects within the study groups (56). It should be noted that besides using monitoring data for health impact assessment, modeling method using air quality models is also used to assess HIA.
Since pollutant concentrations in urban environments may vary widely, geostatistical approaches to environmental epidemiology have gained even more relevance (57,58). The present study aims to present a practical approach to this problem based on the information already gathered in the existing monitoring stations in the Mexico City metropolitan area.
It is expected that data-centered studies, such as this one, will motivate public policy makers to strengthen the monitoring, data gathering and data analysis strategies in large urban environments, such as the metropolitan area of Mexico City.
We are aware of the many challenges that effective environmental assessment has, from economic, logistic and political, but also for technical and analytical reasons. However, we also believe that there are good reasons to be confident that this kind of studies will be relevant for the construction of new, more efficient models of policy making.

DATA AVAILABILITY STATEMENT
All datasets generated for this study are included in the article/Supplementary Material.

AUTHOR CONTRIBUTIONS
KB-L, CF, JE-E, and EH-L made a substantial contributions to conceptualization and methodology, investigation and validation. KB-L, CF, JE-E, MM-G, MC-L, MF-M, and EH-L were involved in the formal analysis and agreed to be accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved, and also were involved in drafting the manuscript or revising it critically for important intellectual content. All authors read and approved the final manuscript.