Additive Interaction between Heterogeneous Environmental Quality Domains (Air, Water, Land, Sociodemographic, and Built Environment) on Preterm Birth

Background Environmental exposures often occur in tandem; however, epidemiological research often focuses on singular exposures. Statistical interactions among broad, well-characterized environmental domains have not yet been evaluated in association with health. We address this gap by conducting a county-level cross-sectional analysis of interactions between Environmental Quality Index (EQI) domain indices on preterm birth in the Unites States from 2000 to 2005. Methods The EQI, a county-level index constructed for the 2000–2005 time period, was constructed from five domain-specific indices (air, water, land, built, and sociodemographic) using principal component analyses. County-level preterm birth rates (n = 3141) were estimated using live births from the National Center for Health Statistics. Linear regression was used to estimate prevalence differences (PDs) and 95% confidence intervals (CIs) comparing worse environmental quality to the better quality for each model for (a) each individual domain main effect, (b) the interaction contrast, and (c) the two main effects plus interaction effect (i.e., the “net effect”) to show departure from additivity for the all U.S. counties. Analyses were also performed for subgroupings by four urban/rural strata. Results We found the suggestion of antagonistic interactions but no synergism, along with several purely additive (i.e., no interaction) associations. In the non-stratified model, we observed antagonistic interactions, between the sociodemographic/air domains [net effect (i.e., the association, including main effects and interaction effects) PD: −0.004 (95% CI: −0.007, 0.000), interaction contrast: −0.013 (95% CI: −0.020, −0.007)] and built/air domains [net effect PD: 0.008 (95% CI 0.004, 0.011), interaction contrast: −0.008 (95% CI: −0.015, −0.002)]. Most interactions were between the air domain and other respective domains. Interactions differed by urbanicity, with more interactions observed in non-metropolitan regions. Conclusion Observed antagonistic associations may indicate that those living in areas with multiple detrimental domains may have other interfering factors reducing the burden of environmental exposure. This study is the first to explore interactions across different environmental domains and demonstrates the utility of the EQI to examine the relationship between environmental domain interactions and human health. While we did observe some departures from additivity, many observed effects were additive. This study demonstrated that interactions between environmental domains should be considered in future analyses.

and other respective domains. Interactions differed by urbanicity, with more interactions observed in non-metropolitan regions.
conclusion: Observed antagonistic associations may indicate that those living in areas with multiple detrimental domains may have other interfering factors reducing the burden of environmental exposure. This study is the first to explore interactions across different environmental domains and demonstrates the utility of the EQI to examine the relationship between environmental domain interactions and human health. While we did observe some departures from additivity, many observed effects were additive. This study demonstrated that interactions between environmental domains should be considered in future analyses.
Keywords: environmental quality, interaction, air, water, sociodemographic, built environment, land, preterm birth inTrODUcTiOn Environmental exposures such as pollutants, social factors, and built environment likely have a collective influence on health; however, epidemiological research often focuses on singular exposures. This may be in part due to the complexity of measuring multiple environmental factors in tandem (1,2). Some research, including air pollution studies, has used indices or decomposition methods such as principal component analysis (PCA) to assess the simultaneous impact of different pollutants (3). These methods are also seen in built and social epidemiological research in the assessment of neighborhood effects or total social constructs (4)(5)(6). While indices are becoming more widely used across epidemiology, very few studies have combined variables across separate environmental domains (e.g., air, water, and built environment). The assessment of cumulative environmental exposure is currently being targeted as a need in epidemiological research (7). Interaction is the concept that the effect of one exposure may depend partly on the presence, absence, or level of another exposure (8). One might expect that two detrimental exposures occurring simultaneously would potentially lead to a more detrimental effect than a single exposure on a given outcome; however, this is not always the case. Interaction can be defined by a departure from multiplicativity (on the log or logit scales such as those used in risk or odds ratio estimation) or from additivity (on the linear scale used to estimate difference measures) (9). Interaction on the additive scale is often thought of as more relevant to public health because the additive scale reflects absolute numbers of persons rather than relative risks or odds (as in the multiplicative scale), which may only translate to change in few individuals (10). One exposure may enhance the effects of another, creating a synergistic interaction, or may diminish the effects of another, creating an antagonistic interaction. In the case of two detrimental environmental exposures, a synergistic interaction would indicate a much worse health effect overall, while antagonistic interaction would result in estimates that are closer to or even across the null value, appearing as beneficial effects. For two exposures in which both have an independently beneficial effect, antagonism would result in a less beneficial association than expected (i.e., less than a strictly additive effect) and synergism would create an even more beneficial association than expected. If two exposures have effects in opposite directions (i.e., one is beneficial and the other is detrimental), then antagonistic effects would be less than expected, which could be closer to or away from the null, depending on which exposure is considered the "main effect. " Interaction is a manifestation of the complex processes giving rise to illness and health. Exploration of interactions may lead to better understanding of these processes.
Though it is important to study the potential effects of interaction, such studies are often challenging to undertake. Analyses of interaction inherently have less power than those of main effects. Interactions do not always follow a simple form, i.e., effects of exposure may differ in a non-linear fashion across levels of another factor or may differ only at certain levels of the factor, making interpretation difficult. While statistical interaction is what one can estimate, connecting statistical to biological or public health interaction requires substantive knowledge of how biological systems can be influenced by exposures in the presence of one another. While recently many epidemiological studies have explored gene-environment interactions, none has assessed interactions across environmental domains (11). Currently, only one metric, the Environmental Quality Index (EQI), has been developed to assess both domain-specific and cumulative environmental exposure. The EQI is a publically available county-level measure of cumulative environmental exposures for the U.S. for the period 2000-2005 (2, 12). The EQI includes variables representing five environmental domains: air, water, land, built, and sociodemographic. The index provides a cumulative total measure of the ambient environment and domain-specific indices.
The EQI has been used as an exposure to assess association between environmental quality and several health outcomes. In a study of preterm birth (PTB), the authors observed positive associations between the air domain and PTB across all rural/ urban strata and for the sociodemographic domain in the most urban stratum, and null or negative associations between the other domains and PTB, and the overall EQI and PTB (13). A study of mortality generally found worse environmental quality to be positively associated with mortality (14). One study applied the EQI to control for environmental confounding in the association between hurricanes and reproductive health outcomes  (15). While the EQI allows researchers to consider multiple environmental constructs simultaneously, no published studies have examined potential interactions between environmental domains in the assessment of human health.
We address this current gap in the literature by conducting a county-level cross-sectional analysis of interactions between EQI domain indices on PTB in the Unites States from 2000 to 2005. We used PTB as the motivating example in this analysis as it is a marker of fetal underdevelopment and a risk factor for further poor health outcomes (16)(17)(18)(19), can be used as an indicator of national health (20,21), and to further develop the previously published analysis (13). In addition, we stratified by urban/rural status as it has been shown that the association with PTB and interactions can vary greatly by urbanicity (22).

MaTerials anD MeThODs study Population
The study population for this analysis has been previously described in Rappazzo et al. (13). In brief, the study population included live births from the National Center for Health Statistics (NCHS) for the entire United States for the years 2000-2005 for all 3141 counties. The study population was restricted to singleton, non-anomalous births, with county identifiers, recorded gestational age, and residence within the same state as birth occurrence (n = 22,705,068). County-level PTB prevalence was estimated as PTBs/total births for all 3141 U.S. counties. PTB is defined as birth occurring between 20 and 36 weeks completed gestation (inclusive). Ten counties were excluded because less than 10 total births or no PTBs occurred over the study period, leaving a final population of 3131 counties. The study protocol was reviewed by the EPA Human Research Protocol Office and deemed non-human subject research as per EPA Regulation 40 CFR 26 (Protection of Human Subjects) Section 26.102 (f).

environmental Quality index
Domain-specific EQIs were used to represent environmental exposure at the county-level for the entire U.S. over the 2000-2005 time period. The EQI includes variables representing five environmental domains: air, water, land, built, and sociodemographic (2). The domain-specific indices include both beneficial and detrimental environmental factors. The air domain includes 87 variables representing criteria and hazardous air pollutants. The water domain includes 80 variables representing overall water quality, general water contamination, recreational water quality, drinking water quality, atmospheric deposition, drought, and chemical contamination. The land domain includes 26 variables representing agriculture, pesticides, contaminants, facilities, and radon. The built domain includes 14 variables representing roads, highway/road safety, public transit behavior, business environment, and subsidized housing environment. The sociodemographic environment includes 12 variables representing socioeconomics and crime.
Briefly, the EQI was constructed using a separate PCA for each of the five environmental domains, and the primary component was retained to represent that domain's index. The primary component for each domain was then used in a subsequent PCA to create the overall EQI for total environmental quality. The EQI and domain-specific index construction was also stratified by condensed rural/urban continuum codes from the United States Department of Agriculture Economic Research Service (23) as has been described in prior research (24)(25)(26): metropolitan urbanized, non-metropolitan urbanized, less urbanized, and thinly populated. Full methods of the EQI's construction are described in Ref. (2), while data description can be found in Ref. (12).

Data analyses
County prevalence of PTB was defined as the proportion of PTBs among all live births in each county for 2000-2005. The exposure variables for the analyses were urban/rural-stratified domainspecific indices. Index values were linked to PTB prevalences by maternal county of residence. Domain indices were categorized by tertiles where lower values indicated better environmental quality, midrange values indicated average, and higher values indicated worse environmental quality. Linear regression was used to estimate prevalence differences (PDs) and 95% confidence intervals (CIs) for average (second tertile) and worse (third tertile) quality tertile compared to better tertile (first tertile) of environmental quality as the referent group.
Analyses compared worse environments to better environments. Therefore, positive PDs indicate an increase in PTBs with worse environmental quality, while negative PDs indicate a decrease in PTBs with worse quality, with a null value at 0. Comparisons were conducted in this manner to align with the previously published paper (13). Domain main effects and interaction effects were included for each tertile combination using two independent domains at a time. An example statistical model presented in Figure 1, using air and land domains as an example, displays how the air and land domains would have three levels each, with the lowest level acting as the referent [shown by (0)], and terms for interactions between domains at each level. As we estimated PDs (as opposed to an odds or risk), the interaction is   assessed on the additive scale. A conservative interaction p-value of <0.05 was used to assess departure from additivity. Models were also simultaneously adjusted for the EQI domains which were not included in the interaction (e.g., if the interaction was between water and air, we adjusted for land, sociodemographic, and built environment). Models included a percent minority covariate to account for differences between race and preterm delivery, as well as county-level confounding due to the non-random distribution of environmental disamenities (27). This schema included 10 models for each urban/rural category to include all interactions between the 5 respective domains.
Results are presented as PD and 95% CI comparing worse environmental quality to the better quality for each model for (a) each individual domain main effect, (b) the interaction contrast, and (c) the two main effects plus interaction effect (i.e., the "net effect") to show departure from additive interaction. This net effect is the cumulative association between the two interacting domains on PTB prevalence. resUlTs Three thousand one hundred and thirty-one U.S. counties were included in the analyses. Distributions of PTB are shown in Table 1. For all counties, prevalence of PTB ranged from 1.85 to 60.56%, and percent minority population ranged from 0.69 to 97.39%. Patterns of the domain-specific tertiles varied spatially (Figures 2-6). Visually, particular domains exhibit similarity in  spatial patterns of worse, average, and better environmental quality; for example, the water and land domains have distributions that resemble one another, while the air built and sociodemographic domains appear to be similar to each other. Tables 2-6 describe the main effects for each domain (PD and 95% CI), the interaction contrast term (e.g., air × land only), and the net effect (e.g.,      PD of air + land + air × land) from the overall non-stratified (entire U.S.) and the four urban/rural-stratified models. The majority of main effects for the non-urban/ruralstratified models were positive (indicating an increase in PTB prevalence with decreasing environmental quality) for the air domain index and negative for the other domain indices ( Table 2) . Note that since sociodemographic and land main effects were negative (sociodemographic PD: −0.016 and land PD: −0.027) and the interaction contrast was positive, the interaction is antagonistic, i.e., interaction is closer to the null than you would expect when adding the two main effects (expected strictly additive interaction:        Table 3). Antagonistic interaction was found in non-metro urbanized counties between the air domain with water, land, and sociodemographic domains ( Table 4 Table 5). In the thinly populated counties, antagonism was between the air domain and the land, and built and sociodemographic domains ( Table 6). Table 7 displays the summary of the interactions found across all domains and urban/rural-stratified models. Across all of the    50 total models, 15 models suggested interactions which were a departure from strict additivity. These interaction contrasts from Tables 2-6 all indicated antagonistic interaction.

DiscUssiOn
We used PTB as a motivating example for the analysis of interaction effects. The main effects observed herein were mostly positive across the air domain and negative across the other domains. This is similar to previously published work, in which worsening air quality was associated with increased PTB, while worsening quality in other environmental domains was not associated with PTB or was associated with decreased PTB; in that work, associations did vary by urban/rural strata, with worsening sociodemographic quality associated with increased PTB only in the metropolitan urbanized counties (13). The majority of models did not have significant interactions, indicating primarily additive relationships between any two domains. We found some evidence for antagonistic relationships between environmental domains, many of which yielded net effects nearer to the null than one would expect from strictly additive effect. Variations in the number of interactions and interacting domains were observed across urban/rural strata. More interactions were found in the less dense counties (less urbanized and thinly populated) as compared to more urban strata.
Across levels of urbanicity, the number and combinations of domain interactions differed across stratum, although all were antagonistic. In metropolitan urbanized counties interaction was observed between the sociodemographic/land domains only. We may see fewer interactions in highly urban regions because of potentially strong heterogeneity between areas of good and bad environments. In non-metro urbanized counties, interaction was observed in the air domain with water, land, and sociodemographic domains. In the less urbanized counties, interactions were found between air/built domains, air/sociodemographic, sociodemographic/water domains, and sociodemographic/land domains. In the thinly populated counties, interaction was found between the air domain with the land, built, and sociodemographic domains.
Antagonistic relationships were observed more often in the less dense counties (less urbanized and thinly populated); it is possible that with respect to their underlying populations and environments, extremely urban areas are more similar to one another, while most rural areas may be more heterogeneous. Interactions in the sociodemographic and built domains were seen across urban/rural status. These differences in urbanicity may be partially explained by the variables included in the sociodemographic domain. These variables (e.g., percent of housing with more than 10 units, percent earning greater than high school education, etc.) may better characterize urban poverty/sociodemographic status because factors such as education may be less meaningful in terms of economic success in a primarily farming community. In the three less urban strata, the air domain frequently has antagonistic associations with the other domains. This could be due in part to the strength of the associations of the air domain with PTB and that the air domain had positive associations with PTB, while the other domains were negatively associated.
In addition, differences in urban/rural strata may be explained through differing contributions of variable loadings. In PCA, variable loadings represent the strength of a variable's contribution to the index value, which vary by urbanicity in the EQI. For example, in the sociodemographic domain index, one factor included in the index, the percent of people at or below poverty level, loaded highly positive in the metropolitan urban stratum and highly negative for all other strata (2). The previously published studies on EQI and PTB discuss the variation in variable loadings across urban/rural status, particularly in the sociodemographic domain (13). Urban environments were more likely to have complete spatial and temporal data within the original variables of the EQI (2). Estimates in the urban strata may be less likely to be biased compared to rural strata. These details might also help explain why we saw null total effects even when domain main effects indicated relationships with PTB.
Observed antagonistic interactions could have several possible explanations. PCA ranks each county based on variables used to represent each domain; since we do expect that counties will rank similarly across domains, looking at interactions between indices may be less informative than combining quantitative metrics (e.g., individual pollutants) of environmental exposure into a single metric, as in the overall EQI. As each index includes both beneficial and detrimental aspects of the environment, county rankings might be slightly unbalanced due to the even weighting of all factors when including multiple indices. For instance, an index which includes 10 beneficial and 10 detrimental aspects of environment will be a more balanced total picture of quality, then an index with only detrimental aspects especially when compared side by side. They both create an environmental metric, but one would estimate total environment and one would better estimate bad environment creating potential unexpected interaction relationships. The eigenvalues from the PCA used to construct the domains were equally scaled and centered, which should balance some of these differences. If in truth some factors are more influential in health then others, then this magnitude of effect is lost, especially in comparing domains to each other. The overall EQI, which uses a secondary PCA decomposition, might better account for the variation in measurement across domains.
While this study explores a complex area of environmental interaction, there are potential limitations. PTB is a useful indicator of national health; however, modes of action/mechanisms from environmental quality leading to PTB have not been well established, though some possibilities have been put forth (28). A clearer understanding of the pathways acting from environmental quality to PTB could be utilized to identify potential biologic interactions along with observed statistical interactions. The nature of the EQI and the underlying data leads to potential smoothing of exposure. In other words, some exposures may change over a smaller spatial or temporal scale than the countylevel and 5-year period used, which would be homogenized by using the EQI. However, the cumulative nature of the EQI does account for simultaneous exposures.
To our knowledge, no other study has examined interaction between different environmental domains, and we were additionally able to examine this at the county-level across the entire United States. As the EQI is a publically available resource in the assessment of environmental quality, other studies may consider using the EQI to control for environmental factors (15) or to assess interaction between exposures and environmental quality. We assessed additive interaction. While we did observe some departures from additivity, many observed effects were strictly additive. In terms of public health importance, this strict additive interaction indicates that those living in areas influenced by multiple poor environmental factors may be at higher risk of PTB than those in an area where a single environmental domain is of poor quality, as one might expect. However, the antagonistic relationships indicate that this may not always hold true in rural areas.
When evaluating environmental exposures on health outcomes, vigilance in testing both traditional confounding and interaction should be exercised as we know exposures occur naturally simultaneously. This study was the first to explore interactions across different environmental domains and demonstrates the utility of the EQI to examine the relationship between environmental domain interactions and human health. Our results demonstrated the existence of interaction between EQI domains, and this application should be considered in future environmental analyses. aUThOr cOnTriBUTiOns SG contributed to the study conception, performed statistical analyses, and drafted the manuscript. KR contributed to the construction of the EQI, performed statistical analyses, and helped to draft and edit the manuscript. CG contributed to the interpretation of the data and provided critical revisions to the manuscript. JJ contributed to the construction of the EQI and provided critical revisions to the manuscript revisions. YJ contributed to statistical analyses and provided critical revisions to the manuscript. LM constructed the EQI and provided critical revisions to the manuscript. DL conceived of the EQI, oversaw its design and coordination, oversaw coordination of this study, and contributed to manuscript revisions. All authors read and approved the final manuscript and agree to be accountable for all aspects of this work.