Challenges in Epidemiological and Statistical Evaluations of Effect Modifiers and Confounders

In multiple adjusted regression models, researchers sometimes do not know when the interaction occurs and how to interpret the exposure effect estimate while adjusting for the interaction term, resulting in a misinterpretation of the results; this issue has been raised in previous epidemiologic studies. In addition, when the positions of exposure and outcome are switched in the multiple regression, interpreting covariates is challenging. 
 
Here, we present the epidemiological and statistical challenges in evaluating the effect modifier and confounding factor.

In multiple adjusted regression models, researchers sometimes do not know when the interaction occurs and how to interpret the exposure effect estimate while adjusting for the interaction term, resulting in a misinterpretation of the results; this issue has been raised in previous epidemiologic studies. In addition, when the positions of exposure and outcome are switched in the multiple regression, interpreting covariates is challenging.
Here, we present the epidemiological and statistical challenges in evaluating the effect modifier and confounding factor.

WHEN DOES THE INTERACTION OCCUR?
As an example to illustrate aspects of the evaluation of interaction, we describe below. In our example, we want to determine whether radon exposure (the third factor) is an effect modifier in the relationship between smoking (exposure) and lung cancer (outcome). For a dichotomous potential modifier [radon exposure (yes/no)], the interaction occurs when the effect of the exposure (smoking) on the outcome (lung cancer) is not homogenous in strata formed by a third variable (radon exposure) (1-3). The effect can be measured either by the attributable risk (in the additive model) or by a relative risk (in the multiplicative model); both models share the same conceptual basis for evaluating the interaction (1, 2).
To measure the effect of the interaction between smoking and lung cancer, we set the simple regression model and added a third variable, radon exposure, along with an interaction term of radon exposure with smoking (i.e., radon * smoking).
The regression model is shown below.
Y (lung cancer) where B s is the effect of X smoking , B r is the effect of X radon exposure , and B sr is the effect of X smoking * radon exposure. The most important issue in this model for determining whether or not radon exposure is an effect modifier is to interpret the effect size (i.e., B sr ); that is, rather than focusing on only the "p-value" of the effect size, researchers should concentrate on the magnitude of the interaction term. In addition, when outcome variables are continuous and the levels of a continuous variable are small (e.g., 0.01, 0.02, and 0.03), the effect size of the interaction can be small resulting in a small slope in a graph composed of an x-axis for exposure and a y-axis for outcome. This occurs as a result of the tiny interval between outcome variable numerals; it does not suggest that there is no interaction. In this case, we can graphically test the interaction by plotting the means of the outcome variables for each category of exposure according to the strata defined by the effect modifier. Non-parallel lines suggest the presence of interaction. In both statistical and graphical evaluations, we always focus on the effect size of the interaction term rather than only its p-value.

IN THE REGRESSION MODEL THAT INCLUDES EFFECT MODIFICATION, WHY DOES THE EFFECT OF EXPOSURE VARIABLE (B S ) NOT REPRESENT THE RELATIONSHIP BETWEEN EXPOSURE AND OUTCOME?
When the effect modifier (radon exposure) is determined to evaluate its interaction with exposure (smoking) and outcome (lung cancer), it should never be treated as a confounder in the analysis; the interaction effect (B sr ) should be considered. In addition, the effect of smoking (B s ) in a regression analysis in which the effect modification (B sr ) is adjusted does not solely indicate the association between exposure and outcome, which has interacted with the effect modifier (4). The regression model shown below accounts for the effect modification: Y (lung cancer) = B s X smoking + B r X radon exposure + B sr X smoking * radon exposure .
When the radon exposure status is no, the effect of smoking is B s.
When the radon exposure status is yes, the effect of smoking is B s + B sr.
The above two cases of the dichotomous effect modifier indicate that the effect of smoking is not B s only and that it depends on the effect of the interaction (B sr ). Thus, when there is an interaction between smoking and radon exposure in lung cancer (i.e., B sr has a noticeable effect), the effect of smoking must be measured in separate analyses, stratified by radon exposure status, to produce the effect of www.frontiersin.org smoking within the radon exposure groups (B s ; B s + B sr ).

INTERPRETING COVARIATES (CONFOUNDERS OR MEDIATORS) IN THE REGRESSION MODEL WHERE THE POSITIONS BETWEEN EXPOSURE (X) AND OUTCOME (Y) ARE SWITCHED IS CHALLENGING
Researchers sometimes face the situation where the positions between exposure and outcome are switched. For example, in a multiple regression model (see the below formula and Figure 1) of the relationship between physical activity (exposure) and breast cancer (outcome), we evaluate the effect of physical activity on breast cancer risk after accounting for the third factors such as diet (as a confounder) and obesity (as a mediator), as shown below: Y (breast cancer: outcome) = B p X physical activity (exposure) The diagram in Figure 1 depicts the relationships between the three Xs and breast cancer outcome.
If researchers want to switch the positions between exposure and outcome, the following formula appears: The new model does not appear to be mathematically problematic. However, the diet variable (X diet ) is no longer a confounder because X diet has no causal direction to physical activity outcome. Additionally, the causal relation between exposure and outcome is revered. For the same reason, the obesity variable (X obesity ) is not a mediator (i.e., obesity physical activity; breast cancer physical activity). Therefore, in this new formula, B b does not reflect the effect of breast cancer adjusted for the confounder and mediator. Furthermore, when researchers set a multiple regression model, they should first understand whether the third variables are included as confounders, mediators, or effect modifiers.
Overall, when using the formula for switching between exposure and outcome and adding third variables without considering their technical roles in a given multiple regression analysis, the results cannot be interpreted correctly to determine the effect of the exposure after adjusting for third variables.
In conclusion, previous epidemiological studies to evaluate effect modification using the multiple regression model may have overlooked an important issue in interpreting the effect modifier estimate by focusing on only the p-value rather than the effect size. Additionally, they may have misinterpreted the exposure effect in the regression analysis adjusted for the effect modification. Moreover, when the positions are switched between exposure and outcome in the multiple regression, the third variables in the formula do not have the same epidemiological roles as those in the formula prior to the switch; they need to be considered carefully to determine their roles as confounders, mediators, or effect modifiers.