Structural Equation Modeling as a Route to Inform Sustainable Policies: The Case of Private Transportation

The availability of big data allows a wide range of predictive analyses that could inform policies for promoting sustainable behaviors. While providing great predictive power, adopted models fall short in explaining the underlying mechanisms of behavior. However, predictive analyses can be enhanced by complementary theory-based inferential analyses, guiding tailored policy design to focus on relevant response mechanisms. This paper illustrates the complementary value of multidisciplinary inferential models in informing large predictive models. We focus on Structural Equation Modeling, an approach suitable for a holistic examination of different pathways and hypotheses from multiple disciplines. Drawing on an interdisciplinary theoretical framework we develop an empirically tractable model and apply it to a sample of household data from Switzerland. The model focuses on the relationships that delineate the underlying mechanisms for energy consumption behaviors in the case of private transportation. The results are discussed in light of possible contributions to policies aiming at the promotion of sustainable travel behavior as well as data requirements for analyses relying on big data.


INTRODUCTION
Widespread digitisation in various sectors of advanced economies brings about a fundamental change in the availability of consumption data. Smart meters, connected appliances and electric vehicles, to mention a few, allow unprecedented access to individual-and household-level data. Such data is increasingly used to predict various consumption behaviors and to tailor marketing messages to specific segments of the population. However, so far, the potential arising from big data is not often harnessed for promoting sustainable consumption and/or pro-social behavior. This lag can be partly explained by the distinctive methodologies used by experts in separate fields. On the one hand, focusing on predictive analyses, marketing experts rely on models with a great number of variables in order to identify patterns of behavior across different groups. On the other hand, focusing on inferential analyses, social scientists usually rely on parsimonious models to identify underlying mechanisms and to explain behavior. While the latter use theoretical premises to provide a relatively rigid structure to meet their empirical models, the former use models with less theoretical structure, allowing predictions to be mainly data-driven. Both types of analyses are essential and complementary. We need to understand behavioral mechanisms, in particular for promoting sustainable consumption, while at the same time utilizing the predictive power of the emerging big data. There is, however, a methodological tension hindering mutual feedback, as revealed by some large-scale studies (e.g., O-Power study, Allcott and Rogers, 2014) where reductions in energy consumption were achieved, but underlying mechanisms for change remained unclear, hence impeding widespread usage beyond a specific context.
In order to tap into the emerging data potentials for promoting sustainability, we need to identify adoption tendencies. In addition, we need to understand the barriers and drivers for different groups in the population. While predictive models are sufficient for the former objective, the latter requires testing specific hypotheses derived from theories. However, theories usually originate from different disciplines, and fail to provide a holistic picture of the consumption behavior of interest. Moreover, by focusing on a single aspect of behavior, usually dictated by a rigorous causality analysis, the analyst inevitably leaves out many variables that might be relevant for a comprehensive analysis.
Inference and prediction can complement each other. Inferential analysis, using statistical models, provides a basis for a sound and theory-driven interpretation whereas predictive models, based for example on machine learning, are less interpretable but provide a powerful framework for data-driven predictions. Recognizing the complementarity of predictive and inferential analyses in large data sets, we put forward the notion that comprehensive structural models can be used to bridge the chasm between the two types of analyses. Instead of zooming into a single aspect of behavior, comprehensive models include a multitude of variables integrated into a relatively rigid structure. The adopted structure can be based on a comprehensive framework rooted in several disciplinary theories. Such an empirical analysis can be conducted by structural equation models (SEM) which offer several advantages. First, compared to other statistical models such as linear regression, SEM have a greater flexibility to accommodate a multitude of pathways for a given outcome. Regression models could also be used for causality analysis, but their focus on a specific aspect restricts their ability for considering multiple hypotheses from various disciplines. Second, SEMs not only provide a holistic picture with a relatively large number of variables, they can also be used to assess the relative importance of various causal pathways. Finally, as opposed to predictive models based on data mining and machine learning methods, SEMs can provide an overall picture of behavior, used for generating relevant hypotheses to be tested with further regression models. Therefore, SEM can be used as a "prime language for causal analysis", as put by Pearl (2012), to provide a conceptual structure to predictions purely driven from data.
Our empirical analysis is based on a broad dataset that contains "distance driven by each household (HH)", but also additional information on HH decisions, like socioeconomic and demographic characteristics, norms and values. The objective is primarily to show how SEMs, including multiple pathways, can play a complementary role to predictive models and disciplinary theory-based analyses. Our empirical illustration in the field of private transportation further provides insights into the challenges in translating a comprehensive framework into empirically applicable models and thus also highlights data requirements. The insights gained into behavioral mechanisms driving HH transportation decisions do not constitute the main focus of this paper. They are used to exemplify the added value of applying SEM for modeling different pathways. Our contribution is to highlight the interplay of determinants behind consumer decisions, and the extent to which SEMs based on an interdisciplinary framework can play a complementary role to predictive and inferential models. As such, it is primarily directed to inform a sustainametrics conversationi.e., a discussion on how increasingly available data can support transitions to sustainable societies and limitations to such a role-rather than a transportation behavior and policy discussion.
This paper is structured as follows. In Section Background we provide rationales for choosing yearly distance traveled by private car as a relevant issue, and SEM as our method. A brief description of the integrated framework proposed by Burger et al. (2015) is presented in Section The Integrated HH Energy Consumption Framework. It is followed by a mapping of the underlying relationships between the factors listed in the framework (Sections Implementation of the IHECF, Relationships of Social Opportunity Space and Individual Opportunity Space Factors, and Relationships of Decision-Making Factors and Choices/Routines). The paper then proceeds to lay out the data used for the empirical analysis (Section Method), followed by a presentation of the results (Section Results), discussion (Section Discussion) and conclusions (Section Conclusion).

BACKGROUND
Achieving low-carbon energy goals heavily depends on shifting demand (over time) to match supply (Shove, 2021). Many studies have pointed out behavioral barriers hampering policy interventions in reducing HH energy consumption. These obstacles range from undesirable consequences of public policies (e.g., Alberini et al., 2018) to a number of barriers operating at an individual and HH level (Cattaneo, 2019;del Mar Solà et al., 2021), such as rebound effects (De Borger et al., 2016;Stapleton et al., 2016), missing price incentives or imperfect knowledge (Allcott and Greenstone, 2012;Pothitou et al., 2016), overestimated technical promises (Fowlie et al., 2018) as well as fixed routines or habits determining daily life (Kurz et al., 2015;Kent, 2021). Most of these barriers operate through behavioral mechanisms, for example driven by cognitive heuristics (Kahneman, 2003), emotions (Brosch and Sander, 2014), values and norms (Ababio-Donkor et al., 2020;Bouman et al., 2021).
In this context, the availability of large amounts of information on HH behavior-energy consumption, in particular-could provide a window into the functioning of existing interventions and the potential of unexplored solutions. For instance, taking advantage of availability of hourly data on Swedish residential electricity usage, Brännlund and Vesterberg (2021) have explored whether there is a potential for shifting load between peak and off-peak hours. If possible, this load shifting would be a game changer as it would allow the covering of expected increases in demand without substantial infrastructure adjustments. Their analysis, while indicating a limited potential, does not, however, provide much insight about how such a shift could be achieved through interventions or policies. Indeed, many empirical analyses of rich datasets are unable to shed light into behavioral mechanisms needed to design policy interventions (e.g., Karimu et al., 2022). These studies often focus on tangible factors, and miss out modeling potentially important characteristics such as attitudes, emotions, and values. A structural model could be helpful to investigate the behavioral links between consumptionshifting with policy-relevant characteristics such as individual preferences and attitudes.
Notably, knowledge about mechanisms and barriers to behavior change is rooted in disciplinary frameworks which do not commonly consider the interplay of multiple factors determining energy demand. While interdisciplinary work is on the rise (e.g., De Witte et al., 2013;Van Acker et al., 2014;Stephenson et al., 2015;Stephenson, 2018;Koszowski et al., 2019), disciplinary divides, such as those between economics, psychology, sociology and geography, often prevent integrated analyses of determinants and the formulation of comprehensive and tailored intervention strategies (Burger et al., 2015;Hess et al., 2018).
In most empirical studies, the focus is either on datadriven predictive models or theory-driven single-equation regressions. However, neither of the two approaches is able to model multiple pathways of energy consumption behaviors. In this paper, we develop a comprehensive model of HH energy consumption and show how such models can be implemented with a SEM to provide structural framing of predictive analyses such as Moro and Holzer (2020). To this end, we build on Burger et al. (2015) who put forward an interdisciplinary model of HH energy consumption based on major empirical and disciplinary findings of research from the fields of psychology, sociology, geography, consumer behavior science, and economics. Reasons for choosing the Integrated HH Energy Consumption Framework (IHECF) by Burger et al. (2015) is precisely the fact that it is an interdisciplinary multitheory-based [e.g., Rational Choice Theory (RCT), Theory of Planned Behavior (TPB), Value Belief Norm Theory (VBN), Norm Activation Model (NAM), Consumer Theory (CT), Behavioral Decision Theory (BDT), Social Practice Theory (SPT)] framework synthesizing the established main drivers of HH energy consumption. Although there are other multidisciplinary frameworks (e.g., De Witte et al., 2013;Giulio et al., 2014;Götschi et al., 2017;Koszowski et al., 2019) and theories to the best of our knowledge there is none which is as comprehensive as the IHECF.
In the empirical analysis, we build a SEM explaining a major domain of HH energy consumption. Specifically, we focus on (self-reported) yearly distance traveled by Swiss HHs using their private car, a behavior that public policies can target to yield potentially large energy savings. In fact, the transport sector accounts for a third of total energy consumption in the European Union (Eurostat, 2021) and in Switzerland (SFOE, 2020). We develop the model in an illustrative manner, with the intention of highlighting challenges in translating such comprehensive models into empirically applicable models and outline data requirements.

The Integrated HH Energy Consumption Framework
The IHECF (Figure 1), proposed by Burger et al. (2015), distinguishes two types of energy consumption behaviors (ECBs): material-specific behaviors (e.g., buying a car), and actionspecific behaviors (e.g., driving the car). These behaviors are influenced by a multitude of individual and socio-economic factors, embedding individuals with their choices and routines in a broad environment.
This broad environment is characterized by factors related to social opportunity spaces (SOS) and individual opportunity spaces (IOS). IOS factors provide individual boundaries framed by SOS factors, that is, the external societal circumstances in which the individual is embedded. SOS factors include characteristics of available technology and facilities, economic factors (including prices), institutional norms and policies, geographic and climatic factors as well as demographic and cultural differences. IOS factors include aspects describing the individuals' social environment (i.e., social context, milieu and lifestyle), their socio-economic setup (i.e., personal appliances and facilities, place of dwelling, HH size) and their sociodemographics, such as income, age, gender and knowledge.
Individuals base their decisions on a complex interplay of internal decision-making factors and IOS/SOS factors. Internal decision-making factors are attitudes, control, norms, values, heuristics and biases, as well as emotions, which influence choices and routines, and ultimately ECB. Choices can either be habitual, i.e., embedded in routines (e.g., always driving to shops instead of taking other transport modes), or deliberate (e.g., purchasing an electric vehicle).
In addition to describing different types of ECB and their determinants, the IHECF makes suggestions regarding governance factors (i.e., instruments, institutions and actors) that could be activated for (re-)shaping ECBs. It is therefore designed to guide interdisciplinary research on energy consumption and offers a way of organizing findings and viewpoints from different disciplines. Due to its interdisciplinary nature and comprehensiveness, the IHECF is not based on a single theory, thus, causal claims must be established through empirical research.

Implementation of the IHECF
Utilizing the IHECF as foundation, we develop an empirically estimable model by linking the factors in the framework and drawing the relationships that delineate the underlying ECB mechanisms. A complete overview of the relationships can be seen in Figure 2. Aiming at an empirically tractable model, we focus on forward flows, that is, relationships going from the broad end of the triangle (SOS) to the tip (choices and routines). The model does not, however, specify relationships in a purely sequential manner. Instead, while some factors influence other factors from an adjacent level, certain factors from all categories (SOS, IOS, decision-making and choices/routines) are modeled as direct predictors of behavior, as can be found in the literature on ECB. To this end, we draw upon empirical research and theories (e.g., TPB, VBN, CT, RCT, and BDT) 1 from psychology, sociology, geography, consumer-behavior science, and economics, focusing on the most robust findings. The generic model specified in Figure 2 represents a tractable option of the IHECF. Without repeating the discussion in Burger et al. (2015), we present some major findings in the ECB literature in the following subsections to underpin the relevance of the four dimensions in our generic model.

Relationships of Social Opportunity Space and Individual Opportunity Space Factors
Starting with the SOS and IOS factors located at the base of the IHECF triangle, we outline the relationships of several determinants traditionally considered in economics. From an economic point of view, utility-maximizing individuals decide on their level of energy consumption while considering unit price of energy and available income (e.g., Borenstein, 2015). This has led to abundant literature on price-and income-elasticities of energy demand (e.g., Havranek and Kokes, 2015;Labandeira et al., 2017). As, however, energy is not consumed per se, but used as an input in the HH production function, the effect of energy prices should be mediated by the characteristics of appliances (e.g., vehicles or electronic devices), which in turn depend on the available technology and the facilities available to the HH.
Technology must be included in the model, as technological progress results in greater efficiency, hence lowering the price of energy service and stimulating demand (cf. also Fowlie et al., 2018 on relevance of available technology). For instance, when a HH purchases a more efficient car, driving becomes cheaper, possibly resulting in a greater usage. This "rebound" effect could offset part of the expected energy consumption reductions. While the existence of rebound effect is widely accepted, its magnitude remains contentious, with empirical estimates for private car travel ranging from negligible to almost 100% (e.g., Azevedo, 2014).
Other SOS factors include geography and climate, taken in a generic way to refer to weather-related factors (e.g., temperature and precipitation), topographic characteristics of the HH's location (e.g., elevation and slope), and atmospheric conditions such as greenhouse gas concentration. All these conditions are also structural determinants of energy usage usually labeled as "demand shifters" (e.g., Kavousian et al., 2013;Winkler et al., 2014).
Demographic factors at the societal level, such as population size, age structure, urbanization, and population density, as well as at the HH level, such as HH size, gender, and age, have been found to significantly impact energy consumption and travel (Brounen et al., 2012;Liddle, 2014;Karatasou et al., 2018;Buylova, 2020). HH size is a structural determinant as, for instance, a 5-person HH naturally consumes more energy than a 2-person HH. However, economies of scale result in a less than proportional increase in energy usage for every additional member. Moreover, the composition of the HH (e.g., size, presence of children) and the type of accommodation increases the availability and use of appliances (e.g., number of cars and intensity of usage, cf. De Witte et al., 2013). Also, ECB appears to be associated with occupants' age, gender, education and ethnicity partly because of differences in activities (e.g., Brounen et al., 2012;McLoughlin et al., 2012;Tweed et al., 2015;Karatasou et al., 2018;Buylova, 2020).
In addition, and given existing evidence, socio-cultural characteristics and how they relate to potentially mediating factors such as values or norms should be integrated in the model. Culture has been defined as "the integrated pattern of meanings, beliefs, norms, symbols and values that individuals hold within a society, with values representing perhaps the most central cultural feature" (Oreg and Katz-Gerro, 2006, p. 466). Accordingly, culture provides the broader social context through which individuals learn what is valued and acceptable in their society. Moreover, institutions provide a broader context in which social norms are perceived (Allcott, 2011;Ostrom, 2014), leading to social norms as mediator between institutions and ECB. For example, Stephenson et al. (2015) and Stephenson (2018)  Between the broad socio-cultural context and the narrower context of social groups, lifestyle-and milieu-groups also play a role and have been observed to behave differently in terms of ECB (Spaargaren, 2003;Sütterlin et al., 2011;Schubert et al., 2020). Some groups, for example, perceive more social pressure through their social context to engage in ECB than others (Sütterlin et al., 2011;Schubert et al., 2021). A social milieu encompasses people, the physical and social conditions underlying traditions and values, which are relatively stable and resistant to societal changes (Mochmann and El-Menouar, 2005). Lifestyles, on the other hand, express a person's social position through their behavior and consumption patterns (Van Acker et al., 2014;Schubert et al., 2020). Whereas, social context mediates the influence of cultural and lifestyle/milieu on values and social norms, lifestyle and milieu, in turn, mediate other IOS factors such as the place of dwelling. Finally, environmental and energy-related knowledge can affect ECB via attitudes (e.g., Nayum and Klöckner, 2014;Pothitou et al., 2016).

Relationships of Decision-Making Factors and Choices/Routines
Psychological theories [e.g., TPB, NAM, VBN, and Social Cognitive Theory (SCT)] and empirical research (cf. metaanalysis by Klöckner, 2013, themselves based on main psychological theories) have identified intention, control, habits/routines, heuristics/biases, attitudes, norms, values, as well as emotions, as important predictors of environmental behavior, including ECB.
Intentions, habits/routines, and control are direct predictors of ECB. Even though intentions are not explicitly listed as a decision-making factor in the original IHECF (Figure 1), it constitutes a major predictor of ECB and is seen as a gauge of people's willingness to adopt environmentally friendly ECBs (e.g., Tan et al., 2017;Sun et al., 2018). We therefore include intention in our model (Figure 2), mediating the relationship between ECB and attitudes, norms, emotions, and control (Klöckner, 2013;Hiratsuka et al., 2018;Brosch, 2021).
Control constitutes a further major predictor of ECB, known as perceived behavioral control (PBC) in the TPB (Ajzen, 1991(Ajzen, , 2006 and self-efficacy in the SCT (Bandura, 2001). Control explains how able-based on the circumstances and skills-a person feels to perform certain behaviors and is influenced by social norms (e.g., Klöckner, 2013;Fu, 2021).
A large amount of our daily behavior is deemed habitual with very little deliberation (Marien et al., 2018), also referred to as routines. Similar to habitual choices (i.e., using a car for commuting), one-off decisions (e.g., what car to buy) could have habitual aspects such as brand loyalty (Nayum and Klöckner, 2014). Habits are also related to heuristics and biases (Verplanken and Aarts, 1999;Klöckner, 2013) and can be considered as mechanisms for focusing on certain aspects of a complex decision while ignoring others (Tversky and Kahneman, 1974).
For simplicity, we assume that heuristics and biases are included in habits and do not consider separate relationships. Indeed, to indicate the overlap with routines and to highlight the habitual element in choices, we place habits in-between decision-making factors, choices, and routines. Habits have, in fact, been found to be a main predictor of behavior, mediating the relationship of ECB with intentions, personal norms and control (e.g., Nayum and Klöckner, 2014). Naturally, the available facilities and appliances, for instance access to a specific car, have an impact on the formation and persistence of habits (Klöckner and Matthies, 2009;Klöckner and Blöbaum, 2010;Hess and Schubert, 2019;Punzo et al., 2021). In addition, personal norms, people's personal and moral considerations, mediate the relationship of control, social norms and values with intentions and habits (Stern, 2000;Klöckner, 2013).

METHODS
To empirically illustrate our framework-based approach we carried out a SEM analysis. SEMs are a collection of different integrated analytical techniques. These include for example, path analysis (regression analysis) and factor analysis. Factor analysis can be utilized to estimate latent factors from observed variables. Path analysis, on the other hand, offers the opportunity to estimate the effect of one or more variables on others and hence allows the investigation of various hypotheses in a single model. SEM can fit data from experimental, non-experimental and observational studies. SEMs are able to simultaneously estimate multiple interrelated relationships of endogenous and exogenous variables, and account for measurement errors. In addition, SEMs provide fit statistics to evaluate the implications of theoretical assumptions or relationships (Bollen and Pearl, 2013).
The SEM developed in this paper is specified as displayed in Figure 3. The SEM is estimated using Stata 16, with all variables demeaned (i.e., the variable's mean is subtracted from all values so that the resulting variable is centered at zero), and covariations were only allowed between exogenous variables and not between measurement errors. All structural paths are grounded in theory as mentioned above and successfully tested in previous empirical studies (cf. Sections Relationships of Social Opportunity Space and Individual Opportunity Space Factors and Relationships of Decision-Making Factors and Choices/Routines).
For the estimation, we rely on Full Information Maximum Likelihood (FIML), which implies using all available data, even observations with missing values. For robustness check, we also applied the (standard) Maximum Likelihood (ML) approach on the subsample composed only of observations without missing values. We conclude that there are no systematic disparities between the FIML and ML sample 2 .
For the estimated model we report both unstandardized and standardized coefficients (Appendix Tables). The two sets of coefficients are complementary: unstandardized coefficients provide quantitative impacts of the covariates on the endogenous variables, thus these path coefficients are in the same unit as the endogenous variable of that path. Standardized coefficients reveal the relative importance of each covariate, thus these coefficients are unit-free and therefore make it possible to compare variables of different magnitudes.

Data
Data analyzed in this paper was collected in April and May 2017, as part of the second wave of the Swiss Household Energy Demand Survey (SHEDS) 3 . SHEDS respondents are representative of the Swiss population 4 according to age, gender, region and home ownership. Respondents self-report their equipment and usage in several energy consumption domains (heating, electricity, and mobility), socio-demographic, psychological (e.g., environmental attitudes, values, etc.) and sociological characteristics (e.g., life events, etc.). Our illustrative empirical analysis (using FIML) 5 focuses on 3,362 car ownersa subsample of the entire SHEDS sample (see Table A1). This includes respondents who own a car running on gasoline or diesel, that is, about 73% of the entire sample of 5,015 HHs 6 . Table 1 provides an overview of all variables considered in our IHECF framework-informed model. Annual mileage is obtained as the answer to the question "On average, how many kilometers do you drive per year?" and is only asked to car owners. Further details of the psychological constructs are provided in the Table A2). In order to adapt to the available data, we exclude a 3 Weber et al. (2017) provide a detailed description of SHEDS, which was based on information of the IHECF. 4 SHEDS collects data from all parts of Switzerland except Ticino, the Italianspeaking canton representing less than 5% of the Swiss population. 5 The ML analysis includes 922 observations. Despite their important size difference, the two samples do not show statistical differences in main variables. An exception is the respondent age, which is on average lower in the ML sample. 6 The focus is on cars running on gasoline or diesel. We excluded 152 observations corresponding to cars with other engine types (e.g., electric or hybrid cars) and 151 outlier observations with evident reporting errors in particular, suspiciously large reported values for fuel consumption. number of variables from the final empirical SEM, as depicted in Figure 3 7 .

Model Fit
Basing our decisions on Hu and Bentler's (1999) criteria, we find that our model 8 satisfies the suggested RMSEA 9 fit statistic (.04, cut off < 0.06). Other fit statistics are slightly outside recommended ranges, such as CFI (0.83, cut off 0.90 or higher) and χ²/DF ratio (χ² = 7,394.15, DF = 993, χ²/DF = 7.45, cut off < 2-5, p < 0.001), however research has shown that the optimal threshold depends on numerous features of the model, including estimation method, sample size, number of degrees of freedom, and the extent to which assumptions of multivariate normality are met (Hu and Bentler, 1999;Marsh et al., 2004;Tomarken and Waller, 2005) 10 . Noting that other large SEMs in the literature 7 More precisely, the reasons for excluding a variable are as follows: (1) Data availability (i.e., attitudes, social context, institutional norms or policies) for 2017; (ii) Mediating factor missing (e.g., for knowledge); (iii) Multicollinearity and heavily unbalanced distribution of respondents across categories (for lifestyle categories). 8 The final measurement model (of the latent psychological variables) shows appropriate fit statistics (FIML: χ2 = 1,005.251, DF = 202, p < 0.001, CFI = 0.969, RMSEA = 0.034). Drawing on modification indices, confirmatory factor analysis, composite reliability, discriminant and convergent validity as well as model fit statistics, we exclude five items showing low loadings on the latent constructs (details in Tables A3, A4). 9 Often presented together with SRMR fit statistics, which were only available for the ML model and also acceptable [=0.053 (ML only), cut off < 0.08]. 10 As robustness checks, we estimated reduced versions of our model (available upon request), which show improved fit statistics.

Model factors Variables included in SEM Behavior
Kilometers driven per year by private car, scale: "up to 5,000 km"-"more than 50,000 km", steps of 5,000 km, with an "I don't know" option

SOS variables
Geography The decision-making variables are further described in the Table A2.
show similar goodness-of-fit statistics (Bouscasse and Bonnel, 2016) we deem the fit statistics for our model acceptable. Overall, the model reports an explained variance (R 2 ) of 17% for the main endogenous variable (km driven/year) and the R 2 for the fifteen further endogenous variables range from 1 to 82% (see Table A5). Tables A5, A6. Each factor in the model may directly or indirectly affect distance traveled (the final outcome in our model). Direct predictors (for instance gender and age) are connected to distance traveled without mediating factors, whereas indirect predictors (for instance social and personal norms) affect distance only through a mediating factor (habits in the case of social and personal norms). There can be several mediating factors between an indirect predictor and the final dependent variable. Predictors may also affect distance traveled both directly and indirectly, in which case the total effect is given by the sum of both. Thus, overall, there are a number of different possible pathways which can explain a given behavior that our analysis depicts. Table A6 reports all direct, indirect and total effects. To facilitate the interpretation of results, here we provide different figures summarizing the standardized coefficients of different investigated pathways.

Direct Pathways Explaining Km-Driven/Year
Figure 4 depicts the direct relationships or pathways between the main endogenous variable, annual km driven by private car, and different decision-making/routine, IOS and SOS factors. Findings show that among the direct decision-making/routine factors, only choices related to habitual transportation mode are significant. Unsurprisingly, people that routinely use public transport and soft modes (walking, cycling etc.) for commuting or leisure travel drive less km/year by car. Compared to the reference group (i.e., people using their private car as main travel modes), individuals that habitually use public transport or soft modes for commuting drive about 3,000 km less on average, and those using these modes of transport for leisure purposes drive about 2,000 km less per year on average (Table A6, total effects).
Several direct IOS-factors are significant. For instance, subscriptions to a general or a regional transport pass is associated with a lower usage of private car, about 5,000 and 3,000 km/year less, on average. On the other hand, people who own a diesel or automatic car drive about 3,000 and 700 km/year more on average, respectively, than those with gasoline engines or manual transmissions. Additionally, owners of new cars (defined as cars registered up to 1 year before implementation of our survey) drive around 800 km/year more on average than those with older cars. Looking at demographics we find that higher income is associated with greater car usage with 130 km/year increase for 10% increase in income on average; female respondents drive about 1,200 km/year less on average; and younger people (18-34 years of age) drive more with an average difference of 900 km/year with the middle-age (35-54 years) and 1,700 km/year with the old (55 years or more) respondents.
Several SOS factors, in particular geographical aspects, also explain annual mileage by private car. People living in the  Table A5), ns = not significant at 5%; PT = public transport; soft = soft mobility; fuel consum. = fuel consumption.
French-speaking region of Switzerland (Romandy) drive about 700 km/year more on average, than those in the Germanspeaking region. In line with commuting transport habits, distance from home to work is also related to annual mileage: each increase by 1 km in the commuting distance is associated with a 40 km increase of the annual distance driven.
Some direct relationships of decision-making, IOS and SOS predictors of behavior are non-significant. These are relationships of behavior with control (DM), intention (DM), HH size and owning more than 1 car in the HH (IOS), rural and suburban dwelling (vs. city; SOS).
Overall, the strongest predictors are commuting habits, structural factors related to the type of engine (diesel vs. gasoline) and public transport passes (general and regional) (see Table A6, standardized total effects).

Indirect Pathways Explaining Km-Driven/Year via Habitual Transport Choices
In addition to distinguishing between direct and total effects for the main dependent variable (km driven/year), one strength of SEMs is that they can help identify different pathways of underlying mechanism for developing interventions. In Figure 5 we display a number of such pathways focusing on the habitual transport choice for commuting and leisure, and their own explanatory variables.
People with a general or regional travel pass are more likely to use public transport for commuting and leisure. Likewise, higher personal norms, control and intentions to reduce car use/carbon footprint explain public transport choice for leisure.
Habitual use of soft modes (walking/cycling) for commuting and leisure transport are explained by a similar group of direct predictors, namely intentions, control and partly personal norms (only for commuting). Interestingly, we observe that intentions to reduce car use/carbon footprint are negatively related to habitual use of soft modes, whereas it is positively related to habitual use of public transport.
Exploring significant explanatory factors of habitual commuting and leisure transport choices further we find that higher intentions are related to higher control and positive emotions but lower social and personal norms.
Control, the feeling of being able to change one's behavior toward more environmentally friendly alternatives, is positively related to social norms (normative information from friends and family who behave and expect others to behave in a proenvironmental way), time spent at 2nd home and property ownership (vs. renting). There is a negative relationship between control and having a diesel car, fuel consumption and having more than one car in the HH. There is no significant relationship between control and having a new car or driving an automatic FIGURE 5 | Indirect pathways explaining km-driven/year via habits for commuting and leisure by public transport (PT) and soft mobility (e.g., bike, walking). Reported are standardized direct coefficients with p < 5% (complete results are provided in Tables A5, A6). Numbers 1-4 refer to the different types of habitual behavior: 1: habitual use of public transport (PT) for commuting, 2: habitual use of soft transport measures for commuting, 3: habitual use of PT for leisure purposes, 4: habitual use of soft mobility measures for leisure purposes. na = not applicable, meaning this path was not tested; ns = not significant at 5%. car, ownership of general and regional transport passes, distance to amenities, or renting a house (vs. renting a flat).
High personal norms-moral personal standards to behave pro-environmentally-are related to a perception of control, positive social norms and positive biospheric (nature-focused) values. Personal norms are negatively related to egoistic (selffocused) and altruistic (other-focused) values. There is no significant relationship between personal norms and hedonic (pleasure-focused) values.

Indirect Pathways Explaining Km-Driven/Year via Relevant IOS and SOS Factors
Investigating the underlying mechanisms of IOS and SOS factors allows us to further understand possible pathways for behavior change (Figure 6). For example, we observe a higher incidence of diesel (vs. gasoline) engines in rural areas and in Germanspeaking regions. Larger commuting distances are also related to having diesel cars or newer cars, hence higher fuel efficiency. On the other hand, individuals with higher incomes are more likely to have cars with lower fuel efficiency. Living in rural and suburban areas is related to lower numbers of general and regional transport passes. Finally, having a higher commuting distance and living in the French-speaking region is related to larger numbers of general transport passes in the HH.
There are a number of non-significant relationships between the IOS and SOS variables, such as income and dwelling location (rural and suburb vs. city dwelling; German vs. French-speaking Swiss regions) that do not appear to be related to owning a new car. Furthermore, having a new car, the commuting distance or dwelling location do not seem to be related to fuel consumption.

DISCUSSION
In this paper, we illustrate how a SEM informed by a multidisciplinary framework can bridge the divide between predictive big data models and single-equation disciplinary models. Specifically, we estimate an interdisciplinary model of energy consumption behavior using data on annual mileage by private vehicle. The bridge can be seen to be established if we can reveal different underlying mechanisms and identify their relative importance, here to reduce private vehicle usage,  Table A5), ns = not significant at 5%; bold variables = explained variable.
thus informing the specification of big data analyses on the one hand, and of disciplinary theory-based models on the other. We indeed demonstrate that framework-informed SEMs can draw out different suitable pathways to behavior change, and thus we illustrate a complementary way to direct the analysis of big data. In the following we discuss the empirical findings on annual mileage by private vehicles to point out the related achievements.

Understanding Different Direct and Indirect Pathways to Change Driving Behavior
A strength of using SEMs is that several possible intervention pathways can be simultaneously investigated. Our results show that lower annual mileage by private car is to a large extent explained by habitual use of alternative mode choices for commuting and leisure, such as taking public transport and walking/cycling. Additionally, higher annual mileage by car is related to owning a diesel, automatic or new car, and reduced annual mileage with transport passes. Our results point to a strong influence of habit and structural aspects on mode choices supporting previous empirical findings (e.g., Klöckner and Matthies, 2009;Hess and Schubert, 2019;Punzo et al., 2021). These findings suggest that interventions, focusing solely and separately on, for instance, taxes or information campaigns should focus on different mechanisms and be designed complementing each other, as suggested by others (Bornemann et al., 2018;Urbanek, 2021). For instance, the findings point to a promising hypothesis that could be tested regarding a combined intervention consisting of: (i) breaking unsustainable habits and forming new sustainable transport habits, (ii) measures to discourage car usage, and (iii) structural changes to facilitate commuting with other travel modes.
Furthermore, we find that habitual use of public transport is largely explained by subscription to the right travel means or "equipment", general or regional passes as well as positive intentions to change behavior. Facilitating the purchase of alternative travel means or free public transport may therefore be a necessary step to increase usage of public transport (Dai et al., 2021). Possible interventions could look at factors related to differences between urban and suburban, rural living and possible region-specific cultural differences (Punzo et al., 2021). Cultural differences between French-speaking and Germanspeaking parts of Switzerland identified in our analysis are in line with previous findings and show that drivers in the Frenchspeaking region drive longer distances and have a stronger preference for fuel efficient cars (Filippini and Wekhof, 2021). The uptake of regional passes is less well explained by the model but structural factors indicating a link to rural and suburban access are significant predictors. Regional differences are also observed regarding public transport passes, with lower subscription rates in the French-speaking region, in line with results from the Swiss Mobility and Transport Microcensus, which show that HHs in the French-speaking region own more cars, but fewer bikes and much fewer public transport passes than their German-speaking counterparts (SFSO, 2012(SFSO, , 2017. It therefore appears that HHs in the different linguistic regions behave differently regarding their transportation means and their mobility in general. Factors related to structural preferences, such as owning a diesel car, should simultaneously be addressed in policy interventions. Diesel car ownership is related to structural factors (rural living) and regional differences (i.e., German-vs. Frenchspeaking). However, the proposed model fell short in explaining the underlying determinants of diesel cars and further work is needed here.
Our results also indicate that some of the main disciplinary determinants of driving behavior, previously suggested as intervention or trigger points to change behavior, are either not significant or only indirectly related to private car usage. This finding suggests that some studies may overstate the relevance of the factors conventionally studied within each discipline. This could be the case for some psychological determinants of ECB such as intentions and control (e.g., Klöckner, 2013), albeit the later findings are in line with Fu's (2021) differentiated findings regarding control. Furthermore, this may also be the case for economic determinants such as fuel consumption (Linn, 2013), and HH size, previously documented as important in the transport literature (De Witte et al., 2013). Various reasons could explain the non-significance of a direct relationship between these factors and annual mileage by car. The non-significance of intentions to behave environmentally friendly may be due to a well-documented phenomenon referred to as the intentionbehavior gap (e.g., Hassan et al., 2016;Zhang et al., 2019). Similar non-significance has been found in interdisciplinary research modeling car use (Klöckner and Friedrichsmeier, 2011). Ideally, intentions should be collected prior to the behavior, because if intentions (e.g., about reducing car usage) are collected at the same time as the behavior (car usage), as is done in SHEDS, the intentions may not have been implemented yet, unless they were formulated some time prior to the data collection. Longitudinal analysis could help overcome this limitation, and future research should investigate whether the intention-behavior gap remains, when estimating models on panel data.
While the positive effect of fuel efficiency on annual mileage by car (the so-called rebound effect) is a much-debated topic in the economics literature, our estimates yield no statistically significant evidence of such an effect. This can be explained by the strong heterogeneity of individuals in their rebound responses (as outlined in another context by Hediger et al., 2018) but also the theoretical structure imposed by the model. While recommending caution against interpreting this result as nonexistence of rebound, we consider that the results point to the importance of moderating effects (here, psychological factors) in understanding rebound behavior. In fact, while many factors might influence both efficiency and the driven distance in a positive manner, factors such as social norms and personal values might lead to higher efficiency and lower usage at the same time.
While recognizing the data limitation on the respondent's knowledge/information, we observe that the model strongly favors policies targeting habit-formation mechanism through setting intentions, increasing control as well as social and personal norms. Furthermore, we find effects from IOS and SOS factors that differ from those previously reported in the transport literature. For example, unlike De Witte et al. (2013), we do not observe a relationship between HH size and distance traveled. Reasons for the non-significance of HH size may be due to the relative magnitude of the effect, meaning that the impact of this factor is relatively small, especially when compared to other factors.
We can summarize our main finding regarding different direct and indirect pathways in four points: 1) The habits/routine pathway shows the most significant impact and stands out as the main mechanism for almost all statistically significant IOS factors. 2) Diesel cars, a main IOS factor, also significantly relates to driving behavior. 3) The SOS factors mainly represent the lowest direct effects, apart from commuting distance, suggesting that mediating factors are important and could change (even reverse) the expected effects. 4) The intention pathway does not represent a significant importance, as shown by the lack of a direct effect of intention on behavior.
Overall, our findings highlight the usefulness of applying SEM to understand complex phenomena and to draw out which pathways would be most suitable for interventions. Our findings also show the importance of interdisciplinary models to provide a solid structure to analyse the complexity of factors (here in shaping driving behavior) and to shed light on the explanatory strengths of the factors and their interplay.

Data Limitations
The results from our illustrative case indicate that the application of SEM may help understand complex phenomena and bridge the gap between predictive big data models and less flexible regression models. While consistent with its theoretical model counterpart, the proposed empirical model is reduced, hence more tractable in certain dimensions. This has proved inevitable for models applied to survey data, mainly because of data availability that does not always match the model's requirements. We concede that the gap between the theoretical model and its empirical counterpart expresses the tension between an ideal model with maximum explanatory power and its empirical applicability. Developing an ideal model, even if datasets can be expected to be suboptimal in most cases, is a relevant task, as it can also set data requirements for future research. In our context, the ideal dataset would include all factors on all levels (SOS, IOS, decisionmaking and routines/choices). Our current dataset misses some individual level factors, such as attitudes and social context, which could be collected through HH surveys. Importantly SOS data were also missing, such as higher order data on institutional policies and norms, weather and geographical information, technology and economy. Collecting SOS data could be timeconsuming and the difficulty would lie in abstracting from the individual HH level. In order to merge individual factors (at the HH level) with social factors (available from various other sources), it is extremely important to collect information about the location of HHs and their work places. Finally, although theoretically possible, it might be practically impossible to gather an ideal dataset via surveys due to financial and time limitations, let alone participants' willingness to fill in very long and detailed surveys which might also lead to an increase in errors and answer biases. Thus, we suggest that a fruitful avenue of research is the exploration of strategies that rely on new communication technologies (i.e., apps, sensors, etc.) and can be linked to revealed preferences data.

CONCLUSION
Our empirical exercise highlights the usefulness of applying SEM to understand complex phenomena such as energy consumption behavior (or, more precisely, annual mileage by private vehicle in this case) and to identify suitable pathways to change behavior. It also further highlights the importance of conducting interdisciplinary research with models considering a broad range of potential predictors as opposed to models rooted in a single discipline. In our estimation, a number of otherwise significant factors have become non-significant, sometimes transgressing disciplinary context matters. The exercise of fitting data to such an interdisciplinary model focuses attention onto what the ideal data would look like, and on potential issues with collecting large sets of survey data. Nevertheless, despite data shortcomings and deviations from the ideal interdisciplinary model, our empirical model delivers relevant insights on determinants of annual mileage by Swiss HHs using private cars.
Our estimated model points to a number of mechanisms that can be targeted for reducing private car usage and increasing use of alternative modes of transport. Our policy relevant conclusions point to the importance of: 1. Promoting "habitual" alternative mode choice use for commuting and leisure. 2. Supporting suitable personal infrastructure changes such as public transport passes. 3. Discouraging the purchase of diesel cars.
Depending on the relative importance of each pathway, we can identify which mechanism should be prioritized for the greatest impact. This is an empirical question that can be addressed by holistic models such as this IHECF framework-informed SEM.
The illustration presented in the paper shows that SEM can be used to effectively assess the relative importance of different direct and indirect pathways.
Findings in this study illustrate how SEM studies can be brought into a sustainametrics conversation. While this conversation tends to focus on how increasingly available data can support transitions to sustainable societies, our study directs attention to limitations inherent to predictive models based on big data and their role in supporting sustainability transitions. Furthermore, our study illustrates how big data analysis can be complemented by SEM analysis on available data-not ideal data but available ones. In particular, we argue that SEM analyses can support fine-tuning of policy interventions informed by predictive analysis relying on big data. While predictive methods relying on big data can be used to estimate the impact of interventions and identify the most reactive segments of population-"low-hanging fruits"-, SEM analysis can inform the design of intervention policies by focusing on specific and multiple mechanisms. For instance, certain machine learning frameworks can be used to predict sustainabilityrelevant individual behaviors based on readily available HH characteristics thus identifying relevant target groups for policy interventions. However, they cannot help defining a mechanism to prioritize various alternatives. A SEM analysis such as the one proposed in this paper can provide information about the relative importance of specific pathways based on multidisciplinary models. These pathways can be used to identify the mechanisms that should be targeted by policy interventions and to design targeted policy interventions on specific responses in relevant segments of population. In a certain sense, SEM stand in between regression and predictive analyses and could hence be used to bridge the gap between the two types of analyses.

DATA AVAILABILITY STATEMENT
The SHEDS data is available to academic researchers after one year of the launch of the respective wave. Researchers have to sign a confidentiality agreement and submit a short research proposal to indicate the intended use of the data. Data use for commercial purposes is not allowed. For more information on the SHEDS survey see https://www.sccer-crest.ch/research/ swiss-household-energy-demand-survey-sheds/.

ETHICS STATEMENT
Ethical review and approval was not required for this type of study with human participants in accordance with the local legislation and institutional requirements. Written informed consent for participation was not required for this study in accordance with the national legislation and the institutional requirements.

AUTHOR CONTRIBUTIONS
All authors were jointly responsible for the conceptualization and the writing of the paper, with IS taking the lead. IS and SW were jointly responsible for the empirical analysis, which they interpreted with the collaboration of ALMC and MF. PB, MF, SW, and IS are responsible for the implementation of the Swiss Household Energy Demand Survey (SHEDS), with SW taking the lead. All authors contributed to the article and approved the submitted version.

FUNDING
This research project was part of the Swiss Competence Center for Energy Research SCCER CREST which was financially supported by the Swiss Innovation Agency Innosuisse under Grant No. KTI. 1155000154.