Approaches to Cognitive Modeling in Dynamic Systems Control

Much of human decision making occurs in dynamic situations where decision makers have to control a number of interrelated elements (dynamic systems control). Although in recent years progress has been made toward assessing individual differences in control performance, the cognitive processes underlying exploration and control of dynamic systems are not yet well understood. In this perspectives article we examine the contribution of different approaches to modeling cognition in dynamic systems control, including instance-based learning, heuristic models, complex knowledge-based models and models of causal learning. We conclude that each approach has particular strengths in modeling certain aspects of cognition in dynamic systems control. In particular, Bayesian models of causal learning and hybrid models combining heuristic strategies with reinforcement learning appear to be promising avenues for further work in this field.


INTRODUCTION
Handling dynamic systems is a common requirement in everyday life, ranging from operating a novel technical device, to managing a company or understanding the dynamics of social relationships. Considerable progress has been made toward assessing dynamic systems control (DSC) as a cognitive skill, particularly in educational contexts (OECD, 2014;Schoppek and Fischer, 2015). However, the development of cognitive theories which describe and explain the mental processes underlying this skill seems to lag behind. In this perspectives article we therefore examine what computational models of cognition can contribute toward an improved understanding of different cognitive processes involved in DSC. For this purpose, we briefly review several approaches to cognitive modeling in DSC, summarize their relative strengths and weaknesses, and conclude with what we perceive as promising routes for future research.
Dynamic systems control can be defined as a form of dynamic decision making that requires (1) a series of interrelated decisions (2) in interaction with a dynamic system inducing states of subjective uncertainty (3) with the aim of attaining (and maintaining) a goal state and/or to explore the system and possible courses of action (also see Edwards, 1962;Osman, 2010). Subjective uncertainty may be caused by random fluctuations of the system but also by limited knowledge of the system's structure and its dynamics (Osman, 2010). In cognitive research DSC is typically investigated using computer-simulated microworlds, which are the focus of the present paper (but see Klein et al., 1993, for a different approach). Microworlds emulate cognitively relevant features of DSC situations (e.g., limited information, delayed feedback, time pressure) framed in a semantically plausible setting such as managing a company or fighting a forest fire (Brehmer and Dörner, 1993;Gonzalez et al., 2005).
In this article, the terms cognitive model or computational model (of cognition) refer to any mechanistic account of cognitive processes that is sufficiently specified to allow a computer-based implementation yielding quantitative predictions of behavior, cognitive processing steps, or neural activity (Lewandowsky and Farrell, 2011). Computational models enforce conceptual completeness, as all functionally relevant properties of a theory have to be made explicit for its computational implementation. Furthermore, computational models generate precise predictions of data patterns that can be empirically tested, which most verbally expressed theories are unable to do with the same level of precision. Models implemented as computer programs can also be used for simulation-based exploration to investigate how varying parameter settings, assumptions about cognitive processes, or simulated task demands affect model behavior. For a comprehensive treatment of computational modeling in cognition − including its challenges and problems − please refer to Lewandowsky and Farrell (2011).
We will now consider the contribution of different approaches to modeling cognition in DSC with selected examples. Our brief review begins with comparatively simple and knowledge-lean modeling paradigms, moving toward approaches involving more complex strategies and knowledge structures (see Table 1 for an overview).

INSTANCE-BASED LEARNING
Instance-based learning (IBL) models have arguably been among the most successful approaches to cognitive modeling in DSC. They are based on a simple principle analogous to reinforcement learning: control actions that lead to a successful outcome become reinforced in memory and will therefore more likely be remembered and enacted when a similar situation is encountered in the future (Gonzalez et al., 2003). The application of this learning principle in DSC can be traced back to Berry and Broadbent's (1984) semi work on knowledge acquisition in dynamic systems. In their deceptively simple task, the production of a simulated sugar factory had to be kept in a target range by adjusting the number of workers. Surprisingly, most participants were unable to verbally describe the system's structure despite being able to control it above chance level. As an explanation, Broadbent et al. (1986) suggested that people might store instances of situation-action combinations they have experienced in memory (e.g., hiring X workers when the current number of workers is Y and the current production is Z). Subsequent decision making in turn is based on retrieving instance memories similar to the situation encountered. Those instances repeatedly associated with successful outcomes (e.g., reaching the target production) become reinforced in memory and gradually start to dominate behavior, although no verbalizable representation of the system's structure has been formed.
A computational IBL model of the sugar factory task has been implemented by Taatgen and Wallach (2002; also see Dienes and Fahey, 1995) using the ACT-R cognitive architecture (Anderson et al., 2004). Each instance was modeled as a unit in declarative memory encoding current state, action and outcome. On encountering a given system state, the model searches for instance memories that are similar to the current state and have led to the target outcome in the past, taking into account how often the instance memory has been retrieved before. The model requires only two production rules and closely fits human behavior. A model of the sugar factory relying on a similar associative learning mechanism was implemented by Gibson et al. (1997) using an artificial neural network. This illustrates that the basic learning principle is independent of any specific modeling architecture. IBL has also been applied to modeling more complex tasks such as controlling an array of pumps in a simulated water purification plant where the system state changes in real time (Gonzalez et al., 2003). The model included a blending mechanism to interpolate information across related instances and relied on a simple decision heuristic as fallback when instance memories were insufficient. A generic implementation of the IBL framework has been made available by Dutt and Gonzalez (2012) to make IBL modeling accessible to non-expert modelers.
An approach similar to IBL was used by Glass and Osman (2017;also Osman et al., 2015) to model learning in a simple dynamic system with continuous input and output variables. Instead of relying on a cognitive architecture, the authors adapted a general-purpose reinforcement learning algorithm to this task (Sutton and Barto, 1998). The model updated the reinforcement history of input variables after each trial depending on how much the last action reduced goal distance. This results in model behavior broadly similar to IBL. Glass and Osman (2017) specifically focused on modeling group differences between young and old adults in terms of exploration vs. exploitation behavior. They mapped this behavioral preference on the noise parameter affecting the choice of input values. Reinforcement learning has also been used to model conflicts between short-and long-term goals and how unreliable information affects learning in dynamic control (Gureckis and Love, 2009).
In sum, IBL and reinforcement learning models have been successfully used to explain different aspects of exploration and control in DSC. The basic mechanism is simple, cognitively plausible and requires only few task-specific assumptions. However, IBL models critically depend on the availability of immediate outcome feedback and the frequent repetition of similar decision situations to facilitate learning, which limits the type of task they can be applied to (see Table 1). Furthermore, IBL models cannot easily explain how people acquire explicit knowledge of the causal structure of a system, which is a central element of some DSC tasks (e.g., Kluge, 2008;Wüstenberg et al., 2012).

HEURISTIC MODELS
Heuristics-based approaches to DSC assume that people rely on simple rule-of-thumb-type decision strategies for controlling dynamic systems. These strategies do not guarantee an optimal result, but allow to achieve reasonable outcomes across a range of conditions with limited cognitive effort (Brehmer, 2005;Shah and Oppenheimer, 2008). Characteristically, heuristic strategies do neither involve complex reasoning nor a detailed mental representation of the problem structure. Empirical research has shown that heuristics can explain adaptive behavior in many decision making situations as well as common errors and biases (Gilovich et al., 2002;Gigerenzer and Brighton, 2009). Furthermore, due to their simplicity heuristics are relatively easy to implement as cognitive models (e.g., Marewski and Mehlhorn, 2011).
The use of heuristics has also been proposed as an explanation for decision making behavior in DSC (e.g., Brehmer and Elg, 2005;Cronin et al., 2009). One of the best known examples of a computational heuristic model in DSC is Sterman's (1989) model of decision making in a supply chain management task. The model is based on the anchoring-andadjustment heuristic (Tversky and Kahneman, 1974), which involves substituting an unknown quantity (supplies ordered) with a related known quantity (sales forecast), adjusting for further influences (current stock level). Data simulated using this heuristic closely match human behavior and reproduce the characteristic oscillation between over-and undersupply that arises from ignoring system delays (Sterman, 1989). The cold store temperature regulation task (Reichert and Dörner, 1988) poses a similar challenge to participants, as the system responds with delay to changed inputs. Reichert and Dörner's (1988) model captures how participants gradually learn to control the system by applying incremental changes to a proportional-control heuristic after unsuccessful control attempts. A conceptually related adaptive heuristic strategy is directional learning: if increasing an input improves the outcome then continue to increase it, otherwise decrease it. Computational models of directional learning have for example been used to model behavior in dynamic economic games, such as the multiple-round prisoner's dilemma or the ultimatum game (Selten and Stoecker, 1986;Grosskopf, 2003).
Heuristic models can also be combined with reinforcement learning to simulate how people learn to choose among competing heuristic strategies. The probability of selecting a strategy depends on the outcomes that this strategy has produced in the past (Erev and Barron, 2005). For example, Gonzalez et al. (2009) used this approach to model response times in a dynamic radar detection and decision making task. The model fitted human data about as well as an alternative IBL model, although it transferred less well to changed task conditions. In contrast, Fum and Stocco (2003) found that a strategybased learning model of the sugar factory task performed better under changed conditions than the corresponding IBL model of Taatgen and Wallach (2002). It appears that the transfer across situations depends on the details of the task, the type of training, and the strategies implemented. Strategybased learning can also be applied to highly complex tasks such as fighting a simulated forest fire (De Obeso Orendain and Wood, 2012). In this model four high-level heuristic strategies competed (e.g., dropping water on the fire or creating a barrier to contain the fire), which were modeled in great detail. The model successfully reproduced how varying the conditions in the training phase affected preferences for particular strategies in later transfer.
In sum, there are several good examples of heuristics-based and hybrid models that address pertinent theoretical questions in DSC (e.g., handling delays, transfer of learning). Although incorporating more task-specific knowledge than pure learning models, models of this type typically still have a relatively simple basic structure. When heuristic models are extended with more complex strategies and abstract knowledge structures  Reichert and Dörner, 1988;Sterman, 1989 Complex knowledge-based models: Complex cognitive strategies and abstract mental representations.
Typically requires considerable task-specific prior knowledge.

Modeling of complex knowledge structures and reasoning strategies
Can be very complex Often highly task-specific, limited transfer Schunn and Anderson, 1998;Schoppek, 2002 Causal learning: Bayesian induction of causal relations from observing system behavior.
Task sufficiently simple to allow causal attribution. Prior knowledge can be minimal.
Comprehensive formalism for representing causal knowledge, uncertainty and knowledge updating Have not yet been directly applied to system control tasks Steyvers et al., 2003;Meder et al., 2010 they gradually transform into complex knowledge-based models, which we will consider next.

COMPLEX KNOWLEDGE-BASED MODELS
Complex knowledge-based models involve the creation and transformation of abstract knowledge structures combined with complex cognitive strategies. This corresponds to the notion of mental models guiding reasoning and actions in DSC (e.g., Brehmer, 1992). Models of this type are often implemented as production systems, i.e., sets of if-then-rules that act on knowledge objects or initiate external behaviors (Newell and Simon, 1972). This flexible mechanism allows to express a wide range of strategies up to expert performance at the expense of resulting in very complex and task-specific models in many cases. Anzai (1984) presented one of the first models of this type in the context of navigating a simulated large ship which responded with considerable delay. Based on the analysis of verbal protocols, Anzai (1984) designed a production system model that qualitatively captured the acquisition of control knowledge in novices and experts. Later studies showed how production system approaches can be extended to model performance even in complex real-time decision making tasks such as that of a radar operator or flying a commercial jet plane (Schoelles and Gray, 2000;Schoppek and Boehm-Davis, 2004). In modeling realtime control tasks the emphasis usually lies on modeling the time course of control performance and typical errors.
A different class of DSC task that requires scientific reasoning to explore the causal structure of unknown systems has been particularly relevant in the recent educationally oriented wave of DSC research (Herde et al., 2016). Schoppek (2002) proposed a fine-grained cognitive model that was able to systematically explore and control a small dynamic system based on linear equations. The model incorporated an explicit mental representation of the system's structure and mental calculation steps to derive input values. This strategy is sufficiently general to control any simple dynamic system based on linear equations (Funke, 2001). The model was able to simulate the effects of different degrees of system knowledge and strategic sophistication and compared favorably to results from human data (Schoppek, 2002). Similarly, Schunn and Anderson (1998) applied a production system approach to model the task of designing scientific experiments (given a restricted set of design options) and drawing conclusions about causal relations from the simulated results. This model was able to successfully capture performance differences between experts and novices by modeling their respective domain knowledge and exploration strategies.
An apparent advantage of complex knowledge-based models is their ability to explain how causal system knowledge combined with reasoning strategies informs the actions that people take. It seems difficult to imagine how some forms of DSC could be explained without recourse to reasoning and abstract knowledge representation, for example, extrapolating system behavior in new situations or hypothesis testing and rule-deduction in discovery learning. However, models of this type are often neither simple nor elegant and require the inclusion of considerable task-specific knowledge (Taatgen and Anderson, 2010).

BAYESIAN CAUSAL LEARNING
Another modeling approach with a focus on structural knowledge is the use of Bayesian networks to model causal learning (Meder et al., 2010;Osman, 2017). Bayesian networks represent a formalism to express the strength of belief in causal hypotheses and provide a principled mechanism based on Bayesian inference for updating beliefs as new evidence becomes available (Holyoak and Cheng, 2011). For instance, Steyvers et al. (2003) used Bayesian networks to model human causal learning either by passive observation of a causal system or through direct interaction with it. This addresses a central aspect of DSC, system exploration and the formation of structural knowledge, although the approach has yet to be applied to DSC tasks requiring goal-directed control.
From a DSC perspective, the strength of Bayesian models of causal learning lies in the nuanced representation of causal structures and probabilistic dependencies combined with a mechanism for updating this knowledge from experience. This makes them a strong contender for explaining structural knowledge acquisition in DSC tasks with an exploration focus (e.g., Kluge, 2008;Wüstenberg et al., 2012). A Bayesian approach provides a formal account of the causal environment from which it is possible to deduce a suitable course of action, given the state of knowledge (including the level of uncertainty) a person has of the world at that time (Osman, 2010(Osman, , 2017.

SYNTHESIS AND CONCLUDING REMARKS
In pursuit of answering the question what cognitive modeling can contribute to DSC research we have considered several approaches (see Table 1). In terms of knowledge-lean modeling approaches, IBL strikes a good balance between simplicity, cognitive plausibility and explanatory power for a range of DSC tasks. On the downside, IBL has strict task requirements (e.g., availability of feedback, repeated decisions) and cannot easily explain the acquisition of causal knowledge or extrapolation to unfamiliar conditions. Heuristic models have no universal task requirements and can be combined with learning mechanisms to achieve a similar adaptivity as IBL models. However, since effective heuristics rely on exploiting the structure of the environment, finding suitable candidate heuristics for a given task can be a considerable challenge and any specific heuristicsbased model is only applicable to a particular niche (Marewski and Schooler, 2011).
Complex knowledge-based models are probably the most domain-specific type of model. They require strong assumptions about knowledge structures and cognitive procedures used by decision makers. If this information is available, it is possible to model skilled expert performance, elaborate reasoning strategies, and the acquisition of explicit structural knowledge, e.g., through active hypothesis testing. With respect to modeling causal knowledge, Bayesian models of causal learning provide an interesting alternative.
They offer an integrated account for representing and updating causal knowledge in a coherent framework, including the representation of epistemic uncertainty. However, these models have so far not been directly applied to model control in microworld DSC tasks.
As our discussion shows, each modeling approach has its particular strengths and weaknesses, which render it suitable for particular modeling tasks. For researchers it therefore seems important to select a modeling approach that matches the research question and that suits the task to be modeled. For example, modeling the process of acquiring explicit causal knowledge in simple dynamic systems through hypothesis testing (e.g., Wüstenberg et al., 2012) naturally maps on complex knowledge-based models or Bayesian models of causal learning but is likely to run into difficulties when approached with an IBL framework.
In general, we think that the rapidly advancing theories in causal learning and hybrid models combining heuristic strategies with reinforcement learning offer considerable untapped potential for cognitive modeling in DSC. Causal learning directly addresses the core issue of DSC tasks focusing on causal exploration (e.g., Steyvers et al., 2003). However, in order to simulate full task performance models of this type would need to be extended by including interaction with the task. Hybrid models in turn may be most suitable to model behavior in complex decision making tasks (e.g., Danner et al., 2011), where (heuristically guided) information reduction and gradual strategy adaptation are central for task performance.
We furthermore propose that computational models based on cognitively plausible process assumptions (e.g., reinforcement learning, use of simple heuristics, Bayesian knowledge updating) could be used as a yardstick for evaluating human performance in DSC (see Brehmer, 2005). This stands in contrast to using mathematical optimization or optimal rational strategies as a benchmark for performance (e.g., Sager et al., 2011). Defining rationally optimal strategies in DSC does have its place, for example when designing decision support systems. However, from a behavioral perspective the question of "what is maximally possible" is often less relevant than "what is humanly possible, " given the realities of incomplete information and limited cognitive capacity (Klein, 2002).
In conclusion, computational models of cognition appear to offer a promising path for advancing research and theory development in DSC. Computational approaches have successfully been used to model a range of cognitive phenomena in different domains of DSC. Promising starting points for further developments include, for example, recent advances in causal learning and hybrid models which combine simple heuristics with reinforcement learning mechanisms. Computational modeling of cognitive processes in DSC remains a constructive challenge that probes -and ideally enhancesour understanding of human behavior in complex dynamic environments.

AUTHOR CONTRIBUTIONS
Both authors contributed to writing the manuscript and approved it for publication.