Model predictive control of cancer cellular dynamics: a new strategy for therapy design

Recent advancements in cybergenetics have led to the development of new computational and experimental platforms that enable us to robustly steer cellular dynamics by applying external feedback control. Such technologies have never been applied to regulate intracellular dynamics of cancer cells. Here, we show in silico that adaptive model predictive control (MPC) can effectively be used to steer the simulated signalling dynamics of Non-Small Cell Lung Cancer (NSCLC) cells to resemble those of wild type cells. Our optimisation-based control algorithm enables tailoring the cost function to force the controller to alternate different drugs and/or reduce drug exposure, minimising both drug-induced toxicity and resistance to treatment. Our results pave the way for new cybergenetics experiments in cancer cells, and, longer term, can support the design of improved drug combination therapies in biomedical applications.


Introduction
Cybergenetics is a recent field of synthetic biology, which refers to the forward engineering of complex phenotypes in living cells applying principles and techniques from control engineering (Del Vecchio et al., 2016;Khammash et al., 2019).
Here, we propose to apply cybergenetics, in particular external feedback control, to predict combinations of drugs (i.e., control inputs) which can bring deregulated cellular variables (i.e., gene expression, control output of the system) within tightly controlled ranges in cancer cells. We take Non-Small Cell Lung Cancer (NSCLC) as an example; using a previously proposed differential equations mathematical model which describes the dynamics of the EGFR and IGF1R pathways, we show in silico that external feedback controllers can effectively steer intracellular gene expression dynamics in cancer cells to resemble those of wild type cells.
The use of feedback control is advantageous as it enables coping with changes in both steady-state levels and temporal dynamics of genes involved in deregulated signalling cascades. The control action is implemented by means of adaptive model predictive control (MPC), using either a full-order and physicsbased non-linear model (linearised at each time step to be, at best, locally accurate), or a simpler, data-based reduced order model. In the latter case, our controller does not require an exact model of the system; this is particularly advantageous in biological applications, where the derivation of detailed models can be time-consuming and troublesome (Marucci et al., 2011;Marucci, 2017;Browning et al., 2020). Experimentally, the controller would not have access to measurements of all the internal states of the model, as just a few variables would be measured (for example, by means of fluorescent proteins in time-lapses); therefore, we included a Kalman filter to estimate the internal states.
If the results proposed here were applied experimentally (e.g., to steer signalling and/or proliferation dynamics in living cells, or in patient-cell derived organoids), they could predict combination therapies which target different nodes in signalling cascades. In this regard, our optimisation-based control algorithm also enables tailoring the cost function to force the controller to alternate different drugs and/or reduce drug exposure. The controller should also be able to cope with the crosstalk of signalling pathways, which might be one of the mechanisms causing drug resistance (Vasan et al., 2019).
In what follows, we demonstrate via simulations that adaptive MPC can be used to effectively steer the concentration of proteins involved in cancer, whilst reducing the dose of each drug provided as an input. Our results pave the way for extending the scope of synthetic biology cybergenetic applications for the direct and automatic control of cancer cell intracellular dynamics.

Control scheme used in external feedback
We applied a feedback controller to regulate the concentrations of two downstream genes (ERK and Akt) of the mTOR and MAPK pathways, as modelled in (Bianconi et al., 2012), and as shown in Figure 1. Figure 2 shows the response of y1 (ERK) and y2 (Akt) in a wild type (−) and in a NSCLC (−) cell to a phosphorylation of EGFR and IGF1R as modelled by varying the system's initial conditions as in (Bianconi et al., 2012), referred to as an activation. A wild type cell's activation is modelled using an active concentration of 8,000 μM for EGFR and 800 μM for IGF1R, while an activation in NSCLC cells is triggered by an active concentration of 800,000 and 400,000 μM for EGFR and IGF1R, respectively. The term 'free' refers to an open loop response (i.e. if no feedback control is applied) in NSCLC cells. It can be seen that y1 (ERK) and y2 (Akt) activation dynamics are different in cancer vs. wild type cells in Figure 2; of note, the activation of y1 (ERK) occurs over a timescale of an order of magnitude faster than the activation of y2 (Akt). The notation (−) shows the colour of the plot in the related figure that is being discussed.
The two pathways are both kinase activated cascades, meaning that an activation at the receptors at the cell membrane causes a cascade of phosphorylation in downstream genes. Therefore it is difficult to robustly control the system as, once an error is measured in the outputs, it can be too late to have a significant effect by acting on the internal states higher up the cascade.
Embedding a model of the system in the controller helps predict the difference between a small change in the output due to an oncoming activation and a small change which is just due to a disturbance in the states, because the controller has knowledge of other states within the cascade. On the other hand, model-free feedback strategies with large gains can have undesirable consequences, such as poor dynamic performance. A Proportional controller can be tuned to decrease the control error, but gains have to be carefully chosen, and the user cannot impose desired constraints on the input (Supplementary Section S8).
Adaptive MPC is used as the model-based control scheme here ( Figure 1). The success of MPC relies on the quality of the model used to predict the future behaviour of the system, and on the cost function the controller uses to calculate the optimal Frontiers in Control Engineering frontiersin.org 02 inputs to be fed to the process. The novelty of the controller used here lies in the choice of adaptive model and in the cost function used. The MPC implementation is presented in Supplementary Section S4.

MPC's linearised model
The NSCLC model (Bianconi et al., 2012) contains significant non-linear terms and a large number of internal states. An adaptive MPC controller computes a linear approximation of the NSCLC model at each time step, predicts the future of the internal states and calculates the optimal input profile. The controller then applies the first input of its calculated optimal input profile to the actual system. At the next time step, the controller calculates a new linear model by linearising the NSCLC model. The use of a linear system results in a convex optimisation problem which can be solved quickly. Adaptive MPC is used in all simulations unless stated otherwise.
Alternatively, non-linear MPC could be used; however, it would introduce additional complexity in the search for the optimal input as local minima will likely be introduced in the cost function. Non-linear MPC is also computationally expensive, and the time needed to compute the next input FIGURE 1 A control scheme including three inputs (u(t) [I 1 , I 2 , I 3 ] T ) that interact with the mTOR and MAPK pathways. Two observable protein concentrations, ERK and Akt, are used as control outputs for the two pathways (y(t) = [ERK,Akt] T ). The regulator used throughout this project is an adaptive MPC program which attempts to steer the concentrations of the outputs to the transient response of the wild type cell, set as the control reference, as shown in Figure 2.

Frontiers in Control Engineering
frontiersin.org could be longer than the sampling/actuation temporal intervals (see Supplementary Section S7).

Improving the traditional MPC cost function
The cost function used by the adaptive MPC algorithm to find the optimal input depends on the current internal state error, e 0 , and on the inputs, u(t). The error, e(t), is the difference between the reference, r(t), and the internal states of the NSCLC system, x(t), as shown in Figure 1. The standard cost function used for linear MPC controllers (Rawlings et al., 2020) focuses on how readily the inputs, u(t), are used and on reducing the proportion error in the states, e(t).
In order to include both the magnitude and duration of the error, the integral of the state error, e(t) dt, is added to the standard cost function. It has been shown that it is beneficial to integrate the state error (Hornberg et al., 2005), meaning that the controller acts due to these longer, smaller errors in the outputs caused by states higher in the cascade. Moreover, to avoid rapid fluctuations in the control input, u(t), a differential cost of the inputs is also added to cost function. The derivation of the cost function and a description of the weights of its terms (α, β, γ, η, θ) can be found in Supplementary Section S4.

MPC simulation parameters
The MPC simulations are reproducible thanks to the deterministic nature of the model and controller, as long as the cost function coefficient weights and other MPC related parameters are kept constant. Table 1 gives a summary of these parameters. Several key parameters are added to the figures' captions.

Indexes used to quantify control performance
To assess quantitatively the performance of our controller, we define an Error Index, EI. It is the sum of the squared error between the output and the reference for the outputs, as used in (Fiore et al., 2015). (1) C is the output matrix of the linearised NSCLC model. A small EI indicates a good performance of the controller.
To quantify the controller effort needed to achieve a certain output, we assess the dose of input drug(s) using a Dose Index, DI i . It is the integral of the input signal, where u(t) = [I 1 (t), The inputs can never be negative as they are physical concentrations, therefore there is no need to square the input signal.

Reduced-order models
For a given control scheme (as shown in Figure 1) it is possible to measure various input(u)/output(y) data sets, which can then be used to identify a model of the system. We derived two simplified models (a three state linear and non-linear grey box model); this identification process is discussed in Supplementary Section S9 and will be referred to as the 'reduced order model', used in Section 3.5. A Kalman filter was used at each step of the controller to estimate the internal

Parameter
Description Value

Results
Non-Small Cell Lung Cancer (NSCLC) accounts for 80% of lung cancer cases and is characterised by various mutations which usually lead to an overexpression of the EGF and IGF1R receptors. These receptors trigger several cascades including the mTOR and MAPK pathways; their downstream genes FOXO1 and C-FOS regulate cell apoptosis and proliferation. The differential equations-based mathematical model for NSCLC signalling developed in (Bianconi et al., 2012) includes the mTOR and MAPK pathways along with some of the reactions between the two pathways, as shown in Figure 1; the model enables comparing gene expression dynamics in wild type vs. cancer cells. The model we used and its parameters are based on two previous mathematical models of these signalling pathways, which were fitted and validated on existing and new experimental data (Brown et al., 2004;Orton et al., 2009). We chose the downstream genes ERK and Akt (noted as y1 and y2, respectively) as the control outputs for the external feedback loop (Figure 1); if experiments were performed, it should be possible to measure the dynamics of those proteins using fluorescent reporters. The two outputs can be tuned by varying three inputs (I 1 , I 2 , I 3 ), which inhibit three specific proteins within the mTOR and MAPK pathways. The pathways can influence each others' reactions, creating internal feedback loops (crosstalk).
The code used to implement an adaptive MPC program on this NSCLC model used in these simulations is available on GitHub: https://github.com/Ben-Smart/Adaptive_MPC_on_ NSCLC.git.

Assessing the cost function
Firstly, Single-Input Single-Output (SISO) simulations were performed. The controller tries to steer the dynamics of either y1 (ERK) or y2 (Akt) by varying the concentrations of a drug that acts directly on one of the two signalling cascades (I 3 for y1 (ERK) and either I 1 or I 2 for y2 (Akt)). Figure 3 uses I 2 to regulate y2 (Akt), and shows the effect of different cost function terms on the performance of the controller. Figure 3A shows that using integral terms within the cost function reduces the error in y2 (Akt), as compared to the proportional terms (EI → 1.945 (−) < 10.923(−)). However, it can be seen in Figure 3C that the controller using integral terms (−) can cause fluctuations in the input. Such fluctuations are reduced when using a differential cost, which also has a lower Error Index, but higher Dose Index (EI → 1.690 (−) < 1.945(−) < 10.923(−), DI 2 → 603 (−) > 546(−) > 129(−)).
The cost functions used in the following simulations include β = 0 (i.e. no output error cost), θ = 0 (i.e. no differential input cost) and η = 1 (i.e. a non-zero integral output error cost), as in Frontiers in Control Engineering frontiersin.org 05 Figure 3C, apart from the weight associated with each input, γ, that is varied (as indicated in figures' captions). Figure 4 shows SISO adaptive MPC simulations for I 1 , I 2 and I 3 ; each drug is used to control the downstream molecule in the cascade it acts on. It can be seen that plots C) and D) of Figure 4 are identical to (−) in plot A) and plot C) of Figure 3 as these are both SISO responses of I 2 using the chosen cost function (costing the integral and input terms). The SISO controller moves the NSCLC response towards the wild type one (−) using a lower dose than just a step of each input at the maximum allowed dose (1 μM), thus decreasing the Dose Index. The step input response can be found in Supplementary Section S3. This demonstrates the benefits of using an external feedback loop compared to an open loop response with a static step input.

Multi-input multi-output control
Adaptive MPC can also be used to steer both outputs using all three inputs in Multi-Input Multi-Output (MIMO) simulations, as shown in Figure 5.
However, due to the fast dynamics of the MAPK pathway, the output y1 (ERK) fails to adequately follow the reference activation curve. Figure 6 shows that if the time step is adequately reduced (for instance, to T s = 0.02 min), the controller can handle the faster dynamics of the pathway and effectively control both outputs (EI → 0.001 < 0.791) whilst using a lower dosage of all the inputs (DI 1 → 545 < 570, DI 2 → 297 < 418, DI 3 → 4.35 < 6.87). Figure 6 shows that the controller can   Frontiers in Control Engineering frontiersin.org 07 overcome the effect of crosstalk, balancing the error in y1 (ERk) and the use of I 1 and I 2 .
To emulate capabilities of current microfluidics devices, the time step will be kept at T s = 1 min. Therefore, in what follows, we investigate only Multi-Input Single-Output (MISO) simulations where y2 (Akt) is controlled only with I 1 and I 2 . In this way, we remove the effect of the crosstalk (through the inherent negative feedback loops shown in Figure 1), which would restrict the MPC's use of I 1 and I 2 caused by the high error in y1 (ERK) due to the faster timescale of this output at the controller's current time step (T s = 1 min).

Combination therapies using I 1 and I 2
If cells are exposed to drugs for an extended period of time, side effects and resistance might become an issue (Salgia and Kulkarni, 2018). The controller could be used to find potential drug combinations that can achieve a low Error Index (EI) whilst reducing the dose of the inputs (DI i ). The weight associated with using each input, γ, within the cost function can be varied for this aim, as shown in Figure 7. The Bliss Independence (BI) formula (Demidenkoid and Miller, 2019;Vakil and Trappe, 2019) has been used here as a normalised Dose Index to summarise the combined effect of multiple drugs (see Supplementary Section S5). Figure 7 focuses on the control of y2 (Akt) using I 1 and I 2 as control inputs (MISO control). The weights in the cost function associated with each input can be varied as a ratio of R γ 2 γ 1 , ranging from low R values (where a high weight is associated with I 1 (γ 1 ), thus producing a SISO-like simulation only using I 2 ), all the way through to a high R value (where γ 2 is relatively large and the controller will only use I 1 ). Figure 7 compares the normalised Error Index, EI (−), and the Bliss Independence BI (−) to the weight ratio (R). It shows that there is a range of R which can significantly reduce both EI (−) and BI (−). Therefore, the control performance of the MISO controller is better than any SISO simulation while keeping drug concentrations low. For the purpose of designing combination therapies, here the optimal input is associated to the minimum value of the EI (−).

Drug holidays
If using an adaptive MPC, the user can set specific time intervals in which the controller does not give specific drugs (for example, to avoid toxicity induced by long exposure). These drug holidays can be achieved by the controller by varying the weights associated with each input, online, during a single simulation.
As an example, Supplementary Figure S3 shows that the controller can retain a low Error Index whilst swapping inputs after 600 min (EI → 1.989 ≈ 1.950 <2.75, the SISO EI of Figure 4). Therefore, a programmed change of cost function weights during the simulation can decide which input to stop using.

Frontiers in Control Engineering
frontiersin.org 08 inputs). The step size T s used makes a significant difference to the response. When T s = 1min the drugs can switch ON or OFF every minute, leading to rapidly fluctuating inputs (shown in Supplementary Figure S4). Supplementary Figure S5 shows the discrete simulation with a larger time step (T s = 30min). It can be seen that there is a better performance when compared to the SISO simulations ( Figure 4) as EI → 1.6878 < 1.95 < 2.75, whilst still using less of each input (DI 1 → 630 < 1068, DI 2 →  Figure 7 (B,C) Inputs I 1 and I 2 used in the two SISO simulations. (D,E) Inputs I 1 and I 2 , respectively, used in the MISO simulation. Parameters: T s = 1min, N = 10, α = 0, β = 0, θ = 0 and η = 1

FIGURE 9
Two MISO adaptive MPC simulation based of a three state model identified from input/output data including a Kalman filter to estimate the internal states from the measured output, comparing the adaptive and the fixed models. I 1 and I 2 are used to control y2 (Akt) (A) The response of Akt to the inputs in (B) and (C) for the fixed response and (D,E) for the adaptive response. Parameters: T s = 1min, N = 10, α = 0, β = 0, γ = [1, 1e5, − ], θ = 10 8 and η = 1.

MPC based on a reduced order model and an estimator
If adaptive MPC was to be used experimentally, then the controller would not have access to all of the internal states. We tested the effectiveness of two reduced order models (three states) within the MPC program, run alongside a Kalman filter to estimate the other two internal states from the measurements of Akt only.
The performance of a standard MPC using a linear reduced order model (−) is compared to an adaptive MPC based on a non-linear reduced order model (−), labelled 'Fixed' and 'Adaptive' respectively in Figure 9. The adaptive MPC simulation has a lower EI than the fixed MPC simulation (EI → 0.319 (−) < 0.477 (−)), showing that, at least with this identified models, updating the linear model at each iteration improves the performance of the controller. The reduced order model simulations with the same cost function used in Figure 8 shows rapid fluctuations of the inputs (Supplementary Figure S9), therefore we set the simulation in Figure 9 with the same cost function as in Figure 8 except for the increase of the differential cost term, θ. It can be seen that the MISO response of the reduced order model in Figure 9 can achieve a similar EI to that of the optimal full order model shown in Figure 8, whilst using a similar combined input dose. Therefore the reduced order model manages to achieve a similar performance to the full order model, whilst not representing all the dynamics of the full order NSCLC model and only using the measured data of one output.

Discussion
Computational approaches have been extensively proposed in the search for effective cancer treatments. Initially, mathematical models were used to investigate the pathways within the system, design the inputs themselves, and observe how the system reacts (Konstorum et al., 2017;Cova et al., 2019). Optimisation algorithms have supported the identification of open loop optimal input profiles for target phenotypes Kassara, 2006;Piccoli and Castiglione, 2006;Castiglione and Piccoli, 2007;Ledzewicz and Schattler, 2007;Itik et al., 2009;Ghaffari and Naserifar, 2010;Oke et al., 2018). Open loop schemes cannot compensate for model inaccuracies or adapt to observations. Therefore, negative feedback controllers were developed, for example to control cancer tissues with model free controllers (Chareyron and Alamir, 2009;Bratus et al., 2013;Goharrizi et al., 2013;Tang et al., 2016;Ledzewicz et al., 2019) and model based controllers (Sápi et al., 2012;Alamir, 2014;Teles and Lemos, 2019). This is, to the best of our knowledge, the first attempt of simulating feedback controllers to regulate intracellular dynamics of cancer cells by using an adaptive control scheme.
We showed that an adaptive MPC program can be used to inform treatments for NSCLC cells, steering the dynamics of several key signalling pathways, whilst offering a tunable cost function that allows the user to adjust the characteristics of an optimal input. Indeed, the controller can be tuned to choose different drug profiles that will achieve a similar control performance whilst reducing exposure to one or more drugs. We also mimicked future in vitro experiments through the use of a simple model (identified through input/output data) alongside a Kalman filter within the adaptive MPC program.
Other control strategies, like PID controllers, cannot be directly tuned to affect the desired input and only act on the observed output. The use of a linear model within the MPC makes the control algorithm running time short enough for it to be used, in the future, in external feedback control experiments. The implementation of those would require some practical aspects to be considered, which we did not account for. Firstly, there might be delays in cell responses to drugs/ actuation, which the model used by the controller should consider. Also, the sampling/actuation time might need to be fast enough, if aiming at controlling genes with fast dynamics. This issue might be overcome using experimental optogeneticsbased platforms instead of microfluidics-based ones, as they can reduce delays in the actuation.
We foresee a growing interest in applying cybergenetics approaches, and in particular feedback controllers, to steer mammalian cells dynamics. If we realise our ambition to implement the experiments proposed here on living cells and, longer term, on patient-derived organoids, feedback control might be a valuable tool for the design of personalised optimal treatments for a range of conditions.

Conclusion
It has been demonstrated, through simulations, that adaptive MPC can be used to inform drug combinations which can steer signalling dynamics in NSCLC cells, offering a tunable cost function to modify ad hoc both the rapidity of inputs' changes and the ways inputs are chosen. The weights can also be changed to choose different drug profiles that will achieve a similar control performance whilst giving the cell a break from individual drugs. In the future, we hope to test the controller in living cells using microfluidics/ microscopy platforms.

Frontiers in Control Engineering
frontiersin.org

Data availability statement
Publicly available datasets were analyzed in this study. This data can be found here: https://github.com/Ben-Smart/ Adaptive_MPC_on_NSCLC.

Author contributions
BS designed and carried out the simulations in Matlab2021b and wrote the manuscript with inputs from all the authors. IC supported the project. LR and LM conceived the project, contributed to manuscript writing and supervised the entire work.