Simulating CXCR5 Dynamics in Complex Tissue Microenvironments

To effectively navigate complex tissue microenvironments, immune cells sense molecular concentration gradients using G-protein coupled receptors. However, due to the complexity of receptor activity, and the multimodal nature of chemokine gradients in vivo, chemokine receptor activity in situ is poorly understood. To address this issue, we apply a modelling and simulation approach that permits analysis of the spatiotemporal dynamics of CXCR5 expression within an in silico B-follicle with single-cell resolution. Using this approach, we show that that in silico B-cell scanning is robust to changes in receptor numbers and changes in individual kinetic rates of receptor activity, but sensitive to global perturbations where multiple parameters are altered simultaneously. Through multi-objective optimization analysis we find that the rapid modulation of CXCR5 activity through receptor binding, desensitization and recycling is required for optimal antigen scanning rates. From these analyses we predict that chemokine receptor signaling dynamics regulate migration in complex tissue microenvironments to a greater extent than the total numbers of receptors on the cell surface.


Section 1. Overview of the Development Methodology
To ensure a principled and transparent design process we employ the CoSMoS (Complex System Modelling and Simulation) process, a framework to guide the modelling and analysis of complex systems (Figure 1) 1,2 . Initial stages of model design involve the development of a domain model, a non-executable, conceptual model focusing purely on current biological understanding, disregarding any consideration of how to implement and simulate the conceptual model. The domain model specifies the states, relationships and methods of interaction (the rule-set) for the biological entities being captured. The platform model is akin to a software specification, as is standard in software engineering, and details how the biological processes specified in the domain model are to be implemented. The simulation platform is an executable piece of software, which implements the underlying conceptual model. The results model provides a structure to interpret data obtained from the Simulation Platform. A specification is created that documents the output obtained from the simulation, what domain knowledge this is compared to, and the statistical methods used to assess this result. In this approach the biological system of interest is referred to as the domain. Initial design phases lead to the development of a domain model, a nonexecutable specification focusing on current understanding with respect to the research context. The platform model represents a software blueprint while the simulation platform is an executable piece of software that implements the conceptual model. The results model summarizes the understanding generated from experimentation conducted using the simulator 1

Section 2. Domain Model Development
In this section we detail the Domain Model. The research context of the simulator is summarised using an expected behaviours diagram. This diagram specifies the research question and the observed emergent properties of the system, as well as the biological entities and mechanisms hypothesised to give rise to these properties. This approach helps to scope the research context, highlighting key model entities and time points of interest.
Following this step low level behaviors are modelled using an adaptation of the Unified Modelling Language (UML) 3 . The Unified Modelling Language (UML) is a general-purpose visual notation used to model the design of a system and is commonly used in software engineering. Specifically, we use two key diagrams: (i) A State Machine Diagram where for each model entity, states (a set of attributes and behaviours associated with a model entity at a specific moment in time) that the entity can exist in and the interactions that much take place for the state to change are examined and documented (Figure 2). (ii) An Activity Diagram where we specify a sequence of activities associated with model entities. For each entity, it details the workflow from an initial state to a finish point, detailing the decision paths and interactions with other entities that occur (Figure 3).
To argue that the simulator fulfils appropriately addresses the research context, acceptance tests, key design decisions, and information used to inform the design, development and validation of the model and simulation are presented as arguments over evidence using a visual notation derived from goal structuring notation (GSN) and can be opened using the GSN visualisation tool ARTOO 4 . concurrency -seperate states that can be held simultaneously Figure 3. Key for Adapted Unified Modelling Language: Activity Diagram. This describes the syntax used to describe biological processes as activity diagrams in both the domain and platform models. Rectangles represent actions or activities; diamonds represent decisions; horizontal bars represent the start (split) or end (join) of concurrent activities; a black circle represents the start (initial state) of the workflow while an encircled black circle represents the end of the activity (final state). As B cells scan the follicle they respond to ligands for the receptors CXCR5 and EBI2; this promotes contact with FDCs and cells located around the follicle perimeter, including sinus associated macrophages, MRCs and DCs [10][11][12][13] . Noncognate interactions with antigen via complement receptors CR1 and CR2 facilitate the mass transport of opsonized antigens from the exposed follicle perimeter to the protected centre for long-term display on FDCs, where they are trapped in the form of iccosomes by complement receptors present on the dense network of FDC processes 8 .

C
If lymphocytes fail to recognize cognate antigen within a few hours to days, they return to the circulation in a sphingosine-1 phosphate receptor 1 (S1PR1) dependent manner through efferent lymph vessels and the thoracic duct 14,15 . Notably, naïve B cells are resident in the LN longer than either CD4 + or CD8 + T cells 16

Characterising Current Understanding Through Visual Notations
The aim of this Domain Model is to summarise current understanding of the pathway with respect to the following question: How does CXCL13 regulate the positioning of cognate B cells within primary lymph node follicles? We define the research context using an expected behaviours diagram (Figure 4), and detail the behaviours of model entities using state-machine and activity diagrams (Figures 5-9) 3 .

Figure 4. Expected Behaviours Diagram for CXCL13Sim.
In this diagram we present the research context within which the model is developed. This diagram specifies the key emergent properties of the system as well as the low level mechanisms reported to give rise to these properties. Lastly, it defines key entities in the model and the relationships between them.

Observable Phenomena
How does CXCL13 regulate the positioning of cognate B-cells within primary lymph node follicles? Research Context    These diagrams detail the activities performed by each model entity and the conditions required for a change of activity to occur. At the beginning of the activity FDCs exist in an immature state but mature if sufficient LTβR signal is received. CXCL13 is secreted at a rate dependent on maturation status. Antigen enters the system via the conduits or may be presented by SCS macrophages if greater than 70kDa. If antigen is part of an immune complex it may become FDC bound. RCs secrete CXCL13 at a fixed rate.

Section 3. Platform Model
The Domain Model detailed in Section 2 describes a set of biological processes occurring on molecular, cellular and tissue levels of organization. To provide a fit for purpose representation of the domain model while also promoting model parsimony and efficiency we hybridise different modelling techniques into a multiscale platform, with a modular architecture to facilitate model development and validation. The design of this system is provided in the following section.

Overview of the Platform Model
An overview of the scheme ( suggests that a 3D model is required to adequately capture the dynamics of T-cell output from the lymph node during infection. 2D models were shown to underestimate lymph node output because the distance between antigen-presenting cells is overestimated in 2D with respect to 3D 25 . As such 3D was deemed most appropriate to model the efficacy of B cell scanning.

Module 1: Stroma
In silico stromal networks are generated using an adaptation of the algorithm developed by Kislitsyn et al. (2015). The algorithm stochastically builds a network from an initial node, picking the nearest non-expanded node and generates a set of vectors that will eventually become edges to new nodes. Each vector has a direction and length randomly chosen, but conforming to values derived from experimental data, and the directions are chosen such that the sum of all the vectors in the set is approximately zero. A new node is then created at the end of each vector, and an edge connects them. This process is repeated to generate a graph.
However, this algorithm has some limitations in that it can only describe one stromal subset. To ensure an accurate reconstruction we consider the morphology of different stromal subsets, follicular dendritic cells and reticular cells. In addition, our datasets also show that a number of cell protrusions directly connect to one another, a property which can affect network topology. To account for this, branches between edges are added by creating a vector connecting the midpoints of each edge, subject to subtypespecific constraints on the maximum edge length and the local density such that the degree centrality matches our in vivo datasets. This approach is used to generate both RC and FDC networks within the follicle while RCs located just below the subcapsular sinus are stochastically seeded by random sampling of X and Y values subject to density constraints to ensure consistency between in vivo and in silico edge lengths and degree centralities. To represent the heterogeneity of network structures observed in vivo and to ensure no biases were introduced through a specific stromal architecture, the algorithm generates a unique network at the beginning of each simulation run. A suite of automated tests were developed to assess whether edge lengths and degree centralities are within expected bounds and that there are no overlapping nodes or edges.
Using this approach we generate a unique network at the start of each simulation run.
Running 250 simulations we find that this approach yields networks with median sigma and omega values of 12.00 and -0.097 respectively, confirming that the network has small world properties (Figure 11). The discrepancy between in vivo and in silico sigma values was expected as sigma scales with network size and the in silico follicle is approximately 4 times the volume of the tissue section used to perform the topological mapping. Comparison of the median values for both edge lengths and degree centralities for in silico and in vivo networks shows no statistically significant differences (Figure 12-13).

Module 2: Chemokine
Many different techniques exist to model molecules in silico, each with associated advantages and limitations. A common approach to model diffusion is through functions that relate molecular concentration to distance from a source 1 , or by PDEs to simulate the dynamics of chemokine field formation where molecules simultaneously undergo production, diffusion, decay, binding and scavenging, key mechanisms required to shape functional chemotactic gradients 28 . In cases where high model granularity is required at the molecular level, soluble factors can be represented as floating point values on discretised grids (Figure 14). The scheme we implement is a discretised form of the heat equation 22 . This mathematical construct is capable of isotropic diffusion 1 , can diffuse to an arbitrary number of neighbours and is applicable to linear, planar, spatial and n-dimensional implementations. These attributes make it well suited to studies of molecular components of the immune system ( Figure 14). In this scheme, chemokine molecules diffuses through a discrete 3D environment where the number of moles of chemokine molecules in each grid space (x,y,z) is denoted (x,y,z). The change in the spatial distribution of molecules is then subject to the following simultaneously occurring processes (i) production (ii) diffusion (iii) decay and (iv) consumption. As we are using agents to model individual cells, terms (i) and (iv) emerge from the simulation. Chemokine is secreted by each stromal cell at a fixed rate and is removed from the grid at a rate that is proportional to its current value . where chemokine in grid space x, ( ), is being diffused to grid space y, = > is the distance squared between x and y, D is the diffusion constant, t is the time step, and A is a normalizing constant that ensures the total amount being diffused is less than or equal to the amount that exists:

Lymphocyte Migration
Chemotaxis, and chemokine receptor internalisation and recycling are key mechanisms governing the fine-tuning of responses to chemokines in vivo 23,28,29 . In addition, the follicle is a highly dense structure and so it is important to account for interactions between cells. To model these phenomena in silico, we adapt the scheme developed by Lin et al 23,24 . In this scheme an agent samples local chemokine concentrations using 6 sampling pseudopodia (Figure 16) In the presence of chemotactic gradients actin flows polarize at the leading edge of the cell thus the relative weighting between HHHHH⃗ and M HHHHHHHH⃗ is scaled by a constant to represent the persistence of the cell; the value of is dependent on the chemotactic state of the agent. As a universal coupling exists between actin flows and cell speed 32 and relate the increase in velocity v* observed during chemokinesis to cell persistence using the following expression: * = ( ) The value of the scalar term was determined empirically, by fitting the model to WT migration data and verifying against CXCR5 -/migration patterns 9 . A number of automated tests were developed to ensure that total receptor values are conserved over time, and that agents move towards high concentrations of chemokines when expressing CXCR5.     31 . The influence of a parameter on model outputs is quantified using a partial rank correlation coefficient. PRCCs for Ordinary Differential Equations L K r K on K i K off K des

Lymphocyte Interactions
To account for dense lymph node environment lymphocyte migration must take into account interactions with other cell types. As time proceeds in fixed discrete intervals we treat both the movement vector of lymphocytes and the edge of the stromal cell as lines (Figure 19). To determine if the two agents are interacting we calculate whether the closest point between the two lines is less than the sum of their diameters. To determine the closest point, we define the lines L1 and L2 as follows:

v(s,t) = L 1 (s) -L 2 (t)
The closest point between the two lines is obtained when the vector is perpendicular to both lines (Figure 19) i.e. when the dot product of the two vectors is equal to zero or d1 * v(s, t) = 0 d2 * v(s, t) = 0 Using Cramer's rule we can then solve this system of equations to determine which values s and t where L1(s) and L2(t) correspond to the closest points on the lines.
Additionally, cells may interact with each other, however cell structure is dynamic and cells are observed to slide over one another. To account for this, once an agent has determined the direction in which to move, the probability that the cell can move towards the target location is determined as ‚B , where is the number of cells in the target location. Automated tests were developed to ensure that lymphocytes completely caged within a tight network of stromal cell protrusions cannot pass through due to interacting with the network, and that the number of agents per gridspace does not exceed a threshold value.

Integration of Model Subunits
System architecture is modelled using an adaptation of the UML as per the domain model. This specification defines how model subunits interface and details the flow of information through the system (Figure 20 -24). On the diagrams, the modules are specified using: M1, M2 and M3 to represent stroma, chemokine and lymphocytes respectively. Key decisions and abstractions made during the development process are presented as arguments over evidence 2 using an adaptation of goal structuring notation.  RCs are resident in the system at the start of the simulation and are generated using the algorithm described in M1. They secrete chemokine at a fixed rate and once secreted, chemokine diffuses as described in M2.

Model Outputs
Once a simulation run is complete, a summary .csv file is produced with the metrics detailed in Table 1 for each cognate B-cell agent. These metrics facilitate comparison with experimental measures of migration and are used to assess the influence of parameter perturbations on the emergent cellular behaviours.

Total Displacement
Record the steps taken by cells and calculate displacement over a fixed time period using vector addition.

Net Displacement
Euclidean distance between the first and last position of the cell

Section 4. Simulation Platform
The simulation platform was implemented using Java and the MASON ABM library version 19 33 in an iterative process of implementation, validation and refactoring using Acceptance Test-Driven Development (ATDD) 34 . Tests are continually assessed and refined as the project progresses and are incorporated into an automated regression framework using the java library JUnit (available from http://junit.org/junit4/) to ensure that new code does not disrupt existing functionality, expediting the development process. To quantify sources of uncertainty in the our simulator we used the R software package SPARTAN 35 . This package contains a suite of statistical techniques (described in more detail in the following sections) specifically designed to help understand the relationship between the simulator and the physical system it describes.

Model Calibration and Validation
Within the simulation platform each module was developed using Java and the multi agent simulation library MASON 33

Mitigation of Aleatory Uncertainty
CXCL13Sim is non-deterministic and therefore, repeat experiments using the same parameter set can lead to differing results (Figure 25). This variation is termed aleatory uncertainty and because of this effect multiple simulation executions must be performed to obtain a representative result. To determine how many runs are required to give a representative output for a given parameter set we perform an aleatory analysis (detailed further above) 35,46 (Figure 27).
In this approach, 20 distributions were generated and contrasted for each sample size.

Local Sensitivity Analysis
To quantify parameter uncertainty in CXCL13Sim we first perform a local parameter SA using the SPARTAN statistical package as follows: each parameter is adjusted within the ranges specified in Table 2, with all other parameters remaining at their calibrated value with 100 replicates used for each parameter set to mitigate aleatory uncertainty. The Vargha-Delaney A-Test described above is employed to determine if changing the parameter value has led to significant difference in comparison with baseline behaviours (Vargha and Delaney, 2000).

A−Test Scores when adjusting parameter Ka
Parameter Value

Global Sensitivity Analysis
To assess combinatorial effects of parameter perturbations we performed a global sensitivity analysis using latin hypercube sampling that partitions the distribution of each input parameter into intervals of equal probability, selecting one sample from each interval (Figure 31-32). LHC sampling generated 1000 parameter sets, 100 executions per parameter set were performed on a high-performance cluster and the influence of each parameter was quantified using a PRCC (detailed in Chapter 2.2.9).
The parameters polarity, travel distance, and total number of CXCR5 receptors were key determinants of cell migration and scanning rates (

Appendix 1. Arguing that the Simulation is a Fit for Purpose Representation of the Biological System
The design and implementation decisions made when constructing a simulator are influenced by the overarching scientific objectives of the work, with simulation results interpreted in this context 4,48 . To argue that the simulator fulfils its remit, acceptance tests, key design decisions, and information used to inform the design, development and validation of the model and simulation are presented as arguments over evidence using a visual notation derived from goal structuring notation ( Figure   33) and can be opened using the ARTOO tool 3 4 . This diagrammatic tool facilitates transparency of model design and analysis, capturing the reasoning behind the inclusion or exclusion of each biological feature and recording assumptions, as well as pointing to evidence supporting model-derived conclusions. Figure 33. Key for Argumentation Notation: (i) claim the purpose of the argument we are seeking to support; (ii) Evidence that supports the argument made in the attached claim; (iii) strategy the steps that will be taken to argue that the claim is supported; (iv) context defines the purpose of the argument and key terms; (v) justification provides a reasoning behind the selection of a strategy or claim and (vi) assumption provides an explicit statement of any assumptions made in place of biological understanding. Figure reproduced