A Primer on the oxDNA Model of DNA: When to Use it, How to Simulate it and How to Interpret the Results

Sengar, A.; Ouldridge, T. E.; Henrich, O.; Rovigatti, L.; Šulc, P.

doi:10.3389/fmolb.2021.693710

REVIEW article

Front. Mol. Biosci., 17 June 2021

Sec. Biological Modeling and Simulation

Volume 8 - 2021 | https://doi.org/10.3389/fmolb.2021.693710

This article is part of the Research TopicCombining Simulations, Theory, and Experiments into Multiscale Models of Biological EventsView all 23 articles

A Primer on the oxDNA Model of DNA: When to Use it, How to Simulate it and How to Interpret the Results

A. Sengar¹

T. E. Ouldridge¹*

O. Henrich²

L. Rovigatti^3,4

P. Šulc^5,6

¹Centre for Synthetic Biology, Department of Bioengineering, Imperial College London, London, United Kingdom
²Department of Physics, SUPA, University of Strathclyde, Glasgow, United Kingdom
³Department of Physics, Sapienza University of Rome, Rome, Italy
⁴CNR Institute of Complex Systems, Sapienza University of Rome, Rome, Italy
⁵Center for Molecular Design and Biomimetics, The Biodesign Institute, Arizona State University, Tempe, AZ, United States
⁶School of Molecular Sciences, Arizona State University, Tempe, AZ, United States

The oxDNA model of Deoxyribonucleic acid has been applied widely to systems in biology, biophysics and nanotechnology. It is currently available via two independent open source packages. Here we present a set of clearly documented exemplar simulations that simultaneously provide both an introduction to simulating the model, and a review of the model’s fundamental properties. We outline how simulation results can be interpreted in terms of—and feed into our understanding of—less detailed models that operate at larger length scales, and provide guidance on whether simulating a system with oxDNA is worthwhile.

1 Introduction

Deoxyribonucleic acid (DNA) is a macromolecule that acts as a storage medium for genetic information for all living organisms (Alberts et al., 2002). In nature, the molecule is most often found as a double helix of two strands. The structure of each strand comprises of a backbone of covalently linked sugar and phosphate groups. Each sugar is further attached to a base moiety: adenine (A), guanine (G), cytosine (C) or thymine (T). Certain intra- and intermolecular interactions between these bases drive the formation of the aforementioned double helical structure.

Crucially, the base pairing that holds these duplexes together is highly specific; to a first approximation, A will only bind to T and C will only bind to G, and vice versa. Matching—or complementary—sequences therefore bind to each other much more strongly than to non-complementary sequences. The different base identities, along with the rules of complementarity, allow information to be encoded into the single strands and copied from generation to generation (Watson and Crick, 1953).

The DNA double helix has a diameter of about 2 nm, and a helical pitch of about 3.4–3.6 nm. Double strands are relatively stiff, with large bending disfavoured on lengthscales below around 40–50 nm (Seeman, 2003). By contrast, single strands are very flexible (Murphy et al., 2004; Chen et al., 2012) forming loops and kinks with only a handful of bases or fewer.

These thermodynamic, mechanical and structural properties influence DNA’s biological role, but also make it an ideal material for nanoscale engineering. The simplicity of interactions between strands, and the predictability of the structural and mechanical properties of the product, have enabled the rational design of a host of synthetic structures (Fu and Seeman, 1993; Goodman et al., 2005; Rothemund, 2006; Douglas et al., 2009; Ke et al., 2012; Zhang et al., 2015; Tikhomirov et al., 2017; Wagenbauer et al., 2017), computing architectures (Adleman, 1994; Rothemund et al., 2004; Qian et al., 2011; Cherry and Qian, 2018; Woods et al., 2019) and dynamic systems (Yurke et al., 2000; Shin and Pierce, 2004; Muscat et al., 2011; Zhang and Seelig, 2011; Wickham et al., 2012; Srinivas et al., 2017; Tomov et al., 2017).

DNA’s importance to biology, nanotechnology and simply as a canonical model biopolymer for biophysicists means that modelling its behaviour is a key challenge. Unsurprisingly, therefore, models spanning an enormous range of complexity have been proposed to analyse and rationalize the behaviour of DNA. In this pedagogical review, we will first discuss this range of models and their interplay, before focusing on a particular coarse-grained model, oxDNA.

The oxDNA model, first published in 2010 (Ouldridge et al., 2010a) (and with a slightly updated potential in 2011 (Ouldridge et al., 2011)), has now been extensively applied to problems in nanotechnology (Ouldridge et al., 2013a; Doye et al., 2013; Srinivas et al., 2013; Machinek et al., 2014; Snodin et al., 2016, 2019; Henning-Knechtel et al., 2017; Hong et al., 2018), soft matter (De Michele et al., 2012; Rovigatti et al., 2014; Procyk et al., 2020; Stoev et al., 2020), biophysics (Matek et al., 2012; Romano et al., 2013; Matek et al., 2015; Mosayebi et al., 2015; Harrison et al., 2019; Nomidis et al., 2019) and biology (Lee et al., 2015; Wang et al., 2015; Craggs et al., 2019). Numerous tools exist to generate and visualize systems with oxDNA (Henrich et al., 2018; Suma et al., 2019), alongside two independent, publicly-available code bases for actually running simulations with at least three qualitatively distinct algorithms for simulating the model (Ouldridge et al., 2011; Snodin et al., 2015). One of these code bases has recently been incorporated into a webserver (Poppleton et al., 2020).

Despite this uptake, however, there is insufficient clarity on how the basic properties of the oxDNA model make it well- or poorly-suited to studying certain systems. Moreover, many interesting phenomena require non-trivial simulation techniques if they are to be probed with oxDNA. Although those techniques have been widely applied, and software implementing them with oxDNA is available, documentation supporting their use is limited. Equally, there is very little help with the intuition required to use these techniques successfully. Finally, a major aspect to interpreting the results from oxDNA is rationalizing its predictions in terms of less detailed models. Unfortunately, however, there are many subtleties in doing so.

In this pedagogical review we implement a series of exemplar simulations that allow us to address these shortcomings. These simulations will establish a well-documented set of examples for a series of approaches that can be adapted by users, and this review will provide some of the intuition for how to use these approaches successfully. Simultaneously, we will use these examples to illustrate key aspects of the oxDNA model that determine its usefulness, and will explore how to interpret the results in terms of DNA models at different scales.

2 DNA Models Across Length Scales

At the smallest and most fundamental scale, quantum chemistry calculations can be used to estimate the nucleotide properties from first principles (Hobza and Šponer, 1999; Pérez et al., 2004; Šponer et al., 2004; Šponer et al., 2008). However, these calculations are computationally extremely expensive and are unable to capture the collective behaviour of whole strands in solution. Nonetheless, insight from this field has been incorporated into classical atomistic force fields AMBER (Cornell et al., 1996) and CHARMM (Brooks et al., 1983) that use empirical force fields to model interactions between atoms. These force fields are iteratively parameterised using both comparison to experimental data and information from lower-level quantum mechanical descriptions. In recent years, advances in computational resources have allowed these models to simulate large systems—such as DNA origami—for long enough timescales to analyse their equilibrium properties. Given long simulations, these atomistic models are able to sample the conformation of large structures (Nguyen et al., 2014; Rocklin et al., 2017) and the breaking and formation of base pairs (Brown et al., 2015). However, at the time of writing, a systematic study of DNA duplex formation thermodynamics, as represented by atomistic models, has not been performed. As such, it is unknown how well these atomistic models represent DNA thermodynamics—historically, the force fields have required adjustment as new systems and longer time scales are studied (Pérez et al., 2007; Yoo and Aksimentiev, 2012). This fact, alongside the heavy computational load in simulating large systems or significant structural changes, mean that atomistic approaches are currently limited to a fraction of the systems of interest in DNA-based biophysics, biology, soft matter and nanotechnology.

In an effort to access longer timescales, a number of “coarse-grained” or “mesoscale” models have been introduced (Savelyev and Papoian, 2009; Ouldridge et al., 2011; Hinckley et al., 2013; Korolev et al., 2014; Maciejczyk et al., 2014; Maffeo et al., 2014; Machado and Pantano, 2015; Uusitalo et al., 2015; Dans et al., 2016; Ivani et al., 2016; Chakraborty et al., 2018; Maffeo and Aksimentiev, 2020). These models represent DNA with a much-reduced set of degrees of freedom relative to atomistic approaches. In particular, solvent (and solvated ions) are usually treated implicitly, and groups of atoms in the DNA are replaced by a single site with effective interactions. As a result, these models can access longer length and time scales than atomistic descriptions.

The procedure for coarse-graining ranges from “bottom-up” approaches that seek to formally map the statistical behaviour of a more detailed model into a coarse-grained description (Savelyev and Papoian, 2009; Maciejczyk et al., 2014; Maffeo et al., 2014), to “top-down” approaches such as oxDNA that are more ad hoc, instead seeking to reproduce as many experimentally relevant properties as possible (Ouldridge et al., 2009; Hinckley et al., 2013; Machado and Pantano, 2015; Uusitalo et al., 2015; Chakraborty et al., 2018; Maffeo and Aksimentiev, 2020). Bottom-up approaches have been most successfully used to study fluctuations within the duplex state, where the atomistic models on which they are built are best parameterised. Top-down approaches, by contrast, have found their application in the analysis of processes that involve DNA outside of its canonical B-form, including duplex hybridization (Ouldridge et al., 2013b), strand displacement (Srinivas et al., 2013; Irmisch et al., 2020), stress-induced structural transitions (Romano et al., 2013; Wang and Pettitt, 2014; Sutthibutpong et al., 2016) and the properties of nanostructures with branched helices and single-stranded sections (Rovigatti et al., 2014; Engel et al., 2020).

Although highly-simplified, all of the coarse-grained models cited above attempt to represent the discrete, three-dimensional structure of DNA explicitly. An important role in our understanding of DNA is played by even simpler models. In thermodynamic terms, two classes of model have received particular attention. Firstly, the Peyrard-Bishop-Dauxois model and its variants have been used to probe the statistical properties of the duplex denaturation transition in the thermodynamic limit (Dauxois et al., 1993; Cocco and Monasson, 1999; Nisoli and Bishop, 2011). These models represent DNA through two or three continuous degrees of freedom per base pair.

A second approach dispenses with continuous degrees of freedom altogether, taking an Ising-like approach in which base pairs are either present or absent. Originally introduced by Poland and Scheraga to probe the duplex denaturation phase transition (Poland and Scheraga, 1966), the approach was adapted and carefully parameterized (SantaLucia, 1998; SantaLucia and Hicks, 2004; Huguet et al., 2010; Bae et al., 2020) to describe binding equilibria for strands of moderate length (oligonucleotides). It is difficult to overstate just how influential the nearest neighbour model has been, particularly in the development of nucleic acid nanotechnology, as it allows rational design of an ensemble of strands to produce the desired thermodynamics. The NUPACK software suite automates this process of system analysis and thermodynamics-based design by implementing the nearest-neighbour model (SantaLucia and Hicks, 2004). A number of attempts have been made to augment this thermodynamic model with realistic kinetics (Flamm et al., 2000; Xayaphoummine et al., 2005; Srinivas et al., 2013; Schaeffer et al., 2015).

At its simplest, the nearest-neighbour model allows a two-state approximation to the binding of A and B, in which the strands are either fully bound or fully dissociated. In this limit, the concentration of the product $[A B]$ can be estimated using the equation

\frac{[A B]}{[A] [B]} = exp (- Δ H_{A B} - T Δ S_{A B} / k T) . (1)

Here $Δ H_{A B}$ and $Δ S_{A B}$ are computed by summing contributions from each nearest-neighbor set of two base pairs, together with terms for helix initiation and various structural features, all of which are assumed to be temperature independent.

Another class of models ignores thermodynamics entirely, instead providing a continuum-level description of DNA mechanics. Most notably, DNA is frequently modelled as a semi-flexible polymer (or worm-like chain, WLC) characterised by a bending modulus (Kratky and Porod, 1949). This model can be augmented with an extensional modulus (Odijk, 1995) and a representation of twist with associated twist modulus (Yamakawa, 1977). It is also possible to consider coupling between the modes of deformation (Gore et al., 2006; Nomidis et al., 2019). As with the nearest-neighbour model of DNA, the influence of these approaches is enormous, particularly within the biophysics community. The elastic rod is the starting point for understanding the geometry of DNA, and the null model against which results are compared and interpreted.

There is actually quite a large gap in complexity between mesoscopic models such as oxDNA and the continuum WLC models or the nearest neighbour model of thermodynamics. The time required to analyse the same system with these methods differs by many orders of magnitude. It is intriguing that, to our knowledge, there are few approaches that come close to bridging this gap. Fundamentally, it is not easy to combine the mechanics of semiflexible DNA as captured by the WLC, the geometry and topology of DNA structures, and the thermodynamics of DNA duplex formation as described in the nearest-neighbour model, in a representation that is simultaneously quantitatively useful and substantially simpler than the existing mesoscale models. Approaches such as Benham’s description of melting in circularly negatively-supercoiled DNA (Fye and Benham, 1999) achieve this marriage in specific contexts. The variety of possible behaviour, however, and the sensitive interplay of topology, structure, mechanics and thermodynamics in many systems of interest, make the development of such models extremely hard and currently necessitate the application of coarse-grained models such as oxDNA.

In the rest of this pedagogical review, we first provide a high-level description of the basics of the oxDNA model and simulation techniques. We then present prototypical simulations to demonstrate key properties of oxDNA, and discuss how results from these simulations can be interpreted in terms of simpler DNA models at different length scales. While doing so, we discuss specific challenges in obtaining meaningful data from oxDNA simulations, and discuss where oxDNA provides added value. Initialisation files, processing scripts and supporting instructions are provided for all simulations presented here at (Sengar, 2021). This review should then serve as an introductory tutorial to applying oxDNA.

One drawback of this format is that the examples are presented as a fait accompli; just re-running the code will provide a limited experience of the real process of simulating oxDNA. We strongly encourage readers using this document as a tutorial to attempt to construct as much as possible of the simulations for themselves, and then to compare to the results obtained here. Alternatively, users may try to construct variants to simulate similar systems. Additional guidance on the nuts and bolts of running simulations can be found at Ref (oxDNA wiki, 2015; LAMMPS Documentation, 2021), where instructions on visualizing the output can also be found. In general, we have found that checking one or two snapshots of a simulation can avoid many wasted hours simulating and studying faulty systems.

3 The OXDNA Model

The oxDNA model was originally developed to study the self-assembly, structure and mechanical properties of DNA nanostructures, and the action of DNA nanodevices—although it has since been applied more broadly. To describe such systems, a model needs to capture the structural, mechanical and thermodynamic properties of single-stranded DNA, double-stranded DNA, and the transition between the two states. It must also be feasible to simulate large enough systems for long enough to sample the key phenomena. As discussed in Section 2, mesoscopic models in which multiple atoms are represented by a single interaction site are the appropriate resolution for these goals.

We will now outline the key features of the oxDNA model, the specific mesoscale model that is the focus of this review. While doing so, we note that there are effectively three versions of the oxDNA potential that are publicly available. The original model, oxDNA1.0 (Ouldridge et al., 2011), lacks sequence-specific interaction strengths, electrostatic effects and major/minor grooving. oxDNA1.5 adds sequence-dependent interaction strengths to oxDNA1.0 (Šulc et al., 2012), and oxDNA2.0 (Snodin et al., 2015) also includes a more accurate structural model, alongside an explicit term in the potential for screened electrostatic interactions between negatively charged sites on the nucleic acid backbone. In addition to these three versions of the DNA model, an RNA parameterisation “oxRNA” has also been introduced (Šulc et al., 2014).

It is worth noting that the three versions of oxDNA are very similar; most of the changes involve small adjustments of the geometry and strength of interactions. Structurally, the most significant change is the addition of a screened electrostatic interaction in oxDNA 2.0, which is typically small unless low salt concentrations are used. Moreover, subsequent versions of the model have been explicitly designed to preserve aspects of earlier versions that performed well. So oxDNA 1.5 and oxDNA 1.0 are very similar, except that oxDNA predicts sequence-dependent thermodynamic effects that are absent in oxDNA 1.0. oxDNA 2.0 is designed to preserve the thermodynamic and mechanical properties of oxDNA 1.5 at high salt as far as possible, but improves the structural description of duplexes (and structures built from duplexes) and allows for accurate thermodynamics at lower salt concentrations.

As a result, therefore, the discussion provided here for simulation of one version of the model largely applies to all. Moreover, it is worth noting that, even given the improved accuracy of oxDNA 2.0 in certain contexts, simulations of earlier versions of the model are still potentially valuable. oxDNA 2.0 comes into its own when it is essential to incorporate longer-range electrostatics at low salt concentrations, or when the detailed geometry of the helices are particularly important. A good example would be when simulating densely packed helices connected by crossover junctions in DNA origami (see Section 8). In other contexts, the reduced complexity of oxDNA 1.5 and hence its improved computational efficiency (along with slightly greater focus on basic thermodynamics and mechanics) may be beneficial. Further, it is often helpful to use oxDNA 1.0 in these contexts, as the comparison of versions 1.0 and 1.5 can help to distinguish sequence-dependent and generic effects.

In all three parameterisations, oxDNA represents each nucleotide as a rigid body with several interaction sites, namely the backbone, base repulsion, stacking and hydrogen-bonding sites, as shown in Figure 1. In oxDNA1.0 and oxDNA1.5, these sites are co-linear; the more realisic geometry of oxDNA2.0 offsets the backbone to allow for major and minor grooving.

FIGURE 1

FIGURE 1. Structure and interactions of the oxDNA model (adapted from (Snodin et al., 2015; Doye et al., 2020). (A) Three strands forming a nicked duplex as represented by oxDNA2.0, with the central section of the complex illustrating key interactions from Eq. 2 highlighted. Individual nucleotides have an orientation described by a vector normal to the plane of the base (labelled n), and a vector indicating the direction of the hydrogen bonding interface (labelled b). (B) Comparison of structure in oxDNA1.0 and oxDNA1.5 vs oxDNA2.0. In the earlier version of the model, all interaction sites are co-linear; in oxDNA2.0, offsetting the backbone site allows for major and minor grooving.

Interactions between nucleotides depend on the orientation of the nucleotides as a whole, rather than just the position of the interaction sites. In particular, there is a vector that is perpendicular to the notional plane of the base, and a vector that indicates the direction of the hydrogen bonding interface. These vectors are used to modulate the orientational dependence of the interactions, which allows the model to represent the coplanar base stacking, the linearity of hydrogen bonding and the edge-to-edge character of the Watson–Crick base pairing. Furthermore, this representation allows the encoding of more detailed structural features of DNA, for example, the right-handed character of the double helix and the anti-parallel nature of the strands in the helix.

The potential energy of the system is calculated as:

V_{0} = \sum_{〈 i j 〉} (V_{b . b .} + V_{s t a c k} + V_{e x c}^{'}) + \sum_{i, j \notin 〈 i j 〉} (V_{H B} + V_{c r . s t .} + V_{e x c} + V_{c o a x}), (2)

with an additional screened electrostatic repulsion term for oxDNA 2.0. In Eq. 2, the first sum is taken over all pairs of nucleotides that are nearest neighbors on the same strand and the second sum comprises all remaining pairs. The terms represent backbone connectivity ( $V_{b . b .}$ ), excluded volume ( $V_{e x c}$ and $V_{e x c}^{'}$ ), hydrogen bonding between complementary bases ( $V_{H B}$ ), stacking between adjacent bases on a strand ( $V_{s t a c k}$ ), cross-stacking ( $V_{c r . s t .}$ ) across the duplex axis and coaxial stacking ( $V_{c o a x}$ ) across a nicked backbone. The excluded volume and backbone interactions are a function of the distance between repulsion sites. The backbone potential is a spring potential mimicking the covalent bonds along the strand. All other interactions depend on the relative orientations of the nucleotides and the distance between the hydrogen-bonding and stacking interaction sites.

A crucial feature of the oxDNA model is that the double helical structure is driven by the interplay between the hydrogen-bonding, stacking and backbone connectivity bonds. The stacking interaction tends to encourage the nucleotides to form co-planar stacks; the fact that this stacking distance is shorter than the backbone bond length results in a tendency to form helical stacked structures. In the single-stranded state, these stacks can easily break, allowing the single strands to be flexible. The geometry of base pairing with a complementary strand locks the nucleotides into a much more stable double helical structure.

The model was deliberately constructed with all interactions pairwise (i.e., only involving two nucleotides, which are taken as rigid bodies). This pairwise character allows us to make effective use of cluster-move Monte Carlo (MC) algorithms, which provide efficient equilibrium sampling (see Section 8).

It is convenient to use reduced units to describe lengths, energies and times in the system. A summary of the conversion of these “oxDNA units” to SI units is provided in Supplementary Appendix A.

4 Simulating The Model: Molecular Dynamics (Md) Vs Virtual-Move Monte Carlo (Vmmc)

The oxDNA model is far too complicated to approach analytically. Publicly released code to simulate oxDNA is available as a standalone package (oxDNA wiki, 2015), or as a module (LAMMPS Documentation, 2021) for the popular LAMMPS simulation software. We note that in the process of preparing this manuscript, a small error was identified in the potential as implemented in LAMMPS (see Supplementary Appendix B). This error is only really noticeable when simulating unpaired DNA bases. Nonetheless, we recommend that potential users of the LAMMPS implementation wait for the stable LAMMPS release in Summer 2021. The fix will be verified through an erratum attached to the original publication Henrich et al. (2018).

There are two broad types of simulation technique that can be applied to probe the model: molecular dynamics (MD) and Monte Carlo (MC). Molecular dynamics (Frenkel and Smit, 2002) algorithms evolve their constituent molecules according to Newton’s laws of motion, and so are a natural choice for simulating particle systems. For coarse-grained models such as oxDNA, in which the solvent is implicit, it is necessary to include a thermostat to both set the temperature and ensure diffusive rather than ballistic dynamics. The default MD algorithm for the standalone version of oxDNA is an Andersen-like algorithm (Russo et al., 2009), in which particle velocities and angular velocities are resampled from a Boltzmann distribution with a frequency that sets the effective diffusion coefficient. In the LAMMPS implementation, the model utilises a Langevin thermostat for rigid bodies (Davidchack et al., 2015), which applies small friction- and noise-based updates to the momentum and angular momentum at each step. The relative size of these contributions sets the temperature.

A challenge of MD simulations is that when strong, short-ranged interactions are present—as in oxDNA—they place a limit on the maximum integration time step that can be used while preserving numerical stability. Interestingly both the Andersen-like and Langevin thermostats act to stabilise the simulations, allowing larger time steps to be used than if the equations of motion were integrated without noise or drag to generate energy-conserving, ballistic motion.

Both MD algorithms generate dynamical trajectories that can be used to probe system kinetics (more on this in Section 7). However, it is also common to use MD to take equilibrium averages over the configurations of a particular system. In the limit of small time steps, both Andersen-like and Langevin algorithms will converge on a steady state in which they sample configurations $x$ from the Boltzmann distribution $p_{eq} (x) \propto exp (- β V_{0} (x) / k_{B} T)$ , where $V_{0} (x)$ is the potential energy of the model. How small the step size needs to be depends on a number of details, such as the strength of coupling to the thermostat. For parameters that have become an unofficial default for oxDNA, we illustrate the accuracy of the algorithm as a function of step size in Supplementary Appendix B.

Monte Carlo (MC) (Metropolis and Ulam, 1949) simulations are an alternative approach for sampling from the same Boltzmann distribution, but evading the drawbacks caused by the presence of a timestep altogether. In the standard MC approach (Frenkel and Smit, 2002), configurational moves $x \to y$ are proposed randomly, with a symmetric probability distribution that satisfies $p_{gen} (x \to y | x) = p_{gen} (y \to x | y)$ . If these proposed moves are accepted with a probability $p_{acc} (x \to y) = max (exp (- (V_{0} (y) - V_{0} (x)) / k_{B} T, 1))$ , then the Boltzmann distribution $p_{eq} (x) \propto \exp (- β V_{0} (x))$ is the stationary distribution of the simulation and a long simulation will sample from that distribution, assuming ergodicity.

In principle, the moves $x \to y$ can be arbitrarily large without leading to errors, since it is not necessary to integrate the derivative of the potential, only calculate its values at the endpoints. However, standard MC techniques incorporate sequential updates of individual particles as the moves $x \to y$ . For a model of a strongly-attractive system such as oxDNA, these moves must be extremely small or the acceptance factor will always be small. The result is painfully slow equilibration, particularly if large scale movements of strands is required to observe it.

Virtual-move Monte Carlo (VMMC) (Whitelam and Geissler, 2007; Whitelam et al., 2009) is an alternative that circumvents the drawbacks of MC algorithms. VMMC first proposes a single particle move, then generates a co-moving cluster of particles based on which interactions are best preserved by moving the particles in unison while ensuring that the correct, detailed-balanced stationary distribution is retained. The cluster building process is based on assessing the change in pairwise interactions, and so VMMC is especially suited to oxDNA, which has exclusively pairwise interactions. We have implemented the variant from the appendix of Ref (Whitelam et al., 2009). in the standalone code.

For those with limited experience of simulating oxDNA, it is not obvious whether VMMC or MD is the optimal approach to sampling a given system. We illustrate the relative efficiencies of the two algorithms when simulating ssDNA of length 20, 100 and 1,000 bases, in terms of the computational time required to reach states representative of equilibrium from the same unrepresentative starting condition.

We simulate the poly (dT) molecules with (a) 20 bases, (b) 100 bases and (c) 1,000 bases, using oxDNA1.0 (Ouldridge et al., 2011). Simulations are performed at $T = 27^{o}$ C, in a periodic box of 20, 100 and 1,000 simulation units for 20, 100 and 1,000 bases, respectively. For the MD simulations, we simulate for 60,000 simulation units of time (a nominal 182 ns), with a time step of $d t = 0.003$ (see Supplementary Appendix B); each simulation therefore has $2 \times 10^{7}$ steps in total. For VMMC, we attempted 60,000 VMMC steps per particle. The proposed moves are: rotation about a random axis, through an angle up to 0.22 radians; and translation through a distance of up to 0.22 units. These choices produce a nice balance of cluster sizes, ranging from individual nucleotides to entire strands.

Strands are initialized in a fully stacked, helical conformation as illustrated in Figures 2A,C, and relax to more-representative, partially-stacked conformations (Figures 2B,D) as the simulation is run. The relaxation of the strands is associated with an increase in the potential $V_{0}$ , and so we illustrate equilibration by plotting that potential averaged over 20 independent simulations as a function of simulation progress in Figure 3. The average value of the potential in equilibrium, ${〈 V_{0} 〉}_{eq}$ , can be approximated by the average over the data collected in the second half of the simulations. We then estimate the equilibration time scale as the time required for $〈 V_{0} (t) 〉 - {〈 V_{0} 〉}_{eq}$ to reach $1 / e$ of its initial value for the first time.

FIGURE 2

FIGURE 2. Snapshots of poly (dT) molecules used in the equilibration time tests. Non-representative initial states of poly (dT) molecules (left), and representative configurations obtained post-equilibration (right). (A) and (B): poly (dT) with 20 nucleotides; (C) and (D) poly (dT) with 100 nucleotides.

FIGURE 3

FIGURE 3. Equilibration plots for the ploy (dT) molecules with (A) 20 bases, (B) 100 bases, (C) 1,000 bases, obtained as averages over 20 independent simulations. For both MD and VMMC, the potential $V_{0}$ is plotted as a function of simulation progress, in units of reduced time (3.03 ps) for MD and attempted steps per particle for VMMC.

The “simulation progress” axes in Figure 3 are not directly comparable for MD and VMMC; one measures simulation time, the other attempted VMMC steps per particle. The most relevant quantity is the actual computational time required to equilibrate the system on a given architecture; in simple contexts, this time is also indicative of the speed with which the algorithm samples the equilibrium ensemble. Table 1 shows the total runtime of the simulations and the equilibration time as a fraction of that runtime. In computational time, the VMMC algorithm is able to equilibrate the poly (dT) molecules more quickly (compared to MD algorithm) in all the three cases. VMMC is around 15 times as fast for the 20-nucleotide strand, dropping to around 4 times as fast for the 1000-nucleotide strand. The large moves available to VMMC, and the lack of a requirement to differentiate potentials, provide this benefit. Note, however, that the ratio of the equilibration times for MD to VMMC algorithms decreases as the system size increases; as can be seen in Table 1, the equilibration time for VMMC simulations increases super-linearly with system size. This super-linear increase arises because more steps must be taken to equilibrate larger systems and because larger clusters tend to be built for larger systems, resulting in each step taking longer on average.

TABLE 1

TABLE 1. Computational time for equilibration of poly (dT) molecules of various lengths. Simulations were performed using a single core Intel(R) Core(TM) i5-4300U CPU @ 1.90GHz.

The relative efficiency of VMMC and MD approaches will depend to some degree on the choice of damping parameters and seed moves; we have not carefully optimised our choices for either technique, but have used values that generally work well. The relative efficiency will also depend on the particular system: VMMC lends itself to systems in which large movements are important. Our metric for efficiency (speed with which systems reach a state that is representative of equilibrium) is also fairly crude. Nonetheless, the general rule of thumb that VMMC is more efficient for smaller systems—particularly those with significantly fewer than 1,000 nucleotides, such as the sampling of duplex formation in oligonucleotides - is a helpful one. It is also particularly easy to enhance VMMC using umbrella sampling, as explained in Section 6.1.1.

For sufficiently large systems, such as DNA origami, MD should equilibrate faster, and therefore provide improved sampling. Another major advantage of the MD approach is much more facile parallelisation when simulating large systems. The standalone code allows for parallel simulation on GPUs (graphical processing units), and the LAMMPS module for parallel simulations across multiple CPUs (central processing units) using MPI. These approaches are demonstrated in Section 8.

5 Mechanical Properties of DNA

The mechanical properties of DNA are central to its role across nanotechnological, biophysical and biological contexts. DNA’s flexibility and response to applied stress determine the conformation and accessibility of the genome inside cells (Lewis et al., 1996; Nikolov et al., 1996; Widom, 2001; Richmond and Davey, 2003). Moreover, not only is the stiffness of dsDNA important in maintaining the conformation of DNA nanostructures, but the relative flexibility of ssDNA crucially allows for joints and flexible hinges. These properties are widely-studied in bulk and single-molecule experiments in vitro (Crothers et al., 1992; Smith et al., 1996; Strick et al., 1996; Wang et al., 1997; Rivetti et al., 1998; Mills et al., 1999; Podtelezhnikov et al., 2000; Dessinges et al., 2002; Bryant et al., 2003; Seol et al., 2004; Fujimoto et al., 2006; Gore et al., 2006; Lionnet et al., 2006; Seol et al., 2007; Du et al., 2008; Forth et al., 2008; Mosconi et al., 2009; Demurtas et al., 2009; Brutzer et al., 2010; Gross et al., 2011; Salerno et al., 2012; Tempestini et al., 2013; Fields et al., 2013; Le and Kim, 2014; Kim et al., 2015).

It is therefore essential that a coarse-grained model provides a reasonable representation of these properties. In this section, we both discuss the mechanical properties of oxDNA, and show how to construct simulations that can probe these properties.

5.1 Stiffness of Duplex and Single-Stranded DNA

The most common metric used to quantify the stiffness of DNA is the persistence length, defined in the textbook of Cantor and Schimmel as (Cantor and Schimmel, 1980)

L_{ps} = \frac{〈 L \cdot l_{0} 〉}{〈 l_{0} 〉} . (3)

Here, $L$ is the end-to-end vector of the polymer and $l_{0}$ represents the vector between the first two monomer units. dsDNA is most commonly thought of as a semi-flexible polymer or wormlike chain (Kratky and Porod, 1949). In this picture, the discrete series of inter-base pair vectors are approximated as a continuous, differentiable polymer axis with a quadratic free energy of curvature. For an infinitely long, semi-flexible polymer, correlations in the alignment of the polymer axis decay exponentially with separation, with a decay rate given determined by $L_{ps}$ . When translated back to the language of inter-base-pair vectors, we obtain

\frac{〈 l_{n} \cdot l_{0} 〉}{{〈 l_{0} 〉}^{2}} = exp (- n 〈 l_{0} 〉 / L_{ps}), (4)

where $l_{n}$ is the vector between base pair $n - 1$ and base pair n.

It is relatively straightforward to both assess whether the wormlike chain model is a good model for oxDNA, and to extract $L_{ps}$ . We simply simulate a duplex system for long enough to sample a representative set of configurations, calculate the correlation between inter-base-pair vectors as a function of separation, and fit the results to the exponential decay of Eq. 4.

In Figure 4, we plot the results of such a procedure. To obtain these data, we simulate a DNA duplex of length 500 base pairs at 27°C using oxDNA1.5. We perform 20 VMMC simulations with $2 \times 10^{7}$ attempted steps per particle for VMMC. The first $10^{5}$ moves are treated as an initialization period and no data is collected. Additionally, the base pairs at the ends of the duplex are more flexible than those well within the bulk; to obtain properties representative of bulk DNA, we therefore do not include the five base pairs at either end in our analysis. Correlations are calculated from the default configurational outputs of the model, using the code provided in (Sengar, 2021). In order to obtain a good sample, it is helpful to output these configurations with a high frequency; we use a small value for the parameter to output energy configurations after a single VMMC move per particle.

FIGURE 4

FIGURE 4. Plot of correlation of inter-segment vectors vs distance (number of base pairs along the DNA) for 500 dsDNA (blue curve) and 100 ssDNA (red curve). dsDNA fitted with an exponential decay with a decay constant of 0.0076134.

As is evident from Figure 5, the correlation of the duplex axis indeed follows an exponential fall-off, to within sampling error. Fitting Eq. 4 to $ln (\frac{〈 l_{n} . l_{0} 〉}{{〈 l_{0} 〉}^{2}})$ gives $L_{ps} \approx 131 〈 l_{0} 〉 = 131 \times 0.4118 \approx 53.95$ simulation units $= 45.91$ nm, consistent with experimental estimates of 40–50 nm (120–150 base pairs) at high [Na+] concentrations (Savelyev, 2012; Herrero-Galán et al., 2013).

FIGURE 5

FIGURE 5. Log plot of correlation of inter-segment vectors vs distance (number of base pairs along the DNA) for 500 dsDNA (blue curve) and 100ssDNA (red curve). dsDNA fitted with an exponential decay with a decay constant of 0.0076134.

Indeed, more generally, the mechanical properties of double-stranded oxDNA are well-described by a semiflexible polymer model, and its torsional and extensional moduli have been analysed elsewhere (Ouldridge et al., 2011; Matek et al., 2015). Significant deviations from this behaviour - such as sharp kinks facilitated by broken base pairs—are generally only observed when large stresses are applied to the molecule (Romano et al., 2013; Matek et al., 2015), in agreement with experiment.

ssDNA behaves very differently in oxDNA. In Figure 4, we plot the correlations of backbone-site-to-backbone-site vector for a 100-base poly (dT) ssDNA, obtained from running simulation in oxDNA1.5 at 27°C. We perform 20 VMMC simulations with $2 \times 10^{7}$ attempted steps per particle. The first $10^{5}$ moves are treated as an initialization period and no data is collected. For these simulations, we have set the stacking strength between the nucleotides to zero (the consequences of non-zero stacking strength will be addressed in Section 5.2). From Figure 4 and Figure 5, it is apparent that the correlation drops very rapidly, meaning that unstacked ssDNA is very flexible in oxDNA, as it should be; adjacent backbone-to-backbone vectors can bend through a large angle. But importantly, it is worth noting that the drop in correlation between vectors with separation along the polymer cannot be well described by an exponential as in Eq. 4. The convexity of $ln (\frac{〈 l_{n} . l_{0} 〉}{{〈 l_{0} 〉}^{2}})$ is indicative of more distant backbone-to-backbone vectors being aligned more strongly than would be expected from the alignment of two adjacent backbone-to-backbone vectors.

The reason for this behaviour is that it is the excluded volume of nucleotides that gives unstacked ssDNA its “stiffness” in oxDNA. The excluded volume of nucleotides discourages ssDNA from folding back on itself, but importantly it leads to very different polymer properties than assumed in common polymer models such as the freely-jointed chain and the wormlike chain. For these classic polymer models, the statistical properties are entirely determined by interactions between parts of the polymer that are adjacent along the backbone, whereas the curvature of $ln (\frac{〈 l_{n} . l_{0} 〉}{{〈 l_{0} 〉}^{2}})$ in Figure 5 is indicative of interactions between more distant points along the polymer contour playing a role.

As a result, using a wormlike chain with a given $L_{p s}$ (or a freely jointed chain with a given Kuhn length) to understand ssDNA in oxDNA is misleading. The overall tendency of the polymer to swell to fill a large volume - due to its excluded volume - would suggest a far greater degree of local stiffness than actually present. This effect is retained even when stacking between adjacent nucleotides is included.

Importantly, these complexities also apply to physical ssDNA, as well as the oxDNA model. Single DNA have linear dimensions on the order of 1 nm, and experimental attempts to measure the mechanical properties of oxDNA (usually reported as persistence lengths) are of a similar order of magnitude (Smith et al., 1996; Rivetti et al., 1998; Mills et al., 1999; Murphy et al., 2004). Describing ssDNA in this way is not self-consistent; any polymer with this cross-section and flexibility would be strongly affected by excluded volume, so these models cannot be accurate. The result has been that experiments on large scale properties of relaxed ssDNA (Rivetti et al., 1998; Murphy et al., 2004), which are sensitive to excluded volume effects, tend to produce larger estimates for quantities like $L_{p s}$ than experiments on shorter sections of ssDNA, or ssDNA under high tension (Smith et al., 1996; Rivetti et al., 1998).

Low salt concentrations, which lead to weaker screening of electrostatic interactions between non-adjacent nucleotides make the above effect stronger (Smith et al., 1996; Dessinges et al., 2002). Base-pairing interactions in non-homopolymeric ssDNA have a confounding effect; the formation of secondary structure tends to condense the strand, making it appear more flexible when its statistics are modelled with a wormlike chain or a freely-jointed chain (Smith et al., 1996). Overall, as for oxDNA, simple descriptions of the mechanical properties of physical ssDNA should be treated with caution.

5.2 Response of ssDNA to Tension

A common mechanism for probing the mechanical properties of DNA is to apply force, whether torsional (Strick et al., 1996; Bryant et al., 2003; Forth et al., 2008; Mosconi et al., 2009; Brutzer et al., 2010; Salerno et al., 2012; Tempestini et al., 2013), extensional (Smith et al., 1996; Strick et al., 1996; Wang et al., 1997; Dessinges et al., 2002; Seol et al., 2004; Gore et al., 2006; Seol et al., 2007; Huguet et al., 2010; Gross et al., 2011) or shearing (Forth et al., 2008; Hatch et al., 2008; Mosconi et al., 2009; van Mameren et al., 2009; Brutzer et al., 2010; Salerno et al., 2012; Tempestini et al., 2013; Wang and Ha, 2013).

Experiments have focused on both the elastic properties associated with small deformations (Smith et al., 1996; Strick et al., 1996; Wang et al., 1997; Bryant et al., 2003), and large scale structural transitions (Smith et al., 1996; Strick et al., 1996; Dessinges et al., 2002; Bryant et al., 2003; Seol et al., 2004; Gore et al., 2006; Seol et al., 2007; Hatch et al., 2008; Forth et al., 2008; Mosconi et al., 2009; van Mameren et al., 2009; Brutzer et al., 2010; Gross et al., 2011; Salerno et al., 2012; Wang and Ha, 2013; Tempestini et al., 2013).

Applying external tension is relatively straightforward in molecular simulation; there are more subtleties associated with applying boundary conditions for external torsion (Matek et al., 2012; Matek et al., 2015), but it is also possible. In the case of oxDNA, small external stresses have been used to help parameterise and characterise the model (Ouldridge et al., 2011; Matek et al., 2015; Skoruppa et al., 2017; Nomidis et al., 2019); larger stresses have been applied to provide insight into experiments on structural transitions (Matek et al., 2012; Romano et al., 2013; Wang and Pettitt, 2014; Matek et al., 2015; Mosayebi et al., 2015; Wang et al., 2015; Engel et al., 2018; Desai et al., 2020).

Systems with internally-induced stress, where the drive to form base pairs in one part of an assembly applies stress to another part, have also been studied (Harrison et al., 2015; Sutthibutpong et al., 2016; Wang and Pettitt, 2016; Wang et al., 2017; Tee and Wang, 2018; Caraglio et al., 2019; Harrison et al., 2019; Engel et al., 2020; Fosado et al., 2021; Park et al., 2021).

As an example, in this section we demonstrate the force-extension properties of ssDNA as represented by oxDNA. Optical tweezer experiments with ssDNA have a long history (Smith et al., 1996). These original experiments with naturally-occurring DNA exhibited formation and stabilization of secondary structure in high salt conditions and low-moderate force, although this was not explicitly modelled at the time. The presence of this secondary structure makes simulation of DNA heteropolymers hard; it is challenging to equilibrate a long strand with many competing base-pairing configurations (we have had some success using methods based on parallel tempering (Romano et al., 2013). Instead, therefore, we simulate 100-nucleotide-long homopolymeric poly (dA) using oxDNA1.0 and oxDNA1.5.

The helicity in oxDNA is driven by stacking interactions between adjacent nucleotides. As is evident from Figure 2, this stacking has a residual effect on the structure of ssDNA strands, which are partially stacked in equilibrium. We will use force-extension simulations to probe the consequences of single-stranded stacking in oxDNA.

For these simulations, which are similar to original results in (Šulc et al., 2012), we use both a version of the parameters with no stacking interacting, a sequence-averaged stacking interaction (oxDNA1.0), and a sequence-specific stacking interaction (oxDNA1.5) for which poly (dA) has the strongest interaction of all sequences. All simulations are performed at $T = 27^{o}$ C, in a periodic box of length 100 simulation units with each simulation running for $4 \times 8$ steps with $d t = 0.005$ which equals to a total run time of $2 \times 10^{6}$ units. Four sets of simulations are performed for 12 difference values of force.

Figure 6A shows that extensive stacking has only a moderate effect on the force-extension properties of the ssDNA at low force. In the sequence-dependent model, poly (dA) is close to 90% stacked at 27°C—see Figure 6B. However, the increased stiffness due to the tendency to form stacked single helices (akin to the initial state in Figure 3) is counteracting by the shorter end-to-end distance of the backbone when it is forced to wind around the helix.

FIGURE 6

FIGURE 6. The response of ssDNA to tension, and the role of stacking therein. (A) Force-extension plots for 100-nucleotide poly (dA) using three models: no stacking (black); average stacking strength (oxDNA1.0, blue) and sequence-dependent stacking (oxDNA1.5, red). Stronger stacking leads to an increased force at larger extensions, and extremely strong stacking results in a plateau-like feature as stacking is disrupted. (B) Stacking probability for average stacking strength (oxDNA1.0, blue) and sequence-dependent stacking (oxDNA1.5, red) as a function of applied force. Adjacent nucleotides are defined as stacked if the stacking energy between the pair is more than -0.1 units.

At larger forces, however, we clearly see a signal of stronger stacking. Larger force is required to extend the strands with stronger stacking, and a plateau-like feature is evident in the system with the strongest stacking. A similar plateau was observed by Seol et al. for RNA stretching (poly(A) and poly(C)) (Seol et al., 2004; Seol et al., 2007) but was absent for poly(U). Those authors hypothesised that the plateau arises as the shorter end-to-end distance in helical stacked confirmations becomes prohibitive; additional force is then required to disrupt the stacking interaction to allow further extension, causing an increase in the gradient of the force-extension curve. After the bases have unstacked, the gradient becomes less steep again.

Broadly speaking, this explanation is borne out by oxDNA. Notably, however, Seol et al. (Seol et al., 2004; Seol et al., 2007) concluded that a relatively low stacking probability should give a pronounced plateau. By contrast, in Figure 6B—obtained by probing configuration output files to assess the degree of stacking (Sengar, 2021) we see that strands with an initial stacking probability of 78% show only a hint of the plateau. This discrepancy arises because, in the minimal model of Seol et al. (Seol et al., 2004; Seol et al., 2007), even a single pair of stacked nucleotides has a much shorter end-to-end distance along the ssDNA backbone than an unstacked pair. The explicit representation of 3D structure in oxDNA, however, captures the fact that the shortening of the end-to-end distance along the DNA backbone is only significant when several bases in a row are stacked into a helix, so that the backbone really has to wrap back round upon itself. As a result, extension can occur while disrupting only a fraction of the stacking interactions, and ssDNA in oxDNA remains significantly stacked even at high force (Figure 6B).

While oxDNA’s representation of the polynucleotide backbone is simplistic, these geometrical arguments also apply to physical DNA - suggesting that even weak plateau-like behaviour in ssDNA force-extension curves is evidence of strong stacking, and the absence of a plateau is not proof of an absence of stacking. More generally, this system is indicative of the value that oxDNA can provide. The system involves an interplay between basic structure, mechanics and thermodynamics of ssDNA. When applied, oxDNA reveals subtleties that are not directly apparent from a more minimal model. Indeed, it is quite common to construct very simple models to interpret biophysical experiments on the mechanical properties of DNA (Hatch et al., 2008; Qu and Zocchi, 2011; Salerno et al., 2012; Vafabakhsh and Ha, 2012; Fields et al., 2013; Tempestini et al., 2013; Wang and Ha, 2013; Meng et al., 2014); simulations with oxDNA often reveal physically reasonable relaxation mechanisms that aren’t factored into these simpler models (Harrison et al., 2015; Matek et al., 2015; Mosayebi et al., 2015; Skoruppa et al., 2017; Harrison et al., 2019). At this stage, it is also worth highlighting a general virtue of coarse-grained models that is apparent in these simulations. It is very simple just to switch off interactions—such as the stacking here—to isolate the effect those interactions have on the system. Doing so can be incredibly helpful in interpreting the physical cause of experimental signals.

6 Thermodynamic Simulations With oxDNA

6.1 Duplex Formation Thermodynamics

As well as representing the structure and mechanical properties of ssDNA and dsDNA, oxDNA is also designed to capture the thermodynamics of the hybridization transition from ssDNA to dsDNA. Needless to say, accurately capturing the thermodynamics of this transition is essential for any model hoping to describe biological and nanotechnological processes involving the forming and disruption of base pairs.

To assess the thermodynamics of a simple duplex, it is typical to simulate an isolated pair of strands in a periodic cell that is large enough to prohibit self-interactions (unit cell size of $\underset{\approx}{>} 2 n$ oxDNA length units, where n is the duplex length, is generally sufficient). Given a sufficiently long VMMC or MD simulation, the fraction of time spent in the bound state can be estimated and used to infer quantities such as melting temperatures, as outlined below.

However, particularly for longer strands, simulating this process can be prohibitively slow. For two short strands in solution, the vast majority of configurations have well-separated strands and no base-pairing interactions. Enthalpically favourable base-pairing provides a compensatory advantage to configurations with many well-formed base pairs (fully-formed duplexes). To obtain a good estimate of the fraction of strands bound in equilibrium, it is necessary to pass between these two sub-ensembles (completely unbound and fully bound) many times; as a rule of thumb, we have found that around 10 interconversions will start to provide meaningful statistics.

Unfortunately, interconversion requires the system to transition through states with only one or two base pairs that benefit neither from the large ensemble of configurations accessible to dissociated strands, nor the favourable interactions of fully-bound strands. These configurations with $Q = 1,2$ base pairs are rare in the equilibrium ensemble, and have a relatively high free energy

F (Q) = - k T ln (p_{eq} (Q)) + C, (5)

where C is a Q-independent constant, and $p_{eq} (Q)$ is the probability of observing Q base pairs in equilibrium. The high free energy of these intermediate states makes dissociation and association rare event processes that are challenging to sample directly.

6.1.1 Umbrella Sampling

To overcome this difficulty, simulations can be augmented with umbrella sampling (Torrie and Valleau, 1977). For a system with coordinates x (in our case, nucleotide positions and orientations), umbrella sampling involves identifying a collective order parameter $λ (x)$ for the transition of interest, and then applying a bias $W (λ (x))$ to force the system to occupy otherwise undesirable values of $λ (x)$ that lie along the transition path more frequently. Unbiased statistical averages can be extracted from these biased samples using

{〈 A (x) 〉}_{eq} = {〈 \frac{A (x)}{W (λ (x))} 〉}_{biased}, (6)

where $A (x)$ is a quantity of interest. Essentially, the contribution of each configuration sampled to the average is reduced by a factor of the bias applied.

A common approach with umbrella sampling is to perform a series of separate simulations with very strong biases tightly centred on distinct values of $λ (x)$ . Simulations centred on adjacent values of $λ (x)$ can then be knitted together using procedures such as the Weighted Histogram Analysis Method (WHAM), allowing the calculate of the free energy difference between the start and end point (Kumar et al., 1992).

Generally, however, we have found that this sophisticated approach is not necessary for oxDNA, and a particularly straightforward umbrella sampling method is built into the standalone oxDNA code. When using VMMC, it is possible to specify discrete order parameters $λ (x)$ based on the number of base pairs between user-defined groups of nucleotides. For duplex formation, it is fairly straightforward to iteratively identify a biasing potential that facilitates both the sampling of all states and the rapid transition between fully bound and completely detached configurations.

This biasing potential doesn’t need to be fine tuned so that all values of $λ (x)$ are equally probable in the biased sample—just good enough to facilitate multiple transitions backwards and forwards. Typical examples for 5-base and 8-base duplexes are given in (Sengar, 2021). For more complex systems, more sophisticated $λ (x)$ and the use of multiple sampling windows are sometimes necessary—we refer the reader to (Ouldridge et al., 2010a; Ouldridge et al., 2013a; Machinek et al., 2014; Harrison et al., 2019). Even in these cases, however, the principles are similar to those outlined here.

We perform umbrella sampling simulations on an 8-nucleotide duplex at 312K in a simulation volume of side length 15 units, using the oxDNA1.5 version of the model. Five independent simulations are performed for $7.7 \times 10^{8}$ VMMC steps per particle. The quantity $p_{eq} (Q)$ obtained from simulations is used to calculate a free energy $F (Q)$ according to Eq. 5 and plotted in Figure 7. The shape of this graph is typical for duplex formation, showing the expected large jump in free energy from 0 to 1 base pairs. From 1 to 6 base pairs there is a steady drop in the free energy as configurations are stabilised by additional base-pairing interactions that are favoured once the strands are in close proximity. The final base pairs are less favourable, as base pairs at the end of a duplex are prone to fraying (SantaLucia and Hicks, 2004; Ouldridge et al., 2011).

FIGURE 7

FIGURE 7. Free-energy profile of an 8-base-pair duplex (3′-ACTGACGT-5′ and 3′-ACGTCAGT-5′) at 312K in a simulation volume of side length 15 units.

The shape of $F (Q)$ gives a good guide to constructing first estimates of umbrella biases $W (λ (x))$ for duplex formation in general. Ignoring the dissociated state, $W (λ (x))$ should increase roughly exponentially with the number of base pairs broken, since it must counteract $exp (- F (Q) / k T)$ . The slope of $F (Q)$ , and hence the required rate of exponential growth in $W (λ (x))$ , is determined by the temperature; as a crude rule of thumb, a bias of a factor of 10–15 is required per base pair broken at 300K; this required bias falls to a factor of 3-4 by 330 K.

The initial jump in free energy from 0 to 1 in 7 is largely determined by the simulation volume; for simulation cells similar in size to this one, a factor of 3000–10000 is a reasonable first guess for the required weight of the 1-base-pair state relative to the 0-base-pair state.

In addition to biasing by the number of base pairs formed, it is sometimes helpful to also use a distance-based contribution to the order parameter. Built in to the standalone oxDNA code is the ability to define additional dimensions of $λ (x)$ that depend on the minimum separation between sets of nucleotides, rather than the number of base pairs. We have found that a simple division of the 0-base-pair state into configurations in which the strands are close (less than 4 units apart) and far apart (4 or more units apart) can reduce the amount of time spent sampling the independent diffusion of strands around the simulation volume. For simulation volumes similar to this one, the close state should be weighted by around 5–10 relative to the distant state, which dominates the unbound ensemble.

6.1.2 Melting Temperature Curves

Although free-energy profiles for a single pair of strands are informative, they aren’t directly comparable to the majority of experiments. Indeed, the thermodynamics of the oxDNA model was parameterised to reproduce the nearest neighbour model (SantaLucia, 1998), which in turn was fitted to—and predicts—experimental melting curves in bulk conditions. The Santalucia parameterisation of the nearest neighbour model (SantaLucia and Hicks, 2004) assumes that DNA duplex formation is essentially a two-state transition between a well-formed duplex and separated single strands. The free-energy profiles produced by oxDNA, such as Figure 7 are consistent with this picture; the ensemble is dominated by configuration with either zero base pairs, or a large number. In this limit, the melting behaviour can be well-characterised by the fraction of strands that are expected to have base pairing with another strand at a temperature T in a bulk system, $f_{\infty} (T)$ .

To calculate $f_{\infty} (T)$ in this two-state description, it is first necessary to obtain data at a range of temperatures. In principle, these data can be obtained through separate simulations. However, we have found that a technique called single histogram reweighting (Ferrenberg and Swendsen, 1988) is sufficient to infer $f_{\infty} (T)$ accurately over a large enough range of temperatures to describe the melting transition. The basic idea is to treat a simulation at a temperature T as a biased sample of the ensemble at another temperature $T^{'}$ ; this bias can be corrected in the same way as the bias applied during umbrella sampling:

{〈 A (x) 〉}_{T'} = {〈 A (x) exp (V_{0} (x, T) / k T - V_{0} (x, T^{'}) / k T^{'}) 〉}_{T} . (7)

Here $V_{0} (x, T)$ is the value of the potential in the original simulation at temperature T ( $V_{0} (x, T^{'})$ is slightly different due to a T-dependent term in the potential (Ouldridge et al., 2011)). Extrapolation to nearby temperatures using single histogram reweighting is built into the oxDNA standalone code. It is important to note that if umbrella sampling, and particularly temperature reweighting, are applied, then it is especially important to simulate for a good equilibration time before results are collected. Normally, any initial unrepresentative states will be swiftly overwhelmed within an average taken over the whole course of the simulation. The unbiasing factors in Eq. 6 and Eq. 7, however, can cause unrepresentative initial states to be assigned enormous weights in the ensemble average that are effectively insurmountable, rendering the simulation results meaningless.

Given well-sampled data of the formation of a single duplex in a simulation volume, it is tempting to assume that the fractional yield of states with more than one base pair in a single duplex simulation, $f_{1} (T)$ , is equal to the bulk yield of duplexes $f_{\infty} (T)$ in a system with the same total concentration of strands. Unfortunately this is not the case; simulations of only a single target duplex neglect concentration fluctuations within unit cells that have large effects on the yield of products (Ouldridge et al., 2010b; Ouldridge, 2012). Quantitative comparison to experimental data is therefore impossible unless extrapolations to bulk conditions can be performed. Assuming ideal behaviour of solutes, Extrapolation is possible. For dimerisation between non-self-complementary strands (Ouldridge et al., 2010b)

f_{\infty} (T) = (1 + \frac{1}{2 Φ (T)}) - \sqrt{{(1 + \frac{1}{2 Φ (T)})}^{2} - 1}, (8)

where $Φ (T) = f_{1} (T) / (1 - f_{1} (T))$ . A similar result holds for self-complementary duplexes (Ouldridge et al., 2010b), and algorithms exist to extrapolate to bulk for more complex assemblies (Ouldridge, 2012).

Melting curves obtained for 5-base and 8-base duplexes, using umbrella sampling, temperature reweighting and extrapolation to bulk, are reported in Figure 8. The melting temperatures $T_{m}$ for these duplexes – defined, in the two state model, as the temperature at which $f_{\infty} (T)$ is 0.5 – are close to the values predicted by the nearest neighbour model at the same conditions (17.8°C and 56.1°C) (SantaLucia and Hicks, 2004). This agreement is, of course, due to the model being fitted to these data. However, it is worth noting that although duplexes are often described as having a single “melting temperature”, the temperature at which $f_{\infty} (T)$ is 0.5 depends on the concentration of the individual strands, $[C]$ with (Ouldridge et al., 2011)

\frac{d T_{m}}{d [C]} \sim \frac{Δ T}{[C]} . (9)

FIGURE 8

FIGURE 8. Melting transition of oligonucleotides. Fractional yield of 5mer (3′-AGTCT-5’/3′-AGACT-5′) and 8mer (3′-ACTGACGT-5’/3′-ACGTCAGT-5′) duplexes in bulk ( $f_{\infty} (T)$ )) as predicted by oxDNA1.5 at a total concentration of $7.96 \times 10^{- 4}$ M for each strand.

Here, $Δ T$ is the width of the transition over which $f_{\infty} (T)$ goes from largely bound to largely unbound. To match nearest neighbour predictions for melting temperatures over a range of concentrations, therefore, it is necessary that transition widths are also comparable; achieving a good match was a major part of oxDNA’s parameterisation.

6.2 Thermodynamics of More Complex Structures

Although accurately simulating basic duplex formation was necessary for the parameterisation of oxDNA, little new information is to be gained from performing these simulations again. The model is trained to reproduce the thermodynamics of the nearest neighbour model, so simulating the thermodynamics of duplex formation is an expensive way to get at an approximation to said nearest neighbour model.Where oxDNA can add value is if duplex formation occurs as part of some more complex system - possibly one in which internally or externally-applied stresses, or topological constraints, are relevant (Romano et al., 2012; Šulc et al., 2012; Ouldridge et al., 2013a; Mosayebi et al., 2015; Kočar et al., 2016; Fonseca et al., 2018; Tee and Wang, 2018; Harrison et al., 2019). As an example, we simulate the formation of a small pseudoknotted structure (Figure 9) leveraging the intuition and techniques discussed in Section 6. Here, the two sequences 3′-AGCTTCCATG-5′ and 3′-AAGCTCATGG-5′ cannot form a single continuous duplex, but can form two 5-bp duplexes section if both strands bend back on themselves. The stability of this structure cannot be inferred from the nearest neighbour model, but it can easily be simulated with oxDNA. Applying umbrella sampling, we simulate the system at a temperature of 308K in a periodic cell of side-length 20 using oxDNA1.5.

FIGURE 9

FIGURE 9. Pseudoknot unbound (A) and bound (B) states, for seqeunces 3′-AGCTTCCATG-5’/3′-AAGCTCATGG-5’.

The resulting free-energy profile, Figure 10, shows that, at the temperatures of interest, forming two arms is less favourable than forming only one. The advantage obtained by bringing the strand into close proximity via the binding of the first duplex is not enough to overcome the internal stress generated by the structure. This internal stress is evidenced by the much shallower slope of the free energy profile for forming base pairs 6–10 than 1-5.

FIGURE 10

FIGURE 10. Free energy vs number of base pairs formed for the complex 3′-AGCTTCCATG-5’/3′-AAGCTCATGG-5′ in a simulation volume of side length 20 oxDNA units.

7 Dynamical Simulations

The simulations described hitherto probe static quantities obtained in the equilibrium ensemble. However, the dynamics of DNA-based systems can be equally important. In particular, the time required for reactions to happen is crucial when constructing complex self-assembling systems or functional circuits, particularly those that are intended to remain out of equilibrium, or exhibit an extremely slow relaxation to equilibrium (Dunn et al., 2015; Srinivas et al., 2017; Fern and Schulman, 2018; Cabello-Garcia et al., 2021).

Unlike the thermodynamic and structural properties, oxDNA has not been carefully parameterised to the dynamics of physical DNA. Coarse-graining is generally known to speed up timescales by smoothing free-energy landscapes (Murtola et al., 2009). Moreover, the explicitly dynamical algorithms (particularly the Andersen-like thermostat) give a fairly crude approximation to the dynamics expected from small molecules in solution. Neither the Anderson-like nor the Langevin thermostat incorporates cooperative hydrodynamics (an updated version of the Langevin thermostat developed to describe hydrodynamic effects (Davidchack et al., 2017) has not yet been implemented in LAMMPS), and both are typically run with large effective diffusion coefficients to enhance sampling (see Supplementary Appendix A).

Nonetheless, the dynamics of oxDNA is fundamentally constrained by the combination of its free-energy landscape and its embedding of that free-energy landscape in an explicit geometrical description. For comparison, it is surprisingly difficult to generate meaningful dynamics based on just the free-energy landscape predicted by the nearest neighbour model without an explicit geometrical representation (Srinivas et al., 2013; Schaeffer et al., 2015) [add commented-out citation].

As a result, dynamical simulations of oxDNA can provide useful insight into dynamical properties of physical DNA; the model has been particularly successful in describing toehold-mediated strand displacement (Srinivas et al., 2013; Machinek et al., 2014; Haley et al., 2020; Irmisch et al., 2020), one of the fundamental reactions of DNA nanotechnology. Importantly, the focus should always be on comparing the relative dynamics of two similar systems – for example, the dependence of strand displacement rates on toehold lengths. Unlike the thermodynamic and mechanical properties of oxDNA, absolute values of dynamical properties are largely irrelevant.

As an example, we simulate the dissociation kinetics of duplexes of length 4 (3′-ATAT-5’/3′-ATAT-5′), 5 (3′-ATATA-5’/3′-ATATA-5′) and 6 (3′-ATATAT-5’/3′-ATATAT-5′) at 320 K using the Anderson-like thermostat applied to oxDNA1.5, Figure 11. Example trajectories, showing the energy of the system per nucleotide, illustrate the two-state nature of the system discussed earlier in Section 6. The strands spend a substantial amount of time in states with an energy of approximately -1.0 in oxDNA units (duplex configurations), before suddenly transitioning to states with an energy around -0.2 (single-stranded states). As hinted at by these examples, longer strands take exponentially longer to dissociate (the simulation steps taken to reach an energy of -0.2 per nucleotide, averaged over 10 simulations for each length, are: $9117 \pm 2883$ , $64032 \pm 20249$ , $279744 \pm 88463$ ). This exponential suppression of the dissociation rate with strand length is consistent with dissociation being a rare event that requires the crossing of a free energy barrier whose height grows linearly with duplex length at fixed temperature (here, 320 K), as suggested by the free-energy profile in Figure 7.

FIGURE 11

FIGURE 11. Energy per nucleotide vs simulation time step for (a) 4mer, (b) 5mer, (c) 6mer duplexes at 320 K. Sudden transitions from low to high energy are indicative of rare-event melting.

In this case, all systems studied showed the required behaviour on relatively short time scales. Frequently, it is necessary to simulate much slower processes. We have found that the forward flux sampling (FFS) technique (Allen et al., 2009) is an effective tool for simulating dynamical processes with a longer timescale. However, FFS is trickier to implement than umbrella sampling, and is not yet built in to the released code in an optimal way.

8 Simulation of Large Structures

Another significant application area for oxDNA has been the simulation of large structures to assess their conformation, stability and flexibility (Fernandez-Castanon et al., 2016; Schreck et al., 2016; Sharma et al., 2017; Shi et al., 2017; Benson et al., 2018; Choi et al., 2018; Coronel et al., 2018; Berengut et al., 2019; Brady et al., 2019; Hoffecker et al., 2019; Snodin et al., 2019; Berengut et al., 2020; Chhabra et al., 2020; Poppleton et al., 2020; Tortora et al., 2020; Yao et al., 2020).

In this context, oxDNA represents an alternative to the CanDo model and simulation package (Castro et al., 2011). The added complexity of oxDNA has a computational cost, but means that it is better able to handle irregular systems. For such simulations, use of oxDNA2.0 is strongly recommended given its better representation of structure, particularly in the context of DNA origami. A more detailed primer on setting up these simulations can be fund in Ref (Doye et al., 2020); here we focus only on technical aspects of the simulations.

As briefly mentioned in Section 4, MD algorithms can facilitate the simulation of really large systems by allowing parallelisation across GPU threads or multiple CPUs. The oxDNA standalone code is GPU-enabled via the CUDA C API and supports runs on single CPUs and single GPUs, whereas the LAMMPS version of oxDNA uses the Message Passing Interface (MPI) and is optimised for parallel runs on multi-core CPUs and distributed memory architectures.

To provide benchmarks and examplar codes, we have performed large-scale simulations with both implementations on two different compute architectures, namely a NVIDIA V100 PCIe GPU with 5,120 CUDA cores at Arizona State University’s High Performance Computing Facility, and the ARCHIE-WeSt HPC facility at the University of Strathclyde consisting of 64 Intel Xeon Gold 6138 (Skylake) processors @2.0GHz with 40 cores per node and 2,560 cores in total. The GPU and single-core CPU runs were performed with the oxDNA standalone code SVN version 6989. The GPU runs all used mixed precision and an edge-based approach (Rovigatti et al., 2015). The LAMMPS stable version from March 3, 2020 was used for the multi-core CPU runs (the small error in the LAMMPS implementation highlighted in Supplementary Appendix B is irrelevant for the purposes of these efficiency comparisons). All runs were performed with the oxDNA2.0 model featuring sequence-dependent stacking and hydrogen-bonding interactions.

Two different benchmarks were studied to analyse the performance of both implementations. The first one consisted of a varying number of double-stranded octamer duplexes and investigated the performance at different system sizes, ranging from 8 octamers with 128 nucleotides in total to 262,144 octamers with 4,194,304 nucleotides in total. The concentration of octamers was kept constant at one octamer per 20³ oxDNA length units, whereas the temperature and salt concentration were set to $T = 293 K$ and $[{Na}^{+}] = 500$ mM, respectively.

The second benchmark consisted of a DNA origami “pointer” structure (Bai et al., 2012) (15,238 nucleotides) and tested the performance at different salt concentrations between $[{Na}^{+}] = 100$ mM and 1 M. The salt concentration is another performance-critical aspect in the simulation of nucleic acids that is often neglected. The reason is that the salt concentration affects the Debye screening length, which is proportional to the inverse square root of the salt concentration. The temperature of this second benchmark was fixed at $T = 293 K$ . The initial configuration was converted from the cadnano format using the TacoxDNA server (Suma et al., 2019), then relaxed using oxdna.org (Poppleton et al., 2021), implementing the protocol from (Doye et al., 2020) followed by a simulation at the respective salt concentration.

It is worth emphasising that origami structures such as the pointer are a setting in which the improved structural model of oxDNA2.0 is essential. Unless an accurate model is used, relatively small discrepancies can contribute strain that builds up across the structure, resulting in large scale distortion.

Figure 12 shows the results of the oligomer benchmark, which are expressed as time per integration time step in milliseconds. On a single Intel Xeon Gold CPU the standalone code implements a single timestep slightly faster than the LAMMPS implementation. Note, however, that the actual efficiency will depend on the choice of coefficients of coupling to the thermostats (see Supplementray Appendix B).

FIGURE 12

FIGURE 12. Performance of the oligomer benchmark as time per time step for various system sizes: Shown are results of the oxDNA standalone code on a single CPU and NVIDIA V100 GPU and of the LAMMPS implementation of the oxDNA2 model at different CPU counts.

When deployed in parallel on more CPUs, the LAMMPS implementation offsets this disadvantage almost immediately. Its performance at the larger side of system sizes is more or less ideal as evidenced through the linear increase of time per integration step with system size. For smaller system sizes, and depending on how many CPUs were used, the performance levels off due to a build-up of MPI communication overheads. However, there is still a noticeable speed-up e.g. for 8,192 nucleotides on 320 MPI-tasks or 65,536 nucleotides on 2,560 MPI-tasks, which comes down to a very low 25 nucleotides per MPI-task. This unusually good performance of a parallel molecular dynamics code has been reported before (Henrich et al., 2018) and is owed to the rather complex oxDNA force field as the code spends a good deal of time carrying out the force calculation.

The GPU-implementation of the standalone code retains a significant advantage over the LAMMPS implementation for all but the largest benchmark sizes and runs on the full ARCHIE-WeSt system size (2,560 MPI-tasks) and its performance levels only off when the GPU becomes under-subscribed with threads at smaller system sizes. We can conclude that the LAMMPS implementation of oxDNA, besides its capability to run on a variety of CPU architectures, is very suitable for studying small and intermediate system sizes, whereas the GPU-implementation has clearly the edge at large-scale simulations.

Figure 13 shows the performance with the pointer benchmark, again expressed as time per time step in milliseconds. This time the LAMMPS implementation is marginally faster than the oxDNA standalone code on a single CPU. Again, the GPU-runs of the standalone code features significantly shorter run times on all but the largest core counts and lowest salt concentrations. It appears the increase in run time between high and low salt concentration is slightly larger for the GPU-implementation of the standalone code. This could be due to a slightly better handling of neighbour lists in LAMMPS.

FIGURE 13

FIGURE 13. Performance of the pointer benchmark as time per time step at various salt concentrations: Shown are results of the oxDNA standalone code on a single CPU and NVIDIA V100 GPU and of the LAMMPS implementation of the oxDNA2.0 model at different core counts.

Most importantly, however, an increase in runtime by a factor 8–9 can be seen at all core counts when moving from high to moderate salt concentrations. This slowdown is in line with the increase in Debye length by about a factor 3 and reflects the longer cutoff radii and neighbour lists of the pair interactions. This large performance difference should be taken into account when choosing simulation parameters: For instance it is nearly always more convenient to perform relaxation runs to create well-initialized configurations at high salt concentrations (e.g. $[{Na}^{+}] = 1$ M). Indeed, unless the response of the system to decreased salt concentration is of specific interest, we would generally recommend using high monovalent salt concentration such as $[{Na}^{+}] = 1$ M for the actual data collection.

9 Conclusion

We have reviewed the properties of, and simulation methods available for, the oxDNA model. In the process we have created a well-documented library of examplar simulations available from (Sengar, 2021). Equally importantly, however, we have attempted to provide the necessary intuition both for successfully running oxDNA-based simulations, and also for identifying which systems would actually benefit from those simulations in the first place.

Having explored the model’s strengths in some detail, it is worth noting a few natural directions for improvements. Although the model has well-parameterised sequence-dependent thermodynamics, and a good representation of average mechanical properties, it lacks sequence-dependent structure and mechanics. Incorporating this feature would be useful in and of itself, but would also be a useful first step towards building a model that could interface with other molecules such as proteins (Procyk et al., 2020).

Author Contributions

AS: Ran simulations for Figures 1–10 and partially wrote analysis and discussions for the corresponding sections. TO: Wrote introduction, analysis and discussion sections of the paper and ran simulations for Figure 11. OH: Ran pointer and oligomer benchmark simulations for the LAMPPS version of oxDNA for Figures 12 and 13. LR: Ran oligomer benchmark simulations for the standalone version of oxDNA for Figure 12. PŠ: Ran pointer benchmark simulations for the standalone version of oxDNA for Figure 13.

Funding

This work is part of a project that has received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation program (Grant agreement No. 851910). T.E.O. is supported by a Royal Society University Fellowship. OH acknowledges support from the EPSRC Early Career Research Software Engineer Fellowship Scheme (Grant No. EP/N019180/2). This work used the ARCHIE-WeSt High-Performance Computer (www.archie-west.ac.uk) based at the University of Strathclyde. PŠ acknowledges the use of the Extreme Science and Engineering Discovery Environment (XSEDE), which is supported by National Science Foundation grant number TG-BIO210009.

Conflict of Interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Supplementary Material

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fmolb.2021.693710/full#supplementary-material

References

Adleman, L. (1994). Molecular Computation of Solutions to Combinatorial Problems. Science 266, 1021–1024. doi:10.1126/science.7973651

PubMed Abstract | CrossRef Full Text | Google Scholar

Alberts, B., Johnson, A., Lewis, J., Raff, M., Roberts, K., and Walter, P. (2002). Molecular Biology of the Cell. 4th ed. (New York: Garland Science).

Allen, R. J., Valeriani, C., and ten Wolde, P. R. (2009). Forward Flux Sampling for Rare Event Simulations. J. Phys. Condens. Matter 21, 463102. doi:10.1088/0953-8984/21/46/463102

PubMed Abstract | CrossRef Full Text | Google Scholar

Bae, J. H., Fang, J. Z., and Zhang, D. Y. (2020). High-throughput Methods for Measuring DNA Thermodynamics. Nucleic Acids Res. 48, e89. doi:10.1093/nar/gkaa521

PubMed Abstract | CrossRef Full Text | Google Scholar

Bai, X.-c., Martin, T. G., Scheres, S. H., and Dietz, H. (2012). Cryo-em Structure of a 3D DNA-Origami Object. Proc. Natl. Acad. Sci. 109, 20012–20017. doi:10.1073/pnas.1215713109

PubMed Abstract | CrossRef Full Text | Google Scholar

Benson, E., Mohammed, A., Rayneau-Kirkhope, D., Gådin, A., Orponen, P., and Högberg, B. (2018). Effects of Design Choices on the Stiffness of Wireframe DNA Origami Structures. ACS nano 12, 9291–9299. doi:10.1021/acsnano.8b04148

PubMed Abstract | CrossRef Full Text | Google Scholar

Berengut, J. F., Berengut, J. C., Doye, J. P., Prešern, D., Kawamoto, A., Ruan, J., et al. (2019). Design and Synthesis of Pleated DNA Origami Nanotubes with Adjustable Diameters. Nucleic Acids Res. 47, 11963–11975. doi:10.1093/nar/gkz1056

PubMed Abstract | CrossRef Full Text | Google Scholar

Berengut, J. F., Wong, C. K., Berengut, J. C., Doye, J. P., Ouldridge, T. E., and Lee, L. K. (2020). Self-limiting Polymerization of DNA Origami Subunits with Strain Accumulation. ACS nano 14, 17428–17441. doi:10.1021/acsnano.0c07696

PubMed Abstract | CrossRef Full Text | Google Scholar

Brady, R. A., Kaufhold, W. T., Brooks, N. J., Foderà, V., and Di Michele, L. (2019). Flexibility Defines Structure in Crystals of Amphiphilic DNA Nanostars. J. Phys. Condensed Matter 31, 074003. doi:10.1088/1361-648x/aaf4a1