Bayesian Inference of Synaptic Quantal Parameters from Correlated Vesicle Release

Synaptic transmission is both history-dependent and stochastic, resulting in varying responses to presentations of the same presynaptic stimulus. This complicates attempts to infer synaptic parameters and has led to the proposal of a number of different strategies for their quantification. Recently Bayesian approaches have been applied to make more efficient use of the data collected in paired intracellular recordings. Methods have been developed that either provide a complete model of the distribution of amplitudes for isolated responses or approximate the amplitude distributions of a train of post-synaptic potentials, with correct short-term synaptic dynamics but neglecting correlations. In both cases the methods provided significantly improved inference of model parameters as compared to existing mean-variance fitting approaches. However, for synapses with high release probability, low vesicle number or relatively low restock rate and for data in which only one or few repeats of the same pattern are available, correlations between serial events can allow for the extraction of significantly more information from experiment: a more complete Bayesian approach would take this into account also. This has not been possible previously because of the technical difficulty in calculating the likelihood of amplitudes seen in correlated post-synaptic potential trains; however, recent theoretical advances have now rendered the likelihood calculation tractable for a broad class of synaptic dynamics models. Here we present a compact mathematical form for the likelihood in terms of a matrix product and demonstrate how marginals of the posterior provide information on covariance of parameter distributions. The associated computer code for Bayesian parameter inference for a variety of models of synaptic dynamics is provided in the Supplementary Material allowing for quantal and dynamical parameters to be readily inferred from experimental data sets.


INTRODUCTION
The statistics and dynamics of stochastic synaptic filtering determine how information is communicated between neurons. Synapses act as activity-dependent filters on the transfer of neuronal signals, suppressing or amplifying trains of inputs to the postsynaptic cell relative to isolated stimuli, in a phenomenon known as short-term plasticity or synaptic dynamics (Zucker and Regehr, 2002;Abbott and Regehr, 2004;Mongillo et al., 2008). An action potential in the presynaptic cell triggers an influx of Ca 2+ into synaptic terminals, causing a probabilistic all-or-none release of neurotransmitter at each active vesicle docking site on the presynaptic membrane. The neurotransmitter binds to channels on the postsynaptic cell resulting in, for example, an excitatory post-synaptic potential (EPSP) "built up statistically of the all-or-none events that are similar in size and distribution to spontaneous miniature" postsynaptic potentials (del Castillo and Katz, 1954). Depletion of vesicles available at active sites can cause an activity-dependent reduction in synaptic efficacy (Eccles et al., 1941) whereas a build-up of Ca 2+ in the presynaptic terminal can increase the probability of neurotransmitter release (Dudel and Kuffler, 1961). Synaptic transmission is thus both fundamentally stochastic (del Castillo and Katz, 1954;Fatt and Katz, 1952;Stein, 1965) and history dependent (Furukawa et al., 1982;Abbott, 1997;Tsodyks and Markram, 1997).
Initial analyses of paired-cell data used the amplitude distribution of isolated EPSPs to identify quantal peaks corresponding to sums of similar mini amplitudes (Boyd and Martin, 1956;Liley, 1956;Kuno, 1964;Kuno and Weakly, 1972;Bennett and Florin, 1974;Bekkers, 1994); for a review see Bennett and Kearns (2000). While this was an effective approach for extracting the properties of neuromuscular synapses (del Castillo and Katz, 1954) the greater variation in mini amplitudes at central synapses (Hanse and Gustafsson, 2001;Franks et al., 2003;Hardingham et al., 2010) necessitated different techniques to recover robust results in the central nervous system. Meanvariance analysis was developed to obtain estimates of the maximum number of vesicles that can be released by a single stimulus (Silver et al., 1998;Clements, 2003;Silver, 2003). Initial applications relied on conducting experiments under a variety of conditions, in particular varying the extracellular Ca 2+ concentration to alter the vesicle release probability (Foster and Regehr, 2004;Birò et al., 2005). Brémaud et al. (2007) and Loebel et al. (2009) increased the practicality of the method by using short-term vesicle depletion to vary the effective release probability under a single experimental condition. Their analyses showed that multiquantal release underlies the wide range of EPSP amplitudes observed (Song et al., 2005;Lefort et al., 2009) and that, in general, it is not the case that the number of distinct anatomical contacts equals the maximum number of readily-releasable vesicles as was put forward by the single-vesicle hypothesis (Kuno, 1971;Korn et al., 1981).
More recent approaches have introduced a principled Bayesian approach to infer synaptic parameters. Bayesian inference determines the extent to which experimental evidence supports a given set of model parameters. This relies on the fact that the probability of a certain model being correct given observed data is proportional to the probability of observing that data given that the model is correct. As such it makes maximal use of data, including every observation rather than extracting moments as in previous approaches. This framework was first applied to neurophysiological synaptic data by Turner and West (1993) to extract the number of components in a unitary EPSP. More recently, McGuinness et al. (2010) used Bayesian analysis to measure presynaptic Ca 2+ concentrations and Bhumbra and Beato (2013) used an exact Bayesian approach to extract quantal parameters from measurements of isolated EPSPs, demonstrating that accurate parameter estimates could be obtained from less data than with existing mean-variance methods.
Inference on isolated EPSPs, however, does not allow recovery of synaptic parameters associated with short-term plasticity. Costa et al. (2013) addressed this issue in a Bayesian framework using the Tsodyks-Markram model of short-term plasticity (Tsodyks et al., 1998) with a likelihood that approximated synaptic amplitude distributions during patterned input as uncorrelated Gaussians around the mean amplitudes. Though this approach does not account for correlations between closelytimed synaptic events, the method nevertheless allowed for accurate inference of a number of synaptic parameters. However, correlations between successive PSPs, which can be significant even at stimulation rates below 10 Hz, (del Castillo and Katz, 1954;Thomson et al., 1993;Fuhrmann et al., 2002) can provide a useful source of additional information for inferring model parameters. This is particularly the case for data sets that feature only a few repreated stimulations or only one series of patterned PSPs such as would be the case for spontaneous in-vivo recordings.
The main barrier to extending the Bayesian approach to a model that allows simultaneous recovery of both quantal and dynamic properties is the calculation of the likelihood of seeing a particular train of amplitudes in response to a certain pattern of presynaptic stimuli. This probability is dependent on the correlated vesicle releases during previous events and the number of possibilities therefore grows exponentially with the number of PSPs. Naively, this would appear to make the problem intractable. However, two independent studies (Barri et al., 2016;Bird, 2016) recently provided a solution to this problem by exploiting the underlying Markovian nature of the problem thereby allowing for the computation of the exact probability of a given set of observed amplitudes with a complexity that grows only linearly with PSP number. Here we develop the method, originally presented in Bird (2016), to show how the likelihood may be written in a compact mathematical form as a matrix product. This allows for efficient calculation of the posterior distribution from which, for example, the covariance of the inferred parameters can be analyzed. Our complete Bayesian method may be thought of as combining the method for inferring quantal parameters for isolated PSPs developed by Bhumbra and Beato (2013) with the method for inferring mean synaptic dynamics (without including correlations) developed by Costa et al. (2013). As well as describing the mathematical solution we additionally provide the software code to perform Bayesian inference for a variety of models of synaptic dynamics as part of this publication.

METHODS
In this section we define the general class of synaptic models our inference procedure applies to before specifying a commonly used depression-facilitation model of neurotransmitter release that will be used for illustrative purposes. The coupling of the presynaptic model to the post-synaptic voltage response is then defined.

The Class of Synaptic Dynamics Models
The method presented here is applicable to a broad class of synaptic models. The synapses this method can be applied to are assumed to have a number n of vesicle release sites to which neurotransmitter vesicles can dock. On arrival of the mth presynaptic spike at time t m neurotransmitter is released independently from each docked vesicle with probability u m . The binary occupancy variable x(t) for single release site obeys where t r are restock events (which occur at a rate that may be dependent on the presynaptic action potential times) and δ m is a binary random variable signifying release of neurotransmitter that is equal to 1 with probability u m and 0 otherwise. The stochasticity in t r and δ m is considered to be statistically independent across the n vesicle release sites. Note also that in this formulation any dynamic quantity (such as x(t)) multiplying a Dirac-delta function is evaluated just before the arrival of the impulse. The expected change in occupancy before and after a presynaptic action potential can be straightforwardly derived to give where x ⊖ m is the probability that a release site is occupied just before and x ⊕ m just after the mth spike. Similarly, the probability of occupancy just before (m + 1)th AP can be related to the occupancy just after the mth AP as where g m is the restock probability. For certain models g m can depend on the history of the presynaptic APs. Together the recursion relations (2) and (3) give the occupancy probability for an arbitrary train of presynaptic action potentials. The initial condition is typically taken as being x ⊖ 1 = 1, where all release sites are stocked. These dynamics cover a range of models such as vesicle depression (Tsodyks and Markram, 1997), depression with facilitation (Varela et al., 1997;Tsodyks et al., 1998;Fuhrmann et al., 2002), frequency-dependent recovery (Fuhrmann et al., 2004) and augmented recovery (Wang and Kaczmarek, 1998;Hosoi et al., 2007). For an in-depth discussion, see Appendix A.

Illustrative Synaptic Model with Depression and Facilitation
To provide an example of the method we use a commonly used model that combines a depression mechanism caused by vesicle release and a constant restock rate with a facilitation mechanism that models the effect of increased release probability due to transient increases in calcium concentrations in the presynaptic terminal (Varela et al., 1997;Tsodyks et al., 1998;Fuhrmann et al., 2002). The restock process is Poissonian and has constant rate 1/τ D , where τ D is commonly referred to as the depression time constant; therefore the restock probability required for Equation (3) is simply where T m = t m+1 − t m is the time between the mth and (m + 1)th APs. Let p 0 be the baseline value of the probability of release, and p 1 be the facilitated release probability immediately after an isolated spike. Let u(t) be the time-dependent release probability.
In the absence of stimulus, u(t) decays back to p 0 with timescale τ F . The dynamics of u(t) therefore obeys where the (1 − u) prefactor of the Delta functions prevents the probability going above unity. In this setup u = p 0 if the previous spike was a long time ago, then on the arrival of a spike it jumps to u = p 1 . Because it is a facilitation model we have p 0 < p 1 < 1. Note that this formulation of parameters allows the facilitated release probability p 1 to be fixed independently of the initial release probability p 0 and maps directly to the original quantal facilitating and depressing synaptic model of Fuhrmann et al. (2002) with p 0 = U SE and p 1 = U SE + (1 − U SE )U 1 using that paper's notation. The values of u(t) just after the mth and before the (m + 1)th action potentials (u ⊕ m and u m+1 respectively) are defined by the following recursion relations where the initial conditions are that u 1 = p 0 and u ⊕ 1 = p 1 . This gives the release probability before each presynaptic spike required for Equation (2). The dynamics of the restock probability g are unaffected and are given by Equation (4). A special case of this model that has one less free parameter is when the release probability doubles after an isolated spike and so p 1 = 2p 0 (Tsodyks et al., 1998).

EPSP Amplitude Distribution
The post-synaptic amplitude statistics for single vesicle release of neurotransmitter is modeled by a gamma distribution with mean µ a and standard deviation σ a (µ a is assumed to be greater than σ a to ensure that a zero amplitude is not the most likely). This is preferred over a normal distribution on empirical grounds and ensures that amplitudes are always positive (Robinson, 1976;Hanse and Gustafsson, 2001;Bhumbra and Beato, 2013). However, it is reasonable to assume that background noise is normal with standard deviation σ b and is independent of EPSP amplitude. Note that this choice of amplitude generation is identical to that described for isolated EPSPs in (Bhumbra and Beato, 2013). With this choice, if k vesicles release neurotransmitter from among the n possible release sites, the observed EPSP amplitude A is written A = ψ + φ where ψ is the release-dependent component and φ the independent Gaussian noise. Because ψ is the sum of k individual quantal amplitudes, each of which are independently identically gamma distributed, its distribution is also gamma-distributed with The distribution for the measured EPSP amplitude A, given k release events, is therefore a convolution between the gamma and normally distributed components of the noise An approach for numerically calculating this integral efficiently is provided in Appendix B.

Computational Methods and Code
An exhaustive grid-based derivation of the likelihood function for the depression-only model (see Appendix) is just within the capabilities of easily accessible computers at the time of writing. However, for more involved models with a greater number of parameters this becomes impracticable and a Markov Chain Monte Carlo (MCMC) approach was used instead. Here priors are taken to be flat (uninformative) for all parameters for illustrative reasons: more informative priors can be included as required. For the MCMC implementation, parameter space is discretised into a grid and the sampler is initialized at a random point consistent with any restrictions on the model parameters. Moves are proposed to each adjacent grid point with equal probability and accepted or rejected based on the log-likelihood ratio of the current and proposed points. Convergence of the sampler was examined by comparing the distributions resulting from chains initiated in different locations. It is straightforward to extend this transparent implementation in our code to include more sophisticated methods such as slice sampling. We provide MATLAB and JULIA code for the Bayesian inference of synaptic parameters from measurements of synaptic amplitudes using the Metropolis-Hastings sampling method (Metropolis et al., 1953;Hastings, 1970) described above as part of the Supplementary Material. The code covers the major synaptic dynamics models including: depression only, depression-facilitation, release-independent depression and frequency-dependent recovery. The models are described in the Appendix.

Synthetic and Experimental Data
To test the model we used both artificial and experimental data sets. Synthetic data with known parameters was generated from the synaptic-dynamics models and consisted of a series of stimulation times and stochastically determined EPSP amplitudes. For experimental data sets the data analyzed consisted of EPSP amplitudes combined with their arrival times. The data, comprising paired whole-cell patch-clamp recordings of layer-5 pyramidal neurons, was taken from a previous study (Kerr et al., 2013). Here data obtained in control conditions and in the presence of 100 µM bath-applied adenosine was used. Presynaptic cells were stimulated with square-pulse currents of 5 ms duration and magnitude sufficient to reliably induce a single action potential without causing bursting. Stimulation consisted of 10 spikes at 20 − 50 Hz with 10 s between traces ensuring sufficient time for full recovery and statistical independence for the next sweep. For each presentation of the same presynaptic stimulus the amplitudes of the individual EPSPs were extracted from the postsynaptic voltage trace using the voltage deconvolution method (Richardson and Silberberg, 2008) providing a vector of 10 EPSP amplitudes.

RESULTS
In this section we first summarize the broad class of synaptic models our methodology applies to. We then describe the nature of the computational problem involved in calculating exact correlated likelihoods. We go on to show how the probability of observing a set of numbered release events for a chain of presynaptic action potentials can be calculated using a Markovian property. By coupling this result to the miniature PSP distribution, the full likelihood for an observed PSP amplitude train is then derived in the form of a matrix multiplication. Finally, we demonstrate the method on both synthetic and experimental data, recovering the shift in synaptic dynamics caused by the neuromodulator adenosine.

Synaptic Models
We consider synaptic models that are quantal, stochastic and exhibit short-term plasticity. The synaptic-dynamics models feature n sites where a vesicle can be present for release. If a vesicle is present just before the mth pulse then it is released with probability u m . Between the mth and (m + 1)th pulses an empty vesicle site can be restocked with probability g m . Both release (given a presynaptic AP) and restock events are independent between release sites. The probabilities themselves are deterministic in that they depend on the model parameters only and can be calculated in advance if the times of the actionpotentials t m are known. This formulation encompasses a very broad range of models of short-term plasticity. When a vesicle is released, the size of the mini PSP it produces in the postsynaptic cell is modeled by a gamma-distributed random variable (see Methods). The mini PSPs induced by different vesicles are assumed to be independently identically distributed. The mean quantal amplitude is µ a and the standard deviation is σ a . In addition there is a normally-distributed background noise with standard deviation σ b that is uncorrelated with EPSP amplitude.
For illustrative purposes we focus on a model of synaptic dynamics that features depression and facilitation (Tsodyks and Markram, 1997;Fuhrmann et al., 2002), though other models for which computer code is also provided are described in Appendix A. Activity reduces synaptic efficacy through vesicle depletion; however, the build-up of Ca 2+ in a presynaptic terminal means that the probability of release given that a vesicle is present u is increased by presynaptic activity. Thus, the response to sustained activity can involve larger individual PSPs than the response to isolated spikes. Here, the model has a probability p 0 of release for an isolated pulse; immediately after an isolated presynaptic action potential the release probability increases to p 1 . The release probability u returns to its initial value p 0 with a timescale τ F . Empty release sites are restocked on a timescale of τ D . The model is fully defined in Methods and its parameters are summarized in Table 1.

The Nature of the Computational Problem
We now discuss the aim of Bayesian inference and the difficulties correlations cause in calculating the necessary quantities. We consider that the data is in the form of a set of presynaptic actionpotential times t 1 , t 2 , · · · , t M and post-synaptic amplitudes A 1 , A 2 · · · A M . The aim of the inference procedure is to calculate the probability densities of the parameters of the model θ = {n, p 0 , p 1 , τ D · · · } given the observed presynaptic action potential times {t 1 , · · · , t M } and postsynaptic amplitudes {A 1 , · · · A M }.
Bayesian inference utilizes the fact that the probability of a particular set of parameters being true, given some observed data, is proportional to the probability of observing that data given that those parameters are correct: The term L on the right-hand side is referred to as the likelihood function. A-priori calculating the likelihood appears computationally infeasible as naively it might be expected to grow exponentially with the number of observed amplitudes M. For example, consider a case with n possible release sites and a pair (M = 2 of presynaptic spikes. Then the likelihood L is given by where k m is the number of vesicles released by the mth spike. Because of the nested sums there are (n + 1) 2 additive terms in this expansion, and more generally the number of terms in the expansion grows exponentially with the number of presynaptic action potentials ∼ (n + 1) M . Written in this form it is clear that the problem becomes computationally prohibitive for long trains of presynaptic spikes and this is what makes calculation of the likelihood difficult for the complete model. The complexity arises from the quantal part of the likelihood P[k 2 , k 1 ]; the individual amplitudes A m are dependent only on the number of vesicles k m released by each action potential. Note that if correlations are ignored and the approximation P(k 2 , k 1 ) ≃ P(k 2 )P(k 1 ) made, then the likelihood factorizes and reduces to a product form that is much more computationally tractable in that only 2(n + 1) terms are required. This approach was taken by Costa et al. (2013) and combined with an additional approximation that neglected quantal synaptic components to focus on the mean effects of short-term plasticity. For the full probability density in which correlations are retained, it is not possible to factorize the likelihood into a scalar product in this way. However, we will show in the following sections that it is possible to use a Markovian property of this likelihood to factorize the calculation into a matrix product.

Joint Probability for Serial Release Events
The quantal component of the likelihood is most problematic; to illustrate the method of tractably calculating the full likelihood we will first consider the joint probability of paired release events P(k 2 , k 1 ). The generalization to a train of many presynaptic action potentials is straightforward. Note that knowing the number of release events at a particular action potential does not specify the state of the system; however, knowing the number of occupied release sites before a spike does fully specify the state of system. This is the Markovian property that makes likelihood calculation possible. We call y m the number of available vesicles present just before the mth action potential. Note that the expected value of y m , E[y m ] = n x ⊖ m , where x ⊖ m obeys Equation (2). Using this notation we can write the paired release probability in a more verbose form P(k 2 , k 1 ) = n y 2 =0 n y 1 =0 P(k 2 , y 2 , k 1 , y 1 ).
It is now possible to factorize the probability on the right-handside of the above equation. First we use the product rule to expand as follows P(k 2 , y 2 , k 1 , y 1 ) = P(k 2 , y 2 , k 1 |y 1 )P(y 1 ) (13) = P(k 2 , y 2 |k 1 , y 1 )P(k 1 |y 1 )P(y 1 ) (14) = P(k 2 |y 2 , k 1 , y 1 )P(y 2 |k 1 , y 1 ) P(k 1 |y 1 )P(y 1 ) (15) = P(k 2 |y 2 )P(y 2 |k 1 , y 1 )P(k 1 |y 1 )P(y 1 ) (16) where in the last step we have used the Markovian property of the occupancy variable. Note also that this is an iterative procedure, in which we can factorize the joint probability starting with the first action potential and then the second, that can be continued for joint probabilities that are comprised of an arbitrary number of spikes. For example, for the case of three action potentials it is only necessary to multiply the two-spike case by P(k 3 |y 3 )P(y 3 |k 2 , y 2 ) with the generalization to higher numbers of spike trains obvious. Inserting the final result in Equation (16) of this factorization into Equation (12) results in the following form for the two-spike case P(k 2 , k 1 ) = n y 2 =0 n y 1 =0 P(k 2 |y 2 ) P(y 2 |k 1 , y 1 )P(k 1 |y 1 ) P(y 1 ) where the square parentheses have been used to isolate components depending on k 2 or k 1 or neither. This form looks like an inner product and can be written in matrix-vector form (using bra-ket notation) as P(k 2 , k 1 ) = l 2 |q 1 |r 0 where l 2 | is a row vector dependent on k 2 , q 1 is an (n + 1) by (n + 1) matrix dependent on k 1 and |r 0 is a column vector that comprises the initial conditions. Typically P(y 1 ) = δ y 1 ,n so that |r 0 has one non-zero entry to indicate that the synapse is initially fully stocked with vesicles. Note also that the case of three action potentials is straightforward P(k 3 , k 2 , k 1 ) = l 3 |q 2 q 1 |r 0 with obvious generalization to higher numbers of spikes. The joint release probability can therefore be reduced to matrix multiplication. The entries of the left row vector and matrices generally comprise two forms. The first form is simply the number of release events k m chosen from the occupancy y m , using the current probability of release u m and is therefore binomial The second form gives the occupancy y m+1 given k m releases from an occupancy y m at the previous action potential. This implies that there were n − y m + k m empty release sites just after the mth pulse. We require there to be n − y m+1 empty sites just before the (m + 1)th pulse which means that y m+1 − y m + k m sites were restocked. Let g m be the restock probability of a single empty release site between time t m and t m+1 P(y m+1 |k m , y m ) = n − y m + k m y m+1 − y m + k m g y m+1 −y m +k m m (1 − g m ) n−y m+1 (21) where this quantity depends on the time between spikes for the synaptic-dynamics model (and all other common synaptic models).

Joint Probability for Serial EPSP Amplitudes
We can now use the factorized form for the serial quantal release events to calculate the full likelihood, which is the joint probability density of seeing amplitudes A 1 and A 2 given the parameter set.
The probabilities P[A 1 |k 1 ] and P[A 2 |k 2 ] for the observed amplitudes given that a certain number of vesicles were released are defined by Equation (8). The form of Equation (22) can again be interpreted as an inner product which can be written in bra-ket notation where L 2 | is a row vector dependent on A 2 , Q 1 is a matrix dependent on A 1 and |R 0 is a column vector with the initial configuration before the first action potential. This quantity is relatively straightforward to compute and, importantly, does not grow exponentially in computational complexity for higher numbers of action potentials. For example, for three spikes we have with the generalization to higher numbers of presynaptic spikes straightforward.

Inferring Quantal Parameters from Synthetic Data
The methodology just described is first applied to synthetic data to test how well the correlated likelihood function can recover quantal and dynamic parameters (Figure 1). Here the synaptic-dynamics model is used to generate sweeps of synthetic amplitude trains. For this model, the eight parameters to infer are the release site number n, initial release probability p 0 , facilitated release probability after an isolated spike p 1 , depression timescale τ D , facilitation timescale τ F , mean quantal amplitude µ a , standard deviation in quantal amplitude σ a , and standard deviation of background noise σ b . Figure 1 shows marginal posterior distributions of these eight parameters given five simulated sweeps, each of 30 regular spikes at 30 Hz. The posterior distributions reflect the true parameters well for all synaptic parameters with the exception of the facilitation timescale τ F and quantal amplitude standard deviation σ a . These parameters have been observed to be hard to estimate in previous studies, with Costa et al. (2013) finding broad distributions for τ F , and Bhumbra and Beato (2013) and Barri et al. (2016) noting similar uncertainties in their estimates of quantal variability. The correlated Bayesian method does not qualitatively change these results, but makes the best use of available data to accurately estimate the uncertainty. The posterior distributions narrow with more data, but it is also possible to change experimental protocols to improve estimates. Costa et al. (2013) note that when the stimulation process is Poisson, rather than periodic, estimates of the time constants τ D and τ F using their method are improved due to the broader range of interspike intervals. This is equally true of the correlated Bayesian method. Estimates of σ a could be improved by a very high stimulation rate that typically causes either 0 or 1 vesicles to release with each spike. Note that with typical delays between sweeps of 15 s, collecting this dataset required just over a minute of experimental time, giving a relatively sparse dataset that nevertheless still allows good estimates of the underlying synaptic parameters.
A major advantage of the Bayesian method over a maximum likelihood approach is that it can recover the full distribution of parameters. This allows determination of the covariances between different parameters. Figure 2 plots the joint posterior distributions of certain pairs of parameters (in total there are 28 possible pairs for the synaptic-dynamics model considered here). Figure 2A shows the relationship between release site number n, depression timescale τ D , initial release probability p 0 , and mean quantal amplitude µ a . The inverse relationship between estimates of n and µ a can be anticipated beause the mean EPSP size will always depend on the product of these two quantities. Note in particular that the relationship between release probability and both n and τ D has a characteristic curved shape that is not apparent from looking at the individual marginal distributions. This is even more apparent (Figure 2B) for larger values of n that can be seen in some central synapses (Loebel et al., 2009(Loebel et al., , 2013.

Experiment: Changing Synaptic Dynamics under Adenosine Application
The neuromodulator adenosine is implicated (Kerr et al., 2013) in the developmental shift from dominant depression at juvenile synapses to weak facilitation at mature synapses (Reyes and Sakmann, 1999). Adenosine acts via A1 receptors to ultimately reduce the probability of vesicle release (Dunwiddie and Fredholm, 1989). Measurement of synaptic dynamics under control conditions and then during bath-application of adenosine therefore provides a convenient experimental protocol to test the inference method. For the control case an initially depressing juvenile connection was stimulated 40 times with nine periodic presynaptic spikes at 40 and 20 Hz (see Figure 3A) followed by a recovery spike, with the postsynaptic response recorded. Adenosine (100 µM) was then bath-applied to the slice (see Methods) and the stimulus protocol repeated. Figure 3A plots individual postsynaptic voltage traces before and after the application of adenosine; Figure 3B shows the change in average EPSP size. The marginal maximumlikelihood estimates for the depression timescale τ D and mean quantal amplitude µ a are similar between the control and adenosine datasets ( Figure 3C). However, the suppressive effect of adenosine on synaptic transmission is clearly visible in the effective number of release sites n and the initial release probability p 0 that drives the shift from predominantly depressing to weakly facilitating synapses. It is also possible to examine the changes in covariance between pairs of parameters inferred from the experimental data ( Figure 3D). Considering active release sites n and initial release probability p 0 together makes particularly apparent the shift in synaptic transmission.

Comparison with Methods that Neglect Serial Correlations
Previous Bayesian inference methods have demonstrated that an uncorrelated likelihood function can accurately infer the quantal (Bhumbra and Beato, 2013) and mean dynamic (Costa et al., 2013) parameters of a synapse. It can therefore be asked under what conditions does the exact likelihood function, which accounts for correlations, provide an improvement over existing methods. Synapses with low numbers of release sites n, high release probabilities u, or long depression timescales τ D have the strongest correlations between EPSPs. High release probabilities u can arise either at strongly depressing synapses, with a high value of p 0 , or facilitating synapses where the stimulation protocol causes large values of u(t) to arise. In addition to these, at least partly, physiological factors, the correlated likelihood function is superior in conditions of sparse data. When only a few PSPs are available per sweep or, more importantly, only a few sweeps are available correlations within a spike train are relatively more important. To quantify this, we compared the full likelihood function described above with an approximated likelihood calculated by ignoring correlations (calculated using forms like Equation 11). The approximate likelihood did not account for the observed previous PSP amplitudes within a sweep, only their distribution of probabilities given by the model parameters and previous spike times. As expected, the uncorrelated likelihood function gave broader posterior distributions ( Figure 4A) with this effect diminishing as more data is added, either in the form of more EPSPs per sweep or more independent sweeps ( Figure 4B). Overall, the exact likelihood function that accounts for correlations provides superior inference on synaptic parameters. It is possible to obtain accurate constraints on synaptic parameters with only a few sweeps, meaning that experiments could capture a snapshot of synaptic properties in a small time window during protocols that change synaptic properties on timescales of tens of seconds rather than tens of minutes.

DISCUSSION
We have presented a method for exactly and efficiently calculating the probability of a given train of PSP amplitudes for dynamical synapses with the utility and robustness of the method demonstrated on synthetic and experimental data. This method, presented earlier in Bird (2016) is equivalent to that simultaneously and independently discovered by Barri et al. (2016) in their expectation-maximization approach, and represents a combination and extension of the recent work of Bhumbra and Beato (2013) on the exact likehood of isolated events and Costa et al. (2013) on the approximated likelihood of serial events. By considering quantal and dynamic properties together, the method described accounts for information that is necessarily neglected when each component is examined in isolation. The advance renders the calculation of the likelihood required for Bayesian inference practical for a variety of models of short-term synaptic plasticity. Moreover, unlike approaches that have relied on mean-variance analysis, it is applicable to single-sweep experiments and so is suitable for in-vivo scenarios where presynaptic firing is uncontrolled, but can be monitored.
The likelihood calculation that makes this inference possible is flexible and can be extended to a number of common synaptic models, allowing for examination of augmented recovery (Wang and Kaczmarek, 1998;Hosoi et al., 2007), releaseindependent depression with frequency-dependent recovery (Fuhrmann et al., 2004), and receptor desensitization (Jones and Westbrook, 1996;Otis et al., 1996). Four such models are described in Appendix A with associated computer code in the MATLAB and JULIA environments to be found in the Supplementary Material. Another natural and straightforward extension of the methodology presented here is to not assume that all sites are initially occupied but have the initial state of the system as a parameter to be inferred. This scenario is relevant for in-vivo experiments where there is no natural break in the presynaptic activity: in this case the release site occupancy and state of the dynamic release probability would be unknown.

AUTHOR CONTRIBUTIONS
AB and MR wrote the paper, AB and MR derived the equations, AB made the figures, AB wrote the MATLAB code and MR wrote the JULIA code, MW supervised experimental data collection.

FUNDING
We acknowledge funding through a Warwick Systems Biology Doctoral Training Centre fellowship to AB, funded by the UK BBSRC funding agency (BBSRC Grant No. BB/G530233/1), and funding to MR under BBSRC Grant No. BB/J015369/1.

ACKNOWLEDGMENTS
Data under adenosine was gathered by Dr. Michael Kerr. Thanks to Dr. Adam Newton (Warwick) and Dr. Peter Jedlicka (Frankfurt) for helpful comments on the manuscript. MR thanks Dr. Christophe Ladroue for useful discussions.

SUPPLEMENTARY MATERIAL
The Supplementary Material for this article can be found online at: http://journal.frontiersin.org/article/10.3389/fncom. 2016.00116/full#supplementary-material Supplementary Material comprises MATLAB and JULIA code to run Bayesian inference using the depression (DEP), depression and facilitation (FAD), release-independent depression (RID) and release-independent depression with frequency-dependent recovery (FDR) synaptic models described, in the Methods and Appendix A.
where u ⊕ m = u m − u m p 0 −p 1 p 0 is the release probability immediately following the mth spike.

APPENDIX B THE LIKELIHOOD CONVOLUTION INTEGRAL
The convolution integral for the amplitude distribution Equation (8) is computed a large number of times. There are two difficulties in doing this efficiently: (i) evaluating a gamma function for large shape parameters and (ii) finding reasonable bounds for the range of integration.

Evaluating Gamma Functions
When µ a ≪ σ a the argument of the gamma function in the denominator of Equation (8) can grow very large in order to normalize the distribution. To avoid issues with this, we note that Stirling's approximation allows evaluation of the gamma function with large arguments For small values of n the function κ(n) can be evaluated exactly, whereas for larger arguments the second, approximate form is used.

Bounding the Range of Integration
The second difficulty involves finding bounds for the range of integration in Equation (8). Introducing γ = β − 1 and y * = γ /λ, and substituting κ(γ ) for the gamma function, Equation (8) can be rewritten as which is in a form that is straightforward to integrate numerically, once reasonable integration bounds are identified.
To this end we note that by completing the squares in the exponential and introducing z = y/σ b the integrand (ignoring constant prefactors) can be written as where t = (A/σ b − λσ b )/2. The integrand has a single positive maximum z * = t + t 2 + γ (note that γ = β − 1 > 0 because the assumption is made that µ a > σ a , see Methods). Our method then is to perform the y integral over an equivalent range of z for which f (z) > f (z * )e −x where x is some sufficiently large power (typically in the range 10-15) with the boundaries of the z range efficiently found using a Newton-Raphson method.