Sensorless wavefront correction in two-photon microscopy across different turbidity scales

Adaptive optics (AO) is a powerful tool to increase the imaging depth of multiphoton scanning microscopes. For highly scattering tissues, sensorless wavefront correction techniques exhibit robust performance and present a straight-forward implementation of AO. However, for many applications such as live-tissue imaging, the speed of aberration correction remains a critical bottleneck. Dynamic Adaptive Scattering compensation Holography (DASH) -- a fast-converging sensorless AO technique introduced recently for scatter compensation in nonlinear scanning microscopy -- addresses this issue. DASH has been targeted at highly turbid media, but to-date it has remained an open question how it performs for mild turbidity, where limitations imposed by phase-only wavefront shaping are expected to impede its convergence. In this work, we study the performance of DASH across different turbidity regimes, in simulation as well as experiments. We further provide a direct comparison between DASH and a novel, modified version of the Continuous Sequential Algorithm (CSA) which we call Amplified CSA (a-CSA).


I. INTRODUCTION
Dynamic wavefront correction is a powerful approach for extending the imaging depth of nonlinear microscopy in scattering tissues such as the mouse brain. However, to make the benefits of this technology accessible to an even broader range of applications, some remaining limitations need to be overcome. A particular problem is posed by the speed at which an aberration-compensating pattern can be retrieved, as imaging into live specimens requires to outpace the decorrelation time imposed by the constantly changing tissue. To this end, we have recently introduced Dynamic Adaptive Scattering compensation Holography (DASH), a fastconverging indirect wavefront sensing algorithm for scatter correction in nonlinear scanning microscopy [1]. DASH employs a programmable phase-diffractive element to shape two beams simultaneously: a test beam, whose wavefront is varied in each step to explore possible signal improvements, and a corrected beam, whose wavefront is continuously improved using the information gained from interference with the test beam. DASH employs phase-only wavefront shaping, which bears the advantage of power efficiency. On the other hand, discarding amplitude information introduces errors in the resulting wavefronts. In this work we investigate the impact of these errors on the correction performance of DASH for different regimes of turbidity. Furthermore, we compare DASH to an alternative approach, which does not suffer from the "phase-only" restriction: a modified version of the Continuous Sequential Algorithm (CSA) [2], which follows a pixel-by-pixel testing approach. Our modification to CSA consists in amplifying the intensity of the tested pixel relative to all other pixels, which increases * Correspondence should be addressed to maximilian.sohmen@i-med.ac.at signal contrast. This amplification allows the application of CSA in situations with low signal-to-noise ratio (SNR), as often encountered in practice. We refer to our modified CSA algorithm as Amplified Continuous Sequential Algorithm, or a-CSA. This work is structured as follows: In Section II we discuss how turbidity can be quantified, how a scattering medium of specific turbidity can be modelled numerically, and how in a two-photon imaging experiment a scattering volume of 'tunable' mean free path can be emulated by a phase-mask displayed on a spatial light modulator (SLM). In Section III, the principles behind DASH and a-CSA are explained. In Section IV, we detail the implementation of a-CSA and DASH in our numerical simulations (Section IV B), our experiment (Sections IV C), and present a comparison of algorithm performance for two-photon excited fluorescence (TPEF) imaging of a homogeneous 'dye-slide' sample in different regimes of turbidity (Section IV D) as well as mouse brain (Section IV E). In Section IV F, implications of a low SNR on algorithm performance are discussed.

II. QUANTIFYING TURBIDITY
It is essential for the present work to define what we mean when speaking of "low" or "high" turbidity. The scattering properties of materials and tissues are often quantified using the scattering mean free path l s , i.e., the expectation value of a photon's free travelling path before it is scattered. This is mirrored in the Beer-Lambert law, |U 0 (L)| 2 = |U 0 (0)| 2 e −L/ls , where |U 0 (L)| 2 represents the intensity of the unscattered ('ballistic') light after travelling (under free-space propagation) to distance L, and |U 0 (0)| 2 the incident light intensity. The transport mean free path l t takes scattering anisotropy into account: l t = l s /(1 − g), where g = ⟨cos θ⟩ is the expectation value of the cosine of the scattering angle θ. For instance, in a material which predominantly scatters into the forward arXiv:2202.12727v5 [physics.optics] 4 Aug 2023 direction (causing small scattering angles), l t is much larger than l s . Conversely, in an isotropic scatterer l t = l s . Typical values of l s for brain tissue range between a few tens to hundreds of micrometers [3][4][5].
Our goal is to model the effect of a (in general threedimensional, 3D) scattering medium on a light field propagating in positive z-direction by a two-dimensional (2D) phase mask, located at axial position z = z scat , with transmission function exp (iΦ(ρ)). Here, Φ(ρ) denotes the scattering-related phase shifts experienced by a field point at the 2D lateral coordinate ρ. The field after the phase mask is denoted by U scat (ρ). Note that this is the full field, not just a 'scattered field' amplitude. Assuming that the phase mask is suitably chosen to describe a medium with predominantly forward scattering and without absorption, we choose the normalisation The ballistic contribution at depth L inside the medium -emulated by the phase mask on the SLM -can be calculated using the overlap integral (OI) i.e., the 'projection' of the field with imprinted phase mask onto the unscattered (incident) field. This equality (3) is most intuitive if the integral is evaluated in the plane of the 2D scattering mask, but for freely propagated fields the OI in fact stays constant in all transverse planes at z ≥ z scat . Using the Lambert-Beer law, the OI can also be written as l s appears here, since every single scattering event reduces the ballistic contribution. Note that this relation (4) implicitly assumes that cases of successive scattering events which exactly compensate each other (thus, re-populating the forward-directed incident field, i.e., contributing to the OI and -erroneously -to the estimated ballistic part) are statistically unlikely and can be ignored. Combining Eqs. 1-2 we can quantify a computed phase mask in terms of the corresponding "thickness" expressed in units of the scattering mean free path l s : 1 1 We note that the relation (5) is consistent with the considerations made in Ref. [6] (see Eq. 4 therein), which lead to the derivation of the scattering-phase theorem.
For the case of dominant forward scattering and negligible absorption, this relation allows us to compute a 2D phase mask Φ(ρ) that leads to a speckle pattern in the object plane which is in many ways similar to that of a voluminous 3D scatter medium of the same scattering mean free path l s . In the experiments described later in this work, we will exploit this fact to simulate different regimes of turbidity by displaying computed 2D scatter masks of specific l s on an SLM. Of course the equivalence between a 3D and a 2D scatterer -even if they exhibit the same l s -does not encompass all physical properties; for instance, the isoplanatic patch (i.e., the 'corrected field of view') obtained through an AO wavefront correction will be smaller for a 3D than for a 2D scatterer. However, concerning the aspects studied in this work (e.g., the algorithm convergence at a single field point), a 3D and a 2D scatterer of same l s can be regarded as equivalent.
We denote the RMS value of a scattering phase mask by a scat (see Algorithm 4, Supplementary Material). If the phase values of the mask are normal-distributed or, for any distribution, if a scat is sufficiently small [6], the relation between the scatterer thickness and a scat is simply L/l s = √ a scat .

III. APPROACHES FOR SENSORLESS AO IN NONLINEAR MICROSCOPY
Most indirect (or sensorless) AO schemes construct an aberration compensation phase mask from measurements of the TPEF signal for many different test patterns or "modes" M n displayed on an SLM, n ∈ {1, . . . , N } denoting the mode index. These test patterns can be predefined, for instance as a set of Zernike [7] or Hadamard modes [8,9], or directly derived from the signal of previous test patterns, such as in genetic algorithms [10]. The phase patterns are commonly imaged into the back focal plane (BFP) of the objective lens, where -by the Fourier transform property of the lens -the test patterns have a homogeneous effect on the point-spread function (PSF) over the entire focal plane. However, there exist also sample-conjugated schemes, where the SLM is imaged directly into the sample space [11][12][13][14].
A particular case is the "traditional adaptive optics" regime, where the given phase aberration Φ(ρ) can be approximated by a weighted sum of the available test modes, Φ(ρ) = N n=1 a n M n (ρ), and each a n has a rather small phase magnitude (up to about 1 rad RMS over the pupil). Here, ρ is the 2D coordinate vector in the BFP. In this case, the TPEF signal S generated in the focus has a smooth dependence on the mode magnitudes a n , allowing to approximate it by a simple function such as a multi-dimensional Lorentzian or parabola, e.g., S ∝ 1− n m α nm a n a m . In many cases, the cross-talk matrix α can be diagonalized by choice of an adequate mode basis [15]. Then, N + 1 measurements can be sufficient to characterize the phase aberration [16], although usually 2N + 1 or even 3N measurements are taken. It is important to note that this particular case does not necessarily coincide with low turbidity (i.e., a small value of L/l s ), since a large number of modes, even if their individual magnitudes are small, can still sum up to a large total aberration. Outside the traditional regime, in what is often called the "scattering" regime, it is usually required to take many more measurements to find a suitable corrective phase mask. Typically, in the scattering regime the complexity of the aberration is beyond the capabilities of the correction device, meaning that the number of scattering modes is larger than the number of correctable ones. Additional reasons which can hinder full aberration compensation include an SNR too small to measure the contributions of less significant modes, or simply lack of time to measure all of the (many) scattering modes. Nevertheless, it has been shown that even correcting only a limited number of scattering modes can suffice to restore a diffractionlimited focus, which in this context can be understood as an intensity-enhanced speckle [17]. There exist several approaches for sensorless AO which can operate in the scattering regime [1,2,10,[18][19][20][21]. Two of them, which we have identified as fast converging in numerical simulations, are studied in this work and outlined in Fig. 1.
The first approach, known as continuous sequential algorithm (CSA) [2], operates on a single-pixel basis: sequentially, each pixel's phase is adjusted to maximize the signal. Although a basis of single pixels appears advanta-geous due to its intrinsic orthogonality, the approach is usually impractical because the interference contrast from single-pixel phase modulation is typically buried in noise. This inspired alternative approaches, such as hybrid methods using larger "superpixels", each of them featuring an internal spatial phase pattern [20,22]. However, we point out that the SNR problem of CSA can also be remedied by simply amplifying the intensity in the test superpixel, as this increases the interference contrast. This variant, which we call a-CSA ("amplified CSA"), requires the laser power on each superpixel to be controllable. Then, as we show in Section IV C, a-CSA can work in experiments where only a comparably small number of modes needs to be corrected.
The mode basis of the second approach, DASH, is a set of plane waves which are, in a sense, Fourier-related to the single-pixel basis. Importantly, for this approach the SNR is uncritical, as the power fraction contained in each plane wave can be adapted directly via the SLM hologram. This makes this method more practical than CSA. However, to shape test beam and corrected beam without introducing artefacts, the SLM would be required to manipulate both, the spatial phase and amplitude distribution of the incident laser beam, just as for a-CSA. In DASH, the amplitude part is disregarded and only the phase part is included in the hologram, ensuring high power efficiency but inevitably leading to errors in the generated wavefronts and undesired diffraction orders. An example of the effects of phase-only shaping is given in  spots with random phases we wish to create in the focal plane using an SLM in the Fourier plane. Discarding the amplitude part of the synthetic hologram (a superposition of the corresponding 256 plane waves) leads to the image on the right. The standard deviation of the phases at the target sites is about 0.25 rad, corresponding to about 4% of the wavelength, and the relative amplitude error at the target sites is about 20%. Additionally, we see that weak "ghost spots" appear outside the target square.
In line with our earlier works [1,14], we will in the following for both a-CSA and DASH denote by f the fraction of the total pupil intensity which is contained in the test mode.

IV. COMPARISON OF ALGORITHM PERFORMANCE
In this section we compare the performance of DASH and a-CSA over different scales of turbidity in numerical simulations and experiment.

A. General recipe
The procedure for our systematic comparison between DASH and a-CSA is as follows.
1. Both algorithms are initialised with an empty correction mask.
2. We calculate a scattering phase mask that emulates a certain degree of turbidity and is kept constant throughout each algorithm run. In the simulations, this scattering mask is simply added to the BFPconjugate wavefront during image propagation; in the experiment, it is displayed on our SLM.
3. We imprint a test mode on the excitation beam and carry out a phase-stepping procedure, i.e., measure the TPEF signal generated in the object plane for P different global phase offsets ϕ p = 2π p/P (with p = 1 . . . P ) applied to the test mode. 2 From the phasestepping procedure, we extract the parameters for which the test mode maximizes the TPEF signal.
4. We directly update our correction mask by including the tested mode with the retrieved optimal parameters. In the simulations, this correction mask is added to the wavefront incident on the scatterer; in the dye-slide experiment, it is displayed on our SLM, on top of the scattering mask. 5. We now go back to step 3, test the next mode, update the correction mask, and so forth, until the correction mask contains the full number of test modes with their optimal contributions.
6. Once the full set of modes has been tested, we can start again from the first test mode for another correction run. We typically run 3 full iterations, which has shown to ensure algorithm convergence in most cases.
Detailed algorithm descriptions are provided in the Supplementary Material. 3

B. Numerical model
In our numerical model, we consider aberrations defined on a square grid of 64×64 pixels accounting for N 2 scat = 4096 scattering modes. The aberration phase masks are calculated by adding cosines of all spatial frequencies supported by the grid size with uniformly distributed random phases. The amplitudes of the cosines follow a Gaussian weighting with standard deviation σ, such that cosines with higher spatial frequencies are increasingly attenuated, as expected for most scattering scenarios found in nature. By varying σ we can hence tune the spatial frequency content of the model scatterer. For details on our specific implementation of calculating the scatter mask, the reader is referred to our Supplementary Material (Algorithm 4).
The simulated SLM for (square) correction patterns features 32×32 pixels. Both the scattering mask as well as the simulated SLM are located in the BFP, in Fourier relation to the object plane, and have a side length measuring 2k 0 NA, where k 0 is the vacuum wavevector of light and NA = 0.7 the numerical aperture. Each set of modes is tested repeatedly for 3 iterations. In our experiments, the SLM illumination does not feature an ideal, flat-top intensity profile, but exhibits a (weak) Gaussian shape of e −1/2 distance approximately equal to the pupil radius.
To reflect the experimental situation, also in our simulations we assume a light intensity distribution across the BFP which is symmetrically Gaussian and of corresponding width. The fluorescent sample is assumed to be a homogeneous, absorption-free fluorescent layer of d s = 10 µm thickness. 4 We model this sample volume by grid points on N s distinct 2D planes with axial interspaces of d s /(N s − 1), and the nominal focus plane located at its center. Normally, we set N s = 6 layers (interspaced by 2 µm), since a larger number of planes does not notably improve the accuracy of results and we have checked that the exact choice of N s is uncritical for our conclusions. To simulate the PMT counts, we simply propagate the light field into the N s planes, calculate the squared intensity on each grid point, and take the sum over all points.
In Section IV D, we show simulation results at high-SNR conditions, for which photon shot noise can be neglected. For scenarios with low SNR, as discussed in Section IV F, we simulate shot noise by varying the summed PMT counts according to Poisson statistics.

C. Experimental procedure
We compare DASH and a-CSA experimentally in a home-built TPEF microscope featuring a phase-only SLM (Meadowlark HSP1920-500-1200-HSP8, 1920×1152 pixels of side length 9.2 µm) located in a BFP-conjugate plane of the excitation path. This SLM serves two purposes in parallel: First, by displaying a 'scattering' phase mask of defined scattering mean free path l s (see Section II) it allows to emulate the effect of a scattering medium in the light path. Second, by running our sensorless wavefront correction schemes and displaying the retrieved phase compensation patterns on top of the given scattering mask, we can test and compare the algorithm performances. A sketch of the setup is shown in Fig. 3. For excitation, we use a mode-locked Ti:sapphire laser (MaiTai DeepSee, Spectra Physics) with emission maximum set to 800 nm wavelength. The epi-TPEF is collected by a water-dipping objective (Olympus XLUMPLFLN20XW, NA = 1) and directed towards a photomultiplier tube (PMT, Hamamatsu H10769A-40) using a dichroic beam splitter (AHF Analysentechnik, HC665LP) and an additional emission filter in front of the PMT (AHF, ET680SP-2P8). In our measurements, to operate with a more homogeneous pupil illumination, we artificially reduce the NA to about 0.7 using an aperture in a pupil-conjugate plane (not shown in Fig. 3). 4 NB that a thin 3D fluorescent sample layer as assumed here leads to smaller signal enhancements and slower algorithm convergence than an (infinitely thin) 2D layer. In the limit of a homogeneous 3D sample volume with thickness much larger than the Rayleigh range of the focused beam, the second-order nonlinearity of TPEF imaging is insufficient for providing any signal enhancement at all [23].
For our systematic comparison, the sample consists of a thin layer of rhodamine solution sandwiched between a glass slide and a coverslip of 1 mm and 170 µm thickness, respectively. Before each algorithm run, we perform an initial precorrection run to compensate for optical-system wavefront distortions such as the spherical aberrations introduced by the coverslip. Starting from this precorrection, the wavefront correction algorithms afterwards only have to correct for the artificial scatterer displayed on our SLM, ensuring compatibility with the numerically simulated correction runs.
For DASH, we typically set the intensity in the test mode to 30% of the total intensity (i.e., f = 0.3, see Supplementary Material). For a-CSA, we amplify the relative power in the test-superpixel by superposing all other a-CSA-superpixels (consisting of many physical SLM pixels) with a blazed grating of defined modulation depth [24]. Thus, for all except the test-superpixel, power is diffracted away from the optical axis and cut by an aperture as indicated in Fig. 3. It is important that the mean phase of the blazed grating is nought to prevent an effect on the zero-order beam, which carries the phase mask. 5 Our blazed gratings have a period of 4 SLMpixels and a modulation depth of approximately π rad, resulting in (1−β) ∼ 50% of the laser power being dumped (measured value) and a power fraction f ≈ 1/(βN 2 ) in the case of N ×N superpixels. Naturally, dumping power on the iris also decreases the total TPEF signal, wherefore we have to compensate by increasing the total laser power. Since in practice the available power is necessarily finite, this approach for amplitude modulation is only applicable in regimes of turbidity which do not require too many (i.e., too small) a-CSA-superpixels.
For compatibility, we conduct our experiments with the same number of test modes as in the simulations for the different regimes, i.e. 16, 64, and 256 modes for low, medium, and high turbidity, respectively. Our SLM holograms measure 560×560 SLM-pixels and entirely fill the reduced objective pupil. For a-CSA this means that we form 4×4 square superpixels (each 140×140 SLM-pixels) for the low turbidity, 8×8 superpixels (each 70×70 SLMpixels) for the medium turbidity, and 16×16 superpixels (each 35×35 SLM-pixels) for the high turbidity scenario. Analogous to the simulation, DASH and a-CSA are executed for three full iterations. The results are summarized in Fig. 4.

D. Results systematic comparison
For our systematic comparison, we study three different scenarios (A, B, C) concerned with increasing levels of turbidity. In Scenario A we study low turbidity, with an effective scatterer thickness of a single scattering mean free path, L/l s = 1, and a spatial frequency distribution of the scatterer chosen accordingly narrow, σ = 1. An example of such a scatter mask is shown on the top left of Fig. 4 (A). In the focal plane, such a mild scatterer typically leads to an intensity distribution which is still spatially confined and only moderately deviates from the aberration-free focus, as shown in Fig. 4 (A), bottom left. Here, the intensity scale is normalized to the peak intensity of the aberration-free focus, such that the maximum value (typically around 0.4 for Scenario A) equals the respective Strehl ratio. In this low-turbidity case it is expected that only a small number of modes is needed for adequate compensation, wherefore we correct 4×4 = 16 modes. The plots in the center and right column of Fig. 4 (A) show the TPEF signal enhancement simulated numerically (center) and measured experimentally (right), respectively, for a-CSA (blue) and DASH (orange). Specifically, we plot the TPEF signal measured when the established compensation pattern is applied at the respective measurement number. 6 In the numerical simulations [ Fig. 4 (A), center], we observe that for a-CSA the signal initially increases fast, but then flattens off from the second iteration onwards. DASH, in contrast, exhibits a less monotonic signal evolution than a-CSA, first increasing more slowly, but ultimately surpassing the final signal level of a-CSA. We attribute the signal flattening and lesser total performance of a-CSA to the fact that its compensation patterns are inherently displayed at the native resolution of the test-mode basis (i.e., 4×4 superpixels in Scenario A) and therfore do not exploit the full resolution supported by the SLM (32×32 pixels in the simulation). This coarser spatial discretization leads to increased diffraction losses compared to DASH (which always exploits the full SLM resolution, regardless of the number of tested modes) during the course of the optimisation. Additionally, we observe that for DASH the steepest changes in signal usually occur at the beginning of each iteration. This is due to the fact that we intentionally sort our test modes in ascending order with regard to the angle between their propagation direction and the optical axis. Since small scattering angles are typically more dominant in fluorescence imaging settings, this tends to speed up the algorithm convergence. Both methods achieve comparable final Strehl ratios around 0.75 and 0.85 for a-CSA and DASH, respectively (see Supplemental Material). Our experimental measurements [ Fig. 4 (A), right] show the same general trends. When comparing absolute values of signal enhancement, it is important to keep in mind the critical dependence on the sample thickness: in our simulations, e.g., we observe for increasing the thickness as 0 → 4 µm → 10 µm (at constant fluorophore density) a tendency of decreasing final enhancements of 2.14 → 1.32 → 1.29 in case of a-CSA and 2.9 → 1.8 → 1.5 in case of DASH. Given that in our experiments the sample layer thickness is controllable and measurable only with limited accuracy, it seems fair to claim good agreement between the observed values in simulation and experiment.
In Scenario B, we assume medium turbidity with L/l s = 3 and an intermediate contribution of modes of higher spatial frequency, σ = 3. We increase the number of correctable modes to 64 to account for this greater spatial frequency content. Before correction, the typical Strehl ratios in this scenario are on the order of 5% [ Fig. 4 (B), bottom left]. The plots in the center and right column show how the TPEF signal improves during correction in simulation (center) and experiment (right). Here, solid curves represents the mean value over 5 repeated runs in the simulation or experiment, each initialised with a different random scatterer, and the ribbons represent the respective standard error of the mean. Kinks in the experimental a-CSA curves (blue asterisks) are caused by the circular, underfilled pupil (see main text). Additional information, such as retrieved SLM compensation masks and resulting focal spots, is provided in the Supplementary Material. Fig. 4 (B), center] indicate that again, a-CSA initially improves faster than DASH, but levels off at a lower final signal. The total signal enhancement achievable by both correction algorithms is higher than for the low-turbidity case; the final Strehl ratios are around 0.4 for a-CSA and 0.6 for DASH. We suspect that the initial delay of DASH in comparison to a-CSA is related to the appearance of higher diffraction orders ("ghost foci" as shown in Fig. 2) stemming from the phase-only field shaping of DASH. This is one of several reasons which lead us to believe that the use of a complex field-shaping technique might bear great potential for improvement of algorithm performance in the future. In our experimental measurements [ Fig. 4 (B), right], we observe similar behaviour. Initially, a-CSA improves faster than DASH, but ultimately DASH delivers higher signal enhancement. For a-CSA, we observe a kink in the signal enhancement at the transition between two iterations (marked by the blue asterisk in Fig. 4), which is not present in the simulations. We attribute this kink to the circular pupil in our experiment which cuts power from superpixels located near the corners of the square SLM pattern. Since we step through the a-CSA superpixels in order of their distance to the pupil center, this effect is most dominant towards the end of each iteration.

Our numerical simulations [
In Scenario C, we assume high turbidity, with L/l s = 5 and σ = 5, where without correction typical Strehl ratios are on the order of 1%. In this scenario, we correct 256 modes.
In the numerical simulations [ Fig. 4 (C), center], we observe that the initial speed advantage of a-CSA compared to DASH decreases, but the final enhancements achievable with both algorithms are even higher than in Scenario B. Comparing between the two algorithms, our simulations again deliver a better performance for DASH than for a-CSA. Final Strehl ratios are around 0.35 for a-CSA and 0.45 for DASH. These trends are well supported by our experimental data [ Fig. 4 (C), right].
Graphical animations of our simulated correction runs are provided as GIF files (see Supplementary Material). These animations show the phase patterns displayed on the SLM during the correction algorithm as well as the evolution of the focal plane intensity distribution.

E. Experimental comparison for a biological sample
Both DASH and a-CSA can offer striking quality improvements for imaging through turbid media, such as in microscopy of layers deep inside tissue. To demonstrate this, we use our TPEF scanning microscope to image microglia expressing green fluorescent protein (CX 3 Cr-1 GFP , cf. Ref. [1]), located 200 µm deep inside the corpus striatum of a mouse brain slice fixed via perfusion. For this, we adjust our wavelength to the excitation maximum at 900 nm, retrieve a precorrection for optical system aberrations by focussing on microglia directly below the coverslip, and subsequently move the objective focus mechanically 200 µm down into the brain tissue. Figure 5 (A) shows an example image of a microglia cell in the deep layer, recorded with only the precorrection applied. Light scattering in the brain tissue above leads to rather low contrast between structures in Fig. 5 (A), as illustrated by the plot of signal intensity along the black dashed line, shown in the inset. Figures 5 (B, C, D) show the same microglia cell after running one out of three AO correction algorithms, respectively, each for 3 iterations of 256 modes. Figure 5 (B) shows the cell after application of the correction mask shown the upper left inset, which has been obtained by performing a-CSA with power-overhead in the test mode (f ∼ 2/256) on top of the precorrection. The improvement in signal is on the order of a factor of 2 to 3 across the cell body, and processes extending from the cell into its surrounding are starting to become visible, especially when a logarithmic color map is applied (upper right inset). In Fig. 5 (C), the cell is shown after performing regular CSA (i.e., without power overhead in the test mode, f = 1/256) on top of the precorrection. The measured modulation SNR was insufficent for algorithm convergence; as is apparent, the signal quality did not improve compared to the precorrection alone [ Fig. 5 (A)], or even became slightly worse. Figure 5 (D) shows the cell after performing DASH (3 iterations of 256 modes, f = 0.3) on top of the precorrection. The signal intensity across the cell body is increased by a factor of about 5; the contrast between structures is clearly enhanced, allowing to distinguish processes extending from the cell body into the surrounding tissue.
Note that for taking the images of Fig. 5 we have started from the precorrection of Fig. 5 (A) simply for illustration purposes, to disentangle the correction of (mild) optical system aberrations from the correction of scattering inside the brain tissue, and for compatibility with the dye-slide simulations and experiments. This, however, is not an experimental necessity; typically, images of a similar quality as shown in Fig. 5 (B, D) can also be obtained by performing DASH or a-CSA without any precorrection.

F. Low-SNR scenarios
The practicality of every sensorless wavefront correction scheme depends on its robustness with respect to noise. It has already been shown that DASH performs well in this regard in comparison to alternative methods [1]. Let us now compare performances at low-SNR conditions. The key for all such methods is being able to discern the TPEF signal modulation -caused by phase-stepping of the test beam -on top of the noise floor, since this modulation carries the relevant information. For this, the test mode must contain a significant percentage of the total light power.
In the following, we use numerical simulations to compare which SNR each of the three methods DASH, 7 a-CSA, and regular CSA 8 requires to operate successfully. To this aim, we simulate shot noise in the PMT readout and then repeatedly execute the methods starting from a decreasing initial signal level. We stop repeating once the signal enhancement η achieved by a method (after 3 full iterations) drops below a certain threshold η th . We set this threshold to η th = 0.75 (η max − 1) + 1, where η max is the enhancement for the noise-free case (i.e., simulations as in Section IV D). The exact choice of the threshold prefactor (0.75 in our case) is largely arbitrary and uncritical for our conclusions. We assume the same sample and investigate the same three turbidity scenarios A, B, C as discussed above.
Our simulations show that for weak scattering (A) and correction of N 2 = 16 modes, DASH signal enhancement crosses threshold when the photon level (before correction) exceeds around 100 photons. a-CSA delivers comparable performance when the test-superpixel contains about 10 times more laser power than any of the other superpixels, corresponding to around 40% of the total power. Regular CSA only crosses threshold for signal levels from around 1000 detected photons onwards, i.e., about 10 times higher than required for DASH. For medium turbidity (B) and N 2 = 64 modes, DASH crosses threshold at about 100 photons collected per measurement (before correction). Comparable performance can be achieved using a-CSA if the test-superpixel amplification factor is around 50, again representing roughly 40% of the total laser power. Regular CSA crosses threshold at around 3 k photons. Finally, for high turbidity (C) and N 2 = 256 modes, DASH crosses threshold at around 400 counted photons (before correction). Again, comparable performance can be achieved using a-CSA with a test-superpixel amplification factor around 50. Regular CSA, in contrast, requires about 15 k photons.
These results are summarized in Table I. Of course, in practice the required signal levels will also depend on the nature of the sample, wherefore these numbers can only serve as a rough orientation. For example, increasing the sample thickness will make it increasingly difficult to obtain a successful correction.
Nevertheless, from our results we can draw three main conclusions. First, as expected, regular CSA requires a much higher SNR than DASH to operate successfully. Achieving a higher SNR requires sending more optical power into the sample volume, making regular CSA unfavourable, e.g., for imaging of fragile specimens. Second, it is possible to successfully operate a-CSA at the same SNR conditions as DASH, if a sufficient amplification of the test-superpixel can be provided. However, it needs to be stressed that if the amplification is realized using a superpixel method (as in Section IV C), discarding light from other pixels, this high performance of a-CSA comes at the price of wasting much optical power. For instance, if the hologram features N ×N superpixels and the testsuperpixel is supposed to contain a fraction f of the total pupil intensity, the incident laser power must be increased by a factor of (N 2 − 1)f /(1 − f ) to keep the total optical power in the objective pupil constant. Even for a moderate number of modes of 64, i.e., N = 8, achieving f = 0.3 requires a total power increase by a factor of 27. This steep scaling with N in practice quickly becomes unfavorable for application of a-CSA in deep tissue imaging, where typically many correctable modes are needed. Third, for DASH, in contrast, such problems do not arise, since the power in the test mode is independent of N and can be tuned conveniently through the depth of the corresponding grating on the SLM.

V. DISCUSSION AND SUMMARY
In this work, we have devised a method to compute a 2D phase mask whose effect on a light field is in many ways equivalent to that of a voluminous scatter medium of corresponding scattering mean free path. This enables software-controlled systematic studies where the degree of scattering is controlled precisely, tunable over a large range, and manual exchange of physical scatter materials is not needed.
We have used this method to investigate the performance of DASH for different scales of turbidity, ranging from the "traditional AO" regime to the regime of multiply-scattered photons, and compared it to an alternative approach which we call Amplified Continuous Sequential Algorithm (a-CSA). a-CSA operates on a square-superpixel mode basis and is shown to work also in low-SNR situations (unlike regular CSA) due to an increased relative power in the test mode. In practice, this amplification of the test superpixel can be achieved by attenuating all other superpixels, e.g., by adding phase gratings of defined modulation amplitude. While easy to implement, this approach works only for a limited number of correctable modes due to its inefficient use of light power, and only for a-CSA superpixels consisting of a sufficient number of physical SLM pixels.
We found in both numerical simulations and experiments that for low turbidity DASH initially improves more slowly than a-CSA, pointing to effects of the phase-only light shaping principle behind DASH. This initial-speed disadvantage of DASH disappears for increasing turbidity. Furthermore, we observed that independent of the degree of turbidity, from the end of the second iteration onwards, DASH outperforms a-CSA. We attribute this performance advantage to the fact that DASH can always exploit the full resolution of the SLM, hence minimizes diffraction losses compared to superpixel methods such as a-CSA. As an overall tendency, we observed that the signal enhancement achievable using DASH or a-CSA grows for increasing degrees of turbidity, highlighting the potential benefits these algorithms promise for nonlinear imaging in highly scattering environments.
We have illustrated the practical improvements these methods can yield for two-photon microscopy by the example of GFP-microglia 200 µm deep inside mouse brain tissue. Furthermore, we emphasize that a-CSA and DASH were performed using identical hardware on the same experimental setup. Therefore, depending on the task, it is possible to execute either (or a combination) of both routines on a pure software level.
A critical challenge for wavefront correction in highly scattering samples remains the size of the corrected field of view ("isoplanatic patch", IP). For fixed brain tissue, as shown in Fig. 5, IP diameters are on the order of 20 to 30 µm, whereas live brain tissue tends to scatter photons at larger angles, decreasing the IP size to just a few µm [14]. Several strategies have been proposed to increase IPs, including multi-conjugate AO [25][26][27] or the application of individual corrections for many sample points [13,14,21]. Whether a-CSA and DASH may benefit from such strategies in terms of the IP size is an interesting question for future studies.

Algorithms for indirect wavefront sensing
In this Section, we provide the algorithms for DASH and a-CSA in pseudo-code. Algorithm 1 describes general steps that both approaches have in common. The function wavefront sensing takes as inputs a set of test modes M and a correction pattern C, initialized by arrays of ones. The algorithm loops I times through the entire set of N modes. For each mode M n , the phase stepping procedure phase step is executed, which returns an array of complex numbers U SLM which carries the amplitude and phase pattern for a particular phase step of the excitation beam. measure signal measures a single TPEF signal S p for the pattern U SLM displayed on the SLM corresponding to a particular phase step p = 1, . . . , P . From the set of P TPEF measurements per mode the complex-valued coefficient a n is calculated, containing magnitude and phase estimates of the mode M n . Finally, the correction beam C is updated by the function update using the new information (a n , M n ), improving the wavefront correction.
DASH and a-CSA differ in how U SLM and the corrected beam part C are updated. The respective sub-functions are outlined in algorithm 2 for a-CSA and in algorithm 3 for DASH.
The algorithms contain a few variables that are not specifically declared as inputs and can be viewed as "global", for instance the number of iterations (I), modes (N ) or phase steps (P ). A particular variable that needs explanation is the scalar value f , which ranges between 0 and 1 and determines the power fraction contained in the test mode.
Compared to our earlier implementation of DASH [1], the Algorithm 3 outlined below features a minor modification: the normalization of the corrected beam (|C i,n | in Eq. 1 of Ref. [1]) which erases the amplitude information from the corrected field C i,n is replaced by a different scalar normalization factor, which can lead to slightly better performance.

Construction of scatter mask in the numerical simulation and experiment
Algorithm 4 describes the procedure of constructing the scattering phase mask displayed at the SLM. We denote the RMS value of a scattering phase mask by a scat . If the phase values of the mask are normal-distributed or, for any distribution, if a scat is sufficiently small [6], the relation between the scatterer thickness and a scat is simply L/l s = √ a scat .
The function make scatterer takes the following inputs: the desired spatial frequency content characterized by σ, the side length of the phase mask N scat , and the desired RMS phase magnitude of the scatterer a scat . The pixel indices are given by x and y. The function rand ALG. 1: General algorithm Inputs: M , a list of N 2D real-valued input modes of size NSLM×NSLM; C, a 2D array of size NSLM×NSLM, initialized with ones.
procedure wavefront sensing(M , C) for i = 1 to I do for n = 1 to N do for p = 1 to P do USLM ← phase step(C, Mn, p) Sp ← measure signal(USLM) end for an ← 1 P P p=1 Sp exp(−i 2π P p) C ← update(C, Mn, an) end for end for return C end procedure