Deep learning-based quantitative phase microscopy

Quantitative phase microscopy (QPM) is a powerful tool for label-free and noninvasive imaging of transparent specimens. In this paper, we propose a novel QPM approach that utilizes deep learning to reconstruct accurately the phase image of transparent specimens from a defocus bright-ﬁ eld image. A U-net based model is used to learn the mapping relation from the defocus intensity image to the phase distribution of a sample. Both the off-axis hologram and defocused bright-ﬁ eld image are recorded in pair for thousands of virtual samples generated by using a spatial light modulator. After the network is trained with the above data set, the network can fast and accurately reconstruct the phase information through a defocus bright-ﬁ eld intensity image. We envisage that this method will be widely applied in life science and industrial detection.


Introduction
Fluorescence microscopy (FM) is one of the approaches that image transparent samples with high contrast and high specificity.However, fluorescence labeling inevitably causes irreversible effects on samples, including phototoxicity and photobleaching, thereby preventing FM for long-time imaging of live samples [1].Being a label-free and noninvasive imaging technique, quantitative phase microscopy (QPM) can visualize transparent samples with high contrast and in a quantitative manner by recovering the phase of the light after passing through a sample [2,3].In the past few decades, QPM has developed rapidly in both structure design and algorithm optimization.The QPM approaches can be classified into three types.The first type is wavefront-sensing, such as Shack-Hartmann sensor [4,5].This type of QPM techniques have a simple structure and fast wavefront measurement speed, while the spatial resolution is limited by the diameter of micro-lenses.The second type is interferometric microscopy, which records interference patterns between an object wave with a reference wave, has played an important role in many fields due to its precedent phase accuracy [6][7][8].However, the interferometric microscopic approaches are susceptible to environmental disturbances.The third type is diffractionbased QPM, such as transport-of-intensity equation (TIE) [9], Fourier ptychographic microscopy (FPM) [10], and beam propagation based phase retrieval approach [11], featuring simple structure and low cost.The quantitative phase image can be obtained after recording multiple diffraction patterns under different constraints and processing the data with a physical model.These methods are non-interferometric and are hence immune to environmental disturbances.Often, a sophisticated algorithm is required to reconstruct the phase from the recorded diffraction patterns, and the phase reconstruction is time consuming, which takes from seconds to minutes.
In recent years, deep learning (DL) has been demonstrated as a powerful tool for solving various inverse problems through training a network with a large quantity of paired images.Once sufficient training data are collected in an environment that reproduces real experimental conditions, the trained model can not only solve the inverse problem, but also exceed physics-model-based approaches on some issues (e.g., computing speed, parameter adaptivity, algorithm complexity, the number of raw images required) [12].The price for the above merits is the laborious capture of massive datasets comprised of thousands of raw-image/ground-truth pairs.Recently, Ulyanov et al. designed a Deep Image Prior (DIP) framework that uses an untrained network as a constraint to solve common inverse problems, considering a well-designed network framework has an implicit bias in images [13].The DIP has a preponderant advantage: it does not need pre-training with a large amount of labeled data.DL has been applied in different fields, including scattering image restoration [14], wavefront sensing [15], super-resolution imaging [16][17][18][19][20][21][22], and image denoising [23].
Notably, deep learning has been introduced into DHM to address phase recovery and aberration compensation problems [24][25][26][27][28].To cite a few, deep learning was used to retrieve complex-amplitude images (including amplitude-and phase-images) from the holograms of inline DHM [29] and off-axis DHM [30,31], eliminating twin-image artifacts and other phase errors.Ren et al. [32] utilized an end-to-end network to refocus different types of samples from a defocused hologram.Li et al proposed a deep learning assisted variational Hilbert quantitative phase imaging approach, which can recover a high-accuracy artifacts-free phase image from a low carrier frequency hologram [33].In addition, deep learning algorithms are well-suited for converting the images of different modalities, for instance, a U-net based DL approach was used to convert a DIC images to fluorescence images, with which the volume of a cell can be measured in a label free manner [34].In general, DL can promote existing imaging approaches by extending their performance, transiting the image among different modalities, reducing experimental time and costs, etc.
In a bright-field microscopy, once a sample is imaged in a defocused manner, the phase information is encrypted in the intensity pattern via a diffraction process.The conventional physics-modeled based approaches require recording of three defocused intensity images at least to reconstruct the encrypted phase.In this paper, we propose a deep-learning-based QPM approach that predicts the phase image of a sample from a defocused bright-field intensity image.For this purpose, a U-net is trained with the phaseintensity image pairs, of which the phase images are obtained using an off-axis DHM configuration and a phase-type spatial light modulator (SLM) to generate a series of phase samples.The experimental results show that the proposed DL-based approach can accurately obtain the phase information from a single defocused bright-field intensity image by using the trained network.

Principle of DLQPM
The schematic diagram of the DLQPM system is shown in Figure 1A.A 633 nm He-Ne laser was used as a coherent illumination source for off-axis DHM and bright-field imaging (once the reference wave is blocked).For DHM imaging, the illumination beam is divided into two identical copies by a 1: 1 beam splitter, where one is used for the coherent bright-field imaging (object wave), and the other one is used as the reference wave.In the object wave arm, a magnification unit composed of a microscopic objective (MO) and a tube lens (L2) is used to magnify a sample.A phase-type SLM (HRSLM84R, Shanghai, UPOLabs, China) is placed on the image plane of the sample to generate pure phase objects for the network training.The SLM is further imaged through a 4f system consisting of lens L3 and lens L4 to a CCD camera.The object wave and the reference wave are recombined by a beam splitter BS2, and the two interfere with each other on the CCD plane.The CCD camera records the generated off-axis hologram I holo , from which both the amplitude and phase images of the sample can be reconstructed by using a standard reconstruction algorithm.
For DLQPM imaging, defocused bright-field intensity images of pure phase samples are recorded by the same CCD camera in the absence of the reference wave.The nonlinear relation between the defocus intensity and the phase information [35] can be simply expressed as Here, (x, y) are the lateral coordinates on the sample plane, I d (x, y) is the defocus intensity image, φ (x, y) is the phase information of the sample, ε(x, y) is the noise, and H{·} represents the nonlinear operator that links the relation between I d (x, y) and φ (x, y).In DLQPM, H{·} is expressed with a U-net (as is shown in Figure 1A, right).Specifically, a U-net network [36], as shown in Figure 1B, was used for applying the relation H{·} in DLQPM.The network consists of two parts: the left part is encoder and the right part is decoder.The encoder (feature extraction part) consists of five repetitions.Each repetition consists of two 3 × 3 convolution layers followed by a batch normalization, a ReLU function, a residual block in the middle, and a 2 × 2 max pooling feature for down-sampling.The number of feature channels doubles after each encoding module, and the size of the feature map is halved.In contrast, the right part (decoder) is the up-sampling part.Each decoding module consists of a transpose convolution and two 3 × 3 convolutions, a batch normalization, a ReLU, and a residual block between the two convolutions.After being trained with throunds of I d -φ data pairs data pairs, phase images φ (x, y) can be predicted from I d (x, y) with the U-net, as is shown in Figure 1B.Despite there are some existing DL-based phase retrieval approaches that can recover the phase form one or multiple defocused intensity images [37], the acquisition of the training data pairs by imaging hundreds of different samples is exhausting.In this study, the training data pairs are acquired by precisely positioning a SLM to the image plane of the sample, so that the training data pairs can be generated numerically and the virtual samples are equivalent or analogical to microscopic samples.

Network training for DLQPM
The U-net can be trained with a set of I d -φ data pairs (see Figure 1A), for which φ can be obtained using off-axis DHM and a numerical focusing procedure, and  1,024 pixels, pixel size: 18 × 18 μm) one by one.The defocus brightfield intensity images I d (x, y) and off-axis holograms I h (x, y) were captured by a CCD (960 × 1,280 pixels, pixel size: 3.75 × 3.75 μm), respectively.The in-focus phase images φ (x, y) are calculated by the traditional DHM recovery algorithm [6] and digitally propagated to the image plane using an angular spectrum based algorithm [39].Then the acquired I d (x, y) and φ (x, y) are cropped with a size of 256 × 256 pixels, where the phase images are used as the Ground truth input of the neural network.For implementing the network model, Pytorch framework based on Python 3.6.1 is used.The network training and testing are performed on a PC with Intel Core processor i7-9700CPU, using NVIDIA GeForce GTX2060 GPU.

Phase accuracy verification of the proposed neural network
At first, handwritten numerals were loaded to the SLM, as virtual samples, to test the DLQPM network, following a similar protocol depicted in Ref. [31].A total of 3,656 images of numerals were utilized in the experiment, of which 3,500 images were used as the training set and 156 images as the test set.Both the off-axis holograms and defocused intensity images were recorded for each sample.Figure 2A shows the defocused bright-field intensity image of the numeral 6, from which the phase image can be reconstructed by using the trained network, as shown in Figure 2B.It can be seen that the reconstructed phase is similar to the ground-truth phase image reconstructed by the off-axis DHM (Figure 2C).To evaluate the accuracy of the network on phase imaging quantitively, we used the structural similarity index measure (SSIM) as an evaluation indicator.The SSIM value between the neural network output image and the Ground Truth is 0.965, further verifying the feasibility and accuracy of our method.Then, a series of random phase patterns generated by SLM were used to further test the DLQPM reconstruction.Figures 2D, E show the reconstructed phase images by DLQPM and the off-axis DHM.Further, the images in Figure 2F show the difference between the phase images reconstructed by the two approaches.The phase difference (error) features a standard deviation of 0.15 rad for the samples with a peak-to-valley (P-V) value around 1.5 rad, which means that the trained network can reconstruct a phase image with high phase accuracy.The error may be caused by the speckle noise of the coherent illumination and the phase fluctuation induced by environmental disturbances during DHM imaging.It can be concluded from the above experiments that our neural network can accurately recover the phase information of a sample from a bright-field defocus intensity image.

Effect of defocusing distance on the imaging performance of DLQPM
As mentioned before, the proposed neural network can predict a phase image from a defocused bright-field intensity image, considering the fact that the defocused image carries the phase information of the sample.It is important to investigate the effect of the defocusing distance on the phase prediction performance of the proposed neural network.We still used the SLM-generated random phase patterns as training-testing data, but we used a series of defocus distances varying from 0 cm-6 cm during I d -I h recording.For each defocusing distance, we recorded 3,621 pairs of data, of which 3,500 used as the training data set and the rest as the testing data set.Each raw data pair were augmented to 6 pairs for neural network training through flipping and reversing operations.Figures 3A, B show the defocused intensity images and the recovered phase images at different defocus positions, respectively.Figure 3C shows the phase error maps between the phase image reconstructed by the network and the ground-true phase image (Figure 3D) obtained by the off-axis DHM.Further, Figure 3E implies the averaged phase error of the 121 tested data decrease with the defocusing distance.This is due to the fact that the influence of the phase information on the intensity image increases as the defocusing distance increases.Further, we randomly selected ten testing data and calculated the structural similarity index measure (SSIM) of the network outputs with respect to their ground-truth phase images.The circles in Figure 3F present the SSIMs calculated for the individual test data, and the red solid-boxes show the averaged SSIM at different defocusing distances.The results show that the quality of the network output increases with the defocusing distance before it becomes saturated at a defocusing distance of 3.5 cm, meaning that the proposed method can recover appropriate phase information once the defocusing distance is larger than 3.5 cm.

Imaging of blood cells by DLQPM
To further verify the feasibility of the DLQPM for imaging biological sample, human red blood cells (RBCs) were used as the sample.17,200 image pairs of the I d -I h data pairs were taken, where 15,000 and 2,200 pairs were used for the training and the testing, respectively.Figure 4A shows the captured hologram I h of the RBCs by the off-axis DHM.Figures 4B, C show the three-dimensional and two-dimensional distribution of the RBCs, reconstructed from the off-axis hologram.Figure 4D shows the captured defocus brightfiled intensity of RBCs, from which the three-dimensional and twodimensional phase distribution of RBCs were reconstructed by using the proposed DLQPM, as shown in Figures 4E, F, respectively.The comparison between Figures 4C, F tells that the proposed method can accurately reconstruct the phase of the RBCs.It is worth noting that the DLQPM image has lower speckle noise and artifacts than that obtained by the off-axis DHM.This is due to the fact that the interference between the object wave and the reference wave is avoided, which will induce additional speckle noises and artifacts.

Conclusion
In this paper, we proposed a novel deep learning based QPM phase reconstruction approach (entitled DLQPM), which can reconstruct a phase image from a defocused bright-field intensity image (I d ) using a U-net.In the implementation, the network should be trained beforehands, for which the data pairs were obtained by inserting a SLM into an off-axis DHM.Therefore, sufficient training data are collected in an environment that reproduces the actual experimental conditions.Compared to the conventional physicsmodeled based approaches, which need to record at least three defocused intensity images to reconstruct the phase, the proposed DL-based approach can accurately obtain the phase information from a single defocused bright-field intensity image.
We quantitatively analyzed the effect of defocusing distance on the phase recovery performance of DLQPM, and the results show that the phase reconstruction quality increase with the defocusing distance and it become saturated after a defocusing distance of 3 cm.Notably, the proposed method can also be used for the bright-field microscope equipped with partially coherent illumination or incoherent illumination, and in this case the training data pairs can be obtained with single beam phase retrieval approaches.We envisage that this method will be useful for life science research and industrial detection.

FIGURE 1
FIGURE 1 Work-flow of DLQPM network.(A) Network-training flow diagram.L, lens; M, mirror; MO, micro-objective; BS, beam splitter; SLM, spatial light modulator.The orange and blue blocks are the data acquisition network training units.(B) Phase reconstruction of DLQPM using the trained network.The scale bar in (B) represents 0.2 mm.

FIGURE 2
FIGURE 2 Verification of DLQPM for quantitative phase imaging.(A) The defocused bright-field intensity image of a numeral of 6. (B) Phase image reconstructed by the off-axis DHM (ground-truth).(C) Phase image obtained by DLQPM.For the rand-pattern samples, the phase images were reconstructed by DLQPM (D) and the off-axis DHM (E) when a series of random phase patterns were loaded to the SLM for testing.(F) Shows the difference between the phase images in (D,E).The scale bar in (A) represents 0.2 mm.
y) is recorded by blocking the reference wave.To generate the training data pairs, the images from the MNIST dataset (Modified National Institute of Standards and Technology database [38]) were loaded onto the SLM (1,280 ×

FIGURE 3
FIGURE 3 Effect of defocusing distance on the phase reconstruction of DLQPM.(A) Intensity images recorded at different defocusing distances varying from 0 to 6 cm.(B) The phase images recovered by the trained network model.(C) The error maps between the network output phase images and the phase images reconstructed by the off-axis DHM (D).(E) Phase error versus defocusing distance for the test image in (D).(F) SSIM curve versus the defocusing distance.In (F), the circles represent the SSIM values for ten randomly-selected testing data, while the solid rectangles represent the averaged SSIM of the ten testing data.The curves in (E,F) are four-order polynomial fits.The scale bar in (A) represents 0.2 mm.

FIGURE 4
FIGURE 4 Quantitative phase imaging of blood cells by the proposed DLQPM.(A) Hologram of the human red blood cells by the off-axis DHM.(B,C) 3D and 2D displays of the phase distribution reconstructed by the off-axis DHM.(D) The defocused bright-field intensity image with a defocusing distance d = 120 μm.(E) 3D and (F) 2D Neural network output phase image.The scale bar in (A) represents 20 μm.