Live Cancer Cell Classification Based on Quantitative Phase Spatial Fluctuations and Deep Learning With a Small Training Set

We present an analysis method that can automatically classify live cancer cells from cell lines based on a small data set of quantitative phase imaging data without cell staining. The method includes spatial image analysis to extract the cell phase spatial fluctuation map, derived from the quantitative phase map of the cell measured without cell labeling, thus without prior knowledge on the biomarker. The spatial fluctuations are indicative of the cell stiffness, where cancer cells change their stiffness as cancer progresses. In this paper, the quantitative phase spatial fluctuations are used as the basis for a deep-learning classifier for evaluating the cell metastatic potential. The spatial fluctuation analysis performed on the quantitative phase profiles before inputting them to the neural network was proven to increase the classification results in comparison to inputting the quantitative phase profiles directly, as done so far. We classified between primary and metastatic cancer cells and obtained 92.5% accuracy, in spite of using a small training set, demonstrating the method potential for objective automatic clinical diagnosis of cancer cells in vitro.


INTRODUCTION
Much effort has been invested on studying the relationship between biological cell properties and cancer. Detection and monitoring of the cell physiological changes by isolating circulating tumor cells from liquid biopsies could be a breakthrough in disease diagnosis and treatment. Conventional cancer cell analysis and sorting techniques, such as fluorescence-based measurements, require specific cell labeling, with prior knowledge of the labeling agent [1][2][3][4][5]. Alternatively, the biophysical properties of cancer cells might be used as a clinical diagnosis tool [6][7][8][9], such as an increased dependence on glucose [10]. Cancer cell stiffness has been reported to correlate well with the disease invasiveness, due to cellular stiffness variations in tumors following changes in cell cytoskeleton, and membrane microviscosity, [6,8,[11][12][13]. Metastatic cells have elastic features that allow them to detach from the primary tumor, penetrate the walls of lymphatic or blood vessels, and create secondary or metastatic tumors [14][15][16][17]. Thus, cancer cell stiffness and its associated properties could form a diagnosis tool, by classifying cancer cell types for early detection, monitoring, and development of specific cancer treatment [18].
The common way to characterize cell stiffness is atomic force microscopy (AFM) [19]. However, this modality is complicated and hard to perform in clinical settings. Additional methods for cell stiffness measurements, such as optical tweezers and magnetic tweezers [20,21], pipette aspiration [22], microfluidic optical stretcher [23], and mechanical microplate stretcher [24,25], are invasive and cause deformations that may lead to cell damage.
Quantitative phase imaging (QPI) clinical modules can record, without using cell staining, the fluctuation maps of live biological cells based on their quantitative phase map, which is proportional to the optical path delay (OPD) map of the cell [26][27][28][29][30]. Reflection phase microscope with coherence gating have shown success to measure cell membrane temporal fluctuation [31]. This method uses coherence gating and requires manual adjustments to the cell surface, which limits the possibility of producing an automatic test for a large number of cells. Quantitative phase temporal fluctuations can be measured directly from the entire cell thickness for red blood cells (RBC), as described in Refs. [32,33], and for cancer cells, as described in Ref. [34]. The latter analysis can be used as a diagnostic tool for discriminating between healthy and cancer cells of different metastatic potential. However, since this method measures temporal fluctuations, it requires high temporal stability of the optical system and good cell-surface attachment. Also, in Ref. [34], no classifier was presented but only statistical results. Another approach was presented in Ref. [35]. It uses the cell stationary quantitative phase map to capture spatial differences. By itself, this map gives only small statistical differences between groups of cancer cells. Therefore, instead, this map was transformed into the disorder-strength map, which is better linked to the cell shear stiffness. The method is demonstrated using stiffness-manipulated cancer cell lines, rather than cells originated from in vivo stages of cancer. Here too, statistical data was given, rather than a classifier that can differentiate between cancer cells on an individual cell basis.
In the last years, deep-learning techniques were significantly developed, due to the rapid evolvement of computational resources. Conventional machine-learning techniques extract hand-crafted features from the cell quantitative phase map [36], where hidden features in the image might be missed. Deep-learning techniques, on the other hand, also take into account hidden features, since the input to the classifier is the entire OPD map, rather than the hand-crafted features. A recent paper presented a deep learning technique, called TOP-GAN [37], which can classify cancer cells based on their quantitative phase maps when only a small training set is available, but many unclassified maps of other cell types are available. All methods in Refs. [35][36][37] did not use the cell spatial fluctuations as a means to discriminate between cancer cells of different metastatic potential.
In the present study, we developed a deep-learning method to automatically classify between stain-free primary cancer cells and metastatic cancer cells originated from an in vivo source, based on the cell quantitative phase maps and the spatial fluctuations of the cell. We compared two types of live cells that were taken from the same organ of the same donor, SW480 cells, from colorectal adenocarcinoma cells from a colon tissue, and SW620 cells, from metastatic cells that originated from a lymph node from a colon tissue. These are established cell lines taken from the same donor, and available for commercial purchase. We show that in spite of the small training set used, we can still use deep learning and obtain very good classification performance, provided that a spatial fluctuation analysis is performed on the quantitative phase profiles, before inputting them into the deep network. This preliminary spatial image analysis extracts essential features and accelerates the convergence of the network, while achieving high accuracy in cancer cell classification.

MATERIALS AND METHODS
We acquired quantitative phase maps of primary and metastatic cancer cells, as described in Cells preparation, Optical system, and Phase retrieval. We then implemented two deep learning classifiers, as elaborated in Classification, where each time we used another type of input. First, we used the stationary phase maps directly. Second, we applied spatial fluctuation analysis, as elaborated in Spatial fluctuation analysis, and only then inputted the resulted maps to the network, validating this method superiority in the case of a small training set.

Cells Preparation
SW480 and SW620 cell lines were purchased from ATCC. Both cell lines were isolated from the same donor, and originated from a primary adenocarcinoma of the colon. SW480 was established from a primary adenocarcinoma of the colon, while SW620 is metastatic cell lines established from a lymph node metastasis. Cells were grown inside the flask in Dulbecco's Modified Eagle Medium (DMEM) until 80% confluence and incubated at 37°C. Then, the cells were suspended using trypsin. Cells were sown on a coverslip, covered with ECL-cell attachment matrix, and put inside a Petri dish overnight to enable cell attachment to the coverslip. For imaging live cells, we used stickers that can be placed on top of the coverslip, forming wells containing ∼10 μl medium to keep the cells alive. The coverlip with the cells is then imaged by the optical system.

Optical System
To acquire dynamic quantitative phase maps of cancer cells of different metastatic potentials, we built a diffraction phase microscopy (DPM) system [38]. The cell quantitative phase profiles were later used to generate the inputs to deep neural network classifiers. DPM can create off-axis holograms and enable single-exposure wavefront sensing. It has high spatial and temporal sensitivities due to using a broadband source with a common-path interferometric geometry. The DPM system was added as an external module to the output of a commercial inverted microscope (IX83, Olympus, United States). As shown in Figure 1, the microscope was illuminated by a supercontinuum laser source (SuperK Extreme, NKT, Denmark), coupled to an acoustooptic tunable filter, AOTF (SuperK SELECT, NKT, Denmark), which emits a wavelength bandwidth of 633 ± 2.5 nm and is spatially coherent. Inside the microscope, the beam passes through the sample S, and is magnified by microscope objective MO (Olympus UPLFLN, 40×, 0.75 NA, United States). Then, it is reflected toward tube lens TL (f 200 mm), which projects the beams onto the output image plane of the microscope. There, we place the DPM module. In this module, an amplitude diffraction grating (100 lines/mm) generates multiple diffraction orders, each The zeroth-order beam is spatially low-pass filtered by a 75-μm pinhole, so that only the DC component of the central diffraction order remains. Note that this pinhole size is small enough to create a clear reference beam under the partially coherent illumination used, as shown in the imaging results in the next section. The first diffraction order is used as the sample beam, and the central diffraction is used as the reference beam. L2 (f 180 mm) creates a 4f imaging system with L1, followed by another 4f imaging system, L3 (f 50 mm) and L4 (f 250 mm), so that totally L1-L4 create an additional 6× magnification of the image from the output of the microscope. The imaged field of view was 50 µm × 50 µm. The optical resolution limit was 0.422 µm. Both beams interfere at a small off-axis angle on the camera and generate a spatially modulated off-axis interference image, which is then captured by a fast camera (FASTCAM Mini AX200, 512 × 512 pixels, 20 µm each, Photron, United States) at 500 frames per second.

Phase Retrieval
Off-axis holography captures the complex wavefront of the sample by inducing a small angle between the sample and reference beams. The interference pattern recorded by the digital camera is defined as follow: where E r represents the reference beam, E s represents the sample beam, and φ′ is the phase difference between the beams. After subtracting the phase induced by the off-axis angle and the phase of the reference beam, we get φ, the phase of the sample, which is proportional to the OPD of the sample, as following: where λ is the illumination wavelength. The OPD value obtained at each point is equal to the product of the sample thickness d(x, y) at that point and the integral refractive-index difference Δn(x, y) at this point. Δn(x, y) refers to the difference between the integral refractive index of the cell and the refractive index of the surrounding media, where the latter typically equals 1.33 for cells in a watery medium. After the acquisition of the off-axis hologram, we Fourier transform it. To cancel the effect of the offaxis angle, we filter out one of the cross-correlation terms, containing the complex wavefront of the sample [39]. An inverse Fourier transform of this cross-correlation term results in the complex wavefront of the sample, and its phase argument is the wrapped phase of the sample. Next, we apply a digital 2D phase unwrapping algorithm to avoid 2π ambiguities [40].

Spatial Fluctuation Analysis
We applied a spatial analysis on each of the quantitative phase maps of the cells, which represents the spatial fluctuation metric [35]. Phase variance, 〈Δφ(x, y) 2 〉 is calculated from the phase image in a window of 3 × 3 diffraction limited spots surrounding each pixel (x, y), which is directly proportional to the variance of the refractive index difference, 〈Δφ(x, y) 2 〉 ∝ 〈Δn(x, y) 2 〉.
To calculate the phase spatial fluctuation metric, regardless of the sample thickness, the phase variance is normalized by the mean square phase in the corresponding 3 × 3 window of diffraction limited spots, φ 2 (x, y), to create the phase fluctuation map, 〈Δφ (x, y) 2 〉/ φ 2 (x, y). By multiplying by the illumination wavelength and dividing by 2π, the OPD fluctuation map is obtained. In the following analysis, we emitted the cell edges, which are expected to have a significantly higher fluctuation amplitude and thresholded out noisy points. Finally, all the values were normalized. Figure 2 presents an OPD map of one of the SW620 cells analyzed and the resulting spatial fluctuation map. To ensure that the spatial fluctuations are not absorbed inside noise level, we calculate the system sensitivity by measuring the spatial standard deviation of the OPD profile of 3.254 nm, and temporal standard deviation of 0.282 nm.

Classification
We used Bottleneck Features network for binary classification in Keras. We trained a VGG16 structure network. This network has been previously proven as an effective feature-extraction classification network [41], and reached a rapid convergence on the given dataset. We attunemented the fully connected layers for binary classification between primary cancer cells and metastatic cancer cells, both isolated from the same individual. The network architecture is shown in Figure 3. We used 830 images for each class, obtained after 5× augmentation of the acquired data, rotations of 90, 180, and 270°, and an additional frame acquired at an additional time point after a second. The augmented data was divided 80, 10, and 10% into training, testing, and validation sets, respectively. To check independence in this selection, this division was done 5 times randomly, followed by new training, testing, and validation processes. Each time, we trained the network from scratch to ensure a stable division between the groups and that the network maintained a similar accuracy average. We trained the same network structure for each of the two classifiers. We then compared the results of the network for classification done separately based on the direct phase analysis and the spatial fluctuation analysis. The weights were frozen after 40 epochs, and the batch size was 10. The optimal selection of the hyperparameters was done after achieving convergence in the shortest time without causing overfitting. As verified, changing the hyperparameters did not result in better performance. The network converged after a few seconds when running on Google Colab GPU. Figure 4 presents the holograms, the quantitative phase maps, and the spatial fluctuation maps for each cell type, where the two latter ones are separately used as the inputs to the two networks. As can be seen when comparing Figures 4C,F one of the most prominent distinctions between the SW480 and SW620 cells is more 'hot' areas in the SW620 cell appearing after the spatial analysis.

RESULTS AND DISCUSSION
We examined the network performances on direct phase images and on the further-processed spatial analysis images. Figure 5 shows the accuracy and loss versus epochs for both the training and the validation stages for the two analyses. The overall deeplearning classification performance results are also summarized in Table 1. Here SW620 is defined as "positive" and SW480 as "negative". The best result was observed for using the spatial analysis, with an accuracy of 92.5% and loss of 0.24. The direct phase analysis yielded a worse result, with an accuracy of 78.26% and loss of 0.45. As shown in Table 1, the spatially processed data yielded significantly higher accuracy, lower loss, higher sensitivity, higher specificity, and higher AUC, compared to those obtained when classifying the quantitative phase profiles directly. This validates the fact that the spatial analysis on the quantitative phase is The OPD fluctuation at each point is calculated by the OPD variance divided by the mean squared OPD of a window of 3 × 3 of diffraction-limited spots [35]. The examined area, excluding the cell edges, is marked with a dashed black line. beneficial as a preliminary step even for an automatic classifier that can theoretically find the best features for classification by itself.
In comparison, in Ref. [36] simple machine-learning classifier, based on a support vector machine and principal component analysis, was used on hand-crafted features extracted from the quantitative phase maps of the cell lines. This previous study achieved limited results for classifying SW480 and SW620 cell lines based on the spatial morphological information. In this case, hidden features in the image might be missed. In principle, deeplearning techniques can better classify by finding the best features automatically as the network is trained. However, they require a  In our case, we had only 133 images, 5× augmented, from each cell type in the training set, which resulted in low performances (78.26% accuracy) when applying the network directly on the quantitative phase profiles. The TOP-GAN technique [37] is one of the techniques that can cope with the problem of a small training set, provided that unclassified examples (e.g., unclassified quantitative phase maps of other cell types) are available. This requires efforts of acquiring and processing many quantitative phase maps of another cell type. Here, we use another approach to cope with the small training set problem, first decreasing the differences between the quantitative phase maps of the two groups that do not contribute to the classification itself. Thus, we focused on classification based on the cell spatial flucuations, rather than the cell morphology, providing us good classification results even though a small training set is available.
In general, it was previously shown that pre-preprocessing on the data such as centering, scaling, and decorrelating known as data whitening [42], and spatial transformation [43] can help in speeding the training, reducing nuisances and redundancies, and improving the classification performance [44]. Thus, the preprocessing manipulation step standardizes and improves the dataset quality for the subsequent deep neural-network training.
In this paper, the chosen preprocessing is the spatial analysis that extracts the fluctuation map from quantitative phase images. As shown in Table 1, this indeed resulted in better accuracy (92.5% instead of 78.26%), better sensitivity (88.88% instead 82.71%), and better specificity (96.25% instead of 73.75%) compared to the direct phase analysis.

CONCLUSION
We presented an automatic deep-learning approach for the classification of live cancer cells by interferometric phase microscopy without staining. We examined two analyses, for which the resulting images are used as the inputs to the deep-learning network. The goal was classification between a pair of two types of live cells that were originated from the same organ of the same donor, SW480 cells, colorectal adenocarcinoma cells from colon tissue, and SW620 cells, metastatic cells from a lymph node from colon tissue. The first deep-learning classifier worked on the quantitative phase maps of the cell directly, and the second one used spatial analysis, which produced cell phase fluctuation maps representing the variance of the refractive-index difference. 332 off-axis holograms were acquired, 166 from each cell type, with 133 images from each cell type in the training set. This is considered a small training set for deep learning. For network training, data augmentation was applied to enlarge our dataset to a total number of 665 images for each class. We trained the same VGG16 neural network structure with the same number of epochs on each of the analyses. The pre-processed phase profiles yielding the spatial fluctuation maps resulted in significantly better results with the highest accuracy (92.5%) and the lowest loss. On the other hand, the direct phase analysis presented worse results, with an accuracy of 78.26% and higher loss. This demonstrates that the use of the phase fluctuation maps resulted from the spatial transformation analysis, before inputting them to the network, reduces nuisances and make the input less redundant, aiding to obtain better classification results in case of using a small training set in deep learning. The present study is expected to bring to an automatic, non-subjective cancer-cell classification, correlating the cancer-cell refractive-index distribution, as measured by stain-free interferometric phase imaging, and the cell metastatic potential.

DATA AVAILABILITY STATEMENT
The raw data supporting the conclusion of this article will be made available by the authors, without undue reservation.

AUTHOR CONTRIBUTIONS
NR-N designed and built the setup, conducted experiments, designed and implemented the algorithms, processed the data, and wrote the paper. NTS, conceived the idea, designed the setup, and algorithms, wrote the paper, and supervised the project.

ACKNOWLEDGMENTS
We thank Dr. Raja Giryes from Tel Aviv University for useful discussions.