Measurement for the Area of Red Blood Cells From Microscopic Images Based on Image Processing Technology and Its Applications in Aplastic Anemia, Megaloblastic Anemia, and Myelodysplastic Syndrome

Background Aplastic anemia (AA), megaloblastic anemia (MA), and myelodysplastic syndrome (MDS) were common anemic diseases. Sometimes it was difficult to distinguish patients with these diseases. Methods In this article, we proposed one measurement method for the area of red blood cells (RBCs) from microscopic images based on image processing technology and analyzed the differences of the area in 25 patients with AA, 64 patients with MA, and 68 patients with MDS. Results The area of RBCs was 44.19 ± 3.88, 42.09 ± 5.35, 52.87 ± 7.68, and 45.75 ± 8.07 μm2 in normal subjects, patients with AA, MA, and MDS, respectively. The coefficients of variation were 8.78%, 10.05%, 14.53%, and 14.00%, respectively, in these groups. The area of RBCs in patients with MA was significantly higher than normal subjects (p < 0.001). Compared with patients with AA and MDS, the area of RBCs in patients with MA was also significantly higher (p < 0.001). The results of correlation analysis between the area of RBCs and mean corpuscular volume (MCV), mean corpuscular hemoglobin (MCH), MCH concentration (MCHC), and red cell distribution width showed no significant correlations (p > 0.05). The area under the curve (AUC) results of the Receiver Operating Characteristic (ROC) curves of RBCs area were 0.421, 0.580, and 0.850, respectively, in patients with AA (p = 0.337), MDS (p = 0.237), and MA (p < 0.001). Conclusion Identifying the area of RBCs in peripheral blood smears based on the image processing technology could achieve rapid and efficient diagnostic support for patients with MDS and MA, especially for patients with MA and in combination with MCV. However, a larger sample study is needed to find the cutoff area values.


INTRODUCTION
Myelodysplastic syndrome (MDS) is a group of heterogeneous clonal diseases originated from hematopoietic stem cells, which is characterized by ineffective hematopoiesis, refractory hemocytopenia, and high-risk transformation to acute myeloid leukemia (1). Aplastic anemia (AA) is a group of diseases that result in the decrease of blood cells due to acquired bone marrow failure (2,3). Megaloblastic anemia (MA) is an anemic disease caused by the disorder of DNA synthesis of blood cells, which is characterized by the megaloblastic metamorphosis of red blood cells (RBCs) and myeloid cells. Vitamin B12 and/or folate deficiency are the most common causes of MA (4). It is sometimes difficult to make differential diagnoses among these three diseases according to clinical manifestations and blood examinations because of similar findings shared by them. Anemia, bleeding, and infections due to cytopenia in one or more lineages can be seen in all these diseases mentioned above. Dysplasia in lineages of peripheral blood was shown not only in MDS, but also in MA. Especially, megaloblastic metamorphosis of erythrocytes in myelogram of patients with MDS needs to be differentiated with MA. Some patients with MDS who do not display prominent dysplasia are difficult to be made differential diagnoses with patients with AA. Sometimes, it was difficult to distinguish patients with MA from MDS of refractory anemia (RA) type (MDS-RA) (5) and multilineage dysplasia type (MDS-MLD), as well as difficult to distinguish patients with AA from hypoplastic MDS. The detection of folate and vitamin B12 is helpful to diagnose MA. Finding the clonal evidence of MDS is helpful to diagnose MDS. There were possibly 52% of patients who had one or more clonal chromosome abnormalities (6). The acquired molecular mutations were possibly found in 80-90% of patients with MDS (7,8). However, the limitations of testing cost and laboratory test conditions in many low-resource settings make detections of abnormal chromosomes, molecular mutations, and concentrations of vitamin B12 and folate not available for some patients suspected of MDS, AA, and MA.
In this study, we described a measurement method based on image processing technology, which was developed to localize and extract RBCs from microscopic images and further calculated the area of RBCs for the first time. It can help for the diagnosis and identification of MDS, AA, and MA because the hematological analyzer can release the result within minutes.

Patients and Diagnosis
Myelodysplastic syndrome was diagnosed according to the 2016 WHO classification (9). The diagnosis of AA was made according to the standard published by the British Society for Standards in Haematology in 2016 (10). Patients with vitamin B12 values below 150 pmol/l were diagnosed with MA. This study and all the procedures used were approved by the Institutional Review Board of the Wuhan University, China. All the patients were from Zhongnan Hospital of Wuhan University. There were 25 normal subjects, 25 patients with AA, 64 patients with MA, and 68 patients with MDS.

Measurement Method
We used a machine learning method based on image processing technology to obtain the area of RBCs in normal subjects or patients with anemia. The specific process included preparation of blood smears, magnified images of blood smears, and area calculation of RBCs (shown in Figure 1).

Preparation of Blood Smears
Blood smears at the time of initial diagnosis or entering the study were acquired from all patients or subjects. The following steps were observed to obtain blood smears. At first, 5-7 µl of Ethylene Diamine Tetraacetic Acid (EDTA) anticoagulated peripheral blood or one drop of peripheral blood was directly collected from all subjects and patients, and the blood collected was dripped to 1 cm at one end of the slide or 3/4 end of the whole slide. Then we pushed the cover slide close to the blood drop, gently touched the blood drop, pressed it on the blood drop, and filled the width of the slide. Finally, Wright-Giemsa mixed staining was needed.

Magnified Images of Blood Smears
The magnified images of peripheral blood smears were obtained by microscope (magnified 1,000 times). The obtained magnified images were input into the computer system, which were Red Green Blue (RGB) images or hue, saturation, and value Hue Saturation Value (HSV) images.

Segmentation of Cell Region and Background
The first step was converting the RGB space to HSV space about images, which included HSV. Hue contains hue and hue information. Saturation contains saturation and color purity information. Value contains lightness information. The next step was image segmentation, which included image threshold segmentation and edge-based segmentation. Otsu threshold segmentation method was mainly used for image segmentation.

Location of the Single Cell in the Image
The outer contour of the image was extracted in the cell region by the findCounter method. After traversing all outer contours, the pending contour was the outer contour with an area less than the FIGURE 1 | The specific process of area calculation of red blood cells. The specific process included preparation of blood smears, magnified images of blood smears, and area calculation of red blood cells. threshold (0.5-10 µm 2 ). S mean was the average area of all outer contours except the pending contour.
The convex hull algorithm was used to identify the effective contour. The cell with the contour of which was convex edge shape was selected as a single cell. After traversing all outer contours except the pending contour, the outer contour with the area of which was in the specific range (a × S mean , b × S mean ) was the effective contour. The value of a was less than 1, and the value of b was greater than 1. After many tests, the accuracy was best when a was 0.3 and b was 5. When the area of one outer contour was larger than b × S mean , it would be judged as multiple cells merging.

Distinguishment of RBCs From Nucleated Cells
Single cells include nucleated cells and RBCs. After the above location of single cells, they also should be divided into nucleated cells and RBCs. The mean gray value of all located single cells images was calculated and recorded as Mg. The single gray value of each single cell image was also calculated and recorded as G. If the ratio of G to Mg was greater than 0.8, the single cell would be selected as a RBC. If the ratio of G to Mg was less than 0.8, the single cell would be selected as a nucleated cell.

Area Calculation of RBCs
If the ratio of the distance from all points to the center of contour (r i ) to the radius (r) of the minimum enclosing circle was above 0.85, and the area of the minimum enclosing circle was in the special range (a × S mean , b × S mean ), the cell would be labeled as a recall RBC. The average ratio of r i to r was named as P mean .
The area of the selected RBC was labeled as S i , the area of recall RBC was labeled as S r. The area of recall RBC was calculated based on the idea of calculus and according to the following formula. Finally, we calculated the average area of all RBCs in the blood smear in every subject and patient.

Statistical Analysis
The normality of area values was carried by the Shapiro-Wilk test. The area was expressed by "mean ± SD" or "median ± quartile range." If the area values satisfied normal distribution and homogeneous variance, they would be analyzed by an independent sample t-test. If the distribution was normal but the variance was not uniform, the corrected ttest would be used. If the normal distribution could not be satisfied, the rank-sum test would be used for analysis. The coefficient of variation was calculated by the ratio of mean to standard deviation. Pearson or Spearman correlation analysis between the area and standard RBC complete blood count (CBC) indices were used according to the normality of data. The ROC curves and the values of the area under the curve (AUC) were used to compare the area of RBCs with MCV.

Area of RBCs in Normal Subjects and Patients With AA, MDS, and MA
In 25 normal subjects, the mean area was 44.19 µm 2 , and the standard deviation was 3.88 µm 2 ( Table 2). The image result of the area calculation in one normal subject is shown in Figure 2A.
The area values of RBCs in normal subjects satisfied normal distribution (Figure 3A). The coefficients of variation (CV) were 8.78%. In 25 patients with AA, the median area was 42.09 µm 2 , the quartile range was 5.35 µm 2 . In 68 patients with MDS, the median area was 45.75 µm 2 , the quartile range was 8.07 µm 2 . In 64 patients with MA, the mean area was 52.87 µm 2 , and the standard deviation was 7.68 µm 2 ( Table 2). The image results of area calculation in one patient with AA, MA, and MDS are, respectively, shown in Figures 2B-D. The area values of RBCs in MA patients satisfied normal distribution, not patients with AA and MDS (Figures 3B-D) Table 3).

Comparison of the Area of RBCs With MCV
We compared the AUC results of ROC curves of RBCs area, MCV, and predicted probability of the two indicators in patients with AA, MA, and MDS. In patients with AA, the AUC results of RBCs area, MCV, and predicted probability were 0.421, 0.585, and 0.581, respectively. We found that MCV and RBCs area had little diagnostic significance in patients with AA. In patients with MDS, the AUC results of RBCs area, MCV, and predicted probability were 0.580, 0.763, and 0.784, respectively. The area of RBCs did not show obvious advantages, but the diagnostic value of the combination of the two indexes increased (p = 0.048). In patients with MA, the AUC results of RBCs area, MCV, and predicted probability were 0.850, 0.984, and 0.991, respectively. The area of RBCs showed a very good diagnostic value and the combined diagnostic value of the two indexes increased significantly (p < 0.001). These results are shown in Figure 4 and Table 4.

DISCUSSION
Myelodysplastic syndrome, AA, and MA are three common hematologic diseases that cause pancytopenia. MDS and MA possibly share similar characteristics in bone marrow cells morphology, such as abnormal nuclear division and nuclear morphology of erythrocyte lines, which make doctors difficult to distinguish them. The diagnostic distinction of AA and  hypocellular MDS is also difficult because of some shared clinical features such as bone marrow hypocellularity (11). Dysplasia in one or more hematopoietic cell lineages is a prerequisite for the diagnosis of MDS. However, dysplasia is not specific for MDS. A small number of patients with MDS display no dysplasia in the early stage of the disease (12,13). The proper diagnostic distinction of these diseases with pancytopenia is challenging.  In bone marrow morphology, the number of blasts, pseudo-Pelger-Huet anomaly, and micro megakaryocytes are of great diagnostic value for MDS (14,15). In bone marrow biopsy, reticular fibers, increased CD34+cells and more residual hematopoietic area in bone marrow biopsy specimens are helpful for the diagnosis of MDS (16,17). Abnormal localization of immature precursor that appeared in the medullary cord can help confirm the diagnosis of MDS. In recent years, advances in the molecular pathogenesis of MDS have been greatly enriched by the systematic application of next-generation sequencing. Deep next-generation sequencing panel assays for detection of somatic mutations are now routinely available helpful in distinguishing AA from hypocellular MDS (18). Advances in novel sequencing techniques have led to the discovery of one or more gene mutations in more than 90% of all cases, and there are more than 60 genes involved (7,8). These cytogenetic changes can help to diagnose MDS. Karyotypic abnormalities that can help doctors confirm an MDS diagnosis are present in about 50% of all cases (19). So, cytogenetic and molecular examination may not help for diagnostic value in some patients. Besides, early MDS may present no dysplasia, and the clinical symptoms are also not typical. MA and MDS both used to be classified into macrocytic anemia, and the similar manifestations of dysplasia in bone marrow increased the difficulty of differential diagnosis of the two diseases. One rapid method for assisting the diagnosis and identification of MDS, AA, and MA was very important.
Red blood cells are the most commonly and intensively studied type of blood cells in cell biology. At present, many hospitals and research institutes conducted RBCs examinations using conventional techniques which included the CBC and microscopic examinations. CBC, as the current standard technique for measuring RBCs properties, contains several important diagnostic parameters, such as the MCV, MCH, MCHC, and RBCs distribution width. Microscopic examinations classify and identify RBCs by artificially observing the morphology of cells under a microscope. Although the conventional measurement techniques can make a general view of RBCs and help identify shapes, roundness, and other information, they have difficulties in extracting highdimensional information at the individual cell level (20). It has been challenging to establish new systems for morphological and classification analysis of erythrocytes based on the properties of individual RBCs.
In the recent years, recognition of RBCs from microscopic images using imaging processing technology has been proposed. In 2016, HA Elsalamony et al. presented algorithms capable of counting and detecting sickle and microcytic RBCs on a smear based on circular Hough transform. The neural network had been applied on their extracted data to evaluate the algorithm (21, 22). He also proposed an algorithm of assigning and counting normal, sickle RBCs and elliptocytosis (23). Other researches were conducted to screen for diseases and syndromes based on computer systems through the extraction of information of RBCs. Kim et al. (24) also demonstrated a rapid and label-free method by combining quantitative phase imagingbased single-RBC profiling with machine learning to screen for iron-deficiency anemia, reticulocytosis, hereditary spherocytosis (>98% accuracy). In Delgado-Font et al. (25) presented a highaccuracy neural network classifier for the support of sickle cell anemia by classifying RBC shape in peripheral blood images using the basic shape analysis descriptors which included circular shape factor and elliptical shape factor. The normal RBCs have a biconcave-disk shape rather than the spherical shape, with an average diameter of 7.2 µm and thickness of 2 µm. The specific shape of the RBC allows it to maximize the uptake of oxygen from its surroundings and release of carbon dioxide produced by the body. Therefore, the surface area rather than volume of the RBC is more reflective of its capacity of oxygen transportation. The above-mentioned researches have achieved some progress in extracting morphological, chemical, and mechanical properties of individual RBCs based on computer systems and even offered diagnostic support for some anemic diseases, but none of them tried to measure the area of RBCs, which may help improve the accuracy of RBC detection.
We carried out this study on the area of RBCs in peripheral blood smears of patients with MDS, AA, MA, and normal subjects by conducting RBCs segmentation and morphological analysis based on image processing technology. The results of this study showed that the mean or median RBCs area was 44.19, 42.09, 45.75, and 52.87 µm 2 , respectively in normal subjects, patients with AA, MDS, and MA. Compared with the normal subjects, the RBCs area in patients with MA was significantly higher. Compared with patients with AA and MDS, the RBCs area in patients with MA was also significantly higher. There were also significant differences between patients with AA and MDS. Therefore, we preliminarily verified the differences in the area of RBCs among patients with AA, MDS, and MA. We found no significant correlations between area of RBCs and MCV, MCH, MCHC, and RDW. Therefore, the results in this study can assist the diagnosis of patients with AA, MDS, and MA, possibly independent of standard RBC CBC indices. After comparing the diagnostic values of MCV and RBCs area in these three diseases, we found that the RBCs area showed very good diagnostic value as MCV in patients with MA. Moreover, the combined diagnostic value of MCV and RBCs area increased significantly, which could be significantly close to 1.
There are several advantages in this study. First, blood smears are easier to be obtained. Second, the area of RBCs and the differences of the area in patients with AA, MDS, and MA were preliminarily obtained. This study also has several limitations, which can be improved in the future work. First, the number of samples is not large enough. Second, cluster cells were not taken into account in this study. At last, we did not find the cutoff area values for screening patients with AA, MDS, and MA. If we get the cutoff area values to distinguish the three diseases and the methodology is mature and standardized, the detection cost and speed will be improved.
In conclusion, identifying the area of RBCs in peripheral blood smears based on the image processing technology could achieve rapid and efficient diagnostic support for patients with MDS and MA, especially for patients with MA and in combination with MCV. However, a larger sample study is needed to find the cutoff area values.

DATA AVAILABILITY STATEMENT
The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by the Institutional Review Board of the Wuhan University, China. Written informed consent for participation was not required for this study in accordance with the national legislation and the institutional requirements.

AUTHOR CONTRIBUTIONS
YZ and XW wrote the manuscript and analyzed the results. TH and HS prepared blood smears and microscopic examination. QC processed the data. BX designed the project, provided professional guidance, and revised the manuscript. All authors contributed to the article and approved the submitted version.