Breast Cancer Histopathological Images Recognition Based on Low Dimensional Three-Channel Features

Breast cancer (BC) is the primary threat to women’s health, and early diagnosis of breast cancer is imperative. Although there are many ways to diagnose breast cancer, the gold standard is still pathological examination. In this paper, a low dimensional three-channel features based breast cancer histopathological images recognition method is proposed to achieve fast and accurate breast cancer benign and malignant recognition. Three-channel features of 10 descriptors were extracted, which are gray level co-occurrence matrix on one direction (GLCM1), gray level co-occurrence matrix on four directions (GLCM4), average pixel value of each channel (APVEC), Hu invariant moment (HIM), wavelet features, Tamura, completed local binary pattern (CLBP), local binary pattern (LBP), Gabor, histogram of oriented gradient (Hog), respectively. Then support vector machine (SVM) was used to assess their performance. Experiments on BreaKHis dataset show that GLCM1, GLCM4 and APVEC achieved the recognition accuracy of 90.2%-94.97% at the image level and 89.18%-94.24% at the patient level, which is better than many state-of-the-art methods, including many deep learning frameworks. The experimental results show that the breast cancer recognition based on high dimensional features will increase the recognition time, but the recognition accuracy is not greatly improved. Three-channel features will enhance the recognizability of the image, so as to achieve higher recognition accuracy than gray-level features.


INTRODUCTION
Cancer has become one of the major public health problems that seriously threaten the health of people. The incidence and mortality of breast cancer have been rising continuously in recent years. Early accurate diagnosis is the key to improve the survival rate of patients. Mammogram is the first step of early diagnosis, but it is difficult to detect cancer in the dense breast of adolescent women, and the X-ray radiation poses a threat to the health of patients and radiologists. Computed tomography (CT) is a localized examination, which can not be used to judge that a patient is suffering from breast cancer according to the observed abnormalities. The gold standard for breast cancer diagnosis is still pathological examination. Pathological examinations usually obtain tumor specimens through puncture, excision, etc. And then stain them with hematoxylin and eosin (H&E) stains. Hematoxylin binds deoxyribonucleic acid (DNA) to highlight the nucleus, while eosin binds proteins and highlights other structures. Accurate diagnosis of breast cancer requires experienced histopathologists, and it requires a lot of time and effort to complete this task. In addition, the diagnosis results of different histopathologists are not the same, which strongly depends on the prior knowledge of histopathologists. It resulting in lower diagnosis consistency, and the average diagnosis accuracy is only 75% (1).
Currently, breast cancer diagnosis based on histopathological images is facing three major challenges. Firstly, there is a shortage of experienced histopathologists around the world, especially in some underdeveloped areas and small hospitals. Secondly, the diagnosis of histopathologist is subjective and there is no objective evaluation basis. Whether the diagnosis is correct or not depends entirely on the histopathologists' prior knowledge. Thirdly, the diagnosis of breast cancer based on histopathological images is very complicated, time-consuming and labor-intensive, which is inefficient in the era of big data. In face of these problems, an efficient and objective breast cancer diagnosis method is urgently needed to alleviate the workload of histopathologists.
With the rapid development of computer-aided diagnosis (CAD), it has been gradually applied to the clinical field. The CAD system cannot completely replace the doctor, but it can be used as a "second reader" to assist doctors in diagnosing diseases. However, there are many false positive areas detected by the computer, which will take a lot of time of doctors to re-evaluate the results prompted by the computer, resulting in a decrease in the accuracy and efficiency. Therefore, how to improve the sensitivity of computer-aided tumor detection method, while greatly reducing the false positive detection rate, improve the overall performance of the detection method is a subject to be studied.
In recent years, machine learning has been successfully applied to image recognition, object recognition, and text classification. With the advancement of computer-aided diagnosis technology, machine learning has also been successfully applied to breast cancer diagnosis (2)(3)(4)(5)(6)(7)(8). There are two common methods, histopathological images classification based on artificial feature extraction and traditional machine learning methods, and histopathological images classification based on deep learning methods. Histopathological images classification based on artificial feature extraction and traditional machine learning methods needs manual design of features, but it does not require equipment with high performance and has advantages in computing time. However, histopathological images classification based on deep learning, especially convolutional neural network (CNN), often requires a large number of labeled training samples, while the labeled data is difficult to obtain. The labeling of lesions is a time-consuming and laborious work, which takes a lot of time even for very experienced histopathologists.
The key of traditional histopathological images classification is feature extraction. The common features include color features, morphological features, texture features, statistical features etc. Spanhol et al. (9) introduced a publicly available breast cancer histopathology dataset (BreaKHis), and they extracted LBP, CLBP, gray level co-occurrence matrix (GLCM), Local phase quantization (LPQ), parameter-free threshold adjacency statistics (PFTAS) and one keypoint descriptor named ORB features, and 1-nearest neighbor (1-NN), quadratic linear analysis (QDA), support vector machines (SVMs), and random forests (RF) were used to assess the aforementioned features, with an accuracy range from 80% to 85%. Pendar et al. (10) introduced a representation learning-based unsupervised domain adaptation on the basis of (9) and compared it with the results of CNN. Anuranjeeta et al. (11) proposed a breast cancer recognition method based on morphological features. 16 morphological features were extracted, and 8 classifiers were used for recognition, the accuracy is about 80%. The authors in (12)(13)(14) proposed breast cancer recognition methods based on texture features. Particularly, Carvalho et al. (14) used phylogenetic diversity indexes to characterize the types of breast cancer. Sudharshan et al. (15) compared 12 multi-instance learning methods based on PFTAS and verified that multi-instance learning is more effective than single-instance learning. But none of them considered the color channel of the image. Fang et al. (16) proposed a framework called Local Receptive Field based Extreme Learning Machine with Three Channels (3C-LRF-ELM), which can automatically extract histopathological features to diagnose whether there is inflammation. In addition, in order to reduce the recognition time and the complexity of the algorithms, this paper is committed to achieving high recognition accuracy with low dimensional features.
Deep learning methods, especially CNN, can achieve more accurate cancer recognition (17)(18)(19)(20)(21)(22)(23)(24)(25) for it's ability to extract powerful high-level features compared with traditional image recognition methods. For example, Spanhol et al. (17) used the existing AlexNet to test the BreaKHis dataset, and its recognition accuracy was significantly higher than their previous work (9). The authors in (18)(19)(20)(21)25) used different CNN frameworks and obtained the recognition accuracy of more than 90% on the twoclass problem of the BreaKHis dataset. Benhammou et al. (22) comprehensively surveyed the researches based on BreaKHis datasets from the magnification-specific binary, magnification independent binary, magnification specific multi-category and magnification independent multi-category four aspects, and proposed a magnification independent multi-category method based on CNN, which is rarely considered in previous studies. The works (23)(24)(25)(26) also achieved good performance on the Bioimaging 2015 dataset. Both the BreaKHis and Bioimaging 2015 are the challenging datases for breast cancer detection. Due to the drawbacks of model training, most researchers' research were based on models that have been well trained through other datasets and verified by histopathological images. Few people trained a complete model with histopathological images for the lack of labeled data.
In order to reduce the workload of histopathologists and allow them to spend more time on the diagnosis of more complex diseases, efficient and fast computer-aided diagnosis methods are of urgent need. This paper proposed a breast cancer histopathological images recognition method based on low dimensional three-channel features. The features of the three channels of the image were extracted respectively, then the threechannel features were fused to realize better breast cancer histopathological images recognition for the image level and the patient level. The framework is shown in Figure 1.
The contributions of this paper are as follows: 1) proposed a histopathological images recognition method based on three-channel features, 2) proposed a histopathological images recognition method based on low dimensional features, 3) it is a method with high accuracy and fast recognition speed, 4) it is a method easy to implement. The rest of the paper is organized as follows: in Section 2 the feature extraction methods are introduced, the experiments and results analysis are given in Section 3, and Section 4 concludes the work.

Gray Level Co-Occurrence Matrix
Gray level co-occurrence matrix is a common method to describe the texture of an image by studying its spatial correlation characteristics.

Average Pixel Value of Each Channel
The average value reflects the centralized tendency of the data and is an important amplitude feature of images. For an image, the average pixel value of each color channel is expressed as where f (x c , y c ) represents the pixel value of (x c , y c ).

Hu Invariant Moment
Geometric moments were proposed by Hu.M.K (30) in 1962. They constructed seven invariant moments according to secondorder and third-order normalized central moments, and proved that they are invariant to rotation, scaling and translation. Hu invariant moment is a region-based image shape descriptor. In the construction of Hu invariant moments, the central moment is used to eliminate the influence of image translation, the normalization eliminates the influence of image scaling, and the polynomial is constructed to realize the invariant characteristics of rotation. Different order moments reflect different characteristics, the low order reflects the basic shape of the target, and the high order reflects the details and complexity.

Wavelet Features
The result of two-dimensional wavelet decomposition reflects the frequency changes in different directions and the texture characteristics of the image. Since the detail subgraph is the high-frequency component of the original image and contains the main texture information, the energy of the individual detail subgraph is taken as the texture feature, which reflects the energy distribution along the frequency axis with respect to the scale and direction. In this paper, 5-layer wavelet decomposition was carried out, and the energy of high-frequency components in each layer was taken as the feature vector.

Tamura
Tamura et al. (31) proposed a texture feature description method based on the psychological research of texture visual perception, and defined six characteristics to describe texture. Namely, coarseness, contrast, directionality, line likeness, regu larity, and roughness. Coarseness reflects the change intensity of image gray level. The larger the texture granularity is, the coarser the texture image is. Contrast reflects the lightest and darkest gray levels in a gray image, and the range of differences determines the contrast. Directionality reflects the intensity of image texture concentration along a certain direction. Lineality reflects whether the image texture has a linear structure. Regulation reflects the consistency of texture features between a local region and the whole image. Roughness is the sum of roughness and contrast.

Local Binary Pattern
Local Binary Pattern (32) is an operator used to describe local texture features of an image. It has significant advantages such as rotation invariance and gray level invariance. The original LBP operator is defined as comparing the gray values of eight adjacent pixels with the threshold value namely the center pixel in a 3×3 window. If the value of the adjacent pixel is greater than or equal to the value of the center pixel, the position of the pixel is marked as 1, otherwise it is 0. That is, for a pixel (x c , y c ) on the image Where P is the number of sampling points in the neighborhood of the center pixel, R is the radius of the neighborhood, g c is the gray value of the center pixel; g p is the gray value of the pixel adjacent to the center pixel.
In this way, 8 points in the neighborhood can be compared to generate a total of 256 8-bit binary numbers, that is, the LBP value of the center pixel of the 3×3 window is obtained, and this value is used to reflect the texture information of the region.

Completed Local Binary Pattern
Completed local binary pattern (33) is a variant of LBP. The local area of the CLBP operator is represented by its center pixel and local differential sign magnitude transformation. After the center pixel is globally thresholded, it is coded with a binary string as CLBP_Center (CLBP_C). At the same time, the local difference sign magnitude transformation is decomposed into two complementary structural components: difference sign CLBP-Sign (CLBP_S) and difference magnitude CLBP-Magnitude (CLBP_M). For a certain pixel (x c , y c ) on the image, the components are expressed as: Where, N is the number of windows, g N = 1 N o N−1 n=0 g n represents the mean gray value about g c when the center point is constantly jg p − g c jrepresents the mean magnitude. CLBP_S P,R (x c , y c ) is equivalent to the traditional LBP operator, which describes the difference sign characteristics of the local window. CLBP_M P,R (x c , y c ) describes the difference magnitude characteristics of the local window. CLBP_C P,R (x c , y c ) is the gray level information reflected by the pixel at the center. In our experiments, we worked with rotation-invariant uniform patterns, with a standard value of P = 8, R = 1, yielding a 20-D feature vector for each channel.

Gabor
Gabor feature is a kind of feature that can be used to describe the texture information of image. The frequency and direction of Gabor filter are similar to human visual system, and it is particularly suitable for texture representation and discrimination. Gabor features mainly rely on Gabor kernel to window the signal in frequency domain, so as to describe the local frequency information of the signal. Different textures generally have different center frequencies and bandwidths. According to these frequencies and bandwidths, a set of Gabor filters can be designed to filter texture images. Each Gabor filter only allows the texture corresponding to its frequency to pass smoothly, while the energy of other textures is suppressed. Texture features are analyzed and extracted from the output results of each filter for subsequent classification tasks. we used the Gabor filters with five scales and eight orientations, the size of the filter bank is 39×39, the block size is 46×70, yielding a 4000-D feature vector for each channel.

Histogram of Oriented Gradient
Histogram of Oriented Gradient (34) is a feature descriptor used for object detection in computer vision and image processing. It constructs features by calculating and counting the histogram of the gradient direction in the local area of the image. The use of gradient information can well reflect the edge information of the target, the local appearance and shape of the image can be characterized by the size of the local gradient. It is generally used in pedestrian detection, face recognition and other fields, but it does not perform well on images with complex texture information. It is introduced as a comparison in this paper.

Protocol
All of the experiments were conducted on a platform with an Intel Core i7-5820K CPU and 16G memory. The BreaKHis dataset has been randomly divided into a training set (70%, 56 patients) and a testing set (30%, 26 patients). We guarantee that patients use to build the training set are not used for the testing set. The results presented in this work are the average of five trials. All the images we used were without any preprocessing before feature extraction. For the SVM, we chose the RBF kernel. The best penalty factor c=2 and kernel function parameter g=1 were obtained by cross validation. For wavelet function, we selected coif5 wavelet function, which has better symmetry than dbN, has the same support length as db3N and sym3N, and has the same number of vanishing moments as db2N and sym2N.
Here, we report the recognition accuracy at both the image level and the patient level. For the image level, let N rec_I be the For the patient level, we followed the definition of (9). Let N P be the image of patient P, S is the total number of patients, and N rec_P images of patient P were correctly classified, then the patient score can be defined as and define the recognition accuracy of the patient level as To further assess the performance of the proposed framework, sensitivity (Se), precision (Pr) and F1-score metrics were used and the formulations of the metrics are described as  Table 3 reports the performance of all descriptors we have assessed. The image level recognition accuracy, the patient level recognition accuracy, sensitivity, precision and F1-score of 10 different three-channel descriptors under 4 magnifications were compared. The descriptors are GLCM1, GLCM4, APVEC, HIM, wavelet feature, Tamura, CLBP. In order to show the effectiveness of low dimensional features, LBP, Gabor, and Hog were introduced for comparison. For images at 40X magnification, GLCM1 achieved the highest recognition accuracy of 94.12 ± 2.19% at the image level and 93.48 ± 2.7% at the patient level, as well as the highest precision and F1_score. The second was GLCM4 with which the image_accuracy and the patient_accuracy were 93.4 ± 3.54% and 92.95 ± 4.02, respectively. Followed by APVEC achieving the image_accuracy of 92.12 ± 1.09%, and the patient_accuracy of 90.55 ± 0.84%. The same conclusion was drawn for 100X. The image level recognition accuracy and the patient level recognition accuracy of GLCM1, GLCM4, and APVEC were 92.65 ± 3.08%, 91.74 ± 3.89%, 91.98 ± 3.79%, 91.16 ± 3.88%, 90.2 ± 2.33%, 89.18 ± 3.45%, respectively. However, for 200X, APVEC achieved the highest image level recognition accuracy of 94.97 ± 1.35%, followed by GLCM1 and GLCM4. GLCM1 performed best at the patient level with an accuracy of 94.24 ± 2.86%, which is 0.3% higher than APVEC. As for 400X, APVEC performed best at both the image level (92.78 ± 3.14%) and the patient level (93.3 ± 3.25%) followed by GLCM1 and GLCM4. On the whole, GLCM1, GLCM4 and APVEC performed well at both the image level and the patient level, followed by HIM. The four descriptors all get the highest recognition accuracy at 200X, and all descriptors except Gabor and Hog obtain the worst performance at 400X, which is same as the conclusion of (18,35). Although the recognition accuracy of LBP and Gabor is above 82%, which is also acceptable, it also needs more recognition time due to the high feature dimension, as shown in Table 4. Tamura and Hog performed slightly worse compared to other descriptors.

Experiment Results
The reason for the above results is that the distributions of features extracted by different feature descriptors are different. The high dispersion of feature distribution will increase the difficulty of image recognition, and the feature with more concentrated distribution will achieve better recognition performance. Figure 3 is the best illustration of the results.     Figure 3(B) shows the feature distribution of 400X. It can be seen from Figure 3 that for 40X, 100X, 200X, the outliers of GLCM1, GLCM4, APVEC, and HIM are much less than other feature descriptors, indicating that the distributions of these four features are relatively concentrated, which is beneficial for breast cancer identification. In addition, comparing the feature distributions of benign and malignant samples under different magnifications, it can be found that the data distribution of benign and malignant samples of Hog are very similar, indicating the weak ability to discriminate between benign and malignant, which is also the reason for its poor performance. The outliers of GLCM1 and GLCM4 under 400X are obviously more compared to other magnifications, and the similarity of the benign and malignant feature distributions of all descriptors is relatively high, resulting in the poor performance of 400X.
Compared with RGB images, grayscale images only retain the brightness information of the images, but lose the chroma and saturation information of the images. Three-channel features can make up for the lost information of single-channel features, increasing the recognition capability of features, so as to achieve better recognition performance. To further illustrate the advantages of three-channel features, Table 5 shows the performance of different descriptors of gray-level features.
Comparing Table 3 and Table 5, it can be seen that the performance of the three-channel features is much better than that of gray-level features, especially GLCM1, GLCM4, APVEC, HIM and Gabor. The accuracy for most of them has increased by more than 10% for both the image level and the patient level. Figure 4 shows the average recognition accuracy of three-  contains more information about the precise lesion locations, which is usually presented through the nucleus, and appear blue-purple.
Different descriptors extract different features. It often cannot obtain all the effective information of the image only by one method. There may be a complementary relationship between different methods, and sometimes more redundant information may be added. In this paper, GLCM1 with the best recognition performance is combined with 8 other methods except GLCM4. Different features are fused in a cascade way. The results are shown in Table 7. Table 7 shows that after the combination of GLCM1 and APVEC, the recognition accuracy of 40X and 100X is better than a single method whether it is for the image level or the patient level, and the accuracy of 200X and 400X is slightly lower than that of APVEC. The combination of GLCM1 and HIM improves the image level accuracy, while for the patient level, the accuracy of 40X and 100X is slightly lower than GLCM1. This shows the complementary relationship between GLCM1 and APVEC, HIM. The performance of the combination of GLCM1 and other methods is lower than that of single GLCM1, which shows that the fusion of different texture features increases the redundancy of features and reduces the recognizability. The recognition accuracy of GLCM1, GLCM4, APVEC, and HIM based on the three-channel features is better than many existing studies, particularly, better than the performance of some deep learning models. Table 8 shows that the method proposed in this paper is superior to many state-of-the-art methods in benign and malignant tumor recognition, both for the image level and the patient level. It is worth mentioning that works (35-43) did not split training and test set according to the protocol of (9), works (44, 45) adopted the existed protocol, and works (46,47) randomly divided training set (70%) and test set (30%), but they did not mention whether it was the same as the protocol. Although the recognition accuracy of the works (37, 39, 41-43, 46, 47) is significantly higher than that of our method, they all use deep learning model, which requires a large number of labeled training samples and consumes longer training time. In addition, in these works, except (42), they only calculated the image level recognition accuracy. George et al. even only tested their method based on the data of 200X.

CONCLUSION
In this paper, a breast cancer histopathological images recognition method based on low dimensional three-channel features is proposed. There have been many related studies, but in traditional methods, most scholars did not consider the color channel of the image, so that the extracted features lost part of the effective information. This paper compares the performance of 10 different feature descriptors in the recognition of breast cancer histopathological images. We extracted the three-channel features of different descriptors and fused the features of each channel. Then SVM was used to assess their performance. The experimental results show that the recognition accuracy of GLCM1, GLCM4, APVEC can reach more than 90% regardless of the image level or the patient level. And the performance based on three-channel features is much better than that of gray-level features, especially for GLCM1, GLCM4. We also proved that the R channel has a greater impact on the classification results of 40X, 100X, and 200X, while for 400X, it is more dependent on the B channel. In addition, high dimensional features consume more recognition time, this paper dedicates to achieving accurate recognition based on low dimensional features. Experiment results verify that the high dimensional features extracted by LBP, Hog, and Gabor require more recognition time, but the accuracy has not been greatly improved. Our method is based on the existing traditional methods and is easy to implement without The bold values indicate that the recognition accuracy of combining the two features is better than that of a single feature.
complex image preprocessing. Experimental results and comparison with other methods confirm that our method requires less training time than deep learning methods, which cannot be ignored in practical applications.
In the future work, we will continue to propose more efficient and rapid methods for breast cancer recognition. The target is to realize multi-class recognition of breast cancer based on the research of benign and malignant tumor recognition. In addition to improving the recognition accuracy, we also hope to extract more effective information about cancer, which can help doctors find the lesion faster and reduce the workload on doctors.