Brain tumor classification: a novel approach integrating GLCM, LBP and composite features

Dheepak, G.; J., Anita Christaline; Vaishali, D.

doi:10.3389/fonc.2023.1248452

METHODS article

Front. Oncol., 30 January 2024

Sec. Cancer Imaging and Image-directed Interventions

Volume 13 - 2023 | https://doi.org/10.3389/fonc.2023.1248452

This article is part of the Research Topic Recent Advances in Diagnosis and Treatment of Brain Tumors: From Pediatrics to Adults View all 16 articles

Brain tumor classification: a novel approach integrating GLCM, LBP and composite features

G. Dheepak^*

Anita Christaline J.

D. Vaishali

Department of Electronics & Communication Engineering, Faculty of Engineering and Technology, SRM Institute of Science and Technology, Vadapalani Campus, Chennai, TN, India

Identifying and classifying tumors are critical in-patient care and treatment planning within the medical domain. Nevertheless, the conventional approach of manually examining tumor images is characterized by its lengthy duration and subjective nature. In response to this challenge, a novel method is proposed that integrates the capabilities of Gray-Level Co-Occurrence Matrix (GLCM) features and Local Binary Pattern (LBP) features to conduct a quantitative analysis of tumor images (Glioma, Meningioma, Pituitary Tumor). The key contribution of this study pertains to the development of interaction features, which are obtained through the outer product of the GLCM and LBP feature vectors. The utilization of this approach greatly enhances the discriminative capability of the extracted features. Furthermore, the methodology incorporates aggregated, statistical, and non-linear features in addition to the interaction features. The GLCM feature vectors are utilized to compute these values, encompassing a range of statistical characteristics and effectively modifying the feature space. The effectiveness of this methodology has been demonstrated on image datasets that include tumors. Integrating GLCM (Gray-Level Co-occurrence Matrix) and LBP (Local Binary Patterns) features offers a comprehensive representation of texture characteristics, enhancing tumor detection and classification precision. The introduced interaction features, a distinctive element of this methodology, provide enhanced discriminative capability, resulting in improved performance. Incorporating aggregated, statistical, and non-linear features enables a more precise representation of crucial tumor image characteristics. When utilized with a linear support vector machine classifier, the approach showcases a better accuracy rate of 99.84%, highlighting its efficacy and promising prospects. The proposed improvement in feature extraction techniques for brain tumor classification has the potential to enhance the precision of medical image processing significantly. The methodology exhibits substantial potential in facilitating clinicians to provide more accurate diagnoses and treatments for brain tumors in forthcoming times.

1 Introduction

A tumor is an abnormal growth of cells which occurs in any portion of the human body. Over two hundred various kinds of cancer, such as lung, blood, breast, heart, lymphoma, etc. have been reported (1). According to World Health Organization (WHO) fact sheet 2022, cancer has been the leading cause of death with 10 million deaths reported (2). Among the various types of tumors, brain tumors have been the primary reason for death in various age and gender groups and are also challenging to treat. A tumor in the human brain is a collection of malignant cells which develops when brain tissues suddenly and abnormally extend. There are different types of brain tumors, some are non-cancerous (benign) and some are cancerous (3). The human brain acts as the body’s control hub. It coordinates the actions of vast numbers of neurons and their many connections. Tumor in the brain disrupts normal brain activities and the nervous system processes. The need to overcome the disadvantages of manual tumor image analysis, which is both time-consuming and vulnerable to human subjectivity, motivated machine learning based techniques of classifying tumors.

As discussed by Abdusalomov et al. (4) Glioma, Meningioma, Pitutary seem to be the common types of brain tumors that look like non-cancerous, but may be. Hence this research intends to study these brain tumors and classify them by incorporating advanced features such as GLCM and LBP, as well as interaction features and statistical analysis. This method has the potential to significantly improve the precision of medical image processing for more precise brain tumor identification and treatment planning.

The primary contributions of this present investigation are

● This research introduces a novel methodology for comprehensive texture analysis of tumor images. The approach integrates Gray-Level Co-occurrence Matrix (GLCM) and Local Binary Pattern (LBP) features. GLCM features capture spatial pixel intensity relationships, while LBP features identify local texture patterns. Together, these features provide a detailed insight into tumor textures, facilitating improved understanding and tumor classification.

● A novel feature generation technique has been proposed, which generates interaction features by multiplying GLCM and LBP feature vectors. This technique generates a new set of features that enhance the discriminative ability of the model, thereby enhancing its capacity to differentiate between various tumor morphologies. The incorporation of these interaction features enables the acquisition of a broader spectrum of texture data, resulting in a greater comprehension of tumor characteristics.

● In addition, the methodology incorporates the computation of aggregated characteristics derived from GLCM properties. These aggregated features, which consist of the sum, mean, and median of the GLCM features, provide a more comprehensive view of the overall characteristics of tumor textures.

● For Tumor classification Support Vector Machine (SVM) is implemented with extracted features which enhances the performance of tumor classification.

The next section discusses the various research works related to tumors and their classification.

2 Related literature

In order to categorize common brain tumor types, Kaplan et al. (5) have used nLBP and αLBP feature extraction approaches. Using the K-Nearest Neighbour (Knn) model and the nLBPd = 1 method, they achieved a high 95.56% success rate of identifying tumors. Another study by Abdusalomov et al. (4) used YOLOv7 and transfer learning to improve brain tumor diagnosis in MRI scans, they report an outstanding 99.5% accuracy for identifying the most common types of brain tumors Glioma, Meningioma, Pitutary. However, they also acknowledge the need for additional research, particularly for minor tumor identification (4).

Research by Kaya et al. (6) used a novel feature extraction technique based on co-occurrence matrices from vibration data to address the problem of accurate bearing issue identification. Effective success rates were obtained by utilizing 1D-LBP and machine learning: 87.50% for dataset 1 (various speeds), 96.5% for dataset 2 (fault size in mm), and 99.30% for dataset 3 (fault type - inner ring, outer ring, ball). Study by Solani et al. (7) examines the difficulties in diagnosing brain tumors and provides information on the potential of MR imaging. They adopt statistical and machine learning techniques to detect brain tumor for a chosen dataset. Yildirim et al. (8) have studied the most accepted forms of brain tumors include gliomas, meningiomas, and pituitary. They say that the optimal course of action for treating these tumors may differ reliant on the type. Brain tumors can be challenging to classify, even for experts, due to heterogeneous imaging findings (8).

According to Shinde et al. (9), while progress has been made in classifying anomalies in medical imaging, there are still challenges to overcome. These include, but are not limited to, model selection, data description, error detection, data sufficiency, and result reliability. As a result, there is no one highest benchmark for categorizing medical images. So, it is quite problematic in computer vision and machine learning domains. The algorithms mentioned generally are developed using soft computing and model-based methodologies, and their results are reliable (9). With the help of mobile sensor inputs, work by Kuncan et al. (10) presents a unique feature extraction approach called DS-1D-LBP for human activity recognition (HAR). They have successfully classified with the Extreme Learning Machine (ELM) with a high success rate of 96.87%. Research by Shil et al. (11) rely heavily on features that have been manually constructed and then provided to a classifier, such as Support Vector Machine, Decision Tree, or k-Nearest Neighbour (KNN). Khalid et al. state that the extensive nature of the dataset can cause delays in feature engineering, thereby increasing the likelihood of errors and highlighting the significance of domain expertise (12).

Processing and analyzing MRI images of brain tumors is one of the most challenging and promising new areas of study. An MRI, which employs magnetic fields and radio waves to generate overall images of internal body structures, is essential for determining the optimal course of treatment for a tumor and its progression. Texture features based on GLCM were first introduced by Haralick et al. (13) in 1979. In biomedical field, advantage of the textural properties of images aids in image classification.

There are numerous methods available to derive the relevant data from imaging modalities for region-based segmentation, including artificial neural network (ANN), fuzzy clustering means (FCM), support vector machine (SVM), knowledge-based techniques, and the expectation-maximization (EM) algorithm technique. Image analysis using SVM and BWT methods was suggested by Bahadure et al. (14) for detecting and classifying brain tumors using MRI. Skull stripping, in which non-brain tissues are removed, allowed for a 95% detection rate utilizing this method. Joseph et al. (15) introduced a technique for segmenting MRI brain images for tumor diagnosis that combines the K-means clustering algorithm and morphological filtering technique. Alfonse and Salem (16) suggested a method for automatically classifying MRI scans for brain tumors using a support vector machine. The researchers employed Fast Fourier Transform (FFT) to extract features to enhance the classifier’s precision. Additionally, they utilized technology that exhibited minimal redundancy and maximum relevance to reduce the number of features.

In order to better segregate brain tumors, Shree et al. (17) pre-processed the images using multiple noise removal methods. Their research employed DWT and GLCM-based characteristics of brain tumors. Any residual noise after segmentation was filtered out using morphological filtering procedures. The suggested model was trained and evaluated using the probabilistic neural network classifier for pinpointing tumor locations in brain MRI scans. Yao et al. (18) provided a method that includes extracting texture characteristics using the wavelet transform and classifier as SVM with an accuracy of 83% to process and address protocols of diverse images and non-linearity of actual data to classify improved MRI images related to contrast. Principal component analysis (PCA) and a radial basis function kernel with SVM were proposed by Kumar and Vijayakumar (19) for classifying and segmenting brain tumors. They were able to achieve 94% success rate using this strategy.

Saleck et al. (20) developed a Fuzzy C-Mean (FCM) way of figuring out the size of a patient’s brain tumor. They could figure out how many groups were there in the FCM by looking at the intensity of each pixel. This approach uses GLCM texture feature extraction to forecast the threshold value. The generic performance of a model is established by how sensitive, specific, and accurate it is.

Considering the research going on in this field, it seems evident that choosing appropriate features of tumor images and adopting appropriate machine learning classifiers will lead to better identification of tumors.

Based on the literature survey, this research intends to study three types of brain tumors (Glioma, Meningioma, Pitutary) and classify them by incorporating novel and advanced features such as GLCM and LBP, as well as interaction features and statistical analysis. This research work introduces novel contributions in the following aspects:

● Interaction Features: The extraction of interaction features involves the computation of the outer product between the Gray-Level Co-occurrence Matrix (GLCM) and Local Binary Pattern (LBP) feature vectors. The aforementioned process results in the formation of a matrix that effectively encompasses the interplay between spatial and local texture data. The matrix is subsequently converted into a unidimensional array in order to constitute the collection of interaction features.

● Non-linear Features: Another significant contribution is applying a logarithmic transformation to the GLCM features. This transformation generates non-linear features, allowing for the capturing of complex relationships within tumor images. By incorporating these non-linear features, the research improves the discriminative power of the feature set and enables a more sophisticated analysis and interpretation of the tumor image properties.

3 Proposed methodology

Publicly available databases Figshare Dataset (13) has been used in this study since it is one of the most common datasets used by many other researchers. This brain tumor dataset contains 3064 T1-weighted contrast enhanced images from 233 patients for three kinds of brain tumor: meningioma (708 slices), glioma (1426 slices), and pituitary tumor (930 slices) (21). The conversion to grayscale from RGB is performed to enrich the images further. The three brain tumor images used are Glioma, Meningioma, and pituitary tumor. As proposed by Demirhan et al. (22) converting the input images to grayscale gives a simplified representation is obtained that effectively captures the overall brightness information while eliminating the complexities associated with color. The parameters include boosting the signal-to-noise ratio, making MR images look better, removing the background of unwanted sections, smoothing the inner parts and keeping the essential edges intact. The proposed work is depicted in Figure 1.

Figure 1

Figure 1 Overall proposed model architecture.

4 Feature extraction

To extract significant features from brain MRI images, six types of feature extraction techniques have been implemented in this study, as listed in Table 1. Using these techniques, important aspects of the images could be identified and analyzed.

Table 1

Table 1 Feature etxtraction Techniques used in this research work.

4.1 GLCM feature extraction

Texture analysis facilitates the differentiation between healthy and unhealthy tissues for visual perception and ML algorithm. In addition, it reveals differences between malignant tumors and normal tissues that might not be observable by the naked eye. By selecting efficient statistical features for early diagnosis, the accuracy can be improved. Second-order statistical texture features can be extracted using GLCM. Creating a GLCM matrix and then deriving statistical metrics from this matrix measures the frequency with which pairs of pixels with specific values and a specific spatial relationship appear in an image. GLCM, or Gray-level spatial dependence matrix (GLSDM), has been used in this research used to extract the statistical features. GLCM was first proposed by Haralick et al. (23) for describing the geographical relationship between pixels with different levels of Gray-level.

To analyze an image’s texture statistically, the GLCM counts how often pairs of pixels with the same value and the same relative position appear in the image. By determining the frequency with which pairs of pixels with a given weight and in a given spatial relationship occur in an image, the GLCM functions can characterize an image’s texture through the extraction of statistical measurements. The Gray-Level Co-occurrence Matrix (GLCM) is a two-dimensional histogram where each pair of ‘p’ and ‘q’ represents the frequency with which the events ‘p’ and ‘q’ occur. As a function of distance S = 1, angle (0 degrees horizontal, 45 degrees positive diagonal, 90 degrees vertical, and 135 degrees negative diagonal), and gray scales ‘p’ and ‘q’, it determines the frequency with which a pixel of intensity ‘p’ occurs in proximity to a pixel of intensity ‘q’ at a given distance ‘S’ and orientation.

After computing GLCM, five different statistical features are extracted from GLCM. These extracted features include,

Contrast: Determines the local variances of the grey-level co-occurrence matrix.

Homogeneity: Determines the proximity of the GLCM element distribution to the GLCM diagonal.

Dissimilarity: Quantifies the range of grayscale intensity.

Energy: It will calculate the pixel’s uniformity.

Correlation: Calculates the average degree to which each pixel in the image is connected with its neighbors.

The formulas used to calculate the above characteristic features are shown in Table 2.

Table 2

Table 2 Features of GLCM.

4.2 LBP features

The Local Binary Patterns (LBP) technique is a widely employed texture descriptor in image processing and computer vision. Local texture representation is a straightforward yet efficient method used to depict the texture characteristics of an image. This technique has been extensively applied in various domains, such as object recognition, face detection, and image segmentation. The step wise LBP Feature Extraction is, each pixel is compared with its neighboring pixels. Consider the Gray value of the center pixel as g_c, and the gray value of the neighboring pixel as g_p. Comparison is carried out with using Equation 1.

\begin{array}{l} S (ɡ_{p}, ɡ_{c}) = i f ɡ_{p} \geq = ɡ_{c} e l s e 0 & (1) \end{array}

A circle of radius R is centered around the center pixel and the function S is applied to P evenly spaced pixels.

4.3 Interaction features

A novel approach implemented in this research is the extraction of image features for tumor classification, wherein a strategy for generating interaction features is employed with the aim of potentially improving the performance of the model. This approach entails the integration of two sets of important attributes: the GLCM attributes and LBP attributes, both of which effectively capture fundamental properties of the images being analyzed.

The GLCM features provides insights into the spatial interdependence of pixel intensities within an image, effectively capturing and representing texture details. In contrast, the features of LBP provide a quantification of the local spatial patterns of pixel luminance, thereby providing supplementary information regarding texture.

The computation of interaction features involves the outer product of the feature vectors obtained from GLCM and LBP. In mathematical terms, the outer product of the GLCM features vector (g) and the LBP features vector (l) yields a matrix (M). Each element M_i,j of this matrix represents the product of the i^th GLCM feature and the j^th LBP feature. The aforementioned statement aptly describes the correlation between the respective GLCM and LBP features.

Subsequently, the interaction matrix is transformed into a one-dimensional vector, thereby generating a novel set of features that capture the interplay between the initial feature sets. The inclusion of this extended feature set, in combination with the existing GLCM and LBP features, offers a more exhaustive and refined depiction of the image. Consequently, it has the potential to improve the machine learning model’s capacity to differentiate between various tumor classifications.

The steps involved in interaction features are,

Defining GLCM and LBP features vector: GLCM feature vector is defined as

ɡ = ɡ₁, ɡ₂, ɡ₃,…ɡ_m and the LBP feature vectors as l = l₁,1₂,l₃,…l_n in which ‘m’ is the number of GLCM features and ‘n’ is number of LBP features.

Calculating the outer product: The matrix ‘M’ is obtained by computing the outer product of the two given vectors. The matrix is provided as follows:

\begin{array}{l} M = ɡ \otimes l = [\begin{matrix} ɡ_{1} l_{1} & \dots & ɡ_{1} l_{n} \\ ⋮ & ⋱ & ⋮ \\ ɡ_{m} l_{1} & \dots & ɡ_{m} l_{n} \end{matrix}] & (2) \end{array}

Each element of this matrix M_ij = ɡ_il_j is a new feature created which captures the interaction between i^th GLCM feature as well as with j^th LBP feature.

Combining interaction features and original features: The original GLCM and LBP features are combined with the interaction features to create the final feature vector for each image. If the GLCM feature vector ‘g’ has a size of m and the LBP feature vector ‘l’ has a size of n, then the final feature vector will have a size of m + n + mXn.

These procedures describe how interaction features are generated from GLCM and LBP features. This augmented feature set has the potential to provide a more comprehensive and nuanced image representation, thereby enhancing the performance of the image classification model.

Here GLCM calculates at 4 different angles such as (0, 45, 90, and 135 degrees). The GLCM algorithm proceeds by computing five distinct properties, namely ‘contrast’, ‘dissimilarity’, ‘homogeneity’, ‘energy’, and ‘correlation’, for each angle. These properties are then used to generate a vector of GLCM features. Assuming the vector representing the GLCM features for an image is [10, 20, 30, 40, 50]. The aforementioned values indicate that the contrast is 10, the dissimilarity is 20, the homogeneity is 30, the energy is 40, and the correlation is 50.

The LBP algorithm is employed to calculate the LBP values using 8 sampling points positioned evenly along a circle with a radius of 1. Subsequently, the histogram of these LBP patterns is computed. Suppose the Local Binary Patterns (LBP) features for a given image are represented by a histogram consisting of 256 bins. The values within this histogram range from 0.01 to 0.01, with each bin containing a distinct value.

The interaction features are generated through the computation of the outer product between the feature vectors of the Grey Level Co-occurrence Matrix (GLCM) and the Local Binary Patterns (LBP).

For the purpose of explanation, consider a simplified scenario where only first five bins of LBP features. These bins are represented by the values [0.01, 0.02, 0.03, 0.04, 0.05].

The 5x5 matrix is obtained by computing the outer product of the GLCM features [10, 20, 30, 40, 50] and the first 5 LBP features [0.01, 0.02, 0.03, 0.04, 0.05]. The value of each element in this matrix is obtained by multiplying a feature from the Grey Level Co-occurrence Matrix (GLCM) with a feature from the Local Binary Patterns (LBP) feature.

The 5x5 matrix is obtained by taking the outer product of the GLCM features [10, 20, 30, 40, 50] and the first 5 LBP features [0.01, 0.02, 0.03, 0.04, 0.05]. The value of each element in this matrix is obtained by multiplying a feature derived from GLCM with a feature derived from the LBP algorithm as represented in Equation 3.

\begin{array}{l} [\begin{matrix} 10 * 0.01 & \dots & 10 * 0.05 \\ ⋮ & ⋱ & ⋮ \\ 50 * 0.01 & \dots & 50 * 0.05 \end{matrix}] & (3) \end{array}

The interaction features offer way to capture the potentially significant connections among various elements of an image’s texture. The precise dynamics of these interactions are dependent upon the data and the attributes of the images under examination. In the context of tumor images, these interactions may potentially expose complicated patterns that are pivotal in distinguishing between various tumor types.

4.4 Aggregated features

The aggregated features represent the additional statistical features of GLCM features. The GLCM features can be consolidated into more straightforward statistical measures. The mean, median, and total of the GLCM characteristics are these statistical measurements. The steps involved in aggregated features calculation is shown in Algorithm 1.

Algorithm 1

Algorithm 1 Calculation of aggregated features.

The features extracted from aggregated features are as follows,

Sum: The total number of GLCM characteristics is determined. This gives us a single number that can be taken as a measure of the “amount” of GLCM features in the image. If the GLCM characteristics are represented by the vector g = [ɡ1, ɡ2,…ɡn] then

ɡ_m = ɡ1 + ɡ2 + … + ɡn gives the total.

Mean: The average GLCM characteristics are determined. This gives us a single number that stands in for the “typical” value of the GLCM feature in the image.

$G_{M e a n} = \frac{(ɡ 1 + ɡ 2 + \dots ɡ n)}{n}$ gives the mean, where ‘n’ defines the GLCM features total numbers.

Median: GLCM features median are computed. When the GLCM features are ordered numerically, this yields a single value that characterizes the “middle” value. The median is the midpoint if ‘n’ is an odd number. If ‘n’ is divisible by 2, then the median is the midpoint between those two values.

Following these computations, the aggregated GLCM features take the form of a vector with three elements, which are denoted by the notations [ɡ_sum, ɡ_mean, and ɡ_median].

The aggregated features (sum, mean, and median) provide a summary of the GLCM features, capturing various aspects of their level and distribution. In addition to the GLCM features, LBP features, interaction features, statistical features, and non-linear features, these characteristics are added to the final feature vector for each image.

Pattern recognition, machine learning, and image classification employ aggregated features such as summation, average, and median. The raw features are condensed, effectively representing the central tendency and overall pattern of the data, thereby offering a simplified yet meaningful perspective of the dataset. The aforementioned properties exhibit a lower degree of variation compared to individual data points, thereby enhancing the robustness of models against the presence of noisy or outlier data. The utilization of these techniques results in a reduction of data dimensionality, thereby enhancing the efficiency of algorithms. The inclusion of aggregated features in a model has been observed to enhance its predictive performance by uncovering latent data patterns. The code provided utilizes the GLCM attributes to generate aggregated features that summarize the textural characteristics of the image. This has the potential to enhance the tumor classification model.

4.5 Statistical GLCM features

A set of statistical measures derived from GLCM of an image constitutes the statistical GLCM features. The GLCM matrix depicts the spatial relationship among image pixel pairs. Each element of the GLCM indicates the probability that two pixels with a particular grey level will occur at a particular distance apart. GLCM statistical traits are used to describe an image’s texture (24). Texture is how the pixels in an image are placed in space. It can be used to tell the difference between different kinds of images, like images of nature, medical images, and images of factories, etc.

The following features are derived under statistical GLCM features,

Variance: The variance quantifies the degree to which GLCM features deviate from their mean value. When the variance of the GLCM features is high, the values they take on span a wide range, whereas when it’s low, the features tend to cluster tightly around the mean. Equation 2 represent the variance formulation. Provided that GLCM feature vector ɡ = [ɡ₁, ɡ₂…ɡ_m]. Where ‘m’ is the number of GLCM features.

\begin{array}{l} v a r i a n c e = = \frac{1}{m} \sum_{i = 1}^{m} {(ɡ_{i} - m e a n)}^{2} & (4) \end{array}

Where, average of GLCM features is mean.

Skewness: The concept of skewness pertains to the degree of asymmetry exhibited by the distribution of GLCM features in relation to their mean value. Skewness is computed as Equation 5,

\begin{array}{l} s k e w n e s s = \frac{1}{m} \sum_{i = 1}^{m} {(\frac{ɡ_{i} - m e a n}{s t d})}^{3} & (5) \end{array}

Where ‘std’ is the standard deviation of GLCM features.

Kurtosis: Kurtosis assesses the tailedness of GLCM feature distributions. High kurtosis shows thick tails and a strong peak, indicating many outliers. Low kurtosis shows light tails and a flat peak, indicating no outliers. Mathematically Kurtosis is expressed in Equation 6.

\begin{array}{l} k u r t o s i s = \frac{1}{m} \sum_{i = 1}^{m} {(\frac{ɡ_{i} - m e a n}{s t d})}^{4} - 3 & (6) \end{array}

From Equations 3, 4 ‘m’ is the number of GLCM features and ɡ_i refers to the i^th element of GLCM feature vector.

These statistical GLCM characteristics can shed light on how the GLCM features are typically distributed. Statistical GLCM features exhibit information on the texture’s variability, asymmetry, and outliers, while GLCM features themselves capture the texture details within the image.

The incorporation of statistical GLCM features, such as variance, skewness, and kurtosis, enhances the predictive model by encompassing supplementary distributional information pertaining to the GLCM features. The statistical measures employed in this study shed light on subtle texture variations that may not be easily distinguishable solely from the raw GLCM features, as they effectively capture the spread, asymmetry, and tailedness of these features. The model’s exceptional performance metrics, such as its nearly perfect accuracy, precision, recall, and F1 score, are likely enhanced by the incorporation of statistical GLCM features, although other factors may also contribute to these outcomes. The inclusion of these features enhances the diversity of the 1547-feature set, thereby augmenting the model’s capacity to accurately classify the tumor images. Statistical GLCM features play a crucial role in enhancing the predictive performance of the model by providing detailed information regarding the texture characteristics of the image.

4.6 Non-linear features

Another set of novel features of this research work, in terms of non-linear features, is computed from the GLCM feature vectors by applying a logarithmic transformation. This process generates non-linear features from the GLCM feature vectors. These GLCM feature vectors, obtained from grayscale image analysis, contain numerical values representing various statistical measures. The application of the logarithmic transformation on these GLCM feature vectors gives rise to the non-linear features. These non-linear features capture intricate patterns and relationships in the data, which may enhance the performance of machine learning models.

The procedure for obtaining non-linear features from the features derived from the Grey Level Co-occurrence Matrix (GLCM) is as follows,

GLCM feature computation: The initial stage entails the computation of the Gray-Level Co-occurrence Matrix (GLCM) features, which serve to extract texture information from the image. The aforementioned features consist of contrast, dissimilarity, homogeneity, energy, and correlation. The computations pertain to the manipulation of the Grey Level Co-Occurrence Matrix (GLCM), a matrix that denotes the occurrence frequency of various combinations of pixel intensities within the image.

The GLCM features can be denoted as C(contrast), D(dissimilarity), H(homogeneity),

E (energy), and Corr(correlation).

Applying Non-linear Transformation: Once the GLCM features have been computed, a non-linear transformation is applied to these features by utilizing the natural logarithm function. The transformation is represented by,

\begin{array}{l} n q . l o g q (m) = log (m + 1) & (7) \end{array}

From Equation 7 where, ‘m’ as the scalar value or the array of the input for which the natural log is to be determined and the input array is being incremented by ‘1’ which is referred as ‘m+1’ in Equation 3. Later, the modified array log function is ‘(m+1)’ which is being computed for each element present in GLCM vector. The resultant of this log function is the non-linear features.

The computation of the non-linear features corresponding to each GLCM feature is performed in the following manner.

\begin{array}{l} C^{'} = log (1 + C) & (8) \end{array}

\begin{array}{l} D^{'} = log (1 + D) & (9) \end{array}

\begin{array}{l} H^{'} = log (1 + H) & (10) \end{array}

\begin{array}{l} E^{'} = log (1 + E) & (11) \end{array}

\begin{array}{l} C o r r^{'} = log (1 + C o r r) & (12) \end{array}

Where Equations 8-12 represent the transformed GLCM features.

In the given transformation, it is crucial to add one to the original feature value, denoted as C, before applying the natural logarithm function, represented as C’ = log(1+C). The need for this adjustment arises in cases where the correlation coefficient ‘C’ is equal to zero, as the natural logarithm of zero is undefined. This lack of definition can result in computational challenges. The inclusion of a constant term enables the logarithmic transformation to be applicable to all conceivable values of ‘C’, encompassing the value of zero. This step is crucial in ensuring the integrity and precision of the feature engineering process. The identical process is employed for the remaining GLCM features, ensuring the integrity of all modified features.

The primary objective of employing the logarithmic transformation is to effectively capture and accentuate non-linear relationships and variations present within the GLCM features. Through the utilization of the logarithm function, the feature values undergo a transformation, resulting in their representation on a logarithmic scale. The utilization of this technique can facilitate the identification of patterns, intensify the differentiation between elements, and enhance the accuracy of the representation of the characteristics of the GLCM.

Non-linear transformations have the potential to effectively capture complex structures within the data, which may not be easy to identify through the original features. Consequently, the utilization of such transformations has the capacity to enhance the performance of machine learning models.

4.7 Concatenated features

Finally, the various features including GLCM, LBP, interaction, aggregated, statistical, and non-linear features are consolidated into a unified feature vector for every image. The process involves arranging all the features consecutively to create a lengthy vector. The concatenated feature vector serves as a representation of the image within the feature space, enabling its utilization in subsequent analysis or machine learning endeavours.

5 Results & discussion

The brain tumor classification model was implemented in Python. Statistical and non-linear feature extraction were used in conjunction with GLCM, LBP, Interaction, and aggregation to create the model. Finally, we classified brain tumors using a support vector machine.

Proposed Composite Feature Extraction Model Performance

In this work, a set of performance metrics have been computed to evaluate the effectiveness of the proposed composite Feature Extraction model, as illustrated in Table 3.

Table 3

Table 3 Performance metrics formula.

Tables 4–8 compares the proposed Composite Feature Extraction model’s sensitivity, precision, specificity, accuracy, DSC, FPR, and FNR metrics with 23 existing models.

Table 4

Table 4 Comparison of performance metrics: existing models with composite feature extraction (proposed model).

Table 5

Table 5 DSC comparison: composite feature extraction vs. existing models.

Table 4 presents a comprehensive evaluation of performance metrics for different models, including the Composite Feature Extraction model proposed in this study. The assessed metrics include Precision, Recall (or sensitivity), F1-score, Specificity, and Accuracy.

The proposed Composite Feature Extraction model demonstrates exceptional performance in brain tumor classification, achieving approximately 99.83% across all key metrics: Precision, Recall, F1-score, Specificity, and Accuracy. The obtained result indicates the model’s high reliability in accurately identifying tumors and healthy cases, effectively reducing false outcomes. The balanced F1-score shows the model’s consistent performance in precision and recall. Overall, the model’s robustness and superior performance signify its effectiveness in brain tumor classification, surpassing existing models.

The presented Table 5 provides a comparative analysis of the Dice Similarity Coefficient (DSC) values of different models utilized to classify three distinct types of brain tumors, namely glioma, meningioma, and pituitary tumors. The Dice similarity coefficient (DSC) is an essential metric to quantify the degree of similarity between the predicted and actual tumor regions observed in brain images. A greater DSC (Dice similarity coefficient) value indicates a higher level of accuracy in the model, particularly in accurately classifying and distinguishing glioma, meningioma, and pituitary tumors. As proposed, the Composite Feature Extraction model demonstrates exceptional performance with a Dice Similarity Coefficient (DSC) value of 99.6. This value signifies the model’s superior accuracy in effectively classifying the three distinct types of brain tumors, surpassing the performance of existing models.

Table 6 illustrates the Composite Feature Extraction model’s enhanced efficacy in classifying tumors. The model demonstrates superior performance compared to existing models, as evidenced by its remarkably low False Positive Rate (FPR) of 0.00625 and a False Negative Rate (FNR) of 0.0. These results highlight the model’s exceptional accuracy and reliability in effectively reducing false alarms and missed detections.

Table 6

Table 6 FPR and FNR comparison: composite feature extraction vs. existing models.

Table 7 presents the categorization of Meningioma, Glioma, and Pituitary Tumours utilizing various machine-learning techniques. The combined studies used a total of 3064 samples. The proposed methodologies for image classification include a Convolutional Neural Network (CNN) model achieving an accuracy of 97.3%. Additionally, a hybrid approach combining Convolutional Dictionary Learning and AlexNet achieves a 91-96% accuracy range. Another model, BrainMRNet, incorporates hypercolumns, attention modules, and residual blocks, achieving an accuracy range of 96-98%. The proposed Composite Feature Extraction model utilizing GLCM, LBP, and Composite Features achieves an accuracy of 99.83% compared with other existing models.

Table 7

Table 7 Comparison based on dataset: composite feature extraction vs. existing models.

In this research work, SVM classifier with a linear kernel is implemented to classify different types of tumors. The input data is processed and converted into the desired format using the kernel function. The complexity of a linear Support Vector Machine (SVM) is lower than that of a non-linear SVM, resulting in faster training.

Support Vector Machines (SVMs) can effectively classify data points by projecting them onto a feature space with many dimensions, even in cases where the data points are not linearly separable. Once a separator between the categories has been identified, the data is transformed, representing the division as a hyperplane. In the present context, the performance metrics of accuracy, precision, recall, and F1 score indicate that SVM can effectively discriminate between various classes of brain tumors. Accurately classifying brain tumors is paramount in medical diagnosis and subsequent treatment planning.

The comparison of various techniques for classifying brain tumors is presented in Table 8. Several models are included in this study, such as BMRI-NET, which is a stack ensemble model. Additionally, a two-channel deep neural network (DNN) model, GoogleNet with K-nearest neighbors (KNN), VGG-16, Resnet50, and InceptionV3 models, a simple convolutional neural network (CNN), a multiscale cascaded multitask network, and the DL (ResNet50V2) model are also considered. The accuracy of the data falls within the range of 96.3% to 99.68%.

Table 8

Table 8 Accuracy comparison: composite feature extraction vs. existing models.

The highest accuracy of 99.83% was achieved by employing a combination of Composite Feature Extraction techniques, namely Grey Level Co-occurrence Matrix (GLCM), Local Binary Patterns (LBP), and Composite Features, in conjunction with an SVM-Linear classifier. The findings of this study indicate that the technique examined in this research is the most prominent method for classifying brain tumors when compared to other approaches, exhibiting exceptional levels of accuracy.

From Figure 2, the model’s classification performance is considered exceptional based on the Area Under the Curve (AUC) values obtained from the Receiver Operating Characteristic (ROC) analysis. The model achieves a perfect Area Under the Curve (AUC) score of 1.00, indicating its ability to classify instances belonging to Class 0 and Class 2 accurately. Despite encountering challenges and exhibiting a few inaccuracies, the model’s overall performance remains commendable, evidenced by its high Area Under the Curve (AUC) value of 0.92. However, it is essential to consider the comprehensive evaluation of the model’s performance. It is noteworthy that the micro-average AUC achieves a near-perfect score of 0.99, indicating the model’s high efficacy across all classes.

Figure 2

Figure 2 Proposed composite feature extraction model ROC plot.

A significant amount of time, specifically 654.45 seconds, was dedicated to the analysis of image features, such as the Grey Level Co-occurrence Matrix (GLCM) and Local Binary Pattern (LBP), during the extraction process. However, the training process of the Support Vector Machine (SVM) classifier using these features was completed in a mere 0.16 seconds. The utilization of resources in this study was primarily focused on feature extraction rather than model training. With a strikingly low ratio of 0.00024389, the outstanding efficiency of computation is evident in this scenario, clearly demonstrating the high priority of computing resources.

The model attained a perfect average accuracy score of 1.0 on the training data, regardless of the varying sizes of the training sets. Nevertheless, notwithstanding the flawless score, the model exhibited commendable performance on the validation data. The predictive capabilities of the model exhibit a high level of robustness, as indicated by a mean accuracy score that falls within the range of 0.993 to 0.998. Furthermore, it is worth noting that the model consistently demonstrated strong performance across a diverse set of data folds, as indicated by the narrow range of standard deviations, which varied between 0.013 and 0.035.

6 Conclusion

Based on the obtained results, the classification model demonstrates exceptional performance, with an accuracy, precision, recall, and F1 score that are all close to 1, indicating a high success rate in classifying across all three classes of brain tumors (Glioma, Meningioma, Pitutary). The model also demonstrates a 100% True Positive Rate (no false negatives) and a 0% False Negative Rate (no false positives), in addition to a very low False Positive Rate, indicating an excellent balance between sensitivity and specificity.

This work presents a novel methodology that offers an improved representation of tumor image textures by combining LBP and GLCM features. The addition of interaction features, which are created by taking the outer product of the GLCM and LBP vectors, greatly improves the derived features’ ability to discriminate. This unique feature distinguishes the strategy from other approaches. Furthermore, an even more thorough representation of critical tumor image characteristics is obtained through the integration of aggregated, statistical, and non-linear information. Combining the approach with a linear support vector machine classifier yields 99.84% of accuracy rate. We opted to use the Figshare dataset so as to compare our classification accuracies with existing research done with same dataset. Being better than other research outcomes, this method can be applied to real time data.

As the most time-consuming phase, future research could concentrate on streamlining the extraction of features. In addition, future research may investigate the possibility of utilizing the extracted detailed features with less computational time. Another objective of future research is to analyze and classify high-grade brain tumors comprehensively utilizing high-grade brain tumors dataset to improve an understanding of complex tumors.

Data availability statement

The original contributions presented in the study are included in the article/supplementary material. Further inquiries can be directed to the corresponding author.

Ethics statement

Ethical review and approval was not required for the study on human participants in accordance with the local legislation and institutional requirements. Written informed consent from the patients/participants or patients/participants’ legal guardian/next of kin was not required to participate in this study in accordance with the national legislation and the institutional requirements.

Author contributions

All authors listed have made a substantial, direct, and intellectual contribution to the work, and approved it for publication.

Funding

The author(s) declare that no financial support was received for the research, authorship, and/or publication of this article.

Acknowledgments

We would also like to thank SRM Institute of science and Technology, Vadapalani Campus for supporting us in providing good lab infrastructure to carry out our research.

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

References

1. Davis CP. Cancer causes, types, treatment, stages and prevention. In: Cancer: Symptoms, Causes, Treatment, Stages, Prevention (medicinenet.com). MedicineNet (2023).

Google Scholar

2. Cancer. In: Cancer (who.int). WHO Factsheet.

Google Scholar

3. Brain Tumor Brain tumor - Symptoms and causes (2022). Mayo Clinic (Accessed August 04, 2022).

Google Scholar

4. Abdusalomov AB, Mukhiddinov M, Whangbo TK. Brain tumor detection based on deep learning approaches and magnetic resonance imaging. Cancers (Basel) (2023) 15(16):4172. doi: 10.3390/cancers15164172

PubMed Abstract | CrossRef Full Text | Google Scholar

5. Kaplan K, Kaya Y, Kuncan M, Ertunç HM. Brain tumor classification using modified local binary patterns (LBP) feature extraction methods. Med Hypotheses (2020) 139:109696. doi: 10.1016/j.mehy.2020.109696

PubMed Abstract | CrossRef Full Text | Google Scholar

6. Kaya Y, Kuncan M, Kaplan K, MiNaz MR, Ertunç HM. A new feature extraction approach based on one dimensional gray level co-occurrence matrices for bearing fault classification. J Exp Theor Artif Intell (2020) 33(1):161–78. doi: 10.1080/0952813x.2020.1735530

CrossRef Full Text | Google Scholar

7. Solanki S, Singh UP, Chouhan SS, Jain S. Brain tumor detection and classification using intelligence techniques: an overview. IEEE Access (2023) 11:12870–86. doi: 10.1109/access.2023.3242666

CrossRef Full Text | Google Scholar

8. Yildirim M, Cengil E, Eroglu Y, Cinar A. Detection and classification of glioma, meningioma, pituitary tumor, and normal in brain magnetic resonance imaging using deep learning-based hybrid model. Iran J Comput Sci (2023) 6:455–464. doi: 10.1007/s42044-023-00139-8

CrossRef Full Text | Google Scholar

9. Shinde AS, Mahendra BM, Nejakar SM, Herur S, Bhat N. Performance analysis of machine learning algorithm of detection and classification of brain tumor using computer vision. Adv Eng Software (2022) 173:103221. doi: 10.1016/j.advengsoft.2022.103221

CrossRef Full Text | Google Scholar

10. Kuncan F, Kaya Y, Kuncan M. A novel approach for activity recognition with down-sampling 1D local binary pattern features. Adv Electrical Comput Eng (2019) 19(1):35–44. doi: 10.4316/aece.2019.01005

CrossRef Full Text | Google Scholar

11. Shil SK, Polly FP, Hossain MA, Ifthekhar MS, Uddin MN, Jang YM. (2017). An improved brain tumor detection and classification mechanism, in: 2017 International Conference on Information and Communication Technology Convergence (ICTC), Jeju, Korea (South. pp. 54–7. doi: 10.1109/ictc.2017.8190941

CrossRef Full Text | Google Scholar

12. Khalid S, Khalil T, Nasreen S. (2014). A survey of feature selection and feature extraction techniques in machine learning, in: 2014 Science and Information Conference, London, UK. pp. 372–8. doi: 10.1109/sai.2014.6918213

CrossRef Full Text | Google Scholar

13. Haralick RM. Statistical and structural approaches to texture. Proc IEEE (1979) 67(5):786–804. doi: 10.1109/PROC.1979.11328

CrossRef Full Text | Google Scholar

14. Bahadure NB, Ray AK, Thethi HP. Image analysis for MRI based brain tumor detection and feature extraction using biologically inspired BWT and SVM. Int J Biomed Imaging (2017) 2017:1–12. doi: 10.1155/2017/9749108

CrossRef Full Text | Google Scholar

15. Joseph RP, Singh CS. Brain tumor MRI image segmentation and detection in image processing. Int J Res Eng Technol (2014) 03(13):1–5. doi: 10.15623/ijret.2014.0313001

CrossRef Full Text | Google Scholar

16. Alfonse M, Salem M. An automatic classification of brain tumors through MRI using support vector machine. Egypt Comput Sci J (2016) 40:11–21.

Google Scholar

17. Shree NV, Kumar TKS. Identification and classification of brain tumor MRI images with feature extraction using DWT and probabilistic neural network. Brain Inf (2018) 5(1):23–30. doi: 10.1007/s40708-017-0075-5

CrossRef Full Text | Google Scholar

18. Yao J, Chen JC, Chow C. Breast tumor analysis in dynamic contrast enhanced MRI using texture features and wavelet transform. IEEE J Selected Topics Signal Process (2009) 3(1):94–100. doi: 10.1109/jstsp.2008.2011110

CrossRef Full Text | Google Scholar

19. Kumar P, Vijayakumar B. Brain tumour mr image segmentation and classification using by PCA and RBF kernel based support vector machine. Middle-East J Sci Res (2015) 23(9):2106–16.

Google Scholar

20. Saleck MM, Elmoutaouakkil A, Mouçouf M. (2017). Tumor detection in mammography images using fuzzy C-means and GLCM texture features, in: 14th International Conference on Computer Graphics, Imaging and Visualization, . pp. 122– 5.

Google Scholar

21. Cheng J. Brain tumor dataset. (2017). doi: 10.6084/m9.figshare.1512427.v5

CrossRef Full Text | Google Scholar

22. Demirhan A, Törü M, Güler İ. Segmentation of tumor and edema along with healthy tissues of brain using wavelets and neural networks. IEEE J Biomed Health Inf (2015) 19(4):1451–8. doi: 10.1109/jbhi.2014.2360515

CrossRef Full Text | Google Scholar

23. Haralick RM, Shanmugam K, Dinstein I. Textural features for image classification. IEEE Trans Syst Man Cybern (1973) 3(6):610–21. doi: 10.1109/TSMC.1973.4309314

CrossRef Full Text | Google Scholar

24. Gupta N, Bhatele PR, Khanna P. Glioma detection on brain MRIs using texture and morphological features with ensemble learning. Biomed Signal Process Control (2019) 47:115–25. doi: 10.1016/j.bspc.2018.06.003

CrossRef Full Text | Google Scholar

25. Rasool MI, Ismail NH, Boulila W, Ammar A, Samma H, Yafooz WMS, et al. A hybrid deep learning model for brain tumour classification. Entropy (2022) 24(6):799. doi: 10.3390/e24060799

PubMed Abstract | CrossRef Full Text | Google Scholar

26. Zulfiqar F, Bajwa UI, Mehmood Y. Multi-class classification of brain tumor types from MR images using EfficientNets. Biomed Signal Process Control (2023) 84:104777. doi: 10.1016/j.bspc.2023.104777

CrossRef Full Text | Google Scholar

27. Isselmou AEK, Xu G, Shuai Z, Saminu S, Javaid I, Ahmad IS, et al. Brain tumor detection and classification on MR images by a deep wavelet auto-encoder model. Diagnostics (2021) 11(9):1589. doi: 10.3390/diagnostics11091589

PubMed Abstract | CrossRef Full Text | Google Scholar

28. Biratu ES, Schwenker F, Debelee TG, Kebede SR, Negera WG, Molla HT. Enhanced region growing for brain tumor MR image segmentation. J Imaging (2021) 7(2):22. doi: 10.3390/jimaging7020022

PubMed Abstract | CrossRef Full Text | Google Scholar

29. Saba T, Mohamed AA, El-Affendi MA, Amin J, Sharif M. Brain tumor detection using fusion of hand crafted and deep learning features. Cogn Syst Res (2020) 59:221–30. doi: 10.1016/j.cogsys.2019.09.007

CrossRef Full Text | Google Scholar

30. Sobhaninia Z, Karimi N, Khadivi P, Samavi S. Brain tumor segmentation by cascaded multiscale multitask learning framework based on feature aggregation. Biomed Signal Process Control (2023) 85:104834. doi: 10.1016/j.bspc.2023.104834

CrossRef Full Text | Google Scholar

31. Raja PMS, Rani AV. Brain tumor classification using a hybrid deep autoencoder with Bayesian fuzzy clustering-based segmentation approach. Biocybernetics Biomed Eng (2020) 40(1):440–53. doi: 10.1016/j.bbe.2020.01.006

CrossRef Full Text | Google Scholar

32. Amin J, Sharif M, Gul N, Raza M, Anjum MA, Nisar MF, et al. Brain tumor detection by using stacked autoencoders in deep learning. J Med Syst (2019) 44(2). doi: 10.1007/s10916-019-1483-2

PubMed Abstract | CrossRef Full Text | Google Scholar

33. Asif S, Zhao M, Chen X, Zhu Y. BMRI-NET: A deep stacked ensemble model for multi-class brain tumor classification from MRI images. Interdiscip Sciences: Comput Life Sci (2023) 15:499–514. doi: 10.1007/s12539-023-00571-1

CrossRef Full Text | Google Scholar

34. Bodapati JD, Shaik NS, Naralasetti V, Mundukur NB. Joint training of two-channel deep neural network for brain tumor classification. SIViP (2021) 15(4):753–60. doi: 10.1007/s11760-020-01793

CrossRef Full Text | Google Scholar

35. Deepak S, Ameer PM. Brain tumor classification using deep CNN features via transfer learning. Comput Biol Med (2019) 111:103345. doi: 10.1016/j.compbiomed.2019.103345

PubMed Abstract | CrossRef Full Text | Google Scholar

36. Al-Shayeji M, Al-Buloushi J, Ashkanani A, Abed S. Enhanced brain tumor classification using an optimized multi-layered convolutional neural network architecture. Multimedia Tools Appl (2021) 80(19):28897–917. doi: 10.1007/s11042-021-10927-8

CrossRef Full Text | Google Scholar

37. Patel M. Brain tumor detection using MRI images, MS Thesis, California State University, San Bernardino, Electronic Theses Projects, and Dissertations (2023). Available at: https://scholarworks.lib.csusb.edu/etd/1602.

Google Scholar

38. Latif G. DeepTumor: framework for Brain MR image classification, Segmentation and Tumor Detection. Diagnostics (2022) 12(11):2888. doi: 10.3390/diagnostics12112888

PubMed Abstract | CrossRef Full Text | Google Scholar

39. Sharma AK, Nandal A, Dhaka A, Polat K, Nour M, Alenezi F, et al. HOG transformation-based feature extraction framework in modified Resnet50 model for brain tumor detection. Biomed Signal Process Control (2023) 84:104737. doi: 10.1016/j.bspc.2023.104737

CrossRef Full Text | Google Scholar

Keywords: brain tumor, GLCM, LBP, texture, composite feature, aggregated feature, non-linear feature

Citation: Dheepak G, J. AC and Vaishali D (2024) Brain tumor classification: a novel approach integrating GLCM, LBP and composite features. Front. Oncol. 13:1248452. doi: 10.3389/fonc.2023.1248452

Received: 05 July 2023; Accepted: 12 December 2023;
Published: 30 January 2024.

Edited by:

Dimitrios N. Kanakis, University of Nicosia, Cyprus

Reviewed by:

Melih Kuncan, Siirt University, Türkiye
Surendiran B, National Institute of Technology Puducherry, India

Copyright © 2024 Dheepak, J. and Vaishali. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: G. Dheepak, dg6154@srmist.edu.in

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.