Combining conventional ultrasound and ultrasound elastography to predict HER2 status in patients with breast cancer

Introduction: Identifying the HER2 status of breast cancer patients is important for treatment options. Previous studies have shown that ultrasound features are closely related to the subtype of breast cancer. Methods: In this study, we used features of conventional ultrasound and ultrasound elastography to predict HER2 status. Results and Discussion: The performance of model (AUROC) with features of conventional ultrasound and ultrasound elastography is higher than that of the model with features of conventional ultrasound (0.82 vs. 0.53). The SHAP method was used to explore the interpretability of the models. Compared with HER2– tumors, HER2+ tumors usually have greater elastic modulus parameters and microcalcifications. Therefore, we concluded that the features of conventional ultrasound combined with ultrasound elastography could improve the accuracy for predicting HER2 status.

. It shows that accurate identification of the molecular subtype of breast cancer is essential for treatment. The 2018 American Society of Clinical Oncology/American Association of Pathologists Detection Guide and 2019 Chinese breast cancer HER-2 Detection Guide regulate the IHC staining requirements and the interpretation of IHC and ISH result (Wolff et al., 2018). In this consensus, HER-2 IHC 3+or HER-2 IHC 2+/ISH+ is defined as HER-2 positive, IHC 1+or IHC 2+/ISH-is defined as HER-2 low expression, and IHC 0 is defined as HER-2 negative.
So far, identification of HER2+ mainly relies on fluorescence in situ hybridization (FISH) and immunohistochemistry (IHC) (Baez-Navarro et al., 2023). However, the two methods are invasive procedures and may lead to seroma (Ebner et al., 2018) and infection (Bruening et al., 2010). Therefore, we need noninvasive, economical and accurate methods to predict HER2 status in breast cancer.
Ultrasound imaging technologies are non-invasive, convenient and affordable and have been widely used for breast cancer screening and diagnosis (Berg et al., 2015). It has been shown that ultrasonographic features are related to molecular subtypes of breast cancer (Wu et al., 2019;Gumowska et al., 2021). Many machine learning models for predicting molecular subtypes of breast cancer have been developed (Zhou et al., 2021;Ma et al., 2022). However, these models mainly relied on the characteristics of conventional ultrasound. In recent years, the development of ultrasound elastography (Barr, 2018) has provided new opportunities for breast cancer screening and diagnosis (Carlsen et al., 2015;Yao et al., 2023). As a new imaging technology, ultrasonic elastic imaging can evaluate the hardness of the lesions and thus identify the nature of the lesions, which is an important supplement to traditional ultrasonic imaging. At present, the ultrasonic elastography technology used for breast diagnosis mainly includes strain elastography and acoustic palpation elastography. Sound touch elastography (STE) is a kind of ultrasonic imaging technology developed recently in China, which can display the tissue hardness information in the region of interest (ROI) in real time, and provide the elastic value related to the mass and its periphery through Shell quantitative analysis tool kit. The hardness change of the lesion tissue was measured accurately. However, to the best of our knowledge, there are no studies exploring the relations between characteristics of ultrasound elastography and HER2+. In this study, we build a machine learning model for HER2 status prediction based on the characteristics of conventional ultrasound combined with ultrasound elastography. In addition, Shapley additive explanations (SHAP) method (Lundberg et al., 2020;Lv et al., 2023) was used to explore the interpretability of the model. We hope that the model can provide more valuable information for personalized healthcare of breast cancer.

Cohorts
Patients with breast cancer at the Affiliated Hospital of Xuzhou Medical University between January 2021 and December 2022 were used in this study. All patients were confirmed by gross needle aspiration biopsy or surgical pathology.
Exclusion criteria were as follows: 1) pregnant or lactating women; 2) tumor diameter more than 50 mm; 3) patients who have undergone interventional treatment (e.g., chemotherapy, radiotherapy) before ultrasound examination; 4) patients with severe organ insufficiency; 5) poor patient compliance. Finally, 51 patients with HER2+ breast cancer were enrolled in this study. As controls, we also recruited 52 patients with HER2-breast cancer and 50 patients with benign breast disease. The study follows the "Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis" (Collins et al., 2015). All patients were de-identified.

Ultrasound
Ultrasound scans were obtained using Mindray Resona 7S Doppler Color Ultrasound and a liner transducer L14-5WU with strain elastography and acoustic palpation elastography system. The operations and assessments were performed by three physicians skilled in ultrasound elastography and conventional ultrasound. Specifically, all patients first underwent a conventional ultrasound examination. The location, size (maximum diameter), morphology, margins, orientation, echo pattern, microcalcification, and hyperechoic halo of the lesion were recorded. Next, the section with the most abundant blood flow was used to assess the blood flow classification (Adler classification (Adler et al., 1990)) and measured the resistance index (RI). Finally, all patients underwent an ultrasound elastography examination, strain ratio, strain elasticity score, lesion mean elastic modulus (A mean ), lesion maximum elastic modulus (A max ), lesion peripheral (shell 2 mm) mean elastic  Frontiers in Physiology frontiersin.org 04 modulus (S mean ), lesion peripheral maximum elastic modulus (S max ) were recorded. In Figure 1, we show examples of ultrasound elastography for (a) HER2+ breast cancer, (b) HER-breast cancer and (c) benign breast disease.

Statistical analysis
Python (Version 3.7) was used for statistical analysis and visualization. One demographic feature, nine conventional ultrasound features, and six ultrasound elastography features were used in this study (Table 1; Table 2). Among these features, age, size, resistance index, strain elasticity score, strain ratio, A mean , A max , S mean , and S max are continuous variables, while orientation, shape, margin, echo pattern, microcalcification hyperechoic halo and Adler classification are discrete variables. For continuous variables, they are presented as median ± interquartile range (IQR), and Mann-Whitney test was used for group comparisons (e.g., HER2+ breast cancer vs. HER2-breast cancer). For discrete variables, they are presented as count (percentage), and chi-square test was used for group comparisons. 2sided p-value <0.05 was considered significantly different.

Machine learning models
A tree-based machine learning approach was used for feature selection (Ke et al., 2017). In the tree-based model, zero-importance features are not used to split any nodes, so the features have no impact on the performance of tree-based models. Previous study has shown that we can obtain the best results if 70%-80% of the data is used for training, and 20%-30% of the data is used for testing (Gholamy et al., 2018). Therefore, all patients were randomly divided into a training set (80%) and a test set (20%). The extreme gradient boosting (XGBoost) model (Chen and Guestrin, 2016) was used to predict the status of tumor (benign tumor or breast cancer) and the status of HER2 (HER2+ or HER2−). Hyperparameters of models (e.g., n_estimators, max depth, learning rate) were selected by k-fold cross-validation on the training set. Usually, k is set to 5 or 10. However, the size of dataset used in this study is small, and a larger k leads to larger fluctuations in the performance of the model (Supplementary Figure S1). Therefore, k is set to 5. The model with the optimal hyperparameters was validated by the holdout test set, and area under the receiver operating characteristic curve (AUROC) was used to evaluate the performance of models. The 95% confidence interval of AUORC on test set was calculated by 1000 bootstrap replicates. The SHAP method was used to explore interpretability of models (Lundberg et al., 2020).
In addition, we also developed a logistic regression (LR) model to predict the status of HER2. We then compared performance of the LR model with that of the XGBoost model.

Cohort characteristics
The cohort included 51 patients with HER2+ breast cancer, 52 patients with HER2-breast cancer and 50 patients with benign breast disease. For patients with breast cancer and benign breast disease, all characteristics showed significant differences (Table 1). Therefore, all features were used to predict the status of tumor (breast cancer or benign tumor). However, for patients with HER2+ breast cancer and HER2-breast cancer, age, orientation, shape, echo pattern, hyperechoic halo and pathology did not show significant differences (Table 2). In addition, we used a tree-based machine learning model (i.e., LightGBM) to calculate the importance of the features. As shown in Supplementary Table S1, orientation, shape, margin, echo pattern, hyperechoic halo and Adler classification are zero importance features. In tree-based machine learning models, the features do not have any effect on the performance of models. Therefore, microcalcification, A mean , resistance index, S mean , A max , S max , size and strain ratio were used to predict the status of HER2 (HER2+ or HER2−). Subsequently, we explored whether the features of conventional ultrasound combined with ultrasound elastography could improve the predicted accuracy of tumor status and HER2 status.

Prediction of tumor status
There were 82 patients with breast cancer and 40 patients with benign breast disease in the training set, and there were 21 patients with breast cancer and 10 patients with benign breast disease in the test set. All features (Table 1) were used to predict the status of tumor (breast cancer or benign tumor). For the model with features of conventional ultrasound, the crossvalidation AUROCs ranged from 0.98 to 1 (0.99 ± 0.01, Supplementary Figure S2A), and the corresponding AUROC of the test set (95% CI) was 0.99 (0.97-1). For the model with features of conventional ultrasound and ultrasound elastography, the cross-validation AUROC ranged from 0.97 to 1 (0.99 ± 0.01, Supplementary Figure S2B), and the corresponding AUROC of the test set (95% CI) was 1.00 (1.00-1.00). AUROCs of the models with features of ultrasound elastography and/or conventional ultrasound are close to 1. One possible reason for this is that the test set Frontiers in Physiology frontiersin.org held is a "too good" subset. To rule out this reason, the training set and test set were repeatedly split 10 times, and we report more evaluation metrics (i.e., sensitivity, specificity, negative predictive value and positive predictive value). The averaged AUROC, sensitivity, specificity, negative predictive value and positive predictive value of the model with features of conventional ultrasound are 0.996 ± 0.009, 0.967 ± 0.036, 0.935 ± 0.059, 0.972 ± 0.024, 0.934 ± 0.074, respectively. The averaged AUROC, sensitivity, specificity, negative predictive value and positive predictive value of the model with features of conventional ultrasound and ultrasound elastography are 0.997 ± 0.006, 0.975 ± 0.025, 0.960 ± 0.089, 0.988 ± 0.025, 0.956 ± 0.045, respectively. Overall, both models can predict the status of tumor accurately (Figure 2).

Prediction of HER2 status
There were 40 patients with breast cancer and 41 patients with benign breast disease in the training set, and there were 11 patients with breast cancer and 11 patients with benign breast disease in the test set. As shown in Table 2, age, orientation, shape, echo pattern, hyperechoic halo and pathology did not show significant differences. Therefore, these features were not used to build machine learning models. For the model with features of conventional ultrasound, the cross-validation AUROC ranged from 0.53 to 0.93 (0.74 ± 0.13, Supplementary Figure S3A) and the corresponding AUROC of the test set (95% CI) was 0.53 (0.27-0.78). For the model with features of conventional ultrasound and ultrasound elastography, the crossvalidation AUROC ranged from 0.69 to 0.88 (0.81 ± 0.07, Supplementary Figure S3B), and the corresponding AUROC of the test set (95% CI) was 0.82 (0.62-0.99). Therefore, we concluded that the features of conventional ultrasound combined with ultrasound elastography could improve the prediction accuracy of HER2 status (Figure 3).
Supplementary Table S1 provides valuable insights into the stepwise variable selection method. Next, we compared the performance of models with different features (i.e., top 8 features, top 10 features and top 16 features). As shown in Supplementary Figure S4,  prediction, we prefer to screen out more suspected HER2+ patients than to miss a possible HER2+ patient, so the F1-value should be preferred as an evaluation metric. Therefore, LR model is a better choice for our prediction purposes (higher recall). However, for clinical prediction models, while the performance of the model is very important, the interpretability of the model should not be neglected. In recent years, the XGBoost model combined with the SHAP method have been widely used in cohort studies (Deng et al., 2022;Lv et al., 2023). These interpretable machine learning models can give not only the prediction results, but also the reasonable reasons for the judgments. Therefore, we prefer to use the XGBoost model. Next, we use SHAP model to explore the interpretability of the model.

Interpretability of the model
The SHAP method can help us identify key factors for HER2+ at the patient level and at the cohort level. First, we identified key factors for HER2+ at the patient level. As shown in Figure 4, We show a patient with the highest SHAP value ( Figure 4A) and a patient with the lowest SHAP value ( Figure 4B). The baseline is the mean SHAP value of −0.1369. The predicted risk for the patient with the highest SHAP value is 2.43. Microcalcification, larger S mean (67.31) and so on are potential key factors for HER2+. For the patient with the lowest SHAP value (−3.64), no microcalcifications, lower resistance index and A max and so on contribute to HER2−.
Next, we identified key factors for HER2+ at the cohort level. As shown in Figure 5, microcalcification, A mean , S mean , size and resistance index are the top 5 key factors to identify HER2 status. Compared with S max and A max , A mean , and S mean are better key factors to identify HER2 status.
Finally, we used clustering algorithm to explore relations between these features. As shown in Figure 6, patients with similar features and similar subtypes were grouped together. Overall, microcalcifications have a strong correlation with  Frontiers in Physiology frontiersin.org 07 HER2+ (cluster 2). However, smaller tumor and A mean have a negative effect on the result of model (cluster 1). For patients without microcalcification, larger S mean or S max (cluster 3) increase the likelihood of HER2+. In addition, we also performed partial regression analysis. As shown in Supplementary Figures S5-S12, the effects of microcalcification, resistance index and S mean on HER2+ were more significant. It shows that conventional ultrasound combined with ultrasound elastography can predict HER2 status better.

Discussion
Compared with other subtypes of breast cancer, HER2+ breast cancer is more malignant, more aggressive, and more likely to recur and metastasize (Guarneri et al., 2013). In recent years, the development of HER2-targeted drugs have led to significant benefits for patients with HER2+ breast cancer (Kümler et al., 2014). Therefore, it is critical to identify the HER2 status of breast cancer patients accurately and quickly.
Ultrasound is widely used for breast cancer screening and diagnosis (Berg et al., 2015), and previous studies have shown that there are some correlations between ultrasound characteristics and breast cancer subtypes (Wu et al., 2019;Gumowska et al., 2021). Conventional ultrasound can evaluate the shape, size, margin, and echo pattern of tumors. In summary, the shape of breast cancer lesions is irregular, the margin of the lesions is not circumscribed, the interior of the lesion is rich in blood flow, and the echo pattern is not homogeneous (Table 1). Both the machine learning model with conventional ultrasound and the machine learning model with conventional ultrasound and ultrasound elastography have shown excellent performance in predicting tumor status ( Figure 2). However, machine learning models with conventional ultrasound haven shown moderate performance in predicting HER2 state (Figure 3). Ultrasound elastography can evaluate the hardness of tumors, providing a new opportunity for the prediction of HER2 status (Carlsen et al., 2015;Yao et al., 2023). The introduction of tumor elasticity information significantly improves the performance of the machine learning model (Figure 3). The SHAP method can help us identify key factors for predicting HER2 status (Figures 4-6).
For conventional ultrasound, size, margin, microcalcification, Adler classification and resistance index were considered as key factors for predicting HER2 status (Table 2; Figure 5). HER2+ stimulates the wild growth of cancer cells, leading to inadequate local blood supply, resulting in cell death and microcalcification (Zhou and Hung, 2003;Loibl and Gianni, 2017). Therefore, HER2+ tumors are usually larger and have microcalcifications (Table 2). In addition, HER2+ increases cancer cell aggressiveness (Pupa et al., 2021). Therefore, the margin of HER2+ are usually not circumscribed (Table 2). However, the prerequisite for rapid tumor growth and infiltration is the formation of a large number of microvessels (Furuya et al., 2005). Microvessels provide the nutrients and oxygen needed for tumor growth (Pluda and Parkinson, 1996). In this study, we found that HER2+ patients have a higher Adler classification (Table 2). This finding is consistent with previous studies (Pluda and Parkinson, 1996;Furuya et al., 2005).
For ultrasound elastography, we found that elastic modulus parameters (i.e., A mean , A max , S mean , and S max ) were significantly higher in HER2+ tumors than in HER2-tumors (Table 2). It may be related to higher microvascular density and interstitial water in HER2+ tumors (Zhang et al., 2022;Kurt et al., 2023). Yoo et al. found that the hardness of the tumor is associated with tissue hypoxia (Yoo et al., 2020), and HER2 contributes to increased hypoxic response in breast cancer by regulating HIF-2α (Jarman et al., 2019). Therefore, we speculated that elastic modulus parameters of tumors can reflect the status of HER2 to some

FIGURE 6
Heatmap that identify clusters of breast cancer patients who have similar characteristics and outcomes.
Frontiers in Physiology frontiersin.org extent. In Figure 5, we found that microcalcification is the most important factor for predicting HER2 status, and it is consistent with the study of Elias et al. (Elias et al., 2014). However, there are some HER2+ patients without microcalcification. For the patients, elastic modulus parameters (i.e., S mean and S max ) can help us identify the HER2 status ( Figure 6) and thus improve the performance of machine learning models (Figure 3). Although this study is meaningful, our study still has some limitations: 1) This study is a retrospective single-center study with a small number of cases, and bias was inevitable; 2) The features used in this study were human-defined. With the development of deep learning, it is expected to automatically extract features from images (Lin et al., 2017;Banan et al., 2020).

Conclusion
In conclusion, ultrasound features are closely related to HER2 status. We developed interpretable machine learning models combined with conventional ultrasound and ultrasound elastography features to predict the state of HER2. The model combined with ultrasound elastography features showed better performance. Conventional ultrasound combined with ultrasound elastography can predict HER2 status better. Microcalcification, A mean , S mean , size and resistance index are the top 5 key factors to identify HER2 status. It is meaningful for breast cancer screening and diagnosis and personalized medicine.

Data availability statement
The raw data supporting the conclusion of this article will be made available by the authors, without undue reservation.

Author contributions
JiL: conceptualization, methodology, investigation, methodology, visualization, writing-original draft; XZ, BC, JaL, YL, and JeL: data curation, writing-original draft; XZ, JoL, and NZ: writing-review and editing, funding acquisition, project administration, supervision, writing-review and editing. All authors contributed to the article and approved the submitted version.