- 1Department of Interventional Ultrasound, PLA Medical College & Chinese PLA General Hospital, Beijing, China
- 2Department of Ultrasound, Zhongda Hospital, Nanjing, China
- 3Department of Breast Surgery, Affiliated Hospital of Putian University, Putian, China
- 4Department of Ultrasound, Xingcheng People’s Hospital, Xingcheng, China
- 5Department of Ultrasound Medicine, Lu’ an People’s Hospital of Anhui Province, Liuan, China
- 6Department of Ultrasound, The Fifth People's Hospital of Chengdu, Chengdu, China
- 7Department of Ultrasound, Huashan Hospital, Shanghai, China
- 8Department of Ultrasound, Guangxi Medical University Cancer Hospital, Nanning, China
- 9Department of Ultrasound, The Fourth Hospital of Hebei Medical University, Shijiazhuang, China
- 10Department of Ultrasound, The Third Xiangya Hospital, Changsha, China
- 11Department of Ultrasound, Peking University Third Hospital, Beijing, China
- 12General Surgery, Chinese PLA General Hospital, Beijing, China
- 13Department of Ultrasound, First Affiliated Hospital of Southern University of Science and Technology, Second Clinical College of Jinan University, Shenzhen Medical Ultrasound Engineering Center, Shenzhen People’s Hospital, Shenzhen, China
- 14Department of Ultrasound Medicine, The First Affiliated Hospital of Nanchang University, Nanchang, China
- 15Department of Ultrasound, Beijing Friendship Hospital, Beijing, China
- 16Department of Ultrasound, The 2nd Affiliated Hospital of Harbin, Harbin, China
- 17Department of Ultrasound, China-Japan Union Hospital of Jilin University, Changchun, China
- 18Department of Ultrasound, Zhengzhou Central Hospital, Zhengzhou, China
- 19Department of Ultrasound, The First Affiliated Hospital of Harbin Medical University, Harbin, China
- 20Department of Ultrasound, Affiliated Hospital of Nanjing University of Chinese Medicine, Nanjing, China
- 21Department of Ultrasound, Shengjing Hospital of China Medical University, Shenyang, China
- 22Department of Ultrasound, The Affiliated Hospital of Inner Mongolia Medical University, Hohhot, China
- 23Department of Ultrasound, The First Affiliated Hospital of Xinxiang Medical University, Xinjiang, China
Purpose: To predict human epidermal growth factor receptor 2 (HER2) expression in breast cancer (BC) using Sonazoid-enhanced ultrasound in a machine learning-based model.
Materials and methods: Between August 2020 and February 2021, patients with breast cancer who underwent surgical treatment without neoadjuvant chemotherapy were prospectively enrolled from 17 hospitals in China. HER2 expression status was assessed by immunohistochemistry or fluorescence in situ hybridization (FISH). The training set contained data from 11 hospitals and the validation set contained 6 hospitals. Clinical features, B-mode ultrasound, contrast-enhanced ultrasound (CEUS), and time-intensity curve were selected by the Least Absolute Shrinkage and Selection Operator. Based on the selected features, six prediction models were established to predict HER2 3 + and 2 +/1 + expression: logistic regression (LR), support vector machine (SVM), random forest (RF), eXtreme Gradient Boosting (XGB), XGB combined with LR, and fusion model.
Results: A total of 140 patients with breast cancer were enrolled in this study. Seven features related to HER2 3 + and six features related to HER2 2+/1 + were selected to establish prediction models. Among the six models, LR, SVM, and XGB showed the best prediction performance for both HER2 3 + and HER2 2+/1 + cases. These three models were then combined into a fusion model. In the validation, the fusion model achieved the highest value of area under the receiver operating characteristic curve as 0.869 (95%CI: 0.715–0.958) for predicting HER2 3 + and 0.747 (95%CI: 0.548–0.891) for predicting HER2 2+/1 + cases. The model could correctly upgrade HER2 2 + cases to HER2 3 + cases, consistent with the FISH test results.
Conclusion: Sonazoid-enhanced ultrasound can provide effective guidance for targeted therapy of breast cancer by predicting HER2 expression using machine learning approaches.
1 Introduction
According to the World Health Organization, breast cancer (BC) can cause 500,000 deaths, and 1.7 million new cases are diagnosed annually (1). Characterized by overexpression of the human epidermal growth factor receptor 2 (HER2) gene and its protein, HER2-positive breast cancer accounts for 20–30% of breast cancer cases and requires distinct therapeutic strategies (2, 3). Trastuzumab and pertuzumab, which are targeted by monoclonal antibody therapies, improve the survival outcomes of HER2-positive (HER2 3+) breast cancer (4–6). Recent reports have recommended HER2-targeted agents and antibody-drug conjugates (ADCs) as new clinical therapies for HER2-low expression (HER2 1+, 2+) breast cancer (7). The distinct pathological characteristics of HER2 0, HER2-low, and HER2-positive breast cancers have been the focus of research. Studies have reported that the 50% recurrence rate of HER2-positive breast cancers can be decreased by the use of HER2-targeted monoclonal antibodies (8).
For patients with HER2-positive cancers, preoperative targeted therapy could increase the chance of breast conservation and sentinel lymph node biopsy rather than mastectomy and axillary lymph node dissection (7, 9). The selection of breast cancer neoadjuvant treatment regimens (particularly monoclonal antibodies) depends on the results of preoperative core needle biopsy (CNB), especially molecular profiling tested by immunohistochemistry (IHC) and fluorescence in situ hybridization (FISH) (10–12). However, because of intratumoral heterogeneity, the inadequate tissue acquired from CNB may not provide complete pathological characteristics of the tumor, causing discordance between cores in 8% of HER2-positive cases and discordance between CNB and surgical pathology results for approximately 26.6% of HER2 status (11, 13, 14). Thus, HER2 expression levels in breast cancer could be underestimated, and the concomitant false-negative results may cause missed diagnosis of HER2-positive cases, affecting clinical arrangements and prognosis. Increasing the number of multi-point punctures may increase the accuracy or decrease the underestimation in the diagnosis of HER2 expression. However, it has been reported that the possibility of core needle seeding in breast cancer varies from 2 to 63% (15–17). Adding the number of punctures to increase the amount of tissue may also increase the risk of tumor seeding (16, 17).
Contrast-enhanced ultrasound (CEUS) indicates vascular information of the tumor, which has been widely used in the diagnosis of benign and malignant breast lesions, assessing the pathological characteristics, and predicting neoadjuvant chemotherapy (NAC) response (18, 19). CEUS can improve the categorization of suspicious breast lesions, reduce unnecessary biopsies, and improve the cancer yield rate of biopsy procedures (20). SonoVue (Bracco, Milan, Italy), the most widely used ultrasound contrast agent, consisting of sulfur hexafluoride microbubbles, has shown better performance in low-intensity imaging (21). Consisting of lipid-stabilized perfluorocarbon microbubbles, Sonazoid (GE Healthcare, Oslo, Norway) is more stable for long-term imaging and has a higher resistance to ultrasound mechanical index (MI), which is more suitable for high-frequency linear array probe scanning (22, 23). Machine learning approaches have been widely applied for the early detection, diagnosis, and outcome prediction of breast cancer (24, 25). It has been reported that the diagnostic accuracy and sensitivity of CEUS in breast cancer can be improved by combining it with a machine learning approach (20).
Hence, our study aims to predict the HER2 status of breast cancer by combining B-mode ultrasound and contrast Sonazoid-enhanced ultrasound features using machine learning models.
2 Materials and methods
2.1 Patients
This prospective, multicenter study was approved by the institutional ethics committee (ClinicalTrials.gov: NCT04657328). Informed written consent was obtained from all participants before the examinations. Between August 2020 and February 2021, 168 patients with breast cancer with 168 breast masses diagnosed by surgical pathology from a multicenter cohort of 17 hospitals in China, were enrolled in this study. Patients with an unclear HER2 status and incomplete time-intensity curve (TIC) features were excluded. According to current guidelines (26), HER2 status was determined using IHC for HER2 protein expression and FISH for equivocal cases (IHC 2+). The multicenter IHC results for HER2 expression were evaluated by experienced pathologists. A total of 140 patients with HER2 status were included in the study. The exclusion criteria were (1) absence of HER2 results and (2) absence of TIC features due to substandard image acquisition. The training set contained datasets from 11 hospitals, including 104 and 79 cases in the two cohorts. The external validation set contained prospective datasets from 6 other hospitals, including 36 and 28 cases in the two cohorts. Among these cases, 104 patients from 11 hospitals were included in the training set and 36 patients from the other 6 hospitals were included in the validation set. In total, there were 26 HER2-positive (IHC 3+), 68 HER2-low (39 IHC 1 + and 29 IHC 2+), 39 HER2 0 (IHC 0), and 7 HER2-negative (IHC 0, 1+, and 2+) cases in the training and validation sets. In total, 88 patients with invasive ductal carcinoma, 3 with mucinous breast carcinoma, 1 with metaplastic breast carcinoma, and 12 with ductal carcinoma in situ were included.
Furthermore, to differentiate HER2-low expression cases from HER2 0 and exclude the confounding effect of HER2-positive expression levels in the analysis, 26 HER2-positive cases and 7 patients with uncertain HER2 expression status (only known as HER2-negative cases) in the cohort were excluded. Finally, 107 patients were included in the HER2-negative and low-expression group, containing 79 patients in the training cohort from the same 11 hospitals and 28 patients in the validation cohort. The study design is shown in Figure 1.

Figure 1. Flowchart of study design. HER2: human epidermal growth factor receptor-2; IHC: immunohistochemistry; TIC: time intensity curve; CEUS: contrast-enhanced ultrasound. LR: logistic regression; SVM: support vector machine; RF: random forest; XGB: XGBoost.
2.2 B-mode and CEUS image acquisition
B-mode ultrasound and CEUS examinations were performed by radiologists from 17 hospitals with 10 ultrasound devices (Supplementary Table 1) equipped with a linear probe. All ultrasound examinations were conducted following a uniform diagnostic consensus. Prior to image acquisition, participating radiologists in this multicenter study, with more than 3 years of experience in breast ultrasound, received systematic training in B-mode and CEUS breast examination. All radiologists in this study received standardized training in breast CEUS interpretation according to Sonazoid instructions and previous studies. They were required to complete a minimum of 50 breast CEUS-independent case evaluations to ensure consistent diagnostic consensus prior to the study. Breast masses were first identified using a B-mode ultrasound scan. Next, 0.015 mL/kg of perfluorobutane-filled microbubble contrast agent (Sonazoid; GE Healthcare, Oslo, Norway) was injected via the catheter line (≥ 22-gauge) placed in the antecubital vein, followed by a 5 mL flush of 0.9% sodium chloride solution. The mechanical index of 0.18–0.24 was applied. When the injection was completed, the imaging timer was started simultaneously. After 1 min of continuous assessment of the whole mass, intermittent scanning (10 s each time) was arranged at the time points of 1.5 min, 2 min, 3 min, 4 min, and 5 min. For patients with multiple masses, images of the largest masses were preserved. Both B-mode and CEUS images and videos were stored in DICOM format on a hard disk at the hospital and sent to our study center. Finally, six radiologists with more than 15 years of experience in conventional breast ultrasound and breast CEUS were independently evaluated for image features at the study center (Figure 2).
In B-mode breast ultrasound, the “strip-shaped echoic” feature represents thin, elongated, and hyperechoic lines or bands within the breast tissue or mass. CEUS characteristics were evaluated, including shape (regular or not), margin (well or poorly defined), wash-in time (earlier, later, synchronous), enhancement degree (hyperenhancement, isoenhancement, hypoenhancement), complete wash-out time of lesions (≤5 min or not), uptake pattern (centripetal, centrifugal, diffuse, no enhancement), as well as exhibitions of the homogeneous pattern, rim-like enhancement, claw-shaped pattern, perfusion defects, lesion size compared with conventional ultrasound increased, and nourishing vessels. The time-intensity curve (TIC) features were evaluated using external perfusion software (VueBox™) to quantitatively evaluate the microvasculature of the tumors through the CEUS videos.
2.3 Statistical analysis
R version 3.4.4 software, SPSS Version 23.0 (IBM, Armonk, NY, United States), and MedCalc 19.5.6 were used to perform statistical analysis. Statistics are described as mean ± standard deviation or numbers with percentiles for distribution. The t-test, chi-square test, and the Least Absolute Shrinkage and Selection Operator (LASSO) were used to select the features. The regularization property of LASSO constrains the model coefficients through the penalty parameter (λ) and shrinks the coefficients of less important variables to zero to mitigate overfitting (27, 28). Logistic regression (LR), support vector machine (SVM), random forest (RF), eXtreme Gradient Boosting (XGB), late fusion model based on the voting method, and XGB combined with LR were trained to classify HER2-positive status and HER2 low expression status in the two groups. A combination of XGB (constructing new features based on existing features) and LR (classifiers) was used to establish the prediction model. Prediction models were established on the training set, and their performance was tested on the validation set (29). For internal validation, leave-one-out cross-validation (LOOCV) was performed to assess the predictive accuracy and stability of the training set. External validation was performed to test the performance of the trained models, evaluate their generalizability, and identify potential biases. The receiver operating characteristic curves (ROC) of the predictive models were analyzed. The area under the receiver operating characteristic curve (AUC), accuracy, sensitivity, specificity, and 95%CI were assessed. The DeLong test was used to compare differences between the AUC values of the different models.
3 Results
3.1 Clinical characteristics
The clinical characteristics of 140 patients with breast cancer (mean age 52.35 ± 11.03 years, range 23–85 years) with 140 masses are shown in Table 1. In the training cohort, 104 patients were enrolled, including 20 HER2-positive cases. Of the 107 patients in the HER2 low expression group, 79 were included in the training cohort, with 56 IHC 2 + or 1 + and 23 IHC 0 cases.
3.2 B-mode and CEUS characteristics
In 140 patients with breast cancer, the image features of B-mode ultrasound, CEUS, and TIC were assessed (Supplementary Table 2). According to the LASSO regression in clinical B-mode with CEUS and TIC of CEUS characteristic groups, seven features related to HER-2 positive breast cancer, including tumor size (cm), echotexture, strip-shaped echoic, macrocalcifications, microcalcifications, perfusion defects, and fall time (FT) of TIC, were selected (Figure 3). No clinical characteristics were observed. The distribution of the selected characteristics is listed in Table 2.

Figure 3. Feature selection in B-mode, CEUS, and TIC of the CEUS group by LASSO regression in 140 patients with breast cancer. (a,b) Selection of B-mode ultrasound and CEUS features. (c,d) Selection of TIC parameters.
Characteristics of B-mode imaging with CEUS.
Characteristics of CEUS TIC.
In 107 cases in the HER2 low expression group, the image features of the three modalities were assessed in Supplementary Table 3. Imaging features related to HER2 low expression were selected by LASSO regression, including location, shape, strip-shaped echoic, perfusion defect, mean transit time (mTT), and FT. There were no clinical characteristics observed. The selected characteristics are listed in Supplementary Table 4.
Characteristics in B-mode.
Characteristics of CEUS images.
Characteristics in TIC of CEUS.
3.3 Machine learning models for the prediction
The prediction model was established on the training set, and its performance was tested on the validation set. The effectiveness and stability of the training set, consisting of 104 cases, were validated using LOOCV, and the accuracy and Kappa were 0.871 and 0.446, respectively. In the training set of FISH positive (IHC 3+) and negative groups, six classifiers, including logistic regression (LR), support vector machine (SVM), random forest (RF), XGB (XGBoost), decision-level fusion technique of hard voting based on LR, SVM, and XGB, as well as the XGB combined with the LR model (29, 30).
The final result of the decision-level fusion model was determined by three single classifiers: LR, SVM, and XGB (better than RF in this study). The hard-voting progression is shown in Figure 4. In the XGB combined with LR prediction model, XGB was used to construct new variables, reflecting the correlation of the selected variables. LR was used to gather the selected and new variables to construct the prediction model and to calculate the significance and weight coefficients of each variable. In the prediction of the HER2-positive breast cancer group, seven variables, including a novel feature (V11) generated by the XGB tree-based model trained on existing features, were selected for the final LR prediction model based on the feature importance rankings (Supplementary Figure 1).
Classifiers of LR, SVM, RF, and XGB were established in three imaging modalities: (1) B-mode ultrasound, (2) B-mode ultrasound combined with CEUS, and (3) B-mode ultrasound combined with CEUS and TIC. The other two types of fusion models were used in the third multi-modality to predict HER2-positive breast cancer.
The AUC, sensitivity, specificity, and accuracy of the four classifiers in three modalities are shown in Table 3. The sensitivities of SVM were increased from 0.728 (95%CI: 0.554–0.862) to 0.778 (95%CI: 0.608–0.899) by adding the CEUS modality. In the three modalities group, SVM performs the best AUC value in the four single classifiers, with an AUC of 0.806 (95%CI: 0.640–0.918), a sensitivity of 0.833 (95%CI: 0.359–0.996) and a specificity of 0.767 (95%CI: 0.577–0.901). The AUC values improved with the enrichment of the imaging modalities. In the third modality, the performances of the other two fusion models are also shown in Table 3.
According to the predictive performance of LR, SVM, RF, and XGB, the three top-performing individual classifiers for HER2 expression, LR, SVM, and XGB, were combined using hard voting to generate a consolidated prediction result. Thus, the decision-level fusion model was constructed using hard voting based on LR, SVM, and XGB to establish the fusion model, and the weighted ratio was set as 1:1:1. In the six models, the fusion model of LR, SVM, and XGB classifiers performed best, with an AUC value of 0.869 (95%CI: 0.715–0.958), a sensitivity of 1.000 (95%CI: 0.541–1.000), and a specificity of 0.668 (95%CI: 0.472–0.827). The ROCs of the six classifiers in B-mode ultrasound combined with CEUS and TIC modalities are shown in Figure 5. In the training cohort of 104 cases, 31 cases with certain IHC results were assessed as IHC 2 + by CNB, and two of them were reclassified as IHC 3 + according to FISH results. The fusion model of LR, SVM, and XGB also predicted them as IHC 3 + cases.

Figure 5. ROCs of the classifiers in predicting HER2-positive breast cancer based on B-mode ultrasound, CEUS, and TIC in the (a) training and (b) validation sets.
In the training set of the HER2 low expression and HER2-negative groups, prediction models based on the six classifiers in the third modality were also established. In the training set of 79 participants, the accuracy and kappa values were 0.864 and 0.637, respectively. The AUC values, sensitivity, specificity, and accuracy are shown in Table 4. The decision-level fusion model was selected as the voting result of LR, SVM, and XGB, and the weighted ratio was set at 1:2:1, according to the performance of the classifiers. The fusion model of LR, SVM, RF, and XGB classifiers also gets the highest AUC value of 0.747 (95%CI: 0.548–0.891), sensitivity of 1.000 (95%CI: 0.735–1.000), and specificity of 0.438 (95%CI: 0.198–0.701). The ROCs of the six prediction models in the HER2 low expression and negative group are shown in Figure 6. Both the AUCs for predicting HER2 status were increased using the decision-level machine learning approach.

Table 4. Diagnostic performance of the classifiers in predicting HER2 low expression patients based on B-mode ultrasound, CEUS, and TIC characteristics.

Figure 6. The ROCs of the six prediction models in predicting HER2 low expression patients in the (a) training and (b) validation sets.
4 Discussion
4.1 Key findings in the context of prior literature
HER2-targeted therapy can reduce recurrence and increase the likelihood of breast-conserving surgery in patients with HER2-positive breast cancer. In this study, the fusion model of multiple single classifiers, based on machine learning approaches, performed best in predicting HER2 3 + and HER2 2+/1 + expression, with an AUC of 0.869 (95%CI: 0.715–0.958) and 0.747 (95%CI: 0.548–0.891), respectively. It could also predict the two equivocal IHC 2 + breast cancers as HER2 3+, in concordance with the FISH results.
In this research, imaging features of multimodalities, including B-mode ultrasound, CEUS, and TIC, were obtained by assessment of radiologists. Previous studies that predicted HER2 expression using imaging features are shown in Supplementary Table 5. Compared with radiomic features acquired by software or a single ultrasound modality, these features are more available and can provide abundant vascularity information. Vasculogenic mimicry (VM), which differs from angiogenesis formed by endothelial cells, is a vascular structure formed by cancer cells that transit tumor and blood cells in a channel network and is involved in tumor neovascularization (31, 32). In breast cancer, VM is associated with HER2-positive cases, which may contribute to two anticoagulant-secreted proteins, Serpine2 and Slpi, promoting VM formation. Both of them mostly occurred in HER2-positive patients with breast cancer (33, 34). Studies have shown that CEUS can assess VM density in vitro, and the quantitative parameters of TIC are related to VM (35, 36). Thus, the microbubbles of CEUS may provide information on HER2-positive breast cancer neovascularization at the molecular level.
Previous studies have mostly focused on HER2 3 + expression in breast cancers using radiomic approaches. To the best of our knowledge, this is the first study to use LR, SVM, and XGB fusion models by voting decision method to prospectively predict HER2 3 + and 2+/1 + expression levels in breast cancer based on a multicenter study of contrast Sonazoid-enhanced ultrasonography. In predicting HER2-positive and HER2-low expression BC cases, the AUC values of the fusion model in both of the two groups were the highest compared with other single machine learning models.
4.2 Clinical implications and innovations
In this study, tumor size, echotexture, strip-shaped echoic, macrocalcifications, and microcalcifications on B-mode ultrasound, perfusion defects on CEUS, and FT of TIC were predictive factors of HER2-positive breast cancer. Factors including tumor location, shape, strip-shaped echoic in B-mode ultrasound, perfusion defect in CEUS, mTT, and FT of TIC could predict HER2 low expression. Strip-shaped echogenic perfusion defects and FT are also predictors of HER2-positive expression, indicating that these features may be closely related to HER2 protein expression levels (2, 37).
Tumor size may reflect growth, indicating the prognosis of malignant tumors. Features of macrocalcifications and microcalcifications on B-mode ultrasound were associated with HER2-positive breast cancer in this study, which was also consistent with previous studies (38–41). Macrocalcification is regarded as the degeneration of the breast caused by injury and inflammation unrelated to cancer, while microcalcification is regarded as a calcium spot caused by rapid decomposition of cancer cells (38). With high aggressiveness and poor prognosis, HER2-positive breast cancer may be related to a faster growth rate than negative cases, indicating that more cell decomposition of the breast exists in positive cases (42, 43).
A strip-shaped echo mostly indicates the fibrosis inside the tumor. Malignant lesions can exhibit disordered hyperechoic strands, whereas benign lesions tend to exhibit organized linear echoes. Fibrosis in breast tumors is histologically regarded as fibroblasts and collagen fibers in the tumor center (44). Some studies have reported that fibrosis is positively related to HER2 expression and high aggressiveness of tumors (45), which is in contrast to the results of this study. In this study, fewer strip-shaped echoes were observed in HER2-positive breast cancer. A possible reason may be that most of our breast cancer cases were in stage I or II (100/104), and tumor cells were in the rapid growth phase, without undergoing necrosis and fibrosis. Further studies are still needed to determine the relationship between strip-shaped echoes and HER2 expression (45, 46).
Previous studies have also revealed that high HER2 expression might be related to the increased invasiveness of tumor cells and the formation of neovasculature (47). In some studies, perfusion defects in CEUS more frequently occurred in HER2-positive breast cancer, which might be caused by ischemic necrosis of the tumor, contributing to the slower blood vessel growth rate than the increased oxygen consumption of the tumor cells (48, 49). Other studies have also revealed that perfusion defects might be associated with uneven distribution of the contrast agent caused by heterogeneity and blood vessel distribution inside the tumor (47, 50). However, in this study, perfusion defects in Sonazoid-based CEUS were negatively associated with HER2-positive and low-expression breast cancers. In HER2 expression cases, less fibrosis was observed, indicating the presence of abundant vascularity, compared with HER2-negative cases.
In SVM models of three modalities, the sensitivities in predicting HER2-positive breast cancer were increased by CEUS, from 0.728 (95%CI: 0.554–0.862) to 0.778 (95%CI: 0.608–0.899). By adding the TIC feature, the sensitivity could also be increased, up to 0.806 (95%CI: 0.640–0.918). This result may indicate that the evaluation of microvasculature could improve the performance of prediction models in HER2-positive breast cancer, especially for the evaluation of TIC features. In previous studies of Sonazoid-based CEUS in liver cancer, short mTT and FT could be factors that differentiate angiomyolipoma and hepatocellular carcinoma from hepatocellular carcinoma because of the different amounts of blood vessels (51). In this study, short FT may be associated with HER2 expression (IHC3+, 2+, and 1+) in breast cancer, compared with HER2-negative expression cases. This may be related to the rapid excretion rate of Sonazoid microbubbles from intratumoral vessels in HER2 expression breast lesions. FT may be related to the number of blood vessels inside tumors because abundant vessels may contribute to a fast blood flow discharging from the draining vein and a short contrast agent staying time. Therefore, HER2-expressing breast tumors tend to exhibit higher internal vascularity.
In the 104 cases of patients with breast cancer, there were a total of 31 cases defined as IHC 2 + for the first time of CNB, with certain results of biopsy. Two of these were finally defined as IHC 3 + according to the FISH results, revealing that 6.5% (2/31) of HER2-positive cases were underestimated by IHC in this study. In the prediction results of the LR + SVM + XGB fusion model, the two cases were also predicted as IHC 3+, indicating that the fusion predictive model could improve the detection of IHC 3 + compared with the results of CNB by pathologists.
4.3 Limitations and future directions
Our study used LR, SVM, and XGB decision-level fusion models to predict three HER2 expression levels in breast cancer in two cohorts based on a prospective multicenter study of contrast Sonazoid-enhanced and B-mode ultrasound. However, this study has some limitations. First, the number of cases was limited because of the use of Sonazoid in breast CEUS multicenter studies. Second, this study only contained image features evaluated by radiologists. Radiomic features can reflect unrecognizable and quantifiable messages to the naked eye. Using radiomic approaches in multi-modal ultrasound may improve the prediction of BC biomarkers. However, radiomic features extracted by software were less available compared with the features assessed by radiologists in this study. Third, our study only included images from B-mode ultrasound and CEUS. Additional modalities, such as MRI and mammography, are expected to be included in the prediction of HER2 expression.
5 Conclusion
In conclusion, multi-mode ultrasound, including B-mode ultrasound, CEUS, and TIC, can predict HER2 expression status. Moreover, the fusion model of machine learning classifiers can improve the prediction results.
Data availability statement
The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.
Ethics statement
The studies involving humans were approved by the institutional ethics Committee of “Chinese PLA General Hospital” in the center of the principal investigator at Chinese PLA General Hospital in Beijing, China (IRB number: 2020-300). Approval was obtained on July 23rd, 2020. The research was carried out in accordance with the Declaration of Helsinki. The studies were conducted in accordance with the local legislation and institutional requirements. The participants provided their written informed consent to participate in this study.
Author contributions
HZ: Data curation, Formal analysis, Methodology, Writing – original draft, Writing – review & editing. ML: Data curation, Writing – review & editing. HS: Data curation, Writing – review & editing. HL: Data curation, Writing – review & editing. NY: Data curation, Writing – review & editing. BC: Data curation, Writing – review & editing. YC: Data curation, Writing – review & editing. HD: Data curation, Writing – review & editing. WY: Data curation, Writing – review & editing. XJ: Data curation, Writing – review & editing. PZ: Data curation, Writing – review & editing. LC: Data curation, Writing – review & editing. JW: Data curation, Writing – review & editing. WX: Data curation, Writing – review & editing. XY: Data curation, Writing – review & editing. ZL: Data curation, Writing – review & editing. YuY: Data curation, Writing – review & editing. TW: Data curation, Writing – review & editing. HW: Data curation, Writing – review & editing. YuaY: Data curation, Writing – review & editing. CW: Data curation, Writing – review & editing. YiW: Data curation, Writing – review & editing. JS: Data curation, Writing – review & editing. YaW: Data curation, Writing – review & editing. XF: Data curation, Writing – review & editing. RL: Data curation, Writing – review & editing. PL: Funding acquisition, Resources, Writing – review & editing. JY: Project administration, Supervision, Writing – review & editing.
Funding
The author(s) declare that financial support was received for the research and/or publication of this article. This work was supported by grants 82030047, 92159305 from the National Scientific Foundation Committee of China.
Acknowledgments
We sincerely appreciate all the authors for their diligent work and valuable contributions during this period.
Conflict of interest
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Generative AI statement
The author(s) declare that no Gen AI was used in the creation of this manuscript.
Publisher’s note
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.
Supplementary material
The Supplementary material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fmed.2025.1585823/full#supplementary-material
References
1. Qi, X, Zhang, L, Chen, Y, Pi, Y, Chen, Y, Lv, Q, et al. Automated diagnosis of breast ultrasonography images using deep neural networks. Med Image Anal. (2019) 52:185–98. doi: 10.1016/j.media.2018.12.006
2. Diéras, V, Miles, D, Verma, S, Pegram, M, Welslau, M, Baselga, J, et al. Trastuzumab emtansine versus capecitabine plus lapatinib in patients with previously treated HER2-positive advanced breast cancer (EMILIA): a descriptive analysis of final overall survival results from a randomised, open-label, phase 3 trial. Lancet Oncol. (2017) 18:732–42. doi: 10.1016/S1470-2045(17)30312-1
3. Krop, IE, Kim, SB, Martin, AG, LoRusso, PM, Ferrero, JM, Badovinac-Crnjevic, T, et al. Trastuzumab emtansine versus treatment of physician's choice in patients with previously treated HER2-positive metastatic breast cancer (TH3RESA): final overall survival results from a randomised open-label phase 3 trial. Lancet Oncol. (2017) 18:743–54. doi: 10.1016/S1470-2045(17)30313-3
4. von Minckwitz, G, Procter, M, de Azambuja, E, Zardavas, D, Benyunes, M, Viale, G, et al. Adjuvant Pertuzumab and Trastuzumab in early HER2-positive breast Cancer. N Engl J Med. (2017) 377:122–31. doi: 10.1056/NEJMoa1703643
5. Bitencourt, AGV, Gibbs, P, Rossi Saccarelli, C, Daimiel, I, Lo Gullo, R, Fox, MJ, et al. MRI-based machine learning radiomics can predict HER2 expression level and pathologic response after neoadjuvant therapy in HER2 overexpressing breast cancer. EBioMedicine. (2020) 61:103042. doi: 10.1016/j.ebiom.2020.103042
6. Zardavas, D, Fouad, TM, and Piccart, M. Optimal adjuvant treatment for patients with HER2-positive breast cancer in 2015. Breast. (2015) 24:S143–8. doi: 10.1016/j.breast.2015.07.034
7. Choong, GM, Cullen, GD, and O'Sullivan, CC. Evolving standards of care and new challenges in the management of HER2-positive breast cancer. CA Cancer J Clin. (2020) 70:355–74. doi: 10.3322/caac.21634
8. Cronin, KA, Harlan, LC, Dodd, KW, Abrams, JS, and Ballard-Barbash, R. Population-based estimate of the prevalence of HER-2 positive breast cancer tumors for early stage patients in the US. Cancer Investig. (2010) 28:963–8. doi: 10.3109/07357907.2010.496759
9. Boughey, JC, McCall, LM, Ballman, KV, Mittendorf, EA, Ahrendt, GM, Wilke, LG, et al. Tumor biology correlates with rates of breast-conserving surgery and pathologic complete response after neoadjuvant chemotherapy for breast cancer: findings from the ACOSOG Z1071 (Alliance) prospective multicenter clinical trial. Ann Surg. (2014) 260:608–14. doi: 10.1097/SLA.0000000000000924
10. Das, A, Nair, MS, and Peter, SD. Computer-aided histopathological image analysis techniques for automated nuclear atypia scoring of breast Cancer: a review. J Digit Imaging. (2020) 33:1091–121. doi: 10.1007/s10278-019-00295-z
11. Slostad, JA, Yun, NK, Schad, AE, Warrior, S, Fogg, LF, and Rao, R. Concordance of breast cancer biomarker testing in core needle biopsy and surgical specimens: a single institution experience. Cancer Med. (2022) 11:4954–65. doi: 10.1002/cam4.4843
12. Zheng, X, Yao, Z, Huang, Y, Yu, Y, Wang, Y, Liu, Y, et al. Deep learning radiomics can predict axillary lymph node status in early-stage breast cancer. Nat Commun. (2020) 11:1236. doi: 10.1038/s41467-020-15027-z
13. Allott, EH, Geradts, J, Sun, X, Cohen, SM, Zirpoli, GR, Khoury, T, et al. Intratumoral heterogeneity as a source of discordance in breast cancer biomarker classification. Breast Cancer Res. (2016) 18:68. doi: 10.1186/s13058-016-0725-1
14. Lu, Y, Zhu, S, Tong, Y, Fei, X, Jiang, W, Shen, K, et al. HER2-low status is not accurate in breast Cancer Core needle biopsy samples: an analysis of 5610 consecutive patients. Cancers (Basel). (2022) 14:6200. doi: 10.3390/cancers14246200
15. Santiago, L, Adrada, BE, Huang, ML, Wei, W, and Candelaria, RP. Breast cancer neoplastic seeding in the setting of image-guided needle biopsies of the breast. Breast Cancer Res Treat. (2017) 166:29–39. doi: 10.1007/s10549-017-4401-7
16. Liebens, F, Carly, B, Cusumano, P, van Beveren, M, Beier, B, Fastrez, M, et al. Breast cancer seeding associated with core needle biopsies: a systematic review. Maturitas. (2009) 62:113–23. doi: 10.1016/j.maturitas.2008.12.002
17. Stolier, A, Skinner, J, and Levine, EA. A prospective study of seeding of the skin after core biopsy of the breast. Am J Surg. (2000) 180:104–7. doi: 10.1016/S0002-9610(00)00425-6
18. Xie, Y, Chen, Y, Wang, Q, Li, B, Shang, H, and Jing, H. Early prediction of response to neoadjuvant chemotherapy using quantitative parameters on automated breast ultrasound combined with contrast-enhanced ultrasound in breast Cancer. Ultrasound Med Biol. (2023) 49:1638–46. doi: 10.1016/j.ultrasmedbio.2023.03.017
19. Boca Bene, I, Ciurea, AI, Ciortea, CA, Ștefan, PA, Lisencu, LA, Dudea, SM, et al. Differentiating breast tumors from background parenchymal enhancement at contrast-enhanced mammography: the role of Radiomics-a pilot reader study. Diagnostics (Basel). (2021) 11:1248. doi: 10.3390/diagnostics11071248
20. Luo, J, Tang, L, Chen, Y, Yang, L, Shen, R, Cheng, Y, et al. A prospective multicenter study on the additive value of contrast-enhanced ultrasound for biopsy decision of ultrasound BI-RADS 4 breast lesions. Ultrasound Med Biol. (2024) 50:1224–31. doi: 10.1016/j.ultrasmedbio.2024.04.010
21. Kotopoulis, S, Popa, M, Mayoral Safont, M, Murvold, E, Haugse, R, Langer, A, et al. SonoVue(®) vs. Sonazoid™ vs. Optison™: which bubble is best for low-intensity Sonoporation of pancreatic ductal adenocarcinoma? Pharmaceutics. (2022) 14:98. doi: 10.3390/pharmaceutics14010098
22. Alter, J, Sennoga, CA, Lopes, DM, Eckersley, RJ, and Wells, DJ. Microbubble stability is a major determinant of the efficiency of ultrasound and microbubble mediated in vivo gene transfer. Ultrasound Med Biol. (2009) 35:976–84. doi: 10.1016/j.ultrasmedbio.2008.12.015
23. Hao, Y, Sun, Y, Lei, Y, Zhao, H, and Cui, L. Percutaneous Sonazoid-enhanced ultrasonography combined with in vitro verification for detection and characterization of sentinel lymph nodes in early breast cancer. Eur Radiol. (2021) 31:5894–901. doi: 10.1007/s00330-020-07639-2
24. Akselrod-Ballin, A, Chorev, M, Shoshan, Y, Spiro, A, Hazan, A, Melamed, R, et al. Predicting breast Cancer by applying deep learning to linked health records and mammograms. Radiology. (2019) 292:331–42. doi: 10.1148/radiol.2019182622
25. Turkki, R, Byckhov, D, Lundin, M, Isola, J, Nordling, S, Kovanen, PE, et al. Breast cancer outcome prediction with tumour tissue images and machine learning. Breast Cancer Res Treat. (2019) 177:41–52. doi: 10.1007/s10549-019-05281-1
26. Wolff, AC, Hammond, MEH, Allison, KH, Harvey, BE, Mangu, PB, Bartlett, JMS, et al. Human epidermal growth factor receptor 2 testing in breast Cancer: American Society of Clinical Oncology/College of American Pathologists Clinical Practice Guideline Focused Update. J Clin Oncol. (2018) 36:2105–22. doi: 10.1200/JCO.2018.77.8738
27. Ranstam, J, and Cook, JA. LASSO regression. Br J Surg. (2018) 105:1348–8. doi: 10.1002/bjs.10895
28. Iparragirre, A, Lumley, T, Barrio, I, and Arostegui, I. Variable selection with LASSO regression for complex survey data. Stat. (2023) 12:e578. doi: 10.1002/sta4.578
29. He, X, Pan, J, Jin, O, Xu, T, Liu, B, Xu, T, et al. Practical lessons from predicting clicks on ads at Facebook In: Eighth international workshop on data Mining for Online Advertising. New York, NY, USA: Association for Computing Machinery. (2014).
30. Zhou, S, Hu, C, Wei, S, and Yan, X. Breast Cancer prediction based on multiple machine learning algorithms. Technol Cancer Res Treat. (2024) 23:15330338241234791. doi: 10.1177/15330338241234791
31. Maniotis, AJ, Folberg, R, Hess, A, Seftor, EA, Gardner, LMG, Pe'er, J, et al. Vascular channel formation by human melanoma cells in vivo and in vitro: vasculogenic mimicry. Am J Pathol. (1999) 155:739–52. doi: 10.1016/S0002-9440(10)65173-5
32. Chiao, MT, Yang, YC, Cheng, WY, Shen, CC, and Ko, JL. CD133+ glioblastoma stem-like cells induce vascular mimicry in vivo. Curr Neurovasc Res. (2011) 8:210–9. doi: 10.2174/156720211796558023
33. Wagenblast, E, Soto, M, Gutiérrez-Ángel, S, Hartl, CA, Gable, AL, Maceli, AR, et al. A model of breast cancer heterogeneity reveals vascular mimicry as a driver of metastasis. Nature. (2015) 520:358–62. doi: 10.1038/nature14403
34. Morales-Guadarrama, G, García-Becerra, R, Méndez-Pérez, EA, García-Quiroz, J, Avila, E, and Díaz, L. Vasculogenic mimicry in breast Cancer: clinical relevance and drivers. Cells. (2021) 10:1758. doi: 10.3390/cells10071758
35. Zhou, YT, Cai, WW, Li, Y, Jiang, X, Feng, L, Zhu, QY, et al. Correlations between quantitative parameters of contrast-enhanced ultrasound and vasculogenic mimicry in murine tumor model: a novel noninvasive technique for assessment? Biol Proced Online. (2019) 21:11. doi: 10.1186/s12575-019-0101-5
36. Liu, H, Gao, M, Gu, J, Wan, X, Wang, H, Gu, Q, et al. VEGFR1-targeted contrast-enhanced ultrasound imaging quantification of Vasculogenic mimicry microcirculation in a mouse model of choroidal melanoma. Transl Vis Sci Technol. (2020) 9:4. doi: 10.1167/tvst.9.3.4
37. Rothschild, HT, Clelland, E, Patterson, A, Molina-Vega, J, Kaur, M, Symmans, WF, et al. HER-2 low status in early-stage invasive lobular carcinoma of the breast: associated factors and outcomes in an institutional series. Breast Cancer Res Treat. (2023) 199:349–54. doi: 10.1007/s10549-023-06927-x
38. Xue, S, Zhao, Q, Tai, M, Li, N, and Liu, Y. Correlation between breast ultrasound microcalcification and the prognosis of breast Cancer. J Healthc Eng. (2021) 2021:6835963. doi: 10.1155/2021/6835963
39. Wang, Y, Ikeda, DM, Narasimhan, B, Longacre, TA, Bleicher, RJ, Pal, S, et al. Estrogen receptor-negative invasive breast cancer: imaging features of tumors with and without human epidermal growth factor receptor type 2 overexpression. Radiology. (2008) 246:367–75. doi: 10.1148/radiol.2462070169
40. Eroles, P, Bosch, A, Alejandro Pérez-Fidalgo, J, and Lluch, A. Molecular biology in breast cancer: intrinsic subtypes and signaling pathways. Cancer Treat Rev. (2012) 38:698–707. doi: 10.1016/j.ctrv.2011.11.005
41. Cui, H, Sun, Y, Zhao, D, Zhang, X, Kong, H, Hu, N, et al. Radiogenomic analysis of prediction HER2 status in breast cancer by linking ultrasound radiomic feature module with biological functions. J Transl Med. (2023) 21:44. doi: 10.1186/s12967-022-03840-7
42. Xing, F, Gao, H, Chen, G, Sun, L, Sun, J, Qiao, X, et al. CMTM6 overexpression confers trastuzumab resistance in HER2-positive breast cancer. Mol Cancer. (2023) 22:6. doi: 10.1186/s12943-023-01716-y
43. Loibl, S, and Gianni, L. HER2-positive breast cancer. Lancet. (2017) 389:2415–29. doi: 10.1016/S0140-6736(16)32417-5
44. Mujtaba, SS, Ni, YB, Tsang, JYS, Chan, SK, Yamaguchi, R, Tanaka, M, et al. Fibrotic focus in breast carcinomas: relationship with prognostic parameters and biomarkers. Ann Surg Oncol. (2013) 20:2842–9. doi: 10.1245/s10434-013-2955-0
45. Hasebe, T, Mukai, K, Tsuda, H, and Ochiai, A. New prognostic histological parameter of invasive ductal carcinoma of the breast: clinicopathological significance of fibrotic focus. Pathol Int. (2000) 50:263–72. doi: 10.1046/j.1440-1827.2000.01035.x
46. Hasebe, T, Tsuda, H, Hirohashi, S, Shimosato, Y, Iwai, M, Imoto, S, et al. Fibrotic focus in invasive ductal carcinoma: an indicator of high tumor aggressiveness. Jpn J Cancer Res. (1996) 87:385–94. doi: 10.1111/j.1349-7006.1996.tb00234.x
47. Wang, XY, Hu, Q, Fang, MY, He, Y, Wei, HM, Chen, XX, et al. The correlation between HER-2 expression and the CEUS and ARFI characteristics of breast cancer. PLoS One. (2017) 12:e0178692. doi: 10.1371/journal.pone.0178692
48. Liang, X, Li, Z, Zhang, L, Wang, D, and Tian, J. Application of contrast-enhanced ultrasound in the differential diagnosis of different molecular subtypes of breast Cancer. Ultrason Imaging. (2020) 42:261–70. doi: 10.1177/0161734620959780
49. Mie Lee, Y, Kim, SH, Kim, HS, Jin Son, M, Nakajima, H, Jeong Kwon, H, et al. Inhibition of hypoxia-induced angiogenesis by FK228, a specific histone deacetylase inhibitor, via suppression of HIF-1alpha activity. Biochem Biophys Res Commun. (2003) 300:241–6. doi: 10.1016/S0006-291X(02)02787-0
50. Jain, RK. Normalizing tumor vasculature with anti-angiogenic therapy: a new paradigm for combination therapy. Nat Med. (2001) 7:987–9. doi: 10.1038/nm0901-987
Keywords: human epidermal growth factor receptor 2, breast cancer, Sonazoid, ultrasound, machine learning
Citation: Zhang H, Lang M, Shen H, Li H, Yang N, Chen B, Chen Y, Ding H, Yang W, Ji X, Zhou P, Cui L, Wang J, Xu W, Ye X, Liu Z, Yang Y, Wei T, Wang H, Yan Y, Wu C, Wu Y, Shi J, Wang Y, Fang X, Li R, Liang P and Yu J (2025) Machine learning-based fusion model for predicting HER2 expression in breast cancer by Sonazoid-enhanced ultrasound: a multicenter study. Front. Med. 12:1585823. doi: 10.3389/fmed.2025.1585823
Edited by:
Juan Wang, The Second Affiliated Hospital of Xi’an Jiaotong University, ChinaReviewed by:
Qiao Hu, The People’s Hospital of Guangxi Zhuang Autonomous Region, ChinaXuejun Ni, Affiliated Hospital of Nantong University, China
Copyright © 2025 Zhang, Lang, Shen, Li, Yang, Chen, Chen, Ding, Yang, Ji, Zhou, Cui, Wang, Xu, Ye, Liu, Yang, Wei, Wang, Yan, Wu, Wu, Shi, Wang, Fang, Li, Liang and Yu. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: Jie Yu, amllbWkzMDFAMTYzLmNvbQ==
†These authors have contributed equally to this work and share first authorship