ORIGINAL RESEARCH article

Front. Cardiovasc. Med., 25 July 2024

Sec. Atherosclerosis and Vascular Medicine

Volume 11 - 2024 | https://doi.org/10.3389/fcvm.2024.1392752

Screening for carotid atherosclerosis: development and validation of a high-precision risk scoring tool

  • ZH

    Zhi-Xin Huang 1,2*

  • LC

    Lijuan Chen 3

  • PC

    Ping Chen 4

  • YD

    Yingyi Dai 1

  • HL

    Haike Lu 1

  • YL

    Yicheng Liang 1

  • QD

    Qingguo Ding 5

  • PL

    Piaonan Liang 6

  • 1. Department of Neurology, The Affiliated Guangdong Second Provincial General Hospital of Jinan University, Guangzhou, Guangdong, China

  • 2. Guangzhou University of Chinese Medicine, Guangzhou, Guangdong, China

  • 3. Department of Ultrasound, Songyang County People’s Hospital, Lishui, Zhejiang, China

  • 4. Department of Neurology, The First Hospital of Putian City, Putian, Fujiang, China

  • 5. Department of Neurology, Nanhai Economic Development Zone Peoples Hospital, Foshan, Guangdong, China

  • 6. Department of Rehabilitation Medicine, The Affiliated Guangdong Second Provincial General Hospital of Jinan University, Guangzhou, Guangdong, China

Article metrics

View details

1,6k

Views

531

Downloads

Abstract

Objective:

This study aimed to investigate the prevalence of carotid atherosclerosis (CAS), especially among seniors, and develop a precise risk assessment tool to facilitate screening and early intervention for high-risk individuals.

Methods:

A comprehensive approach was employed, integrating traditional epidemiological methods with advanced machine learning techniques, including support vector machines, XGBoost, decision trees, random forests, and logistic regression.

Results:

Among 1,515 participants, CAS prevalence reached 57.4%, concentrated within older individuals. Positive correlations were identified with age, systolic blood pressure, a history of hypertension, male gender, and total cholesterol. High-density lipoprotein (HDL) emerged as a protective factor against CAS, with total cholesterol and HDL levels proving significant predictors.

Conclusions:

This research illuminates the risk factors linked to CAS and introduces a validated risk scoring tool, highlighted by the logistic classifier's consistent performance during training and testing. This tool shows potential for pinpointing high-risk individuals in community health programs, streamlining screening and intervention by clinical physicians. By stressing the significance of managing cholesterol levels, especially HDL, our findings provide actionable insights for CAS prevention. Nonetheless, rigorous validation is paramount to guarantee its practicality and efficacy in real-world scenarios.

Introduction

Carotid atherosclerosis (CAS) is not just an indication of systemic atherosclerosis in the carotid arteries; its implications extend far beyond that (1, 2). Firstly, CAS had garnered significant attention due to its close association with stroke (3), a grave health concern where CAS was recognized as a major contributing factor. Notably, asymptomatic CAS exhibits a high prevalence of up to 40% among middle-aged and elderly individuals, implying that it may have remained unnoticed in many individuals (4). Rupture of atherosclerotic plaques in the carotid arteries could trigger transient ischemic attacks and strokes, profoundly impacting patients’ lives. Moreover, given atherosclerosis was a systemic disease, there exists a close relationship between CAS and coronary atherosclerosis, further elevating the risk of cardiovascular and cerebrovascular diseases (5, 6). These multiple hazards underscored the urgent need for in-depth research and management of CAS.

In the process of studying and managing CAS, ultrasound diagnosis played a pivotal role (7). Carotid ultrasound examination provides crucial insights into the disease severity. It facilitated early diagnosis and monitoring of disease progression by quantifying carotid intima-media thickness (CIMT) and detecting carotid atherosclerotic plaques (CAP) (8, 9). This non-invasive tool boasted safety benefits without radiation risks while remaining applicable across extensive age groups from children to seniors. These advantages made ultrasound diagnosis a powerful methodology for elucidating the underpinnings and clinical approaches for CAS. Our focus was on the overlooked community populations. We utilized ultrasound techniques for CAS screening, aiming to ascertain subjects at risk and further evaluate associated factors. This endeavor provides valuable data that could enable the establishment of a robust scoring model to accurately evaluate CAS risks in patients and tailor preventive tactics. Through this study, we aim to thoroughly characterize and strategize management of CAS, especially in community settings, filling knowledge gaps to enhance screening and care.

Methods

Study population

The study protocol was reviewed and approved by the Ethical Committee of Guangdong Second Provincial General Hospital (Approval number: GD2H-KY IRB-AF-SC.07-01.2), and the ethical guidelines of the 1975 Declaration of Helsinki were followed.

A prospective longitudinal study named Ivy Action was performed in the form of a voluntary prevention screening program for ischemic cerebrovascular disease targeting the adult population of multicenter communities (Guangzhou) in 2018–2019. The study included individuals aged 35 years or older, who had no history of stroke or had experienced a good recovery after stroke with a modified Rankin scale score of 2 or less indicating good recovery. Exclusion criteria included individuals unable to communicate with the research team, those with mobility impairments hindering study participation, individuals with heart, liver, or kidney failure, and patients with a history of malignancy. The comprehensive study protocol included gathering information on basic socio-economic status, social and residential status, smoking, housework, physical activity, sleep and dietary habits. Additionally, cardiovascular risk factors, family history and medical history were meticulously recorded.

In our machine learning models, we employed a comprehensive set of risk factors, including age, gender, biomarkers, systolic and diastolic blood pressure, health status, lifestyle factors, family history of cardiovascular disease, and medical history (Table 1). These factors were chosen for their well-documented association with carotid atherosclerosis and their clinical significance in predicting cardiovascular events. The medical exam included non-invasive tests (resting blood pressure, anthropometric measurements), ECG, anxiety and depression Scale, venous blood tests performed in a central laboratory with conventional enzymatic methods, echocardiography and carotid duplex scans (10). Waist and hip circumference were measured with light clothing. Cardiac History encompasses any medical history of coronary artery disease, myocardial infarction, angina, atrial fibrillation, or valvular heart disease. A waist-to-hip ratio over 0.90 for men and >0.85 for women was considered central obesity.

Table 1

CharacteristicNo, N = 646aYes, N = 869a
Sex
 Male157 (24%)350 (40%)
 Female489 (76%)519 (60%)
Age54 (45, 60)63 (59, 68)
Biomarkers
 hsCRP0.85 (0.43, 1.78)1.14 (0.59, 2.39)
  Unknown25
 Uric Acid329 (274, 396)351 (297, 428)
  Unknown22
 Hemoglobin5.60 (5.30, 5.80)5.70 (5.50, 6.10)
  Unknown22
 Glucose4.69 (4.34, 5.18)5.03 (4.58, 5.91)
  Unknown22
 Homocysteine10.4 (8.8, 12.6)11.7 (9.8, 14.3)
  Unknown22
 LDL2.83 (2.37, 3.30)3.00 (2.45, 3.63)
  Unknown22
 TG1.27 (0.93, 1.90)1.50 (1.10, 2.08)
  Unknown22
 HDL1.31 (1.15, 1.52)1.29 (1.13, 1.49)
  Unknown22
 TC5.26 (4.65, 5.91)5.42 (4.74, 6.16)
  Unknown22
Blood pressure
 SBP120 (109, 131)133 (121, 146)
  Unknown1811
 DBP79 (73, 86)82 (75, 89)
  Unknown1811
Medical history
 Hypertension100 (15%)330 (38%)
 Dyslipidemia116 (18%)264 (30%)
 Diabetes29 (4.5%)114 (13%)
 Cardiac history26 (4.0%)109 (13%)
 Depression or anxiety124 (19%)154 (18%)
 Cognitive impairment13 (2.0%)19 (2.2%)
Family history
 Stroke103 (16%)140 (16%)
  Unknown01
 Coronary heart disease65 (10%)105 (12%)
  Unknown01
 Hypertension245 (38%)325 (37%)
  Unknown01
 Diabetes106 (16%)109 (13%)
  Unknown01
Health status
 BMI23.2 (21.5, 25.2)23.8 (21.8, 25.8)
  Unknown1614
 Central obesity323 (50%)592 (68%)
 Smoking76 (12%)159 (18%)
  Unknown12
 Drinking103 (16%)204 (24%)
  Unknown23
 Sleep357 (56%)492 (57%)
  Unknown106
Lifestyle factors
 Living alone23 (3.6%)44 (5.1%)
  Unknown13
 Sport levels
  Lack of physical activity270 (50%)355 (47%)
  Frequent physical activity270 (50%)393 (53%)
  Unknown106121
 Housework Level
  No37 (6.3%)73 (9.0%)
  Occasionally46 (7.8%)68 (8.4%)
  30 min_perday35 (5.9%)55 (6.8%)
  30 min–1 h_perday90 (15%)130 (16%)
  1–3 h_perday243 (41%)307 (38%)
  4 h_perday140 (24%)180 (22%)
  Unknown5556
 Mutual action
  Virtually none221 (39%)299 (38%)
  1–2days_perweek214 (38%)278 (35%)
  3–5days_perweek85 (15%)138 (17%)
  6–7days_perweek46 (8.1%)80 (10%)
  Unknown8074

Demographic information for participants.

a

n (%); Median (IQR); “unknown” indicates missing values.

It is essential to note that the primary aim of this study was focused on the prevention and management of stroke. All examination procedures were provided free of charge, and participants had the option to voluntarily undergo testing, ensuring that the study adhered to ethical standards and obtained informed consent from the participants.

Utilizing high-resolution ultrasound to assess CAS

An examination was conducted with the patient supine with the head slightly extended. Carotid intima media thickness (IMT) was evaluated with a 7.5 MHz linear array transducer using a LOGIO E ultrasound (GE healthcare, USA). Measurements were taken carefully, analyzing the maximum IMT at two sites and all interfaces of the near and far walls of the common carotid artery (CCA), located 1–1.5 cm below its bifurcation. To obtain anechoic images, adjustments the image gain, depth, and focus were meticulously made for each participant. Plaques were defined as a focal thickening extending into the lumen by at least 0.5 mm or as IMTs greater than 1.5 mm, measured between the near- and far-walls of any carotid segment in the internal carotid artery, common carotid arteries, and carotid bulb. Carotid artery (CA) was defined as having a maximal intima-media thickness (IMT) of ≥1.0 mm and/or the presence of atherosclerotic plaques. To minimize intra-operator variation, a single experienced sonographer conducted all ultrasound examinations.

A comprehensive approach to machine learning modeling and evaluation

The analytical approach employed in this study leveraged R code to perform several critical steps. To address missing data, the “missRanger” package was utilized for imputing missing values. Subsequently, a diverse set of machine learning models, encompassing support vector machines (SVM), XGBoost, decision trees, random forests, and logistic regression, were employed. Hyperparameter tuning was conducted through random search to identify the optimal model configuration. To ensure robust model performance while mitigating overfitting, a 5-fold cross-validation strategy was implemented. Given the presence of imbalanced target classes, data balance was achieved through oversampling. Model performance evaluation primarily relied on the area under the curve (AUC) metric, facilitating the selection of the best-performing machine learning method as the final model.

In the final model evaluation, three key approaches were employed: (1) ROC and PRC (Precision-Recall Curve) plots provided in-depth insights into model performance. (2) Model calibration was conducted using the calibration function from the “calibrate” package. This process involved generating calibration plots for apparent and bias-corrected probabilities to comprehensively assess model performance. (3) Decision Curve Analysis (DCA) was carried out to evaluate the clinical utility of the model. Decision curves and the plot_decision_curve function were employed to visualize the net benefit of the model across various threshold probabilities.

Additionally, a forest plot was created using the forest_model function from the “forestmodel” package, offering deeper insights into the relationship between predictor variables and the outcome variable. Sample size calculation was executed using the ShowRegTable function, contributing to an understanding of dataset characteristics and ensuring an adequate sample size for the analysis.

In summary, this methodological approach comprised data imputation, machine learning modeling, hyperparameter tuning, cross-validation, data balancing, and comprehensive model performance evaluation, culminating in the selection of the best-performing model. The final model evaluation involved ROC and PRC plots, model calibration, Decision Curve Analysis, and a forest plot. Sample size calculations were conducted to support the robustness of the analysis.

Developing a risk scoring system

Grouping and Reference Value Selection of Risk Factors: Firstly, risk factors are grouped based on their clinical significance or common usage. In each group, appropriate numerical values are selected as reference values (Wij). Typically, the midpoint of the group is chosen as the reference value. Secondly, Handling Categorical Variables: For categorical variables such as gender, a category is chosen as the reference, and its reference value is set to 0, while other categories are naturally assigned numerical values, typically 1. Basic Risk Reference Value: Each risk factor needs to have a suitable group selected as the basic risk reference value (WiREF). When constructing a scoring tool later, the score for this group will be set as 0. Scores for other groups will be assigned positive or negative values based on their relationship with WiREF. Scoring Calculation: Using the regression coefficients estimated by a multiple logistic regression model (βi) and the reference values for each risk factor group (Wij), the distance between each risk factor's group and the basic risk reference value (WiREF) is calculated (D). The calculation formula is D = (Wij−WiREF) * βi. Thirdly, Risk Probability Calculation: Using the equation of a multiple logistic regression model, the probability of risk prediction for each score is calculated. This probability value represents the likelihood of an individual experiencing a specific event under certain risk factors. The formula is:where represents a linear combination, which is approximately a constant term added to the product of the scores for various risk factors. Analyses were conducted using the R Statistical language (version 4.3.1; R Core Team, 2023).

Results

Population characteristics

Initially, 1,560 participants were recruited; however, 45 individuals with incomplete carotid ultrasound examinations were excluded from analysis. Consequently, the final analytical sample comprised 1,515 subjects. Carotid ultrasound revealed CAS in 869 participants, constituting a prevalence of 57.4% in the study cohort. Individuals in the CAS group demonstrated more advanced age compared to the non-CAS group (mean 63 years vs. 54 years). Additionally, the majority of CAS individuals were female (519 subjects, accounting for prevalence of 60%). Detailed demographic information for the entire study population is presented in Table 1.

Machine learning model training and evaluation process

The preprocessing phase involved imputing missing values to ensure a complete dataset, as depicted in Figure 1. Subsequently, five machine learning models were explored, including logistic regression, support vector machines (SVM), XGBoost, decision trees and random forests. Hyperparameter optimization was carried out through random search across defined parameter spaces for each algorithm. To mitigate class imbalance, the “oversample” method from the classbalancing package was utilized, equalizing the training set's target categories. For imbalanced classification, the area under the receiver operating characteristic curve (AUC) served as a robust metric of overall performance. As shown in Figure 2, the ROC curves described each learner's discrimination capacity across thresholds, with the AUC score quantifying overall predictive power.

Figure 1

Figure 2

Simultaneously, the AUC values for each learner on the training set were computed and visually represented in the figure. The outcomes of different machine learning algorithms on the training and test sets were summarized in Table 2. Remarkably high accuracy levels (79%–100%) were achieved by all classifiers during the training phase. However, when applied to the test set, the overall accuracy dropped to 75%. Notably, an accuracy of 82% was achieved by the logistic classifier, outperforming its counterparts. This discrepancy underscores potential challenges in the model's generalization to unseen data, with the logistic regression model exhibiting greater robustness during the transition from training to testing phases. Therefore, the logistic regression model (Figure 3) was selected as the final model, incorporating six variables: age, systolic pressure, hypertension, total cholesterol, HDL cholesterol, and sex. Based on these six variables, the model was named the AP2C2S model. Specifically, age showed a significant positive correlation with CAS (OR = 1.14, 95% CI: 1.13–1.16, P < 0.001), while systolic pressure also demonstrated a positive correlation (OR = 1.02, 95% CI: 1.01–1.02, P < 0.001). Conversely, high-density lipoprotein (HDL) exhibited a significant negative correlation with CAS (OR = 0.30, 95% CI: 0.17–0.50, P < 0.001), indicating its role as a protective factor against CAS. Additionally, total cholesterol (TC) showed a positive correlation with CAS (OR = 1.58, 95% CI: 1.35–1.86, P < 0.001). Gender, when treated as a categorical variable, was significantly associated with CAS, with males showing a higher risk (OR = 2.09, 95% CI: 1.57–2.80, P < 0.001). Moreover, individuals with a history of hypertension exhibited a significant positive correlation with CAS (OR = 1.53, 95% CI: 1.11–2.11, P = 0.009). These findings underscore the significance of these variables in predicting CAS risk and justify their inclusion in the AP2C2S model.

Table 2

LearnerArea under the curve training setArea under the curve test setSensitivitySpecificityFalse negative rateFalse positive rate
Logistic regression0.85058270.82085420.82166630.65952300.17833370.3404770
Support vector machines0.93956730.80612500.83659560.61620750.16340440.3837925
XGBoost1.00000000.78761610.80438510.62695290.19561490.3730471
Decision trees0.78869770.76311420.84123310.59756710.15876690.4024329
Random forests0.99951620.81459730.85963060.62077520.14036940.3792248
Scale.oversample.classif.log_reg0.84952730.82180000.75951760.72140730.24048240.2785927
Scale.oversample.classif.svm0.94668900.79961700.78713040.66726300.21286960.3327370
Scale.oversample.classif.xgboost1.00000000.79438700.80207300.65020870.19792700.3497913
Scale.oversample.classif.rpart0.79040940.75186000.76764330.67035180.23235670.3296482
Scale.oversample.classif.ranger0.99968040.81639430.83662220.67186640.16337780.3281336

Comparison of the performance of machine learning classifiers.

This set of machine learning pipelines (“scale.oversample.classif.log_reg”, “scale.oversample.classif.svm”, “scale.oversample.classif.xgboost”, “scale.oversample.classif.rpart”, “scale.oversample.classif.ranger”) corresponds to different classifiers such as logistic regression, support vector machine, XGBoost, recursive partitioning tree (rpart), and Ranger random forest, aimed at accomplishing the final classification task. The ensemble framework is designed to enhance the model's ability to handle imbalanced data, thereby improving the overall performance of the classification algorithm.

Figure 3

Expanding upon our analysis, we extracted the regression model and generated ROC and PRC curves (Figures 4A,B) to further assess the model. We performed consistency checks using the calibrate function and depicted the calibration curve (Figure 4C). Additionally, employing decision curve analysis allowed us to assess patient benefits at different thresholds, providing insights into the model's potential value in real clinical decision-making (Figure 4D). This series of analyses and evaluations contributes to a thorough understanding of the model's performance and applicability.

Figure 4

Development of a risk scoring tool (AP2C2S)

Firstly, we categorized the variables of the final model as per the methodology section, assigning values to each variable based on the description provided (refer to Table 3). Subsequently, utilizing the equation of the multiple logistic regression model, we calculated the risk prediction probability for each corresponding score. Through this iterative process, we established a comprehensive table illustrating the correspondence between total scores and risk prediction probabilities, as depicted in Table 4.

Table 3

FactorsCategoriesReference value (Wij)βiβi (Wij−WiREF)Pointsij = D/B = (Wij−WiREF) * βi/B
Age0.127
35–4439.5 = W1REF00
45–5449.51.274
55–6459.52.548
65–7469.53.8112
≥7578.54.95315
SBP0.022
<120107−0.396−1
120–129125 = W2REF00
130–1391350.221
140–1491450.441
150–1591550.662
≥1601700.993
HDL−1.28
<1.140.990.29441
1.14–1.291.22 = W3REF00
1.30–1.501.40−0.2304−1
≥1.511.89−0.8576−3
TC0.55
<4.703.98 = W4REF00
4.70–5.335.020.5722
5.34–6.045.690.94053
≥6.057.231.78755
Sex0.88
Female0 = W5REF00
Male10.883
Hypertension0.39
No0 = W6REF00
Yes10.391

Development of a risk factor scoring tool using multifactor logistic regression.

HDL, high-density lipoprotein; SBP, systolic blood pressure; TC, total cholesterol.

Wij (Reference Value of Risk Factor Group): Represents the reference value of the j-th category of the i-th risk factor group. Typically, the median of this category is selected as the reference value. WiREF (Basic Risk Reference Value): Represents the baseline risk reference value of the i-th risk factor. When constructing the scoring tool, an appropriate group is selected as WiREF, with its score set to 0. Other groups receive positive or negative scores based on their relationship with WiREF. Risk Score Calculation: Utilizes the regression coefficients βi estimated by a multiple logistic regression model and the reference values Wij of each risk factor group to calculate the distance D between each group and the baseline WiREF. The formula is as follows: D = (Wij - WiREF) * βi. The score for each risk factor (Pointsij) represents the calculation derived from the multiple logistic regression model, indicating the score of the ith risk factor in the jth category. This score is based on the difference between the reference value (Wij) and the baseline risk reference value (WiREF) of that category, multiplied by the regression coefficient (βi) of the risk factor, and divided by a constant (B). This constant, B, is utilized to convert the units of the regression coefficients into scores, ensuring consistency in the use of scores within the risk scoring system.

Table 4

Point totalEstimate of riskPoint totalEstimate of riskPoint totalEstimate of risk
−10.8819575990.452914929190.084023053
00.857062087100.39917591200.068567823
10.827940365110.347760001210.055782307
20.794309407120.299663659220.045264987
30.756041843130.25561233230.03665365
40.713225204140.216039055240.029629715
50.666211583150.18110187250.023918357
60.615644079160.150728336260.019286027
70.56244723170.124673352270.015536572
80.507774373180.102578251280.012506764

The correspondence table between total score and risk prediction probability.

The seamless integration of this grouping and calculation procedure facilitates a systematic evaluation of each variable's contribution to the final model. It ensures a thorough and precise prediction of risk probability for patients. This methodical approach underscores the scientific rigor and practical utility of our risk assessment model in a clinical context.

To illustrate the application of the AP2C2S risk scoring tool concretely, we provide an example based on Tables 3, 4. Let's consider a 55-year-old male patient with a systolic blood pressure of 135 mmHg, total cholesterol (TC) of 5.5 mmol/L, high-density lipoprotein (HDL) of 1.2 mmol/L, and a history of hypertension. Referring to Table 3, we can assign points for each risk factor: Age (55–64 years): 8 points; Gender (male): 3 points; SBP (130–139 mmHg): 1 point; HDL (1.14–1.29 mmol/L): 0 points; TC (reference range): 3 points; Hypertension (yes): 1 point. Adding up these points, the patient's baseline total score is 16 points. Next, referring to Table 4, we find the corresponding CAS risk prediction probability based on the total score. In this example, a score of 16 corresponds to a risk prediction probability of 15.07%.

Discussion

Our study revealed a 57.4% prevalence of CAS among 1515 participants, underscoring a significant risk, especially in the middle-aged and elderly population. CAS patients exhibited higher age and positive correlations with traditional cardiovascular risk factors, including systolic blood pressure, age, history of hypertension, male gender, and total cholesterol. Notably, we identified high-density lipoprotein (HDL) as a protective factor against CAS, highlighting its role in risk mitigation. These findings indicate that total cholesterol and HDL levels could serve as significant predictors of CAS in community-dwelling men. This underscores the importance of managing cholesterol and increasing HDL levels to prevent the development of atherosclerosis. Specifically for elderly men with a history of hypertension, early adoption of secondary cardiovascular prevention measures is strongly recommended. Furthermore, leveraging advanced machine learning techniques, such as SVM, XGBoost, decision trees, random forests, and logistic regression, we achieved high accuracy during training (79%–100%). The logistic classifier, with an 82% accuracy, exhibited superior robustness in transitioning from training to testing, leading to its selection as the final model. This comprehensive approach enhances our understanding of CAS while providing a practical tool for risk assessment and personalized preventive strategies.

A traditional belief was that HDL particles, known for their role in reverse cholesterol transport, conferred cardiovascular benefits. However, current insights into HDL highlight its role in suppressing inflammation, oxidative stress, and stimulating endothelial function (11). Additionally, animal studies have indicated that recombinant HDL may inhibit the expression of carotid VCAM-1 (12) or aortic VCAM-1 and ICAM-1 (13). HDL has also been demonstrated to reduce the surface expression of ICAM-1, VCAM-1, and E-selectin by activating annexin A1 (14), a recognized anti-inflammatory mediator. In summary, the collective evidence from experimental models suggests that HDL particles directly influence endothelial cells, inhibiting proteins associated with endothelial activation. This mechanism holds the potential to decelerate the progression of atherosclerosis.

Notably, this study integrates traditional epidemiological analysis with advanced machine learning techniques, culminating in the creation of a practical risk scoring tool. By initially recruiting a substantial sample and utilizing carotid ultrasound to identify atherosclerosis, the research delves into population characteristics, revealing significant associations with demographic factors. The application of machine learning algorithms enhances the study's predictive capabilities, with logistic regression emerging as the most robust model (15). The incorporation of ROC and PRC curves, calibration checks, and decision curve analysis ensures a thorough evaluation of the model's performance and clinical utility. Additionally, the development of a risk scoring tool (AP2C2S) based on logistic regression results contributes to a systematic and precise assessment of individual risk probabilities. This holistic methodology reflects a novel and integrated approach to investigating and managing CAS, offering valuable insights for both research and clinical applications.

While our study yields valuable insights, certain limitations need consideration. The cross-sectional nature impedes causal inference, necessitating future longitudinal investigations. Additionally, the study's regional focus may impact generalizability, prompting caution in extending findings to broader populations. It is also worth noting that the AP2C2S model has not undergone direct comparison with the Systematic Coronary Risk Evaluation (SCORE) system (16) or other currently utilized scoring systems for carotid artery disease risk assessment. Therefore, while our findings provide valuable insights into the risk factors associated with carotid atherosclerosis, further research is needed to compare the AP2C2S model with other scoring systems and to validate its predictive accuracy across different demographic groups.

The risk scoring tool developed in this study serves as a foundation for further refinement and validation in diverse populations. Future research could explore additional biomarkers or imaging modalities to enhance risk prediction accuracy. The study sets the stage for evaluating personalized interventions based on risk scores, contributing to targeted and efficient preventive strategies.

Given the prolonged progression of atherosclerosis, our study's “preventive strategies” aim to achieve two key objectives: primary prevention for identifying high-risk individuals susceptible to cerebrovascular and cardiovascular complications, and secondary prevention targeting existing CAS patients. For primary prevention, our AP2C2S model facilitates the identification of candidates who could benefit from lifestyle modifications and early medical interventions, thereby reducing the risk of stroke. In the realm of secondary prevention, the same model serves as a valuable tool for prioritizing patients requiring heightened surveillance and more aggressive treatment approaches to impede CAS progression.

Conclusions

In conclusion, our study unveils a significant prevalence of CAS within the community, especially among the elderly. The introduction of the AP2C2S risk scoring tool, validated through the logistic classifier's robust performance across training and testing phases, offers a refined approach to risk assessment. This tool holds promise for identifying high-risk individuals within community health initiatives, potentially streamlining the process of screening and clinical intervention. By emphasizing the critical role of cholesterol management, particularly high-density lipoprotein (HDL), our research provides actionable insights that could inform CAS prevention strategies. However, we recognize the imperative for rigorous and extensive validation to ensure the tool's practicality and effectiveness in diverse real-world settings.

Statements

Data availability statement

The original contributions presented in the study are included in the article/Supplementary Material, further inquiries can be directed to the corresponding author.

Ethics statement

The studies involving humans were approved by the Ethical Committee of Guangdong Second Provincial General Hospital. The studies were conducted in accordance with the local legislation and institutional requirements. The participants provided their written informed consent to participate in this study.

Author contributions

Z-XH: Writing – review & editing, Writing – original draft, Supervision, Project administration, Methodology, Funding acquisition, Conceptualization. LC: Writing – original draft, Validation, Methodology, Formal Analysis, Data curation. PC: Writing – review & editing, Methodology, Formal Analysis, Data curation. YD: Writing – review & editing, Validation, Methodology, Funding acquisition, Data curation. HL: Writing – original draft, Visualization, Methodology, Investigation, Data curation. YL: Writing – original draft, Methodology, Data curation, Conceptualization. QD: Writing – review & editing, Visualization, Project administration, Methodology. PL: Writing – review & editing, Project administration, Methodology.

Funding

The author(s) declare financial support was received for the research, authorship, and/or publication of this article.

Z-XH was supported by the Science and Technology Program of Guangzhou, China (2024B03J0436), and Research Funds of Centre for Leading Medicine and Advanced Technologies of IHM (No. 2023IHM01052). YD was supported by the Medical Scientific Research Foundation of Guangdong Province, China (B2021382). The funding source had no role in study design, data collection, analysis, or interpretation.

Acknowledgments

We thank all the community populations who participated in the study and the health professionals who helped with the data collection.

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

References

  • 1.

    KatoAMinamiYAsakuraKKatamineMKatsuraAMuramatsuYet alCharacteristics of carotid atherosclerosis in patients with plaque erosion. J Thromb Thrombolysis. (2021) 52(2):6207. 10.1007/s11239-021-02419-1

  • 2.

    LiWLuoJPengFLiuRBaiXWangTet alSpatial metabolomics identifies lipid profiles of human carotid atherosclerosis. Atherosclerosis. (2023) 364:208. 10.1016/j.atherosclerosis.2022.11.019

  • 3.

    KassemMNiesKPHBoswijkEvan der PolJAizazMGijbelsMJJet alQuantification of carotid plaque composition with a multi-contrast atherosclerosis characterization (match) MRI sequence. Front Cardiovasc Med. (2023) 10:1227495. 10.3389/fcvm.2023.1227495

  • 4.

    WangXLiWSongFWangLFuQCaoSet alCarotid atherosclerosis detected by ultrasonography: a national cross-sectional study. J Am Heart Assoc. (2018) 7(8):e008701. 10.1161/JAHA.118.008701

  • 5.

    HsiaoCLChenPYHsuPJLinSK. Nomogram and carotid risk score for predicting moderate or high carotid atherosclerosis among asymptomatic elderly recycling volunteers. Diagnostics (Basel). (2022) 12(6):1407. 10.3390/diagnostics12061407

  • 6.

    FuJDengYMaYManSYangXYuCet alNational and provincial-level prevalence and risk factors of carotid atherosclerosis in Chinese adults. JAMA Netw Open. (2024) 7(1):e2351225. 10.1001/jamanetworkopen.2023.51225

  • 7.

    HuangGJinQTianXMaoY. Development and validation of a carotid atherosclerosis risk prediction model based on a Chinese population. Front Cardiovasc Med. (2022) 9:946063. 10.3389/fcvm.2022.946063

  • 8.

    ShimizuYYamanashiHHondaYNonakaFMiyataJKawashiriSYet alLow-density lipoprotein cholesterol, structural atherosclerosis, and functional atherosclerosis in older Japanese. Nutrients. (2022) 15(1):183. 10.3390/nu15010183

  • 9.

    ZengNShenYLiYWangY. Association between remnant cholesterol and subclinical carotid atherosclerosis among Chinese general population in health examination. J Stroke Cerebrovasc Dis. (2023) 32(8):107234. 10.1016/j.jstrokecerebrovasdis.2023.107234

  • 10.

    HuangZXChenLHXiongRHeYNZhangZZengJet alEssen stroke risk score predicts carotid atherosclerosis in Chinese community populations. Risk Manag Healthc Policy. (2020) 13:211523. 10.2147/RMHP.S274340

  • 11.

    HungAMTsuchidaYNowakKLSarkarSChoncholMWhitfieldVet alIl-1 inhibition and function of the HDL-containing fraction of plasma in patients with stages 3 to 5 CKD. Clin J Am Soc Nephrol. (2019) 14(5):70211. 10.2215/CJN.04360418

  • 12.

    ShearstonKTanJTMCochranBJRyeKA. Inhibition of vascular inflammation by apolipoprotein A-IV. Front Cardiovasc Med. (2022) 9:901408. 10.3389/fcvm.2022.901408

  • 13.

    ChenCChangCCLeeITHuangCYLinFYLinSJet alHigh-density lipoprotein protects vascular endothelial cells from indoxyl sulfate insults through its antioxidant ability. Cell Cycle. (2023) 22(21–22):240923. 10.1080/15384101.2023.2296184

  • 14.

    PanBKongJJinJKongJHeYDongSet alA novel anti-inflammatory mechanism of high density lipoprotein through up-regulating annexin A1 in vascular endothelial cells. Biochim Biophys Acta. (2016) 1861(6):50112. 10.1016/j.bbalip.2016.03.022

  • 15.

    GoecksJJaliliVHeiserLMGrayJW. How machine learning will transform biomedicine. Cell. (2020) 181(1):92101. 10.1016/j.cell.2020.03.022

  • 16.

    ConroyRMPyoralaKFitzgeraldAPSansSMenottiADe BackerGet alEstimation of ten-year risk of fatal cardiovascular disease in Europe: the score project. Eur Heart J. (2003) 24(11):9871003. 10.1016/s0195-668x(03)00114-3

Summary

Keywords

prevention, cerebrovascular disease, screening, ultrasound, machine learning, carotid atherosclerosis

Citation

Huang Z-X, Chen L, Chen P, Dai Y, Lu H, Liang Y, Ding Q and Liang P (2024) Screening for carotid atherosclerosis: development and validation of a high-precision risk scoring tool. Front. Cardiovasc. Med. 11:1392752. doi: 10.3389/fcvm.2024.1392752

Received

28 February 2024

Accepted

15 July 2024

Published

25 July 2024

Volume

11 - 2024

Edited by

Teresa Padro, Institut de Recerca de l'Hospital de la Santa Creu i Sant Pau, Spain

Reviewed by

Oscar Rafael Escate Chávez, Sant Pau Institute for Biomedical Research, Spain

Dalin Tang, Worcester Polytechnic Institute, United States

Updates

Copyright

*Correspondence: Zhi-Xin Huang

Disclaimer

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.

Outline

Figures

Cite article

Copy to clipboard


Export citation file


Share article

Article metrics