A LASSO-Derived Risk Model for Subclinical CAC Progression in Asian Population With an Initial Score of Zero

Background: This study is aimed at developing a prediction nomogram for subclinical coronary atherosclerosis in an Asian population with baseline zero score, and to compare its discriminatory ability with Framingham risk score (FRS) and atherosclerotic cardiovascular disease (ASCVD) models. Methods: Clinical characteristics, physical examination, and laboratory profiles of 830 subjects were retrospectively reviewed. Subclinical coronary atherosclerosis in term of Coronary artery calcification (CAC) progression was the primary endpoint. A nomogram was established based on a least absolute shrinkage and selection operator (LASSO)-derived logistic model. The discrimination and calibration ability of this nomogram was evaluated by Hosmer–Lemeshow test and calibration curves in the training and validation cohort. Results: Of the 830 subjects with baseline zero score with the average follow-up period of 4.55 ± 2.42 year in the study, these subjects were randomly placed into the training set or validation set at a ratio of 2.8:1. These study results showed in the 612 subjects with baseline zero score, 145 (23.69%) subjects developed CAC progression in the training cohort (N = 612), while in the validation cohort (N = 218), 51 (23.39%) subjects developed CAC progression. This LASSO-derived nomogram included the following 10 predictors: “sex,” age,” “hypertension,” “smoking habit,” “Gamma-Glutamyl Transferase (GGT),” “C-reactive protein (CRP),” “high-density lipoprotein cholesterol (HDL-C),” “cholesterol,” “waist circumference,” and “follow-up period.” Compared with the FRS and ASCVD models, this LASSO-derived nomogram had higher diagnostic performance and lower Akaike information criterion (AIC) and Bayesian information criterion (BIC) value. The discriminative ability, as determined by the area under receiver operating characteristic curve was 0.780 (95% confidence interval: 0.731–0.829) in the training cohort and 0.836 (95% confidence interval: 0.761–0.911) in the validation cohort. Moreover, satisfactory calibration was confirmed by Hosmer–Lemeshow test with P-values of 0.654 and 0.979 in the training cohort and validation cohort. Conclusions: This validated nomogram provided a useful predictive value for subclinical coronary atherosclerosis in subjects with baseline zero score, and could provide clinicians and patients with the primary preventive strategies timely in individual-based preventive cardiology.

Background: This study is aimed at developing a prediction nomogram for subclinical coronary atherosclerosis in an Asian population with baseline zero score, and to compare its discriminatory ability with Framingham risk score (FRS) and atherosclerotic cardiovascular disease (ASCVD) models.
Methods: Clinical characteristics, physical examination, and laboratory profiles of 830 subjects were retrospectively reviewed. Subclinical coronary atherosclerosis in term of Coronary artery calcification (CAC) progression was the primary endpoint. A nomogram was established based on a least absolute shrinkage and selection operator (LASSO)-derived logistic model. The discrimination and calibration ability of this nomogram was evaluated by Hosmer-Lemeshow test and calibration curves in the training and validation cohort.
Results: Of the 830 subjects with baseline zero score with the average follow-up period of 4.55 ± 2.42 year in the study, these subjects were randomly placed into the training set or validation set at a ratio of 2.8:1. These study results showed in the 612 subjects with baseline zero score, 145 (23.69%) subjects developed CAC progression in the training cohort (N = 612), while in the validation cohort (N = 218), 51 (23.39%) subjects developed CAC progression. This LASSO-derived nomogram included the following 10 predictors: "sex," age," "hypertension," "smoking habit," "Gamma-Glutamyl Transferase (GGT)," "C-reactive protein (CRP)," "high-density lipoprotein cholesterol (HDL-C)," "cholesterol," "waist circumference," and "follow-up period." Compared with the FRS and ASCVD models, this LASSO-derived nomogram had higher diagnostic performance and lower Akaike information criterion (AIC) and Bayesian information criterion (BIC) value. The discriminative ability, as determined by the area under receiver operating characteristic curve was 0.780 (95% confidence interval: 0.731-0.829) in the training cohort and 0.836 (95% confidence interval: 0.761-0.911) in the validation cohort. Moreover, satisfactory calibration was confirmed by Hosmer-Lemeshow test with P-values of 0.654 and 0.979 in the training cohort and validation cohort.

INTRODUCTION
Subclinical atherosclerosis is a chronic, progressive, and inflammatory disease of the arterial wall with a long-term asymptomatic phase (1)(2)(3). In recent years, non-invasive imaging modalities have been proposed to help early detect and monitor the burden of subclinical atherosclerosis. The introduction of several non-invasive imaging modalities has given the possibility to diagnose subclinical atherosclerosis easily in asymptomatic subjects, including carotid ultrasonography and coronary calcium assessment by computed tomography (CT). Coronary artery calcification (CAC) could be considered a surrogate marker of subclinical coronary atherosclerotic burden (2). Agatston score of zero is known to be a powerful negative cardiovascular event predictor with a long-term warranty period ("the power of zero") (4)(5)(6)(7)(8). PESA study has demonstrated that male, age, high-density lipoprotein cholesterol (LDL-C), hemoglobin A1c (HbA1 c ), vascular cell adhesion molecule-1 (VCAM) and cystatin are significant biologic predictors associated with subclinical atherosclerotic lesions in Western asymptomatic population with low cardiovascular risk (9). Previous studies have investigated the risk factors associated with the warrant period of zero score in Asian population (8,10,11). To the best of our knowledge, no previous studies have evaluated early preventive models for predicting the risk of subclinical coronary atherosclerosis in term of CAC progression in Asian population with baseline zero score. Therefore, we aim to develop a LASSO (least absolute shrinkage and selection operator)-based risk model for the prediction of subclinical CAC progression with an initial score of zero in a hospital-based dataset, and to compare its discriminatory ability with other prediction models, such as Framingham risk score (FRS) and atherosclerotic cardiovascular disease (ASCVD) score.

Study Population and Baseline Characteristics
Eight hundred and thirty consecutive subjects were included in this study from April 2005 to December 2018 according to the inclusion criteria. The inclusion criteria for this study are as follows: (1) all subjects with medical check-ups underwent two consecutive scans (CAC scan and coronary CT angiography) during the follow-up period; (2) all subjects must meet the criteria of zero score in the baseline scan. Because we did not use any human subjects or personally identifiable records in our study, informed consent was waived. The study protocol was approved by the institutional review boards of Kaohsiung Veterans General Hospital in accordance with the Declaration of Helsinki (IRB: VGHKS19-CT6-02). Clinical characteristics, physical examination, and laboratory profiles were retrospectively obtained from the patients' electronic medical records and reviewed by a trained study coordinator. Laboratory profiles were performed at the same day as the baseline CT scans. Clinical demographic characteristics included age, gender, BMI, current smoking habit, pack-year, hypertension, diabetes mellitus and followup period collected by patients' electronic medical records or questionnaire. A physical examination was also conducted to collect data on body mass index (BMI), systolic blood pressure (SBP), diastolic blood pressure (DBP), body-fat percentage, and waist circumference. Laboratory profiles were collected to obtain biochemical variables, including uric acid, Gamma-Glutamyl Transferase (GGT), fasting glucose, hemoglobin A1c (HbA1c), C-reactive protein (CRP), low-density lipoprotein cholesterol (LDL-C), high-density lipoprotein cholesterol (HDL-C), total cholesterol, and triglycerides level. Hypertension was defined as systolic blood pressure (SBP) > 140 mmHg, diastolic blood pressure (DBP) >90 mmHg, or subjects with anti-hypertensive medications. Diabetes mellitus was diagnosed in subjects with additional oral anti-diabetic or insulin medications. Framingham risk score (%) in the first round, CAD-RADS categories in the first and final round, and CAC score in the final round were also recorded.

CT Imaging Acquisition
All subjects retrospectively underwent two consecutive scans in the first round and final round during the mean follow-up period of 4.55 ± 2.42 years. In brief, a non-contrast CAC scan was performed before the cardiac CT angiography on a 256 × 0.625mm detector row CT system (Revolution CT, GE Healthcare, Milwaukee, USA) or a 64 × 0.5-mm detector row CT system (Aquilion 64; Toshiba Medical Systems). CT acquisition protocol includes two sequential acquisitions.
First, a non-contrast CAC scan was performed with the following acquisition parameters: fixed tube voltage 120-kVp with reconstructed at 3 mm slice thickness. Secondly, a prospectively ECG-triggered cardiac CT angiography (CTA) was performed with the following parameters: fixed tube voltage of 120 kV, tube current modulation (mA modulation). CAC scoring was performed using the Agatston method with GE AW analysis software (12). Prior to cardiac CT angiography imaging, oral beta-blockers (metoprolol 100 mg) and sublingual nitroglycerin (nitrostat one tablet, 0.6 mg) were administered to all subjects without contraindication if the heart rate exceeded 65 beats per min. Intravenous contrast (Iopamidol 370) was administered at 5 mL/second followed by 40 mL 0.9% saline flush. Images were acquired and reconstructed at diastole (75-81% of the R-R interval) or at systole (37-43% of the R-R interval). All CT scans were reported by accredited cardiac radiologists. The severity of obstructive CAD with standardized reporting of individual segmental coronary stenosis was reported according to Coronary Artery Disease Reporting and Data System (CAD-RADS TM ), published in 2016 by the Society of Cardiovascular Computed Tomography (SCCT) (13).

LASSO-Derived Prediction Model
We extracted 10 features through lasso regression to construct the new prediction model with the optimal value of lambda that minimizes the cross-validation error, and compares its prediction accuracy and discriminatory ability with other different prediction models, such as FRS model and ASCVD model. In addition, we evaluated the discriminatory ability of different prediction models by using c-statistic, Akaike information criterion (AIC) and Bayesian information criterion (BIC). Higher c-statistic and lower AIC and BIC values were considered to indicate a more discriminatory model. Previous literature reviews have shown that original purpose of FRS and ASCVD models in 10-year cardiovascular event prediction (14,15). However, there are some scientific merits of FRS and ASCVD models in subclinical atherosclerosis prediction according to recent studies (16,17).

Framingham Risk Score (FRS) Model
FRS is the scoring system that is most commonly used to predict the 10-year cardiovascular events. The components of the FRS include age, sex, total cholesterol, HDL-C, systolic BP, DM, and smoking habit. A total FRS score was calculated for each eligible subject according to the algorithm developed by D'Agostini et al. (14). Current clinical guidelines recommend categorizing asymptomatic individuals into low (FRS < 10%), intermediate (10-20%), and high-risk subgroups (> 20%) for risk stratification.

Statistical Analysis
The clinical characteristics and demographic profiles of the subjects in the training and validation cohorts were compared by Student t-test for continuous variables and chi-square test/Fisher exact text for categorical variables. The primary study outcome was to develop a LASSO-based prediction nomogram (Optimal lambda selection) for CAC progression in Asian population with baseline zero-score (19). The multivariable logistic regression model was used to estimate the odds ratio (OR) and 95% CI. We evaluate and compare the discriminatory ability of three predictive models by using the c-statistic (area under the ROC curve, AUC), Bayesian information criterion (BIC) and Akaike information criterion (AIC). Higher c-statistic and lower AIC/BIC values were considered to indicate a more discriminatory model (20).
The values of the c-statistic range from 0.5 (no ability to discriminate) to 1.0 (full ability to discriminate). Calibration was assessed by the Hosmer-Lemeshow goodness-of-fit statistic and by calibration graphs plotting predicted CAC progression against the observed rates in deciles of predicted risk (21).
A nomogram was established based on the LASSO-derived parameters in the training cohort. The statistical significance for all tests was set at P < 0.05. All statistical analyses were performed using SPSS 22.0 for Windows (SPSS Inc., Chicago, IL) and Stata version 13.0 (Stata Corp, College Station, TX, USA).

The Study Population Characteristics
Of the 830 subjects with baseline zero score in our study, of whom 555 were men and 275 were women, 196 had CAC progression events and 634 did not have the events. The prevalence of CAC progression in the total study cohort was 23.61%.
In the study cohort of 830 subjects, about 17.2 subjects have non-calcified plaques in the baseline scan. In the final round, about 41.9% subjects have non-calcified or calcified plaques formation during follow up period of 4.55 ± 2.42 year shown in Table 1. These subjects were randomly placed into the training set or validation set at a ratio of 2.8:1. 612 and 218 subjects with baseline zero score were included in the training and validation cohorts, respectively, shown in Figure 1. There were no significant differences between the two groups (the train cohort and the validation cohort) in terms of all parameters of clinical characteristics, physical examination, and laboratory profiles.

LASSO-Derived Predictor for Subclinical CAC Progression
We conducted logistic regression with the least absolute shrinkage and selection operator (LASSO) penalization to help reduce the dimensions of feature selection through a 10-fold cross validation for subclinical CAC progression prediction. Finally, ten of the original 20 variables were selected in the prediction model developing. The finial LASSO model with optimal lambda included the following 10 non-zero variables: "sex, " "age, " "hypertension, " "smoking habit, " "GGT, " "CRP, " "HDL-C, " "cholesterol, " "waist circumference, " and "follow-up period." We carried out the multivariate analyses in the training cohort to establish the prediction model for subclinical CAC progression. Ten of the original 20 variables were included in the prediction model. The results of the multivariate logistic regression analysis are summarized in Table 2. The LASSOderived prediction model including 10 selected variables also has showed its good performance in Table 2.

Development of the Nomogram
The probability of subclinical CAC progression in the study training cohort with the baseline zero score according to the multivariable logistic regression model including ten potential predictive factors (sex, age, hypertension, smoking habit, GGT, CRP, HDL-C, cholesterol, waist circumference, and followup period). A nomogram was further generated to predict subclinical CAC progression based on the multivariable logistic regression results shown in Figure 2. By adding up these scores identified on the points scale for each parameter, we were easily able to draw a straight line down to establish the estimated individual probability score of subclinical CAC progression in the training cohort with baseline zero score. As an example to better explain the nomogram model, if the male subject is age    Table 3 presents a summary of the discriminatory ability and diagnostic performance of the three prediction models, including LASSO-derived, FRS, and ASCVD models.

Comparison of LASSO-Derived, FRS and ASCVD Models
In addition, the comparison and difference of three predictive models are summarized in Table 4. Our study result demonstrated that LASSO-based model has significantly superior discriminatory ability, higher c-statistic, and the lower AIC and BIC over other two predictive models. Compared with FRS and ASCVD model, the novel LASSO-derived nomogram model shows better diagnostic performance with an AUC of 0.780 (95% CI, 0.731 to 0.829) for detection subclinical CAC progression in Asian population with baseline zero score with balanced sensitivity (78.49%) and specificity (67.62%).

DISCUSSION
We built up and assessed a nomogram model for individually predicting subclinical CAC progression in subjects with baseline zero score. The predictive nomogram model incorporates clinical characteristics, physical examination and laboratory profiles for guiding individual subclinical coronary atherosclerosis prediction. To the best of our knowledge, this is a first predictive nomogram for subclinical CAC progression prediction in Asian population. In this study, we demonstrated three major findings. The first one is that we developed a LASSOderived novel nomogram prediction model based on clinical characteristics, physical examination and laboratory profiles to predict subclinical atherosclerosis with baseline zero score, and   demonstrated that it provides a good level of performance for predicting subclinical CAC progression in an Asian cohort. Second, compared with FRS and ASCVD model, the LASSOderived model exhibited a significantly better discriminatory ability and lowest AIC and BIC. Third, the LASSO-derived risk prediction model exhibited good discrimination and calibration ability in the training and validation cohort.
In this study, we consecutively selected and analyzed 830 subjects with baseline zero score, which randomly divided into the training cohort and validation cohort. In the mean followup period of 4.55 ± 2.42 year, finally about 196 (23.61%) had CAC progression events in the study cohort. For subclinical CAC progression, LASSO-derived model with an optimal cutoff value of <0.2205 (probability score) may be an ideal screening tool to help rule out subclinical CAC progression within the 5 years of the warranty period in the middle-age Asian population with low to intermediate risk (sensitivity of 78.49%; specificity of 67.62%). Our study findings are consistent with a growing body of literature about the natural course of CAC progression in population with zero score (4)(5)(6)(7)(8). The evidences from previous studies have demonstrated that zero CAC score at the baseline scan could provide the 5 year-warranty period of beneficial effect on the future cardiac event in both Western and Asian asymptomatic population with low to intermediate cardiovascular risk. In addition, we developed and validated a new novel nomogram that integrated clinical characteristics, physical examination and laboratory profiles. This nomogram can more efficiently predict the subclinical CAC progression, compared with FRS or ASCVD model. Sarah et al. previously reported that CVHI (cardiovascular health index) score had most sensitive (94%) but least specific (14.9%) in identifying individuals with subclinical atherosclerosis assessed with noninvasive carotid intima-media thickness (CIMT) measurement, compared with FRS and MetS (metabolic syndrome) score (22). Our previous study has demonstrated that FRS score had poor to fair diagnostic performance for subclinical CAC progression prediction in individuals with baseline zero score (8). Compared with FRS and ASCVD model, the novel LASSOderived nomogram model shows better performance with an AUC of 0.780 (95% CI, 0.731 to 0.829) for detection subclinical CAC progression in Asian population with baseline zero score with balanced sensitivity (78.49%) and specificity (67.62%). Our LASSO-derived model is feasible to predict subclinical CAC progression with high relative high sensitivity for rule out this clinical scenario in subjects with baseline zero score.
Early detection of coronary atherosclerosis in its subclinical stage could impact on the primary prevention of cardiovascular events, and allow the prompt implementation of primary prevention strategies (1)(2)(3). The PESA study demonstrated that the high prevalence of subclinical atherosclerosis (44%) in term of the iliac-femoral district in asymptomatic middle-aged population (23,24). Therefore, healthy lifestyle strategies such as lifelong attention to diet, exercise habit, smoking abstinence or statins treatment in the subgroup with CAC >100 through promoting patient-centered shared decision making are crucial for maintaining and prolong cardiovascular health and to slow the progression of coronary atherosclerosis in preventive cardiology (1)(2)(3)25).

STRENGTHS AND LIMITATIONS
The study has two main strengths. First, a strength of the present study was its longitudinal nature which allowed us to clearly identify the correct sequence of time events, identify changes over time, eliminate recall bias and provide insight into cause-andeffect relationships. Second, this study investigates on the unique Asian population cohort. However, there is no populationbased study focusing on the prediction model for subclinical coronary atherosclerosis among Asian population. Therefore, this study could investigate risk factors of subclinical coronary arthrosclerosis associated with the racial difference.
There are some limitations in this study. First, this is a single-center retrospective study focused on Asian population. Therefore, the generalizability of the prediction model result to the western population is limited. Second, we did not investigate the clinical cardiovascular event for primary outcome analysis due to small sample size limitation and low to intermediate FRS risk (FRS% 12.79 ± 8.90, N = 830). Therefore, the costbenefit analysis of predicting subclinical coronary atherosclerosis is still uncertain (26,27). Subclinical coronary atherosclerosis has become a threatening public health issue in the world due to behavioral, environmental and genetic factors (28,29). There is increasing trend in the USA that people died suddenly from cardiovascular events at low risk according to Framingham risk stratification (1,2,30). Therefore, to pay more attention on subclinical coronary atherosclerosis is mandatory in this age with high prevalence of subclinical atherosclerosis stage. Early detection with primary prevention such as health promotion with lifestyle behavior modification (diet, physical activity, stop smoking, etc.) is a very important way to slow or reverse the progression of subclinical coronary atherosclerosis (31,32). Third, our relatively short follow-up period is a potential limitation. Therefore, longer follow-up studies are warranted to investigate the natural course of CAC progression in the 10-year period. Fourth in this study we aimed to investigate a specific form of subclinical coronary atherosclerosis in term of coronary calcification. Therefore, other forms of subclinical atherosclerosis such as development of non-calcified coronary plaques or subclinical atherosclerosis in carotid, aorta and iliac arteries could not be assessed in this study (23,24,33). In addition, previous studies have demonstrated that statin therapy may influence coronary plaque calcification (34). However, the retrospective study design did not collect complete history of the lipid-lowering drugs. Further studies are warranted to assess multiterritorial subclinical atherosclerosis in Asian population.

CONCLUSION
In summary, we developed and validated successfully a LASSOderived prediction nomogram based on 10 routine clinical parameters conveniently including "sex, " age, " "hypertension, " "smoking habit, " "GGT, " "CRP, " "HDL-C, " "cholesterol, " "waist circumference, " and "follow-up period, " and demonstrated that it provides a good level of performance for predicting subclinical coronary atherosclerosis in subjects with baseline zero score. This nomogram could help clinicians to identify subclinical coronary atherosclerosis in subjects at low to intermediate risk for guidance for the primary preventive strategies in individualbased preventive cardiology.

DATA AVAILABILITY STATEMENT
The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by Institutional review boards of Kaohsiung Veterans General Hospital. Written informed consent for participation was not required for this study in accordance with the national legislation and the institutional requirements. Written informed consent was not obtained from the individual(s) for the publication of any potentially identifiable images or data included in this article.