Your new experience awaits. Try the new design now and help us make it even better

ORIGINAL RESEARCH article

Front. Endocrinol., 19 January 2026

Sec. Pediatric Endocrinology

Volume 16 - 2025 | https://doi.org/10.3389/fendo.2025.1728132

Development and comparison of multivariate diagnostic models for rapidly progressive central precocious puberty in girls: the role of serum osteocalcin

Wei QinWei Qin2Runqi WangRunqi Wang1Tao XieTao Xie1Yanfei ChenYanfei Chen1Dan ZengDan Zeng1Ziting DingZiting Ding1Dan Lan,,&#x;*Dan Lan1,3,4†*
  • 1Department of Pediatrics, The First Affiliated Hospital of Guangxi Medical University, Nanning, China
  • 2Department of Pediatrics, The First People’s Hospital of Nanning, Nanning, China
  • 3Difficult and Critical Illness Center, Pediatric Clinical Medical Research Center of Guangxi, Nanning, China
  • 4The Key Laboratory of Children’s Disease Research in Guangxi’s Colleges and Universities, Education Department of Guangxi Zhuang Autonomous Region, Nanning, China

Objectives: To develop a diagnostic prediction model for rapidly progressive central precocious puberty (RP-CPP) and evaluate the contribution of osteocalcin(OC) to the model.

Methods: For a total of 411 girls who met the criteria for central precocious puberty were selected. Of these, 219 were included in the training set, 87 in the internal validation set, and 105 in the external validation set. Binary logistic regression was used to construct the model. The model fit and diagnostic accuracy were assessed using the Akaike Information Criterion (AIC), calibration curves, and the area under the receiver operating characteristic curve(AUC). The model was presented in the form of a nomogram. Internal and external validations of the model were performed.

Results: Diagnostic models for RP-CPP were developed both with and without the inclusion of OC. Among all models, those that included OC consistently demonstrated smaller AIC values, higher AUC values, and lower prediction error rates. A model incorporating the duration of breast development, serum OC levels, mean ovarian volume, endometrial presence/absence, and breast Tanner staging demonstrated superior performance. The AUC for diagnosing RP-CPP was 0.973, with a sensitivity of 91.6% and specificity of 92.5%. The model performed well in the internal and external validation sets, demonstrating good clinical application value.

Conclusion: The inclusion of OC helps improve the predictive performance of the model. For the diagnosis of RP-CPP in girls, a model can be chosen that includes the duration of breast development, serum OC levels, mean ovarian volume, endometrial presence/absence, and breast Tanner staging. However, all samples were from a single center, and multicenter validation is still needed.

Introduction

Among patients with central precocious puberty (CPP), there are some special types, one of which is rapidly progressive central precocious puberty (RP-CPP). Epidemiological studies indicate that as the incidence of CPP increases, the prevalence of RP-CPP also shows a rising trend (14). The prominent clinical features of RP-CPP include advanced bone age (BA) and gonadal development, which may significantly impact adult height or psychological health. Early identification and proactive treatment of RP-CPP can bring greater benefits to patients, making early recognition particularly important. Therefore, the development of precise, rapid, and effective diagnostic methods remains an urgent priority for clinical researchers.

Various hormones and factors within the body interact to modulate the development of puberty. Researchers are investigating factors influencing pubertal progression and actively seeking biomarkers for early detection of RP-CPP. Zung et al. suggest that the measurement of morning urine LH levels is a non-invasive and reliable method. Using a cut-off value of 1.16 IU/L for morning urinary luteinizing hormone(LH) to identify RP-CPP, the sensitivity and specificity are 83% and 72%, respectively, which assists in distinguishing between slowly progressive CPP(SP-CPP) and RP-CPP (5). The study by Calcaterra et al. demonstrated that an ultrasonic breast volume of ≥0.85 cm3 is an independent predictor of RP-CPP (6). According to Chen Yun et al., girls with RP-CPP have more advanced BA, higher basal LH levels, and larger ovarian and uterine volumes compared to those with SP-CPP, with BA being the most helpful factor in identifying RP-CPP in girls (7). Zhang et al. ‘s research shows that when the basal LH is ≥0.58 IU/L, the sensitivity for diagnosing RP-CPP is 77.5% and the specificity is 66.7% (8). Research by Kim et al. also confirmed that advanced BA is a risk factor for pubertal progression in girls (9). Furthermore, quantitative pituitary stalk perfusion by arterial spin labeling is used as a non-invasive method to identify progressive CPP with a sensitivity of 76%, specificity of 83%, and accuracy of 77% (10).However, the existing research findings still fall short in terms of sensitivity and specificity for diagnosing RP-CPP. Given the suboptimal performance of single factors in identifying RP-CPP, a few studies have explored multi-factorial approaches for diagnosis. A model incorporating uterine volume, estradiol levels, endometrial thickness, BA, and breast volume measured by ultrasound demonstrated a sensitivity and specificity of 77.1% and 83.3%, respectively, for distinguishing RP-CPP (6). Chen et al. ‘s research has shown that the combination of anti-Müllerian hormone (AMH) and inhibin B can differentiate between slowly progressive CPP and progressive CPP with a sensitivity of 80% and a specificity of 89.3% (11). Despite these efforts, accurately identifying RP-CPP in its early stages remains a significant challenge.

Osteocalcin (OC) serves as a marker of bone turnover, participating in the regulation of osteoblast and osteoclast activity, and is associated with bone growth and mineralization. Moreover, the metabolically active form of OC functions as an endocrine hormone and may be involved in sexual development (12, 13). Our previous study has shown that in girls with CPP, changes in serum OC levels precede those of estradiol and BA, potentially serving as an early biomarker for RP-CPP. When using a cutoff value of 107.05 ng/mL for serum OC, the sensitivity and specificity for diagnosing RP-CPP were 91.1% and 70.7%, respectively (14).

Based on our previous research (14, 15), this study aims to develop a diagnostic prediction model for RP-CPP based on serum OC. We will develop separate models with and without OC to assess its impact on diagnostic performance and use multiple statistical approaches to select the optimal model. Additionally, a nomogram-based scoring system is constructed. This approach is designed to assist in developing a precise and efficient clinical diagnostic workflow.

Study population

Girls diagnosed with CPP at the First Affiliated Hospital of Guangxi Medical University between January 2020 and December 2022 were included in the dataset for model construction. This dataset was randomly divided into a training set and a test set (internal validation set) at a ratio of 7:3. Girls diagnosed with CPP between January 2023 and October 2023 were included as the external validation set.

Diagnostic Criteria for CPP in Girls: 1. Onset of breast development before age 8 or menarche before age 10. 2. Advanced bone age, exceeding chronological age by ≥1 year. 3. Gonadal enlargement, as evidenced by ultrasound findings of enlarged uterus and ovaries. 4. Gonadotropin releasing hormone (GnRH) stimulation test indicating activation of the hypothalamic-pituitary-gonadal axis (LH ≥ 5 IU/L and LH/FSH ratio ≥ 0.6) (16, 17). Exclusion Criteria: 1. Incomplete examination data. 2. Negative GnRH stimulation test. 3. Prior exposure to GnRH agonist (GnRHa) treatment. 4. Secondary CPP (e.g., due to tumors, congenital adrenal hyperplasia [CAH]). 5. Presence of underlying diseases or a family history of abnormalities.

Diagnostic Criteria for RP-CPP: Progression from one Tanner stage to the next in less than 6 months, accompanied by advanced BA (BA exceeding chronological age by more than 1 year or a significant advancement in BA over a short period) (5, 18). Alternatively, the interval from breast development to menarche is less than 2 years (16).

Grouping: Among the girls diagnosed with CPP included in this study, those meeting the diagnostic criteria for RP-CPP were classified into the RP-CPP group, and the remainder were assigned to the non–rapidly progressive CPP (NRP-CPP) group.

Methods

Data Extraction: Medical history records were obtained from the electronic medical record system to collect clinical information for girls diagnosed with CPP. The collected data included age, duration of breast development (time from breast development to initial consultation), height, weight, body mass index (BMI), birth weight, breast Tanner staging, and parental heights. Laboratory tests and imaging examinations.

BA assessment: BA was independently assessed by two experienced pediatric endocrinologists using the Greulich-Pyle (G.P.) atlas. The final BA was determined as the average of the two assessments. Ultrasonography Measurements: All girls underwent transabdominal ultrasonography to measure uterine length, endometrial thickness, and ovarian dimensions. Ovarian volume was calculated using the formula: Ovarian Volume (cm3) =0.5233×Length×Width×Depth (cm).

Variable Selection for the Model: Potential factors influencing precocious puberty in girls were identified through a comprehensive literature review, clinical experience, and statistical analysis. Clinical data were extracted and compiled into a dataset. Initial screening of these factors was performed using univariate analysis, with a significance level of P < 0.05 as the threshold. Following this, multivariate analysis was conducted to further refine the selection of variables for inclusion in the model. Various multivariate analysis methods were employed, including cross-validated LASSO regression, stepwise regression, Bayesian best subset selection, and random forest.

Model Construction: A binary logistic regression model was constructed to diagnose RP-CPP. The dependent variable was whether the patient was diagnosed with RP-CPP, and the independent variables were selected based on the optimal results from various multivariate screening methods. Our previous study showed that OC aids in the early identification of RP-CPP (14); therefore, we developed separate models with and without OC to assess its impact on diagnostic performance.

Sample Size Estimation for the Model: Based on previous analyses of risk factors associated with CPP, studies have shown that typically 3–6 risk factors are independently related to the occurrence of CPP (19). According to the empirical rule of 10 events per variable (10 EPV) (20), it is estimated that the risk prediction model developed in this study will include no more than 6 predictors. Therefore, a minimum of 60 positive cases (10 EPV × 6 predictors) is required to meet the sample size needs for model construction.

Statistical analysis

Means between groups were compared using independent sample t-tests or analysis of variance (ANOVA). Proportions were compared using chi-square tests. For non-normally distributed data, Wilcoxon rank-sum tests were employed, statistical analyses were conducted using SPSS version 23.0, and a significance level of P < 0.05 was used for the initial screening of independent factors associated with RP-CPP. Multivariate analysis was conducted using cross-validated LASSO regression, stepwise regression, Bayesian best subset selection, and random forest methods. The results from these different screening methods were compared, and the optimal screening result was selected based on the AIC values. The final independent variables included in the model were determined based on this optimal result. A binary logistic regression model was constructed to estimate the regression coefficients for each predictor. The effect sizes of the regression model were expressed as odds ratios (ORs) with their corresponding 95% confidence intervals (CIs). The goodness-of-fit of the model was assessed using the Hosmer-Lemeshow test, P > 0.05 indicates adequate model fit. ROC curves were plotted and the AUC was used to evaluate the diagnostic performance of the model. Calibration plots were used to assess the consistency of the model’s predictive performance. The clinical utility of the model was evaluated using DCA. The model was presented in the form of a nomogram. Analyses were performed using R version 4.0.0. All tests were 2 sided, and P < 0.05 was considered to indicate statistical significance.

Results

Clinical characteristics

From a total of 1046 girls with precocious puberty, 411 girls who met the criteria for CPP were selected. Of these, a total of 306 girls with CPP (77 with RP-CPP and 229 with NRP-CPP) were included in the modeling dataset and were randomly divided into a training set (train) and an internal validation set (test) in a ratio of 7:3(219 were included in the training set, 87 in the internal validation set). An additional 105 girls with CPP (38 with RP-CPP and 67 with NRP-CPP) from a different time period were included as the external validation set (external). Comparison of clinical characteristics is detailed in Table 1.

Table 1
www.frontiersin.org

Table 1. Comparison of clinical characteristics between the RP-CPP and NRP-CPP groups in the training set, test set, and external validation set.

In the modeling dataset, the mean age of the RP-CPP group was slightly higher than that of the NRP-CPP group (8.87 ± 0.73 vs. 8.29 ± 0.94 years, P < 0.001). Although the duration of breast development was shorter in the RP-CPP group (9.47 ± 5.53 vs. 11.18 ± 8.38 months, P = 0.042), the difference between bone age and chronological age(Δ (BA-CA)) was significantly greater in the RP-CPP group (1.95 ± 0.80 vs. 1.34 ± 0.86 years). The distribution of breast Tanner stages was as follows: in the RP-CPP group, Tanner stage 2 accounted for 28.57%, Tanner stage 3 for 53.25%, and Tanner stage 4 for 18.18%; in the NRP-CPP group, Tanner stage 2 accounted for 71.18%, Tanner stage 3 for 26.14%, and Tanner stage 4 for 2.68%. The RP-CPP group was predominantly in Tanner stage 3, whereas the NRP-CPP group was predominantly in Tanner stage 2. The relevant laboratory indicator results are shown in Table 1.

Based on the results of univariate statistical analysis, the following potential predictive factors were initially screened: age, duration of breast development, BA, Δ(BA-CA), height, weight, body mass index (BMI), Breast development at Tanner stage 2 (yes/no), insulin-like growth factor 1 (IGF-1), OC, fasting insulin, total cholesterol(TC), high-density lipoprotein (HDL), free thyroxine (FT4), estradiol (E2), testosterone(T), basal luteinizing hormone (B-LH), basal follicle-stimulating hormone (B-FSH), uterine length, mean ovarian volume, and the presence or absence of endometrium.

Multivariate variable selection and model construction

Multivariate variable selection was performed using cross-validated LASSO regression, stepwise regression, Bayesian best subset selection, and random forest methods. Additionally, diagnostic models with and without OC were developed (Table 2).

Table 2
www.frontiersin.org

Table 2. Results of model selection using cross-validated LASSO regression, stepwise regression, bayesian best subset selection, and random forest methods in the training set.

In the training set, when OC was included, the screening results indicated that the Bayesian best subset selection method had the smallest AIC value (AIC = 76.55), suggesting a relatively optimal model, followed by cross-validated LASSO regression (AIC = 83.32). The screening results are shown in in Table 2 and Figure 1. Both models exhibited high AUC values and low prediction error rates in the training set, internal validation set, and external validation set. Given that the model selected by cross-validated LASSO regression included 5 variables, while the model selected by Bayesian best subset selection included 7 variables, the model selected by cross-validated LASSO regression is more convenient for clinical application. When OC was excluded from the dataset, the stepwise regression method yielded the best screening results, but the AIC value increased to 99.09, and the number of included variables increased to 6. The AUC in the training set was 0.963(the sensitivity and specificity were 93.4% and 90.6%, respectively), but the AUC in the internal validation set decreased to 0.895, and the prediction error rates in both the training set and validation set increased. Therefore, the inclusion of OC contributes more effectively to enhancing the diagnostic performance of the model. The model selected by cross-validated LASSO regression was identified as the optimal model (Model-1), comprising: duration of breast development, serum OC levels, mean ovarian volume, endometrial presence/absence, and breast Tanner staging (whether the breast development is at Tanner Stage 2 or not) (Tables 2, 3).

Figure 1
Panel a shows a plot of binomial deviance versus log of lambda. The red line indicates the trend, surrounded by gray error bars. Panel b depicts coefficient paths versus log lambda, with multiple lines representing different variables, converging near zero.

Figure 1. Mean square error of the model under different lambda, Loglambda and correlation coefficient. (a) Mean squared error (MSE) of the model across different λ values. The x-axis shows log(λ), and the y-axis shows MSE. Three-fold cross-validation was used for parameter selection in LASSO regression, with plots of partial likelihood deviance (binomial deviance) and log(λ). The left vertical dashed line indicates the log(λ) value corresponding to the minimum MSE; the right dashed line marks the largest λ within one standard error of the minimum MSE. Using the optimal λ, five predictors with non-zero coefficients were selected. (b) Log(λ) versus regression coefficients. LASSO regression was applied for variable selection among 21 candidate predictors. The plot displays the coefficient paths of all variables as a function of log(λ).

Table 3
www.frontiersin.org

Table 3. Correlation coefficients, odds ratios (ORs) and 95% confidence intervals for the two optimal models.

Given that Model 1 included duration of breast development—a variable highly susceptible to subjective bias and difficult to measure accurately—we excluded it from the dataset and rebuilt the model. In the training set, the screening results indicated that the stepwise regression and the Bayesian best subset selection method had the same results and the smallest AIC value (AIC = 108.53), suggesting a relatively optimal model. The optimal model (Model-2) was included the following variables: serum OC levels, mean ovarian volume, endometrial presence/absence (Tables 2, 3).

Evaluation of model diagnostic performance

The ROC curves for diagnosing RP-CPP using Model-1 and Model-2 are presented in Figure 2. The AUC of Model-1 in the training set, test set, and external validation set were 0.973(95%CI:0.950-0.996), 0.972(95%CI: 0.939-1.000), and 0.923(95%CI:0.869-0.972), respectively, indicating high diagnostic performance.Model-2 demonstrated high diagnostic performance in both the training and test sets, with AUCs of 0.947 (95% CI: 0.911-0.984) and 0.989 (95% CI: 0.970-1.000), respectively; however, performance declined slightly in external validation (AUC: 0.830; 95% CI: 0.754-0.906).The AUC, cut-off values, sensitivity, and specificity of the models are detailed in Table 4. The calibration plot for Model 1 is shown in Figure 3a. The curves for both the training and external validation sets closely followed the ideal reference line, indicating good agreement between predicted and observed probabilities for RP-CPP. Although slight deviation was noted in the internal validation set, overall calibration performance remained satisfactory. The calibration plot for Model 2 is shown in Figure 3b. Calibration curves for the training and test sets closely approximated the ideal reference line, indicating good agreement between predicted and observed probabilities. However, some deviation was observed in the external validation set, suggesting only moderate calibration performance. The decision curve analysis (DCA) is shown in Figure 4, both models’ DCA curves lie above the two reference lines, indicating favorable performance, particularly strong clinical utility in the training and test sets, though slightly reduced in external validation.

Figure 2
ROC curves for two models: Graph (a) for model-1, and graph (b) for model-2. Both curves plot sensitivity against specificity. For each graph, three lines represent the training (black), test (red), and external (green) datasets. Model-1 shows strong performance with all lines near the top-left corner, indicating high sensitivity and specificity. Model-2 displays similar high performance but with slight differences in curve positioning, especially for the test and external datasets. Legends are included for line identification.

Figure 2. The ROC curves for diagnosing RP-CPP of model-1 and model-2 in the training set, test set, and external validation set. (a) The AUC of Model-1 in the training set, test set, and external validation set were 0.973(95%CI:0.950-0.996), 0.972(95%CI:0.939-1.000), and 0.923(95%CI:0.869-0.972), respectively, indicating high diagnostic performance. (b) Model-2 demonstrated high diagnostic performance in both the training and test sets, with AUCs of 0.947 (95% CI: 0.911-0.984) and 0.989 (95% CI: 0.970-1.000), respectively; however, performance declined slightly in external validation (AUC: 0.830; 95% CI: 0.754-0.906).

Table 4
www.frontiersin.org

Table 4. AUC, cut-off values, sensitivity, and specificity of model-1 and model-2.

Figure 3
Six calibration plots showing predicted probabilities versus observed probabilities for training, test, and external datasets. Each panel displays lines for apparent, bias-corrected, and ideal models. Panels are labeled as a: train, test, external; and b: train, test, external. Each displays calibration performance with predicted and observed probabilities ranging from zero to one.

Figure 3. Calibration curves of model-1 and model-2 in the training set, test set, and external validation set. (a) Model-1 shows good consistency in predictive ability across the training set and external validation set, indicating good agreement between predicted and observed probabilities for RP-CPP. Although slight deviation was noted in the test set, overall calibration performance remained satisfactory. (b) The calibration curves for Model 2 in the training and test sets closely followed the ideal reference line, indicating good agreement between predicted and observed probabilities; however, some deviation was observed in the external validation set, suggesting only moderate calibration performance.

Figure 4
Two line graphs labeled “a” and “b” compare net benefit against high-risk thresholds. Both graphs include lines for train (black), test (red), external (green), all (light gray), and none (dark gray) datasets. Net benefit ranges from 0 to 0.3, and high-risk thresholds range from 0 to 1, with a cost-benefit ratio from 1:100 to 100:1.

Figure 4. Clinical decision curve analysis (DCA) of the models. In the DCA, the y-axis represents net benefit (NB). The horizontal line indicates a net benefit of zero, corresponding to the strategy of treating no patients. The upward-sloping curve represents the net benefit when all patients are treated. A higher net benefit reflects greater clinical utility of the model. (a) DCA for Model-1; (b) DCA for Model-2. The DCA curves for both models lie above the reference lines, demonstrating good clinical decision-making performance in the training and test sets, though with slightly reduced performance in external validation.

Nomogram

Nomograms for predicting RP-CPP using Model-1 and Model-2 are illustrated in Figure 5. The nomogram for Model-1 was developed based on five independent factors: duration of breast development, serum OC levels, mean ovarian volume, endometrial presence/absence, and whether at breast Tanner stage 2. The nomogram for Model-2 was developed based on three independent factors: serum OC levels, mean ovarian volume, endometrial presence/absence. In these nomograms, a higher total score, calculated as the sum of points assigned to each factor, is indicative of a greater probability of RP-CPP.

Figure 5
A set of two nomograms labeled “a” and “b” used for RP-CPP diagnosis, showing scales for factors such as points, development of breast duration, OC, mean ovary size, endometrium, Tanner Stage 2, total points, and RP-CPP diagnosis probability. Both nomograms follow a similar structure with different axis values, used to visually assess clinical parameters.

Figure 5. The nomograph of model-1 and model-2. (a) The nomograph of model-1; (b) The nomograph of model-2.

Discussion

RP-CPP is characterized by the rapid progression of both growth and sexual development. Currently, clinical identification of RP-CPP relies mainly on the assessment of Tanner stages of sexual development and the rapid advancement of bone age. However, these methods generally require a period of observation to accurately evaluate the rate of development, and therefore, there remains a lack of effective early identification methods for RP-CPP. Several studies have attempted to evaluate RP-CPP using parameters such as bone age, basal LH levels, IGF-1, inhibin B (INHB), anti-Müllerian hormone(AMH), and the size of the uterus or ovaries (57, 11, 21, 22). However, the findings from these studies indicate that reliance on a single factor remains inadequate for the early identification of RP-CPP. Endocrine disorders affecting pubertal development in girls are influenced by a complex interplay of genetic, environmental, nutritional, and psychosocial factors (3, 2326). Given the inherent limitations of single-predictor approaches, integrating multiple variables may enhance diagnostic accuracy. This notion is reinforced by a recent review highlighting the challenges in differentiating CPP from benign variants and proposing a structured diagnostic framework that combines clinical features, hormonal assessments, and imaging findings (27). Therefore, this study aimed to develop a diagnostic prediction model for RP-CPP to explore methods for early identification.

To date, a limited number of studies have explored the use of multivariate analysis methods for the early diagnosis of RP-CPP. For instance, Calcaterra et al. developed a model combining uterine volume, estradiol levels, uterine endometrium, BA, and breast volume measured by ultrasound. This model achieved a C-statistic of 0.86 for distinguishing RP-CPP, with a sensitivity of 77.1% and a specificity of 83.3% (6).Another study evaluated the diagnostic value of various parameters for RP-CPP and found that basal LH ≥ 0.2 IU/L, estradiol (E2) ≥ 50 pmol/L, uterine length ≥ 3.5 cm, uterine width ≥ 1.5 cm, presence of endometrial echoes, and ovarian volume ≥ 2 cm3 were significantly associated with RP-CPP. The study further showed that diagnostic performance improved with the inclusion of more combined parameters: when three or more were present, the AUC was 0.71 (sensitivity 58%, specificity 85%) (21).The study by Chen et al. indicated that girls with progressive CPP exhibit lower levels of AMH and higher levels of INHB compared to those with slowly progressive CPP. The combined use of AMH and INHB for distinguishing between slowly progressive CPP and progressive CPP yielded an AUC of 0.92, the sensitivity and specificity were 80% and 89.3%, respectively (11). Despite the efforts of researchers to investigate various methods, current approaches for identifying RP-CPP remain insufficient, exhibiting suboptimal sensitivity and specificity.

Our earlier study showed that serum OC levels change earlier than estradiol and BA, suggesting its potential as an early biomarker for RP-CPP (14). Therefore, this study constructs distinct models with and without OC to validate whether the inclusion of OC can enhance the diagnostic performance of the models. Our findings indicate that, when OC is included as a variable in the dataset, the optimal model (Model 1) achieves an AIC of 83.32, which is superior to the optimal model that excludes OC (selected through stepwise regression, AIC = 99.09) and involves a smaller number of independent variables. Additionally, Model 1 demonstrates a higher AUC in the internal validation set compared to the model excluding OC (0.972 vs. 0.895). Moreover, the model incorporating OC had significantly lower prediction error rates in both the training set and internal validation set compared to the model excluding OC (training set: 5.48% vs 10.05%; internal validation set: 8.05% vs 12.64%). These findings suggest that OC contributes to enhancing the predictive performance of the models and may be beneficial for the early identification of RP-CPP.

In this study, cross-validated LASSO regression, stepwise regression, Bayesian optimal subset selection, and random forest methods were employed to screen independent variables. When the dataset includes the OC, the model selected by Bayesian optimal subset exhibited the smallest AIC value (76.55), outperforming those identified by cross-validated LASSO regression (AIC = 83.32), stepwise regression (AIC = 90.55), and random forest (AIC = 111.24).However, the Bayesian best subset model included 7 predictors, compared with 5 in the cross-validated LASSO model. Their AUCs were 0.981 vs. 0.973 in the training set (P > 0.05), both 0.972 in internal validation, and 0.930 vs. 0.923 in external validation (P > 0.05).This indicates that both models have similar diagnostic efficacy. Considering the convenience of clinical application, a model that incorporates fewer independent variables is preferred. Therefore, the model selected by cross-validated LASSO regression (Model 1) is considered superior. It achieved an AUC of 0.973 in the training set, with 91.6% sensitivity and 92.5% specificity, and showed robust performance in both internal and external validation. Given the substantial subjectivity in assessing breast development duration, we also reconstructed models excluding this variable. The results indicate that the model (Model 2) incorporating OC, mean ovarian volume, and endometrial presence/absence was optimal, as it yielded the lowest AIC value(108.53). This model achieved an AUC of 0.947 in the training set, with a sensitivity of 91.6% and specificity of 86.8%. Our findings indicate that both Model 1 and Model 2 exhibit high AUC, sensitivity, and specificity in training and validation sets. Goodness-of-fit assessments confirm excellent model performance, calibration curves demonstrate strong predictive consistency, and decision curve analysis underscores their clinical utility. However, if OC measurement is unavailable, an alternative model selected via stepwise regression (AIC = 99.09) can be used. This model includes duration of breast development, BA, HDL, mean ovarian volume, endometrial presence/absence, and whether at breast Tanner stage 2. It achieved an AUC of 0.963 in the training set (sensitivity: 93.4%; specificity: 90.6%) and demonstrated good predictive performance in both internal (AUC: 0.895) and external validation (AUC: 0.901).In summary, Model 1 and Model 2 are complementary and address distinct clinical needs. We recommend that clinicians select the model best aligned with their institutional resources and data availability, favoring Model 1 when accurate measurement of breast development duration is feasible.

In conclusion, this study adopted multiple methods to screen independent variables and selected the optimal model. Both internal and external validation tests confirmed that the model demonstrates excellent fitting performance and significant clinical utility. A nomogram was developed as a scoring system to facilitate clinical application. Moreover, our preliminary research indicated that OC might be beneficial for the early identification of RP-CPP. By incorporating OC into the model, we confirmed its ability to enhance predictive performance. Therefore, the inclusion of OC in the model may be applicable for the early diagnosis of RP-CPP.

The limitation of this study is that the samples analyzed were all sourced from a single center. Whether the findings are generalizable to other centers remains to be verified through multicenter, large-sample clinical validations.

Data availability statement

The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.

Ethics statement

The studies involving humans were approved by the Scientific Ethics Committee of the First Affiliated Hospital of Guangxi Medical University, Nanning, China (Approval No. 2023-E312-01). The studies were conducted in accordance with the local legislation and institutional requirements. Written informed consent for participation in this study was provided by the participants’ legal guardians/next of kin.

Author contributions

WQ: Data curation, Formal analysis, Methodology, Project administration, Writing – original draft, Writing – review & editing. RW: Data curation, Investigation, Methodology, Writing – review & editing. TX: Data curation, Methodology, Writing – review & editing. YC: Data curation, Investigation, Methodology, Writing – review & editing. DZ: Data curation, Investigation, Methodology, Writing – review & editing. ZD: Data curation, Investigation, Methodology, Writing – review & editing. DL: Conceptualization, Formal analysis, Funding acquisition, Project administration, Supervision, Writing – review & editing.

Funding

The author(s) declared that financial support was received for this work and/or its publication. Funding was received for the research. This study is supported by the Guangxi Clinical Research Center of Pediatric Disease (grant no: GUI KE AD22035219) , the Beijing Integrated Medicine Association (Grant No.ZHKY-2024-3314) , the Key Laboratory of Children's Disease Research in Guangxi's Colleges and Universities, and the Open Project of Guangxi Key Laboratory of Precision Medicine for Genetic Diseases (Maternal and Child Health Hospital of Guangxi Zhuang Autonomous Region)(Grant No.GXWCH-ZDKF-2022-18).

Conflict of interest

The authors declared that this work was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Generative AI statement

The author(s) declared that Generative AI was not used in the creation of this manuscript.

Any alternative text (alt text) provided alongside figures in this article has been generated by Frontiers with the support of artificial intelligence and reasonable efforts have been made to ensure accuracy, including review by the authors wherever possible. If you identify any issues, please contact us.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

References

1. Acinikli KY, Erbaş İM, Besci Ö, Demir K, Abacı A, and Böber E. Has the frequency of precocious puberty and rapidly progressive early puberty increased in girls during the COVID-19 pandemic? J Clin Res Pediatr Endocrinol. (2022) 14:302–7. doi: 10.4274/jcrpe.galenos.2022.2022-12-11

PubMed Abstract | Crossref Full Text | Google Scholar

2. Chen Y, Chen J, Tang Y, Zhang Q, Wang Y, Li Q, et al. Difference of precocious puberty between before and during the COVID-19 pandemic: A cross-sectional study among shanghai school-aged girls. Front Endocrinol. (2022) 13:839895. doi: 10.3389/fendo.2022.839895

PubMed Abstract | Crossref Full Text | Google Scholar

3. Street ME, Ponzi D, Renati R, Petraroli M, D’Alvano T, Lattanzi C, et al. Precocious puberty under stressful conditions: new understanding and insights from the lessons learnt from international adoptions and the COVID-19 pandemic. Front Endocrinol. (2023) 14:1149417. doi: 10.3389/fendo.2023.1149417

PubMed Abstract | Crossref Full Text | Google Scholar

4. Fava D, Pepino C, Tosto V, Gastaldi R, Pepe A, Paoloni D, et al. Precocious puberty diagnoses spike, COVID-19 pandemic, and body mass index: findings from a 4-year study. J Endocrine Society. (2023) 7:bvad094. doi: 10.1210/jendso/bvad094

PubMed Abstract | Crossref Full Text | Google Scholar

5. Zung A, Burundukov E, Ulman M, Glaser T, Rosenberg M, Chen M, et al. The diagnostic value of first-voided urinary LH compared with GNRH-stimulated gonadotropins in differentiating slowly progressive from rapidly progressive precocious puberty in girls. Eur J Endocrinol. (2014) 170:749–58. doi: 10.1530/EJE-14-0010

PubMed Abstract | Crossref Full Text | Google Scholar

6. Calcaterra V, Sampaolo P, Klersy C, Larizza D, Alfei A, Brizzi V, et al. Utility of breast ultrasonography in the diagnostic work-up of precocious puberty and proposal of a prognostic index for identifying girls with rapidly progressive central precocious puberty. Ultrasound Obstet Gynecology. (2009) 33:85–91. doi: 10.1002/uog.6271

PubMed Abstract | Crossref Full Text | Google Scholar

7. Chen Y and Liu J. Do most 7- to 8-year-old girls with early puberty require extensive investigation and treatment? J Pediatr Adolesc Gynecology. (2021) 34:124–9. doi: 10.1016/j.jpag.2020.11.020

PubMed Abstract | Crossref Full Text | Google Scholar

8. Zhang M, Sun J, Wang Y, Wu Y, Li X, Li R, et al. The value of luteinizing hormone basal values and sex hormone-binding globulin for early diagnosis of rapidly progressive central precocious puberty. Front Endocrinol. (2023) 14:1273170. doi: 10.3389/fendo.2023.1273170

PubMed Abstract | Crossref Full Text | Google Scholar

9. Kim MR, Jung MK, and Yoo EG. Slower progression of central puberty in overweight girls presenting with precocious breast development. Ann Pediatr Endocrinol Metab. (2023) 28:178–83. doi: 10.6065/apem.2244062.031

PubMed Abstract | Crossref Full Text | Google Scholar

10. Denis J, Dangouloff-Ros V, Pinto G, Flechtner I, Piketty M, Samara D, et al. Arterial spin labeling and central precocious puberty. Clin Neuroradiol. (2020) 30:137–44. doi: 10.1007/s00062-018-0738-5

PubMed Abstract | Crossref Full Text | Google Scholar

11. Chen T, Wu H, Xie R, Wang F, Chen X, Sun H, et al. Serum anti-müllerian hormone and inhibin B as potential markers for progressive central precocious puberty in girls. J Pediatr Adolesc Gynecology. (2017) 30:362–6. doi: 10.1016/j.jpag.2017.01.010

PubMed Abstract | Crossref Full Text | Google Scholar

12. Solhjoo S, Akbari M, Toolee H, Mortezaee K, Mohammadipour M, Nematollahi-Mahani SN, et al. Roles for osteocalcin in proliferation and differentiation of spermatogonial cells cocultured with somatic cells. J Cell Biochem. (2019) 120:4924–34. doi: 10.1002/jcb.27767

PubMed Abstract | Crossref Full Text | Google Scholar

13. Lee WY, Jung G, Kim HR, Nam HK, Rhie YJ, and Lee KH. Serum osteocalcin levels in girls with central precocious puberty: relation to the onset of puberty. Tohoku J Exp Med. (2018) 245:239–43. doi: 10.1620/tjem.245.239

PubMed Abstract | Crossref Full Text | Google Scholar

14. Qin W, Xie T, Chen Y, Zeng D, Meng Q, and Lan D. Osteocalcin: may be a useful biomarker for early identification of rapidly progressive central precocious puberty in girls. J Endocrinol Invest. (2025) 48:721–30. doi: 10.1007/s40618-024-02478-0

PubMed Abstract | Crossref Full Text | Google Scholar

15. Qin W, Chen Y, Sooranna SR, Zeng D, Xie T, Meng Q, et al. Osteocalcin: A potential marker to identify and monitor girls with rapidly progressive central precocious puberty. J Paediatr Child Health. (2024) 60:593–600. doi: 10.1111/jpc.16632

PubMed Abstract | Crossref Full Text | Google Scholar

16. Subspecialty Group of Endocrinologic, Hereditary and Metabolic Diseases, the Society of Pediatrics, Chinese Medical Association; Editorial Board, Chinese Journal of Pediatrics. Consensus statement For the diagnosis and treatment of central precocious puberty (2015). Zhonghua er ke za zhi = Chin J Pediatr. (2015) 53:412–8.

PubMed Abstract | Google Scholar

17. Kletter GB, Klein KO, and Wong YY. A pediatrician’s guide to central precocious puberty. Clin Pediatr. (2015) 54:414–24. doi: 10.1177/0009922814541807

PubMed Abstract | Crossref Full Text | Google Scholar

18. Subspecialty Group of Endocrinologic, Hereditary and Metabolic Diseases, the Society of Pediatrics, Chinese Medical Association, Editorial Board, Chinese Journal of PediatricsExpert consensus on the diagnosis and treatment of central precocious puberty (2022). Zhonghua er ke za zhi = Chin J Pediatr. (2023) 61:16–22. doi: 10.3760/cma.j.cn112140-20220802-00693

PubMed Abstract | Crossref Full Text | Google Scholar

19. Prosperi S and Chiarelli F. Early and precocious puberty during the COVID-19 pandemic. Front Endocrinol. (2022) 13:1107911. doi: 10.3389/fendo.2022.1107911

PubMed Abstract | Crossref Full Text | Google Scholar

20. Riley RD, Ensor J, Snell KIE, Harrell FE Jr., Martin GP, Reitsma JB, et al. Calculating the sample size required for developing a clinical prediction model. BMJ (Clinical Res ed). (2020) 368:m441. doi: 10.1136/bmj.m441

PubMed Abstract | Crossref Full Text | Google Scholar

21. Calcaterra V, Klersy C, Vinci F, Regalbuto C, Dobbiani G, Montalbano C, et al. Rapid progressive central precocious puberty: diagnostic and predictive value of basal sex hormone levels and pelvic ultrasound. J Pediatr Endocrinol Metab: JPEM. (2020) 33:785–91. doi: 10.1515/jpem-2019-0577

PubMed Abstract | Crossref Full Text | Google Scholar

22. Liu Y, Cheng Y, Sun M, Hao X, and Li M. Analysis of serum insulin-like growth factor-1, fibroblast growth factor 23, and Klotho levels in girls with rapidly progressive central precocious puberty. Eur J Pediatr. (2023) 182:5007–13. doi: 10.1007/s00431-023-05174-y

PubMed Abstract | Crossref Full Text | Google Scholar

23. Argente J, Dunkel L, Kaiser UB, Latronico AC, Lomniczi A, Soriano-Guillén L, et al. Molecular basis of normal and pathological puberty: from basic mechanisms to clinical implications. Lancet Diabetes Endocrinol. (2023) 11:203–16. doi: 10.1016/S2213-8587(22)00339-4

PubMed Abstract | Crossref Full Text | Google Scholar

24. Calcaterra V, Cena H, Loperfido F, Rossi V, Grazi R, Quatrale A, et al. Evaluating phthalates and bisphenol in foods: risks for precocious puberty and early-onset obesity. Nutrients. (2024) 16:2732. doi: 10.3390/nu16162732

PubMed Abstract | Crossref Full Text | Google Scholar

25. Calcaterra V, Verduci E, Magenes VC, Pascuzzi MC, Rossi V, Sangiorgio A, et al. The role of pediatric nutrition as a modifiable risk factor for precocious puberty. Life (Basel Switzerland). (2021) 11:1353. doi: 10.3390/life11121353

PubMed Abstract | Crossref Full Text | Google Scholar

26. Mizgier M, Jarząbek-Bielecka G, Wendland N, Jodłowska-Siewert E, Nowicki M, Brożek A, et al. Relation between inflammation, oxidative stress, and macronutrient intakes in normal and excessive body weight adolescent girls with clinical features of polycystic ovary syndrome. Nutrients. (2021) 13:896. doi: 10.3390/nu13030896

PubMed Abstract | Crossref Full Text | Google Scholar

27. Paparella R, Bei A, Brilli L, Maglione V, Tarani F, Niceta M, et al. Precocious puberty and benign variants in female children: etiology, diagnostic challenges, and clinical management. Endocrines. (2025) 6:29. doi: 10.3390/endocrines6020029

Crossref Full Text | Google Scholar

Keywords: diagnostic model, girls, nomogram, osteocalcin, rapidly progressive central precocious puberty

Citation: Qin W, Wang R, Xie T, Chen Y, Zeng D, Ding Z and Lan D (2026) Development and comparison of multivariate diagnostic models for rapidly progressive central precocious puberty in girls: the role of serum osteocalcin. Front. Endocrinol. 16:1728132. doi: 10.3389/fendo.2025.1728132

Received: 19 October 2025; Accepted: 24 December 2025; Revised: 23 December 2025;
Published: 19 January 2026.

Edited by:

Rodolfo A. Rey, Hospital de Niños Ricardo Gutiérrez, Argentina

Reviewed by:

Mohamed Ahmed Abdullah, University of Khartoum, Sudan
Dorota Formanowicz, Poznan University of Medical Sciences, Poland

Copyright © 2026 Qin, Wang, Xie, Chen, Zeng, Ding and Lan. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Dan Lan, bGFuZGFuQHN0dS5neG11LmVkdS5jbg==; bGFuZGFuX2xkQDE2My5jb20=

ORCID: Dan Zeng, orcid.org/0000-0001-5899-7040

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.