A Model for Predicting Polycystic Ovary Syndrome Using Serum AMH, Menstrual Cycle Length, Body Mass Index and Serum Androstenedione in Chinese Reproductive Aged Population: A Retrospective Cohort Study

Background A clinical diagnosis of polycystic ovary syndrome (PCOS) can be tedious with many different required tests and examinations. Furthermore, women with PCOS have increased risks for several metabolic complications, which need long-term health management. Therefore, we attempted to establish an easily applicable model to identify such women at an early stage. Objective To develop an easy-to-use tool for screening PCOS based on medical records from a large assisted reproductive technology (ART) center in China. Materials and Methods A retrospective observational cohort from Peking University Third Hospital was used in the study. Least Absolute Shrinkage and Selection Operator (LASSO) logistic regression with 10-fold cross-validation was applied to construct the model. The area under the receiver operating characteristic curve (AUC), sensitivity, and specificity values were used to evaluate and compare the models. Design, Setting, and Participants This retrospective cohort study included 21,219 ovarian stimulation cycle records from January to December 2019 in Peking University Third Hospital. Main Outcomes and Measures The main outcome was whether there was a clinical diagnosis of PCOS. The independent variables included were age, body mass index (BMI), upper limit of menstrual cycle length (UML), basal serum levels of anti-Müllerian hormone (AMH), testosterone androstenedione, antral follicle counts et al. Results We have established a new mathematical model for diagnosing PCOS using serum AMH and androstenedione levels, UML, and BMI, with AUC values of 0.855 (0.838–0.870), 0.848 (0.791–0.891), 0.846 (0.812–0.875) in the training, validation, and testing sets, respectively. The contribution of each predictor to this model were: AMH 41.2%; UML 35.2%; BMI 4.3%; and androstenedione 3.7%. The top 10 groups of women most predicted to develop PCOS were demonstrated. An online tool (http://121.43.113.123:8888/) has been developed to assist Chinese ART clinics. Conclusions The models and online tool we established here might be helpful for screening and identifying women with undiagnosed PCOS in Asian populations and could assist in the long-term management of related metabolic disorders.


INTRODUCTION
Polycystic ovary syndrome (PCOS) is one of the most common endocrine and metabolic disorders in women of reproductive age, with a prevalence of 5.6% in the Chinese population (1). Despite growing interest in understanding the pathophysiology of PCOS, our knowledge is still incomplete. The diagnosis of PCOS in assisted reproductive technology (ART) clinics can be challenging, and women with this disorder can have different signs and symptoms: some might only have a few, whereas others can experience many. Even within the same woman, the number of symptoms experienced and their severity will change over time. Therefore, the final diagnosis of PCOS can be long and tedious with many different required tests and examinations, which means that many women might not recognize that they have the PCOS associated conditions and not seek appropriate evaluation and treatment. Furthermore, screening and diagnosis of PCOS is not an easy task for primary physicians. These factors mean that PCOS is largely underdiagnosed. Indeed, a large retrospective study found that the prevalence of PCOS in women attending primary care ART clinics was lower than in community samples, indicating a general problem of underdiagnosis (2).
In addition to androgen excess, insulin resistance is indicated to play an important role in pathophysiology of PCOS (3)(4)(5). For a specific PCOS woman, the symptoms change over time, from hirsutism in adolescence to infertility in reproductive age. Even after menopause, high levels of androgens and insulin resistance continues, which results in higher risk of type 2 diabetes and other related metabolic syndrome (6,7). Therefore, irregular menstruation and/or hirsutism can no longer be regarded as a benign nuisance. Long-term health managements, including tailored medication and lifestyle management have been paid more and more attention (4). The premise of giving PCOS women long-term health management is to early identify them. Here we attempted to establish an online tool for screening PCOS, which might be helpful for timely identification of the undiagnosed PCOS and thus improve long-term health management for affected women.

Subjects
This was a retrospective observational cohort study in Peking University Third Hospital. In all, 21,219 ovarian stimulation cycle records were recovered from January to December 2019. We excluded the following: 3,289 cycles without menstrual cycle data; 150 without body mass index (BMI) information; 3,211 without one or more key hormone levels recorded; 2,546 without information on the antral follicle count (AFC); and 1303 multiple cycles of the same patient. The study flowchart is shown in Figure 1. Finally, 11,720 cycle records were used. The menstrual cycle duration in this study refers to the upper limit of menstrual cycle length (UML). For example, if the duration of a patient's menstrual cycle was 30-90 days, 90 days was used as the UML. The basic characteristics of the cycle data are listed in Table 1. The need for informed consent by the patients was waived for the deidentified data in our analysis, which conformed to the Helsinki declaration (8).

Diagnosis of PCOS
Women with PCOS were diagnosed according to the 2003 Rotterdam criteria (9), which require the presence of at least any two of the following: (1) ovulatory dysfunction (i.e., oligo/ anovulation); (2) hyperandrogenism, diagnosed either clinically by cutaneous manifestations of androgen excess or hyperandrogenemia (high testosterone or androstenedione levels in blood tests); or (3) polycystic ovaries on ultrasonography. Diagnoses of PCOS were made after the exclusion of phenotypically similar androgen excess disorders such as congenital adrenal hyperplasia, androgensecreting tumors, Cushing syndrome, thyroid dysfunction, or hyperprolactinemia.
Hyperandrogenism was diagnosed by the presence of excessive acne, androgenic alopecia, or hirsutism; or chemically by high serum levels of total testosterone (TES) ≥ 2.5 nmol/L or androstenedione (AND) ≥ 11.5 nmol/L. Hirsutism was diagnosed as described (10). Briefly, a modified Ferriman-Galwey score of >4, or hair growth involving the upper lip, thighs, and lower abdomen with scores of >2 are used to diagnose hirsutism in our clinical practice. Oligomenorrhea was diagnosed as menstrual cycles lasting >35 days but <6 months. Amenorrhea was defined as the absence of menstruation for more than 6 months after a cyclic pattern had been established. A polycystic ovary on ultrasonography was defined as one containing 12 or more follicles measuring 2-9 mm in diameter or an ovary with a volume of greater than 10 mL. A single ovary meeting either or both of these definitions was deemed sufficient for the diagnosis. Hyperprolactinemia was diagnosed from two serum prolactin levels of >25 ng/mL. The clinical diagnosis of having or not having PCOS was recorded by our data supporting group, along with other diagnoses and basal or clinical information.

Antral Follicle Counts and Endocrine Assays
Antral follicles measuring 2-10 mm in diameter in both ovaries were counted on menstrual cycle day 2 using transvaginal ultrasound scans. On the same day, intravenous blood was collected for measuring follicle stimulating hormone (FSH), luteinizing hormone (LH), TES, AND, and estradiol (E 2 ) concentrations. Blood samples for measuring AMH were collected on any day of the menstrual cycle prior to ovarian stimulation. Samples were collected, immediately inverted five times and centrifuged at 1800g 10min for further endocrine assessments.
Serum levels of FSH, LH, TES, AND, and E 2 measurements were tested using a Siemens Immulite 2000 immunoassay system (Siemens Healthcare Diagnostics, Shanghai, P. R. China). The quality controls for FSH, LH, TES, AND, and E 2 were supplied by Bio-RAD Laboratories (Hercules, CA, USA; Lyphochek  Immunoassay Plus Control, Trilevel, catalog number 370, lot number 40370). Serum AMH concentrations were measured using an ultrasensitive two-site enzyme-linked immunosorbent (ELISA) assay (Ansh Laboratories LLC; Webster, TX, USA), using quality controls supplied with the ELISA kits. The coefficients of variation for the assays were <6% for AMH, FSH, and LH, and <10% for E 2 , AND, and TES.

Statistical Analysis
The diagnosis of having or not having PCOS was included as the dependent variable, and AFC, AMH level, age and other measures were included as independent variables. To make the model better applied to the clinical practice, continuous variables were transformed into categorical variables. The grouping standard for independent variables was mainly based on data exploration before analysis combined with our clinical experience. The grouping criteria for each independent variable in the three different models were held the same. For variable selection process, first, a proportion (70%) of the data was selected randomly as a training set, which was used for model establishment, and the rest (30%) of the data was used as the testing set, which was used for model evaluation. Then a prediction model was constructed in the training set. The scaled negative loglikelihood (-Log L (b)) was used to evaluate the final model: the smaller the value of scaled -Log L (b) in the validation set the better the model fit. The logistic Least Absolute Shrinkage and Selection Operator (LASSO) model, a shrinkage method that can actively select from a large and potentially multicollinear set of variables and reduce the likelihood of overfitting, was applied to construct a predictive model. The logistic LASSO is a logistic regression analysis approach that penalizes the absolute size of the coefficients of a regression model based on the value of penalty term, l. With larger penalties, the estimates of weaker factors shrink toward zero, so that only the strongest predictors remain in the model. The value of l was determined with 10-fold cross validation, and the most predictive covariates-selected by the minimum value (l min)-were used to construct the PCOS diagnostic models (11).
The performance of each model was assessed using the area under the receiver operator characteristic curve (AUC), sensitivity and specificity with 95% confidence interval (CI).
All the analyses in this study were performed using SAS JMP Pro (version 14.2; SAS Institute, Cary, NC, USA), and p < 0.05 was considered statistically significant. Table 1 were all of significance in univariate analysis when diagnosing PCOS, and then included for multiple logistic regression with 10 fold cross-validation analysis. When seven variables were included, the scaled -Log L (b) value in the validation set no longer decreased indicated in model building process of Model1 in Supplementary Document 1, thus Model 1 was identified. The parameter estimations of Model 1 were shown in Table 2. The AMH level is generally considered to be a good substitute for the AFC for PCOS diagnosis (12)(13)(14), thus we tried to establish another model without AFC. All the independent variables except AFC were included to identify the ideal model, using the same multiple logistic regression method with cross validation, Model 2 were identified, the variables included were age, AMH, UML, BMI, TES and AND. The parameter estimations were indicated in Supplementary Table 1.

The indicators in
The AUCs of Model 2 (without the AFC) in the training and validation data were 0.862 and 0.865, respectively, whereas the AUCs of Model 1 (with the AFC) in the training and validation data were 0.865 and 0.845, respectively. Inclusion of the AFC did not improve the performance of our models. The contribution of each variable is shown in Table 3. The main effect of AMH in Model 2 (without the AFC) was 35.1%, whereas the main effects of AMH and AFC in Model 1 (with the AFC) were 18.3% and 17.2%, respectively. Given the small contributions of Tes and age in Model 2 (without the AFC) ( Table 3), we eliminated them in subsequent model building. UML, AMH, BMI, and AND levels were included in this process to derive Model 3. The estimated parameter values and p values of each index are shown in Table 4. The main effects of each predictor contributing to Model 3 were: AMH 41.2%; UML 35.2%; BMI 4.3%; and AND 3.7%. Table 5 shows the AUC, sensitivity, and specificity of Model 3 in the training, validation and testing sets. The relationship between predicted probability and the prevalence of PCOS is shown in Figure 2. The prevalence of PCOS increased with the predicted probability. Table 6 shows the top 10 groups of women most highly predicted to have PCOS. The prevalence and predicted probability of PCOS in all groups are indicated in Supplementary Table 2.
Our algorithm for diagnosing PCOS has been developed into a website for Chinese ART clinics, (http://121.43.113.123:8888/). In this, the user inputs the required indicators, clicks 'calculate', and the results of specific probability of PCOS and risk group of a certain subject are displayed. The grouping criteria are as follows: a low-risk group, with a predicted probability of <10%; a medium-risk group, with a predicted probability of 10% to 50%; and high-risk group, with a predicted probability of >50%.

DISCUSSION
There are good correlations between AMH level and polycystic ovarian morphology (12)(13)(14), and the serum AMH level has been increasingly regarded as a surrogate marker for PCOS and ovarian reserve assessment. Previous studies have found different AMH cut-off values to diagnose PCOS. However, because of small sample sizes, inappropriate controls and heterogeneous AMH assays (15), the application of these cutoffs in the diagnosis of PCOS has been limited. This might be why the introduction of AMH into the diagnosis of PCOS is controversial. We have established a four-item mathematical model (AMH + UML + BMI + AND) instead of a simple AMH cut-off to diagnose PCOS, with AUCs of 0.855, 0.848, 0.846 in the training, validation, and testing sets, respectively. The contributions of each predictor in Model 3 are: AMH 41.2%; UML 35.2%; BMI 4.3%, and AND 3.7%.
Although AMH might serve as a potential diagnostic marker for PCOS, it is not currently recommended as a single-test parameter for this by the International Evidence-based Guideline for the Assessment and Management of Polycystic Ovary Syndrome 2018 (16). Comprehensive models have been   Table 2 indicate that adjusting for UML, serum AMH level, the AFC, BMI, serum AND level and the serum TES level, the contribution of age is small, only 0.2%. PCOS is mainly a hyperandrogenic disorder, verified in various rodent models using androgen induction (20). However, how excessive androgens are produced remains largely unknown. Studies using mouse models have revealed that AMH is involved in regulating the hypothalamic-pituitaryovarian axis and might stimulate the production of excess androgens (21,22). Administration of recombinant human AMH on gestational days 16.5, 17.5 and 18.5 in mice activated the AMH receptor in gonadotropin releasing hormone -secreting  neurons and increased the frequency of luteinizing hormone (LH) pulses, leading to increased levels of serum LH and TES and decreased levels of E 2 and progesterone in pregnant mice on gestational day 19.5 (22). The elevated levels of serum LH and TES induced by high AMH levels lead to oligo-ovulation or anovulation and defective oocyte development in mother mice and their female offspring. Thus, AMH is being increasingly recognized as a potential marker for the diagnosis of such disorders in women undergoing ART. It has been debated that whether there is a possibility AFC, one of the measures acquired by ultrasonography (9), could be replaced by AMH. Our previous study has shown that AFC can be replaced by AMH in assessing ovarian reserve (23). However, AFC and AMH have different significance. AMH is secreted by immature granulosa cells in a Gn-independent way, while the AFC reflects the small Gn-dependent follicular development. Could AFC be replaced by AMH when screening PCOS? Our results here show that the contribution of AMH to Model 1 (without AFC) is 35.1%, while the contribution of a combination of AMH and AFC in Model 2 (with AFC) is 35.5%, which suggests that AMH could potentially replace AFC in diagnosing or screening PCOS.
The etiology of PCOS is multifactorial (3,14). Affected patients with or without a normal BMI are observed commonly in our clinical practice. This heterogeneity means that the driving force might differ in patients with PCOS according to their BMI, which is consistent with many other studies (24,25). Thus, the etiology of such obese patients with PCOS but with normal AMH levels (such as group 96 in Supplementary Table 2) might differ from that in normal weight PCOS patients with high AMH levels (such as group 16), so follow-up treatments should also be different.
Because PCOS is so strongly associated with obesity, lean women with PCOS often go undiagnosed for years while they struggle to conceive. In our data, the prevalence of PCOS was 64/ 1071 (5.98%, CI:4.71%-7.56%) for women with a BMI <18.5 kg/ m 2 . When combining this value of BMI<18.5 kg/m 2 with an AMH level >10 ng/mL, the prevalence of PCOS increased to 21/49 (42.86%, CI:30.02%-56.73%). Similarly, combining the values of BMI <18.5 kg/m 2 and AMH>10 ng/mL and UML of >90 days, the incidence of PCOS increased to 10/13 (76.92%, CI:49.74%-91.82%). These normal-weight or lean women still face fertility challenges, increased androgen levels and the resulting symptoms (such as acne, hirsutism, and hair loss), and increased risk of diabetes and cardiovascular disease (26). The diagnostic models we have established here might help diagnose these patients in a timely fashion and facilitate their long-term health management.
It has been demonstrated that serum AMH levels in oligomenorrheic girls without evidence of hyperandrogenism are similar to levels in adolescents and adults with PCOS but elevated compared with normal adolescents and adults (27). Combined with the discovery that excessive AMH contributes to ovulatory disorders in mice (21), high AMH levels might be closely associated with oligomenorrhea. In our data, among the first group with a predicted PCOS probability of 93.56%, there are still more than 10% who do not meet the Rotterdam criteria (Supplementary Table 2). There is a high probability that these women who have not been diagnosed with PCOS are actually oligomenorrheic without evidence of hyperandrogenism. In addition, the possibility of a missed diagnosis cannot be ruled out.

Limitations of the Study
First, although our sample size was large, our PCOS diagnostic models are still based on retrospective data, further prospective studies are needed to prove their applicability in ART practice. Second, the different AMH values obtained using diagnostic kits from different manufacturers might also affect the applicability of our models. Third, it has been acknowledged that the measurements of plasma or serum total TES levels in men have adequate sensitivity and clinical utility, but are relatively inaccurate for women because of poor accuracy and sensitivity, which severely limits their clinical utility (28). This might explain why the TES level was not of significance in our models. A TES diagnostic kit with better performance might help to further optimize our diagnostic models. Fourth, other predictors are needed for improving the performance of our models in the future. Finally, our PCOS screening model is established based on data of reproductive aged women from infertility clinics, not adolescent girls, thus is not suitable for PCOS screening in adolescents.

CONCLUSION
Women with PCOS are at increased risk of infertility and multiple metabolic symptoms, which need long-term health management. However, the clinical diagnosis of PCOS is complicated, requiring different tests and physical examinations. Currently, screening and diagnosis of PCOS is not an easy task for general gynecologists and primary physicians. Here, we establish an easy applicable PCOS screening model. By entering two serological indicators and the upper limit of the menstrual cycle length and BMI, the probability and the risk of having PCOS could be predicted (http://121.43.113. 123:8888/). It can be used to early identify these women who have higher risk of having PCOS, so that they can be further diagnosed, thus contributing to their long-term health management. Moreover, it should be noted that our PCOS screening model is established based on data of reproductive aged women from infertility clinics, not adolescent girls, thus is not suitable for PCOS screening in adolescents.

DATA AVAILABILITY STATEMENT
The original contributions presented in the study are included in the article/Supplementary Material. Further inquiries can be directed to the corresponding author.

AUTHOR CONTRIBUTIONS
HX, GF, and KA contributed to manuscript drafting and revising. HX, GF, KA, LC, and YH contributed to data analysis and interpretation. HX, LC, and RY prepared figures and tables. RL and JQ contributed to the conception of the study, manuscript revising and final approval. All authors reviewed the manuscript. All authors contributed to the article and approved the submitted version.