Predicted risk factors associated with secondary infertility in women: univariate and multivariate logistic regression analyses

Introduction Infertile women are those who regularly engage in unprotected intercourse for a period of at least 1 year and are unable to become clinically pregnant. Primary infertility means the inability of couples to conceive, without any previous successful pregnancies. Secondary Infertility refers to the inability to get pregnant for 12 months, after having a previous pregnancy for one time at least. The objectives of the current study were to analyze risk factors for secondary infertility and compare the predictive accuracy of artificial neural network (ANN) and multiple logistic regression models. Methods The study was conducted at The University Institute of Public Health collecting data from Gilani Ultrasound Center 18 months after approval of synopsis. A total of 690 women (345 cases and 345 controls) were selected. The women selected for the case group had to be 20–45 years of age, had any parity, and had a confirmed diagnosis of secondary infertility. Results Multiple logistic regression (MLR) and ANN were used. The chance of secondary infertility was 2.91 times higher in women living in a joint family [odds ratio (OR) = 2.91; 95% confidence interval (CI) (1.91, 4.44)] and was also 2.35 times higher for those women who had relationship difficulties with their husband [OR = 2.35; 95% CI (1.18, 4.70)]. Marriage at an earlier age was associated with secondary infertility with β being negative and OR being < 1 [OR = 0.94; 95% CI (0.88, 0.99)]. For the logistic regression model, the area under the receiver operative characteristic curve (ROC) was 0.852 and the artificial neural network was 0.87, which was better than logistic regression. Discussion Identified risk factors of secondary infertility are mostly modifiable and can be prevented by managing these risk factors.

Introduction: Infertile women are those who regularly engage in unprotected intercourse for a period of at least year and are unable to become clinically pregnant.Primary infertility means the inability of couples to conceive, without any previous successful pregnancies.Secondary Infertility refers to the inability to get pregnant for months, after having a previous pregnancy for one time at least.The objectives of the current study were to analyze risk factors for secondary infertility and compare the predictive accuracy of artificial neural network (ANN) and multiple logistic regression models.

Methods:
The study was conducted at The University Institute of Public Health collecting data from Gilani Ultrasound Center months after approval of synopsis.A total of women ( cases and controls) were selected.The women selected for the case group had to be -years of age, had any parity, and had a confirmed diagnosis of secondary infertility.
Results: Multiple logistic regression (MLR) and ANN were used.The chance of secondary infertility was .times higher in women living in a joint family [odds ratio (OR) = .; % confidence interval (CI) ( ., .)] and was also .times higher for those women who had relationship di culties with their husband [OR = .; % CI ( ., .)]. Marriage at an earlier age was associated with secondary infertility with β being negative and OR being < [OR = .; % CI ( ., .)].For the logistic regression model, the area under the receiver operative characteristic curve (ROC) was .and the artificial neural network was ., which was better than logistic regression.

Introduction
Infertility as a disease affects a couple's social, psychological, economic, and sexual wellbeing and is a significant global health issue.Infertility is often defined as a couple's failure to conceive after between 12 and 24 months of unprotected intercourse.Primary infertility occurs when a couple has never been successful in conceiving, whereas secondary infertility occurs after a previously confirmed pregnancy (1).Infertility affects 10% to 15% of couples worldwide.In 35% and 45% of instances, respectively, the female and male variables are to blame, while the remaining couples either have a mix of factors or experience idiopathic infertility (2,3).Different parts of the world have varied fertility rates, and 10-15% of married couples struggle to conceive.Primary infertility and secondary infertility are the two categories of infertility (4).
In 2010, almost 40 million couples actively sought infertility therapy, of which 34 million were in developing countries.Worldwide, there were approximately 48.5 (10-15%) million infertile couples.Women are thought to experience the highest rates of infertility (5).A significant majority of female factor infertility is due to the tubal factor.Pelvic inflammatory disease and acute salpingitis are the two most common causes of female factor infertility.Approximately 12% of pelvic infection episodes result in tubal damage, 23% after two episodes, and 54% after three episodes (6).Nevertheless, the reasons for infertility may vary geographically (7).When a woman is under 35 years of age, a fertility evaluation is often conducted after 1 year of regular unprotected sexual activity, and for those in the age group of 35 years or older, it is usually performed after 6 months.However, in women with irregular menstrual cycles or recognized risk factors for infertility, such as a history of pelvic inflammatory disease, endometriosis, or reproductive tract anomalies, the assessment may be initiated earlier (8).
Infertility in women can be diagnosed using various methods (9).Different risk factors are attributed to secondary infertility including lifestyle variables such as diet, obesity, drinking, smoking, and environmental hazards, as well as secondarily connected factors to human infertility such as childbirth complications, postpartum practices, and symptoms of sexually transmitted diseases (10).Other prevalent causes of female infertility include anovulatory disorders, polycystic ovarian syndrome, peritubovarian adhesions, endometriosis, and uterine and cervical factors (11,12).Traditional statistical analysis approaches are used to discover the specific cause of infertility and give effective predictors of prevention and management.Univariate analysis can be used to assess the association between the investigated factor and the treatment result.However, multivariate analysis (multivariate logistic regression) provides a high-accuracy model for predicting pregnancy.Many types of studies utilize the phrase "data mining" to evaluate and classify medical data (13).There is great hope for artificial neural network (ANN) technology, which has already been shown to be successful in pregnancy prediction and can replace traditional statistical prediction methods, such as regression analysis.This classifier is designed to learn information, generalize, and model any linear or non-linear multidimensional accuracy (13).
Different statistical techniques are being used to explore determinants of medical conditions or classify them.The classification of individuals is a common problem, whereas the traditional statistical classification methods, such as logistic regression (LR), have been extensively used in the medical field to explore determinants when dependent variables had dichotomous outcomes (14,15).MLR is taken from classical statistics based on probabilities and dominates the data, and it does not have the potential to solve non-linear problems (16).

Materials and methods . Study design and sampling
A case-control study was designed by non-probability consecutive sampling.This study was conducted at the University Institute of Public Health (UIPH), University of Lahore, Lahore, Pakistan, by incorporating data from the Gilani Ultrasound Center, Lahore, in a period of 18 months after approval of the synopsis.A total of 690 women (345 cases and 345 controls) were included in the analysis.

. Inclusion and exclusion criteria
Inclusion criteria for the case group were women between the ages of 20 and 45 years, having any previous parity, and valid diagnosis of secondary infertility (as per operational definition).Inclusion criteria for controls were women between the ages of 20 and 45 years and having any parity.Exclusion criteria for cases and controls were couples that were separated for 1 year at least, couples with male factor infertility, and infertile women with a history of tuberculosis or any organic lesion (fibroids, etc.).

. Ethical considerations
While conducting the study, the ethical guidelines established by the ethics council of the University of Lahore were followed, and the participants' rights were upheld.All participants provided their written and informed consent to participate in the study.

. Data collection procedure
After receiving approval for the study protocol from the institutional review board (IRB) of the university, female patients with secondary infertility who fulfilled the inclusion and exclusion criteria were included in the study.The collection of data and all information were kept anonymous.All women underwent a complete physical examination, with measurement of height, weight, and body mass index (BMI).Demographical data such as age, years since marriage, and duration of infertility were collected.Complete medical history and examination were conducted and recorded.Information was recorded on pro forma and analyzed.

. Data analysis
All data were recorded and analyzed using the Stata program.In descriptive analysis, for quantitative data, mean ± S.D was used, or in case of non-normality of data, median ± IQR was used.Independent sample t-test was applied for normally distributed data, and the Mann-Whitney U-test was applied for data that were not normally distributed.For categorical data, frequency (%) was used, and the chi-square test was used to analyze the significant association between cases and controls and other factors.In inferential statistics, multiple logistic regression was used in addition to ANN.The receiver operating characteristic (ROC) curve and the area under ROC were also calculated.The association was considered significant at a P ≤ 0.05.

Results
The mean ages of the women in the case group and control group were 33.08 ± 4.17 years and 31.37 ± 4.36 years, respectively.The median age was statistically higher in the case group (34.0 ± 6 years) than in the control group (30.0 ± 7 years), with a P < 0.05.Similar to age, other sociodemographic, anthropometric, and medical risk factors were studied in participants of both case and control groups.All significant variables or variables with p <  2).Advanced age had a positive impact on secondary infertility (β = 0.22), duration of marriage had a negative effect on secondary infertility (β = −0.07),working status had a positive effect on secondary infertility (β = 1.10), joint family had a positive effect on secondary infertility (β = 1.07), and cousin marriage and relationship difficulties with husband also had positive effect of secondary infertility as their βs were also positive, i.e., 0.  For the logistic regression model, the area under ROC was 0.852 (95% CI: 0.825, 0.880) (Figure 1).As per importance chart of artificial neural network, the highest normalized importance was given to age at marriage (100%), followed by current age (years) (98.5%), history of pelvic inflammatory disease (39.9%), history of abortion (36.4%), menorrhagia (36.0%), cousin marriage (33.9%), history of breastfeeding to child (30.8%), type of family (27.1%),premature delivery (27.0%), violence during previous pregnancy by husband (26.5%), obesity (24.6%), profession (24.0%), relationship difficulties with husband (22.2%), uterine fibroids (21.5%), history of hypertension (17.0%), history of endometriosis (15.7%), history of diabetes (15.1%), history of urinary tract infection (14.0%), history of polycystic ovary syndrome (11.3%), and intermenstrual bleeding (13.0%) (Table 4, Figure 2).For ANN, the area under the curve was 0.872, which was better than logistic regression (Figure 3).

Discussion
The single most significant factor affecting both spontaneous and treatment-related conception is female age.The threshold for advanced reproductive age lacks a universally agreed-upon definition; however, it is generally acknowledged that 35 years marks a significant point in terms of fertility (17).In this study, patients who have secondary infertility had a mean age of 33.08 ± 4.17 years.A study conducted in 2010 by Nosheen et al. (18) reported that the mean age in secondary infertility was 32 years, while Talib et al. (19) reported in 2007 that the mean age of secondary infertility was 29.4 years in women.Factors related to nutrition and lifestyle that impact fertility encompass conditions, such as weight imbalances, anemia, and smoking.A report published by the American Society for Reproductive Medicine emphasizes that 12% of infertility instances arise from being either underweight or overweight.Additionally, a history of breastfeeding was associated with a higher likelihood of experiencing secondary infertility (20).
In the current study, ANNs gave better predictive accuracy based on their sensitivity and specificity and area under the curve (AUC).For the logistic regression model, the area under ROC was 0.852, and for the artificial neural network, the area under ROC was 0.87, which was better than logistic regression.While comparing techniques of machine learning, according to research, the most suitable fit point was found using the logistic regression ROC curve with a sensitivity of 0.688 and specificity of 0.615, and an ANN ROC curve with a sensitivity of 0.935 and specificity of 0.873 was obtained (21).Another study was conducted to predict the probability of preterm birth (PTB) using logistic regression, in which they reported that few variables significantly contributed to the risk of PTB.They found that there was an elevated probability of PTB for women under the age of 35 years, with an OR of 1.8 and a 95% confidence interval (1, 3), for refugees, with an OR of 1.57 and a 95% confidence interval (1.05, 2.34), for antenatal visits of at least four, with an OR of 2.89 and a 95% confidence interval (1.30, 6.4), for medically induced pregnancies, with an OR of 4.01 and a 95% confidence interval (1.30, 6.4), history of previous preterm delivery, with an OR of 5.58 and a 95% confidence interval (3.13, 9.94), for previous history of stillbirth, with an OR of 4.01 and a 95% confidence interval (1.59, 10.13), and for previous history of cesarean section, with an OR of 1.78 and a 95% confidence interval (1.00, 3.00) (22).
One recent publication in April 2019 determines the factors affecting birth weight by comparing the multiple logistic regression analysis and ANN.A total of 223 newborn babies in Istanbul, Turkey, were included in this study.The strategy designed based on these records was assessed using logistic regression and ANN.For the ANN and the logistic regression models, the area under the receiver operating characteristic (AuROC) curve was 0.941 (SD = 0.0012) and 0.909 (SD = 0.019), respectively, whereas the ANN value was greater than the LR value, and this investigation found that the outcomes were relatively similar (23).In the current study, for ANN, the area under the curve was 0.872, which was better than 165 logistic regressions.
Another study was conducted to compare logistic regression and ANNs to predict the outcomes in extremely low birth weight neonates.In that study, for both models, the AUC for analysis using the significant variables was greater than the AUC for analysis using the whole data set (p = 0.005).The AUC was mostly influenced by gestational age, birth weight, and the 5-min Apgar score, with similar contributions to the individual variables in both models.Based on significant variables at 80% sensitivity, specificity, PPV, and NPV were equal for both models (85% specificity, 72% PPV, and 90% NPV) (24).On the other side, the values of specificity and sensitivity for both groups were not the same.
In our study, among participants in the case group, 60.9% of women are living in a combined family system, 64.9% of women are married to their cousins, and 88.1% of women have relationship problems with their husbands, and these findings suggest that women should be very careful with respect to these factors before and after marriage.These risk factors emerged as potential risk factors that are associated with secondary infertility.Meanwhile, gynecological surgeries need to be considered carefully, especially surgeries on ovaries and fallopian tubes.Furthermore, women

Conclusion
This study identified that social and sociodemographic, anthropometric, family and social support, history of different medical illnesses, birth history, gynecological, and family history of different medical risk factors are the main causes of secondary infertility in women.Multiple logistic regression analysis was performed to check the specificity, sensitivity, and accuracy of the study.Hence, identified risk factors of secondary infertility are mostly modifiable and can be prevented or treated.By managing these risk factors, we can reduce the risk of secondary infertility.

FIGURE
FIGURENormalized importance through artificial neural network.
TABLE Variable risk factors with significant di erences are taken for logistic regression and artificial neural network.
TABLE Model summary and Hosmer-Lemeshow test.
TABLE Model of multiple logistic regression analysis.
TABLE Normalized importance through artificial neural network.