Using Deep Learning in a Monocentric Study to Characterize Maternal Immune Environment for Predicting Pregnancy Outcomes in the Recurrent Reproductive Failure Patients

Recurrent reproductive failure (RRF), such as recurrent pregnancy loss and repeated implantation failure, is characterized by complex etiologies and particularly associated with diverse maternal factors. It is currently believed that RRF is closely associated with the maternal environment, which is, in turn, affected by complex immune factors. Without the use of automated tools, it is often difficult to assess the interaction and synergistic effects of the various immune factors on the pregnancy outcome. As a result, the application of Artificial Intelligence (A.I.) has been explored in the field of assisted reproductive technology (ART). In this study, we reviewed studies on the use of A.I. to develop prediction models for pregnancy outcomes of patients who underwent ART treatment. A limited amount of models based on genetic markers or common indices have been established for prediction of pregnancy outcome of patients with RRF. In this study, we applied A.I. to analyze the medical information of patients with RRF, including immune indicators. The entire clinical samples set (561 samples) was divided into two sets: 90% of the set was used for training and 10% for testing. Different data panels were established to predict pregnancy outcomes at four different gestational nodes, including biochemical pregnancy, clinical pregnancy, ongoing pregnancy, and live birth, respectively. The prediction models of pregnancy outcomes were established using sparse coding, based on six data panels: basic patient characteristics, hormone levels, autoantibodies, peripheral immunology, endometrial immunology, and embryo parameters. The six data panels covered 64 variables. In terms of biochemical pregnancy prediction, the area under curve (AUC) using the endometrial immunology panel was the largest (AUC = 0.766, accuracy: 73.0%). The AUC using the autoantibodies panel was the largest in predicting clinical pregnancy (AUC = 0.688, accuracy: 78.4%), ongoing pregnancy (AUC = 0.802, accuracy: 75.0%), and live birth (AUC = 0.909, accuracy: 89.7%). Combining the data panels did not significantly enhance the effect on prediction of all the four pregnancy outcomes. These results give us a new insight on reproductive immunology and establish the basis for assisting clinicians to plan more precise and personalized diagnosis and treatment for patients with RRF.


INTRODUCTION
Pregnancy is a complex biological process that poses a great challenge to the maternal immune system. The unique immunology of the maternal-fetal interface was recognized, since the "fetal allograft" concept was first described by Sir Peter Brian Medawar in the early 1950s (1). Correct and precise crosstalk between fetus and mother is an important basis for the apposition, adhesion, implantation, and growth of the embryo in the uterus (2). The abnormal frequencies and functions of maternal immune cells are associated with reproductive failure, especially in cases of recurrent reproductive failure (RRF), such as recurrent pregnancy loss (RPL) and repeated implantation failure (RIF) (3).
In the conventional medical procedure, the patients with RRF are assessed and given a score, based on biomarkers that have been demonstrated to be of relevance to the disease. The treatment for the patients is based on the classification or scores (4,5). However, the etiologies of RRF are highly heterogeneous and the complex underlying interactions between the biomarkers make the creation of a personalized treatment strategy based on all known parameters impossible for the clinicians. Therefore, the design of a model that can accurately predict the outcome of treatment methods would be highly beneficial to the clinicians, enabling the choice of lower-risk treatments, thus alleviating the financial burden of the treatment cost and reducing treatment time.
In the field of assisted reproductive technology (ART), predictive models have been applied as decision aids to embryo, egg, sperm selection, and pregnancy outcome prediction, and for the intrinsic evaluation of various factors related to clinical outcomes (6,7). Presently, the validity of the applied models has been demonstrated by analyzing the correlation between factors and the treatment outcome or etiology (8). However, the varying degrees of accuracy and limitations of the applied models have inhibited their use in the routine implementation of in vitro fertilization (IVF) procedures (6). To address this problem, more complex Artificial Intelligence (A.I.) systems, such as artificial neural networks (ANNs), have been introduced in ART fields (9,10). A.I. systems are advantageous due to their significant information processing properties in terms of nonlinearity, high levels of parallelism, noise and fault tolerance, as well as learning, generalization, and adaptive capabilities (11). Nevertheless, few studies that focus on the pregnancy outcome prediction in patients with RRF exist.
Sparse coding is a common machine learning technique used to extract features from raw data. The core of sparse coding involves establishing a sparse representation of the raw data to form a linear combination of basic elements called "atoms, " which collectively form a library known as "dictionary." The advantages of using sparse coding include: (1) training a learning model by adopting a relatively lower number of features from raw data, which, in turn, lowers the computational cost during model training; (2) increased interpretability of the learning results as critical features that can be identified efficiently from the dictionary (12). It has been demonstrated that sparse coding can be applied in genome-wide association studies, neuroimaging, and oncology for object detection and classification tasks (13)(14)(15)(16). Sparse coding techniques have not been comprehensively studied in the area of reproductive medicine, and its application in immunological profile analysis of patients with RRF remains to be explored.
The remainder of this article is organized as follows. A literature review in the field of A.I. and reproductive medicine is presented in the next section, followed by the methodology and demonstration of sparse coding application to analyze multidimensional clinical data of patients with RRF and predict their pregnancy outcomes. The results on the performance of the model are then introduced and finally, the concluding remarks are presented in the discussion.

Artificial Intelligence in Predicting Pregnancy Outcomes of Patients With Infertility
Machine learning is a subset of A.I. that enables computer algorithms to model the relationship between a set of observable data (input data) and another set of variables (output data) (17). It provides the ability to interpret and understand data and to develop predictive models based on experience. Machine learning methods include ANN, Support Vector Machines (SVM), C4.5, Classification and Regression Tree (CART), Random Forest (RF), K-Nearest-Neighbor (KNN), and so on. ANNs and SVMs are widely used in biomedical problems analysis. Machine learning methods can provide more options and richer task information for problem solving. At the same time, machine learning methods are gaining popularity in clinical decision-making (18)(19)(20).
The concept of a neural network is derived from the structure and function of biological neural networks. In particular, ANNs propose a system with stacked layers of interconnected processors, or nodes, that can form increasingly complex features in each successive layer (21). Raw information is supplied into the input layer and passes through the implicit layer by a weighted connection system. Finally, the output values of the transformed features are generated in the output layer, to predict the outcome. In a clinical setting, the input layer can represent medical data, the output layer can represent prognostic subclasses and multiple implicit layers can represent feature detectors, used to capture higher-order correlations. The SVM algorithm classifies the input data by calculating support vectors that construct hyperplanes in a higher-dimensional space, where the features are separable. C4.5, CART, and RF are three decision trees with non-parametric characteristics that map characteristics to outcomes using a partitioning procedure that recursively divides the source set of each node or branch point into unrelated subsets based on the value of a particular characteristic. KNN is an instancebased learning method that assigns classes to the data, based on nearest-neighbor decision rules.
On the basis of big data training iterations, dimensionality reduction can be applied to a large number of influential factors by using various machine learning methods for modeling and prediction. Simultaneously, relevant attributes with high influence can be extracted, and a prediction model with relatively high accuracy can be obtained. Hassan et al. (21) evaluated the predictive ability of five different machine learning models, namely, Multilayer Perceptron (MLP), SVM, C4.5, CART, and RF, on the success rate of IVF pregnancies. A feature selection algorithm for climbing features (attributes), combined with automatic classification using a machine learning technique, was used to reduce the number of most influential attributes to 19 for MLP, 16 for RF, 17 for SVMs, 12 for C4.5, and 8 for CART, in order to analyze and predict IVF pregnancies in a more accurate manner. The most influential factors were summarized as: age, fertility factor index, basal sinus follicle count, number of mature eggs, sperm collection method, gametes, in vitro fertilization rate, 14-day follicle count, and embryo transfer date. Vogiatzi et al. (8) used the ANN approach to validate the efficiency of an ANN based on correlated parameters of live birth as a comprehensive tool for predicting clinical outcomes in patients undergoing ART. The ANN was constructed using 12 statistically significant parameters from the initial integration with a cumulative sensitivity of 76.7% and a specificity of 73.4%. The standard deviation of the performance metrics evaluated between the training and the testing sets was low in the validation process, pointing to the stability of the constructed ANN. The constructed ANN, based on statistically significant live birth outcome variables, is a stable and efficient system with increased performance metrics. The validation of the system led to the recognition of its clinical value as a medical decision aid and provided a reliable method for the routine practice of IVF units in a user-friendly environment. Elson et al. (22) developed a decision tree based on a combination of clinical, morphological, and biochemical parameters predicting successful pregnancy outcomes that assisted the expected management of women with tubal ectopic pregnancies. Significant differences were detected in maternal age, initial serum β-hCG, and progesterone among pregnant women who required surgery or recovered on their own. Analysis utilizing a decision tree can be used as an estimation guide for the probability of successful prediction individually.
Machine learning methods have, in general, high prediction accuracy; however, the final model prediction accuracy can vary.
Commonly used classifiers include SVMs, recursive partitioning, RF, adaptive augmentation, and KNN. Hafiz et al. (23) used data mining techniques to predict the implantation outcome of IVF and intracytoplasmic sperm injection (ICSI), which were found to be superior to other comparable methods using RF and recursive partitioning, with the corresponding area under the ROC curve (AUC) values reaching 84.23 and 82.05%, respectively. Ghaeini et al. (24) proposed an ICSI outcome prediction model based on decision trees and SVMs. The input variables of the model included parameters such as the medical history of the couple, hormone testing, and cause of infertility. The output variable was the occurrence of a clinical pregnancy. The accuracies of the decision tree method and SVMs were 70.3 and 75.7%, respectively. The performance of the SVM method was superior to the performance of the decision tree method.

Artificial Intelligence in Predicting Pregnancy Outcomes of Patients With RRF
Currently, research on predictive models for pregnancy outcomes in patients with RRF is limited and has mainly focused on classifying patients for better clinical management, ignoring the effects of relevant immune factors on pregnancy outcomes of the patients. Bruno et al. (25) used machine learning to stratify patients with RPL into different risk categories, validated their appropriate prognosis and potential treatments through diagnostic workup, provided a decisionsupport system tool to stratify RPL patients, and objectively addressed their appropriate clinical management. Immune factors were not accounted for and pregnancy outcomes were not predicted. Li et al. (26) suggested that RPL may be related to abnormally elevated amounts of uterine natural killer (uNK) cells. They pointed out the difficulty of counting uNK and stromal cells under histochemical sections, because of the close morphological proximity of stromal cells to epithelial cells. This paper was the first to report on the ability to distinguish between different cell morphologies and accurately count them using image recognition techniques. Researchers can greatly benefit from this method in analyzing immunohistochemical images. Nevertheless, its application is limited and is unable to provide predictions, based on cell counts alone. Mora-Sanchez et al. (27) concluded that the degree of allelic sharing of human leukocyte antigen (HLA) genes is related to RPL, combining immunogenetics with A.I. to create a personalized tool to elucidate the genetic causes of unexplained infertility and a gamete matching platform that could improve pregnancy success.
The representative literature on the development of predictive models for pregnancy outcomes in recent infertile and patients with RRF is summarized in Table 1. Notably, a limited amount of models were observed that were established for the prediction of pregnancy outcome of patients with RRF. The published models were based only on genetic markers or common indices. To investigate the impact of immune factors on pregnancy outcomes in patients with RRF, we applied A.I. for the analysis of the medical information of patients with RRF, including immune indices for pregnancy outcomes prediction.

Model Training and Performance
The initialization of the dictionary matrix (Wd) was carried out using uniformly distributed numbers. Each column of Wd was normalized to a magnitude of 1. The processed and normalized data set (X) and dictionary matrix were used as input into the Iterative Shrinkage and Thresholding-based Algorithm with coordinate descent to obtain the sparse representation (Z) (34).
Tanh and ReLu were selected as the activation functions of the hidden layers for the sparse representation. Softmax was used in the output layer for linear classification to obtain the prediction results. The cost of prediction was calculated using the sum of least squares between the prediction result and the true label. The Wd matrix was updated through backpropagation. Forwardand back-propagation were repeated until the optimal dictionary matrix was obtained (i.e., lowest cost). The data set was divided into training data set (90%) and testing data set (10%) for each data panel. The performance of the model was evaluated on the testing data set. The evaluation metrics included receiver operating characteristic (ROC) curves, accuracy, sensitivity, and specificity.

Clinical Characteristics of Samples
Following the literature review and in combination with the expertise of clinicians, six panels with 64 variables were considered as input variables. Three immune-related data panels, including the autoantibodies, peripheral immunology, and endometrial immunology panels were considered. Other IVF-related data panels contributing to the pregnancy outcome, including basic characteristics, hormones, and embryo panels were also considered. The clinical characteristics used in this study along with their respective description explaining their physical implications, type of the values, and their range in the collected data set are listed in Table 2. The mean age at the time of conception was 34.67 years and significantly different between the live birth group and non-live birth group. The average body mass index (BMI) was 21.69 kg/m 2 , with no statistically significant differences detected (21.55 vs. 21.79). Statistically significant differences were detected between some of the 64 variables, between the groups of patients who did and did not achieve a live birth ( Table 2).

Model Performance on Immunological Data Panels
We tested the sparse coding model using various data panels including autoantibodies (Figure 1), peripheral immunology (Figure 2), endometrial immunology (Figure 3), and the combination of all three immunological data panels (Figure 4). A summary of prediction accuracy using various data panels is shown in

Performance of Model on Combined Data Panels
Additionally, the sparse coding model was tested using both IVFrelated data panels and immunological data panels. The AUC during the training phase was 1.0 for the prediction of all four pregnancy outcomes, using combined data panels. The AUC during the testing phase ranged from 0.661 to 0.793, following a similar lower trend, as in the case of immunological data panels ( Figure 5). The use of combined data panels did not significantly enhance the effect on the prediction of all four pregnancy outcomes.

DISCUSSION
A machine learning model was developed in this study for the prediction of the pregnancy outcomes for the patients with RRF at any gestational period, namely, biochemical pregnancy, clinical pregnancy, ongoing pregnancy, and live birth. The accuracy of the models for each stage in the testing data set ranged from 54.2 to 89.7%. We observed that the performance of the endometrial immunology panel in biochemical pregnancy prediction was superior to the autoantibodies and peripheral immunology panel. Consistent with this result, it has been reported that implantation failure in ART is thought to be mainly due to impaired endometrial receptivity (35). In addition, implantation failure and miscarriage occurrence have been reported to have different mechanisms (36). Antiphospholipid syndrome (APS), which is characterized by the presence of anti-cardiolipin autoantibodies (ACAs), is the most common autoimmune disease associated with RPL (37). However, the association between ACAs and RIF is somewhat controversial (38). Anti-thyroid autoantibodies (ATA) have been also demonstrated to correlate with RPL, while, the association between ATA and RIF remains unclear (39)(40)(41). Our results showed that the AUC of the model using the autoantibodies panel in predicting clinical pregnancy, ongoing pregnancy, and live birth was the highest, while, the AUC for biochemical pregnancy prediction was the lowest. Subsequently, we can safely conclude that different models are appropriate for pregnancy outcome prediction at different pregnancy periods.
Machine learning algorithms have been widely used in many complex scenarios, such as image analysis, diagnosis, classification, and prognosis (42). Multiple machine learning techniques have been applied to improve the success rate of ART. The A.I. application in reproductive medicine has focused mainly on oocytes evaluation and selection (43), sperm analysis and selection (44), and embryo selection (45). A few studies have attempted to establish models for IVF outcome prediction FIGURE 4 | Combination of immunology-related panels (autoantibodies, peripheral immunology, and endometrial immunology) performance of sparse coding in predicting pregnancy outcomes at different pregnancy periods. (A) ROC plot of the training data set. (B) ROC plot of the testing data set. (23,46). Typical machine learning techniques such as Deep Artificial Neural Network (DANN) and Convolutional Neural Network (CNN) can be used to handle the high dimensionality data features, but very often these models are hard to interpret due to the "black-box" situation (47), which is usually not favored in biomedical applications. We adopted sparse coding which helps in the creation of an overcomplete information space composed of atom features with high dimensionality, which are critical to our model classifications. Simultaneously, the sparse representation of atoms can highlight the important features of patients using only a few atoms. It also enables us to visualize the features and interpret the classification or prediction results. To our knowledge, this is the first sparse coding-based prediction model based on reproductive big data including basic patient characteristics, hormone levels, immune status, and embryo parameter information for patients with RRF. This model represents an attempt at combining the reproductive immunology parameters with a machine learning algorithm. In conventional clinical practice, clinicians can only provide the successful pregnancy probability to the patient according to the mean success rate of the fertility center. In addition to predicting pregnancy outcomes, clinicians are also concerned about developing effective treatment strategies based on the medical data of the patient. The models in the majority of the previous studies provided the live birth probability to the clinicians. Given the variation in the probability of success, the clinicians were unable to know how close the status of the patient is related to a successful pregnancy. The clinicians usually plan the treatment strategy according to the medical data of the patient and their experience on the underlying connection between different parameters. Interpreting the underlying relationships between medical data may influence the decision of the clinicians concerning treatment strategies. Future studies can include a thorough analysis of the immune status of the patients, by comparing the atoms which contribute to a successful pregnancy, generated in the sparse representation and assist clinicians to develop more personalized treatment strategies based on the comparison result.
Several limitations of this study need to be considered. First, the entire data set of this study was derived from a single reproductive immunology center. Second, other factors that potentially affect pregnancy outcomes, such as lifestyle (e.g., smoking history) and family genetic history, were not taken into consideration in our study. Finally, the performance of our model is related to the quantity and quality of the data. Therefore, the model presented here needs further study with more multi-center clinical data before full implementation in a clinical setting. Moreover, the current A.I. is mainly used as a support system to improve the accuracy and efficacy of the clinicians, rather than a stand-alone decisionmaking system. The clinicians should collaborate with algorithm engineers to continually optimize the model, as it is applied in clinical work.

DATA AVAILABILITY STATEMENT
The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by Ethics Committee of Shenzhen Zhongshan Urology Hospital, Shenzhen, China. Written informed consent for participation was not required for this study in accordance with the national legislation and the institutional requirements.

AUTHOR CONTRIBUTIONS
WT, YZe, and WL were involved in the study design. CH, ZX, and WL were involved in the organization of the entire project, data analysis with a clinical perspective, and manuscript writing. DT, CY, and LW were involved in the establishment of the algorithm. YZh, YL, SY, and LD were involved in collecting and preparing the clinical data. ZL was mainly involved in the literature review about A.I. and reproductive medicine. All authors contributed to the article and approved the submitted version.