Skip to main content


Front. Aging Neurosci., 23 October 2023
Sec. Neurocognitive Aging and Behavior
Volume 15 - 2023 |

Predicting mild cognitive impairment among Chinese older adults: a longitudinal study based on long short-term memory networks and machine learning

Yucheng Huang1 Zishuo Huang1,2 Qingren Yang1,2 Haojie Jin1 Tingke Xu1 Yating Fu1 Yue Zhu1 Xiangyang Zhang3* Chun Chen1,4*
  • 1School of Public Health and Management, Wenzhou Medical University, Wenzhou, Zhejiang, China
  • 2School of Innovation and Entrepreneurship, Wenzhou Medical University, Wenzhou, Zhejiang, China
  • 3The First Affiliated Hospital of Wenzhou Medical University, Wenzhou, Zhejiang, China
  • 4Center for Healthy China Research, Wenzhou Medical University, Wenzhou, Zhejiang, China

Background: Mild cognitive impairment (MCI) is a transitory yet reversible stage of dementia. Systematic, scientific and population-wide early screening system for MCI is lacking. This study aimed to construct prediction models using longitudinal data to identify potential MCI patients and explore its critical features among Chinese older adults.

Methods: A total of 2,128 participants were selected from wave 5–8 of Chinese Longitudinal Healthy Longevity Study. Cognitive function was measured using the Chinese version of Mini-Mental State Examination. Long- short-term memory (LSTM) and three machine learning techniques, including 8 sociodemographic features and 12 health behavior and health status features, were used to predict individual risk of MCI in the next year. Performances of prediction models were evaluated through receiver operating curve and decision curve analysis. The importance of predictors in prediction models were explored using Shapley Additive explanation (SHAP) model.

Results: The area under the curve values of three models were around 0.90 and decision curve analysis indicated that the net benefit of XGboost and Random Forest were approximate when threshold is lower than 0.8. SHAP models showed that age, education, respiratory disease, gastrointestinal ulcer and self-rated health are the five most important predictors of MCI.

Conclusion: This screening method of MCI, combining LSTM and machine learning, successfully predicted the risk of MCI using longitudinal datasets, and enables health care providers to implement early intervention to delay the process from MCI to dementia, reducing the incidence and treatment cost of dementia ultimately.

1. Introduction

With an increasing older adult population worldwide, geriatric health concerns cannot be ignored. Aging results in declining physical and cognitive functions, leading to a high risk of disability and death (Klimova et al., 2017). Distinguishing between pathological and normal cognitive decline, generally referred to as dementia or cognitive impairment, remains challenging. As an inevitable human phenomenon, aging is a significant factor in deteriorating cognitive function. With a global increase in life expectancy, older adults have an increased likelihood of developing dementia and cognitive impairment. The World Health Organization (WHO) stated that >55 million older adults had a diagnosis of dementia in 2021, with >139 million older adults estimated to be diagnosed with dementia in 2050 worldwide. In 2019, the annual cost of dementia-related treatment exceeded US $1.3 trillion (World Health Organization, 2021). China has the greatest population of people with dementia, comprising 25% of the global population. Aggregate expenditure on dementia in China reached US $195 billion in 2019 (Jia et al., 2020b; Mattap et al., 2022).

With no reversal therapies available, prevention of dementia remains a priority. Mild cognitive impairment (MCI), a risk factor for dementia, is considered a transitional stage between normal cognitive function and dementia, where there is objective cognitive decline but with a capacity to live independently. However, approximately 10–20% of older adults aged ≥65 years with MCI are diagnosed with dementia after 1 year (Langa and Levine, 2014). Delaying the progression of MCI to dementia is currently the most effective approach, as diverse treatments for MCI have proven to be effective and less costly (Langa and Levine, 2014; Anderson, 2019; Huang et al., 2022), with early identification and intervention in high-risk groups shown to prevent dementia onset in 40% of such cases.

Currently, screening techniques and questionnaires for MCI are limited. On account of the fact that neurodegenerative disease starts to develop many years before the symptoms are observed, while applying MCI screening to the population with normal cognitive function, imaging examinations, and fluid biomarkers can detect the neurodegenerative and pathological changes most accurately. Imaging techniques, such as magnetic resonance imaging (MRI), positron emission computed tomography (PET), and single photon emission computed tomography (SPECT), are capable of showing the tiny changes in brain structure, blood flow, metabolism, and neurotransmitters in patients with MCI. Nevertheless, due to the rarity and inaccessibility of these techniques for the general public, they cannot be used as a common screening tool for MCI (Dunne et al., 2021), with limited coverage in terms of MCI questionnaires [Mini-Mental State Examination (MMSE) and the Montreal Cognitive Assessment (MoCA)] that generally require a significant investment in manpower and their training. Therefore, an effective, systematic, and convenient MCI screening method to identify high-risk older adults in the general population is urgently needed. Effective screening could be conducive to targeted interventions for those at high risk of MCI. One study reported significant changes through implementing appropriate early intervention for potential patients in England, namely, an 8.5% decrease in the incidence of dementia and a reduction in dementia-related expenditure of approximately $180 million (Mukadam et al., 2020). Owing to the irreversible nature of dementia, treatment for patients with dementia places considerable financial and psychological pressure on families and caregivers (Chiao et al., 2015). Given the significant negative effects of dementia, it is critical to identify high-risk individuals at an early stage.

Some studies have adopted multiple perspectives to identify risk factors in people with MCI. A national cross-sectional study in China that comprised 46,011 older adults showed that MCI was associated with sociodemographic characteristics, including age, sex, parental history, education level, residence, and marital status (Jia et al., 2020a). Several cohort studies have shown a causal relationship between health status and behaviors that contribute to MCI. Chronic diseases, such as hypertension, stroke, and diabetes as well as harmful lifestyle behaviors, such as smoking and alcohol consumption, significantly increase the risk of MCI, while regular physical exercise, tea/coffee consumption, and playing Mahjong can prevent cognitive impairment (Kivipelto et al., 2018; Kakutani et al., 2019; Zhang et al., 2020, 2022). Owing to limitations in conventional regression methods in terms of collinearity potentially affecting predictors, some studies have applied machine learning based on imaging data or biomarkers to further determine whether an individual has MCI and to explore key features of MCI (Mirzaei et al., 2016; Wang et al., 2022; Alamro et al., 2023). However, most machine learning studies have only used single-wave panel data, and neurodegenerative disorders have a natural history of progression, thus ignoring the dynamic and longitudinal nature of these diseases, such that early identification and intervention could be sufficient.

Consequently, to address those deficiencies in previous studies, we used long short-term memory networks (LSTMs) in this study to capture the interdependence of predictors in longitudinal data. In combination with machine learning, it is possible to generate a model that can forecast the likelihood of conversion to MCI after several years. This model facilitates convenient and efficient screening for MCI and identification of risk groups for targeted intervention procedures. LSTMs are a form of recurrent neural network that address long-term dependencies and gaps between significant events in sequential data. Compared to traditional times series analysis like the Autoregressive Integrated Moving Average model (ARIMA), LSTMs models generally generate better outcomes in nonlinear and volatile time series data (Lou et al., 2022; Liu X. D. et al., 2023) despite the complexity of model interpretations and the long duration of model training. LSTMs were originally introduced into medically relevant applications to forecast the incidence and prevalence of diseases with considerable success during the COVID-19 pandemic (Borges and Nascimento, 2022; Gautam, 2022; Liu X. D. et al., 2023). Simultaneously, several studies have shown the feasibility of using LSTMs prediction in relation to individual characteristics in machine learning techniques to predict depression in older adults through applying longitudinal sequence data (Su et al., 2020; Lin et al., 2022).

No previous studies have used multiple sequence data waves to predict potential MCI in older Chinese adults. On the basis of the traits that LSTMs could effectively capture the temporal dependencies and trends of individual characters in longitudinal data from multiple data waves, and the capability that machine learning could extract important variables with significant trends related to MCI, therefore, this study assumes that the combination of LSTMs and machine learning could successfully identify the older adults at high risk for MCI and indicate instructions of implementing early interventions to prevent dementia.

2. Materials and methods

2.1. Data source and samples

The data used in this study were Waves 5–8 (2008, 2011, 2014, 2018) of the Chinese Longitudinal Healthy Longevity Survey (CLHLS), a secondary data series collected by the Center for Healthy Aging and Development and the China Mainland Information Group, Peking University, since 1998 (Center for Healthy Aging and Development Studies. The Chinese Longitudinal Healthy Longevity Survey (CLHLS)-Longitudinal Data, 1998–2018). Respondents in the CLHLS among the selected waves were randomly sampled from approximately half of the counties and city districts of China’s 23 mainland provinces. The CLHLS questionnaire includes a wide range of instruments, such as the Mini-Mental State Examination (MMSE), the Center for Epidemiologic Studies Depression Scale, and the Self-Rating Anxiety Scale. Previous studies have confirmed that the design of questionnaire and quality of datasets are excellent (Gu, 2008; Zeng, 2012).

The Wave 5 questionnaire of the CLHLS was used to obtain baseline characteristics of the older adults, including 2,334 home-based interviewees who continuously responded until Wave 8. After excluding respondents lacking answers or records for cognition measurement, that is, the MMSE questionnaire in this study, and respondents who had been diagnosed with dementia in Waves 5–7 based on the their MMSE scores, 2,128 eligible participants were included in the ultimate data preprocessing and statistical analysis.

2.2. Assessment of MCI and outcome variables

The MMSE has been widely applied to screen for cognitive dysfunction among older adults. In the CLHLS questionnaires, the MMSE was modified into a Chinese version, including 24 items within six dimensions: five items for orientation (five points in total), one for naming (seven points in total, one point for naming each kind of food), three for registration (three points in total), five for attention and calculation (five points in total), three for recall (three points in total), and seven for language (seven points in total). The final cognitive function score was the sum of the scores of the six dimensions, with a possible total of 30 points.

In this study, due to the age distribution of participants (age range, 70–80 years, 31.72%; age ≥ 80 years, 68.28%), MCI was defined as an MMSE score < 18 in this study (patients with MCI = 1; normal participants = 0) (An and Liu, 2016; Gao et al., 2017).

2.3. Predictors

We considered three levels of individual characteristics to fit the LSTMs and machine learning models from Waves 5–8, namely (Supplementary Table 1), (i) sociodemographic characteristics, such as age, sex, geographical area, education level, marital status, residence, income level, and living status; (ii) health behavior factors, including active smoking, alcohol consumption, exercise, self-rated health [SRH], and sleep quality; and (iii) health status factors, such as a history of hypertension, diabetes, cardiopathy, stroke, chronic respiratory disease, cancer, or gastrointestinal ulcer.

2.4. Processing of missing values

In order to reduce the probability of bias during the imputation procedure, variables with >20% information were abandoned to guarantee good performance (Jakobsen et al., 2017). The ultimate predictors included from CLHLS Waves 5–7 were imputed utilizing a MICE package in R studio 4.2.3 software, applying multivariate iterative random forest (“RF” method) imputation algorithms with five iterations to produce datasets with the least variance compared with datasets being imputed before.

2.5. Statistical analysis

Statistical analyses were performed using Keras package (version 2.6.0) software for deep learning and Scikit-Learn package (version 1.1.2) for machine learning in Python (version 3.9) software. We randomly partitioned the data into three disjoint sets: training, testing, and validation, with proportions of 60, 20, and 20%, respectively. Details about hyperparameters of LSTMs and parameters of three machine learning models were listed in Supplementary Tables 2, 3.

2.5.1. The multivariate LSTMs models

Machine learning techniques are generally applied to panel data from a cross-sectional perspective, but are not able to capture features with time sensitivity. To forecast the development of predictors and explore potential outcomes, recurrent neural networks (RNNs) are used to capture the inputs of predictors from specific time periods and transfer information to subsequent time periods through combining the interdependence among predictors. However, traditional RNNs cannot cope with gradient vanishing and gradient exploding in long-term dependency issues owing to their simple neuron structure, whereas LSTMs can successfully handle these disadvantages in RNNs through the use of “forget gate” and the sigmoid function in each LSTMs unit. The LSTMs model has been validated as a powerful and precise model for forecasting time-series data in longitudinal studies. As shown in Figure 1, time-sensitivity predictors in CLHLS Waves 5–6 were randomly split such that 70% of the samples were used to train the LSTMs model to forecast the values of the predictors in Wave 7, and the remaining 30% of the samples were used to test our LSTMs model. The model was then fitted to CLHLS Waves 6–7 to forecast predictors in Wave 8, combining invariable features such as age, sex, education level, and geographical area that did not need to be predicted over time to constitute a new dataset.


Figure 1. The predictors of the LSTMs model for older adults with MCI from CLHLS wave 5 to wave 8.

2.5.2. Synthetic minority oversampling technique

Imbalanced data were a challenge for machine learning as the proportion of older adults with MCI was only 16.92% in this study. A common issue is that models tend to be biased toward the majority class, resulting in suboptimal performance. To address this problem, we applied the synthetic minority oversampling technique (SMOTE). SMOTE creates synthetic samples from the existing minority class through interpolation from its nearest neighbors, thereby increasing the number of minority samples in the datasets.

2.5.3. Gradient boosting decision tree (GBDT)

The GBDT is an ensemble machine learning approach for classification and regression based on the CART algorithm. The GBDT improves prediction accuracy through gradually improving estimation using a boosting method. In addition, the GBDT utilizes a nonlinear regression procedure to improve tree accuracy. A series of decision trees was created, which produced a set of weak prediction models and generated loss functions. The final classification model was the weighted sum of all weak prediction models through each round of training.

2.5.4. Extreme gradient boosting

XGBoost is a scalable and efficient implementation of gradient boosting, a popular machine learning technique that combines weak learners (typically decision trees) into a strong ensemble model. XGBoost offers several advantages over other gradient boosting frameworks, such as parallelization, regularization, and missing value handling. In addition, XGBoost can handle encoded categorical variables.

2.5.5. Random Forest algorithm

Random Forest (RF) is a machine learning technique that builds an ensemble of decision trees and aggregates their predictions. RF can handle both classification and regression problems, as well as categorical and numerical features. It also provides measures of feature importance and variable selection. RF introduces randomness in two ways: by bootstrapping the training data for each tree, and by selecting a random subset of features for each split. To analyze the ultimate result, each decision tree was accessed in the final decision to obtain a reliable result. Based on majority selection for all decision trees, each sample was classified into two classes.

2.6. Model assessment

To assess the outcomes of each machine learning model, we calculated the area under the receiver operating characteristic curve (ROC; AUC) and sensitivity (equation 1), specificity (equation 2), accuracy (equation 3), and balanced accuracy (equation 4). True positives and true negatives indicate older adults who were correctly identified as patients with MCI or the normal cognitive function group, respectively; false positives and false negatives indicate older adults who were inaccurately identified as patients with MCI or the normal cognitive function group, respectively. Each machine learning model could predict the probability of cognitive impairment in older adults. If the probability of an individual was greater than the threshold, then older adults were regarded as patients with MCI, and vice versa. To further evaluate and understand the prediction models, we calculated the net benefit of the machine learning models using decision curve analysis (DCA). This method indicated the proportion of patients who received a correct diagnosis minus the percentage of patients who were misdiagnosed under different threshold values.

Sensitivity=TruePositiveTruePositive+FalseNegative    (1)
Sensitivity=TrueNegativeTrueNegative+FalsePositive    (2)
Accuracy=TrueNegative+TruePositive    (3)TrueNegative+TruePositive+FalseNegative+FalsePositive
Balance accuracy=2SpecificitySensitivitySpecificity+Sensitivity    (4)

2.7. SHapley Addictive explanation models

For ensemble machine learning models applied in this study, the processes of their predictions are generally opaque. Unlike the traditional statistical models, it is difficult for people to understand their working mechanisms and certain positive or negative contributions of predictors to the outcomes. To address this problem, post-hoc interpretations of the model output should be proposed for machine learning studies. Based on the individual and joint contributions among players, Shapley values are a way of fairly allocating the payoff of a game in cooperative game theory, which was introduced into machine learning techniques to explain the attribution of each input feature toward the outcome. SHapley Addictive explanation models (SHAP) is able to be used to provide various types of visualized explanations for machine learning models, including global feature importance, feature interaction, and feature dependence. SHAP was performed in Python using shap package (Version 0.42.1) in this study and was used to visualize the importance of each predictor and the association between predictors and MCI quantitatively (Ekanayake et al., 2022).

3. Results

As presented in Table 1, 2,146 older adults in the baseline CLHLS wave of 2008 participated in this study (older adults with MCI, 17.29%). The median age of patients with MCI was 92 years (range, 86–97 years), which was 10 years older than that of older adults with normal cognitive function (82 years, range, 78–88 years). The proportions of older adult males (46.62%) and females (53.38%) were relatively equal, with approximately two-thirds of the participants with MCI being female. Of older adults with MCI, 71.67% were illiterate, and 75.28% were single older adults. Older adults with low or very low-income levels comprised the majority of participants with MCI. The percentage of individuals living alone was higher among those with normal cognitive function than among those with MCI. Only 13.61% of older adults regularly exercised among those with MCI. People with normal cognitive function generally rated their health and sleep quality as better than those with MCI. A higher percentage of older adults in the normal group had a diagnosis of hypertension. A total of 14.17% of older adults with a history of stroke had poorer MMSE scores.


Table 1. Predicted characteristics in 2018 and odds ratio of older adults with MCI.

For further descriptive analysis, odds ratios (ORs) for each predictor were evaluated using univariate and multivariate logistic regression analyses. Among sociodemographic variables, the analysis showed that age was a risk factor for MCI (adjusted OR [aOR] 1.123, 95% CI 1.103–1.143). Compared with literate older adults, illiterate older adults had a higher risk of developing MCI (aOR 1.641, 95% CI 1.199–2.247). Older adults with very low income levels had a higher risk of MCI than their wealthier counterparts (aOR 5.673, 95% CI 1.067–30.180). Among health behavior/health status variables, older adults who did not regularly exercise had a high risk of MCI (aOR 2.277, 95% CI 1.596–3.248). Older adults with poor or very poor self-rated health had a higher risk of MCI compared with those who had very good self-rated health (aOR 2.069, 95% CI 1.145–3.740 and aOR 3.874, 95% CI 1.527–9.826, respectively). Moreover, older adults with no history of stroke had a reduced risk of MCI (aOR 0.515, 95% CI 0.347–0.776).

LSTMs model performance is illustrated in Figure 2. The mean squared errors of both the training and validation sets were generally equal (approximately 0.08) after 30 rounds of training, and the inflection points of both sets were close, indicating that the LSTMs model could be utilized to forecast characteristics of older adults three years later. Table 2 and Figure 3A shows the ROC curves and AUC values of the three machine learning models in the testing set (GBDT 0.902, 95% CI 0.879–0.925; XGBoost 0.928, 95% CI 0.908–0.948; and RF 0.938, 95% CI 0.919–0.956). Table 3 and Figure 3B shows the performance of the three models in the validation set. The AUC values of all three machine learning models in the test sets were >0.9. The three machine learning models produced equal results in the validation sets, indicating that they were robust models for classifying patients with MCI and healthy people. XGBoost had the highest and most balanced accuracy and the second-highest sensitivity using 0.3 as a threshold (Table 2), and RF produced the highest sensitivity under this condition. The DCA results (Figure 4) showed that the XGboost and RF models were close, within the range of 0–0.8, and the net benefit values were higher than 0.4 using 0.3 as a threshold.


Figure 2. The training and validation curve of LSTMs from CLHLS wave5 to wave 7 (MSE, Mean squared error).


Figure 3. Performance of machine learning models in test set (A) and validation set (B).


Table 2. Performance of machine learning models in test set of predicting MCI among Chinese older adults.


Table 3. Performance of machine learning models in validation set of predicting MCI among Chinese older adults.


Figure 4. Decision curve analysis. The x-axis indicates the threshold probability of MCI. The y-axis indicates the net benefit.

Figure 5 illustrates the ranking of feature importance in MCI prediction. Age, education, and chronic respiratory disease were the first, second, and third-most important characteristics of older adults when predicting MCI in all three models, respectively. Younger literate older adults with no history of chronic respiratory disease had a lower probability of developing MCI. Self-rated health was also an important feature that presented a direct trend in MCI output. All three SHAP models indicated that having a gastrointestinal ulcer was one of the most important features for predicting potential MCI in patients; however, it did not show a clear tendency in relation to MCI progression.


Figure 5. Importance of predictors analysis by SHAP model. SHAP (SHapley Additive exPlanation) values are ranked by value of a feature to the predictions made by the GBDT/XGboost/RF.

4. Discussion

To our knowledge, this study is the first to forecast cognitive impairment in older Chinese adults using an LSTMs model and machine learning based on CLHLS Waves 5–8, with predictions that included sociodemographic health behaviors and health status characteristics. In total, 2,128 older adults were included in this study. Our LSTMs model produced robust results in the validation set; thus, it was capable of forecasting the feature values of older adults in the next wave using the SMOTE algorithm and three machine learning approaches that performed well in predicting MCI. Figure 6 depicts the conceptual framework discussed, summarizes the accuracy of the prediction models, presents the results, and presents multiple perspective values.


Figure 6. Conceptual framework of discussion in this study.

Regarding model precision, this prediction method combining LSTMs and machine learning can be successfully applied to longitudinal data to capture temporal information, thus improving the accuracy of MCI predictions in older adults (Chae et al., 2018; Wang et al., 2019; Su et al., 2020). To date, most studies have used LSTMs to forecast the prevalence and incidence rates or temporal trends in medical-related applications (Borges and Nascimento, 2022; Gautam, 2022; Liu X. D. et al., 2023). In addition, LSTMs have shown excellent performance when predicting high-dimensional data such as air and water pollution (Kim et al., 2022; Middya and Roy, 2022). Thus, building on previous LSTMs applications, some studies have used LSTMs to detect early health deterioration in individual clinical data (da Silva et al., 2021). Furthermore, the utilization of LSTMs to forecast individual features, followed by machine learning to construct predictive models, has been shown to be useful in disease prediction; for example, in the prediction of depression (Su et al., 2020; Lin et al., 2022) and in glaucoma assessment (Dixit et al., 2021). To date, no studies have utilized LSTMs and machine learning to establish a prediction model for MCI and explore its risk factors. Compared to the previous two prediction models using CLHLS, this study revealed relatively high accuracy and robustness with the AUCs of 0.902 to 0.938 for the test set and high sensitivity and specificity, and from 0.890 to 0.914 for the validation test. One longitudinal study proposed to use The Growth Mixed Model (GMM) and machine learning combination to forecast the MMSE trajectory of older adults. Due to the time effect bias for the application of constant baseline individual character in forecasting models, the AUCs of their models ranged from 0.51 to 0.66 in eight machine learning techniques (Wu et al., 2022). The other study utilized sociodemographic and life behavioral features of Chinese older adults to construct prediction models, achieving an accuracy of 0.7540 and the AUC of 0.8269 at maximum (Wang et al., 2022). To conclude, the outcomes of LSTMs and machine learning framework demonstrates the feasibility and effectiveness of the study hypothesis.

Three decision tree-based models (GBDT, XGBoost, and RF) were used with SHAP to interpret individual predictions. Age, education level, chronic respiratory disease, gastrointestinal ulcers, and self-rated health were identified as the five most important predictors in this study. Age and education level have been reported in previous studies to be important predictors of MCI (Chun et al., 2022; Liu H. et al., 2023). Physiological decline in cognitive function is inevitable as people age (Langa and Levine, 2014) and age is a major predictor of MCI. Lower educational levels have been shown to be significantly associated with cognitive decline, and education in later life may also contribute to improved cognitive function (Peeters et al., 2020). According to our results, older adults with a formal education performed well in terms of MMSE scores. The other three features were not found to be strong predictors in other studies; however, they have all been shown to be closely associated with MCI. Older adults with no history of chronic respiratory disease are less likely to develop MCI. Common chronic respiratory diseases, such as chronic obstructive pulmonary disease and obstructive sleep apnea-hypopnea syndrome, lead to perennial hypoxia and hypercarbia (Olaithe et al., 2018), causing damage to brain functions, including language, execution, and attention. Ultimately, cognitive function continues to decline under these pathological conditions. Gastrointestinal ulcers did not show a clear trend in Figure 5, whereas changes in metabolic substances in the gastrointestinal tract under pathological conditions are reported to impair brain function via the gut-brain axis (Zeng et al., 2022). Moreover, a healthy gastrointestinal tract can guard against cognitive decline and mitigate neuroinflammation (Xiang et al., 2022); hence, this result needs to be verified in another study. The SHAP analysis illustrated a positive correlation between self-rated health and MCI; that is, good self-rated health may represent good cognitive function and vice versa, which is consistent with previous cohort studies (Bond et al., 2006).

MCI prediction models could provide references for clinical practice and bring broad benefits to society; however, they still need adjustment and practice to meet the standards for real-world application. When applied for MCI screening, the most appropriate prediction model requires striking a balance between sensitivity and specificity to achieve high precision and cost-effectiveness. Consequently, it is critical to determine the threshold for identifying patients with MCI and conducting further interventions. As shown in Figure 4, the XGBoost prediction model had the greatest net benefit and balanced accuracy when the threshold probability was <0.6. When the threshold probability was 0.3, RF had the highest sensitivity and identifies most patients with MCI with relatively low cost-effectiveness owing to the proportion of misdiagnoses. Determining the ultimate thresholds require constant evaluation and collaboration between governments and healthcare providers to obtain optimal clinical, economic, and social outcomes.

Ongoing application of this approach and cooperation can be viewed from three perspectives: the nation (macro), healthcare providers (medium), and individuals (micro). As a macro-regulator, the government should enhance the utilization of big data and incorporate prediction models into various healthcare provider and public Internet platforms. This screening method could promote population health and reduce the disease burden. Various healthcare providers can select different thresholds in terms of specific medical conditions and testing technologies and change their criteria according to local prevalence and incidence. As psychiatric hospitals are generally equipped with adequate medical resources, the threshold for machine learning models could be relatively low to achieve suitable resource allocation. Once MCI predictive models become more sophisticated with continuous training and with more individual information available, such as risk genes or biomarkers, the threshold can be adjusted to pursue relatively high cost-effectiveness. In terms of the micro perspective, the general public could benefit through becoming more aware of their own and their families’ risk of MCI through the application of this prediction model, avoiding additional examinations and ameliorating individual MCI risk.

This study contributes to the prevention of MCI and dementia. First, the combination of an LSTMs model and machine learning could precisely identify patients with MCI and their critical features several years earlier. Age, literacy level, chronic respiratory disease, gastrointestinal ulcers, and self-rated health were good predictors of MCI. Second, MCI prediction models have substantial clinical, economic, and social value through optimizing prediction under governmental direction and adjusting thresholds for MCI probability according to the specific needs of different healthcare providers. Finally, this study contributes to the prevention of dementia and MCI and promotes healthy aging.

5. Study limitations

This study had some limitations. First, we examined the robustness of both LSTMs and machine learning models and included four waves of data; however, our findings need to be validated in another cohort. Lacking external validation may affect the performance and adaptability of prediction models in different scenarios, as well as the confidence in the predictive ability of the models. Therefore, future researchers need to use other sources or types of data to validate this method framework and explore possibilities for improvement. Second, most predictors in this study were self-reported, which could have led to information bias. Third, the MMSE has a ceiling effect, meaning that it may not detect subtle changes in cognition that occur during MCI. Furthermore, MMSE scores could be affected by certain individual sociodemographic background factors (Arevalo-Rodriguez et al., 2021; Wu et al., 2022); therefore, MCI evaluations should be more comprehensive and include using Montreal Cognitive Assessment and the Clinical Dementia Rating evaluations, in addition to detecting biomarkers and undertaking imaging examinations for a more accurate clinical diagnosis in future studies. While this study proposes a convenient screening method using accessible individual features for the general public, outcomes obtained using this method are for reference only and cannot replace acknowledged MCI diagnosis standards.

6. Conclusion

This study showed that individual features could be predicted through combining LSTMs and machine learning models. The risk of MCI could be accurately predicted through exploring critical risk factors, such as age, education level, chronic respiratory disease, gastrointestinal ulcer, and self-rated health, in patients with MCI using three SHAP models among older Chinese adults based on four waves of CLHLS datasets. The combination of LSTMs and machine learning models captured the interdependence of predictors and generated an effective decision support system for healthcare providers to identify patients at high risk of MCI. With macro-direction undertaken at a governmental level, this screening method can continue to be optimized to obtain better thresholds for MCI screening. Our study findings may offer healthcare providers MCI screening support to implement early interventions to delay the progression from MCI to dementia, increase test availability among the population, and reduce incidence rates and treatment costs, ultimately contributing to healthy aging.

Data availability statement

The original contributions presented in the study are included in the article/Supplementary material, further inquiries can be directed to the corresponding authors.

Author contributions

YH: Conceptualization, Writing – original draft, Writing – review & editing, Data curation, Formal analysis, Investigation, Methodology, Project administration, Software, Validation, Visualization. ZH: Conceptualization, Data curation, Formal analysis, Investigation, Project administration, Validation, Visualization, Writing – original draft, Writing – review & editing. QY: Validation, Writing – review & editing, Methodology, Software. HJ: Methodology, Software, Writing – review & editing, Data curation. TX: Writing – review & editing. YF: Writing – review & editing, Visualization. YZ: Writing – review & editing, Formal analysis. XZ: Writing – review & editing, Funding acquisition, Resources. CC: Funding acquisition, Resources, Writing – review & editing, Conceptualization, Investigation, Project administration, Supervision, Writing – original draft.


The author(s) declare financial support was received for the research, authorship, and/or publication of this article. This work was supported by National Natural Science Foundation of China (72274141), Zhejiang Provincial Natural Science Foundation (LY22G030006), 2023 Joint Project of Science and Technology Department of National Administration of Traditional Chinese Medicine and Zhejiang Administration of Traditional Chinese Medicine (GZY-ZJ-KJ-23084), and Point Leader Research and Development Project (2022C03G1890052).


We are grateful to the CLHLS participants for providing the data for this research. The CLHLS was supported by funds from the US National Institute on Aging, National Institutes of Health, the Duke/Duke-NUS Collaboration Pilot Project, the National Natural Science Foundation of China, the China Social Science Foundation, and the UN Fund for Population Activities. The CLHLS was managed by the Center for Healthy Aging and Development Studies, Peking University.

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Supplementary material

The Supplementary material for this article can be found online at:


Alamro, H., Thafar, M. A., Albaradei, S., Gojobori, T., Essack, M., and Gao, X. (2023). Exploiting machine learning models to identify novel Alzheimer’s disease biomarkers and potential targets. Sci. Rep. 13:4979. doi: 10.1038/s41598-023-30904-5

PubMed Abstract | CrossRef Full Text | Google Scholar

An, R., and Liu, G. G. (2016). Cognitive impairment and mortality among the oldest-old Chinese. Int. J. Geriatr. Psychiatry 31, 1345–1353. doi: 10.1002/gps.4442

CrossRef Full Text | Google Scholar

Anderson, N. D. (2019). State of the science on mild cognitive impairment (MCI). CNS Spectr. 24, 78–87. doi: 10.1017/S1092852918001347

CrossRef Full Text | Google Scholar

Arevalo-Rodriguez, I., Smailagic, N., Roqué-Figuls, M., Ciapponi, A., Sanchez-Perez, E., Giannakou, A., et al. (2021). Mini-mental state examination (MMSE) for the early detection of dementia in people with mild cognitive impairment (MCI). Cochrane Database Syst. Rev. 2021:CD010783. doi: 10.1002/14651858.CD010783.pub3

CrossRef Full Text | Google Scholar

Bond, J., Dickinson, H. O., Matthews, F., Jagger, C., and Brayne, C. (2006). Self-rated health status as a predictor of death, functional and cognitive impairment: a longitudinal cohort study. Eur. J. Ageing 3, 193–206. doi: 10.1007/s10433-006-0039-8

PubMed Abstract | CrossRef Full Text | Google Scholar

Borges, D., and Nascimento, M. C. V. (2022). COVID-19 ICU demand forecasting: a two-stage prophet-LSTM approach. Appl. Soft Comput. 125:109181. doi: 10.1016/j.asoc.2022.109181

PubMed Abstract | CrossRef Full Text | Google Scholar

Center for Healthy Aging and Development Studies. The Chinese Longitudinal Healthy Longevity Survey (CLHLS)-Longitudinal Data (1998–2018). Peking University open research data Platform2020. Available at:

Google Scholar

Chae, S., Kwon, S., and Lee, D. (2018). Predicting infectious disease using deep learning and big data. Int. J. Environ. Res. Public Health 15:1596. doi: 10.3390/ijerph15081596

PubMed Abstract | CrossRef Full Text | Google Scholar

Chiao, C. Y., Wu, H. S., and Hsiao, C. Y. (2015). Caregiver burden for informal caregivers of patients with dementia: a systematic review. Int. Nurs. Rev. 62, 340–350. doi: 10.1111/inr.12194

CrossRef Full Text | Google Scholar

Chun, M. Y., Park, C. J., Kim, J., Jeong, J. H., Jang, H., Kim, K., et al. (2022). Prediction of conversion to dementia using interpretable machine learning in patients with amnestic mild cognitive impairment. Front. Aging Neurosci. 14:898940. doi: 10.3389/fnagi.2022.898940

PubMed Abstract | CrossRef Full Text | Google Scholar

da Silva, D. B., Schmidt, D., da Costa, C. A., da Rosa, R. R., and Eskofier, B. (2021). DeepSigns: a predictive model based on deep learning for the early detection of patient health deterioration. Expert Syst. Appl. 165:113905. doi: 10.1016/j.eswa.2020.113905

CrossRef Full Text | Google Scholar

Dixit, A., Yohannan, J., and Boland, M. V. (2021). Assessing glaucoma progression using machine learning trained on longitudinal visual field and clinical data. Ophthalmology 128, 1016–1026. doi: 10.1016/j.ophtha.2020.12.020

PubMed Abstract | CrossRef Full Text | Google Scholar

Dunne, R. A., Aarsland, D., O’Brien, J. T., Ballard, C., Banerjee, S., Fox, N. C., et al. (2021). Mild cognitive impairment: the Manchester consensus. Age Ageing 50, 72–80. doi: 10.1093/ageing/afaa228

PubMed Abstract | CrossRef Full Text | Google Scholar

Ekanayake, I., Meddage, D., and Rathnayake, U. (2022). A novel approach to explain the black-box nature of machine learning in compressive strength predictions of concrete using Shapley additive explanations (SHAP). Case Stud. Constr. Mater. 16:e01059. doi: 10.1016/j.cscm.2022.e01059

CrossRef Full Text | Google Scholar

Gao, M., Kuang, W., Qiu, P., Wang, H., Lv, X., and Yang, M. (2017). The time trends of cognitive impairment incidence among older Chinese people in the community: based on the CLHLS cohorts from 1998 to 2014. Age Ageing 46, 787–793. doi: 10.1093/ageing/afx038

PubMed Abstract | CrossRef Full Text | Google Scholar

Gautam, Y. (2022). Transfer learning for COVID-19 cases and deaths forecast using LSTM network. ISA Trans. 124, 41–56. doi: 10.1016/j.isatra.2020.12.057

PubMed Abstract | CrossRef Full Text | Google Scholar

Gu, D. (2008). “General data quality assessment of the CLHLS” in Healthy longevity in China: demographic, socioeconomic, and psychological dimensions. eds. Z. Yi, D. L. Poston, D. A. Vlosky, and D. Gu, vol. 20 (Dordrecht: Springer Netherlands), 39–60.

Google Scholar

Huang, X., Zhao, X., Li, B., Cai, Y., Zhang, S., Wan, Q., et al. (2022). Comparative efficacy of various exercise interventions on cognitive function in patients with mild cognitive impairment or dementia: a systematic review and network meta-analysis. J. Sport Health Sci. 11, 212–223. doi: 10.1016/j.jshs.2021.05.003

PubMed Abstract | CrossRef Full Text | Google Scholar

Jakobsen, J. C., Gluud, C., Wetterslev, J., and Winkel, P. (2017). When and how should multiple imputation be used for handling missing data in randomised clinical trials – a practical guide with flowcharts. BMC Med. Res. Methodol. 17:162. doi: 10.1186/s12874-017-0442-1

PubMed Abstract | CrossRef Full Text | Google Scholar

Jia, L., Du, Y., Chu, L., Zhang, Z., Li, F., Lyu, D., et al. (2020a). Prevalence, risk factors, and management of dementia and mild cognitive impairment in adults aged 60 years or older in China: a cross-sectional study. Lancet Public Health 5, e661–e671. doi: 10.1016/S2468-2667(20)30185-7

CrossRef Full Text | Google Scholar

Jia, L., Quan, M., Fu, Y., Zhao, T., Li, Y., Wei, C., et al. (2020b). Dementia in China: epidemiology, clinical management, and research advances. Lancet Neurol. 19, 81–92. doi: 10.1016/S1474-4422(19)30290-X

PubMed Abstract | CrossRef Full Text | Google Scholar

Kakutani, S., Watanabe, H., and Murayama, N. (2019). Green tea intake and risks for dementia, Alzheimer’s disease, mild cognitive impairment, and cognitive impairment: a systematic review. Nutrients 11:1165. doi: 10.3390/nu11051165

PubMed Abstract | CrossRef Full Text | Google Scholar

Kim, T., Shin, J., Lee, D., Kim, Y., Na, E., Park, J. H., et al. (2022). Simultaneous feature engineering and interpretation: forecasting harmful algal blooms using a deep learning approach. Water Res. 215:118289. doi: 10.1016/j.watres.2022.118289

PubMed Abstract | CrossRef Full Text | Google Scholar

Kivipelto, M., Mangialasche, F., and Ngandu, T. (2018). Lifestyle interventions to prevent cognitive impairment, dementia and Alzheimer disease. Nat. Rev. Neurol. 14, 653–666. doi: 10.1038/s41582-018-0070-3

CrossRef Full Text | Google Scholar

Klimova, B., Valis, M., and Kuca, K. (2017). Cognitive decline in normal aging and its prevention: a review on non-pharmacological lifestyle strategies. Clin. Interv. Aging 12, 903–910. doi: 10.2147/CIA.S132963

PubMed Abstract | CrossRef Full Text | Google Scholar

Langa, K. M., and Levine, D. A. (2014). The diagnosis and management of mild cognitive impairment: a clinical review. JAMA 312, 2551–2561. doi: 10.1001/jama.2014.13806

PubMed Abstract | CrossRef Full Text | Google Scholar

Lin, S., Wu, Y., and Fang, Y. (2022). A hybrid machine learning model of depression estimation in home-based older adults: a 7-year follow-up study. BMC Psychiatry 22:816. doi: 10.1186/s12888-022-04439-4

PubMed Abstract | CrossRef Full Text | Google Scholar

Liu, X. D., Wang, W., Yang, Y., Hou, B. H., Olasehinde, T. S., Feng, N., et al. (2023). Nesting the SIRV model with NAR, LSTM and statistical methods to fit and predict COVID-19 epidemic trend in Africa. BMC Public Health 23:138. doi: 10.1186/s12889-023-14992-6

PubMed Abstract | CrossRef Full Text | Google Scholar

Liu, H., Zhang, X., Liu, H., and Chong, S. T. (2023). Using machine learning to predict cognitive impairment among middle-aged and older Chinese: a longitudinal study. Int. J. Public Health 68:1605322. doi: 10.3389/ijph.2023.1605322

PubMed Abstract | CrossRef Full Text | Google Scholar

Lou, H. R., Wang, X., Gao, Y., and Zeng, Q. (2022). Comparison of ARIMA model, DNN model and LSTM model in predicting disease burden of occupational pneumoconiosis in Tianjin, China. BMC Public Health 22:2167. doi: 10.1186/s12889-022-14642-3

PubMed Abstract | CrossRef Full Text | Google Scholar

Mattap, S. M., Mohan, D., McGrattan, A. M., Allotey, P., Stephan, B. C., Reidpath, D. D., et al. (2022). The economic burden of dementia in low- and middle-income countries (LMICs): a systematic review. BMJ Glob Health 7:e007409. doi: 10.1136/bmjgh-2021-007409

CrossRef Full Text | Google Scholar

Middya, A. I., and Roy, S. (2022). Pollutant specific optimal deep learning and statistical model building for air quality forecasting. Environ. Pollut. 301:118972. doi: 10.1016/j.envpol.2022.118972

PubMed Abstract | CrossRef Full Text | Google Scholar

Mirzaei, G., Adeli, A., and Adeli, H. (2016). Imaging and machine learning techniques for diagnosis of Alzheimer’s disease. Rev. Neurosci. 27, 857–870. doi: 10.1515/revneuro-2016-0029

CrossRef Full Text | Google Scholar

Mukadam, N., Anderson, R., Knapp, M., Wittenberg, R., Karagiannidou, M., Costafreda, S. G., et al. (2020). Effective interventions for potentially modifiable risk factors for late-onset dementia: a costs and cost-effectiveness modelling study. Lancet Healthy Longev. 1, e13–e20. doi: 10.1016/S2666-7568(20)30004-0

PubMed Abstract | CrossRef Full Text | Google Scholar

Olaithe, M., Bucks, R. S., Hillman, D. R., and Eastwood, P. R. (2018). Cognitive deficits in obstructive sleep apnea: insights from a meta-review and comparison with deficits observed in COPD, insomnia, and sleep deprivation. Sleep Med. Rev. 38, 39–49. doi: 10.1016/j.smrv.2017.03.005

PubMed Abstract | CrossRef Full Text | Google Scholar

Peeters, G., Kenny, R. A., and Lawlor, B. (2020). Late life education and cognitive function in older adults. Int. J. Geriatr. Psychiatry 35, 633–639. doi: 10.1002/gps.5281

CrossRef Full Text | Google Scholar

Su, D., Zhang, X., He, K., and Chen, Y. (2020). Use of machine learning approach to predict depression in the elderly in China: a longitudinal study. J. Affect. Disord. 282, 289–298. doi: 10.1016/j.jad.2020.12.160

CrossRef Full Text | Google Scholar

Wang, S., Wang, W., Li, X., Liu, Y., Wei, J., Zheng, J., et al. (2022). Using machine learning algorithms for predicting cognitive impairment and identifying modifiable factors among Chinese elderly people. Front. Aging Neurosci. 506, 14–28. doi: 10.1016/j.neuroscience.2022.09.009

PubMed Abstract | CrossRef Full Text | Google Scholar

Wang, G., Wang, L., Qiu, J., Yan, Z., Tai, K., Yu, W., et al. (2019). Fabrication of efficient formamidinium perovskite solar cells under ambient air via intermediate-modulated crystallization. Sol. Energy 187, 147–155. doi: 10.1016/j.solener.2019.05.033

CrossRef Full Text | Google Scholar

World Health Organization. (2021) World failing to address dementia challenge. Available at:;

Google Scholar

Wu, Y., Jia, M., Xiang, C., Lin, S., Jiang, Z., and Fang, Y. (2022). Predicting the long-term cognitive trajectories using machine learning approaches: a Chinese nationwide longitudinal database. Psychiatry Res. 310:114434. doi: 10.1016/j.psychres.2022.114434

PubMed Abstract | CrossRef Full Text | Google Scholar

Xiang, S., Ji, J. L., Li, S., Cao, X. P., Xu, W., Tan, L., et al. (2022). Efficacy and safety of probiotics for the treatment of Alzheimer’s disease, mild cognitive impairment, and Parkinson’s disease: a systematic review and meta-analysis. Front. Aging Neurosci. 14:730036. doi: 10.3389/fnagi.2022.730036

PubMed Abstract | CrossRef Full Text | Google Scholar

Zeng, Y. (2012). Towards deeper research and better policy for healthy aging --using the unique data of Chinese longitudinal healthy longevity survey. China Econ. J. 5, 131–149. doi: 10.1080/17538963.2013.764677

PubMed Abstract | CrossRef Full Text | Google Scholar

Zeng, W., Yang, F., Shen, W. L., Zhan, C., Zheng, P., and Hu, J. (2022). Interactions between central nervous system and peripheral metabolic organs. Sci. China Life Sci. 65, 1929–1958. doi: 10.1007/s11427-021-2103-5

CrossRef Full Text | Google Scholar

Zhang, H., Peng, Y., Li, C., Lan, H., Xing, G., Chen, Z., et al. (2020). Playing mahjong for 12 weeks improved executive function in elderly people with mild cognitive impairment: a study of implications for TBI-induced cognitive deficits. Front. Neurol. 11:178. doi: 10.3389/fneur.2020.00178

PubMed Abstract | CrossRef Full Text | Google Scholar

Zhang, Y. R., Xu, W., Zhang, W., Wang, H. F., Ou, Y. N., Qu, Y., et al. (2022). Modifiable risk factors for incident dementia and cognitive impairment: an umbrella review of evidence. J. Affect. Disord. 314, 160–167. doi: 10.1016/j.jad.2022.07.008

CrossRef Full Text | Google Scholar

Keywords: mild cognitive impairment, machine learning (ML), LSTM (long short-term memory networks), prediction model, China

Citation: Huang Y, Huang Z, Yang Q, Jin H, Xu T, Fu Y, Zhu Y, Zhang X and Chen C (2023) Predicting mild cognitive impairment among Chinese older adults: a longitudinal study based on long short-term memory networks and machine learning. Front. Aging Neurosci. 15:1283243. doi: 10.3389/fnagi.2023.1283243

Received: 25 August 2023; Accepted: 10 October 2023;
Published: 23 October 2023.

Edited by:

Mario Bernardo-Filho, Rio de Janeiro State University, Brazil

Reviewed by:

Marcia Cristina Moura Fernandes, Rio de Janeiro State University, Brazil
Hao Wang, Shanghai Jiao Tong University, China
José Alexandre Bachur, University of Franca, Brazil

Copyright © 2023 Huang, Huang, Yang, Jin, Xu, Fu, Zhu, Zhang and Chen. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Chun Chen,; Xiangyang Zhang,

These authors have contributed equally to this work