Incremental Value of CSF Biomarkers in Clinically Diagnosed AD and Non-AD Dementia

Background: Cerebrospinal fluid (CSF) biomarkers are used to diagnose Alzheimer disease (AD), especially in atypical clinical presentations. No consensus currently exists regarding cut-off values. This study aimed, firstly, to define optimal cut-off values for CSF biomarkers, and secondly, to investigate the most relevant diagnostic strategy for AD based on CSF biomarker combinations. Methods: A total of 380 patients were prospectively included: 140 with AD, 240 with various neurological diagnoses (non-AD). CSF biomarkers were measured using ELISA. Univariate and multivariate analyses were performed using random forest and logistic regression approaches. Results: Univariate receiver operating curve curves analysis of T-Tau, P-Tau181, Aβ42, Aβ40 concentrations, and Aβ42/Aβ40 ratio levels showed AD cut-off values of ≥355, ≥57, ≤706, ≥10,854, and ≤0.059 ng/L, respectively. Multivariate analysis using random forest and logistic regression found that the algorithm based on P-Tau181, Aβ42 concentrations and Aβ42/Aβ40 ratio yielded the best discrimination between AD and non-AD populations. The cross-validation technique of the final model showed a mean accuracy of 0.85 and a mean AUC of 0.89. Conclusion: This study confirms that the Aβ42/Aβ40 ratio was more useful than the Aβ40 concentration in discriminating AD from non-AD populations in daily practice. These results indicate that the Aβ42/Aβ40 ratio should be assessed in all cases, independently of Aβ42 concentrations.


INTRODUCTION
Alzheimer disease (AD), the most common cause of dementia, is a progressive neurodegenerative disease clinically characterized by memory impairment and/or deficit in other cognitive domains associated with functional decline (1). For many years, the diagnosis of AD remained probabilistic and was based on clinical features associated with neuropsychological testing and neuroimaging (2). These old criteria were useful at the stage of dementia and were defined as cognitive impairment including memory impairment impeding daily life activities and interactions with the social network. This clinical diagnostic method led to a diagnosis of "probable" and "possible" AD, based on different levels of clinical confidence. Despite a lack of specificity (70%), this method has relatively good sensitivity (80%) (3). To increase diagnostic accuracy, especially in case of atypical clinical presentation, the International Working Group (IWG) recommendations proposed the use of diagnostic biomarkers such as neuroimaging (e.g., 8F-fluorodeoxyglucose PET, volumetric MRI and amyloid PET) and/or cerebrospinal fluid (CSF) biomarker assessment (4). The association of clinical criteria of AD with CSF biomarkers clearly improved diagnostic accuracy, raising it to over 80% (5). CSF biomarkers could also be useful for discriminating AD from other dementias [e.g., dementia with Lewy bodies (DLB)] (6).
Core CSF biomarker assessment is defined as a combination of amyloid-β 1-42 peptide (Aβ 42 , which is correlated with APP metabolism and amyloid deposition), Total Tau protein (T-Tau) which reflects neurodegeneration, and phosphorylated Tau protein (P-Tau 181 ) which reflects tangle pathology measurement (7). According to the literature, these core biomarkers have a high specificity and sensitivity for discriminating AD from other dementias (8). The typical CSF biomarker profile in AD associates increased T-Tau and P-Tau 181 concentrations and decreased Aβ 42 peptide concentration. As misclassifications may be due to inter-individual variability in overall amyloid peptide production, amyloid-β 1-40 peptide (Aβ 40 ) assay, which closely reflects total amyloid load in the brain, has more recently been validated and implemented in the diagnostic sequence (9).
However, no consensual cut-off values have been established at this time, except for P-Tau 181 with a cut-off of around 60 ng/L (10). The optimal cut-off value for Aβ 40 peptide has not been widely studied, although increased concentrations have been reported in AD compared with non-AD populations (11). Hence, most clinical studies are based on ad hoc optimum cut-off definitions. Differences are mostly reported for amyloid peptide measurement (i.e. Aβ 42 peptide cut-off values may range from 550 to 800 ng/L, even when using the same ELISA method) (12). It is widely acknowledged that pre-analytical factors play a key role in this variability, but are not sufficient to wholly explain these discrepancies. To significantly reduce the substantial variability in Aβ 42 measurements across laboratories, three Certified Reference Materials (CRMs) for the measurement of Aβ 42 peptide have been produced by the International Federation of Clinical Chemistry and Laboratory Medicine (IFCC) and the Joint Research Centre (JRC), and are now available.
It has been clearly demonstrated that a combination of CSF biomarkers that includes Aβ 42 /Aβ 40 ratio calculation, significantly improves the discriminatory capacity in the diagnosis of AD (13,14). Nevertheless, the added value of CSF Aβ 40 concentration in the classification strategy of AD diagnosis remains controversial (11,15). Different classification strategies involving CSF biomarkers are currently used. For instance, CSF Aβ 40 peptide assessment may be performed together with the core biomarkers (i.e., T-Tau, P-Tau 181 and Aβ 42 ) in all patients, or may only be performed in a second stage, after assessment of the three core biomarkers, especially in case of conflicting results. However, there is no consensus as to the combination of CSF biomarker that has the highest sensitivity and specificity for the diagnosis of AD. A recent study concluded that the CSF Aβ 42 /Aβ 40 ratio should be calculated in all patients, independently of Aβ 42 concentration, to improve AD diagnosis (16).
To this end, we undertook the present study firstly, to define optimal cut-off values for CSF biomarkers, and secondly, to evaluate the incremental value of Aβ 40 concentration or Aβ 42 /Aβ 40 ratio compared with the standard classification strategy of core CSF biomarkers. The secondary objective was to identify the diagnostic algorithm that was best able to distinguish AD patients from other dementias.

Study Design
This was a single-centre, prospective study to evaluate the incremental value of CSF biomarker combinations in the diagnosis of AD. An "AD profile" was defined as increased CSF concentrations of T-Tau and P-Tau 181 associated with decreased Aβ 42 peptide concentrations, and/or a decreased Aβ 42 /Aβ 40 ratio. As previously described, an increased concentration of Aβ 40 peptide was also considered a criterion compatible with an AD profile (13). In contrast, a "Normal profile" was considered if CSF concentrations of T-Tau, P-Tau 181 , Aβ 42 , Aβ 40 and Aβ 42 /Aβ 40 ratio were within the optimal cut-off values identified by ROC curve analysis on our sample. Isolated decreased CSF Aβ 42 peptide concentration associated with an Aβ 42 /Aβ 40 ratio within the cut-off value was also considered as a normal profile.

Clinical Enrollment
This study included patients from the memory consultations or geriatric medicine units of Reims University Hospital (Reims, France) between January 2011 and September 2016. Subjects who were not affiliated to any social security regime, as well as subjects under legal guardianship were excluded. All included patients underwent baseline physical and neurological examination associated with neuropsychological evaluation including Mini-Mental State Examination (MMSE), brain imaging and CSF biological measurements. Patients were monitored until the end of the study by clinical examinations. Clinical, neuropsychological and imaging data were reviewed by an independent board (multidisciplinary team including neurologist, geriatrician, and psychiatrist), from baseline to the end of follow-up, for a final diagnosis, but the board members were blinded to the results of CSF biomarker assays.
A total of 380 patients were sequentially included in this study: 140 patients with AD and 240 patients with various non-AD diagnoses (i.e., dementia with Lewy bodies (DLB) (n = 12), frontotemporal lobe degeneration (n = 14), vascular dementia (n = 42), psychiatric disorder (n = 31), mixed neurodegenerative disease (n = 70), composed of a combination of two or more concomitant neurodegenerative disorders, including AD). For these patients, AD was not mainly involved in the retained diagnosis from the multidisciplinary team. All AD patients met the diagnostic criteria for probable AD according to the Diagnostic and Statistical Manual of Mental Disorders, fourth edition (DSM IV) (17), and the criteria of the National Institute of Neurological and Communicative Disorders and Stroke and the Alzheimer's disease and Related Disorders Association (NINCDS-ADRDA) (2). For patients with other types of dementia (non-AD population), the diagnoses were made according to specific criteria: diagnosis of DLB was made according to the DLB Consortium criteria (18), frontotemporal lobe degeneration was diagnosed according to the criteria described by Neary et al. (19), and Rascovsky et al. (20). Vascular dementia was diagnosed according to the NINDS-AIREN criteria (21). Patients with psychiatric disorders, such as bipolar disorder, depression and anxiety were diagnosed according to DSM IV criteria (17). The study was approved by the regional ethics committee (CPP Est III; protocol number 2014-A00056-41), and written informed consent was obtained from each participant and their main caregiver.

CSF Analysis
Lumbar puncture was performed in the sitting position according to standardized procedures. All CSF samples were collected in 10 mL polypropylene tubes (ref. 62.610.201, Sarstedt, Nümbrecht, Germany) and transported to the laboratory within 2 h. Samples were then centrifuged for 10 min at 2,000 g at 4 • C, transferred to another polypropylene tube and immediately stored at −80 • C until analysis. CSF T-Tau, P-Tau 181 , Aβ 42, and Aβ 40 peptide concentrations were measured using commercially available sandwich ELISAs (INNOTEST, Fujirebio Europe, Ghent, Belgium) according to the manufacturer's instructions. Biomarkers were included in routine clinical diagnosis runs of the laboratory, which is part of an external quality control program for the Alzheimer's Association. CSF analyses were performed blinded to the clinical diagnoses.

Statistical Analysis
Statistical analyses were performed with R 3.1.4 (The R Foundation for Statistical Computing, http://www.r-project.org). Continuous variables are described as mean and standard deviation (SD) and categorical variables as number (percentage). A sample size of 380 patients including 140 cases and 240 controls provided >80% power to detect a minimal area under the receiver operating characteristics (ROC) curve (AUC) of 0.583 compared to the null hypothesis of an AUC of 0.50 (equivalent to no diagnostic value) using a 2-sided test at a significance level of 0.05. This sample size also provided at least 80% power to detect a 0.13 unit difference between a statistical model with an AUC of 0.58 and another statistical model with an AUC of 0.72 using a 2-sided z-test at the significance level α = 0.05. The normality of distributions was assessed using the Shapiro-Wilk and Kolmogorov-Smirnov tests. Non-parametric univariate analyses were done for continuous variables using the Mann-Whitney test. Categorical variables were assessed using the chi-square test, or two-tailed Fisher's exact tests when the expected number in any cell was <5.
To assess the discriminatory capacity of univariate biomarkers, a ROC curve was created, and the AUC was calculated. The cut-off values were estimated using the Youden index (22). The multivariate analysis was performed by a random forest algorithm and by binary logistic regression models in order to define the best diagnostic algorithm for the discrimination between AD and non-AD populations. The biomarkers with greatest discriminatory capacity were identified by the Gini-Index using random forest algorithms (random Forest package R) (23). This procedure is considered a standard non-parametric classification for constructing prediction rules without making any prior assumptions as to the form of their association with the outcome. The hyperparameters and tuning strategies for random forest were performed as described using the tune Ranger package and were as follows: number thread/processor = 2, number of predictors sampled for splitting at each node = 5, minimum size of terminal nodes = 27, sample fraction = 0.202, number of trees to grow = 1000, respect.unordered.factors = order, a sampling of cases done with replacement (24). Others hyperparameters were set to their default value.
Binary logistic regression was applied to calculate the probability of combined biomarkers for the diagnosis of AD (25,26). CSF concentrations of T-Tau, P-Tau 181 , Aβ 42 , Aβ 40 and Aβ 42 /Aβ 40 ratio as biomarkers were chosen by fitting a logistic regression model using forward/backward stepwise selection. Akaike information criterion (27) and Bayes information criterion (28) were used to find good candidate models for biomarker combinations. The goodness-of-fit and appropriateness of the logistic regression model were evaluated using the Nagelkerke R squared values and Hosmer-Lemeshow value, by the overall correct percentage of classification, absolute standardized residuals, QQplot or residual, residual vs. covariate and by residual vs. predicted value. Multicollinearity was checked for all analyses and the Wald test was used for hypothesis testing.
The stability and robustness of the final model and its estimates were validated using bootstrap resampling (n = 1000). ROC curves for combined biomarkers were constructed using the predictive probability as a covariate.
The performance of the final model was validated using the cross-validation technique (29,30). To this end, we randomly split the data into twenty different estimations (or training) sets and twenty different test (or validation) sets. Each estimation set consisted of data from 95% of patients; the corresponding test set contained the data from the remaining 5% of patients. The final model was fitted to each of the estimation data sets and then the parameter estimates were used to predict the observed class for each test set. The accuracy (1-misclassification error or the proportion of correctly classified patient's i.e., the sum of true positive and true negative tests) and the AUC of ROC curves were assessed using pooled observed and predicted class.

RESULTS
Demographic, clinical and biological characteristics of the study population are summarized in Table 1. T-Tau and P-Tau 181 concentrations were significantly higher in the AD population than in the non-AD population (617 and 82 ng/L vs. 319 and 47 ng/L, p < 0.001, respectively) whereas CSF Aβ 42 concentration and Aβ 42 /Aβ 40 ratio were significantly lower in AD vs. non-AD patients (513 ng/L and 0.044 vs. 786 ng/L and 0.077, p < 0.001, respectively).
In the clinically diagnosed AD population (n = 140), 88 (62.9%) of patients had a CSF AD profile by using a classification strategy based solely on the core CSF biomarkers (Tau, P-Tau 181 , Aβ 42 ). In the same way, in the non-AD population (n = 240), 90 (37.5%) patients had a normal CSF profile and 23 (9.6%) patients had an AD profile (and were misclassified). The remaining patients (37.1% in AD and 52.9% in non-AD populations) had conflicting results of CSF biomarkers.
Multivariate analysis using a random forest machine learning approach was used to define the best diagnostic algorithm for the discrimination of AD from non-AD patients. After optimization, Gini-Index showed that P-Tau 181 , Aβ 42 and Aβ 42 /Aβ 40 ratio were the CSF biomarkers with the best discriminatory capacity (53.3, 41.9, and 37.6, respectively). By contrast, the impact of T-Tau protein and Aβ 40 peptide remained lower (27.7 and 16.4 respectively) (Figure 2A). To evaluate the discriminatory capacity of the final random forest model, a ROC curve was plotted, and showed an AUC of 0.85 ( Figure 2B). The classification error of random forest (out-of-bag estimate), was about 20%. The sensitivity and specificity of the random forest model were about 76 and 82%, respectively. To improve classification of CSF biomarkers in daily practice, a multivariate logistic regression model was constructed. This analysis identified the same discriminating biomarkers as the random forest analysis (namely P-Tau 181 , Aβ 42 , and Aβ 42 /Aβ 40 ratio) ( Table 3). The final model retained by binary logistic regression combining these three main CSF biomarkers was cross-validated 20-fold. Cross validation showed that this model yielded the most discriminatory approach [mean accuracy: 0.83 (0.65 -1); mean AUC: 0.89 (0.7 -1)] (Figure 3). This model validated the following equation, combining P-Tau 181 , Aβ 42 concentrations and Aβ 42 /Aβ 40 ratio biomarkers, which could be considered in daily practice to discriminate AD from non-AD patients: pAD was defined as the predictive probability of AD. ROC curve of pAD showed an AUC of 0.89. The optimal cut-off was 0.387 (sensitivity 85% and specificity 85%).

DISCUSSION
In daily practice, CSF biomarkers are likely to play a key role in the diagnosis of AD, especially in the presence of discrepancies between imaging and clinical features, or atypical presentation of the disease. Nevertheless, the classification of CSF biomarker assay results can be difficult, due to the lack of consensus on clinically relevant thresholds, even when using the same assays. These differences are the combined result of pre-analytical and analytical factors. For example, pre-analytical factors, such as the choice of lumbar puncture needle, collection  tube or conservation tube, remain the main confounding factors, although recommendations have been published to promote their standardization (31).
Based on our cohort recruited in memory consultations and geriatrics wards, we firstly defined the optimal cut-off values that can adequately discriminate between AD and non-AD populations using Innotest ELISA methods (Fujirebio Europe, Ghent, Belgium). Recent studies have defined cut-off values using the same ELISAs as those used in our study. For instance, a study using data-drive Gaussian mixture modeling determined a cut-off of 680 ng/L for CSF Aβ 42 peptide concentration, which is very close to that found in our study (12). In an autopsyconfirmed AD population, the different cut-off values were also consistent with our findings, with the greatest difference observed for Aβ 42 peptide (638.5 vs. 706.0 ng/L, respectively) (14). P-Tau cut-off value found in our study was consistent with those found in the literature of around 60 ng/L (10). T-Tau protein cut-off value of 500 ng/L was close to the commonly used concentration in daily practice for the elderly population (32). Furthermore, the use of a gray zone in the decisional algorithm has been suggested to assist in the classification based on CSF biomarkers (e.g., +10% for T-Tau and P-Tau 181 concentrations and−10% for Aβ 42 concentrations and Aβ 42 /Aβ 40 ratio) (33), in order to take account of the analytical variability of CSF biomarker assays. Only a few studies have defined a specific cut-off value for classification based on CSF Aβ 40 . For example, Dorey et al. suggested an increased concentration of Aβ 40 peptide (>12,644 ng/L) in AD compared with non-AD patients. However, this cut-off value was determined in AD patients with Aβ 42 concentrations above the cut-off value or non-AD patients with decreased Aβ 42 concentration (11). In our study, all 380 patients were assessed (i.e., all AD and non-AD patients, blinded to the Aβ 42 concentration) in order to define our optimal cut-off value of 10,854 ng/L. These cut-off values are in agreement with those commonly described in the literature (15,34).
CSF concentration of Aβ 42 peptide is one of the most commonly used criteria in the diagnosis of AD, because of its accumulation in amyloid plaques in the brain. However, different studies have demonstrated that CSF concentrations of amyloid peptide (i.e., Aβ 42 and Aβ 40 ) may be influenced by interindividual variations in the total amyloid load linked to variations in production and/or turnover in the brain (9). A decreased CSF concentration of Aβ 42 peptide may be the consequence of lower production of all amyloid peptides, or of an accumulation of Aβ 42 peptide in amyloid plaques in the brain. In case of low levels of production of all amyloid peptides, the concentration of Aβ 40 peptide will also be decreased, which could lead to profile misclassification. For that purpose, the calculation of the amyloid Aβ 42 /Aβ 40 ratio has been proposed to differentiate the two situations. The addition of the Aβ 42 /Aβ 40 ratio to the diagnostic algorithm based on core CSF biomarker analysis improves the diagnosis performance.
Univariate analysis was used to define cut-off for CSF biomarkers, whereas multivariate analysis was preferred to study the most efficient combination of CSF biomarkers for the diagnosis of AD. The discriminatory capacities of T-Tau protein and Aβ 40 concentrations were clearly weaker than those of P-Tau 181 and Aβ 42 concentrations and the Aβ 42 /Aβ 40 ratio. Our findings are consistent with those of Slaets et al., who reported that the addition of the Aβ 42 /Aβ 40 ratio to standard CSF biomarkers improved discrimination between AD and non-AD profiles, whereas the addition of Aβ 40 peptide concentration alone did not (15).
In our cohort, the two multivariate prediction models showed that the algorithm based on P-Tau 181 , Aβ 42 , and Aβ 42 /Aβ 40 seems to be the most relevant for discriminating AD from non-AD populations. Our results are consistent with the conclusion of a recent review that recommended measuring the Aβ 42 /Aβ 40 ratio irrespective of the concentration of Aβ 42 peptide (16). The results previously published by Bombois et al. which concluded that the measurement of CSF Aβ1-42 and p-Tau levels seems sufficient for the diagnosis of AD are also in agreement with our data (35). However, for many years, the standard CSF biomarker analysis strategy in AD has been based on the interpretation of T-Tau, P-Tau 181 , and Aβ 42 concentrations. Our results suggest that T-Tau protein measurement could be replaced by Aβ 40 concentration measurement, in order to calculate Aβ 42 /Aβ 40 ratio. Our results are consistent with the literature which indicates that CSF P-tau should be considered the most specific biomarker for AD (36). Nevertheless, despite a less important impact in the discrimination between AD and non-AD population, T-Tau protein assay remains useful for the evaluation of diseases associated with acute brain injury (37) (e.g., encephalitis, Creutzfeldt-Jakob disease, cerebral infarction). An acute brain injury is associated with neuronal death which releases large amounts of T-Tau protein. The increase in CSF T-Tau protein concentration occurs from the first few days following the injury and lasts for some weeks, whereas P-Tau protein concentration remains normal (37). An increased T-Tau protein concentration may be measured in absence of neurocognitive disorders such as AD. The high rate of comorbidities commonly found in elderly patients may decrease the specificity of T-Tau protein concentration. In absence of control group, these results confirmed that CSF biomarkers are useful in the differential diagnosis of AD vs. other causes of neurocognitive disorders, rather than in the discrimination between AD patients and healthy subjects.
The population of our study was older than those of previous studies. Interestingly, Ewers et al. showed that the specificity and negative predictive value of CSF biomarkers decreased with age (38). As a result, the increased number of false negative results (patients with AD with negative CSF biomarkers) may be highlighted in our relatively older population, which also explains the relatively high proportion of misclassified results compared to the literature. The recently described Limbic-predominant Age-related TDP-43 Encephalopathy (LATE) disease may also explain these results in an old population. Patients with LATE disease often includes tauopathy and amyloïd-β plaques which could mimic Alzheimer's-type dementia (39). Moreover, different confounding factors such as the high rate of mixed dementia in elderly (40) or the discrepancies between neurocognitive disorders and neuropathological lesion of AD at autopsy (41,42) could decrease the discriminative power of CSF biomarkers.
The main limitation of this study is the lack of neuropathological validation of the diagnosis, which could explain the rate of misinterpreted results in both populations under study, especially the non-AD patients. Nevertheless, we attempted to minimize the rate of probable misdiagnosis by using the most specific and sensitive criteria for each patient in our study. Diagnoses were made by a multidisciplinary team composed of trained physicians (neurologists, geriatricians and psychiatrists). The 5-year follow-up allowed physicians to revise their diagnosis, if an atypical clinical or cognitive sign was detected. A decrease of CSF Aβ 42 peptide concentration has been reported in DLB patients which may have an effect of non-AD population results, despite the low number of patients with DLB retained diagnosis in our cohort. Clinicians were blinded to CSF biomarker results prior to clinical diagnosis, which excluded a circular reasoning bias in our findings. However, recent reports highlighting that 10% to 30% of patients clinically diagnosed as AD by experts do not display AD neuropathologic changes at autopsy (43). That is why the lack of supporting biomarkers regarding the diagnosis of AD could also be considered a limitation of this study. A further limitation is the single-center nature of the study.

CONCLUSION
This study confirms that the Aβ 42 /Aβ 40 ratio is more useful than the Aβ 40 concentration alone for discriminating AD from non-AD populations in daily practice.
Both random forest and logistic regression multivariate analysis showed that a diagnostic algorithm based on P-Tau 181 , Aβ 42 , and the Aβ 42 /Aβ 40 ratio is the most relevant for distinguishing AD from non-AD patients. These results suggest that the Aβ 42 /Aβ 40 ratio should be calculated in all cases, independently of Aβ 42 concentration. However, T-Tau protein assay remains useful in daily practice to rule out differential diagnoses.

DATA AVAILABILITY STATEMENT
The datasets generated for this study are available on request to the corresponding author.

ETHICS STATEMENT
The study was approved by the regional ethics committee (CPP Est III; protocol number 2014-A00056-41), and written informed consent was obtained from each participant and their main caregiver.