Improving CSF Biomarkers’ Performance for Predicting Progression from Mild Cognitive Impairment to Alzheimer’s Disease by Considering Different Confounding Factors: A Meta-Analysis

Background: Cerebrospinal fluid (CSF) biomarkers’ performance for predicting conversion from mild cognitive impairment (MCI) to Alzheimer’s disease (AD) is still suboptimal. Objective: By considering several confounding factors we aimed to identify in which situations these CSF biomarkers can be useful. Data Sources: A systematic review was conducted on MEDLINE, PreMedline, EMBASE, PsycInfo, CINAHL, Cochrane, and CRD (1990–2013). Eligibility Criteria: (1) Prospective studies of CSF biomarkers’ performance for predicting conversion from MCI to AD/dementia; (2) inclusion of Aβ42 and T-tau and/or p-tau. Several meta-analyses were performed. Results: Aβ42/p-tau ratio had high capacity to predict conversion to AD in MCI patients younger than 70 years. The p-tau had high capacity to identify MCI cases converting to AD in ≤24 months. Conclusions: Explaining how different confounding factors influence CSF biomarkers’ predictive performance is mandatory to elaborate a definitive map of situations, where these CSF biomarkers are useful both in clinics and research.


INTRODUCTION
Mild cognitive impairment (MCI) is a high risk factor for developing dementia, particularly Alzheimer's disease (AD). About 35% of MCI patients progress to AD, with an annual conversion rate of 5-10% (Mitchell, 2009). Because AD entails severe consequences, an appropriate prediction of MCI outcome is crucial for giving the patients a prognosis and to initiate therapeutical strategies as soon as possible. In this regard, the new MCI diagnostic criteria recommended by the National Institute of Aging -Alzheimer's Association (NIA-AA) emphasize the use of neuroimaging and cerebrospinal fluid (CSF) biomarkers (Albert et al., 2011). Although significant advances have been made in the field of neuroimaging, biomarkers based on CSF are at present the most convenient for studying disease progression.
The currently validated CSF biomarkers of AD are Aβ 42 , total tau (T-tau), and phosphorylated tau (p-tau). CSF Aβ 42 is reduced, and T-tau and p-tau levels are increased in MCI patients compared to healthy controls (Diniz et al., 2008). In addition, MCI patients with abnormal CSF biomarkers have increased risk to progress to AD (Herukka et al., 2005;Hansson et al., 2006Hansson et al., , 2007Bouwman et al., 2007;Brys et al., 2009;Mattsson et al., 2009;Shaw et al., 2009;Hertze et al., 2010;Buchhave et al., 2012). Buchhave et al. (2012) showed that 90% of MCI patients with pathologic CSF biomarkers developed AD within 9·2 years. This knowledge is now incorporated in the new diagnostic criteria for MCI, indicating that positive biomarkers of Aβ accumulation (e.g., CSF Aβ 42 ) and neuronal injury (e.g., CSF T-tau and p-tau) confers the highest likelihood that AD pathophysiological processes are the cause of the cognitive dysfunction; and that individuals with this biomarker profile are more likely to decline or progress to dementia due to AD in relatively short periods (Albert et al., 2011). Regarding predictive capacity, although single CSF biomarkers have shown unsatisfactory results, their combination could be suitable to identify which MCI patients will progress to dementia Frontiers in Aging Neuroscience www.frontiersin.org (Ferreira et al., 2014). In particular, the Aβ 42 /p-tau ratio has demonstrated high efficiency (Hansson et al., 2006;Mattsson et al., 2009;Buchhave et al., 2012;Parnetti et al., 2012;Roe et al., 2013). Two systematic reviews with meta-analysis have previously been published (Mitchell, 2009;Monge-Argilés et al., 2010). Mitchell (2009) only evaluated p-tau. Monge-Argilés et al. (2010) evaluated the three CSF biomarkers, but the group of MCI patients that converted to AD was compared to a mixed group of stable MCI cases and MCI patients that converted to non-AD dementias. Moreover, their analysis of combined CSF biomarkers was limited to only three studies and the combination procedure was not sufficiently detailed. Importantly, CSF biomarkers' predictive performance could be improved by considering different confounding factors such as the MCI subtype, time to AD conversion, and age (Ferreira et al., 2014). Previous studies show that the CSF biomarkers have better predictive capacity in amnestic MCI (Vos et al., 2013), MCI patients that convert to AD in relatively short periods (e.g., <12 months) (Gaser et al., 2013), and young MCI patients (Mattsson et al., 2012). However, most of the studies performed to date do not consider these confounding factors. These aspects together with methodological variability have made it difficult to propose definitive cut-off values for CSF biomarkers. For this reason, the fact of disseminating the use of CSF biomarkers to clinical routine is compromised at present (Ferreira et al., 2014).
The main objective in this study was to carefully evaluate the capacity of the CSF biomarkers to predict conversion from MCI to AD in several clinically relevant situations. In particular, we aimed to identify for which specific MCI patients these CSF biomarkers might be useful in clinical practice. In order to address this question, several meta-analyses were performed for studies that prospectively analyzed the predictive performance of CSF Aβ 42 and T-tau and/or p-tau. The design of the included studies is baseline cross-sectional comparisons between MCI patients that convert to AD or dementia (MCI-C) and MCI patients that remain stable (MCI-S) at follow-up. We hypothesized that combined CSF biomarkers would have better predictive performance than single CSF biomarkers, and that this performance could be increased by controlling for different confounding factors such as the MCI subtype, time to AD conversion, and age, among others.

SEARCH STRATEGY AND SELECTION CRITERIA
A systematic review was conducted for the period between January 1990 and September 2013 in the following electronic databases: MEDLINE and PreMedline, EMBASE, PsycInfo, CINAHL, Cochrane Library, and CRD. The search strategy was developed for each database using the following Medical Subject Heading (MeSH) and free-text terms: "Alzheimer's disease diagnosis" or "Alzheimer's disease," and "abeta-42" or "T-tau" or "P-tau" or "tau" or "phospho-tau" or "phosphorylated tau." Examples for the two major databases are shown in Table S1 in Supplementary Materials (MEDLINE) and Table S2 in Supplementary Materials (EMBASE). In addition, reference sections were searched to identify relevant publications.
Inclusion criteria for this meta-analysis were studies that (1) performed a prospective analysis of the CSF biomarkers' performance for predicting conversion to AD or dementia in individuals with MCI at baseline; (2) included at least two CSF biomarkers, being Aβ 42 always required along with T-tau and/or p-tau; and (3) were published in English or Spanish. Studies were excluded if they did not report sensitivity or specificity values, or any other data that enabled its calculation. Two reviewers independently performed the study selection (Daniel Ferreira and Amado Rivero-Santana), and in case of doubt and/or disagreements a third reviewer was consulted (Lilisbeth Perestelo-Pérez). The search yielded 1308 references after discarding duplicates. One-hundred fifty-eight articles were selected by title and abstract. After applying eligibility criteria, 12 articles were eventually included (Hampel et al., 2004;Herukka et al., 2005;Parnetti et al., 2006Parnetti et al., , 2012Eckerström et al., 2010;Hertze et al., 2010;Monge-Argilés et al., 2011;Buchhave et al., 2012;Ewers et al., 2012;Gaser et al., 2013;Toledo et al., 2013;Vos et al., 2013). Three of these studies included data from the Alzheimer's Disease Neuroimaging Initiative (ADNI). As these studies represent different analyses of overlapping ADNI subsamples, only one ADNI study was included for each metaanalysis depending on the analyzed biomarker. If two ADNI studies were available for the same biomarker, the one with largest sample was selected. Selection flow including reasons for study exclusion at each phase is shown in Figure 1.

DATA COLLECTION, RISK OF BIAS, AND EVALUATION OF METHODOLOGICAL QUALITY
A data extraction sheet was developed to collect relevant data by covering: author and publication year, country, objectives, methods (with special attention to participants' recruitment procedures, study design, follow-up length, CSF biomarkers evaluated including Aβ 42 /T-tau and Aβ 42 /p-tau ratios, diagnostic groups characteristics, and statistical analyses), results, and conclusions.
Frontiers in Aging Neuroscience www.frontiersin.org Data extraction was carried out by two researchers (Daniel Ferreira and Amado Rivero-Santana), and quality and accuracy of the extraction was verified by a third researcher (Lilisbeth Perestelo-Pérez). Several strategies were followed in order to reduce the risk of bias related to publication, data availability, and reviewer selection (see Table S3 in Supplementary Materials). The QUADAS-2 scale (Whiting et al., 2011) was used in order to assess the methodological quality of the included studies. The scale was applied by two researchers (Amado Rivero-Santana and Daniel Ferreira), and in case of doubt and/or disagreements a third was consulted (Lilisbeth Perestelo-Pérez). Finally, this study was performed in accordance with the PRISMA statement (Liberati et al., 2009;Moher et al., 2010), which provides a detailed guideline of preferred reporting style for systematic reviews and meta-analyses.

IDENTIFICATION OF POTENTIAL CONFOUNDING FACTORS OF CSF BIOMARKERS' PREDICTIVE PERFORMANCE
We hypothesized that CSF biomarkers' predictive performance might be influenced by different confounding factors. To explore sources of heterogeneity, the following factors were defined a priori based on the literature (see Table 1 for a detailed description of the different factors and considered categories): (1) recruitment setting; (2) MCI subtype; (3) diagnostic criteria for MCI at baseline; (4) diagnostic criteria for AD at follow-up; (5) postmortem confirmation of AD pathology; (6) criteria for conversion from MCI to AD/dementia; (7) diagnosis at follow-up; (8) follow-up length (as rough estimation of time to AD conversion); (9-11) MCI severity at baseline according to mini-mental state examination (MMSE), clinical rating [e.g., clinical dementia rating (CDR), global deterioration scale (GDS)], and magnetic resonance imaging (MRI) rating (i.e., degree of brain atrophy); (12) Age; (13) gender distribution; (14) years of education; (15) family history of AD; (16) APOE e4 status; (17) technology applied for the CSF analysis; and (18) cut-offs for interpreting the CSF levels.

STATISTICAL ANALYSIS
For each article, true and false positives/negatives values were calculated from sensitivity, specificity, positive predicted value, negative predicted value, and/or the rate of converters and nonconverters. A global meta-analysis was performed for each single CSF biomarker (i.e., Aβ 42 , T-tau, and p-tau) and two relevant ratios (i.e., Aβ 42 /T-tau and Aβ 42 /p-tau). Analyses were performed with the MetaDisc 1.1.1 software (Zamora et al., 2006). Sensitivity and specificity pooled estimates were calculated with random-effects models (DerSimonian and Laird, 1986), which yield more conservative estimates. For a qualitative interpretation of sensitivity and specificity results, values above 80% were considered indicative of satisfactory predictive performance according to international recommendations (The Ronald and Nancy Reagan Research Institute of the Alzheimer's Association and the national Institute on Aging working Group, 1998). Positive and negative likelihood ratios were calculated from resulting sensitivity and specificity values and interpreted following established guidelines (see these guidelines in Figure 2 footnotes) (Qizilbash, 2002). Likelihood ratios indicate how the pretest probability of disease is increased or decreased by the outcome of a diagnostic test. A positive likelihood ratio [LR + = sensitivity/(1 -specificity)] greater than one increases the probability that the disease is present (in this context progression to AD) and helps to rule-in MCI-C cases. A negative likelihood ratio (LR-= (1 -sensitivity)/specificity) of less than one diminishes the probability that disease is present and helps to rule-out MCI-C cases. Statistical heterogeneity was explored with the Cochran Q-test. As this statistic has low power when few studies are available, a recommended p-value of 0·10 was established as statistical significance threshold to detect heterogeneity (Hardy and Thompson, 1998). Differences in sensitivity and specificity values for pairs of subgroup meta-analyses (e.g., MCI cases younger than 70 years vs. older than 70 years) were tested with the formula: Q BET = Q TOT -(Q 1 + Q 2 ). Where Q TOT represents the overall inter-study variability, and Q 1 and Q 2 represents inter-study variability for each subgroup in the comparison (Deeks et al., 2001). The Q BET statistic was then compared to a χ 2 distribution with J − 1 degrees of freedom using a significance level of 0·05, where J is the number of subgroups.

MAIN CHARACTERISTICS OF INCLUDED STUDIES AND METHODOLOGICAL QUALITY
Among the 12 studies included, 10 offered data about the diagnostic performance of Aβ 42 , 6 about T-tau, 5 about p-tau, and 6 about the Aβ 42 /T-tau and Aβ 42 /p-tau ratios. Main study characteristics are detailed in Table 2. Methodological quality (QUADAS-2) is shown in Table S4 in Supplementary Materials. In summary, (1) Patient selection: only two studies demonstrated low risk of bias; seven did not explicitly state consecutive or random samples; patients could have been inappropriately excluded in nine studies.
(2) Diagnostic test: seven studies proved low risk of bias by using external cut-off values or establishing the cut-off in the study sample independently of the clinical diagnosis [i.e., mixture model analysis in Buchhave et al., 2012]. (3) Diagnostic criterion: all the studies were classified as unclear given that postmortem confirmation of AD pathology was never performed. (4) Patients flow and follow-up: three studies demonstrated low risk of bias; all the studies applied the same reference standard to all the patients, but patients were followed during only two years or less in five studies; six studies did not include all baseline patients in the final analyses. Table 3 shows sensitivity and specificity values with 95% CI, heterogeneity, and likelihood ratios. Heterogeneity was significant for the three single biomarkers as well as for the two evaluated ratios. Aβ 42 /p-tau ratio showed the best performance with 85% sensitivity, 79% specificity, and a negative likelihood ratio of 0·19, indicating moderate decrease in the probability that the disease is present. ADNI criteria (+ 1)

MCI SEVERITY AT BASELINE
MMSE total score No enough variability: all studies reporting MMSE have mean scores between 23 and 30 (7 + 1) vs.

CSF METHODS
Technology for CSF analysis ELISA (6) vs. xMAP (3 + 1) Cut-offs for interpreting CSF levels Great variability: Internal (highest value of SN + SP or Youden's Index) (3) vs. External or independent of clinical diagnosis (Mixture model analysis, obtained from another cohort in the same study; Hulstaert et al., 1999;Sjögren et al., 2001;Shaw et al., 2009;Zetterberg et al., 2003) (4 + 1) vs. both internal and external (2) Between brackets, number of studies available for at least one biomarker; +1 refers to ADNI studies (only one ADNI study is included for each subgroup meta-analysis).

FIGURE 2 | Positive and negative likelihood ratios.
A LR+ greater than one increases the pretest probability that the disease is present [in this context progression from MCI to AD or, in other words, MCI due to AD (Albert et al., 2011)]. A LR-of less than one diminishes the pretest probability that disease is present. The established guidelines (Qizilbash, 2002) states that a LR+ greater than 10 will often make conclusive changes to the pretest probability, indicating that the disease is likely present; a LR+ between 5 and 10 corresponds to moderate increase in probability; and a LR+ between 2 and 5 corresponds to small increase. A LR− of less than 0·1 will often make conclusive changes to the pretest probability that the disease is present, indicating that the disease is unlikely present; a LR− between 0·1 and 0·2 corresponds to moderate decrease in probability; and a LR− between 0·2 and 0·5 corresponds to small decrease. LR+, positive likelihood ratio; LR−, negative likelihood ratio; Global MA, global meta-analysis; MCI, mild cognitive impairment; AD, Alzheimer's disease. Table 3 shows sensitivity and specificity values with 95% CI, inter-groups difference (Q BET ), heterogeneity, and likelihood ratios. Heterogeneity was significant in most of the subgroups meta-analyses. Noteworthy, CSF biomarkers' predictive performance was optimal (>80%) in two clinically relevant situations, and heterogeneity was no longer significant: (1) P-tau alone had 84% sensitivity and 93% specificity for MCI cases converting to AD in ≤24 months, significantly different from 59% sensitivity (p = 0·01) and 71% specificity (p < 0·001) in studies with followup periods > 24 months; (2) Aβ 42 /p-tau ratio showed 81% sensitivity and 91% specificity in MCI patients younger than 70 years, significantly different from 66% specificity in MCI patients older than 70 years (p < 0·001).
Aβ 42 /p-tau ratio showed the best performance across the different subgroups meta-analyses. Sensitivity was slightly increased in studies including only amnestic MCI cases (heterogeneity no longer significant), MCI patients older than 70 years (heterogeneity no longer significant), and studies using ELISA. Aβ 42 /T-tau ratio yielded optimal sensitivity values, but suboptimal specificity. Results were not satisfactory for single CSF biomarkers, except for the remarkably good p-tau diagnostic performance commented above.
The analysis of positive likelihood ratios showed extremely high increase in the probability that the disease is present (LR+ = 12) for p-tau in MCI cases converting to AD in ≤24 months (Figure 2; Table 3). Moreover, there was a moderate increase in the probability that the disease is present (LR+= 5-10) for the Aβ 42 /p-tau ratio in two situations: MCI patients younger than 70 years; and studies using ELISA technology. The analysis of negative likelihood ratios showed moderate decrease in the probability that the disease is present (LR-= 0.1-0.2) in several situations: p-tau and Aβ 42 /T-tau ratio in MCI cases converting to AD in ≤24 months; Aβ 42 /p-tau ratio in the global meta-analysis as well as in all the subgroups meta-analyses (except in MCI patients younger than 70 years and studies using ELISA technology).

DISCUSSION
The two main findings in this study are that the Aβ 42 /p-tau ratio has high capacity to predict AD conversion in MCI patients younger than 70 years; and p-tau alone has high capacity to identify MCI cases converting to AD in ≤24 months. The analysis of likelihood ratios showed that, in both situations, a CSF test result indicating pathological values of Aβ 42 /p-tau or p-tau significantly increase the probability that the disease is present [in this context Frontiers in Aging Neuroscience www.frontiersin.org  progression from MCI to AD or, in other words, MCI due to AD (Albert et al., 2011)]. Better predictive performance of the CSF biomarkers in younger MCI patients has been recently shown in a large multicenter study (Mattsson et al., 2012). A fact that may explain this result is that typical AD brain alterations increase with age in individuals without dementia (Green et al., 2000;Bennett et al., 2006), with about a third of cognitively normal elderly evidencing an AD-like pattern of CSF biomarker alterations (Ewers et al., 2007;Bouwman et al., 2009;Mattsson et al., 2009;Shaw et al., 2009). This also occurs in stable MCI cases, therefore obstructing specificity for AD and undermining CSF biomarkers' performance.
Regarding time to AD conversion, Gaser et al. (2013) showed that the CSF biomarkers had generally better performance for MCI cases that converted to AD in <12 months as compared with MCI cases that converted to AD in >12 months. In the current metaanalysis, this finding is still valid when considering 24 months as threshold. However, Buchhave et al. (2012) reported that the combination of CSF biomarkers might not be recommendable at 60 months before AD conversion. The reason for this is that at that point, many MCI-C have normal T-tau levels but already pathological Aβ 42 levels. In another study, the combination of CSF biomarkers with structural MRI showed >80% sensitivity during the first 18 months of follow-up, decreasing to 75% at 24 months, and to 68% at 36 months (Westman et al., 2012). Therefore, predictive value and biomarkers' utility strongly depend on the stage of the disease and time to conversion. Aβ 42 performs better than Tau 5-10 years before conversion to AD, but T-tau and p-tau have better predictive power 0-5 years before conversion (Buchhave et al., 2012). Other biomarkers such as those based on structural MRI have the highest performance the closer to AD diagnosis. Future research should thus pursue in combining the CSF biomarkers not only with each other but also with other biomarkers. Recent studies show an increase in the diagnostic efficiency of CSF biomarkers when combined with neuroimaging biomarkers (Vos et al., 2012;Westman et al., 2012;Choo et al., 2013;Galluzzi et al., 2013;Prestia et al., 2013;Shaffer et al., 2013). The development of new combinations and indexes may contribute not only to predict AD conversion but, importantly, to facilitate prediction of time to conversion, which is still challenging.
Importantly, the Aβ 42 /p-tau ratio showed satisfactory predictive performance in a heterogeneous group of MCI patients, which better represents the clinical reality (global meta-analysis). Moreover, it is noteworthy that the sensitivity was increased in two specific conditions: amnestic MCI patients and old MCI patients (>70 years). Recently, Vos et al. (2013) showed that the CSF biomarkers are more sensitive in amnestic MCI than in non-amnestic MCI patients . An explanation for this is MCI heterogeneity. Only 30-60% of the MCI patients are affected by prodromal AD, whereas the others may stem from a variety of different etiologies and pathologies (Ritchie et al., 2001;Petersen, 2004). The amnestic subtype is mainly associated with AD pathology. Nonetheless, vascular etiology has also been referred as explicative factor, especially in those cases with cognitive impairment encompassing other domains besides memory (Petersen, 2004;Winblad et al., 2004). On the contrary, the non-amnestic subtype may have higher likelihood of progressing to non-AD dementias such as dementia with Lewy bodies or frontotemporal lobar degeneration (Petersen, 2004;Winblad et al., 2004). In this regard, it seems reasonable that the CSF biomarkers validated for AD perform better in the amnestic MCI cases. In agreement with Vos et al. (2013), this may have implications for clinical implementation of the new revised criteria for MCI (Albert et al., 2011), given that both amnestic and non-amnestic subtypes are considered in this criteria as possible prodromal stages of AD-type dementia. Regarding the finding of better CSF biomarkers' sensitivity in old MCI patients, this is in line with the discussion above about the age-related increase in AD-like CSF biomarker patterns. Mattsson et al., 2012 also found increased sensitivity in MCI patients older tan 65 years compared to MCI patients younger than 65 years. On the other hand, Aβ 42 /p-tau specificity was not increased in any of the subgroups meta-analyses except for young MCI patients (≤70 years), as discussed above. The three CSF biomarkers alone and the Aβ 42 /T-tau ratio showed suboptimal predictive power except p-tau for MCI cases converting to AD in ≤24 months, as already commented.
A better performance of the Aβ 42 /p-tau ratio over the other CSF biomarkers has been reported in previous studies on MCI prediction (Hansson et al., 2006;Mattsson et al., 2009;Buchhave et al., 2012;Parnetti et al., 2012;Roe et al., 2013) and differential diagnosis between AD and other dementias (Maddalena et al., 2003;Jong et al., 2006;Holtzman, 2011). This finding is likely due to this ratio reflects two aspects of AD pathology, i.e., plaques (Aβ 42 ), and neurodegeneration (tau). Moreover, p-tau usually shows better performance than T-tau (Mitchell, 2009;Bloudek et al., 2011;van Harten et al., 2011), probably because p-tau is not only a marker of axonal damage and neuronal degeneration, as T-tau, but it is more closely related to AD pathophysiology and the formation of neurofibrillary tangles (Anoop et al., 2010;Holtzman, 2011). In addition, CSF p-tau concentrations in dementia with Lewy bodies, frontotemporal lobar degeneration, and vascular dementia have been referred to be more comparable to concentrations in controls than to concentrations in AD patients (van Harten et al., 2011). This positively affects prediction of MCI due to AD.
Regarding the clinical value of the CSF biomarkers, results for negative likelihood ratios were normally better than results for positive likelihood ratios. This means that the CSF biomarkers are more useful to identify MCI patients that remain stable at followup (MCI-S) than to rule-in MCI patients that will progress to AD or dementia (MCI-C). This finding supports the consideration made in the new MCI diagnostic criteria in relation to biomarkers profile suggesting that the MCI syndrome is unlikely to be due to AD (point 3.6.4. in Albert et al. (2011): "the definitive absence of evidence of either Aβ deposition or neuronal injury strongly suggests that the MCI syndrome is not due to AD"). Our study shows that a normal result in the Aβ 42 /p-tau ratio has a moderate decrease in the probability that the disease is present (conversion to AD). This is true in all the situations evaluated in the different subgroups meta-analyses, although we could not confirm this for MCI cases converting to AD in ≤24 months because only one study was available (Monge-Argilés et al., 2011). This single study reported 86% sensitivity and 75% specificity (Monge-Argilés et al., 2011). Therefore, it is quite probable that a meta-analysis of the Aβ 42 /p-tau ratio in MCI cases converting to AD in ≤24 months would provide a satisfactory negative likelihood ratio, given that Frontiers in Aging Neuroscience www.frontiersin.org both p-tau and the Aβ 42 /T-tau ratio showed optimal results. On the other hand, positive likelihood ratios were normally within the range of a small increase in the probability that the disease is present. The only two situations where conclusive increase was achieved are those already commented above (Aβ 42 /p-tau ratio in young MCI patients and p-tau in MCI cases converting to AD in ≤24 months). This finding may have implications for the consideration made in the new MCI diagnostic criteria regarding biomarkers pattern indicating a high likelihood that the MCI syndrome is due to AD [point 3.6.1. in Albert et al., 2011]. In particular, young MCI patients with positive biomarkers of Aβ accumulation and neuronal injury seems to have increased risk to decline or progress to dementia due to AD in relatively short periods.
To determine in which specific situations the CSF biomarkers provide satisfactory predictive performance is of great relevance. In this study, some of those situations have been identified. However, despite these positive results, we acknowledge that much additional work needs to be done to validate the application of the CSF biomarkers as they are proposed in the new revised criteria for MCI (Albert et al., 2011). The main limitation for extending the use of the CSF biomarkers to the clinical routine is the difficulty to establish appropriate cut-offs. There is a big variability in the cut-offs applied across the different studies. This is in part related to differences in methodological aspects as well as absence of technical standardization. In this meta-analysis, two aspects related to variability in CSF methods were considered. First, we tried to analyze the influence of different cut-offs for the CSF biomarkers. Due to the great variability found it was not possible to group the studies in order to perform specific subgroups meta-analyses (Table 1). Second, the technology for the CSF analysis applied was also considered as potential confounding factor. Results showed that sensitivity and specificity values differed depending on whether xMAP (Luminex, Austin, US) or ELISA (Innogenetics, Ghent, Belgium) technology was used. A clear pattern was not found however. Therefore, future research is mandatory to hopefully ascertain universal cut-offs values for the CSF biomarkers. Several studies indicate that the standardization of laboratory procedures could contribute to reduce variability in the results (Hansson et al., 2006;Fagan et al., 2011;Mattsson et al., 2011).
Therefore, standardization of methodological aspects is expected to increase the clinical utility of the CSF biomarkers. In this meta-analysis, we demonstrate that several confounding factors are another source of variability in published diagnostic/predictive performance and cut-offs. We show that CSF biomarkers' performance can be improved and heterogeneity reduced by carefully considering these confounding factors. In this regard, future studies should be addressed to explain how these factors influence the diagnostic and predictive performance of the CSF biomarkers. This need is reinforced by the fact that we could not evaluate 13 of the 18 identified potentially confounding factors given the lack of studies directly addressing these aspects. A related limitation is the scarce number of studies available for some of the analyses. This causes that certain subgroups meta-analyses could be influenced by some of the other confounding factors. In order to evaluate this, an analysis of coincident studies across factors was performed. Table S5 in Supplementary Material shows that most of the subgroups were rather independent from each other. However, for p-tau and Aβ 42 /p-tau, studies including follow-up periods >24 months coincided with studies with AD diagnosis at follow-up; and for p-tau and Aβ 42 /T-tau, studies using xMAP technology coincided with studies including MCI cases older than 70 years (and vice versa only for Aβ 42 /T-tau: ELISA technology with studies including MCI cases younger than 70 years). Another limitation is that systematic reviews and meta-analyses are essential tools for summarizing evidence accurately and reliably, but might be susceptible of bias if not properly conducted. Following PRISMA recommendations (Liberati et al., 2009;Moher et al., 2010), several strategies were carefully considered in this study to reduce risk of bias related to publication, data availability, and reviewer selection. Evidence was rigorously reviewed and literature was supplemented with manual query of relevant studies in order to minimize both publication and reviewer selection bias. Selected studies were carefully examined for clues suggesting that there may be missing results or data. Moreover, assessments were completed independently by more than one reviewer and consensus was required. Regarding the included studies, QUADAS-2 was applied to evaluate risk of bias and results applicability. It must be noticed that "domain 1" indicated high probability of patient selection bias in six of the included studies, related to inclusion of not completely consecutive or random samples, and not perfect avoidance of inappropriate exclusions. In particular, patients were normally selected from specialized centers on the basis of availability of CSF data, a procedure not generally performed in all incoming patients. This fact, may have certain impact in the applicability of the results, although these six studies scored rather well in the other three domains, indicating that the index test, the standard test, and flow and timing are not compromised. Another drawback is that the follow-up period was used as rough measure of time to AD conversion. Therefore, although it is clear that MCI-C cases in studies with follow-up ≤24 months converted to AD in less than 24 months, it is possible that some MCI-C cases in studies with follow-up >24 months also converted to AD before the threshold of 24 months. Finally, sensitivity and specificity values above 80% were considered indicative of optimal predictive performance according to international recommendations (The Ronald and Nancy Reagan Research Institute of the Alzheimer's Association and the national Institute on Aging working Group, 1998). Higher levels are not easy to be achieved given that analyses are derived from clinically diagnosed AD cases in which the diagnostic accuracy already approximates 85% when validated by the standard pathologic diagnosis at autopsy (Mendez et al., 1992;Victoroff et al., 1995). None of the studies included in this meta-analysis performed postmortem AD confirmation. It is thus necessary to test CSF biomarkers' predictive performance in pathologically confirmed AD patients.
In conclusion, this study contributes to define several situations in which the CSF biomarkers seem to be clinically useful for predicting conversion from MCI to AD. In particular, a baseline CSF test result indicating Aβ 42 /p-tau pathological values in MCI patients younger than 70 years has a moderate increase in the likelihood of developing AD. Moreover, a baseline CSF test result indicating pathological levels of p-tau increases the likelihood of developing AD within the next 24 months. To move forward in Frontiers in Aging Neuroscience www.frontiersin.org the knowledge about how different confounding factors influence the diagnostic and predictive performance of the CSF biomarkers is of utmost importance. Such knowledge will help the elaboration of a map of situations where the CSF biomarkers are useful, so that clinicians and researchers know when the new diagnostic criteria for MCI will be successful or otherwise prone to mistakes. This will be crucial when new disease-modifying treatments are available in the near future. Early prediction of MCI conversion to AD is expected to maximize treatment benefit if applied to the right people and before neuronal degeneration is too widespread and patients are already demented. In addition, this has ethical benefits because it is preferred not to treat patients with low risk of AD in trials that could cause side effects. Finally, this will also be important to enrich the samples with pure AD cases, both for research and clinical trials.