Validation of ICMR Neurocognitive Toolbox for Dementia in the Linguistically Diverse Context of India

Objectives: The growing prevalence of dementia, especially in low- and middle-income countries (LMICs), has raised the need for a unified cognitive screening tool that can aid its early detection. The linguistically and educationally diverse population in India contributes to challenges in diagnosis. The present study aimed to assess the validity and diagnostic accuracy of the Indian Council of Medical Research-Neurocognitive Toolbox (ICMR-NCTB), a comprehensive neuropsychological test battery adapted in five languages, for the diagnosis of dementia. Methods: A multidisciplinary group of experts developed the ICMR-NCTB based on reviewing the existing tools and incorporation of culturally appropriate modifications. The finalized tests of the major cognitive domains of attention, executive functions, memory, language, and visuospatial skills were then adapted and translated into five Indian languages: Hindi, Bengali, Telugu, Kannada, and Malayalam. Three hundred fifty-four participants were recruited, including 222 controls and 132 dementia patients. The sensitivity and specificity of the adapted tests were established for the diagnosis of dementia. Results: A significant difference in the mean (median) performance scores between healthy controls and patients with dementia was observed on all tests of ICMR-NCTB. The area under the curve for majority of the tests included in the ICMR-NCTB ranged from 0.73 to 1.00, and the sensitivity and specificity of the ICMR-NCTB tests ranged from 70 to 100% and 70.7 to 100%, respectively, to identify dementia across all five languages. Conclusions: The ICMR-NCTB is a valid instrument to diagnose dementia across five Indian languages, with good diagnostic accuracy. The toolbox was effective in overcoming the challenge of linguistic diversity. The study has wide implications to address the problem of a high disease burden and low diagnostic rate of dementia in LMICs like India.


INTRODUCTION
Dementia, a neurocognitive syndrome that affects the ability to perform everyday activities, has become a major health crisis worldwide, and research priorities that are aimed at reducing its global disease burden are a priority (1, 2). There has been a significant rise in the numbers of elderly people with dementia, especially in developing regions of the world (3)(4)(5). Of the 47 million people living with dementia globally, about 63% of these currently live in low-and middle-income countries (LMICs) (5)(6)(7). These figures are projected to further increase to 82 million by 2030 and 152 million by 2050, particularly in China, India, and Latin America (2,8,9). In India itself, there are at least 5.29 million people living with dementia currently, and this number is expected to double by 2035 (6). The prevalence of undetected dementia is also significantly high globally. It is estimated to be currently at 61.7%, with India and China having a higher proportion compared to Europe and the USA (10).
There are various barriers to the diagnosis of dementia in LMICs. Major factors include low awareness, inadequate healthcare resources, and scarcity of diagnostic tools that are culturally and linguistically valid (1, 8,11). As a result, both under-detection and overdiagnosis of dementia are possible (11,12). Hence, it is important that reliable diagnostic tools and instruments are developed that are culturally, educationally, and linguistically valid and can help in early and accurate diagnosis (12,13). Additionally, the use of diagnostic tests that can be harmonized with future global studies is crucial. There have been some efforts toward developing comprehensive neuropsychological test batteries for use in various languages such as 10/66 global dementia studies (14), the Consortium to Establish a Registry for Alzheimer's Disease (CERAD) neuropsychological battery (15), the NIMHANS neuropsychological battery for the elderly (16), the Spanish English Neuropsychological Assessment Scales (SENAS) (17), and the international harmonization standards proposed by the National Institute of Neurological Disorders and Stroke (NINDS) and the Canadian Stroke Network (CSN) (18,19). However, except for the studies by 10/66 dementia research groups, majority have been conducted in the developed world.
A significant amount of variance has been detected in the prevalence rate of dementia in India, a country with a large and diverse population (6,7,(20)(21)(22)(23)(24)(25). Variability in sociodemographic factors; genetic, protective, and risk factors; and methodological factors could account for these differences. To unravel the complex nature of dementia, research priorities to examine such large variations in dementia detection should be set, particularly in LMICs (26).
Variability in dementia-screening instruments and the use of different diagnostic methods and criteria contribute significantly to regional variability in reported dementia prevalence. To accurately establish the incidence or prevalence rates of dementia that are comparable across diverse populations, it is crucial that diagnostic instruments are harmonized by developing standardized procedures that are sensitive toward linguistic, educational, and cultural variability in populations (3). Additionally, efforts should also be made to increase the availability and ease of accessibility of these diagnostic instruments across the societies that are limited in resources. Another major barrier to effective management is a delay in early detection and treatment due to the scarcity of skilled professionals. Training more personnel on standardized diagnostic tools will also be necessary for effective management. These challenges exist not only for the diagnosis, treatment, and care of dementia but also for a majority of other mental health conditions (27).
To overcome these major barriers, a multidisciplinary group of neurologists, neuropsychologists, speech and language pathologists, and experts from related fields collaborated on a project funded by the Indian Council of Medical Research (ICMR) (http://icmr.nic.in). The efforts put forth by the group focused upon development, adaption, and validation of a comprehensive cognitive and functional test, the ICMR-Neurocognitive Toolbox (ICMR-NCTB) protocol, in five Indian languages (Hindi, Bengali, Telugu, Kannada, and Malayalam) with sensitivity toward different literacy levels across India (28). This test battery was developed to screen and diagnose dementia and mild cognitive impairment in the early stages, across the country (29), and to be suitable for conducting global collaborative research in cognitive disorders (28). The ICMR-NCTB has been validated for the diagnosis of MCI in the Indian context and demonstrated a good sensitivity of 81.1% and specificity of 88.8% to diagnose all-cause MCI (29). The usefulness of the ICMR-NCTB to diagnose dementia in the context of India requires to be established (30). In the background of a high burden of dementia due to Alzheimer's disease (AD) and vascular dementia (VaD) in India, the present study aimed to determine the validity of the ICMR-NCTB for the diagnosis of dementia in the context of linguistic heterogeneity in India.  (28).

METHODS
The adaptation and validation involved a systematic process that included a review of existing international and national efforts at standardizing dementia diagnosis to identify culturally appropriate tests for the Indian context, adaptation for its use in five Indian languages and for both literates and illiterates, and validation in a cohort of individuals with normal cognition, mild cognitive impairment, and dementia across the multiple centers. This process has been detailed in an earlier report (28).
The ICMR-NCTB consisted of a range of tests that evaluate the major cognitive domains:  (38), and RAND Short Form Health Survey (RAND SF-36) (39). A uniform protocol for the diagnosis of normal cognition and dementia due to neurodegenerative diseases was followed in all five centers (28).
Patients were recruited from out-patient services of neurology, geriatric, and internal medicine clinics of the participating hospitals, and healthy controls were randomly drawn from senior citizen centers, community outreach services, and healthy relatives of patients in the clinics. The detailed demographic, cognitive, and medical history of participants was collected to determine the eligibility for participation. Participants who fulfilled the following inclusion criteria for healthy controls were recruited: individuals >40 years and consented to participate; individuals with no history of head injury, infections, stroke, and other neurological, systemic, medical, or psychiatric disorders that can cause cognitive impairment; and those with no significant hearing or visual impairments that could interfere with the testing. A standard and harmonized case record form was used to collect sociodemographic information and neurocognitive and functional data.
An experienced cognitive neurologist evaluated all subjects, and experienced psychologists administered the gold standard tests on all the participants. Participants without any subjective cognitive complaints and scored normally on Addenbrooke's Cognitive Examination-III (ACE-III), Clinical Dementia Rating (CDR), Rey Auditory Verbal Learning Test (RAVLT), and Color Trails Test (CTT) were considered as healthy controls (28,29). Participants with dementia were diagnosed based on clinical evaluation and the presence of impaired cognitive functions, as indicated by their scores on ACE-III (40) and CDR (28,41). The dementia diagnosis was done based on the standard DSM-IV criteria (42). Patients were further classified into dementia subtypes: AD was diagnosed in patients who fulfilled the NIA-AA criteria for probable and possible AD (43), vascular dementia was diagnosed in patients who fulfilled the NINDS-AIREN criteria (44), and FTD was diagnosed based on the criteria by Rascovsky et al. (45). Persons diagnosed to have MCI were excluded from this study. The diagnosis of MCI was made based on the modified Petersen criteria (46). The recruited participants were subsequently administered with the complete ICMR-NCTB by a team of psychologists and clinicians who were blind to the diagnosis. The research ethics committee of all the participating centers approved the study, and consent was obtained from all the participants and their family caregivers.

Statistical Analysis
To compare the demographic data and neuropsychological test scores of patients with dementia and controls, an independent sample t-test for normally distributed continuous data or Mann-Whitney U-test for non-normal data, χ2 tests or Fisher's exact tests for categorical data, and trend test for ordinal data were used as appropriate. The test scores were represented in mean and standard deviation except TMT A & B scores, which is represented in median and interquartile range due to variability in the scores in the dementia group. The external validity of the battery was determined by the receiver operating curve (ROC) using the area under the curve (AUC). The optimum cutoff scores were established with corresponding sensitivity and specificity levels. All statistical analyses were performed using SPSS Statistics for Windows, version 23.0.

RESULTS
A total of 1,141 participants were recruited that included 991 controls and 185 patients with dementia. After matching the groups for age, education, and gender, 354 participants (222 controls and 132 patients with dementia) were included for further analysis. The patients were diagnosed as Alzheimer's disease (AD), vascular dementia (VaD), and frontotemporal Dementia (FTD): AD−65, VaD−45, and FTD−22. The mean age of the healthy controls and patients with dementia was 65 years and 66 years, respectively. Participants were from both urban and rural backgrounds: 71% were controls and 77% of patients were urban dwellers. Out of 132 patients with dementia, 61 (46.30%) reported to have very mild dementia, 47 (35.60%) mild dementia, 18 (13.60%) moderate, and 6 (4.50%) severe on the Clinical Dementia Rating Scale (CDR). Because of the heterogeneity in demographic characteristics in the overall cohort, language-wise analysis was conducted (Hindi: controls−40, dementia−20; Bengali: controls−45, dementia−29; Telugu: controls−45, dementia−33; Kannada: controls−57, dementia−15; and Malayalam: controls−35, dementia−35). The demographic characteristics and cognitive test scores on ACE-III of healthy controls and patients with dementia are presented in Table 1. Both healthy controls and dementia patient groups were matched for age, education, and gender in all the language groups. Healthy controls performed significantly better on ACE-III than patients with dementia [t (330) = 18.87, p < 0.001) in all the five language groups. A significant difference in the mean (median) scores between healthy controls and patients with dementia was observed on all the tests of ICMR-NCTB ( Table 2) The ROC revealed that the majority of the ICMR-NCTB tests had good discriminating power in differentiating cognitively impaired participants from the normal healthy group across five languages (see Table 3 Figures 1, 2).

DISCUSSION
In the present study, we assessed the diagnostic accuracy of the tests included in the ICMR Neurocognitive Toolbox in detection of dementia, across five Indian languages (Hindi, Bengali, Telugu, Kannada, and Malayalam). The ICMR-NCTB in all the five Indian languages met standardized test requirements, which indicates that the test adaption and standardization were successful across languages. This study confirms the utility of majority of the tests included in the ICMR-NCTB as effective instruments for the diagnosis of dementia, particularly with a relatively high sensitivity and specificity in a linguistically diverse context. Overall, the ICMR-NCTB appears promising in terms of validity based upon standard criteria for evaluating a dementia diagnostic test in the Indian context (28,47).
Dementia is one of the most important independent contributors to disability in elderly especially in low-to middleincome countries (LMICs) where the resources to diagnose and manage dementia are limited. While specialized services for dementia are increasingly available in high-income countries, such facilities are lacking in LMICs. In addition, primary care physicians in developing countries do not receive suitable training to diagnose dementia and its subtypes. The gap in the diagnosis of dementia is further widened by the cross-cultural differences in understanding dementia due to linguistic and educational diversity. Therefore, a valid test battery that can be applied by clinicians and neuropsychologists in diagnosing dementia is crucial in the linguistically and educationally diverse Indian context.
The development and validation of a comprehensive NCTB protocol for the diagnosis of dementia, harmonized in five different languages, was an important facet of this study. It was established by following a common methodology that was applied on a large cohort consisting of persons with diverse linguistic profiles, which enabled it to be effectively utilized to detect cognitive deficits in early stages of dementia and help in reducing the variability in clinical diagnosis in hospitals and clinics across India. The main finding in our study was that the tests included in the ICMR-NCTB were found to be sensitive and specific in the identification of dementia in LMICs in all of the five Indian languages.
The external validity of each individual test included in the ICMR-NCTB was determined by the receiver operating curve (ROC) using the area under curve (AUC), and optimum cutoff scores were established with corresponding sensitivity and specificity levels.
Our study showed that the Trail Making Test-A, a test of attention, included in the ICMR-NCTB accurately differentiated patients with dementia from healthy control participants with high sensitivity ranging from 71 to 93% and specificity ranging  (48). Category fluency (animals) showed high sensitivity (86-100%) and specificity (74-98%) at optimal cutoff points ranging from 8 to 11, except in Bengali where the sensitivity of the category fluency task was moderate (66%) with good specificity (83%) at an optimal cutoff point of 11. This finding is in agreement with the verbal fluency test included in the Consortium to Establish a Registry for Alzheimer's Disease (CERAD) neuropsychological battery which showed higher sensitivity (75%) and specificity (74%) at an optimal cutoff point of <17 for the category fluency task (15). The lower category fluency cutoff score in the ICMR-NCTB compared to CERAD neuropsychological battery might be due to the inclusion of moderate and severe dementia patients. The NIMHANS Neuropsychological Battery for the Elderly demonstrated good discriminability for the animal fluency task (AUC = 0.99, 95% CI [0.96, 0.99]) (16) that was comparable to the discriminability findings of category fluency (AUC = 0.77-0.99, 95% CI [0.58, 1.00]) findings of our study.
The episodic memory tests of the ICMR-NCTB (verbal learning test-TL and verbal learning test-DR) accurately differentiated patients with dementia from the healthy control group, which is consistent with the criteria to diagnose majority of dementia subtypes including AD (49) that highlight episodic memory impairment in patients with dementia. The verbal learning test-DR showed high sensitivity and specificity ranging from 71 to 90% and 83 to 91%, respectively, with optimal cutoff points ranging from 2 to 3, and the verbal learning test-TL showed a sensitivity of 71-100% and specificity of 80-95% at optimal cutoff points ranging from 11 to 16. The high discriminability of the episodic memory tests of ICMR-NCTB compares well with the word list DR (sensitivity = 94%, specificity = 85%, cutoff = 5) and word list learning (sensitivity = 90%, specificity = 83%, cutoff = 17) of CERAD neuropsychological battery (15). The cutoff scores of verbal learning test are consistent with studies done in LMICs like Brazil. A Brazilian epidemiological study derived a cutoff score of 3 in the literate group and 1 in the illiterate group for delayed recall of a word list test from the CERAD neuropsychological battery (50). A different study from India also established a cutoff score of 3 for the delayed verbal memory test (33), from Kolkata cognitive screening battery, which is also consistent with our study. The word list-delayed recall (AUC = 0.99; 95% CI [0.97, 0.99]) of NIMHANS neuropsychological battery for the elderly revealed the highest discriminability (16), which is comparable with the verbal learning test DR (AUC = 0.79-0.94; 95% CI [0.67, 0.99]).
The Picture Naming Test included in the ICMR-NCTB showed high sensitivity (71-100%) and specificity (73-100%) at optimal cutoff points ranging from 69 to 81 (maximum score = 90), which compares favorably well with the naming test of CERAD neuropsychological battery with a sensitivity of 68% and specificity of 76% at an optimal cutoff point of 12 (maximum score = 15) (15).
The high sensitivity and specificity of the majority of the tests included in the ICMR-NCTB for diagnosing dementia favorably compares to that of other cognitive test batteries such as the Spanish and English Neuropsychological Assessment Scales (SENAS) (51) battery with 80% sensitivity and specificity for a combination of word list learning and object naming to diagnose dementia. The sensitivity of the diagnostic algorithm against clinically diagnosed dementia in the widely used 10/66 pilot samples was 94%, and the specificity was 97% in people with high education and 93% in individuals with low education (52), which is comparable with the sensitivity and specificity of the ICMR-NCTB tests.
Tests included in the NINDS-CSN battery include Animal Naming Test (ANT), Wechsler Adult Intelligence Scale (WAIS)   An important feature of the study is that it is unique in comprehensively addressing the validity of each neuropsychological test included in the ICMR toolbox in a linguistically, educationally, and culturally heterogeneous population. A further strength of the ICMR-NCTB is that tests for all major cognitive domains of attention/executive function, language, memory, and visuospatial functions are incorporated and optimum cutoff points with corresponding  sensitivity and specificity of various cognitive domains in five Indian languages are provided separately. This is crucial for the diagnosis of dementia subtypes: AD, VaD, and FTD that have characteristic cognitive profiles. While AD is a disorder of memory especially in the early stages (54), VaD is characterized by prominent executive dysfunction (55) and frontotemporal dementia syndromes present with language and/or executive function impairment (56). The advantage of inclusion of tests of all major cognitive domains in the ICMR-NCTB is reflected in the relatively high diagnostic sensitivity and specificity of majority of the cognitive domains in this dementia cohort consisting of multiple subtypes. While the study has established successful discriminability between dementia and controls across all tests, the most efficacious combination of measures discriminating healthy controls from patients with dementia is yet to be determined. There were certain limitations identified. (i) The study was conducted in a literate population, and patients with dementia studied were relatively young. (ii) We had a relatively small sample in the Kannada dementia group which might be one of the reasons for the high sensitivity and specificity of ICMR-NCTB tests in Kannada. (iii) Differing proportions of dementia subtypes in our dementia cohort might have led to the differences in ages across dementia patients in five language groups. (iv) We did not have enough numbers for establishing diagnostic validity separately for subtypes of dementia. (v) For clinical and research generalizability, the test battery will need to be adapted to the illiterate group and in larger numbers in the future. (vi) The findings of the current study are applicable to dementia cohorts seen in memory clinics and specialized centers only as the study was conducted in academic medical centers. (vii) There was a variation in the cutoff scores across languages for the Modified Taylor Complex Figure test (MTCF) copy and delayed recall tests, as the sample size was not adequate due to the inability of the low-educated participants and advanced dementia patients to perform the test. Therefore, the MTCF test could not be validated in the current study and the tool might not be applicable for the low literate population in the Indian context. This is planned during the next phase of the study, in larger and more diverse clinical and community populations to further validate the ICMR-NCTB.
To conclude, we were successfully able to validate a cognitive test battery in five different languages that is harmonized culturally and linguistically to diagnose dementia in India. The high specificity and sensitivity of the tests included in the ICMR-NCTB highlight its ability to detect dementia across languages. Our study thus establishes a benchmark for dementia research in India and will prove to be an invaluable tool for clinical practice and for multicentric preventive and therapeutic research in a socio-linguistically diverse context.

DATA AVAILABILITY STATEMENT
The raw data supporting the conclusions of this article will be made available upon request, without undue reservation.
Requests to access the datasets should be directed to the corresponding author.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by Research Ethics Committee of Nizam's

AUTHOR CONTRIBUTIONS
MV contributed toward project implementation, data collection, analyses, and manuscript writing. MT, AN, and SA contributed toward tool development and adaptation, project implementation, data collection, analyses, and manuscript writing. APap contributed data collection, data analyses, and manuscript writing. FV contributed toward statistical and data analyses. RD, MS, AS, AG, RNM, GI, RM, JN, SR, PS, and RV contributed toward tool development and adaptation, project implementation, data collection, analyses, and manuscript editing. APau, MP, US, and SK are expert panelists who contributed toward tool development and adaptation, project implementation, and manuscript editing. FA, GD, TM, SM, RH, JS, RK, AK, RN, LS, and YV contributed toward project implementation, data collection, and manuscript editing. All authors contributed to the article and approved the submitted version.