# INDIVIDUALIZED ASSESSMENT OF BRAIN AGING ACROSS THE LIFESPAN: APPLICATIONS IN HEALTH AND DISEASE

EDITED BY : Katja Franke, Christian Gaser, Nicolas Cherbuin and James H. Cole PUBLISHED IN : Frontiers in Aging Neuroscience

### Frontiers eBook Copyright Statement

The copyright in the text of individual articles in this eBook is the property of their respective authors or their respective institutions or funders. The copyright in graphics and images within each article may be subject to copyright of other parties. In both cases this is subject to a license granted to Frontiers. The compilation of articles constituting this eBook is the property of Frontiers.

Each article within this eBook, and the eBook itself, are published under the most recent version of the Creative Commons CC-BY licence. The version current at the date of publication of this eBook is CC-BY 4.0. If the CC-BY licence is updated, the licence granted by Frontiers is automatically updated to the new version.

When exercising any right under the CC-BY licence, Frontiers must be attributed as the original publisher of the article or eBook, as applicable.

Authors have the responsibility of ensuring that any graphics or other materials which are the property of others may be included in the CC-BY licence, but this should be checked before relying on the CC-BY licence to reproduce those materials. Any copyright notices relating to those materials must be complied with.

Copyright and source acknowledgement notices may not be removed and must be displayed in any copy, derivative work or partial copy which includes the elements in question.

All copyright, and all rights therein, are protected by national and international copyright laws. The above represents a summary only. For further information please read Frontiers' Conditions for Website Use and Copyright Statement, and the applicable CC-BY licence.

ISSN 1664-8714 ISBN 978-2-88963-902-1 DOI 10.3389/978-2-88963-902-1

### About Frontiers

Frontiers is more than just an open-access publisher of scholarly articles: it is a pioneering approach to the world of academia, radically improving the way scholarly research is managed. The grand vision of Frontiers is a world where all people have an equal opportunity to seek, share and generate knowledge. Frontiers provides immediate and permanent online open access to all its publications, but this alone is not enough to realize our grand goals.

### Frontiers Journal Series

The Frontiers Journal Series is a multi-tier and interdisciplinary set of open-access, online journals, promising a paradigm shift from the current review, selection and dissemination processes in academic publishing. All Frontiers journals are driven by researchers for researchers; therefore, they constitute a service to the scholarly community. At the same time, the Frontiers Journal Series operates on a revolutionary invention, the tiered publishing system, initially addressing specific communities of scholars, and gradually climbing up to broader public understanding, thus serving the interests of the lay society, too.

### Dedication to Quality

Each Frontiers article is a landmark of the highest quality, thanks to genuinely collaborative interactions between authors and review editors, who include some of the world's best academicians. Research must be certified by peers before entering a stream of knowledge that may eventually reach the public - and shape society; therefore, Frontiers only applies the most rigorous and unbiased reviews.

Frontiers revolutionizes research publishing by freely delivering the most outstanding research, evaluated with no bias from both the academic and social point of view. By applying the most advanced information technologies, Frontiers is catapulting scholarly publishing into a new generation.

### What are Frontiers Research Topics?

Frontiers Research Topics are very popular trademarks of the Frontiers Journals Series: they are collections of at least ten articles, all centered on a particular subject. With their unique mix of varied contributions from Original Research to Review Articles, Frontiers Research Topics unify the most influential researchers, the latest key findings and historical advances in a hot research area! Find out more on how to host your own Frontiers Research Topic or contribute to one as an author by contacting the Frontiers Editorial Office: researchtopics@frontiersin.org

# INDIVIDUALIZED ASSESSMENT OF BRAIN AGING ACROSS THE LIFESPAN: APPLICATIONS IN HEALTH AND DISEASE

Topic Editors:

Katja Franke, University Hospital Jena, Germany Christian Gaser, Friedrich Schiller University Jena, Germany Nicolas Cherbuin, Australian National University, Australia James H. Cole, University College London, United Kingdom

Citation: Franke, K., Gaser, C., Cherbuin, N., Cole, J. H., eds. (2020). Individualized Assessment of Brain Aging Across the Lifespan: Applications in Health and Disease. Lausanne: Frontiers Media SA. doi: 10.3389/978-2-88963-902-1

# Table of Contents

### *05 Premature Brain Aging in Baboons Resulting From Moderate Fetal Undernutrition*

Katja Franke, Geoffrey D. Clarke, Robert Dahnke, Christian Gaser, Anderson H. Kuo, Cun Li, Matthias Schwab and Peter W. Nathanielsz

*16 MRI Visual Ratings of Brain Atrophy and White Matter Hyperintensities across the Spectrum of Cognitive Decline are Differently Affected by Age and Diagnosis*

Hanneke F. M. Rhodius-Meester, Marije R. Benedictus, Mike P. Wattjes, Frederik Barkhof, Philip Scheltens, Majon Muller and Wiesje M. van der Flier


Janet P. Trammell, Priscilla G. MacRae, Greta Davis, Dylan Bergstedt and Ariana E. Anderson

*48 Physiological Aging Influence on Brain Hemodynamic Activity During Task-Switching: A fNIRS Study*

Roberta Vasta, Simone Cutini, Antonio Cerasa, Vera Gramigna, Giuseppe Olivadese, Gennarina Arabia and Aldo Quattrone

*58 Bayesian Optimization for Neuroimaging Pre-processing in Brain Age Classification and Prediction*

Jenessa Lancaster, Romy Lorenz, Rob Leech and James H. Cole

*68 Gray Matter Network Disruptions and Regional Amyloid Beta in Cognitively Normal Adults*

Mara ten Kate, Pieter Jelle Visser, Hovagim Bakardjian, Frederik Barkhof, Sietske A. M. Sikkes, Wiesje M. van der Flier, Philip Scheltens, Harald Hampel, Marie-Odile Habert, Bruno Dubois and Betty M. Tijms for the INSIGHT-preAD study group

*79 Brain Aging and APOE* e*4 Interact to Reveal Potential Neuronal Compensation in Healthy Older Adults*

Elisa Scheller, Lena V. Schumacher, Jessica Peter, Jacob Lahr, Julius Wehrle, Christoph P. Kaller, Christian Gaser and Stefan Klöppel

	- Natalie M. Zahr

Seyul Kwak, Hairin Kim, Jeanyung Chey and Yoosik Youm

*123 Predicting Age From Brain EEG Signals—A Machine Learning Approach* Obada Al Zoubi, Chung Ki Wong, Rayus T. Kuplicki, Hung-wen Yeh, Ahmad Mayeli, Hazem Refai, Martin Paulus and Jerzy Bodurka

*135 Biological Brain Age Prediction Using Cortical Thickness Data: A Large Scale Cohort Study*

Habtamu M. Aycheh, Joon-Kyung Seong, Jeong-Hyeon Shin, Duk L. Na, Byungkon Kang, Sang W. Seo and Kyung-Ah Sohn

*149 Progressive Decline in Gray and White Matter Integrity in* de novo *Parkinson's Disease: An Analysis of Longitudinal Parkinson Progression Markers Initiative Diffusion Tensor Imaging Data*

Kirsten I. Taylor, Fabio Sambataro, Frank Boess, Alessandro Bertolino and Juergen Dukart

*158 A Nonlinear Simulation Framework Supports Adjusting for Age When Analyzing BrainAGE*

Trang T. Le, Rayus T. Kuplicki, Brett A. McKinney, Hung-Wen Yeh, Wesley K. Thompson, Martin P. Paulus and Tulsa 1000 Investigators

*169 Pre-aging of the Olfactory Bulb in Major Depression With High Comorbidity of Mental Disorders*

Fabian Rottstaedt, Kerstin Weidner, Thomas Hummel and Ilona Croy

# Premature Brain Aging in Baboons Resulting from Moderate Fetal Undernutrition

Katja Franke<sup>1</sup> \*, Geoffrey D. Clarke<sup>2</sup> , Robert Dahnke<sup>1</sup> , Christian Gaser 1, 3 , Anderson H. Kuo<sup>2</sup> , Cun Li 4, 5, Matthias Schwab<sup>6</sup> and Peter W. Nathanielsz 4, 5

<sup>1</sup> Structural Brain Mapping Group, Department of Neurology, University Hospital Jena, Jena, Germany, <sup>2</sup> Radiology, University of Texas Health Science Center San Antonio, San Antonio, TX, USA, <sup>3</sup> Department of Psychiatry, University Hospital Jena, Jena, Germany, <sup>4</sup> Texas Pregnancy and Life Course Health Research Center, Southwest National Primate Research Center, Texas Biomedical Research Institute, San Antonio, TX, USA, <sup>5</sup> Animal Science, University of Wyoming, Laramie, WY, USA, <sup>6</sup> Department of Neurology, University Hospital Jena, Jena, Germany

Contrary to the known benefits from a moderate dietary reduction during adulthood on life span and health, maternal nutrient reduction during pregnancy is supposed to affect the developing brain, probably resulting in impaired brain structure and function throughout life. Decreased fetal nutrition delivery is widespread in both developing and developed countries, caused by poverty and natural disasters, but also due to maternal dieting, teenage pregnancy, pregnancy in women over 35 years of age, placental insufficiency, or multiples. Compromised development of fetal cerebral structures was already shown in our baboon model of moderate maternal nutrient reduction. The present study was designed to follow-up and evaluate the effects of moderate maternal nutrient reduction on individual brain aging in the baboon during young adulthood (4–7 years; human equivalent 14–24 years), applying a novel, non-invasive neuroimaging aging biomarker. The study reveals premature brain aging of +2.7 years (p < 0.01) in the female baboon exposed to fetal undernutrition. The effects of moderate maternal nutrient reduction on individual brain aging occurred in the absence of fetal growth restriction or marked maternal weight reduction at birth, which stresses the significance of early nutritional conditions in life-long developmental programming. This non-invasive MRI biomarker allows further longitudinal in vivo tracking of individual brain aging trajectories to assess the life-long effects of developmental and environmental influences in programming paradigms, aiding preventive and curative treatments on cerebral atrophy in experimental animal models and humans.

Keywords: BrainAGE, developmental programming, in vivo, maternal nutrient restriction (MNR), machine learning, non-human primates, magnetic resonance imaging (MRI)

## INTRODUCTION

Although, moderate dietary restriction during adulthood appears to the lengthen lifespan (Fontana et al., 2010), dietary restriction during prenatal life has been clearly demonstrated to have the opposite effect, i.e., being related to an altered, suboptimal development of structure and function of multiple organ systems, a shortened lifespan, and increased prevalence for chronic diseases in later life (Entringer et al., 2012; Schuurmans and Kurrasch, 2013; Tarry-Adkins and Ozanne, 2014;

### Edited by:

Aurel Popa-Wagner, University of Rostock, Germany

### Reviewed by:

Ana M. Coto-Montes, Universidad de Oviedo Mieres, Spain Daniel Ortuño-Sahagún, Centro Universitario de Ciencias de la Salud, Mexico

\*Correspondence:

Katja Franke katja.franke@uni-jena.de

Received: 20 December 2016 Accepted: 20 March 2017 Published: 11 April 2017

### Citation:

Franke K, Clarke GD, Dahnke R, Gaser C, Kuo AH, Li C, Schwab M and Nathanielsz PW (2017) Premature Brain Aging in Baboons Resulting from Moderate Fetal Undernutrition. Front. Aging Neurosci. 9:92. doi: 10.3389/fnagi.2017.00092 Rando and Simmons, 2015; Zambrano et al., 2015), as well as permanent impairments in brain structure and function (Morgane et al., 1993; Morley and Lucas, 1997; Olness, 2003; Grantham-McGregor and Baker-Henningham, 2005; Wainwright and Colombo, 2006; Walker et al., 2007; Benton and ILSI Europe, 2008; Antonow-Schlorke et al., 2011; Rodriguez et al., 2012; Keenan et al., 2013; Muller et al., 2014). Still, fetal undernutrition due to decreased nutrient delivery and micronutrient deficiency is a worldwide societal challenge, with a multiplicity of causes, like poverty, natural disasters, war, and cultural habits, but also maternal dieting, teenage pregnancies, pregnancies in women over 35 years of age, women suffering from hyperemesis gravidarum or placental insufficiency, as well as in multiple pregnancies (Black et al., 2008; Baker et al., 2009; Beard et al., 2009; Roseboom et al., 2011; Raznahan et al., 2012; Zhang et al., 2015).

However, human retrospective studies are subject to lifestyle and environmental confounds and do not readily allow isolation and control of individual variables presumed to cause specific long-term outcomes, such as variable degrees of prenatal or postnatal nutrition, maternal stress etc. (Symonds et al., 2000; Piras et al., 2014). Animal models that introduce controlled perturbations are required to determine, quantify, and understand the causal relationships between perinatal nutrient delivery and life-long effects on brain maturation and aging. Developmental programming studies have been mainly conducted in polytocous, altricial rodents, i.e., species with substantially different trajectories of fetal and neonatal brain development from monotocous, precocial mammals, including humans (Ganu et al., 2012; Fontana and Partridge, 2015). Non-human primates have many similarities in physiology, neuroanatomy, reproduction, development, cognition, and social complexity to humans (Vandeberg et al., 2009; Phillips et al., 2014). The baboon is an old world primate, which is the closest available species to relate to human programming in terms of reproduction, developmental physiology, gene function, or brain structure (Vandeberg et al., 2009; Atkinson et al., 2015).

To enable studies directed at translation to determine the effects of malnutrition in humans we have developed a baboon model of 30% reduction in global maternal nutrition during pregnancy, while controlling for all other psychosocial stressors. In the maternal nutrient restriction (MNR) fetus, we have already shown an altered trajectory of brain development (Antonow-Schlorke et al., 2011). Subsequently, the adolescent MNR baboon offspring showed altered postnatal cognitive and behavioral performances at 3.3 years of age (human equivalent 11.5 years) (Rodriguez et al., 2012; Keenan et al., 2013). To further track the life-long consequences of MNR in neuroanatomical maturation and aging we aimed to develop an in vivo, noninvasive biomarker for brain aging in order to capture individual deviations in the MNR offspring, which will translate and be comparable to humans. Therefore, this study utilizes a baboonspecific adaption of our innovative BrainAGE method, recently developed for modeling brain aging in human samples. The BrainAGE method is based on a fully-automatic preprocessing pipeline for structural in vivo brain magnetic resonance imaging (MRI) data and uses pattern recognition methods to evaluate individual brain aging (Franke et al., 2010, 2012). In humans, it has already been validated and applied in several studies, indicating subtle deviations in age-related brain structure due to various health and lifestyle conditions, with premature brain aging being related to cognitive decline and clinical symptoms (Franke et al., 2012, 2014; Gaser et al., 2013).

For the present study, a novel, fully automatic preprocessing pipeline for baboon brain MRI data was developed and sexspecific reference curves of baboon brain tissue volumes across the adult lifespan were constructed, based on non-invasive in vivo MRI data from 29 control subjects (aged 4–22 years; human equivalent 14–77 years). Second, we present a speciesspecific adaptation of our BrainAGE method based on pattern recognition methods to evaluate individual brain age in the baboon. Finally, we employed this non-invasive, non-terminal in vivo biomarker to study structural brain aging in MNR and control offspring at about 5 years of age (human equivalent 17.5 years). We hypothesized that moderate MNR during pregnancy would lead to premature neuroanatomical aging in the young adult offspring. In line with the sexual dimorphism hypothesis in Developmental Programming, we hypothesized that effects of decreased fetal nutrient delivery on individual brain aging would differ between male and female offspring (Aiken and Ozanne, 2013).

# MATERIALS AND METHODS

### Subjects and MRI Scanning

The study included two samples. The lifespan-sample, which was used to analyze changes in brain tissue across adulthood as well as to build and test the baboon-specific brain age estimation model, included 29 (15 female) healthy control subjects (Papio hamadryas), aged 4–22 years (mean age 9.5 ± 4.9 years) (**Table 1**), which is equivalent to 14–77 years in humans. A second sample of 11 subjects (5 female) formed the experimental group of subjects with fetal undernutrition due to MNR (see Production of MNR and CTR Offspring). 12 same-aged subjects (5 female) from the lifespan-sample were included in the control (CTR) group. At time of MRI data acquisition, MNR and CTR subjects were aged 4–7 years (**Table 2**), which is equivalent to 14–24 years in humans. Each subject was scanned on a 3 Tesla whole body MRI scanner (TIM Trio, Siemens Medical Solutions, Malvern, PA), using a T1-weighted sequence.

# Production of MNR and CTR Offspring

Female baboons were housed in harem groups of 16 females and one vasectomized male at the Southwest National Primate Research Center at San Antonio, Texas, USA. Groups of mothers that eventually gave birth to the MNR and CTR offspring were socialized in the presence of a vasectomized male while eating Purina Monkey Diet 5038 (Purina, St. Louis, Missouri, USA) containing crude protein not less than 15%, crude fat not less than 5%, crude fiber not more than 6%, ash not more than 5% and added minerals not more than 3% ad libitum. The management of the feeding of ad libitum and nutrient reduction has previously been described in detail (Keenan et al., 2013). Following acclimation, the vasectomized male was replaced by

TABLE 1 | Morphologic data in the lifespan-sample.


Data are displayed as mean ± standard deviation (SD). Bold type indicates statistical significance.

a proven breeder male. Timing of pregnancy was performed by following sex-skin turgescence (Hendrickx and Peterson, 1997). Following confirmation of pregnancy by ultrasound at 30 days gestation, baboons with moderate MNR received 70% of the average daily amount of feed eaten by the ad libitum control females on a weight adjusted basis at the same gestational age. Water was continuously available. Mothers were of similar age (mean age ± SD: 11.5 ± 0.51 years) and morphometric phenotype. One cage was randomly selected for ad libitum feeding on normal primate feed pellets and one cage for mothers fed 70% of the feed eaten by control females on a weightadjusted basis from the time of diagnosis of pregnancy (∼30 days gestation) for the rest of pregnancy and through lactation. Initially, 40 adult females were recruited to the study, with 18 mothers-to-be being placed on the reduced diet and 22 on ad libitum feed. All mothers delivered spontaneously at full term. In the whole cohort, male CTR offspring of ad lib fed mothers weighed 930 ± 40 g (mean ± SEM; n = 10) at birth, male MNR offspring weighed 820 ± 39 g (n = 9; p < 0.05). Female CTR neonates weighed 820 ± 40 g (n = 12), female MNR neonates weighed 730 ± 40 g (n = 9; p < 0.05). Weights of the randomly chosen MRI subsample used for the present study were within the same range as the whole sample (**Table 1**). All offspring were reared with their mothers in group-based housing until 9 months of age. Juvenile offspring were transferred to the University of Texas Health Science Center at San Antonio in cohorts of 5–7 subjects over a 9-months period and housed individually in the visual and auditory presence of ≥ 6 other peers in the Laboratory Animal Resources facility.

All animal procedures were performed in accordance with accepted standards of humane animal care approved by the Texas Biomedical Research Institute and University of Texas Health Science Center at San Antonio Institutional Animal Care and Use Committees and conducted in facilities approved by Association for Assessment and Accreditation of Laboratory Animal Care International Inc (AAALAC).

### Basic Concept of the Brain Age Estimation Framework

We recently developed the brain age gap estimation (BrainAGE) framework to model healthy human brain aging (Franke et al.,


2010). Its basic concept is the aggregation of the complex, multidimensional aging pattern across the whole brain into one single value, i.e., the estimated brain age. In human samples, the BrainAGE framework accurately and reliably estimates the age of individual brains with minimal preprocessing and parameter optimization using anatomical MRI scans (Franke et al., 2010, 2012). It also has the potential to identify pathological brain aging on an individual level (Franke et al., 2012; Gaser et al., 2013).

In general, the process includes three steps (**Figure 1**). First, the raw T1-weighted image data are preprocessed with a standardized voxel-based morphometry (VBM) pipeline. Second, data reduction is performed on the preprocessed MRI data in order to reduce computational costs, to avoid severe overfitting, as well as to produce a robust and widely applicable age estimation model. Third, relevance vector regression (RVR) is utilized to capture the multidimensional aging patterns across the whole brain in order to model brain aging over a wide age range and to subsequently estimate individual brain ages. In the present study, the brain age estimation framework includes a novel baboon-specific MRI preprocessing pipeline (see Preprocessing of MRI Data) as well as building the new species-specific model of healthy brain aging in baboons (see The BrainAGE Method and Its Baboon-specific Adaptation).

### Preprocessing of MRI Data

T1-weighted image MR image data we acquired using a 3D acquisition scheme with in-plane images in the sagittal orientation in order to maximize isotropic resolution while avoiding aliasing artifacts. This acquisition strategy necessitated the use of slice-based inhomogeneity correction to remove MR protocol dependent slice artifacts (**Figure 2A**) (Van Leemput et al., 1999; Cohen et al., 2000). Then, a spatial adaptive nonlocal means (SANLM) filter (Christidis and Cox, 2006) was applied to reduce high-frequency noise. For segmentation and spatial registration a baboon-specific tissue probability map (TPM) and a "Diffeomorphic Anatomical Registration using Exponentiated Lie algebra" (DARTEL) template (Ashburner, 2007) were required. The template was created in an iterative process based on a rescaled human template (**Figure 2B**). For initialization, an affine transformation was used to scale the human SPM12 TPM and the VBM12 Dartel template map to the expected brain size of baboons. Image resolution of the template was changed to isotropic voxel size of 0.75 mm. For each iteration step, the resulting tissue maps were averaged and smoothed with a full-width-at-half-maximum (FWHM) kernel of 2 mm to estimate an affine registration in order to create a new TPM, T1 average map and brain mask. For averaging data, the median function was used to reduce distortions by outliers and failed processing. Iterations were stopped if the change compared to the previous template was below a pre-defined threshold, resulting in the final segmentation (**Figure 2C**).

### Data Reduction

Preprocessed MRI data were smoothed with a 3-mm FWHM smoothing kernel and images were resampled to 3 mm. Data were further reduced by applying principal component analysis (PCA).

# The BrainAGE Method and Its Baboon-Specific Adaptation

The brain age estimation framework uses RVR, which was introduced as a Bayesian alternative to support vector machines (SVM) for obtaining sparse solutions to pattern recognition tasks (Tipping, 2000, 2001). Former results indicated favorable performance of RVR to capture the typical age-specific atrophy patterns across the whole brain (Franke et al., 2010). A linear kernel was chosen since age estimation accuracy is not improving when choosing non-linear kernels (Franke et al., 2010). Besides and in contrast to the use of support vector machines, parameter optimization during the training procedure is not necessary. More details can be found in (Franke et al., 2010).

In general, the model is trained with preprocessed whole brain structural MRI data as well as the corresponding chronological ages of a training sample, resulting in a complex model of brain aging (**Figure 3A**, left panel). Put in other words, the algorithm uses those whole-brain MRI data from the training sample that represent the prototypical examples within the specified regression task (i.e., brain aging). Besides, voxel-specific weights can be calculated that represent the importance of each voxel within the specified regression task (i.e., brain aging).

Subsequently, the brain age of a test subject can be estimated using the individual tissue-classified MRI data, aggregating the complex, multidimensional aging pattern across the whole brain into one single value (**Figure 3A**, right panel). In other words, all the voxels of the test subject's MRI data are weighted by applying the voxel-specific weighting matrix. Then, the brain age is calculated by applying the regression pattern of healthy brain aging and aggregating all voxel-wise information across the whole brain. The difference between estimated brain age and the true chronological age will reveal an individual deviation score, namely the brain age gap estimation (BrainAGE) score. Consequently, positive or negative values of this deviation score directly quantify the amount of acceleration or deceleration in individual brain aging, respectively (**Figure 3B**).

Here, gray matter (GM) images resulting from the baboonspecific preprocessing pipeline described above were used to build the model of neuroanatomical aging in baboons. The brain age estimation model was trained and tested via leave-one-out cross-validation, i.e., preprocessed MRI data from 28 out of 29 baboons was used for training and the brain age of the left-out subject was estimated subsequently. This procedure was repeated 29 times. Brain ages in the MNR subjects were calculated using the whole lifespan sample as the training sample.

### Technical Notes

The whole brain age estimation framework works fully automatically. All MRI preprocessing, data reduction, model training, and brain age estimation is done using MATLAB. Preprocessing of the in vivo T1-weighted images was done using the toolboxes "Statistical Parametric Mapping" (SPM12; http://www.fil.ion.ucl.ac.uk/spm) and our new "Computational Anatomy Toolbox for SPM" (CAT12; http://dbm.neuro. uni-jena.de). PCA is performed using the "Matlab Toolbox for Dimensionality Reduction" (http://ict.ewi.tudelft.nl/~

FIGURE 1 | General flowchart of the brain age estimation framework.

FIGURE 2 | (A) Shown are the original T1-weighted image and the slice-corrected version. (B) For the segmentation process, a baboon specific tissue probability map (TPM; shown as label map) was used in an iterative process, starting with a scaled human template (left) and refinements during each iteration. (C) The final baboon TPM was used to create the final segmentation of the individual MRI.

FIGURE 3 | Depiction of the original brain age estimation framework for humans. (A) The model of healthy brain aging is trained with the chronological age and preprocessed structural MRI data of a training sample (left; with an exemplary illustration of the most important voxel locations that were used by the age regression model). Subsequently, the individual brain age of a previously unseen test subject is estimated, based on MRI data (blue; picture modified from Schölkopf and Smola, 2002). (B) The difference between the estimated and chronological age results in the deviation (i.e., BrainAGE) score. Consequently, positive deviation scores indicate accelerated brain aging. (Image reproduced from Franke et al., 2012, with permission from Hogrefe Publishing, Bern).

lvandermaaten/Home.html). To compute the age regression model as well as to predict the individual brain ages, the freely available toolbox "The Spider" (http://www.kyb.mpg.de/bs/ people/spider/main.html) is used.

Baboon TPM and template generation takes around 30 min per subject and iteration, ending up in about 48 h for the whole sample. For modeling healthy brain aging with RVR, one crossvalidation loop in the life-span sample of 29 subjects takes between 0.2 and 1.2 s on MAC OS X, Version 10.6.8, 2.8 GHz Intel Core 2 Duo. Thus, the whole process of training the baboonspecific brain age estimation model and subsequent estimation of individual brain age for all 29 subjects takes about 20 s in total.

### Statistical Analyses

First, volumes of GM, white matter (WM), cerebrospinal fluid (CSF), and total intracranial volume (TIV) were analyzed using regression models. To test whether age effects were significantly associated with brain volumes, F statistics of linear and quadratic regression models were compared. To measure the accuracy of the brain age estimation model, the correlation coefficient between chronological and estimated brain age as well as the mean absolute error (MAE) was calculated:

$$\text{MAE} = 1/\text{n}^\* \Sigma\_{\text{i}} |\text{BA}\_{\text{i}} - \text{CA}\_{\text{i}}| \tag{1}$$

with n being the number of subjects in the test sample, CA<sup>i</sup> the chronological age, and BA<sup>i</sup> the structural brain age estimated by the model. Best-fit was tested comparing F statistics of linear and quadratic regression models.

Before analyzing brain aging in MNR baboons, birth weight, weight and age at time of MR scan, as well as brain volumes were compared between MNR and CTR groups using analysis of variance (ANOVA). As the range of chronological ages was relatively broad (4–7 years), age was included as a covariate (except for birth weight). Effect size was calculated using partial η 2 . To analyze differences in individual brain aging between both groups, the individual deviation scores, i.e., the BrainAGE scores, were calculated:

$$BrainAGE \,\text{score} = \text{BA}\_{\text{i}} - \text{CA}\_{\text{i}} \tag{2}$$

We have previously reported sex differences in neurodevelopment and cognitive performance for the same cohort of MNR baboon (Rodriguez et al., 2012; Keenan et al., 2013). Therefore, we also tested the effect of sex on structural brain aging. Female and male offspring of mothers in the control group are referred as CTR females and CTR males, respectively, and female and male offspring of mothers in the moderate MNR group are referred as MNR females and MNR males, respectively. All statistical testing was performed using MATLAB 7.11.

### RESULTS

### Baboon Brain Volumes across Adulthood

The lifespan-sample of healthy control subjects used to model healthy brain aging in baboon was aged 4–22 years at the time of data acquisition. Mean age did not differ between males and females (**Table 1**). Evaluation of in vivo MRI revealed that absolute GM volume, absolute WM volume, absolute CSF, as well as TIV were significantly higher in male compared to female subjects (**Table 1**). Absolute neocortical GM volume declined significantly with age, especially in males (**Table 3**). Absolute CSF volume increased with age only in females. Absolute WM volume as well as TIV did not vary with age.

To analyze characteristics of baboon brain tissue volumes during adulthood independent of individual differences in brain size, absolute brain volumes were corrected for TIV, resulting in individual proportions of GM, WM, and CSF in relation to individual TIV. Fractional GM, WM, and CSF volumes did not differ between genders (**Table 1**). In males and females, the GM decline was strongly explained by age (**Figure 4A**; **Table 3**), with a linear age effect in females (adjusted R <sup>2</sup> = 0.85; p < 0.001) and a quadratic age effect in males (adjusted R <sup>2</sup> = 0.94; p < 0.001). Fractional WM volume increased with age (**Figure 4B**; **Table 3**), with males showing a stronger relationship (quadratic fit: adjusted R <sup>2</sup> = 0.75; p < 0.01) than females (linear fit: adjusted R <sup>2</sup> = 0.36; p < 0.05). Fractional CSF volume showed a moderate increase with age only in females (linear fit: adjusted R <sup>2</sup> = 0.31; p < 0.05), but not in males (**Figure 4C**; **Table 3**).

### Baboon Brain Age Estimation Model

The baboon-specific brain age estimation model included a baboon-specific preprocessing pipeline for in vivo anatomical MRI scans and a machine-learning algorithm for pattern recognition in order to model a reference curve for normal aging of the baboon brain. Using preprocessed GM images, leave-one-out cross-validation in the whole lifespan-sample of healthy control subjects resulted in a correlation of r = 0.80 (p < 0.001; **Figure 5**) between chronological age and estimated brain age, with age estimation being slightly more accurate in females (r = 0.88) than in males (r = 0.81). The linear regression model resulted in the best fit (adjusted R <sup>2</sup> = 0.62; F = 47.6; p < 0.001). The mean MAE between chronological age and estimated brain age was 2.1 years for the whole modeling sample of healthy control subjects (females: MAE = 1.5 years; males: MAE = 2.8 years).

### Morphometric Characteristics of MNR Model Baboon Offspring

In our whole non-human primate (NHP) model, MNR decreased birth weight (Li et al., 2013a). Birth weights of the MRI subsample in the present study, including a total of 11 randomly chosen offspring (5 female) with MNR and 12 randomly chosen control offspring (5 female) from mothers receiving full diet during pregnancy, fell within the same range as the total sample (Li et al., 2013a), with MNR offspring showing the tendency to weigh less than CTR offspring [F(1, 18) = 3.2; p = 0.09; **Table 2**]. At time of in vivo MRI data acquisition, subjects were aged 4–7 years (human equivalent 14–24 years) (**Table 2**). Chronological age did not differ between experimental groups [F(1, 18) = 0.1; n.s.] or gender [F(1, 18) = 2.9; n.s.]. At the time of the MRI scan, female MNR offspring weighed more than female CTR offspring [F(1, 7) = 7.2; p < 0.05; η <sup>2</sup> = 0.51], showing an altered postnatal growth profile as a result of programming. No differences in weight at time of MRI scan were found in males (**Table 2**).

In the sample of MNR and CTR offspring, absolute as well as relative brain volumes did not differ between experimental groups. However, absolute CSF volume was increased in female MNR offspring as compared to female CTR offspring. Even more interesting, fractional GM volume corrected for individual TIV was significantly decreased in female MNR offspring [F(1, 7) = 7.9; p < 0.05; η <sup>2</sup> = 0.50; **Table 2**].

### Brain Aging in Adult MNR Baboon

Baboon BrainAGE scores based on species-specific preprocessed GM images, which quantify baboon-specific neuroanatomical aging, were significantly increased by 2.74 years in young adult female MNR subjects as compared to young adult female CTR offspring [F(1, 7) = 11.4; p = 0.01; η <sup>2</sup> = 0.62; **Figure 6A**; **Table 2**], suggesting premature brain aging in female MNR offspring as a result of developmental programming due to fetal undernutrition. In males, BrainAGE scores did not differ between MNR and CTR offspring [F(1, 9) = 0.0; n.s.; **Figure 6B**; **Table 2**].

### DISCUSSION

To our knowledge this is the first study to combine the power of in vivo MRI data acquisition and the development of a controlled NHP model of developmental programming to follow the effects on structural brain development and aging. The translational


TABLE 3 | Sex-specific regression models with linear and quadratic fit for age effects in absolute and fractional (/TIV) baboon brain volumes across adulthood.

Asterisk indicates best-fit model. Bold type indicates significance.

power of these studies is shown by a parallel MRI study in the Dutch famine birth cohort, in which decreased total brain volume in late adulthood was already shown in those who had been undernourished prenatally (de Rooij et al., 2016). Further research in the Dutch famine birth cohort will investigate whether exposure to fetal undernutrition during early gestation has an effect on individual brain aging in late-life. This present study provides important novel insights into experimental brain programming using the closest available species with regards to human programming to model (experimentally induced) brain changes. First, we developed a novel fully-automatic baboon-specific MRI data preprocessing pipeline, based on and comparable to well-established preprocessing pipelines for human brain MRI data. The characteristics of brain structures across the baboon adult life course were analyzed, based on noninvasive in vivo MRI data. Second, we present a species-specific reference curve for brain aging in baboons, resulting from the novel adaptation of our well-established BrainAGE method. Using BrainAGE, future studies investigating experimentally induced brain changes in different programming paradigms can refer and compare to these new reference curves in normal brain volumes and individual brain aging across the adult baboon life course. Third, applying this innovative brain aging biomarker, the present study is the first to reveal modifications of individual brain aging trajectories resulting from developmental programming in a non-human primate MNR model.

The analyzes of the in vivo MRI data from healthy, untreated baboons showed a strong decline in GM volume and increase in WM volume during adult lifespan in males as well as females. In comparison, MRI studies in humans suggest linear decline in GM volume, non-linear age effect in WM volume decline as well as increase of CSF volume to be predominant during adulthood in both genders (Good et al., 2001; Resnick et al., 2003). In contrast, chimpanzees and rhesus monkeys show only very small (if any) age-related decline in GM and WM volumes during adulthood (Sherwood et al., 2011; Chen et al., 2013; Autrey et al., 2014). Thus, our results suggest the baboon to be the animal model closest to the human in terms of changes in brain tissue across the lifespan, and thus best suited for future translational studies.

Based on the age-related patterns of brain tissue loss, we have recently presented a fully-automatic brain age estimation framework for use in humans (Franke et al., 2010), which aggregates the complex aging patterns across the whole brain. The result is a single global estimation score of an individual "brain age" that accounts for the individual multidimensional aging pattern across the whole brain. Several studies in human samples provide evidence for the BrainAGE method to accurately

0.80 (p < 0.001), with an overall MAE of 2.1 years.

and reliably model age-related spatiotemporal human brain changes as well as to estimate individual deviations from healthy brain aging trajectories. Moreover, BrainAGE results profoundly correlate with a number of general lifestyle and health parameters, disease markers, and cognitive functions (Franke et al., 2010, 2012, 2014; Gaser et al., 2013). Based on the original BrainAGE framework, this paper presents a species-specific adaptation for baboon brain aging, including a novel baboonspecific preprocessing tool for MRI data and validated machine learning methods for pattern recognition in order to model typical brain aging characteristics and to subsequently estimate individual brain ages. Here, this baboon-specific adaptation of the brain age estimation framework showed excellent performance in modeling baboon brain aging using in vivo MRI data from 15 female and 14 male baboons, aged 4–22 years.

Applying this novel, non-invasive in vivo MRI biomarker to a sample of baboons with 30% reduction in global maternal nutrition during pregnancy, this study shows premature brain aging of about 2.7 years in the young adult MNR female subjects (4–7 years; human equivalent 14–24 years). Several studies of fetal and postnatal MNR baboon offspring in this model have established a multi-system altered phenotype, affecting the cardiovascular system (Clarke et al., 2015), liver (Cox et al., 2006b), kidney (Cox et al., 2006a; Pereira et al., 2015), and brain (Antonow-Schlorke et al., 2011; Li et al., 2013a,b). With regards to brain development and function, a recent histological study utilizing the same baboon model of moderate MNR during pregnancy already indicated major impairments of fetal brain development, including disturbances of early organizational processes in cerebral development on a histological and gene product level, neurotrophic factor suppression, imbalances in cell proliferation and developmental cell death, impaired glial maturation and neuronal process formation, as well as altered gene expression (Antonow-Schlorke et al., 2011), resulting in

an altered cognitive and behavioral phenotype during childhood and adolescence with female but not male MNR offspring demonstrating more variable and lower levels of persistence and attention and less emotional arousal than female CTR offspring by 3.3 years of age (human equivalent 11.5 years) (Keenan et al., 2013). The present neuroanatomical study on brain aging in the MNR baboon model is the first evidence in a NHP sample with MNR that appraises and quantifies the effects of fetal malnutrition on neuroanatomical aging in young adulthood, employing non-invasive in vivo data collection and a novel, baboon-specific evaluation approach. The increased BrainAGE scores in the female MNR baboons provide the first in vivo evidence for premature brain aging during young adulthood following impaired fetal brain development induced by moderate MNR during gestation. These results are in line with our former studies of brain development and (cognitive) function also showing stronger alterations due to MNR during gestation in the female baboon offspring (Antonow-Schlorke et al., 2011; Rodriguez et al., 2012; Keenan et al., 2013; Li et al., 2013a).

Increasing evidence from developmental programming studies often shows distinct differences between males and females (Aiken and Ozanne, 2013). Potential mechanisms for the different effects in males and females observed in this study are varying timing of puberty and adolescence in male and female baboons and its associated hormone changes. Puberty and adolescence in female baboons occur around the age of 3 to 4 years with baboons attaining adult size around 6 years of age. In contrast, puberty and adolescence in male baboons occur between 4 and 7 years, with males reaching full size around 10 years of age (Crawford et al., 1997; Jolly and Phillips-Conroy, 2003). These differing maturation trajectories probably also include brain maturation. Thus, processes of brain tissue gain as a result of pre-pubertal growth and pruning may still be continuing in the male subjects studied, whereas in the female subjects brain maturation was closer to completion, such that the processes of age-related tissue loss occurring during adulthood had already started. For future studies we propose to continue to apply this in vivo biomarker for brain aging in the same sample of MNR offspring during middle and late adulthood in order to comprehensively track the long-lasting effects of developmental programming on gender-specific alterations of individual brain aging trajectories.

Several established models for brain aging in humans showed significant relationships between individual brain aging and health and lifestyle variables as well as medical drug use (Franke et al., 2014; Habes et al., 2016). Delayed brain aging has been found to be associated with higher levels of education and physical activity (Steffener et al., 2016) as well as higher levels of meditation practice (Luders et al., 2016). Furthermore, advanced brain aging was shown to be indicative of poorer physical fitness, lower fluid intelligence, higher allostatic load, and increased mortality (Cole et al., 2017), and even predicting the onset of cognitive decline (Franke et al., 2012; Gaser et al., 2013). Additionally, a recent study on changes of individual BrainAGE during the course of the menstrual cycle in humans (Franke et al., 2015) proves the BrainAGE method being capable of reliably indicating even temporary neuroanatomical changes as for example occurring during the course of the menstrual cycle. Consequently, the BrainAGE method offers new approaches to monitor subtle neuroanatomical changes in longitudinal intervention and treatment studies in humans and experimental animal models, e.g., exploring the effects of daily activity, protective nutrients, or medication on individual brain structure.

In conclusion, to our knowledge this is the first study to utilize the power of MRI brain imaging in a NHP model of controlled decreased fetal nutrition and MNR. We have developed a novel, species-specific, non-invasive in vivo MRI biomarker for brain aging that shows great potential for further studies in our well-established NHP model of developmental programming and aging. For example, these offspring will be maintained to follow the aging process over the rest of their life course. The brain aging biomarker enables a wellcontrolled, repeatable, and non-invasive in vivo exploration of individual effects on subtle, yet clinically-significant, changes in brain structure affecting neuroanatomical maturation and aging due to various environmental challenges experienced in human pregnancy (e.g., maternal obesity and diabetes, effects of maternal stress and placental insufficiency). Future studies can combine this MRI methodology with cognitive and behavioral studies, as well as treatment and intervention studies. Additionally, there is a clear need for gender-specific mechanisms, such as those shown here, to be taken into account in future studies. We observed premature brain aging in young adult female offspring. Furthermore, a species-specific and well-performing adaptation of the BrainAGE method for analyzing brain aging in rodents has recently been presented (Franke et al., 2016), thus also enabling future (longitudinal) studies of experimentally induced changes in brain maturation and aging in rodent models. In summary, the BrainAGE method can potentially identify a variety of environmental factors and mechanisms that induce premature brain atrophy at an individual level and contribute to a better understanding of healthy and pathological brain aging in animal models and humans.

# AUTHOR CONTRIBUTIONS

KF planned study design, developed part of the methods, performed statistical analyses, interpreted results, wrote manuscript. GC planned experimental design, acquired MRI data. RD developed part of the methods. CG contributed to methods development and manuscript. AK took care of animals, planned experimental design, acquired MRI data. CL took care of animals, planned experimental design, acquired MRI data. MS contributed to study design and manuscript. PN contributed to study design, experimental design, and manuscript.

# FUNDING

This work was supported in part by the European Community [FP7 HEALTH, Project 279281 (BrainAge) to KF], the German Research Foundation [DFG, Project FR 3709/1-1 to KF], and the National Institute of Health [K25 DK089012 & R24 RR021367 to PN]. The sponsors had no role in the design and conduct of the study; collection, management, analysis, and interpretation of the data; and preparation, review, or approval of the manuscript.

## REFERENCES


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2017 Franke, Clarke, Dahnke, Gaser, Kuo, Li, Schwab and Nathanielsz. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# MRI Visual Ratings of Brain Atrophy and White Matter Hyperintensities across the Spectrum of Cognitive Decline Are Differently Affected by Age and Diagnosis

Hanneke F. M. Rhodius-Meester <sup>1</sup> \*, Marije R. Benedictus <sup>1</sup> , Mike P. Wattjes <sup>2</sup> , Frederik Barkhof 2, 3, Philip Scheltens <sup>1</sup> , Majon Muller <sup>4</sup> and Wiesje M. van der Flier 1, 5

<sup>1</sup> Department of Neurology, Alzheimer Center, VU University Medical Centre, Amsterdam Neuroscience, Amsterdam, Netherlands, <sup>2</sup> Department of Radiology and Nuclear Medicine, VU University Medical Centre, Amsterdam Neuroscience, Amsterdam, Netherlands, <sup>3</sup> Institutes of Neurology and Healthcare Engineering, UCL, London, UK, <sup>4</sup> Department of Internal Medicine, Section Geriatrics, VU University Medical Centre, Amsterdam, Netherlands, <sup>5</sup> Department of Epidemiology and Biostatistics, VU University Medical Centre, Amsterdam, Netherlands

Aim: To assess the associations of age and diagnosis with visual ratings of medial temporal lobe atrophy (MTA), parietal atrophy (PA), global cortical atrophy (GCA), and white matter hyperintensities (WMH) and to investigate their clinical value in a large memory clinic cohort.

### *Edited by:*

Nicolas Cherbuin, Australian National University, Australia

### *Reviewed by:*

Franziskus Liem, University of Zurich, Switzerland Panteleimon Giannakopoulos, Université de Genève, Switzerland

*\*Correspondence:* Hanneke F. M. Rhodius-Meester h.rhodius@vumc.nl

*Received:* 20 January 2017 *Accepted:* 11 April 2017 *Published:* 09 May 2017

### *Citation:*

Rhodius-Meester HFM, Benedictus MR, Wattjes MP, Barkhof F, Scheltens P, Muller M and van der Flier WM (2017) MRI Visual Ratings of Brain Atrophy and White Matter Hyperintensities across the Spectrum of Cognitive Decline Are Differently Affected by Age and Diagnosis.

Front. Aging Neurosci. 9:117. doi: 10.3389/fnagi.2017.00117 Methods: We included 2,934 patients (age 67 ± 9 years; 1,391 [47%] female; MMSE 24 ± 5) from the Amsterdam Dementia Cohort (1,347 dementia due to Alzheimer's disease [AD]; 681 mild cognitive impairment [MCI]; 906 controls with subjective cognitive decline). We analyzed the effect of age, APOE e4 and diagnosis on visual ratings using linear regression analyses. Subsequently, we compared diagnostic and predictive value in three age-groups (<65 years, 65–75 years, and >75 years).

Results: Linear regression analyses showed main effects of age and diagnosis and an interaction age∗diagnosis for MTA, PA, and GCA. For MTA the interaction effect indicated steeper age effects in MCI and AD than in controls. PA and GCA increased with age in MCI and controls, while AD patients have a high score, regardless of age. For WMH we found a main effect of age, but not of diagnosis. For MTA, GCA and PA, diagnostic value was best in patients <65 years (optimal cut-off: ≥1). PA and GCA only discriminated in patients <65 years and MTA in patients <75 years. WMH did not discriminate at all. Taking into account APOE did not affect the identified optimal cut-offs. When we used these scales to predict progression in MCI using Cox proportional hazard models, only MTA (cut-off ≥2) had any predictive value, restricted to patients >75 years.

Conclusion: Visual ratings of atrophy and WMH were differently affected by age and diagnosis, requiring an age-specific approach in clinical practice. Their diagnostic value seems strongest in younger patients.

Keywords: Alzheimer's disease, mild cognitive impairment (MCI), MRI, prognosis, diagnostic test assessment

# INTRODUCTION

The current diagnostic criteria for mild cognitive impairment (MCI) and dementia due to Alzheimer's disease (AD) advise to apply biomarkers such as MRI features, to identify patients with (underlying) AD pathology (Dubois et al., 2007, 2014; Albert et al., 2011; McKhann et al., 2011). The criteria do not specify how MRI features should be measured, what cut-offs should be used and whether a patient's age should be taken into account (Frisoni et al., 2011). Studies demonstrating discriminatory value of atrophy, such as medial temporal lobe atrophy (MTA), parietal atrophy (PA) and global cortical atrophy (GCA) in AD, often use automatic quantitative MRI analysis (van de Pol et al., 2006; Sluimer et al., 2008; Henneman et al., 2009a; Trzepacz et al., 2014). However, these analyses are time consuming hence hard to apply in daily clinical practice. A feasible way of applying MRI features in daily practice is to use established visual rating scales for atrophy measures and vascular white matter changes (Scheltens et al., 1992, 1995; Wattjes et al., 2009).

The presence of MTA has been shown to differentiate patients with dementia due to AD from controls and to predict progression to dementia in MCI patients (Scheltens et al., 1992; Jack et al., 2002; Korf et al., 2004; Vos et al., 2012; Clerx et al., 2013; Ferreira et al., 2015). However, medial temporal lobe atrophy also occurs in normal aging (Jernigan et al., 2001; van de Pol et al., 2006; Barkhof et al., 2007). To discriminate both young and old controls from AD, an average score of the left and right sides of MTA ≥ 1 has been proposed for patients <75 years and MTA ≥ 1.5 for patients >75 years (Scheltens et al., 1992, 1995; Schoonenboom et al., 2008). Recently two studies, based on the same cohort, have suggested to increase the cut-off for patients <75 years to MTA ≥ 1.5, for patients >75 years to MTA ≥ 2 and to add a specific cut-off of MTA ≥ 2.5 for patients aged >85 years (Pereira et al., 2014; Ferreira et al., 2015). Since these studies used patients with a mean age of 75, it remains uncertain what the optimal cut-off in younger patients would be.

In younger patients, PA is increasingly recognized as an important feature of AD (Koedam et al., 2010). Rating PA improves the distinction of early onset AD patients from younger controls, but seems to be less suited to separate older AD patients from older controls (Lehmann et al., 2012; O'Donovan et al., 2013). No age-specific cut-offs have yet been suggested (Koedam et al., 2011; Ferreira et al., 2015). Only one study assessed the diagnostic value of combining MTA with PA, but this study did not take age into account (Ferreira et al., 2015). Being affected by parietal atrophy as well, the GCA scale has a lot of overlap with the PA scale. However, no cut-offs for the use of this scale as a diagnostic or predictive marker exist (Pasquier et al., 1996; Scheltens et al., 1997; Henneman et al., 2009b; Fjell et al., 2013).

Particularly in older patients, dementia pathology is often mixed including neurodegenerative and vascular changes. Therefore, in addition to measures of atrophy, it is common practice to estimate the extent of small vessel disease (SVD), such as white matter hyperintensities (WMH) in the diagnostic workup (van der Flier et al., 2004; Kester et al., 2014). A recent CT study showed an unexpected low percentage of WMH in elderly patients (Claus et al., 2015). It has been suggested that WMH may predict progression in the MCI stage, but other studies have found no such effect (Prins et al., 2013; Mortamais et al., 2014). The WMH severity can be rated using Fazekas' scale, but optimal cut-offs for separating controls from AD taking into account age have not been reported (Fazekas et al., 1987). Details regarding the afore mentioned scales can be found in **Table 1**.

The aim of our study was to explore the effect of age on the diagnostic value of visual ratings of MTA, PA, GCA, and WMH for discriminating controls from AD and for predicting progression to dementia in MCI in a very large memory clinic cohort (van der Flier et al., 2014). Second, we evaluated the effect of APOE genotype. Our ultimate goal is to provide practical support to clinicians to improve the effective incorporation of MRI visual ratings scale in daily practice.

# METHODS

### Subjects

We included 2,934 patients from the Amsterdam Dementia Cohort who had visited the Alzheimer center between 2000 and 2015 (van der Flier et al., 2014). Of these patients, 906 were diagnosed with subjective cognitive decline (SCD), who served as controls, 681 with MCI and 1,347 with AD. Subjects were included if MRI and mini mental state examination (MMSE; Folstein et al., 1975), performed within 6 months of baseline diagnosis, were available. The local medical ethical committee approved the study, according to the declaration of Helsinki. All patients provided written informed consent for their clinical data to be used for research purposes.

### Clinical Assessment

At baseline, patients received a standardized and multidisciplinary work-up, including medical history, physical, neurological and neuropsychological examination, MRI and laboratory tests. Cognitive functions are assessed with a standardized test battery, including the MMSE and Cambridge cognitive examination for global cognitive decline (Folstein et al., 1975; Derix et al., 1991). For memory we use the visual association test (VAT) and Rey auditory verbal learning task (Saan and Deelman, 1986; Lindeboom et al., 2002). For language we use VAT naming and category fluency (Lindeboom et al., 2002; Van der Elst et al., 2006). For attention and executive functions we use the trail making test A and B and the digit span (Reitan, 1958; Lindeboom and Matto, 1994). More details can be found in our cohort paper (van der Flier et al., 2014). Diagnoses were made in a multidisciplinary consensus meeting (van der Flier et al., 2014). Patients were labeled as SCD when the cognitive complaints could not be confirmed by cognitive testing and criteria for MCI, dementia or any other neurological or psychiatric disorder known to cause cognitive complaints were not met. MCI was diagnosed using Petersen's criteria; in addition all patients fulfilled the core clinical criteria of the NIA-AA guidelines for MCI (Petersen, 2004; Albert et al., 2011). Patients were diagnosed with probable AD using the criteria of the National Institute for Neurological and Communicative Diseases Alzheimer's Disease and Related Disorders Association; all patients also met the core clinical criteria of the National Institute on Aging-Alzheimer's Association guidelines for AD (McKhann et al., 1984, 2011).

### TABLE 1 | Details on used visual ratings scale of MTA, PA, CGA, and WMH.


Sens, sensitivity; spec, specificity. Inter-rater and intra-rater reliability is presented as Cohen weighted Kappa.

### Follow-Up

Follow-up for MCI patients took place by annual routine visits to our memory clinic in which patient history, cognitive tests, and a physical and neurologic examination were repeated. Followup data were available in 464(68%) MCI patients, with a mean duration of follow-up of 2.5 ± 1.7 years. Of these patients, 255(55%) remained stable, 161(35%) progressed to AD and 48(10%) progressed to another type of dementia.

### MRI

Subjects were scanned with a standardized scan protocol on 1.0 T, 1.5 T, and 3.0 T whole body MRI systems as part of their diagnostic work-up. Over time, the core protocol remained comparable and always included 3DT1 with coronal slices and FLAIR with axial slices. Details on acquisition parameters per scanner can be found in Supplementary Table 1. All scans were visually rated by a trained rater after they had completed the required training and obtained a weighted kappa of at least 0.80 for MTA, 0.60 for GCA, and 0.70 for Fazekas, and subsequently evaluated in a consensus meeting with our experienced neuroradiologist. The raters were blinded for diagnosis. Visual rating of MTA was performed on oblique coronal T1-weighted images according to the 5-point (range 0–4) Scheltens scale from the average score of the left and right sides (Scheltens et al., 1992, 1995). PA was rated using the posterior cortical atrophy scale (range 0–3), using T1 and FLAIR weighted images viewed in sagittal, axial and coronal planes, computing an average score of the left and right sides (Koedam et al., 2010, 2011; Lehmann et al., 2013). Global cortical atrophy (GCA) was assessed visually on axial FLAIR images (range 0–3) (Pasquier et al., 1996). The degree of white matter hyperintensities severity was rated on axial FLAIR images using Fazekas' scale (range 0–3) (Fazekas et al., 1987). More details can be found in **Table 1**.

### APOE Genotyping

DNA was isolated from 10 ml of EDTA blood. APOE genotype was determined with the light cycler APOE mutation detection method (Roche diagnostics GmbH, Mannheim, Germany). According to APOE e4 status, patients were dichotomized into carriers (hetero- and homozygous) and non-carriers. APOE status was available for 2410(82%) subjects.

### Statistical Analyses

For statistical analyses, we used SPSS version 20 (IBM, Armonk, NY, USA). We compared visual ratings according to the baseline diagnosis (controls, MCI and AD) using Kruskal-Wallis tests and post-hoc Mann-Whitney U-tests. We used Spearman's correlations to assess correlations between visual rating scales.

We used linear regression analyses to assess the combined effect of age and diagnosis on visual ratings (using separate models for each rating scale). As independent variables we entered diagnosis (using dummy variables), age (continuous) and the interaction terms for age∗diagnosis. In a second model we additionally added APOE (dichotomized) as independent variable and the interaction term age∗APOE. To confirm the age effect on visual ratings we repeated the linear regression analyses entering as independent variable, instead of diagnosis, MMSE (continuous) and the interaction term for age∗MMSE. To allow comparison of the different models, we report standardized betas (st beta).

Subsequently, we created three age strata (<65 years, 65– 75 years and >75 years) and evaluated the diagnostic ability of each visual rating scale to separate patients with dementia due to AD from controls per age group. Sensitivity, specificity, positive predictive value (PPV) and negative predictive value (NPV) and the Youden index [(sensitivity + specificity)-1] (Youden, 1950) were calculated for different cut-off points in the three age groups using cross tabulation. When we repeated the linear regression analyses adding APOE, we found only an effect of APOE e4 presence on the MTA scale. Therefore, we repeated the evaluation of diagnostic ability for MTA only, stratifying for APOE e4 carriers (controls vs. AD) and APOE e4 non-carriers (controls vs. AD), excluding 524(18%) of subjects in which APOE was not available. The highest Youden index indicated the optimal cut-point, we took a Youden index >0.50 as a minimum. For the scales showing a Youden index >0.50, we assessed the effect of combining the scales at their optimal cut-off. We created a new variable consisting of 4 levels: 1. normal MTA and normal PA (reference group), 2. normal MTA and abnormal PA, 3. abnormal MTA and normal PA, 4. abnormal MTA and abnormal PA. This was also done for the combination of MTA and GCA.

Finally, we assessed the predictive value of the visual ratings for dementia due to AD in MCI patients, stratified by age group. We used Cox proportional hazard models, taking into account variability in time to follow up. Baseline MTA, PA, CGA, and WMH were entered dichotomized, in separate models, at the earlier derived optimal, agespecific cut-offs and, in addition as continuous values. In a separate model, we evaluated the combined effect of MTA and PA and of MTA and GCA using the newly constructed 4 level variables, as described above. Event variable was progression to dementia due to AD, excluding subjects with progression to another type of dementia, and in another model progression to all types of dementia. Sex was entered as co-variate. HR with 95% confidence intervals (CI) are presented.

A p < 0.05 was considered significant. Since we focus on discriminatory and predictive value, rather than statistical significance, we did not adjust for multiple comparisons.

## RESULTS

### Baseline Characteristics

**Table 2** shows the baseline characteristics of the total population. Patients with MCI and AD were older and had more WMH than controls. Patients with AD were more often female, more often APOE e4 carrier, had the lowest MMSE score and highest MTA, PA, and GCA compared to controls and MCI. When we assessed correlations between the visual rating scales using Spearman's rho, we found the strongest correlation between PA and GCA (r = 0.732) and the weakest correlation for WMH and PA (r = 0.133; Supplementary Table 2).

# Influence of Age and Diagnosis on Visual Ratings

We used linear regression analyses to assess the combined effect of age and diagnosis on each visual rating (**Figure 1** and **Table 3**). For MTA we found main effects of age and diagnosis. In addition there was an interaction effect for age∗diagnosis, indicating a somewhat steeper age effect in patients with MCI and AD than in controls. For PA and GCA, we found main effects of age and diagnosis. In addition, there was an interaction effect for AD age∗diagnosis, indicating that AD patients have a higher score, regardless of their age, while in MCI and controls, PA and GCA increased with age. For WMH we only found a main effect of age but no main effect of diagnosis nor interaction between age and diagnosis. When we added APOE and age∗APOE to the model, we found a main effect of APOE on MTA indicating more MTA in case of APOE e4 presence, and an interaction effect of age∗APOE, indicating a steeper age effect in APOE non-carriers

TABLE 2 | Baseline characteristics of controls, MCI and AD patients in the total group.


Values are mean ± standard deviation or n (%). Group differences between the different diagnostic groups were estimated using Chi-quadrate test# and Kruskal-Wallis test and post hoc Mann-Whitney U-tests when appropriate. Please note that although we report mean ± standard deviation for the visual rating scales, we used non-parametric tests. <sup>a</sup>P < 0.05 compared to control, <sup>b</sup>P < 0.05 compared to MCI.

on MTA. However, in PA, GCA, and WMH we found no main effect of APOE nor an interaction effect of age∗APOE.

When we repeated the linear regression analyses with MMSE and age∗MMSE instead of diagnosis and age∗diagnosis, the same age effects were found. Details can be found in Supplementary Tables 3, 4.

### Visual Ratings per Baseline Diagnosis and Age Groups

Since there was a clear effect of age on visual ratings, we categorized patients in three age strata; <65 years, 65–75 years and >75 years. **Figure 2** visualizes the mean score of each visual rating scale in the different age strata, according to baseline diagnosis. Group sizes for the diagnostic groups by age strata are reported in the figure. For MTA, we found differences between all diagnostic groups in each age group. For PA and GCA, we found differences between all diagnostic groups in <65 years. In addition, for GCA this was also found in the stratum 65–75 years. For PA, in the age group 65–75 years, only AD differed from SCD and MCI, while >75 years AD differed only from MCI. For WMH we found differences between SCD and MCI and between SCD and AD in age groups <65 years and, 65–75 years. There were no differences between diagnostic groups in the >75 years stratum.

### Diagnostic Value of Visual Ratings to Separate AD from Controls per Age Group

Based on the highest Youden index, we determined the optimal cut-off for each rating scale in the total group and per age stratum (**Table 4**). A cut-off of MTA ≥ 1 was optimal for the total group and for <65 years and a cut-off MTA ≥ 1.5 for 65–75 years. In the patients aged >75 years no satisfactory cut-off could be derived. Both PA and GCA add sensitivity in the younger age range, as for these scales we found a high sensitivity at the cost of a lower specificity. A cut-off of an average PA ≥ 1 and GCA ≥ 1 were optimal for <65 years. PA and GCA did not discriminate in the older age groups. WMH did not sufficiently discriminate between groups at all. When we repeated the cross-tabulation for finding the optimal cut-off for MTA in APOE carriers and noncarriers results only changed marginally and optimal cut-offs were comparable (Supplementary Table 5).

Since MTA, PA, and GCA all had diagnostic value in the age group <65 years, we evaluated if the combination of these scales improved their diagnostic value. **Table 5** shows that a

(left + right/2), PA (left + right/2), GCA, and WMH, X-as: age group, 95% confidence interval is presented by the error bars, \*indicates significant difference between diagnostic groups, using Kruskal-Wallis tests and post-hoc Mann-Whitney U-tests.



Linear regression analyses were used, using separate models for each rating scale. As independent variables we entered diagnosis (using dummy variables), age (continuous) and the interaction terms for age\*diagnosis. St beta: standardized coefficients beta, p, p-value; F, Fisher; df, degrees of freedom.

combination of MTA with PA or GCA provides a very sensitive and specific indication for AD in the age group <65 years, especially when both ratings are abnormal. In case of one normal and one abnormal rating, the Youden index remained at or below 0.50, and did not add over the application of MTA or GCA/PA alone.

# Prediction Ability of Visual Ratings per Age Group in MCI

Finally, we assessed the predictive value of the visual ratings for dementia due to AD in MCI patients. Details of the demographics and visual ratings of these MCI patients are provided in **Table 6**. In the patients 65–75 years there was more WMH in the stable MCI as compared to progressive MCI patients, in patients >75 years MTA differed between stable and progressive MCI patients. Results of Cox proportional hazards models are shown in **Table 7**. Using age-specific cut-offs derived from the controls-AD comparisons, predictive value of MTA was strongest in the oldest MCI patients. PA, GCA, and WMH were not associated with progression to dementia due to AD in any of the age groups. Combination of the visual ratings resulted in a predictive effect for an abnormal MTA with and abnormal PA in the age TABLE 4 | Discriminatory value of different cut-off points of MTA, PA, GCA, and WMH for differentiating AD from controls in total population and in three age groups.


The results are calculated using cross tabulation. Youden index = (sensitivity + specificity) −1. Bold values are the cut-off values that showed the best differentiation. Sens, sensitivity; Spec, specificity; PPV, positive predictive value; NPV, negative predictive value.

TABLE 5 | Sensitivity and specificity for the combination of MTA and PA and for the combination of MTA and GCA for differentiating AD from controls in age group <65 years.


A new variable using 4 levels was created, using only the cut-offs with a Youden index > 0.50 from *Table 2*. Sensitivity and specificity are calculated using cross tabulation. Youden index = (sensitivity + specificity) −1. Bold values are the combinations that showed the best differentiation. Sens, sensitivity; Spec, specificity.

groups <65 years and 65–75 years. Combination of an abnormal MTA with an abnormal GCA resulted in the same effect in the age group <65 years. When we entered visual rating scales as continuous measures, MTA and WMH were slightly more predictive in older patients, GCA in younger patients. When we repeated the Cox analyses with lower cut-offs only HR of MTA improved slightly in the age group >75 years. When we used progression to any type dementia as outcome measure, results changed only marginally. Details are shown in Supplementary Table 6.

### DISCUSSION AND CONCLUSION

In this very large memory cohort with a broad age range, we studied the combined effect of age and diagnosis on the visual ratings of atrophy and WMH in controls, MCI and AD. This resulted in three main findings. First, we found an independent effect of age and diagnosis on MTA, resulting in different diagnostic and predictive value in the three age groups. Second, age and diagnosis had a different effect on PA and GCA, providing unequivocal support for their diagnostic value, specifically in younger patients. And third, for WMH we found hardly any diagnostic or predictive value, while this measure was strongly related to age.

Our first finding that MTA is equally affected by age and diagnosis, is consistent with former studies (Launer et al., 1995; Bastos Leite et al., 2004; van de Pol et al., 2006; Barkhof et al., 2007). Earlier studies have suggested age-specific cut-offs for MTA (Scheltens et al., 1992; Koedam et al., 2011; Duara et al., 2013; Pereira et al., 2014; van de Pol and Scheltens, 2014; Ferreira et al., 2015). We found the best diagnostic performance in MTA in the youngest group, with an identified optimal cutoff of MTA ≥ 1, which is the same as the original article but lower than the cut-off of MTA ≥ 1.5 advised by two recent articles (Scheltens et al., 1992, 1995, 1997; Barber et al., 1999; Pereira et al., 2014; Ferreira et al., 2015). Younger subjects


TABLE 6 | Baseline visual ratings of MCI patients according to diagnosis at follow-up by age group.

Values are mean ± standard deviation. Group differences were estimated using Mann-Whitney test. Please note that although we report mean ± standard deviation for the visual rating scales, we used non-parametric tests. \*Difference between MCI and dementia due to AD at follow up with p < 0.05.

should not have medial temporal atrophy at all; at an age <65 years even a MTA score of 1 is suspicious. This finding might be explained by differences in study populations. Our cohort contains a large subgroup <65 years, consisting of 1,047 controls and AD with a mean age of 58 ± 5 years. In former studies assessing the effect of age on MTA, average age of the so-called younger groups was much higher. Also our average MMSE is higher than in most studies, suggesting less advanced disease. The optimal cut-off of MTA ≥ 1.5 for 65–75 years was similar to recent studies (Schoonenboom et al., 2008; Pereira et al., 2014; Ferreira et al., 2015). For the subjects aged >75 years sensitivity and specificity when applying a MTA ≥ 1.5 (sensitivity 0.81; specificity 0.62) or a MTA ≥ 2 (sensitivity 0.65, specificity 0.78) are comparable to previous studies, but the low Youden index indicates that diagnostic performance is modest. When we repeated our linear regression analysis including APOE, we found, comparable to earlier studies, more MTA in APOE carriers and a stronger age effect on MTA in noncarriers (Pereira et al., 2014; Ferreira et al., 2015). Apparently the presence of APOE e4 results in more affected hippocampal region (van der Flier et al., 2011; van de Pol and Scheltens, 2014). The effect of APOE on MTA was subtle however and did not lead to different optimal cut-offs. This is in line with the fact that APOE genotype is generally not used in the diagnostic work-up of AD.

When we attempted to predict progression to AD dementia in patients with MCI, MTA had strongest predictive value in the oldest group >75 years. PA, GCA, and WMH showed TABLE 7 | Cox proportional hazard models; influence of MTA, PA, GCA, and WMH and combination of MTA/PA and MTA/GCA on progression of MCI to dementia due to AD in the three age groups.


Data are presented as hazard ratio (HR) (95% CI). Cox proportional hazard models compared progression to AD with non-converters (= stable MCI at follow-up). Time variable was time to follow-up in years; state variable was progression to AD. The visual ratings were entered dichotomized at the optimal cut-off as was derived from classifying controls from dementia due to AD (*Table 2*). For the combination of MTA/PA and MTA/GCA a new 4 level variable as presented in *Table 3*, was used. Sex was entered as co-variate. Bold values are the HR's with p < 0.05.

no predictive value. In addition, the predictive value of MTA in the younger patients was limited. This was an unexpected result, as previous studies have shown predictive ability for MTA and PA, especially in younger subjects (Korf et al., 2004; Staekenborg et al., 2009; Lehmann et al., 2012, 2013; Prins et al., 2013; Ferreira et al., 2015). However, in our study, the MCI subjects aged <65 years were younger than in previous studies and they had lower MTA scores. In addition, younger patients were less likely to show clinical progression than older patients (<65:28% vs. 65–75:44% vs. >75:56%), resulting in less power. Apparently MCI patients <65 years constitute a different patient category than older MCI patients. It is conceivable that the prototypical patient with MCI due to AD, is a patient that develops a typical, hippocampal type of AD, with an age-atonset of about 75 years. Younger subjects with the earliest stages of cognitive decline tend to have an atypical presentation, a longer doctors-delay because of misdiagnosis and suffer from a larger penalty on stigmatizing them with MCI (Koedam et al., 2010; Barnes et al., 2015). As a result, younger subjects with AD, often present to a memory clinic already at dementia stage, which may result in a bias for the MCI population in this age group. In older subjects, MCI might be better recognized, which could explain the predictive value of MTA in this group. Also, in the patients 65–75 years there was more WMH in the stable MCI as compared to progressive MCI patients. This

suggests that the WMH, rather than AD, could be the cause for their cognitive decline, explaining why this specific group remained stable. Another reason for the low predictive value might be our choice to use the cut-offs derived from controls-AD comparison. One could argue that patients with MCI might have subtler atrophy rates, being earlier in the disease trajectory, thus requiring more sensitive cut-offs. When we repeated the Coxanalyses with lower cut-offs however, predictive values did not improve.

Our second finding concerned the different effects of age and diagnosis on PA and GCA. Previous studies have shown that PA ratings have diagnostic value in early onset AD but do not help the separation of late onset AD from older controls (Koedam et al., 2011; Lehmann et al., 2012, 2013; O'Donovan et al., 2013). To date this has not been reflected by age-specific cut-offs for PA and GCA. To our knowledge, only one study assessed age-specific cut-offs for PA, finding a low diagnostic value, yet advising a cut-off PA ≥1 for all age groups (Ferreira et al., 2015). In our study we found that patients with AD have a high score on PA and GCA regardless of age, while controls and MCI show increased PA and GCA scores with increasing age. These findings resulted in a high diagnostic value for both PA and GCA in patients <65 years, but no value of PA and GCA for patients >65 years. The optimal cut-off for both atrophy measures was a rating of ≥1. The original paper proposed a higher cut-off PA ≥2, resulting in a high specificity at the cost of a low sensitivity (Koedam et al., 2011). With a lower cutoff ≥1 we now found a reverse pattern in the age-group <65 years, with a high sensitivity at the cost of a lower specificity. An additional finding of abnormal MTA greatly adds specificity to PA. In this subgroup of patients <65 years, a combination of an abnormal PA and MTA resulted in very high sensitivity and specificity, hence this should be regarded as alarming. In the preparation of this study, we also used a classification tree to improve the utility of combining visual ratings. However, this tree only added improvement in discriminating controls from AD for both MTA with PA in the age group <65 years. We decided to leave these analyses out of the paper, as the more complex modeling did not add to our message. Furthermore, since our aim was to evaluate the visual ratings as a clinician would, we chose to use as simple as statistics as possible, reflecting clinical practice.

In our study, we found WMH mainly to be affected by age, but not by diagnosis. Various studies have advocated a synergistic effect of SVD and AD pathology on cognitive decline, while other studies have shown that SVD in AD was related to age and vascular risk factors, comparable to individuals without AD (Kester et al., 2014; Mortamais et al., 2014; Spies et al., 2014; Benedictus et al., 2015; Claus et al., 2015; Prins and Scheltens, 2015). Yet, in all these studies the diagnostic value of WMH for separating AD from controls has not been addressed. We found no diagnostic utility for WMH in discriminating AD from controls, which cannot be explained by the relatively young age of our study sample, since even in the oldest age stratum, WMH did not have any discriminatory value. Assessing WMH in the diagnostic workup remains important, because of the known negative effect of WMH on many outcomes, such as functional decline, (lacunar) infarcts, depression and mortality (Pantoni et al., 2005; van der Flier et al., 2005; Inzitari et al., 2007; Verdelho et al., 2010; Firbank et al., 2012). Furthermore, presence of WMH indicates a possible treatable cause in order to prevent further deterioration (Basile et al., 2006; Prins and Scheltens, 2015). These findings do not oppose the possible interaction of SVD and AD. Since WMH in this study was equally severe in aging controls, one might argue that dementia at older age is by definition "mixed." Perhaps in older subjects, having WMH, less AD damage is needed to develop dementia (van der Flier et al., 2004; Mortamais et al., 2014). These age effects persisted when we used MMSE score instead of clinical diagnosis which confirms our finding.

These findings have several clinical implications. The value of the visual ratings of atrophy and WMH all differ across the age-groups. This makes it of utmost importance to take into account the age of the patients when using MRI in diagnostic workup. Especially in the younger patients MTA and PA/GCA have diagnostic value; atrophy at an age <65 is a bad sign. By combining MTA with PA/GCA, the value even increases. Older age reduces the value of rating scales substantially, in older patients it is harder to separate age-effect from ADeffect. These findings are in line with the classical Braak model for MTA (Braak et al., 2006). However, the findings for PA are not in line with Braak, since especially young subjects showed severe PA only in AD cases, which is not observed in controls and MCI, whereas this difference disappears in increasing age. This suggests a separate pathological stageingmodel for younger patients may be warranted (Jagust et al., 2008; Fjell et al., 2013). In this patient group, the use of visual ratings should be used to rule-out AD in case of no atrophy rather than proving inclusive evidence for AD when there is atrophy. Perhaps in the future more automated measures will be able to distinguish pathological from age-adequate brain aging, being able to pick up more subtle effects (Koikkalainen et al., 2016). Automatic quantification methods of brain atrophy, and other modalities such as FDG-PET, also have the advantage of providing objective measures, independent of the expertise of the clinician, whereas visual ratings are a subjective visual interpretation. Furthermore, these automatic methods are able to extract more information and combine information, for example on WMH and atrophy, and provide an estimate of the underlying neurodegenerative disease. Visual rating of MRI's have the advantage however that they are more feasible in daily clinical practice. Automatic quantification methods are dependent on scan protocol and quality, whereas visual ratings can be applied to images acquired with less advanced scanners. Also these automatic methods often require costly and time-consuming software-programs, while visual ratings can be applied in an instant, with the patient in front of the clinician.

This study has several limitations. First, the lack of neuropathological confirmation of diagnosis. Especially in elderly patients, with comorbid SVD, atrophy might also be the result of WMH or hippocampal sclerosis and not of amyloid pathology (Barkhof et al., 2007). Due to this we might have selected patients that have been misclassified with AD. However, in this study we found a similar degree of WMH in all elderly subjects, regardless of diagnosis, diminishing the importance of specifying the etiology as mixed or not. Second, we used SCD as controls, although we cannot exclude the possibility that these patients had underlying AD. We feel that the comparison of AD with SCD patients is a clinically relevant comparison however, as this is the differential diagnosis that a clinician has to make every day. Furthermore, underlying AD can also not be excluded in "pure" controls, as it is known that roughly one third of normal elderly harbors AD pathology (Chetelat et al., 2010; Vos et al., 2013). Third, the mean follow-up of 2.5 ± 1.7 years could imply that MCI patients, who remained stable during this period, might still progress to dementia after longer follow up. Fourth, in our clinical work-up clinicians are not blinded for the MRI results. This might have resulted in bias. The effect of the MRI results on diagnosis might have also changed throughout the time due to changing insights in use of biomarkers. However, all diagnoses were made in our multidisciplinary consensus meeting, in which the clinical characteristics of the patient and the cognitive profile on neuropsychological testing is leading. A final limitation could be the use of different scanners with increasing field strength throughout the time. This could also be regarded as a strength however, as the visual ratings have the advantage that they are robust for scanner differences and easy to use.

Among the strengths of the current study is our harmonized diagnostic protocol according to which all patients were analyzed. All patients were selected from the same memory clinic. The large sample size and the broad age spectrum ranging from 45 to 95 makes these results robust. Furthermore, the scans were rated by experienced researchers after they had completed the required training (van der Flier et al., 2014).

To conclude, visual ratings are of use in daily practice, but should be interpreted with caution and with reference to a patients' age. The current research criteria advise the use of MTA in the diagnostic work-up for AD, but do not specify the amount of atrophy or the effect of age (Dubois et al., 2007, 2014). This study shows that MTA is strongly influenced by age and that age related cut-offs are needed. PA and GCA seem to be of equal use for the diagnostic workup in patients <65

### REFERENCES


years, and their information is incremental to the information in the MTA scale. Taking into account age-specific cut-offs and characteristics of each visual rating scale, use of visual rating scales for MRI can enhance recognition of AD for either diagnostic or research purposes, especially in younger patients.

### ETHICS STATEMENT

This study was carried out in accordance with the recommendations of the medical ethical committee of the VU Medical Center with written informed consent from all subjects. All subjects gave written informed consent in accordance with the Declaration of Helsinki. The protocol was approved by the medical ethical committee of the VU Medical Center.

### AUTHOR CONTRIBUTIONS

HR drafted the manuscript and analyzed/interpreted data. MB, MW, FB, PS, and MM revised the manuscript and interpreted the data. WF drafted the manuscript, analyzed/interpreted data and supervised the project.

### ACKNOWLEDGMENTS

Research of the VUmc Alzheimer center is part of the neurodegeneration research program of the Amsterdam Neuroscience. The VUmc Alzheimer center is supported by Alzheimer Nederland and Stichting VUmc fonds. The clinical database structure was developed with funding from Stichting Dioraphte. WF is recipient of the ZonMW-Memorabel grant (no. 70.73305-98-205). HR is appointed on a grant from the European Seventh Framework Program project PredictND under grant agreement 611005.

## SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: http://journal.frontiersin.org/article/10.3389/fnagi. 2017.00117/full#supplementary-material


clinical progression in subjective cognitive decline. Stroke 46, 2661–2664. doi: 10.1161/STROKEAHA.115.009475


**Conflict of Interest Statement:** MW received speaking and consultancy fees from Biogen, Novartis, and Roche. FB serves/has served on the advisory boards of Bayer-Schering Pharma, Sanofi-Aventis, Biogen-Idec, TEVA, Merck-Serono, Novartis, Roche, Synthon BV, Jansen Research and Genzyme. He received funding from the Dutch MS Society and EU-FP7 and has been a speaker at symposia organised by the Serono Symposia Foundation and MedScape. PS has served as consultant for Wyeth-Elan, Genentech, Danone and Novartis and received funding for travel from Pfizer, Elan, Janssen, and Danone Research. WF performs contract research for Boehringer Ingelheim. Research programs of WF have been funded by ZonMW, NWO, EU-FP7, Alzheimer Nederland, CardioVascular Onderzoek Nederland, stichting Dioraphte, Gieskes-Strijbis fonds, Boehringer Ingelheim, Piramal Neuroimaging, Roche BV, Janssen Stellar. All funding is paid to her institution.

HR, MB, and MM declare that the research was conducted in the absence of any commercial or financial relationship that could be construed as a potential conflict of interest.

Copyright © 2017 Rhodius-Meester, Benedictus, Wattjes, Barkhof, Scheltens, Muller and van der Flier. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Structural Brain Network Changes across the Adult Lifespan

Ke Liu<sup>1</sup> , Shixiu Yao<sup>1</sup> , Kewei Chen2, 3, 4, Jiacai Zhang<sup>1</sup> , Li Yao1, 5, Ke Li <sup>6</sup> , Zhen Jin<sup>6</sup> and Xiaojuan Guo1, 7 \*

*<sup>1</sup> College of Information Science and Technology, Beijing Normal University, Beijing, China, <sup>2</sup> Banner Alzheimer's Institute, Phoenix, AZ, United States, <sup>3</sup> Department of Mathematics and Statistics, Arizona State University, Tempe, AZ, United States, <sup>4</sup> Arizona Alzheimer's Consortium, Phoenix, AZ, United States, <sup>5</sup> National Key Laboratory of Cognitive Neuroscience and Learning, Beijing Normal University, Beijing, China, <sup>6</sup> Laboratory of Magnetic Resonance Imaging, Beijing 306 Hospital, Beijing, China, <sup>7</sup> Beijing Key Laboratory of Brain Imaging and Connectomics, Beijing Normal University, Beijing, China*

A number of magnetic resonance imaging (MRI) studies have shown age-related alterations in brain structural networks in different age groups. However, the specific age-associated changes in brain structural networks across the adult lifespan is underexplored. In the current study, we performed a multivariate independent component analysis (ICA) to identify structural brain networks based on covariant gray matter volume and then investigated the age-related trajectories of structural networks over the adult lifespan in 536 healthy subjects aged 20–86 years. Twenty independent components (ICs) were extracted in the ICA, and statistical analyses between age and ICA weights revealed 16 age-related ICs across the adult lifespan. Most of the trajectories of ICA weights demonstrated significant linear decline tendencies, and the corresponding structural networks primarily included the anterior and posterior dorsal attention networks, the ventral and posterior default mode networks, the auditory network, five cerebellum networks and the hippocampus-related network with the most significant decreased tendency among all ICs (*p* of age = 1.11E-77). Only the temporal lobe-related network showed a significant quadratic tendency with age (*p* of age<sup>2</sup> = 5.66E-06). Our findings not only provide insight into the patterns of the age-related changes of structural networks but also provide a foundation for understanding abnormal aging.

Keywords: independent component analysis, structural network, magnetic resonance imaging, gray matter volume, age-related changes

# INTRODUCTION

Magnetic resonance imaging (MRI) studies have shown that the brain undergoes remarkable structural development during childhood and adolescence and that those alterations continue even through adulthood (Good et al., 2001; Gogtay et al., 2004; Raji et al., 2012; Fjell et al., 2013; Mills et al., 2014). The global gray matter volume decreases linearly with age, and the total white matter volume shows an inverse U-shaped tendency in healthy adults (Good et al., 2001; Ge et al., 2002); however, regional brain changes are heterogeneous in different regions across the adult lifespan (Gogtay et al., 2004; Allen et al., 2005; Curiati et al., 2009; Ziegler et al., 2012; Fjell et al., 2013). For example, the gray matter volumes of the frontal, parietal and occipital lobes present linear decreases with age across the adult lifespan (Allen et al., 2005), while the hippocampus volume presents a

### Edited by:

*James H. Cole, Imperial College London, United Kingdom*

### Reviewed by:

*Yun Zhou, Johns Hopkins School of Medicine, United States Xuemin Xu, University of Tennessee, Knoxville, United States*

### \*Correspondence:

*Xiaojuan Guo gxj@bnu.edu.cn*

Received: *07 June 2017* Accepted: *02 August 2017* Published: *17 August 2017*

### Citation:

*Liu K, Yao S, Chen K, Zhang J, Yao L, Li K, Jin Z and Guo X (2017) Structural Brain Network Changes across the Adult Lifespan. Front. Aging Neurosci. 9:275. doi: 10.3389/fnagi.2017.00275* non-linear trend with age (Allen et al., 2005; Fjell et al., 2013). Early structural MRI researches used univariate methods, such as regions of interest (ROIs) or voxel-based morphometry (VBM), to investigate the age-related gray matter changes; however, these studies considered ROIs or brain voxels as independent variables and ignored the interregional covariant information among them.

A number of brain MRI studies have investigated anatomical network changes based on the structural covariance of gray matter volume in normal adults (Brickman et al., 2007; Bergfield et al., 2010; Li et al., 2013; Hafkemeijer et al., 2014). Li et al. used seed-ROI regression models to explore age-related changes of gray matter volumes in eight gray matter networks in young, middle-aged and older groups of healthy subjects aged 18–89 years (Li et al., 2013). Hafkemeijer et al. utilized independent component analysis (ICA) to extract nine gray matter anatomical networks in middle-aged to older normal participants (45–85 years), who were divided into four age subgroups (Hafkemeijer et al., 2014). Brickman et al. identified aging-related regional MRI covariance patterns in younger and older groups of healthy adults using a multivariate statistical model called the subprofile scaling model (SSM; Brickman et al., 2007). These studies showed that structural covariance patterns or networks demonstrated different age-related changes among the different age groups (Brickman et al., 2007; Li et al., 2013; Hafkemeijer et al., 2014). For example, there was a negative correlation between age and gray matter volume in four anatomical networks, including the medial visual cortical network, sensorimotor network, default mode network (DMN) and executive control network; however, gray matter volume was not significantly associated with age in five other networks, including the temporal network, auditory network, and three cerebellar networks (Hafkemeijer et al., 2014). It has been noted that the above-mentioned studies focused primarily on brain structural network changes in different age groups. However, the age-related trajectories of the brain structural networks across the adult lifespan need to be further explored.

Multivariate analysis methods can identify the inter-regional covariance relationship among different brain regions. Therefore, these approaches have been widely applied to brain imaging studies (Damoiseaux et al., 2006; Brickman et al., 2007; Mantini et al., 2007; Xu et al., 2009; Bergfield et al., 2010; McIntosh and Mišic, 2013; Guo et al., 2014; Hafkemeijer et al., 2014). ICA, as a popular data-driven multivariate analysis method, was introduced first to the studies of brain functional networks (Damoiseaux et al., 2006; Mantini et al., 2007) and then to those of structural networks (Xu et al., 2009; Guo et al., 2014; Hafkemeijer et al., 2014). Xu et al. presented a source-based morphometry (SBM) approach using ICA to study gray matter network differences between subjects with schizophrenia and healthy control subjects and confirmed the validity of ICA in structural MRI data (Xu et al., 2009). Guo et al. also applied ICA to examine structural covariance networks across healthy young adults and to determine their spatial consistency (Guo et al., 2014). Compared with other multivariate analysis methods, ICA is a higher-order statistical method and can decompose linear mixed signals into maximally independent components (Calhoun et al., 2009). In this way, ICA can effectively extract independent sources from complex brain imaging data without a priori information.

The purpose of the current study is to explore age-related gray matter changes at the network level across the adult lifespan. We applied the ICA method to identify structural gray matter covariance networks among 536 healthy subjects aged 20–86 years. Finally, regression analyses were performed on ICA weights and age to investigate age trajectories of the corresponding networks.

# MATERIALS AND METHODS

# Participants

Structural MRI data were obtained from a large public database of the Information eXtraction from Images (IXI) (http://brain-development.org/ixi-dataset/). In this study, 536 healthy subjects (Females/Males = 273/263, age range 20–86 years) were included. Specific information about the subjects is presented in **Table 1**. More details about demographic information of the participants are presented on the IXI database website (Kennedy et al., 2016).

# Data Acquisition

All structural MRIs were obtained on three different sites: a Philips 1.5T system at Guy's Hospital, a General Electric (GE) 1.5T at the Institute of Psychiatry, and a Philips 3T magnetic resonance scanner at Hammersmith Hospital. The T1-weighted structural MRIs were acquired using a magnetization-prepared rapid acquisition gradient-echo (MPRAGE) sequence. Scanning parameters for Philips 1.5T scanner were: TR = 9.8 ms, TE = 4.6 ms, flip angle = 8 ◦ ; and for Philips 3T scanner were: TR = 9.6 ms, TE = 4.6 ms, flip angle = 8 ◦ . Scanning parameters for GE 1.5T scanner were not available.

### Image Preprocessing

In this study, all structural MRI data were processed using the VBM8 toolbox (available at http://dbm.neuro.uni-jena.de/vbm8) (Ashburner and Friston, 2000; Good et al., 2001; Ashburner, 2007) in the Statistical Parametric Mapping (SPM8) software (available at: http://www.fil.ion.ucl.ac.uk/spm). In brief, using adaptive maximum posterior (MAP) and partial volume estimation (PVE), all of the structural images were segmented into gray matter, white matter and cerebrospinal fluid. Subsequently, a diffeomorphic anatomical registration exponential Lie algebra (DARTEL) approach (Ashburner, 2007) was applied to normalize each subject's gray matter image to the average DARTEL template, which was generated iteratively and finally to the Montreal Neurological Institute (MNI) space. Additionally, to preserve the total gray matter amount in the native space, the voxel of each gray matter image was multiplied by the Jacobian determinant from the normalization. Gaussian smoothing was performed with a kernel of 8 mm full width at half maximum (FWHM) on each subject's gray matter image.

Multiple linear regression models were constructed for the spatial preprocessed gray matter maps to account for two confounding factors: scanner and gender. In order to avoid

Liu et al. Lifespan Changes of Structural Brain Network


TABLE 1 | Sample characteristics of different age groups.

*<sup>a</sup>Three separate subsamples from different scanning sites in London: Guy's Hospital: Philips 1.5T/Hammersmith Hospital: Philips 3T/Institute of Psychiatry: General Electric 1.5T <sup>b</sup>The number of different ethnic groups in our IXI sample: Caucasian/Black/Asian/Other*

*<sup>c</sup>Education levels: 1* = *No qualifications; 2* = *O-levels, GCSEs, or CSE; 3* = *A-levels; 4* = *Further education; 5* = *University or Polytechnic degrees.*

the possible bias of different scanners, all participants from the three scanners (Guy's Hospital, Institute of Psychiatry and Hammersmith Hospital) were represented with three column dummy independent variables of 0/1 in regression models. Additionally, gender was a nuisance factor in this study, then gender was also represented with one column dummy variables of 0/1. The adjusted gray matter images were entered into the subsequent ICA procedure.

### ICA

The ICA was implemented using the fusion ICA toolbox (FIT) (available at http://mialab.mrn.org/software/fit/index.html). In this study, the gray matter image of each subject was spatially concatenated as a row vector to form a subject-by-voxel input data matrix. Then, the initial matrix was decomposed into a subject-by-source mixing matrix (also referred to as ICA weights) and a statistically independent source-by-voxel source matrix (spatial components or brain networks) using the informax algorithm which minimizes the mutual information of the sources (Calhoun et al., 2009; Xu et al., 2009). The mixing matrix exhibits the interrelationship in subjects and source networks, and the source matrix exhibits the interrelationship in source networks and voxels across the whole brain. Then each column of the mixing matrix represents the degree to which one subject contributes to the corresponding source network. Each row of the source matrix indicates a spatial distribution of brain structural network which expresses the covariant changes of the gray matter volume within the brain (Xu et al., 2009). Finally, each source network was converted to a z-score map and reshaped to a 3D brain map with a threshold Z = 3 to reveal the gray matter structural covariant patterns. The resulting ICA coefficient weights were used for the statistical analysis.

### Statistical Analysis

Cubic, quadratic and linear regression analyses were performed separately between age (independent variables) and each column of the ICA weights (dependent variables) to explore the agerelated trajectories of networks throughout the adult lifespan. Bayesian Information Criterion (BIC) was used to determine the optimal regression model with the smallest BIC value. A singlesample T-test was performed on the regression coefficients of the highest-order age item with the statistical significance threshold set at p < 0.05 with Bonferroni correction for each optimal regression model.

Additionally, in order to evaluate the age range effect on the age-related patterns, we re-performed the same statistical analysis of the ICs for subjects aged 20–80 years, 20–70 years, and 20–60 years, respectively.

### RESULTS

Twenty independent components (ICs) were extracted in the ICA. The BIC and T-test revealed 16 ICs significantly associated with age at Bonferroni corrected P-value (**Figures 1**–**3**). Fifteen ICs showed significant linear declines (p < 2.50E-03), and only one IC (IC 17) had a quadratic trend (p = 5.66E-06). These structural networks included the anterior and posterior dorsal attention networks (DAN; **Figure 1**, IC 2 and IC 7), the ventral and posterior DMN (**Figure 1**, IC 6 and IC 11), the auditory network (**Figure 2**, IC 12), the sensory-motor network (**Figure 2**, IC 15), the language-related speech network (**Figure 2**, IC 3), the hippocampus-related network (**Figure 2**, IC 16), the caudate-related network (**Figure 2**, IC 9), the thalamus-related network (**Figure 2**, IC 13), the cerebellum networks (**Figure 3**, IC 4, IC 5, IC 14, IC 19, and IC 20), and the temporal loberelated network (**Figure 3**, IC 17). The main brain clusters in each IC are described in **Table 2**. The hippocampus-related network (**Figure 2**, IC 16) showed the most significant decreasing tendency among them (p = 1.11E-77).

**Figures 1**–**3** show age-related changes in the ICs and the corresponding scatterplots with best fitted curves between ICA weights and age for each IC. **Table 3** lists the results of the regression statistics analysis.

The results for subjects aged 20–80 years showed that the ICA weights of the same 15 ICs exhibited significant linear declines (p < 2.50E-03), and only that of IC 17 had a significant quadratic trend (p = 4.17E-04). The results for subjects aged 20–70 years showed that there were still the same 14 ICs showing significant linear declines (p < 2.50E-03), but one IC (IC 4) showed nonsignificant linear reduce (p = 0.0191). In addition, IC 17 was also had a quadratic trend with non-significant level (p of age<sup>2</sup>

536 healthy adult subjects. The color bar represents Z scores. (B) and (D) columns: the orange scatterplots show the age-related patterns in different networks. The

orange lines represent the fitted lines between age and ICA weights for each network.

FIGURE 2 | Age-related changes in gray matter structural networks. (A) and (C) columns: IC 12, 15, 3, 16, 9, and 13 represent structural network maps associated with age in 536 healthy adult subjects. The color bar represents Z scores. (B) and (D) column: the orange scatterplots show the age-related patterns in different networks. The orange lines represent the fitted lines between age and ICA weights for each network.

= 0.0081). The results for subjects aged 20–60 years showed that all 16 ICs had the linear decreased patterns. Twelve of 16 ICs were still significant (p < 2.50E-03) and the remaining 4 ICs (IC 4, 17, 19, 20) were non-significant with a minimum of p = 0.0062.

# DISCUSSION

In the current study, we first performed multivariate ICA to investigate the brain structural covariance networks across the adult lifespan based on healthy subjects' MRI data acquired from the dataset IXI. Then, we further explored the trajectories of the structural networks associated with age. We found 16 significant age-related networks, and ICA weights of 15 networks decreased linearly with age; only ICA weights of the temporal loberelated network (IC 17) showed a significant quadratic tendency with age.

In previous studies, researchers have extracted the DAN based on functional MRI (fMRI) data (De Luca et al., 2006; Fox et al., 2006; Mantini et al., 2007; Power et al., 2011). For example, Mantini et al. decomposed fMRI data via ICA to investigate the brain resting state networks from 15 healthy subjects (20–29 years) and obtained a DAN network mainly including the bilateral intraparietal sulcus (Mantini et al., 2007). We reported that IC 2 and IC 7 corresponded to the anterior and posterior DAN. Some resting state functional studies have explored functional connectivity density (FCD) changes of DAN across lifespan (Tomasi and Volkow, 2012; Betzel et al., 2014). Betzel et al. found two dorsal attention components (DorsAttnA and DorsAttnB), and the modularity of DorsAttnA which mainly located to the temporo-occipital cortex, parieto-occipital cortex, and superior parietal lobule showed a prominent age-related linear decrease of FCD across the subjects aged 7–85 years (Betzel et al., 2014). Tomasi and Volkow used a FCD mapping approach and revealed statistically significant age-related FCD decreases in DAN (r = −0.23, p < 1.00E-06) from healthy subjects (13–85 years) (Tomasi and Volkow, 2012). Our current results showed that gray matter volumes of IC 2 and IC 7 also exhibited significant linearly decreased trends with age (p = 1.36E-34 and p = 2.40E-13, respectively), which suggested that functional and structural DAN have similar age-related patterns.

We found that ICA weights of both the ventral and posterior DMN (IC 6 and IC 11) declined linearly with age ranging from 20 to 86 (**Figure 1**, **Tables 2, 3**). Several neuroimaging studies have proposed that the structural DMN changes with age not only during the developmental process (Bluhm et al., 2008; Supekar et al., 2010) but also in adult life (Luo et al., 2012; Spreng and Turner, 2013; Hafkemeijer et al., 2014). Spreng et al. suggested a significant linear decline between age and the structural covariance of the default network scores across

### TABLE 2 | Brain regions showing age-related changes in structural networks.


### TABLE 2 | Continued


the adult lifespan of 18–96 years (Spreng and Turner, 2013). Meanwhile, Hafkemeijer et al. revealed that there was a negative association between age and gray matter volume in the DMN from 45 to 85 years of age (Hafkemeijer et al., 2014), in agreement with our results. Moreover, age-related changes can also be found in the functional connectivity (FC) in the DMN (Damoiseaux et al., 2008; Hafkemeijer et al., 2012; Onoda et al., 2012; Huang et al., 2015). Damoiseaux et al. demonstrated that the FC of the DMN decreased in older participants (age 70.7 ± 6.0 years) relative to young participants (age 22.8 ± 2.3 years) (Damoiseaux et al., 2008). The DMN consists of sub-networks and different sub-networks are responsible for different cognitive functions (Uddin et al., 2009; Damoiseaux et al., 2012; Huang et al., 2015). The degree to which age affects the relevant cognitive functions of default mode sub-networks seems to be different (Huang et al., 2015). We also demonstrated that these two DMN ICs presented significant declining trends but with different degrees (p = 4.66E-50 for the ventral DMN and p = 1.41E-06 for the posterior DMN), possibly because of DMN sub-network's distinct cognitive functions. When considered together, these findings indicate that the decreased functional connectivity within the DMN may be associated with structural network changes of the DMN.



\**Beta of the highest item in the regression model.*

Our ICA results found three other gray matter covariant networks: the auditory network (IC 12), the sensory-motor network (IC 15), and the language-related speech network (IC 3; **Figure 1** and **Table 2**). Significant linear decrease trajectories were found between the ICA weights and age in these three networks (**Table 3**). Li et al. also found that the structural associations in the auditory network and language-related speech network decreased significantly with age between the young and middle-aged groups and were relatively preserved or mildly changed between the middle-aged and old groups (Li et al., 2013). Whereas, they found that there was an increased tendency in structural associations within the motor network from the young group (18–23 years) to the middle-aged group (30–58 years) which was different from ours, and a downtrend from the middle-aged to the old group (60–69 years) but no significant difference between the young and old groups (Li et al., 2013). In addition, Zielinski et al. investigated the developmental structural changes in these networks based on children and adolescents in four age categories (from 5 to 18 years) and found that the primary auditory and motor networks largely developed in early adolescence; in contrast, the language-related speech network showed a significant expansion in late adolescence (Zielinski et al., 2010). Further, an accelerated decline on the gray matter volume in the middle and superior frontal gyrus (the main brain areas of IC 15) in ages older than 20 years was reported (Giorgio et al., 2010). A number of studies have illustrated an accelerated loss of gray matter volume in auditory-related regions (the main brain areas of IC 12) in aging adult brains (Good et al., 2001; Lemaître et al., 2005; Kalpouzos et al., 2009). The significant decline trends from 20 to 86 years old showed by IC 12 and IC 15 in our study are consistent with the regional patterns of age-related gray matter loss in these studies.

Apart from the ICs discussed above, IC 16 included the left and right hippocampus and parahippocampal gyrus (**Figure 2**) and showed the most significant decreasing tendency among all ICs (p = 1.11E-77). Several studies have consistently reported an accelerated decline of the gray matter volume in the hippocampus with age (Manrique et al., 2009; Fjell et al., 2013). Fjell et al. delineated age-related trajectories of the volume of 17 ROIs in healthy adults (18–94 years) via a non-parametric smoothing spline approach, and the hippocampus showed the fastest loss rate (Fjell et al., 2013). Although, we employed different method from Fjell et al., nevertheless, the gray matter volume of hippocampus-related network also had the most severe agingrelated atrophy (p = 1.11E-77) in comparison to those of other networks. Further, the decline of memory and cognitive abilities with age has been frequently discussed (Schönknecht et al., 2005; Manrique et al., 2009; Rosenbaum et al., 2015). Our findings add to the growing evidence that memory deficits of aging may be related to the atrophy of the hippocampus.

We also identified five cerebellum networks: IC 4, IC 5, IC 14, IC 19, and IC 20 (**Figure 2** and **Table 2**). O'Reilly et al. found that cerebellum contained at least two zones, including a primary sensorimotor zone and a supramodal zone, which were equivalent to our network IC 14 (O'reilly et al., 2010). Dobromyslin et al. even found multiple cerebellar networks, and four of our cerebellum networks (except IC 5) were spatially similar to four of theirs (except a, f, g) (Dobromyslin et al., 2012). Significant linear declines between ICA weights and age were observed in these five cerebellum networks (**Table 3**). Raz et al. revealed an age-related linear decline in the volume of cerebellar hemispheres and vermis based on healthy adults aged 18–81 years, which is in agreement with the trajectory of our cerebellum networks (Raz et al., 2001). The cerebellum is commonly involved in motor coordination and now is also considered to be related to the modulation of cognition and learning (Raz et al., 2000; Bernard and Seidler, 2014).

In our results, the trajectory of the temporal lobe-related network (IC 17; **Figure 2** and **Table 2**), a quadratic decrease over age (t = −4.59, p of age<sup>2</sup> = 5.66E-06), showed an increasing trend from 20 to 50 years old and was followed by an obvious decline. Previously, age-related differences of the temporal anatomical network have been reported (Alexander et al., 2006; Brickman et al., 2007; Douaud et al., 2014; Hafkemeijer et al., 2014). Hafkemeijer et al. found that the temporal lobe-related network (network e) showed the decreased trend with non-significant level in middle-aged to older adults (Hafkemeijer et al., 2014). Douaud et al. assessed brain structure networks among normal subjects (8–85 years) and the brain network mainly including the medial temporal areas showed a symmetric and strong nonmonotonic relationship with age (Douaud et al., 2014). Alexander et al. used the SSM method to identify structural network patterns associated with age from healthy adults (22–77 years) and found older age was associated with less gray matter in the frontal and temporal brain regions (Alexander et al., 2006). Sowell et al. found that the gray matter density in the temporal-related area showed a non-linear change with an inverted U-shaped curve with age across the lifespan (7 to 87 years) (Sowell et al., 2003). Because of the late maturation pattern of the temporal lobe in the human brain (Gogtay et al., 2004) and the memory, recognition and other functions related to the temporal lobe (Macsweeney et al., 2002; Diaconescu et al., 2013; Perrodin et al., 2014), the temporal lobe-related network might mature after other brain areas, followed by atrophy, thus presenting an inverse U-shaped tendency with age.

For two subcortical structures, IC 9 was recognized as a caudate-related network, and IC 13 recognized as a thalamusrelated network. Damoiseaux et al. also found that the functional brain network K, which contains the thalamus, putamen, insular, and transverse temporal gyrus, showed spatial overlap with IC 13 in this study (Damoiseaux et al., 2008). The thalamusrelated network's path in our results showed a significant linear reduction across the adult stage (**Figure 2** and **Table 3**), in accordance with Hafkemeijer et al.'s study, in which the gray matter volume in this network displayed a slightly negative association with age (Hafkemeijer et al., 2014). Although, other studies reported age-related neuroanatomical volume changes in subcortical structures, such as the caudate and thalamus, their findings were not exactly consistent (Fjell et al., 2013; Pfefferbaum et al., 2013; Serbruyns et al., 2015). Serbruyns et al. have investigated the subregional atrophy of bilateral thalamus and caudate from 22 to 79 years old, and the right thalamus showed atrophy from the 5th decade, while 6th decade for the left thalamus and the bilateral caudate (Serbruyns et al., 2015). In Pfefferbaum et al.'s study, the best fitted trajectories of the thalamus and caudate from 20 to 85 years were quadratic models (Pfefferbaum et al., 2013). Till now, there are relatively few studies on these two networks. Thus, their network patterns associated with age need to be investigated further.

We re-performed the same statistical analysis of the ICs for three different age group (20–80, 20–70, and 20–60 years) to evaluate the age range effect on the age-related patterns. Compared with significant age-related results for subjects aged 20–86 years, the results for subjects aged 20–80 years showed the same age-related patterns. Though these results had different pvalues from the original analysis, all met the significance level. The results for subjects aged 20–70 years showed the similar age-related patterns. To be specific, among the reported 15 linear ICs, there were still the same significant 14 ICs, but one non-significant IC (IC 4). In addition, IC 17 was also had a quadratic trend with non-significant level. The results for subjects aged 20–60 years showed the slightly different age-related patterns. Specifically, all 16 ICs showed the linear decreased trends with significant 12 ICs and four non-significant ICs (IC 4, 17, 19, 20). Overall, the results of these additional analyses are consistent with our original findings and also demonstrated reasonably the age range effect on age-related change patterns of brain structural networks. In our results of subjects aged 20–86 years, the trajectory of the IC 17 showed a quadratic change over age. However, for a shorter age range, such as 20–70 and 20–60 years, a quadratic trend over age was not so obvious or even changed to be a linear path. Indeed, the significance level is associated with the sample size and the age distribution of participants in each age group. We should report the results with caution and clearly declare the age-related patterns with the specific subject age range.

A specific limitation of our study is the estimation of the number of components. For ICA-related studies, there is a lack of available standards to determine the optimal number of ICA components. Most studies adopted 12 to 25 components in structural networks or resting-state functional networks (Beckmann et al., 2005; Damoiseaux et al., 2006, 2008; Smith et al., 2009). Based on these studies, we chose 20 as the ICA output number. Additionally, the number of available subjects aged more than 80 years was relatively smaller than those of other age groups in the Information IXI database (http://www.brain-development.org). More subjects older than 80 years needed to be included to confirm our findings. Finally, our study only investigated structural MRI data and lacked resting-state fMRI and diffusion tensor imaging (DTI) data. In future studies, we shall combine multi-modality data to examine anatomical and functional networks and the age-related relationships between them.

# CONCLUSION

In the current study, we used a multivariate ICA method to investigate the structural covariance patterns of gray matter volume through adulthood in 536 healthy subjects. Sixteen structural networks, with the exception of the temporal loberelated network, showed a linear decline trajectory with age from 20 to 86 years. Our results largely confirmed previously reported findings. We noticed the confirmatory nature of our findings but for continuous age range so further extending the previous findings. Our findings not only provide insight into the patterns of age-related structural changes based on the network in the human brain, but also provide a foundation for understanding abnormal aging.

# AUTHOR CONTRIBUTIONS

KLiu, SY, KC, LY, and XG conceived and designed the study. KLiu, SY, KC, and XG developed the methods. KLiu and SY analyzed and interpreted data. JZ, LY, KLi, and ZJ interpreted data. KLiu, SY, and XG drafted the manuscript.

### ACKNOWLEDGMENTS

This work was supported by the National Natural Science Foundation (NNSF), China (61671066), the Funds for International Cooperation and Exchange of NNSF, China

### REFERENCES


Damoiseaux, R. S. A., Barkhof, F., Scheltens, P., Stam, C. J., Smith, S. M., and Beckmann, C. F. (2006). Consistent resting-state networks (61210001), Key Program of NNSF, China (91320201), the Fundamental Research Funds for the Central Universities, China, China Scholarship Council, China, the National Institute of Mental Health, US (RO1 MH57899), the National Institute on Aging, US (9R01AG031581-10, P30 AG19610), and the State of Arizona. All structural MRI data used in the preparation of this article were obtained from a large public database of the Information eXtraction from Images (IXI) (http://brain-development.org/ixi-dataset/), funded by Engineering and Physical Sciences Research Council (EPSRC) of the UK (EPSRC GR/ S21533/02).

across healthy subjects. Proc. Natl. Acad. Sci. U.S.A. 103, 13848–13853. doi: 10.1073/pnas.0601417103


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2017 Liu, Yao, Chen, Zhang, Yao, Li, Jin and Guo. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# The Relationship of Cognitive Performance and the Theta-Alpha Power Ratio Is Age-Dependent: An EEG Study of Short Term Memory and Reasoning during Task and Resting-State in Healthy Young and Old Adults

Janet P. Trammell <sup>1</sup> \*, Priscilla G. MacRae<sup>1</sup> , Greta Davis <sup>1</sup> , Dylan Bergstedt <sup>1</sup> and Ariana E. Anderson2, 3

### Edited by:

*James H. Cole, Imperial College London, United Kingdom*

### Reviewed by:

*Olusola Ajilore, University of Illinois at Chicago, United States Ines R. Violante, University of Surrey, United Kingdom*

> \*Correspondence: *Janet P. Trammell janet.trammell@pepperdine.edu*

Received: *09 June 2017* Accepted: *23 October 2017* Published: *07 November 2017*

### Citation:

*Trammell JP, MacRae PG, Davis G, Bergstedt D and Anderson AE (2017) The Relationship of Cognitive Performance and the Theta-Alpha Power Ratio Is Age-Dependent: An EEG Study of Short Term Memory and Reasoning during Task and Resting-State in Healthy Young and Old Adults. Front. Aging Neurosci. 9:364. doi: 10.3389/fnagi.2017.00364* *<sup>1</sup> Division of Social Sciences and Natural Sciences, Seaver College, Pepperdine University, Malibu, CA, United States, <sup>2</sup> Department of Psychiatry and Biobehavioral Sciences, University of California, Los Angeles, Los Angeles, CA, United States, <sup>3</sup> Department of Statistics, University of California, Los Angeles, Los Angeles, CA, United States*

Objective: The Theta-Alpha ratio (TAR) is known to differ based upon age and cognitive ability, with pathological electroencephalography (EEG) patterns routinely found within neurodegenerative disorders of older adults. We hypothesized that cognitive ability would predict EEG metrics differently within healthy young and old adults, and that healthy old adults not showing age-expected EEG activity may be more likely to demonstrate cognitive deficits relative to old adults showing these expected changes.

Methods: In 216 EEG blocks collected in 16 young and 20 old adults during rest (eyes open, eyes closed) and cognitive tasks (short-term memory [STM]; matrix reasoning [RM; Raven's matrices]), models assessed the contributing roles of cognitive ability, age, and task in predicting the TAR. A general linear mixed-effects regression model was used to model this relationship, including interaction effects to test whether increased cognitive ability predicted TAR differently for young and old adults at rest and during cognitive tasks.

Results: The relationship between cognitive ability and the TAR across all blocks showed age-dependency, and cognitive performance at the CZ midline location predicted the TAR measure when accounting for the effect of age (*p* < 0.05, chi-square test of nested models). Age significantly interacted with STM performance in predicting the TAR (*p* < 0.05); increases in STM were associated with increased TAR in young adults, but not in old adults. RM showed similar interaction effects with aging and TAR (*p* < 0.10).

Conclusion: EEG correlates of cognitive ability are age-dependent. Adults who did not show age-related EEG changes were more likely to exhibit cognitive deficits than those who showed age-related changes. This suggests that healthy aging should produce moderate changes in Alpha and TAR measures, and the absence of such changes signals impaired cognitive functioning.

Keywords: cognition, aging neuroscience, Theta Alpha Ratio, EEG, aging

# INTRODUCTION

From 2014 to 2060, the number of older adults in the U.S. population is expected to more than double (US Census Bureau, 2016), an increase that will likely lead to parallel increases in the number of age-related chronic diseases. For instance, the prevalence of dementia—a disease with annual costs estimated to be \$157–\$215 billion (Hurd et al., 2013)—increases from 1 in 20 persons for those 71–79 years of age, to 1 in 3 persons for those over 90 years of age (Plassman et al., 2007). With aging, there is a linear decline in executive function beginning in the third decade of life, despite the fact that overall acquired knowledge (crystalized intelligence) continues to improve through the first five decades of life (Salthouse, 2012). Therefore, understanding cognitive function in old adults and being able to identify brain activity associated with optimal cognitive performance could lead to the development of better prevention and treatment of dementia and cognitive impairment.

Changes in cognition with aging have been examined in numerous studies, but the nature of this relationship has only more recently been examined with electroencephalography (EEG) technology. These studies have examined age-related changes in cognition and EEG metrics, often with inconsistent results. Cognitive functioning, as measured by memory, declines with increasing age (e.g., Gilbert and Levee, 1971; Salthouse, 1990; Parkin and Walter, 1991; Yokota et al., 2000; Hartman et al., 2001; Salat et al., 2002), but the EEG metrics associated with or underlying this decline are poorly understood. For instance, Delta (1–4 Hz) relative power has been found to correlate with cognitive performance with inconsistent results. Vlahou et al. (2014) found Delta power positively associated with executive function and perceptual speed in old, but not young adults, suggesting that associations between cognitive performance and Delta power may depend upon age. Finnigan and Robertson (2011) did not show a relationship between Delta relative power and any cognitive measure (recall, attention, executive function) in 73 healthy old adults (mean age 60) who had subjectively complained of memory loss but had no objective measures of memory dysfunction.

Other EEG metrics, such as Alpha (8–12 Hz), are also associated with cognitive performance. Individual Peak Alpha Frequency (iPAF), the peak of spectral Alpha power of the EEG, is generally positively correlated with memory and attention at all ages (Klimesch, 1997, 1999; Angelakis et al., 2004; Clark et al., 2004; Grandy et al., 2013). However, Finnigan and Robertson (2011) found no relationship between iPAF and cognitive performance in old adults. Independent of neurocognitive performance, Alpha rhythms decrease as a function of age (Chiang et al., 2011), with even more dramatic change in Alpha rhythms seen in neurodegenerative disorders leading to dementia such as Huntington's disease (Streletz et al., 1990) and Alzheimer's disease (Montez et al., 2009; Basar and Guntekin, 2013). This suggests that age should modulate the associations of cognitive ability with Alpha power.

Theta (4–8 Hz) power studies of cognitive ability have also generated less-consistent age-dependent findings. Significant positive correlations were reported between Theta power and cognitive deficits in healthy adults (Jelic et al., 1996) and higher baseline Theta power was found to be indicative of subsequent cognitive decline (Jelic et al., 2000; Prichep et al., 2006). In contrast, others reported Theta power positively associated with memory, executive functioning, perceptual speed, reasoning, and attention in old adults (Cummins and Finnigan, 2007; Cummins et al., 2008; Finnigan and Robertson, 2011; Vlahou et al., 2014). During the encoding phase of a spatial navigation task (cognitive mapping), Lithfous et al. (2015) found that Theta activity positively correlated with accuracy for young but not old adults. In comparison to young adults, old adults showed both reduced accuracy and reduced Alpha and Theta during encoding.

In addition to separate Theta and Alpha band analyses, their ratios have been more recently implicated as a potentially important indicator of cognitive ability in old adults. In older adults with amnesic mild cognitive impairment (aMCI), the Theta to Alpha Ratio (TAR) was increased relative to controls (Bian et al., 2014), findings echoed closely by (Moretti, 2015). The reverse metric, the Alpha to Theta Ratio, was used to discriminate individuals with probable Alzheimer's disease from healthy older controls (Schmidt et al., 2013), and differed within patients with mild and severe Alzheimer's disease (Penttilä et al., 1985). Furthermore, the Alpha 1 (8–10 Hz) to Theta Ratio was able to discriminate individuals with and without cognitive impairment in older individuals with Parkinson's disease (Bousleiman et al., 2015). Differences in the TAR metric between young and old adults has not yet been examined. In sum, changes in Theta and Alpha bands are likely predictive of cognitive impairment (Klimesch, 1999; Fonseca et al., 2011) in old adults, but this relationship may depend upon age.

To investigate the relationship between aging, EEG metrics (Theta, Alpha, and TAR) and cognitive performance (short term memory and reasoning), we modeled these relationships in 36 healthy adults, 16 young and 20 old, using EEG across six different blocks (Resting and Active states). Given the associations of cognitive performance with Theta/Alpha levels within old adults, we hypothesized that EEG signatures previously associated with cognitive functioning may differ within healthy young and old adults. The outcome measure evaluated was TAR in three midline regions (Fz, Cz, Pz). Furthermore, we assessed whether the correlations between EEG metrics (iPAF, relative Alpha, relative Delta, and relative Theta) and cognitive performance were the same during the cognitive tasks as at rest, as most research has investigated the relationship only between resting EEG metrics and cognition. Collectively, this manuscript identifies whether EEG signatures of increased cognitive performance are consistent within healthy old and young adults, and whether aging changes these signatures in the absence of any overt pathology.

## METHOD

### Participants

Sixteen young adults (20.7 ± 0.9 years, range 20–29 years of age, 8 women and 8 men) and 20 high functioning old adults (72.9 ± 2.5 years, range 70–79 years of age, 14 women and 6 men) completed the study after signing informed consent approved by the Institutional Review Board of Pepperdine University. Young adults were recruited from the university via on-campus advertisements. Old adults were recruited from the local community via advertisements in the local senior center newsletter. All participants received a \$20 gift card to a local grocery store for participating in the study.

# Procedures

Young and old participants completed a questionnaire about their health history. Participants diagnosed with a concussion, stroke, epilepsy, neurological disease (dementia, Parkinson's disease, schizophrenia), or diabetes; or who experienced a heart attack, congestive heart failure, or cancer in the last year; or who were currently taking hypertensive or psychotropic medications, were excluded from the study. After passing the initial health history screening, qualified participants were asked to refrain from consuming alcohol, caffeinated beverages, and any other central nervous system stimulants for 4 h prior to the cognitive assessment session. Once the participant arrived at the lab, he/she gave informed consent and was screened for cognitive impairment, depression, and visual deficits. In order to be included in the study, participants were required to pass the cognitive assessment, scoring >26 (range 0–30) on the Mini Mental Status Examination (MMSE; Folstein, 1975); the depression assessment, scoring <4 on the abbreviated 15-item Geriatric Depression Scale (GDS; Sheikh and Yesavage, 1986); and demonstrate normal or corrected-to-normal vision (20/20). All 16 of the young adults successfully passed all screens and 20 out of 21 old adults passed. One older woman was excluded for scoring below 26 on the MMSE. Following these screenings, each participant prepared for EEG recording and completed the cognitive tasks (described below). The session took approximately 90 min.

### EEG Recording

Each participant was seated in a dimly lit room and fitted with an electrode cap. EEG was sampled with 19 electrodes in standard 10–20 International Electrode System placements (FP1, FP2, F7, F3, Fz, F4, F8, T3, C3, Cz, C4, T4, T5/P7, P3, Pz, P4, T6/P8, O1, O2), with reference to linked ears. Impedance was maintained below 10 kOhms and within 1.5 kOhm difference between sites. EEG data was collected using a Mitsar 201 M amplifier, EEGStudio v1.6/WinEEG v.2.103.70 software (Mitsar Ltd., St. Petersburg, Russia), and electrode caps (Electro-Cap Intl. Inc., Eaton, OH).

EEG recording involved separate recording blocks in the following order: an initial 5-min eyes-open resting baseline (EO1, block 1), an initial 5-min eyes-closed resting baseline (EC1, block 2), randomly ordered cognitive tests (matrix and short term memory tests, blocks 3 and 4), a final 5-min eyes-open resting baseline (EO2, block 5), and a final 5-min eyes-closed resting baseline (EC2, block 6). EEG data was plotted, filtered (bandpass: 0.1–30.0 Hz, notch: 55.0–65.0 Hz), and carefully inspected using manual artifact-rejection for all tasks. Episodic artifacts including eye blinks, eye movements, jaw tension, body movements, and EKG interference were removed from all channels by two trained researchers and subsequently reviewed by a third researcher to reach consensus on any discrepancies. Relative power was computed by dividing the band specific values by total power. The data was divided into epochs of 500 samples of continuous artifact free data (2 s). The WinEEG spectral analysis tool was used to determine EEG relative power activity in the following frequency bands: Delta (0.5–4.0 Hz), Theta (4.0–8.0 Hz), Alpha (8.0–12.0 Hz), and Beta (12.0–24.0 Hz), during EC1 and during the cognitive tasks. iPAF was determined by evaluating the maximal difference peak (6.0–13.5 Hz) in occipital and parietal electrode sites during Alpha suppression (EC1-EO1). A natural log transform was applied to all EEG variables to normalize the data distribution.

Since Alpha rhythm changes with age, memory load, and pathology, the logarithm of the Theta/Alpha ratio was calculated within each block for each electrode using the maximum power within two frequency bands for Alpha: Alpha 1 (8–10 Hz.) and Alpha 2 (10–12 Hz.) as recommended by Klimesch (1999) and Haegens et al. (2014). This allows the Alpha power to vary based upon activity and age, yet constrains it within the frequency band of the individual peak Alpha frequency for all but two participants. These models were additionally replicated when removing the two older participants whose iPAF of 7.5 Hz were outside the 8–12 Hz region, to assess whether deviations in peak Alpha may unduly influence the relationship between cognitive ability and the TAR. Finally, all models were also replicated using Alpha which was set uniformly at 8–12 Hz, to assess whether a fixed Alpha bandpower would identify a similar relationship to a variable Alpha.

### Cognitive Tasks

The cognitive assessments were completed on a computer using E-Prime <sup>R</sup> software. Participants completed two cognitive tasks: Short Term Memory (STM) and Raven's Matrices (RM). The cognitive tasks were counterbalanced for each participant and occurred during EEG blocks 3 and 4. In STM, participants were shown a list of 12 words (see Appendix A in Supplementary Material) in random order appearing sequentially for 1 s each. Immediately after the last word, participants were given as much time as needed to recall the words. The STM task was repeated for a total of four trials using the same randomized 12 item word list. In order to control for typing inability in some participants, all participants were asked to write their responses on a blank sheet of paper and responses were then typed into the program by the experimenter. In the RM task, which was used to measure reasoning, participants selected a missing image from an incomplete 3 × 3 matrix, based on horizontally- and vertically-progressing patterns. Participants were allotted 10 min to complete 18 problems, and performance was measured by the percent correct of total attempted problems.

### Modeling

In primary analyses, a general linear mixed-effects regression model was used to predict the TAR using age group (young vs. old), block (EO, EC, STM, and RM), total STM score, and RM percent correct. This was compared to predicting TAR without age in a hierarchical regression, using a chi-square test of nested models. Interaction affects were included for STM scores/Age/RM scores to investigate whether EEG correlates of cognitive performance may be age-dependent, and to account for the covariance between the cognitive tests (STM and RM). This model was assessed separately for each location (Fz, Cz, Pz). Participant ID was included as a random effect to account for repeated measures. To directly test the hypothesis that age modulates the relationship between cognitive performance and aging, a chi-square ANOVA test was used to compare models predicting the TAR with and without age.

In secondary analyses, descriptive statistics detailing the correlations among cognitive performance, age and EEG metrics were computed.

# RESULTS

# Cognitive Tests

### STM

Young adults (My = 8.75, SD = 2.02) recalled more words than old adults (Mo = 6.70, SD = 2.06), t(34) = 3.00, p = 0.005 across all four STM trials (see **Figure 1**). As seen in the T-tests for each trial individually, the young adults recalled significantly (p ≤ 0.05) or marginally (p ≤ 0.10) more words than old adults for each of the four trials. As results for each trial were similar, trial 4 was used for secondary correlational analyses.


### Reasoning

Out of 18 problems, young adults had a higher percentage of correctly completed problems [M<sup>Y</sup> = 55.69, SD = 15.16; M<sup>O</sup> = 32.57, SD = 12.98; t(34) = 4.93, p < 0.0005, see **Figure 1**]. RM and STM were correlated (r = 0.60), with a similar relationship for both age groups (see **Figure 2**).

### EEG Analyses

### Primary Model: TAR and Cognitive Performance:

Cognitive performance, task, age, and gender was used to predict TAR EEG activity across blocks in three mid-line locations. For Fz (frontal) and Pz (posterior) locations, only block (EO, EC, STM, RM) was statistically significant in predicting the TAR (p < 0.001, Supplementary Tables 1, 2). For all regions, the EC block produced significantly lower TAR than the EO, STM, or RM blocks (p < 0.001).

For the Cz (central) regions, the TAR was dependent upon cognitive performance only after accounting for age, determined by a hierarchical regression comparing the ability to predict the TAR with and without age (chi-square test of nested models, p < 0.05; **Tables 1**, **2**). Cognitive performance significantly interacted with age in the regression model such that young adults had substantial increases in TAR with increased STM compared to old adults (p < 0.05; **Figure 3**, **Table 1**). For decreased RM, subjects showed increased TAR after holding constant the effects of age, STM, and task (p < 0.05). The interaction effect between aging and RM suggested that young adults also had increases in TAR with increasing RM compared to older adults after holding constant all else (p < 0.10; **Table 2**, **Figure 4**).

To assess sensitivity of these findings to the individual variation in Alpha, these models were also replicated when excluding 2 participants whose iPAF fell outside the traditional 8–12 Hz window, which did not change these findings (Supplementary Table 3). Similarly, results were consistent when using Alpha fixed at 8–12 Hz instead of separating by Alpha 1 (8–10) Hz. and Alpha 2 (10–12) Hz. windows. This replication suggests that the individual variation in Alpha did not drive the interactions between aging and cognitive performance. When modeling just Alpha separately, increased Alpha was associated with increased cognitive ability in general, but the aging effects were not statistically significant (p > 0.05, Supplementary Table 4).

When assessing Theta and Alpha separately, increases in STM performance were associated with a decrease in Theta and Alpha for both age groups (**Figure 5**). Holding constant cognitive performance, young adults had greater Theta and Alpha than old adults. For RM, age showed different relationships with cognitive performance in predicting Theta and Alpha power (**Figure 6**), with increased RM showing increased Alpha in young adults and decreased Alpha and decreased Theta in old adults.

### Secondary Model: Descriptive Analyses of Fz, Cz, and Pz by Age and Bandwidth:

EEG activity during EC1 and during cognition was correlated with iPAF, relative Delta power, relative Theta power, and relative Alpha power at the three midline sites: Fz, Cz, and Pz. The cognitive task dependent measures were trial 4 of the STM (as results for trials 1–3 were similar to trial 4) and RM percent correctly completed.

### Correlations

All correlation analyses were conducted with Pearson correlations, two-tailed tests. Statistical significance was not reached after Bonferroni correction, so these results are presented descriptively with uncorrected p-values; see Supplementary Tables 5–8 for correlations by age, for the entire sample, and for young and old adults separately.

### Age

iPAF was significantly negatively correlated with age. During the EC1 condition, relative Delta was significantly negatively TABLE 1 | Theta Alpha Ratio (TAR) model parameters for CZ.


*Cognitive performance did not predict the EEG TAR measure until accounting for the effects of age (p* < *0.05, chi-square ANOVA test of nested models). The TAR within an EEG block was modeled using a general linear mixed-effects regression model. Increases in STM were more strongly associated with an increased TAR (p* < *0.05) in young than old adults. Increases in RM were associated with decreases in TAR. Interpretation of coefficients is with respect to a baseline of an Older Male during Eyes Closed EEG block, implying that Eyes Closed condition was significantly different than all other tasks (p* < *0.001).*

*Significance codes: 0.001 "*\*\*\**"; 0.01 "*\*\**"; 0.05 "*\**"; 0.1 "."*

correlated with age at the Fz site. Relative Theta was significantly negatively correlated with age at Fz and Cz. Relative Alpha was not significantly correlated with age.

### iPAF

iPAF was not correlated with STM performance but was significantly positively correlated with RM across the entire sample. IPAF was not correlated with performance within young or within old adults.



*When not including the effects of age, the TAR changed with activity but did not significantly depend on cognitive performance. Interpretation of coefficients is with respect to a baseline of an Older Male during Eyes Closed EEG block, implying that Eyes Closed condition was significantly different than all other tasks (p* < *0.001).*

*Significance codes: 0.001 "*\*\*\**"; 0.01 "*\*\**"; 0.05 "*\**"; 0.1 "."*

### Relative Delta

Across the entire sample, relative Delta was generally positively correlated with cognitive performance. During EC1, relative Delta was significantly positively correlated with STM at all three sites. During the STM task, relative Delta was significantly positively correlated with STM at Pz. RM was not correlated with relative Delta during EC1 or during the matrix task. For young adults, during EC1, relative Delta was significantly positively correlated with STM at Fz and marginally correlated at Cz and Pz. Relative Delta was not significantly correlated with performance during the STM task. Relative Delta was also not significantly correlated with RM performance during EC1 or during the RM task. For old adults, relative Delta was generally positively correlated with performance. During EC1, relative Delta was significantly positively correlated with STM at Pz and marginally positively correlated at Cz. Relative Delta during the STM task was also significantly positively correlated at Pz. Relative Delta was not correlated with RM performance.

### Relative Theta

Across the entire sample, relative Theta was positively correlated with cognitive performance, but this was sensitive to activity/block. STM performance was not significantly correlated with relative Theta during EC1 or during the STM task. During EC1, relative Theta was not significantly correlated with RM performance. During the Matrix task, relative Theta was significantly positively correlated with RM at the Fz and Cz sites. For both young and old adults separately, no correlations reached statistical significance, but relative Theta was generally negatively correlated with performance.

### Relative Alpha

Relative Alpha was generally negatively correlated with cognitive performance for the entire sample, although no correlations reached statistical significance. Correlations for young adults showed that relative Alpha was negatively correlated with performance. During EC1, relative Alpha was not significantly correlated with STM performance. During the STM task, relative Alpha was negatively correlated with STM performance at Pz. Relative Alpha was also not significantly correlated with RM performance during EC1 or during the RM task. Correlations for old adults likewise showed that relative Alpha was generally

negatively correlated with performance. During EC1, relative Alpha was not correlated STM performance. During the STM task, relative Alpha was significantly negatively correlated with STM at Pz. Relative Alpha was not correlated with RM performance.

### DISCUSSION

Consistent with the literature, young adults reliably outperformed old adults on STM and RM measures of cognitive function (Salthouse, 2012). The TAR marker of cognitive performance showed different trends for high-functioning young and old adults in the Cz region. For all locations, the most significant predictor of TAR was activity: TAR increased in EO, STM, and RM blocks compared to EC blocks (p < 0.001). The most likely reason for this TAR decrease during the EC block is the general increase in Alpha, occurring during wakeful relaxation with shut eyes (Barry et al., 2007).

Some differences seen between the RM and STM cognitive assessments/TAR measurement may reflect the different brain regions recruited during such cognitive tasks; matrix reasoning produces fMRI activation in right frontal and bilateral parietal regions (Prabhakaran et al., 1997), while verbal short-term memory tasks produces activation in the posterior temporal regions, supramarginal gyri, Broca's area, and dorsolateral premotor cortex (Henson et al., 2000). Different susceptibility of these brain regions to the aging process, as well as compensatory recruitment of other regions for specific cognitive tasks, also may impact the differences seen between the two assessments (STM, RM) within our old cohort. For example, patients with

Alzheimer's disease showed decreased fMRI cortical activation but increased hippocampal activation during a STM task relative to controls, suggesting compensatory recruitment (Peters et al., 2009). Similarly, cortical recruitment strategies change with age; healthy elderly adults used frontal areas for a spatial working memory task, whereas healthy younger adults recruited parietal areas (McEvoy et al., 2001).

Increased TAR was associated with decreases in RM performance in old adults, with the opposite trend seen in young adults. This is consistent with the findings of Bian et al. (2014), who found that diabetic older patients with MCI had an increased TAR compared to diabetic older adults without MCI, as well as Moretti (2015) who found an increased Theta frequency power in MCI adults due to Alzheimer's disease. The increased Theta frequency was also associated with hippocampal atrophy, which suggests that a greater TAR would be associated with greater hippocampal atrophy when holding Alpha constant (Moretti, 2015).

Progressive atrophy of the hippocampus, measured through MRI, also correlated with decreased EEG cortical Alpha power in adults with Alzheimer's disease (Babiloni et al., 2009), while Penttilä et al. (1985) found that decreases in Alpha in those with Alzheimer's disease was specific to late-stage disease. Jelic et al. (1996) found cognitive impairment was associated with increased Theta and decreased Alpha power in adults with Alzheimer's disease. Although our study found also that increased Theta was associated with decreased cognitive functioning in healthy old adults, decreased Alpha was associated with increased cognitive ability for RM (**Figure 6**). This suggests that the relationship between cognitive ability and Alpha is non-linear, where optimal cognitive functioning occurs when Alpha shows moderate ageassociated changes. This closely echoes earlier MEG findings of Vlahou et al. (2014), who found enhanced Delta and Theta power with increased executive functioning and perceptual speed, but only within healthy older adults, further suggesting a nonlinear relationship between aging, brain activity, and cognitive functioning. Healthy aging produces changes in EEG spectra in the absence of pathology (Polich, 1997), so age-related and pathological-related power changes seen in diseases of the elderly are ultimately dependent upon the "control" group being studied.

For all participants within most blocks, cognitive performance (STM and RM) was generally positively correlated with IPAF, relative Delta, and relative Theta, and negatively correlated with relative Alpha (Supplementary Table 6), although these pairwise correlations did not surpass multiple comparison corrections and are presented descriptively. As seen in the separate correlational analyses for young adults (Supplementary Table 7), IPAF and relative Delta did not show correlations in a general direction, and both relative Theta and relative Alpha were generally negatively correlated with cognitive performance. In contrast, old adults showed general positive correlations between cognitive performance and both iPAF and relative Delta (Supplementary Table 8). Similarly, old adults showed general negative correlations between cognitive performance and both relative Theta and relative Alpha. Furthermore, correlations were generally similar both at rest (EC1) and during cognition. As relative Delta activity correlated similarly for young and old adults, our correlational data suggests that these relationships with cognitive performance remain stable regardless of age. Overall, few bivariate correlations were significant after correction for multiple comparisons, likely due to the individual blocks being underpowered because of small sample size and the use of two-tailed (nondirectional) rather than one-tailed (directional) testing. Age was negatively correlated with iPAF, relative Delta, and relative Theta (Supplementary Table 5).

Overall, our correlations are consistent with other findings in regards to iPAF (Klimesch, 1997, 1999; Angelakis et al., 2004; Clark et al., 2004; Grandy et al., 2013), relative Delta (Vlahou et al., 2014), relative Theta (Jensen and Tesche, 2002; Cummins and Finnigan, 2007; Cummins et al., 2008; Vlahou et al., 2014) and findings of Alpha power inversely correlated with age (Hartikainen et al., 1992; Hong and Rebec, 2012) but inconsistent with Finnigan and Robertson (2011), who found positive relationships between relative Theta and cognitive performance in old adults, but no association of resting iPAF, relative Delta power, and relative Alpha power with cognitive performance. Thus, while our study design was a close replication of Finnigan and Robertson (2011), with the addition of a young adult group and an older age range in the old adult group (70–79 instead of 55–73), and EEG metrics recorded both at rest and during the cognitive assessment, our correlational findings align better with other studies.

Our findings have lent some clarity to the mixed results in literature regarding age, EEG, and cognitive performance, yet

FIGURE 5 | As STM scores increased, Cz Theta and Alpha decreased for both age groups. Shaded area indicates 95% confidence intervals. Plots do not illustrate the contributing effects of other covariates (e.g., block and STM).

there are some shortcomings to our study. Because our sample size was small, this study was underpowered to detect the smaller effect-sizes in other brain regions. Thus, the insignificance of the TAR in this analyses in frontal regions (Fz) does not confirm the absence of an age-dependent cognition relationship, but may rather suggest that the effect size is smaller in Fz than that of the central region (Cz). A larger sample would also have allowed us to evaluate more hypotheses, which were purposefully few to avoid false positives due to multiple comparisons. Our findings were in a subset of high-functioning healthy adults, so EEG correlates of cognitive performance may differ in participants having an exclusionary medical diagnoses. Moreover, participants were not followed after the conclusion of the study to track incidence of exclusionary diagnoses, so some may have been in the premorbid stages during the study period. Lastly, while we deliberately chose to focus on a narrow age range for old adults (70–79), it is possible that these results are not generalizable to old adults across a larger age spectrum, which may explain the differences between our and Finnigan and Robertson (2011)'s findings which included patients as young as 55. Future research would benefit from a longitudinal analysis of adult EEG activity and cognitive performance across a larger age spectrum.

Future research should include additional measures of cognition and brain activity. For instance, the Stroop task would serve as a measure of inhibitory control, and may be associated with other EEG measures such as P3 amplitude (Saliasi et al., 2013). In addition, it is unlikely that age by itself is the only predictor of EEG activity and performance. Many other factors that vary widely in old adults, such as physical activity, education, sleep quality and quantity, and social support and interaction, are likely to co-vary with and predict cognition. Our descriptive measures of EEG (e.g., localized relative power) were purposefully selected because their simplicity makes them more clinically applicable; however, more complex global measures of EEG behavior such as coherence may yield a deeper understanding of the relationship between cognitive performance and age. This also is a direction for future work.

In conclusion, this study adds to the existing body of knowledge by illustrating that healthy aging is associated with changes in EEG activation patterns and cognitive performance. EEG markers such as the TAR may disambiguate cognitive changes specific to healthy and pathological aging. The significant interaction effects between aging and cognitive performance indicated that a failure to show age-related changes resulted in "young" EEG signatures but impaired cognitive performance. Rather than aging-related changes being a marker of detriments in cognitive performance, a "healthy" person whose EEG patterns do not change with age is more likely to exhibit cognitive impairment than a person who shows normal agerelated changes.

### REFERENCES


### ETHICS STATEMENT

This study was carried out in accordance with the recommendations of the following guidelines: APA guidelines, the Belmont Report, U.S. Department of Health and Human Services PART 46: PROTECTION OF HUMAN SUBJECTS, California Experimental Subjects Bill of Rights, and Pepperdine University's Seaver IRB, with written informed consent from all subjects. All subjects gave written informed consent in accordance with the Declaration of Helsinki. The protocol was approved by Pepperdine University's Seaver IRB.

## AUTHOR CONTRIBUTIONS

All authors meet the following authorship contribution criteria: Substantial contributions to the conception or design of the work, as well as the acquisition, analysis, and interpretation of data for the work; Drafting the work and revising it critically; Final approval of the version to be published; Agreement to be accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved.

### ACKNOWLEDGMENTS

The authors would like to thank Nancy Zelaya, Jessica Chao, and Brian Cheah for their assistance with data collection; SenseLabs, and in particular Sarah Wyckoff and Leslie Sherlin, for training our staff and the use of their EEG equipment; and the Malibu Senior Center for assistance with participant recruitment. Supported in part by the National Institute of Health (NIH) –NIMH R03MH106922, NIA K25AG051782. AA holds a Career Award at the Scientific Interface from BWF.

# SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fnagi. 2017.00364/full#supplementary-material


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2017 Trammell, MacRae, Davis, Bergstedt and Anderson. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Physiological Aging Influence on Brain Hemodynamic Activity during Task-Switching: A fNIRS Study

Roberta Vasta<sup>1</sup> \* † , Simone Cutini 2†, Antonio Cerasa3, 4, Vera Gramigna<sup>1</sup> , Giuseppe Olivadese<sup>3</sup> , Gennarina Arabia<sup>4</sup> and Aldo Quattrone1, 3, 5

*<sup>1</sup> Neuroscience Research Center, University Magna Graecia, Catanzaro, Italy, <sup>2</sup> Department of Developmental Psychology, University of Padova, Padova, Italy, <sup>3</sup> Institute of Bioimaging and Molecular Physiology, Neuroimaging Research Center, Consiglio Nazionale Delle Ricerche (CNR), Catanzaro, Italy, <sup>4</sup> Institute S. Anna, Research in Advanced Neurorehabilitation, Crotone, Italy, <sup>5</sup> Institute of Neurology, Department of Medical and Surgical Sciences, University Magna Graecia, Catanzaro, Italy*

Task-switching (TS) paradigm is a well-known validated tool useful for exploring the neural substrates of cognitive control, in particular the activity of the lateral and medial prefrontal cortex. This work is aimed at investigating how physiological aging influences hemodynamic response during the execution of a color-shape TS paradigm. A multi-channel near infrared spectroscopy (fNIRS) was used to measure hemodynamic activity in 27 young (30.00 ± 7.90 years) and 11 elderly participants (57.18 ± 9.29 years) healthy volunteers (55% male, age range: (19–69) years) during the execution of a TS paradigm. Two holders were placed symmetrically over the left/right hemispheres to record cortical activity [oxy-(HbO) and deoxy-hemoglobin (HbR) concentration] of the dorso-lateral prefrontal cortex (DLPFC), the dorsal premotor cortex (PMC), and the dorso-medial part of the superior frontal gyrus (sFG). TS paradigm requires participants to repeat the same task over a variable number of trials, and then to switch to a different task during the trial sequence. A two-sample *t*-test was carried out to detect differences in cortical responses between groups. Multiple linear regression analysis was used to evaluate the impact of age on the prefrontal neural activity. Elderly participants were significantly slower than young participants in both color- (*p* < 0.01, *t* = −3.67) and shape-single tasks (*p* = 0.026, *t* = −2.54) as well as switching (*p* = 0.026, *t* = −2.41) and repetition trials (*p* = 0.012, *t* = −2.80). Differences in cortical activation between groups were revealed for HbO mean concentration of switching task in the PMC (*p* = 0.048, *t* = 2.94). In the whole group, significant increases of behavioral performance were detected in switching trials, which positively correlated with aging. Multivariate regression analysis revealed that the HbO mean concentration of switching task in the PMC (*p* = 0.01, β = −0.321) and of shape single-task in the sFG (*p* = 0.003, β = 0.342) were the best predictors of age effects. Our findings demonstrated that TS might be a reliable instrument to gather a measure of cognitive resources in older people. Moreover, the fNIRS-related brain activity extracted from frontoparietal cortex might become a useful indicator of aging effects.

Keywords: task-switching, physiological aging, functional near-infrared spectroscopy, cognitive control, regression analysis

### Edited by:

*Nicolas Cherbuin, Australian National University, Australia*

### Reviewed by:

*Felix Scholkmann, University Hospital Zurich, Switzerland Frithjof Kruggel, University of California, Irvine, United States*

### \*Correspondence:

*Roberta Vasta r.vasta@unicz.it*

*† These authors have contributed equally to this work.*

Received: *20 September 2017* Accepted: *15 December 2017* Published: *08 January 2018*

### Citation:

*Vasta R, Cutini S, Cerasa A, Gramigna V, Olivadese G, Arabia G and Quattrone A (2018) Physiological Aging Influence on Brain Hemodynamic Activity during Task-Switching: A fNIRS Study. Front. Aging Neurosci. 9:433. doi: 10.3389/fnagi.2017.00433*

# INTRODUCTION

Task-switching paradigm is a well-known and validated tool for exploring executive control processes and neural correlates of cognitive cost and it is often used to assess age-related executive deficits (Wilckens et al., 2017). Typically, this paradigm requires the repetition of the same task over a variable number of trials (i.e., repetition trials) and the rapid alternation between two different tasks at some point of the trials sequence (i.e., switch trials). For each trial a reaction time (RT) is registered. Switch cost refers to the finding that performance is slower (longer RTs) and less accurate on switch trials than repeated trials and is thought to reflect the executive processes required to deactivate the task set relevant on the previous trial and to activate the currently relevant task set (Monsell, 2003). Taskswitching performance may be improved using task cues, which provide valid information about the upcoming target and allow for time to prepare for a given trial (Schapkin et al., 2014). Task cues "effect" is associated to maintaining and reconfiguration processes of a task set in working memory (Wilckens et al., 2017).

In the last decades several studies have highlighted the fundamental role of the dorsolateral and ventrolateral prefrontal cortex (dlPFC, vlPFC), the supplementary and pre-supplementary motor areas (SMA, pre-SMA) and the superior and inferior lobules of the parietal cortex in taskswitching (Dove et al., 2000; Braver et al., 2003; Brass and von Cramon, 2004; Wager et al., 2004; Ruge et al., 2005; Badre and Wagner, 2006; Crone et al., 2006; Slagter et al., 2006; for reviews see Ruge et al., 2013; Jamadar et al., 2015).

Physiological aging modulation of these brain areas in switching task has been extensively investigated by using both structural and functional advanced magnetic resonance imaging (DiGirolamo et al., 2001; Milham et al., 2002; Gold et al., 2010; Zhu et al., 2014; Hakun et al., 2015; Eich et al., 2016; Jolly et al., 2017), suggesting that age-related changes in behavioral performance are associated with changes in neural patterns of activation. Specifically, older adults show a less specific cerebral activation and the recruitment of additional frontal regions that are not activated in younger adults (DiGirolamo et al., 2001; Milham et al., 2002; Gold et al., 2010).

Although these neuroimaging modalities are highly effective and reliable, these are also very expensive and invasive. fNIRS is a non-invasive neuroimaging technique that enables to investigate brain hemodynamic) with reasonable temporal and spatial resolution, quantifying task-related changes in oxygenated hemoglobin (HbO), and deoxygenated hemoglobin (HbR) concentrations (Scholkmann et al., 2014); crucially for our purposes, fNIRS provides a remarkable added value in cognitive neuroscience because it allows to gather information on cortical activity in those overcoming some limitations imposed by other neuroimaging techniques, thereby increasing the ecological validity of the tasks used to test participants (Cutini et al., 2012, 2014; Cutini and Brigadoi, 2014). The advantages of being non-invasive, portable and relatively low susceptible to motion artifacts than other neuroimaging techniques, give NIRS a strong ecological validity for use in situated cognition paradigms (Ferreri et al., 2014).

For this reason, this neuroimaging technique has gained growing interest in the last 10 years, particularly in the field of cognitive aging (Agbangla et al., 2017), focusing on several domains cognitive function such as language (Scherer et al., 2012; Amiri et al., 2014), episodic memory (Ferreri et al., 2014), executive functions (verbal fluency; Herrmann et al., 2006; Kahlaoui et al., 2012; Obayashi and Hara, 2013), working memory (Vermeij et al., 2012, 2014a,b, 2016), inhibition and cognitive flexibility (Schroeter et al., 2003; Laguë-Beauvais et al., 2013; Hagen et al., 2014; Müller et al., 2014).

In the context of attentional control functions, very few studies have monitored brain hemodynamic changes during the task-switch execution by using fNIRS technology (Cutini et al., 2008; Laguë-Beauvais et al., 2013). Among them, just one study has evaluated physiological aging effect on brain areas modulation during inhibition and switching tasks (Laguë-Beauvais et al., 2013). The authors compared fNIRS-related functional brain activation patterns in the prefrontal cortex in older and younger adults during a modified Stroop task with interference and switching conditions, by using a classical univariate statistical approach. Conversely, the lesson learnt from one decade of neuroimaging studies provides consistent evidence of the advantage of multivariate analysis of moving from group-level statistical results to a full description of a biologic phenomenon (Habeck, 2010).

For this reason, our aim was to develop an ecologically sound and easily applicable mean to assess both behavioral and neurofunctional age-related changes in switching task by using functional near-infrared spectroscopy (fNIRS) and a multivariate statistical approach. In this study, we investigated the hemodynamic response in the frontoparietal areas during the execution of a task-switching paradigm by means of fNIRS on a population of healthy participants, and we characterized the selective influence of physiological aging on brain hemodynamic response by using multiple linear regression.

Within the present framework, we sought to explore whether the recruitment of additional frontal regions is a pervasive phenomenon that can be observed in the vast majority of the frontal lobe or if it is restricted to a subset of regions. In this regard, multiple linear regression gave us the chance to observe a possible dissociation between those regions that might help to compensate the age-related cognitive decline and those regions that might be indeed less activated in elderly participants.

### MATERIALS AND METHODS

Participants were recruited from University of Catanzaro, Polyclinic "Magna Graecia," community recreational centers and hospital personnel through local advertisements. Inclusion criteria were: (1) no evidence of dementia or depression symptoms according to DSM-V criteria; (2) no use of antidepressant, anxiolytic, or antipsychotic drugs that could affect cerebral blood flow; (3) right- handedness; and (4) absence of chronic medical conditions (heart disease, hypertension, or diabetes); According to these criteria, 38 right-handed healthy volunteers (21 males and 17 females, in the age range of 19–69 years, mean age = 37.87 ± 14.94 years) were considered eligible for this study. All participants had normal or corrected to normal vision, and normal color vision. All the participants gave written informed consent. The study was approved by the Ethical Committee of the University "Magna Graecia" of Catanzaro, according to the Helsinki Declaration.

# Experimental Procedure

The experiment was carried out in a sound-attenuated and dimly lit room. Participants were seated in a comfortable chair while performed a color-shape task-switching paradigm (Hakun et al., 2015), that was designed using E-prime 3.0 software (Schneider et al., 2012) (Psychology Software Tools, Pittsburgh, PA). The synchronization between fNIRS recording and timing of stimulation was performed through the RS232 serial port communication. The stimuli consisted of two possible shapes (circle or square), in one of two possible colors (red or blue), presented on a computer screen. Participants were asked to hold the index and middle fingers of the right hand on the "left" and "right" arrows keys of the computer keyboard throughout the entire experiment, respectively.

At the beginning of each trial, an instructional cue was given to participant (the word "color" or "shape"), which was displayed for 150 ms. Upon the presentation of a color stimulus, participants had to press "right" key in response to "red," or the "left" key in response to "blue." Upon the presentation of a shape stimulus, participants had to press "right" key for "square," or the "left" key for "circle." Each stimulus was presented for a maximum of 3,000 ms, and replaced with a black screen upon detection of a response (with a duration randomly varying from 8 to 10 s). Then a 200 ms central fixation (plus-sign) signaled the start of the next trial. Before detection, participants received task instructions and practiced for each condition.

As showed in **Figure 1**, the task was composed by three main blocks: (a) in the color block, participants were required to distinguish between red and blue stimuli; (b) in the shape block, participants were asked to judge when the stimulus was a circle or a square; (c) in the switching block, shape and color stimuli were shown alternatively to patients. Blocks (a) and (b) were regarded as single blocks, whereas switching blocks included repetition and switch trials.

In single blocks, participants performed a sequence of 20 experimental trials with the same instructional cue (color or shape) repeated on each trial, while during switching blocks the color and shape tasks were presented pseudo-randomly, with an equal number of repeating/ switching in consecutive trials (40 switch trials and 40 repetition trials).

RTs and the percentage of correct response were calculated for single blocks, repetition and switch trials.

An independent two-sample t-test was performed in order to evaluate behavioral differences between groups in task-switching performances. We calculated the Cohen's d (Cohen, 1998) as a measure of the effect size, which indicates the magnitude of mean differences (using the estimated marginal means) in SD units.

On the whole group, an ANOVA analysis was carried out on median RT values and the percentage of correct responses in order to highlight possible differences among single blocks, repetition and switch trials. Moreover, a simple regression analysis was performed in order to evaluate the effect of aging on task-switch performances.

FIGURE 2 | Anatomical ROIs localization. Rendering of the skull surface showing the detection channels to record brain activity in the right and left dorso-lateral prefrontal cortex (DLPFC), the right and left dorsal premotor cortex (PMC), and the right and left dorso-medial part of the superior frontal gyrus (sFG) during color-shape task-switching paradigm, according to 10/20 system.

### fNIRS Probe Location and Data Acquisition

A 52-channel NIRS machine (ETG-4000 Optical Topography System; Hitachi Medical Co., Japan) working with two different wavelengths (695 and 830 nm) and a sample frequency of 10 Hz was used to measure relative changes of absorbed near-infrared light during color-shape task switching.

Two "4 × 4" measurement grids were attached to a regular swimming cap. Eight emitters and eight detectors -for a total of 24 measurement channels for each hemisphere- were used (**Figure 2**) and each source/detector pair at a distance of 3 cm. According to the international 10/20 system, channel grids were placed to cover following ROIs for each hemisphere: the dorsolateral prefrontal cortex (DLPFC), the dorsal premotor cortex (PMC) and the dorso-medial part of the superior frontal gyrus (sFG) (see "Data Sheet 1" for an example of photon migration and penetration depth). **Figure 2** shows that dorso-medial part of sFG, DLPFC, and PMC were bilaterally covered by the present probe spatial arrangement.

# fNIRS Data Analysis

A preliminary visual inspection of the fNIRS intensity signal time-course of each source-detector pair was used to detect the presence of physiological activity and to test fNIRS signal quality.

Noisy source-detector pairs were manually discarded on the base of absence of physiological activity in both 830 and 695 nm signals. Channels that visually showed movement artifacts were excluded from the analysis. A moving average method with a window width of 5 s was used to identify and remove any shortterm movement artifacts. Raw fNIRS data were converted into optical density changes and then bandpass filtered between 0.005 and 0.5 Hz, to remove low frequency drifts signal components and cardiac fluctuations interferences. The relative changes in the concentration of HbO, HbR, and total hemoglobin (HbT) were estimated according to changes in the optical properties of the light using the Beers-Lambert law (Cope and Delpy, 1988; Delpy et al., 1988).

Each trial was baseline corrected by subtracting the mean intensity of the optical signal recorded during the 2 s preceding trial onset from the overall hemodynamic activity.

Then HbO and HbR mean concentrations during vascular response were calculated for each subject and task in all channels of interest from standardized grand average waveform (z-score).

# Statistical Analysis

Two different statistical approaches were used to evaluate the age-related influence on task-switching activity. Initially, we considered aging effect as discrete factor. For this reason, we grouped healthy sample in young and elderly populations. Age ≥50 years was used as cut-off for defining elderly people, since several studies demonstrated the presence of physiological neurodegenerative processes starting after the so-called: nonelderly adult phase (18–50 years; Pieperhoff et al., 2008; Terribilli et al., 2011). Twenty-seven young participants (mean age = 30.00 ± 7.90 years) and 11 elderly participants (mean age = 57.18 ± 9.29 years) were matched for gender (Chi-square test, p < 0.05) and education level (t-test, p < 0.05).

Next, simple regression or multivariate regression analyses considering the impact of aging effect as a continuous factor were employed. Statistical analyses were performed with SPSS Version 12.0 (https://www.ibm.com/software/products/it/spss-statistics). Assumptions for normality were tested for all continuous variables by using the Kolmogorov–Smirnov test. Unpaired ttest and analysis of variance were employed appropriately for behavioral data. Finally, Pearson correlation analysis was used for evaluating the relationship between age and task performance. For all statistical analyses, a p-level of 0.05 was considered to be significant. Moreover, Cohen's d as a measure of the effect size was also calculated (Cohen, 1998).

A similar approach was used for fNIRS data. We started with an independent two-sample t-test to evaluate differences between groups in hemodynamic activation (HbO and HbR mean concentration) within ROIs. Next, to evaluate how physiological aging could selectively influence brain hemodynamic response, we performed a multiple linear regression, according to the model: age = β x (predictors) + constants. In particular, with the aim of quantifying the relative contribution to aging of each task (color single-task, shape single-task, repetition trials, switch trials), each hemodynamic parameter (HbO and HbR mean concentrations) and each ROI (for both the hemispheres separately), we performed a regression analysis using a multiple linear model including all predictors.

# RESULTS

### Demographic Data

Demographic features of all subjects are summarized in **Table 1**. No differences were detected in gender (p = 0.876) and education level (p = 0.312) between young and elderly participants.

### Behavioral Data

An independent two sample t-test revealed significantly longer reaction time (RT) for elderly compared to young participants in both color- (p < 0.01, t = −3.67; Cohen's d = −1.21, effect size = 0.52) and shape-single tasks (p = 0.026, t = −2.54; Cohen's d = −0.84, effect size = 0.39) as well as switching (p = 0.026, t = −2.41; Cohen's d = −0.79, effect size = 0.37) and repetition trials (p = 0.012, t = −2.80; Cohen's d = −0.92, effect size = 0.42) (see **Table 1**). No significant differences between groups were detected in the correct response percentage (**Table 1**).

An ANOVA analysis was carried out on median RT values and the percentage of correct responses of the whole group, considering the four task conditions (color, shape, switch, and repetition trials). As expected, switch trials were associated with longer RTs (F = 12.4, p < 0.001) and worse accuracy (F = 5.6; p = 0.001) with respect to single blocks (**Figure 3**). A simple regression analysis was performed in order to evaluate the effect of aging on switch trials. As expected, performance was positively correlated with age for repetition (r = 0.378; p = 0.019) and switching trials (r = 0.316; p = 0.045).

### fNIRS Data

An independent two-sample t-test revealed differences in cortical activation between young and older participants for HbO mean concentration of switching task in the left PMC (p = 0.048, t = 2.94; effect size = 0.44 and Cohen's d = 0.97).

The multiple linear regression analysis, performed on the whole sample, highlighted that age variance explained by the



§*Unpaired two-sample t-test.*

linear model was about 80% (R <sup>2</sup> = 0.806) and the best age predictors were HbO mean concentration for shape single-task in the sFG (p = 0.003, β = 0.342) and HbO mean concentration for switching task in the PMC (p = 0.01, β = −0.321) (**Figure 4**).

# DISCUSSION

In the last decades, neuroimaging studies have been particularly focused in understanding the neurofunctional bases of physiological aging effects on cognitive processes. In particular, functional neuroimaging studies have shown that cognitive control processes involve a broad network centered on frontoparietal areas (e.g., Corbetta and Shulman, 2002; Dosenbach et al., 2008), which are thought to subserve underlying different cognitive operations (D'Esposito et al., 1995; Duncan et al., 1996; Posner and DiGirolamo, 1998). Age-related worsening in behavioral performance is associated with changes in neural patterns of activation, involving the under-recruitment of task-specific regions (deactivations and a decreased spatial extent of activation), hemispheric lateralization, and the recruitment of additional brain areas, especially of frontal regions (DiGirolamo et al., 2001; Milham et al., 2002; Gold et al., 2010). This increased frontal activation has led to opposing interpretations: evidence of an adaptive positive compensatory mechanism in order to preserve cognitive functioning (Reuter-Lorenz and Cappell, 2008; Davis et al., 2009; Reuter-Lorenz and Park, 2014) or age-related brain dysfunction (Colcombe et al., 2005; Rypma et al., 2005, 2006; Zarahn et al., 2007; Stern, 2009; Gold et al., 2013; Zhu et al., 2015).

The most common tasks used to define cognitive reserve in elderly people are Go/NoGo (inhibitory control), n-back (working memory), and task-switching (cognitive control). Agerelated alterations in brain activation tend to be especially pronounced on tasks that emphasize cognitive control processes (Drag and Bieliauskas, 2010). As a consequence, among these, task-switching has been extensively used to evaluate physiological aging influence on executive deficits (DiGirolamo et al., 2001; Milham et al., 2002; Gold et al., 2010; Zhu et al., 2014; Hakun et al., 2015; Eich et al., 2016; Jolly et al., 2017). However, the existing knowledge on this research field has been mainly achieved by advanced neuroimaging methods, as Positron Emission Tomography (PET) (Berry et al., 2016) structural MRI (Zhu et al., 2014; Jolly et al., 2017) and functional MRI (DiGirolamo et al., 2001; Milham et al., 2002; Gold et al., 2010; Hakun et al., 2015; Eich et al., 2016).

Albeit these conventional neuroimaging modalities have proven to be effective and reliable, their well-known limitations (very expensive, invasive, and with several constraints for patients with physical limitations) make them unsuitable for a large scale application. fNIRS is a non-invasive neuroimaging technique able to investigate in vivo brain hemodynamic (Villringer et al., 1993) with reasonable temporal and spatial resolution, quantifying task-related changes in oxygenated hemoglobin (HbO) and deoxygenated hemoglobin (HbR) concentrations (Cutini et al., 2014).

whole group (*p* < 0.05).

By the advantages of being non-invasive, more ecological than conventional neuroimaging methodologies and able to investigate in vivo brain hemodynamic (Villringer et al., 1993) with reasonable temporal and spatial resolution, fNIRS technology has been recognized as a suitable tool for application in the field of cognitive aging (Agbangla et al., 2017).

The most consistently reported pattern of age-related differences in brain activation is the increased high involvement of the prefrontal cortex. This overactivation in older adults is often interpreted as a compensatory mechanism when it is concomitant with preserved cognitive performance. Recent results of fNIRS studies on working memory (Vermeij et al., 2012, 2014a,b, 2016) are in agreement with this model, showing that higher activation at a high cognitive load was predictive of higher behavioral improvements, whereas relatively higher prefrontal recruitment at a low cognitive load was related to worse behavioral performance and improvement.

In addition, to the best of our knowledge only one study (Laguë-Beauvais et al., 2013) monitored age-related modulation on task switch using fNIRS during a modified Stroop task. Their univariate statistical approach also confirmed that the two executive processes of interference and switching are associated with distinct patterns of prefrontal activation and that both these patterns appear more spread out in the PFC of older adults.

Our work aimed at overcoming the intrinsic limit of univariate statistic by combining the well-known task-switching paradigm and the fNIRS technology by a multivariate statistical approach. Indeed, we were able to disentangle the relative contribution of age-related functional alterations of frontoparietal areas and to evaluate a possible dissociation between different mechanisms (deactivations or hyperactivation) that those regions adopt to compensate the age-related cognitive decline.

In particular, our behavioral data confirms previous evidence of the importance of the task-switching in defining cognitive cost. Moreover, this greater cognitive demand correlates with age, confirming that aging effect can be captured during specific cognitive tasks. In addition, fNIRS confirms that this effect also occurs at the neurobiological level, with an increase in functional activity of the frontal areas. Although the association between sFG activity and aging has been found in other neuroimaging studies (DiGirolamo et al., 2001; Milham et al., 2002; Gold et al., 2010; Zhu et al., 2014, 2015; Hakun et al., 2015; Berry et al., 2016; Eich et al., 2016; Jolly et al., 2017), the new finding which merits to be highlight is the opposite

trend in the correlation between PMC activity (HbO and HbR mean concentrations) and aging. Although surprising, this result may be explained by the fact that age-related cognitive decline is associated to an increase in compensatory functional activity and a simultaneous decreased cortical activity of other regions. In particular, it has been demonstrated that during working memory with increasing task load, older adults showed decreased connectivity and ability to suppress activity in other brain regions. The deactivation of other strictly connected brain regions is essential for the correct execution of cognitively demanding tasks (Sambataro et al., 2010). The positive correlation between age and HbO concentration change in the sFG during the single task could be explained in terms of an additional effort required by working memory process in older people. This finding is consistent with the results of a neuropsychological study (du Boisgueheneuc et al., 2006), which found that patients with a left sFG lesion exhibited a working memory deficit when compared with all control groups.

It is worth noting that the present probe arrangement did not include short separation channels, which are typically employed to eliminate systemic, task-dependent physiological oscillations that might create a confounding factor when evaluating the task-evoked brain hemodynamic response (Tachtsidis and Scholkmann, 2016). This issue is caused by a specific reason: beside capturing hemodynamic variations related to cortical activity, the signal from fNIRS channels with a standard source-detector distance (e.g., 3 cm) is contaminated with superficial, physiological hemodynamic fluctuations (e.g., heartbeat and Mayer's waves), located both in the vasculature of the layers overlaying the brain and in the brain itself (Caldwell et al., 2016). Given that the source-detector distance is inversely related to the proportion of photons reaching the cortex, short-separation channels enable to measure the same global, superficial hemodynamic fluctuations visible in standard channels, while also being insensitive to brain activity (Brigadoi and Cooper, 2015); indeed, confounding effects from extracerebral contamination and systemic factors are eliminated by regressing out the signal obtained from short-channels from the one observed in the standard channels. This procedure assures that the activity found in standard channels can be safely attributed to brain activation. Although with the present arrangement, we cannot completely rule out the presence of a physiological contamination in our results, two aspects deserve careful consideration. The first one concerns the design of the experimental protocol: the stringent control condition provided by repetition trials makes unlikely that the different hemodynamic pattern between switch and repetitions trials can be attributed to physiological oscillations; the experimental paradigm was specifically to have just one difference (i.e., the reconfiguration of task-set) between switch and repetition trials. The same line of reasoning has been recently highlighted in theoretical works (e.g., Scholkmann et al., 2013; Tachtsidis and Scholkmann, 2016) and it can be appreciated in recent fNIRS studies on clinical populations (e.g., Cutini et al., 2016). Second, the hallmark of extracerebral contamination is the ubiquitous presence in all the channels, thereby implying that all the regions should show the same hemodynamic pattern; crucially, in our results we observed a clear functional dissociation between PMC and sFG. Taken together, these two observations strongly suggest that the hemodynamic activity found in the present study is mainly driven by cortical activation.

In conclusion, we might speculate that the two active regions found with fNIRS are both bound to physiological aging but they might be representative of two distinct cognitive processes that are partially dissociable.

### AUTHOR CONTRIBUTIONS

RV: analyses and interpretation of the data, statistical analysis and drafting/revising the manuscript, final approval of the version to be published; SC: study concept and design, analyses and interpretation of the data, statistical analysis, and drafting/revising the manuscript, final approval of the version to be published; AC: study concept and design, data collection and interpretation and drafting/revising the manuscript, final approval of the version to be published; VG: data collection, analysis and interpretation, and drafting/revising the manuscript,

### REFERENCES


final approval of the version to be published; GO: data collection and final approval of the version to be published; GA and AQ: drafting/revising the manuscript, final approval of the version to be published.

# FUNDING

This study was supported by MIUR (Ministero Universita' e Ricerca Italiana; PON03PE\_00009-NEUROMEASURES).

### ACKNOWLEDGMENTS

The authors wish to thank Prof. Christophe Grova, Associate Professor in Physics Department and PERFORM centre, Concordia University and Zhengchen Cai, PhD student in the same department for their help and support in the determination of probabilistic model of photon migration through the head and in performing sensitivity analysis.

### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fnagi. 2017.00433/full#supplementary-material

differences in inhibitory function in aging humans. Psychol. Aging 20, 363–375. doi: 10.1037/0882-7974.20.3.363


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Vasta, Cutini, Cerasa, Gramigna, Olivadese, Arabia and Quattrone. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Bayesian Optimization for Neuroimaging Pre-processing in Brain Age Classification and Prediction

### Jenessa Lancaster<sup>1</sup> , Romy Lorenz<sup>1</sup> , Rob Leech<sup>1</sup> and James H. Cole1,2 \*

<sup>1</sup> Computational, Cognitive and Clinical Neuroimaging Laboratory, Division of Brain Sciences, Department of Medicine, Imperial College London, London, United Kingdom, <sup>2</sup> Department of Neuroimaging, Institute of Psychiatry, Psychology and Neuroscience, King's College London, London, United Kingdom

### Edited by:

Christian Gaser, Friedrich-Schiller-Universität Jena, Germany

### Reviewed by:

Nils Muhlert, University of Manchester, United Kingdom Franziskus Liem, University of Zurich, Switzerland Martin Lotze, University of Greifswald, Germany

\*Correspondence:

James H. Cole james.cole@kcl.ac.uk; james.cole@imperial.ac.uk

Received: 18 October 2017 Accepted: 23 January 2018 Published: 12 February 2018

### Citation:

Lancaster J, Lorenz R, Leech R and Cole JH (2018) Bayesian Optimization for Neuroimaging Pre-processing in Brain Age Classification and Prediction. Front. Aging Neurosci. 10:28. doi: 10.3389/fnagi.2018.00028 Neuroimaging-based age prediction using machine learning is proposed as a biomarker of brain aging, relating to cognitive performance, health outcomes and progression of neurodegenerative disease. However, even leading age-prediction algorithms contain measurement error, motivating efforts to improve experimental pipelines. T1-weighted MRI is commonly used for age prediction, and the pre-processing of these scans involves normalization to a common template and resampling to a common voxel size, followed by spatial smoothing. Resampling parameters are often selected arbitrarily. Here, we sought to improve brain-age prediction accuracy by optimizing resampling parameters using Bayesian optimization. Using data on N = 2003 healthy individuals (aged 16–90 years) we trained support vector machines to (i) distinguish between young (<22 years) and old (>50 years) brains (classification) and (ii) predict chronological age (regression). We also evaluated generalisability of the age-regression model to an independent dataset (CamCAN, N = 648, aged 18–88 years). Bayesian optimization was used to identify optimal voxel size and smoothing kernel size for each task. This procedure adaptively samples the parameter space to evaluate accuracy across a range of possible parameters, using independent sub-samples to iteratively assess different parameter combinations to arrive at optimal values. When distinguishing between young and old brains a classification accuracy of 88.1% was achieved, (optimal voxel size = 11.5 mm<sup>3</sup> , smoothing kernel = 2.3 mm). For predicting chronological age, a mean absolute error (MAE) of 5.08 years was achieved, (optimal voxel size = 3.73 mm<sup>3</sup> , smoothing kernel = 3.68 mm). This was compared to performance using default values of 1.5 mm<sup>3</sup> and 4mm respectively, resulting in MAE = 5.48 years, though this 7.3% improvement was not statistically significant. When assessing generalisability, best performance was achieved when applying the entire Bayesian optimization framework to the new dataset, out-performing the parameters optimized for the initial training dataset. Our study outlines the proof-of-principle that neuroimaging models for brainage prediction can use Bayesian optimization to derive case-specific pre-processing

parameters. Our results suggest that different pre-processing parameters are selected when optimization is conducted in specific contexts. This potentially motivates use of optimization techniques at many different points during the experimental process, which may improve statistical sensitivity and reduce opportunities for experimenter-led bias.

Keywords: brain aging, Bayesian optimization, T1-MRI, machine learning, pre-processing

## INTRODUCTION

The aging process affects the structure and function of the human brain in a characteristic manner that can be measured using neuroimaging. This quantifiable relationship was key to the early demonstrations of voxel-based morphometry (VBM) (Good et al., 2001) and to this day represents one of the most robust relationships between a measurable phenomenon (i.e., aging) and brain structure. This makes aging an ideal target for evaluating novel neuroimaging analysis tools. More recently, researchers have used this relationship to develop neuroimagingbased tools for predicting chronological age in healthy people using machine learning (Franke et al., 2010; Cole et al., 2017b). A 'brain-predicted age' determined from magnetic resonance imaging (MRI) scans represents an intuitive summary measure of the natural deterioration associated with the effects of the aging process on the brain, and has the potential to serve as biomarker of age-related health (Cole, 2017).

The extent to which brain-predicted age is greater than an individual's chronological age has been associated with accentuated age-associated physical and cognitive decline (Cole et al., 2017c). Specifically, an 'older'-appearing brain has been associated with decreased fluid intelligence, reduced lung function, weaker grip strength, slower walking speed and an increased likelihood of mortality in older adults (Cole et al., 2017c). Factors which could contribute to an increased brainpredicted age include genetic effects, neurological or psychiatric conditions, or poor physical health (Koutsouleris et al., 2013; Löwe et al., 2016; Steffener et al., 2016; Cole et al., 2017c,d; Pardoe et al., 2017). Potentially, individuals at increased risk of experiencing the negative consequences of brain aging, such as cognitive decline and neurodegenerative disease, could be identified by measuring brain-predicted age in clinical groups or even screening the general population.

Despite promising results to-date, models for generating brain-predicted age still continue to contain measurement error, and efforts to improve accuracy and particularly, generalisability, to data from different MRI scanners are warranted. Training on large cohorts of healthy adults is one approach to reduce error, with the lowest mean absolute error (MAE) rates between 4 and 5 years (Konukoglu et al., 2013; Irimia et al., 2014; Steffener et al., 2016; Cole et al., 2017b). Notably, individual errors range across the population from perfect prediction to discrepancies as great as 25 years. While brain-predicted age has high test–retest reliability (Cole et al., 2017b), and a proportion of this variation likely reflects underlying population variability, certainly a substantial amount of noise remains. This means that a model with an MAE = 0 is highly unlikely, however, the lower bound of prediction accuracy has not yet been reached, as indicated by the gradual improvements in performance seen in more recent methods. This means that efforts to reduce noise, improve prediction accuracy and in particular the generalisability to new data is essential if such approaches are to be applied to individuals in a clinical setting, the ultimate goal of any putative health-related biomarker.

A key issue in brain-age prediction, along with many other neuroimaging approaches, is the choice of methods for extracting features or summary measures from raw data for further analysis (Jones et al., 2005; Klein et al., 2009; Franke et al., 2010; Andronache et al., 2013; Shen and Sterr, 2013). For example, the majority of brain-age prediction pipelines have used T1 weighted MRI data and either generated voxelwise maps of brain volume (e.g., Cole et al., 2015) or summary measures of cortical thickness and subcortical volumes (e.g., Liem et al., 2017). The parameters set during image pre-processing are commonly the defaults supplied by the software developer or are based on prior studies. Nevertheless, the choice of pre-processing parameters may have a strong influence on the outcome of any subsequent data modeling, and ideally should be optimized on a case-by-case basis. This optimization is rarely conducted, as trial-and-error approaches are time-consuming and often ill-posed. Importantly, using sub-optimal pre-processing may reduce experimental precision, which increases the likelihood of false positives or false negatives as well as reducing reproducibility. In the worstcase scenario, p-hacking may occur, whereby pre-processing is manually optimized based on minimizing the resultant p-values of the subsequent hypothesis testing within the same sample. Here, we outline a principled Bayesian optimization strategy for identifying optimal values for pre-processing parameters in neuroimaging analysis, implementing sub-sampling to avoid bias. We then demonstrate proof-of-principle applied to the problem of age prediction using machine learning.

Bayesian optimization is an efficient and unbiased approach to parameter selection, which avoids both the failure to adequately search a large parameter space and the drawbacks of an exhaustive search. Instead, it utilizes a guided sampling strategy, assessing a subgroup of points from within the possible parameter space and testing these values on subsets of the total sample (Brochu et al., 2010; Snoek et al., 2012). This data division strategy ensures performance tests reflect out-of-sample prediction and always evaluate differing conditions on separate data, reducing the likelihood of overfitting. This intelligent selection of a small number of points for evaluation allows the characterization of parameter space and the solution of the optimization problem to be accomplished in fewer steps, making it a computationally efficient approach (Pelikan et al., 2002).

The current work used Bayesian optimization to attempt to optimize image pre-processing parameters for: (i) distinguishing

the brains of young and old adults (classification), (ii) predicting chronological age (regression), and (iii) evaluating the generalisability of the resulting optima for the regression task to an independent dataset. The classification task was included to allow evaluation of Bayesian Optimization hyper-parameters and to show the applicability of Bayesian optimization to different contexts. We hypothesized that by using Bayesian optimization we would improve model accuracy compared to previously used 'non-optimized' values. The study was designed to show proof-of-principle of the applicability of Bayesian optimization to help improve neuroimaging pre-processing in a principled and unbiased fashion.

### MATERIALS AND METHODS

### Neuroimaging Datasets

This study used data compiled from 14 public sources (see **Table 1**), and as per our previous research (e.g., Cole et al., 2015), referred to here as the brain-age healthy control (BAHC) dataset. Data included T1-weighted MRI from 2003 healthy individuals aged 16–90 years (male/female = 1016/987, mean age = 36.50 ± 18.52). All BAHC participants were either originally included in studies of healthy individuals, or as healthy controls from case-control studies. As such, all were screened according to local study protocols to ensure they had no history of neurological, psychiatric or major medical conditions. Images were acquired at 1.5T or 3T with standard T1-weighted MRI sequences (see **Table 1**). Ethical approval and informed consent were obtained locally for each study covering both participation and subsequent data sharing.

Additionally, the Cambridge Centre for Aging Neuroscience (CamCAN) neuroimaging cohort was used as an independent validation dataset (Shafto et al., 2014; Taylor et al., 2017). These data were obtained from the CamCAN repository<sup>1</sup> . The CamCAN cohort consisted of T1-weighted images (acquired at 3T, using a 3D MPRAGE sequence: repetition time = 2250 ms, echo time = 2.99 ms, inversion time = 900 ms; flip angle = 9◦ ; field of view = 256 × 240 × 192 mm; 1 mm isotropic voxels) from 648 participants aged 18–88 years (male/female = 324/324, mean age 54.28 ± 18.56). This study used similar exclusion criteria, including only healthy individuals. Ethical approval for CamCAN was obtained locally, including the permission to subsequently make anonymised versions of the data publicly available.

An outline of the analysis pipeline for the study has been included as **Figure 1**. The details of each stage are as follows.

### Pre-processing to Prediction

Normalized brain volume maps were created following a standard VBM protocol, described previously (Cole et al., 2015). This involved segmentation of raw T1-weighted images into gray matter maps using SPM12 (University College London, London, United Kingdom). Images were normalized to a studyspecific template in MNI152 space using DARTEL for non-linear registration (Ashburner, 2007). This step involved resampling to a common voxel size, modulation to retain volumetric information and spatial smoothing; the specific voxel size and smoothing kernel size parameters were chosen by the Bayesian optimization protocol as detailed below.

After pre-processing, gray matter volume images were converted to vectors of ASCII-format intensity values. These were used as the input features for subsequent classification or regression analysis. This was performed in MATLAB using the support vector machine (SVM) program. For the binary classification problem of distinguishing young and old participants, SVMs were used. For predicting age as a continuous variable, SVM regression (SVR) was used. Both SVM and SVR procedures used a linear kernel to map the input data into a computationally efficient feature space.

### Bayesian Optimization of Pre-processing

Bayesian optimization was used to identify optimal preprocessing parameters, based on the accuracy of the subsequent model predictions (either classification or regression). Hence, the Bayesian optimization procedure can be seen as an additional outer layer of analysis, that surrounds the standard pipeline (preprocessing through to model accuracy evaluation). The Bayesian optimization process runs multiple iterations of this internal pipeline using different sub-samples of the dataset, exploring the parameter space to select varying image pre-processing options based on their influence on the objective function (i.e., classification or regression accuracy).

A key advantage of Bayesian optimization derives from its 'surrogate' model that represents the relationship between an algorithm and the initially unknown objective function. This surrogate model is progressively refined in a closed-loop manner, by automatically selecting points in the parameter space. This provides an informed coverage of that space, based on the performance of previously sampled points. This aspect makes Bayesian optimization highly efficient, reducing the number of iterations necessary to identify optima of complex objective functions (Brochu et al., 2010; Lorenz et al., 2017). We used the MATLAB Bayesian Optimization Algorithm<sup>2</sup> implementation, which internally defines a number of optimization parameters, including selecting the covariance kernel and tuning the hyperparameters of the process.

In this study, we aimed to optimize the final normalized voxel size and the full-width half-maximum (FWHM) of the spatial smoothing kernel used during final resampling. Conventionally, these are set between 1 and 2 mm<sup>3</sup> and either 4 or 8 mm FWHM, respectively for VBM studies conducted using SPM. In our previous work on brain-age prediction we used 1.5 mm<sup>3</sup> voxel dimensions and a 4 mm smoothing kernel (Cole et al., 2015, 2017a,b,c,d; Pardoe et al., 2017), which we subsequently refer to as 'un-optimized' pre-processing parameters. Comparison of classification and regression accuracy was compared between optimized and un-optimized models using Steiger's z-test for dependent correlations and McNemar's chi-square for paired nominal data accordingly. Using Bayesian optimization, a wider

<sup>1</sup>http://www.mrc-cbu.cam.ac.uk/datasets/camcan/

<sup>2</sup>https://uk.mathworks.com/help/stats/bayesian-optimization-algorithm.html


TABLE 1


LONI, Laboratory ABIDE consortiums ∗OASIS scans were acquired four times and then averaged to increase

 of Neuro Imaging Image & Data Archive

 comprising

 data from various sites with different

(https://ida.loni.usc.edu).

scanners/parameters.

signal-to-noise

 ratio.

range of values were considered; between 1 and 30 mm isotropic was permitted for voxel size and 1–20 mm FWHM for smoothing kernel size for the classification task, while ranges of 1–15 mm and 1–10 mm were used for the regression task (the latter smaller parameter space was used to reduce computation time for the more intensive task). The time taken for the optimization procedure was approximately 30 s per image for the resampling step, running on a desktop computer with 16GB RAM. A full optimization cycle thus took approximately 8 h to complete 12 iterations (see below).

# Classifying Young and Old Adults

We categorized the 500 oldest individuals (aged 51–90 years) and the 500 youngest (aged 16–22 years) as the "old" and "young" groups for classification. Each iteration of Bayesian optimization used a sub-sample of the total dataset (N = 1000), to test a combination of pre-processing parameter values. Participants were divided into subsets of size n stratified by age, such that each subset contained an approximately representative distribution of participants from across the age range, resulting in a total of N/n iterations. We used n = 80 total (40 participants from each group) as a sample for each iteration, giving 1000/80 = 12 iterations of Bayesian optimization. This included a burn-in phase (i.e., preliminary phase of unevaluated samples to initialize the process) of 5 randomly sampled points from within the parameter ranges to begin characterization of the search space, followed by 7 iterations of 'guided' active sampling. In each iteration, a voxel size and smoothing kernel size combination was selected and used for resampling during DARTEL normalization of each subject's images. Normalized images were then converted to feature vectors and a binary SVM classifier was trained and assessed using 10-fold cross-validation. SVM hyperparameters were left at default values. Classifier error was the objective function to be minimized. Bayesian optimization used the Expected Improvement Plus (EI+) acquisition function, with the default exploration-exploitation ratio of 0.5.

# Regression Prediction of Chronological Age

Next, we used Bayesian optimization in the context of regression models which predict chronological age in healthy individuals using brain volume images. This was done by first identifying optimal pre-processing parameters through Bayesian optimization, then applying them to the full training dataset and comparing the resulting prediction accuracy to that achieved in using 'un-optimized' values. The regression analysis used n = 80 participants per iteration, with age values spanning the full range (16–90 years) and the same Bayesian optimization hyper-parameters as above. The MAE in age prediction across 10-fold cross-validation using SVM regression (i.e., SVR) was the objective function to be minimized. SVR hyper-parameters were left at default values. To enable both the optimization search and make use of the full sample size available for training a generalisable regression model, the dataset was divided into a training set and a held-out test set. Bayesian optimization was first carried out using 1803 of the 2003 total participants to determine the optimal voxel size and smoothing kernel size values (allowing 22 total optimization iterations). A regression model was then trained on these 1803 images pre-processed using the identified optimal values, and tested on the 200 held-out participants in the test set (pre-processed using the same optimized parameters)

to give an unbiased out-of-sample measure of age prediction performance.

A final regression model was trained on all 2003 participants (the full model). This allowed us to evaluate how well the optimized pre-processing parameters generalized to the independent CamCAN dataset. We compared three possible approaches for this independent validation step: (1) application of the BAHC-derived full model to age prediction on CamCAN data, (2) application of the entire Bayesian optimization framework to the CamCAN dataset followed by regression training, (3) using the BAHC-derived pre-processing parameters to process the CamCAN dataset, but training a new regression model for age prediction. In case #1 the optimized voxel size and smoothing kernel size from the BAHC dataset were applied to the CamCAN dataset. In case #2, the pre-processing parameters were optimized afresh, using only the CamCAN dataset. Case #3 is something of an intermediate iteration, generalizing the optimized pre-processing parameters, but not the trained regression model.

### Performance Stability

Finally, we performed several experiments to assess reproducibility and variability of different solutions to the classification task (i.e., young vs. old). This was done to allow inference regarding which image pre-processing parameters had the greatest impact on prediction accuracy, and to establish robustness of the model. We tested the consistency of model solutions across repetitions and participant sets, using different random seeds to create shuffled groups and burn-in points. We also varied the acquisition function of the Bayesian approach. This included comparing the results using six different acquisition functions: Expected Improvement (EI), EI per second, EI+, EI per second +, Lower Confidence Bound, and Probability of Improvement. Finally, model solutions were compared across differing values for the exploration-exploitation ratio, ranging from 10 to 90% exploration. These tests were conducted in context of voxel size and smoothing kernel size, using ranges of 1–15 mm<sup>3</sup> and 1–10 mm FWHM, respectively. The acquisition function "Expected Improvement +" has a property which allows for escape from a local optimum, and returns to exploratory behavior when a region is thought to be over-exploited. The plus addition to the EI function enables this escape and refers to the additional requirement that the standard deviation of the posterior function must be less than the standard deviation of included noise multiplied by the exploration ratio.

### RESULTS

### Classification Analyses

Model performance accuracy from ten-fold cross-validation was 88.1% for correct classification of neuroimaging data as either young or old (see **Table 2**). The optimized voxel size was 11.5 mm<sup>3</sup> and smoothing kernel size was 2.3 mm. The performance using un-optimized values (1.5 mm<sup>3</sup> voxels and 4 mm smoothing kernel) was 80.3%. The optimized model TABLE 2 | Classification performance for distinguishing young and old brains.


performance was significantly better than the un-optimized case (McNemar test, χ <sup>2</sup> = 18.76, p < 0.001). The parameter space exploring the expanded range of voxel size and smoothing kernel size values yielded by the model is shown in **Figure 2**.

### Classification Model Stability

Stability and reproducibility of model solutions were explored in the classification problem. Correlation of models in different scenarios are shown in **Figure 3**. These were; (a) across 10 repetitions of the final classification protocol, (b) produced by the use of each of six different acquisition functions, and (c) using 5 different exploration-exploitation ratios ranging between 90% exploration and 90% exploitation. In all three cases, model solutions showed high cross-correlation across replications, as well as across varying settings of the optimization process. The choice of acquisition function for Bayesian sampling and the choice of exploration-exploitation ratio of this function had little impact on final model performance. Similar models were reproducible across repetitions and in randomly shuffled participant sub-sets. This behavior implied that a stable model exists in the outlined parameter space. These observations supported our use of the default acquisition function options for the classification analysis: Expected Improvement Plus (EI+), with an exploration ratio of 0.5. We thus adopted these parameters for the regression task.

function solutions across (A) replicates of the protocol, (B) six different acquisition functions, and (C) a range of exploration-exploitation ratios between 0.1 and 0.9

Age Prediction Regression

(where exploitation is 1-exploration).

**Figure 4** shows the objective function model created during regression, when varying smoothing kernel and voxel size (1–10, 1–15), and minimizing the MAE in SVR age prediction. The lowest MAE observed for an individual sample (N = 80) was 7.17 years and the objective model function estimated a minimum error of 8.52 years (lowest point on the surface fitted in **Figure 4**). The estimated optimal voxel size and smoothing kernel size values were 3.73 mm<sup>3</sup> and 3.68 mm respectively. Following optimization these values were used to pre-process the full dataset and train a regression model for subsequent application to other datasets.

The final resulting model was applied to predict ages for the remaining 200 holdouts and achieved a MAE of 5.08 years (**Figure 5A**). This was compared to MAE = 5.48 years when using the un-optimized pre-processing values, though this value was not significantly different from the optimized case (Steiger test p > 0.1). The absolute error observed ranged from 0 to 22.78 years. In the hold-out set the Pearson's correlation between predicted and true age was r = 0.941, with R <sup>2</sup> = 0.89 using optimized pre-processing. Using un-optimized pre-processing parameters, we found r = 0.927, R <sup>2</sup> = 0.86.

The relevance of this approach for an independent dataset (i.e., CamCAN) was considered in three different ways. (1) The BAHC-trained model was applied to the CamCAN data pre-processed with the BAHC-informed optimum voxel size and smoothing kernel size values. Here the CamCAN participants were included in the test set only. This achieved a MAE of 6.08 years, r = 0.929, with R <sup>2</sup> = 0.86 (**Figure 5B**). This was an improvement compared to the performance when using un-optimized pre-processing values (voxel size = 1.5 mm<sup>3</sup> , smoothing kernel size = 4 mm) which resulted in MAE = 6.76 years. (2) The CamCAN data was analyzed entirely independently; the full Bayesian optimization framework was instead applied to the CamCAN data to discover new, CamCAN-specific pre-processing optima, and a new regression model was trained with 588 participants and tested on 60 participants (giving a similar training-testing ratio as used in the BAHC dataset). This resulted in optimal values of 8.41 mm<sup>3</sup> for voxel size and 3.54 mm for smoothing kernel size and yielded MAE = 5.46 years, r = 0.91, R <sup>2</sup> = 0.83. (3) The CamCAN cohort

was pre-processed using the BAHC-informed optimum values but a new regression model was trained and tested within the CamCAN participants. This model resulted in MAE = 6.21 years, r = 0.89, R <sup>2</sup> = 0.79. The model from case #2 was significantly more accurate than the model from case #3 (Steiger test, t = 1.67, p = 0.05).

### DISCUSSION

Using Bayesian optimization, we present a conceptual improvement to conventional pipelines for distinguishing young and old brains or predicting age using neuroimaging data. The Bayesian optimization-derived optima for voxel size and smoothing kernel size showed improved performance over 'un-optimized' defaults for the classification of young and old brains, though performance in brain-age prediction was similar to un-optimized values. Potentially, the values previously used are relatively near the optimum thanks to the testing and experience of the researchers involved. In fact, the derived optimal smoothing kernel was very near the un-optimized value (3.68 mm vs. 4 mm). It is also important to note optima

the BAHC-trained model on (A) the hold-out N = 200 test set from BAHC (MAE = 5.08 years, r = 0.941, R <sup>2</sup> = 0.89), and (B) on participants from the CamCAN dataset, using the pre-processing values optimized on the BAHC dataset (MAE = 6.08 years, r = 0.929, R <sup>2</sup> = 0.86).

derived from the Bayesian optimization process should not be regarded as definitive, as a completely exhaustive search of the parameter space is not conducted (nor desirable due to time constraints). Our results are important as they suggest that the same pre-processing parameters are not optimal for different prediction tasks (i.e., classification vs. regression) or for different datasets (BAHC vs. CamCAN). Often, researchers will apply parameters used in one context to another. This may not necessarily be best practice, and our work shows proof-ofprinciple that Bayesian optimization can be used to improve image pre-processing in a principled and unbiased fashion.

Beyond optimizing model performance, our Bayesian optimization approach also allows for relative comparison of the influence of different parameters. This potentially provides novel information regarding the prediction problem at hand. For example, here we found that varying voxel size had a much greater impact on overall performance than did smoothing kernel size. This was seen in all experiments; the change in performance across the full range of values was much smaller for smoothing kernel size than voxel size, and is clearly seen in the surface plots (**Figures 1**, **2**). This suggests that in future neuroimaging pre-processing pipeline design, there is more to be gained from optimizing voxel size, rather than smoothing kernel size. The target voxel size for normalization is often not considered, though has an important impact on the degree of partial volume effects, number of simultaneous statistical tests undertaken, spatial resolution and subsequent inferences made about anatomical specificity. Our findings suggest that more weight needs to be placed on this important parameter when relating volumetric MRI data to age.

Importantly, the conclusions regarding specific optimal values are related to the particular application in which they are tested. Within this study, we observed a notable difference in the optimal voxel size for classification (11.5 mm<sup>3</sup> ) compared to regression (3.73 mm<sup>3</sup> ). Potentially, the more gross distinction between young and old brains benefits from a coarser resolution which increases signal-to-noise ratio, while the subtler patterns underlying gradual age-associated changes in brain structure requires finer-grained representation. Alternatively, the much larger voxel size identified here could result in better classification by reducing data dimensionality, with this size representing the optimal trade-off between representing the information and simplifying a high-dimensional feature space for more effective classification. Either way, the discrepancy in optimal voxel size between classification and regression reinforces the point that systematic evaluation of parameter specifications should be conducted case-by-case. Commonly, 'one-size-fits-all' is the prevailing heuristic for setting pre-processing parameters in neuroimaging analysis, where the defaults are assumed to be adequate. Our findings show that this is not necessarily the case, supporting the use of optimization techniques to improve experimental precision.

The age-prediction accuracy achieved in the BAHC dataset (MAE = 5.08 years) is comparable to the current performance seen in similar research (Konukoglu et al., 2013; Mwangi et al., 2013; Irimia et al., 2014; Cole et al., 2015, 2017b,c; Steffener et al., 2016). In these studies, pre-processing parameters are set somewhat arbitrarily (c.f. Franke et al., 2010). Here, the Bayesian optimization method offered a more principled approach. The prediction tools used here were common methods selected for computational efficiency (i.e., SVMs), in contrast with some of the studies capitalizing on state-of-theart techniques and advanced modeling such as deep learning or Gaussian process regression (Cole et al., 2017b,c). One limitation of Bayesian optimization is the computational time needed to derive optima, which may slow its adoption with

newer, computationally intensive algorithms. The duration of the optimization also depends on the process to be optimized. Here, we selected image resampling parameters as these can be computed rapidly. One important target for optimization should be image registration, however, the speed of most non-linear registration algorithms currently makes this a time-consuming goal.

We explored the generalisability of the model and the general framework. The resulting MAE in the CamCAN dataset of 6.08 years (BAHC-defined model and parameters), 5.18 years (CamCAN defined model and parameters) and 6.38 years (BAHC-defined parameters, CamCAN model) provides some interesting insights. Though the BAHC-derived model produced reasonable performance, the highest accuracy was achieved when re-optimizing and re-training within the independent cohort, using the full Bayesian optimization framework. Interestingly, the optimal smoothing kernel size was similar between the two datasets (3.68 mm vs. 3.54 mm), while there was a marked difference in optimal voxel size (3.73 mm<sup>3</sup> vs. 8.41 mm<sup>3</sup> ). Speculatively, this could be due to the specific acquisition parameters at this site, or as a result of latent sample characteristics, as truly random sampling is hard to achieve. This suggests that there is unlikely to be ground-truth optimum for a given parameter, highlighting the importance of defining such optima in a given context. This relates to another potential limitation of the approach; sufficiently large datasets are necessary for the optimization to work effectively. While this is increasingly possible due to the drive to share data and the availability of large, publicly accessible cohorts (e.g., Alzheimer's Disease Neuroimaging Initiative, Human Connectome Project, UK Biobank), this may not be possible in certain clinical contexts, particularly regarding rarer diseases. Nevertheless, the 'generalized' performance of the model was still reasonable in this case, which suggests that it is incumbent on researchers to decide what constitutes sufficient prediction accuracy in each context.

Bayesian optimization is a robust and elegant way to tune preprocessing pipelines in an efficient and automated manner. In addition to the parameter optima, an additional strict, unbiased estimate for performance and generalisability is generated. The objective function model created provides detailed information on model performance and new insights can be gained from mapping the entire parameter space by enabling visualization of relationships between key components of the analysis and performance. This could allow for informed decision-making in experimental design, such as allowing for cost-benefit analysis in the case where optimal parameters only lead to marginal improvement in performance over other values which are easier, quicker, or less costly to enact. This could be critical in applications where the varied inputs represent expensive or invasive procedures, such as MRI scanning or obtaining CSF samples from lumbar punctures.

Though our analysis focused on neuroimaging pre-processing to illustrate the strengths of Bayesian optimization methods, the potential applications are far-reaching. Optimization could be applied anywhere in the experimental pipeline: in questions of experimental design, stimuli choice, data acquisition, statistical method or algorithm selection, prediction methods or final model selection. The current literature on Bayesian optimization topics applies mainly to tuning of machine learning algorithms (Snoek et al., 2012) and though machine learning is widely used in neuroscience, few studies have capitalized on this strategy to improve neuroimaging analysis or neuroimaging-based prediction (Lorenz et al., 2016, 2017). In machine learning contexts, and especially in applied multi-disciplinary fields like neuroscience where researchers may not necessarily have expertise regarding every relevant experimental parameter, more widespread use of a priori unbiased parameter optimization could be highly beneficial.

Our study shows the potential of Bayesian optimization to improve neuroimaging pre-processing by reducing prior assumptions, in the context of classification and regression in the context of brain aging. Future research into brain aging and other neuroscientific areas could benefit from applying principled optimization approaches to improve study sensitivity and reduce bias.

# ETHICS STATEMENT

All study data were obtained from repositories of publicly available data. These data were originally acquired with written informed consent of all study participants.

# AUTHOR CONTRIBUTIONS

RLe and JC conceived and designed the study. RLo, RLe, and JC developed the methods. JL, RLe, and JC analyzed and interpreted the data. JL, RLo, RLe, and JC drafted and revised the manuscript.

# FUNDING

JC was funded by a Research Councils UK/UK Research and Innovation/Medical Research Council Rutherford Fund Fellowship. RLo was supported by a Leverhulme Trust Research Project Grant made to Imperial College London.

# ACKNOWLEDGMENTS

The authors would like to acknowledge the principal investigators of the studies who made their data openly accessible for research. Data collection and sharing for one of the datasets used in this project was provided by the Cambridge Centre for Ageing and Neuroscience (CamCAN). CamCAN funding was provided by the UK Biotechnology and Biological Sciences Research Council (Grant No. BB/H008217/1), together with support from the UK Medical Research Council and University of Cambridge, United Kingdom.

### REFERENCES

fnagi-10-00028 February 8, 2018 Time: 20:15 # 10


cognitive impairment. Neuroimage 148, 179–188. doi: 10.1016/j.neuroimage. 2016.11.005


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

The handling editor is currently co-organizing a Research Topic with one of the authors JC, and confirms the absence of any other collaboration.

Copyright © 2018 Lancaster, Lorenz, Leech and Cole. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Gray Matter Network Disruptions and Regional Amyloid Beta in Cognitively Normal Adults

Mara ten Kate<sup>1</sup> \*, Pieter Jelle Visser 1,2 , Hovagim Bakardjian3,4 , Frederik Barkhof 5,6 , Sietske A. M. Sikkes 1,7 , Wiesje M. van der Flier 1,7 , Philip Scheltens <sup>1</sup> , Harald Hampel 3,4,8,9 , Marie-Odile Habert <sup>10</sup> , Bruno Dubois 3,4 and Betty M. Tijms <sup>1</sup> for the INSIGHT-preAD study group

<sup>1</sup>Alzheimer Center & Department of Neurology, Amsterdam Neuroscience, VU University Medical Center, Amsterdam, Netherlands, <sup>2</sup>Department of Psychiatry & Neuropsychology, School for Mental Health and Neuroscience, Maastricht University, Maastricht, Netherlands, <sup>3</sup>Département de Neurologie, Pitié-Salpêtrière University Hospital, Institut de la Mémoire et de la Maladie d'Alzheimer, Paris, France, <sup>4</sup> Institut du Cerveau et la Moelle Epinière (ICM)/Brain and Spine Institute, Pitié-Salpêtrière Hospital, Sorbonne Universities, Pierre and Marie Curie University, Paris, France, <sup>5</sup>Department of Radiology and Nuclear Medicine, Amsterdam Neuroscience, VU University Medical Center, Amsterdam, Netherlands, <sup>6</sup> Institutes of Neurology and Healthcare Engineering, University College London, London, United Kingdom, <sup>7</sup>Department of Epidemiology and Biostatistics, VU University Medical Center, Amsterdam, Netherlands, <sup>8</sup>AXA Research Fund & Sorbonne University Chair, Paris, France, <sup>9</sup>Sorbonne University, GRC no. 21, Alzheimer Precision Medicine (APM), AP-HP, Pitié-Salpêtrière Hospital, Paris, France, <sup>10</sup>Nuclear Medicine Department, Laboratoire d'Imagerie Biomédicale, Sorbonne Universités, Pitié-Salpêtrière University Hospital, Paris, France

### Edited by:

Christian Gaser, Friedrich Schiller Universität Jena, Germany

### Reviewed by:

Panteleimon Giannakopoulos, Université de Genève, Switzerland Yong Liu, Brainnetome Center, China

> \*Correspondence: Mara ten Kate m.tenkate1@vumc.nl

Received: 21 September 2017 Accepted: 27 February 2018 Published: 15 March 2018

### Citation:

ten Kate M, Visser PJ, Bakardjian H, Barkhof F, Sikkes SAM, van der Flier WM, Scheltens P, Hampel H, Habert M-O, Dubois B and Tijms BM (2018) Gray Matter Network Disruptions and Regional Amyloid Beta in Cognitively Normal Adults. Front. Aging Neurosci. 10:67. doi: 10.3389/fnagi.2018.00067 The accumulation of amyloid plaques is one of the earliest pathological changes in Alzheimer's disease (AD) and may occur 20 years before the onset of symptoms. Examining associations between amyloid pathology and other early brain changes is critical for understanding the pathophysiological underpinnings of AD. Alterations in gray matter networks might already start at early preclinical stages of AD. In this study, we examined the regional relationship between amyloid aggregation measured with positron emission tomography (PET) and gray matter network measures in elderly subjects with subjective memory complaints. Single-subject gray matter networks were extracted from T1-weigthed structural MRI in cognitively normal subjects (n = 318, mean age 76.1 ± 3.5, 64% female, 28% amyloid positive). Degree, clustering, path length and small world properties were computed. Global and regional amyloid load was determined using [<sup>18</sup>F]-Florbetapir PET. Associations between standardized uptake value ratio (SUVr) values and network measures were examined using linear regression models. We found that higher global SUVr was associated with lower clustering (β = −0.12, p < 0.05), and small world values (β = −0.16, p < 0.01). Associations were most prominent in orbito- and dorsolateral frontal and parieto-occipital regions. Local SUVr values showed less anatomical variability and did not convey additional information beyond global amyloid burden. In conclusion, we found that in cognitively normal elderly subjects, increased global amyloid pathology is associated with alterations in gray matter networks that are indicative of incipient network breakdown towards AD dementia.

Keywords: amyloid beta, PET, gray matter network, graph theory, MRI, subjective memory complaints, Alzheimer's disease

# INTRODUCTION

Amyloid pathology is hypothesized to be one of the earliest events in the pathological cascade of Alzheimer's disease (AD; Jack et al., 2013; Villemagne et al., 2013), and has been associated with future cognitive decline in cognitively normal subjects (Donohue et al., 2017). Understanding associations between amyloid pathology and other early pathological processes is critical as secondary prevention trials are shifting towards the earliest disease stages. AD can be considered as a disconnectivity disease (Delbeuck et al., 2003). In this study, we examined the relation between amyloid depositions measured with positron emission tomography (PET) and disruptions of gray matter networks in elderly subjects.

Brain areas involved in similar cognitive functions tend to develop in a coordinated way (Andrews et al., 1997; Alexander-Bloch et al., 2013b; Váša et al., 2018). Such co-variation of gray matter structure can be measured using structural T1-weighted MRI images and represented as a network (Lerch et al., 2006; Bassett et al., 2008; Tijms et al., 2012; Alexander-Bloch et al., 2013a). In cognitively normal subjects, brain networks tend to have a ''small-world'' organization, and it has been proposed that such a network organization provides an optimal balance of specialized information processing and integration (Sporns et al., 2004; Humphries and Gurney, 2008; Alexander-Bloch et al., 2013a). Using group level approaches (i.e., one network per diagnostic group), several studies have shown that gray matter network measures are disrupted in AD dementia compared to controls (He et al., 2008; Yao et al., 2010; Pereira et al., 2016). Using our method to extract gray matter networks on a singlesubject level (Tijms et al., 2012), we have shown that worse gray matter network disruptions in AD dementia are associated with more severe symptoms, and worse functioning in specific cognitive domains (Tijms et al., 2013a, 2014).

In cognitively normal older adults, lower cerebrospinal fluid (CSF) amyloid beta 1–42 levels, indicative of abnormal amyloid aggregation in the brain, already show disrupted gray matter network measures (Tijms et al., 2016), suggesting that at very early stages of the disease networks are starting to disorganize into the direction often observed in dementia stages of AD (Tijms et al., 2013a; Pereira et al., 2016). This suggests that gray matter networks are sensitive to detect very early brain changes related to abnormal amyloid metabolism. However, as CSF is an indirect measure of amyloid plaques it remains unclear whether gray matter network disruptions are linked to local amyloid deposits or to a global effect of amyloid pathology.

In the present study, we examined the regional relationship between amyloid depositions measured with PET and gray matter network disruptions in a large cohort of cognitively normal elderly subjects with subjective memory complaints. Since the Apolipoprotein E (APOE) ε4 allele, a genetic risk factor for sporadic AD (Bertram et al., 2010), is associated with amyloid pathology (Jansen et al., 2015) and functional and structural brain changes (Cherbuin et al., 2007; Trachtenberg et al., 2012) in cognitively normal subjects we also examined whether APOE ε4 modified the relationship between amyloid and gray matter networks.

# MATERIALS AND METHODS

### Subjects

We analyzed baseline data from the ongoing INSIGHT-preAD study (Dubois et al., 2018). INSIGHT-preAD is a monocentric longitudinal cohort study in 318 cognitively normal elderly (age between 70 and 85 years) with subjective memory complaints recruited from the community in the wider Paris area, France. All subjects underwent amyloid PET and MRI scans as well as an extensive battery of neuropsychological exams. Subjective memory complaints were defined by an affirmative answer to both of the following questions: ''are you complaining about your memory''; ''is it a regular complaint which lasts more than 6 months?'', in the absence of any objective memory deficits (mini-mental state examination (MMSE) ≥ 27, 16-item Free and Cued Selective Reminding Test (FCSRT) total score ≥ 41). Exclusion criteria were having a neurological or psychiatric disorder that could interfere with cognition (e.g., epilepsy, brain tumor, stroke), or contra-indication for MRI or amyloid PET scan. APOE genotype was determined as previously described (Teipel et al., 2017). Subjects were classified as APOE ε4 carriers if they had one or two APOE ε4 alleles and non-carrier otherwise. This study was carried out in accordance with the recommendations of the French national medical research Ethics Committee with written informed consent from all subjects. All subjects gave written informed consent in accordance with the Declaration of Helsinki. The protocol was approved by the French national medical research Ethics Committee.

### PET Acquisition and Preprocessing

Amyloid PET images were acquired on a Philips Gemini GXL CT-PET scanner using [18F]-Florbetapir (AVID radiopharmaceuticals). Subjects received a single intravenous dose of approximately 370 MBq (range 333–407 MBq). Fifty minutes post-injection, three 5-min frames were obtained (128 × 128 acquisition matrix, 2 × 2 × 2 mm<sup>3</sup> voxels). Images were reconstructed using an iterative LOR-RALMA algorithm with 10 iterations and a smooth post-reconstruction filter. Attenuation, scatter and random coincidence corrections were integrated in the reconstruction. Frames were realigned, averaged and quality-checked. Image analysis of PET data was performed by CATI (Centre d'acquisition et traitement des images<sup>1</sup> ). Structural MRI images were co-registered to the PET images using Statistical Parametric Mapping software version 8 (SPM8; Wellcome Department of Cognitive Neurology, London, UK). PET images were corrected for partial volume effects with the RBV-sGTM method (Thomas et al., 2011) using gray and white matter tissue maps. Using the normalization parameters from the spatial normalization of structural MRI images, a set of cortical regions of interest (ROIs) was mapped to each subjects' native space PET. This was performed for 12 cortical ROIs (bilateral precuneus, posterior and anterior cingulate, inferior parietal, middle temporal gyrus and orbitofrontal cortex) defined in Clark et al. (2012) and a reference region (a combination of

<sup>1</sup> cati-neuroimaging.com

pons and whole cerebellum). For each individual, parametric PET images were created by dividing each voxel by the mean activity extracted from the reference region. Global standardized uptake value ratios (SUVr) were computed by averaging the mean activity of the 12 cortical ROIs. Regional SUVr from the 12 cortical regions was used to explore local relationships between amyloid load and gray matter networks. A global SUVr threshold for abnormality was determined by performing a linear regression analyses between the above-described method and the method used by Besson et al. (2015) which used PET scans from controls from the IMAP (Multimodal Imaging of Early-Stage AD) study. This strategy has previously been used to reliably estimate relationships between different tracers and processing methods (Landau et al., 2014). A global SUVr threshold of 0.79 corresponded to the IMAP's cohort threshold of 1.005 (Besson et al., 2015). Thus, subjects with a SUVr above 0.79 in the present study were considered amyloid positive.

### MRI Acquisition and Preprocessing

Whole-brain scans were obtained using a 3T scanner (Siemens Magnetom Verio) with a 12-channel head coil. Isotropic structural three-dimensional T1-weighted images were acquired using a sagittal MPRAGE sequence (256 × 240 acquisition matrix, 1 × 1 × 1 mm<sup>3</sup> voxels, repetition time = 2300 ms, echo time = 2.98 ms, inversion time = 900 ms, flip angle = 9◦ ). The structural 3D T1 images were segmented using Statistical Parametric Mapping software version 12 (SPM12; Wellcome Department of Cognitive Neurology, London, UK) running in MATLAB 2011a (MathWorks Inc., Natick, MA, USA). Quality of all gray matter segmentations was visually inspected and none had to be excluded. After segmentation, all gray matter segmentations were resliced into 2 × 2 × 2 mm<sup>3</sup> voxels to reduce the total number of voxels. Total gray matter volume (GMV) and total intracranial volume (TIV; i.e., GMV + white matter volume + CSF) were computed from segmented images in native space.

### Single-Subject Gray Matter Networks

Single-subject gray matter networks were computed based on cortical similarity from native space gray matter segmentations, using an automated method as previously described (Tijms et al., 2012<sup>2</sup> ). Briefly, nodes in these networks represent brain areas (regions of 3 × 3 × 3 voxels defined by template free approach as described in Tijms et al. (2012), and connections are based on similarity in the spatial structure of gray matter density values as quantified with a Pearson's correlation. Networks were binarized using subject-specific thresholds as determined with a random permutation method that ensured a similar chance to include at most 5% spurious correlations in the network (Noble, 2009).

The following network measures were computed based on the average of all nodes: size of the network (i.e., total number of nodes in the network), connectivity density (i.e., ratio of existing connections to maximum possible number of connections), average degree (i.e., number of edges of a node), characteristic path length (i.e., shortest distance between two nodes), clustering coefficient (i.e., level of interconnectedness

<sup>2</sup>https://github.com/bettytijms/Single\_Subject\_Grey\_Matter\_Networks

between the neighbors of a node), and betweenness centrality (i.e., the proportion of characteristic paths that run through a node). Next, we also estimated normalized path length λ and normalized clustering coefficient γ by dividing the averaged measures across nodes of each network by properties that were derived from averaging 20 randomized reference networks of equal size and degree (Maslov and Sneppen, 2002). Last, we measured the small world network property, which is defined as having more clustering than a random network while having the average path length similar to that of a random network (Watts and Strogatz, 1998). These computations were performed using scripts from the Brain Connectivity Toolbox adapted for large sized networks (Rubinov and Sporns, 2010<sup>3</sup> ). For regional network measures, we computed the average network properties across all nodes within each region of the automated anatomical labeling (AAL) atlas (Tzourio-Mazoyer et al., 2002). These 90 anatomical areas were defined for each subject in native space by warping the AAL atlas using the inverted parameters that were calculated when normalizing subject space images to standard space. The 12 cortical regions for which PET data was available were matched to the corresponding AAL region.

# Statistical Analysis

Demographic measures were compared between amyloid positive and amyloid negative subjects using Student's t-test or Mann-Whitney-Wilcoxon test for continuous data and chi-square test for categorical data. We used two linear regression models to study the association between global amyloid burden (continuous) and each whole brain network measure. Model 1 included network measure as the dependent variable and age, gender and global amyloid SUVr as independent predictors (model 1). Additional correction for total GMV was performed in model 2. Additionally, we tested whether there was an interaction effect of APOE ε4, on the association between amyloid burden and network measures in both models.

For those network measures for which we found a global effect, we examined the regional specificity of amyloid pathology and gray matter network measures using three analyses. In the first analysis, we assessed the association between global amyloid burden and regional network measures. In the second analysis, we examined the association between regional SUVr values and network measures of the same region. In the third analysis, we used the model from the second analysis with additional correction for global SUVr. The aim of this third model was to assess whether regional SUVr values provided additional information above global SUVr. Regional associations were corrected for age, gender, TIV, local GMV and for clustering and path length also local degree. Regional associations were corrected for multiple testing using a false discovery rate (FDR) procedure (pFDR; Benjamini and Yekutieli, 2001). Regional associations were visualized using BrainNet viewer (Xia et al., 2013). All statistical analyses were performed in R (R version 3.3.1<sup>4</sup> ).

<sup>3</sup>www.brain-connectivity-toolbox.net

<sup>4</sup>http://www.R-project.org

# RESULTS

# Cohort Characteristics

Subject characteristics for the total sample and according to amyloid status are described in **Table 1**. We included 318 subjects with a median age of 76 (range 69–85) and 204 (64%) were female. All subjects were cognitively normal at the time of inclusion with an average MMSE of 29 (range 27–30). All subjects had a fully connected gray matter network with an average size of 6744 nodes (SD = 619) and average network density of 15% (SD = 1). There were 88 (28%) subjects with a positive amyloid PET scan and 58 (18%) of the subjects were APOE ε4 carriers. Amyloid positive subjects were older, more often APOE ε4 carrier, had lower total GMV, lower clustering and normalized clustering γ and lower small world values. Regional amyloid PET SUVr values in the whole sample and according to amyloid status are presented in Supplementary Table S1.

# Relationship Between Global Amyloid Burden and Whole Brain Network Measures

Higher global amyloid SUVr values were associated with lower total GMV (β = −0.1, standard error 0.05, p = 0.04). Higher global amyloid SUVr values were associated with whole brain lower clustering, lower normalized clustering coefficient γ, lower normalized path length λ, and lower small world property when correcting for age and gender (**Table 2**, **Figure 1**). Normalized clustering coefficient γ and small world remained significant after additionally correcting for total GMV. No associations were found between global amyloid SUVr and whole brain network size, degree, network density and betweenness centrality. There was no interaction effect of APOE ε4 on the association between global amyloid SUVr and any of the network measures.

# Relationship Between Global Amyloid Burden and Regional Network Measures

Next, we examined the relationship between global amyloid burden and regional network measures to assess whether effects were localized in specific regions or equally distributed across the cortex. Higher global amyloid SUVr values were associated with lower clustering values in right calcarine and left superior occipital gyrus, and with lower path length in the right superior occipital cortex (all pFDR < 0.05). Using a more liberal threshold of an uncorrected p-value < 0.05, effects were more widespread including orbito- and dorsolateral frontal and parieto-occipital cortex for clustering, and medial and orbito-frontal, posterior parieto-occipital and temporal regions for path length (**Figure 2**).

# Relationship Between Regional Amyloid Burden and Regional Network Measures

Subsequently we examined the relationship between regional SUVr and network measures of the same region. There were no significant associations at pFDR < 0.05. Repeating the analysis with an exploratory uncorrected p-value showed that higher regional amyloid SUVr in the left precuneus was associated with lower clustering in the left precuneus (β = −0.06, p = 0.03), and higher SUVr in the right precuneus was associated with lower path length in the right precuneus (β = −0.08, p = 0.01). We also found an association between higher SUVr in right orbitofrontal cortex and lower path length in right orbito-frontal cortex (β = −0.06, p = 0.03).

Next, we aimed to assess whether changes in network measures were driven by regional amyloid plaques, rather than a global effect of amyloid. However, models in which we additionally corrected for global SUVr suffered from multicollinearity issues, as global SUVr was strongly correlated with regional amyloid burden (all ROIs showed a Pearson's r ≥ 0.9 with a p-value below 1 × 10−20; **Figure 3B**). This suggests that amyloid was homogenously distributed across the cortex, which was supported by exploratory analysis that show associations of lower clustering in left precuneus with increased PET SUVr values in 10 out of 11 other regions (**Figure 3A**). Similarly, lower path length in right precuneus was also associated with higher amyloid SUVr values in six other regions.

# DISCUSSION

In this study we found that increasing amyloid load measured by amyloid PET is associated with alterations in gray matter network measures in an elderly cohort of cognitively normal subjects with subjective memory complaints. Higher amyloid SUVr was associated with lower clustering, lower normalized clustering γ, lower normalized path length λ, and lower small world values. Our results suggest that gray matter network alterations may be part of the early pathological changes in AD, which can already be detected in cognitively normal subjects with subjective memory complaints in the absence of manifest cognitive impairment.

Previous studies using group level approaches have found an association between amyloid pathology and gray matter covariance in cognitively normal subjects (Oh et al., 2014; Teipel et al., 2017). Using a multivariate analysis, these studies have found amyloid pathology to be associated with a pattern of decreased GMV in medial temporal lobe, cingulate gyrus and prefrontal cortex. Using a single-subject approach to derive gray matter networks, we extend on these findings by showing withinindividual associations between amyloid load and gray matter network changes.

## Relationship Between Amyloid Burden and Clustering

Results from the present study are in line with our previous study in an independent cohort of cognitively normal subjects, in which we found an association between lower amyloid beta 1–42 in CSF (representative of abnormal amyloid metabolism) and changes in gray matter network measures (Tijms et al., 2016). In that study we also found an association between increased amyloid pathology and whole brain lower clustering values, indicating that there are fewer connections between neighboring areas in the brain, suggesting less effective local integration. Here, using PET to measure amyloid depositions in the brain we extend those findings by showing that lower



Key: APOE, apolipoprotein E; FCSRT-TR, total recall of the Free and Cued Selective Reminding Test; GMV, gray matter volume; IQR, interquartile range; MMSE, mini-mental state examination; PET, positron emission tomography; SUVr, standardized uptake value ratio. Cut-point for amyloid positivity SUVr > 0.79. <sup>∗</sup>p < 0.05, ∗∗p < 0.01 different between amyloid positive and amyloid negative subjects.

normalized clustering values γ are also associated with more severe amyloid burden. Changes in normalized clustering values γ suggest that the global network organization is also affected by amyloid deposition. In our previous study we did not find an association between normalized clustering and amyloid CSF (Tijms et al., 2016). A potential explanation for this discrepancy could be the difference in age between both populations, as subjects in the current study are approximately 20 years older than in our previous CSF study (median age 56 vs. 76 years). As amyloid pathology increases with age, subjects in the present study had on average more amyloid pathology (28% being classified as amyloid abnormal vs. 6% in the previous study). The percentage amyloid positive subjects falls within the expected range for the age group in both studies (Jansen et al., 2015). Another explanation for the differences in findings could be the method to measure amyloid pathology. Some studies have suggested that amyloid alterations may be detected somewhat earlier in CSF than on PET (Mattsson et al., 2015; Palmqvist et al., 2016), which is particularly relevant in cognitively normal subjects. CSF and PET measure slightly different aspects of amyloid pathology. In CSF, soluble amyloid beta 1–42 monomeres are measured, which decrease when amyloid aggregates in the brain. Soluble CSF amyloid beta 1–42 levels may also be influenced by other factors such as amyloid beta production and non-fibrillary aggregation (Mattsson et al., 2015), possibly making CSF more sensitive for the earliest stages of amyloid aggregation. Amyloid PET provides a more direct measure of amyloid deposition with ligands binding to the amyloid beta in fibrillary plaques (Mathis et al., 2012), leading to floor effects within the normal range. It is likely that in our previous study in a younger population that showed mostly normal CSF values, we captured the earliest signs of incipient network disorganization related to very early pathological changes. Lower clustering values associated with increased amyloid load have also been observed for structural connectivity measured with diffusion tensor imaging, independent of cognitive status (Prescott et al., 2014). Lower gray matter clustering values have previously also been reported in subjects with AD dementia and subjects with mild cognitive impairment who later convert to dementia (Tijms et al., 2013a, 2018; Pereira et al., 2016). Taking together, these studies might suggest that during the progression of Alzheimer pathology, clustering values gradually worsen starting with decreased regional connections, and progressively leading to


<sup>∗</sup>p < 0.05, ∗∗p < 0.01. Model 1 is adjusted for age and gender. Model 2 is adjusted for age, gender and total gray matter volume. NA, not applicable.

represents the cut-off for amyloid positivity (SUVr > 0.79).

more extensive changes rendering networks more similar to randomly organized networks.

# Relationship Between Amyloid Burden and Path Length

The relationship between amyloid pathology and path length is less straightforward. In this study we found an association between increased amyloid pathology on PET and lower normalized path length λ values, although not significant when correcting for GMV. In our earlier CSF study, we found an opposite association with lower CSF values being associated with increased un-normalized path length (Tijms et al., 2016). In that study, the increased path length values were accompanied by lower connectivity density values. With decreasing number of connections, the average path length may increase. In the present study, we did not find an association between amyloid pathology and connectivity density. Possibly, this discrepancy is explained by the age-difference between the populations studied. Network density may decrease with advancing age, and the average connectivity density was 15% in the present study, compared to 20% in our previous younger cohort. Path length values might also change non-linearly during the

progression of Alzheimer pathology. Possibly, path length values first increase in the earliest stages of amyloid accumulation due to the loss of connections, and eventually decrease again when the network breaks down and becomes more randomly organized. Such an inverted U-shape trajectory of path length changes has previously been observed in functional networks during aging (Smit et al., 2012). Decreased path length values associated with network breakdown might reflect advanced disease stages when many brain areas show atrophy, and thus would show spurious similarities. In patients with AD dementia, both decreased and increased path length values have been reported across and within different imaging modalities (Xie and He, 2012; Tijms et al., 2013b; Kim et al., 2016; Duan et al., 2017). Given these inconsistencies in literature regarding path length changes in AD, and the influence of other variables on path length, path length may not be a good measure to assess and track AD-related gray matter connectivity changes. Longitudinal studies are needed to further characterize normal gray matter network changes associated with aging and pathological changes associated with amyloid pathology and brain atrophy.

# Relationship Between Amyloid Burden and Small World Values

Finally, we found an association between increased amyloid SUVr and lower small world values. Small world values indicate how much a network is locally integrated compared to a random network while remaining short path length. Small world values are based on the relation between normalized clustering coefficient and normalized path length. Hence, changes in small world values can be caused by a change in either of these measures. In this study, the decrease in small world values associated with increasing amyloid load can be explained by a relatively higher decrease in normalized clustering compared to normalized path length with increasing amyloid load. Decreases in small world values have previously also been reported in subjects with AD dementia compared to cognitively normal subjects (Tijms et al., 2013a; Kim et al., 2016; Pereira et al., 2016), and have been associated with future cognitive decline in amyloid positive non-demented subjects (Tijms et al., 2018). Some studies have also reported increased small-world values in subjects with AD dementia for different imaging modalities (Tijms et al., 2013b; Duan et al., 2017). Differences between studies might

be due to differences in methods to construct the networks or non-linear changes with disease progression, possibly reflecting non-linear changes in path length. Longitudinal studies are needed to further investigate trajectories of network changes with advancing disease.

# Regional Associations Between Amyloid PET and Gray Matter Network Measures

At a local level, increased global amyloid PET SUVr values were associated with decreased clustering in orbito- and dorsolateral frontal areas as well as parieto-occipital areas. Several of the regional correlations correspond to our previous results with CSF amyloid values (Tijms et al., 2016). Increased global amyloid PET was also associated with decreased path length in various brain areas. The associations between global amyloid and regional changes were quite widespread, and some of these areas are known regions of amyloid depositions (Braak and Braak, 1996). When examining the relationship between regional amyloid load and regional network changes, we found an effect in the precuneus and orbito-frontal cortex. These may be the regions of earliest amyloid accumulation (Villeneuve et al., 2015). When we further studied the anatomical specificity of these relationships, however, we found that much of the observed associations between local network measures and amyloid pathology were largely explained by global amyloid SUVr values. Our results are in line with other studies that did not find a direct relationship between local amyloid plaque deposits and localized measures of neuronal injury (Jack et al., 2008; Altmann et al., 2015; Grothe and Teipel, 2016). Possibly, the poor anatomical correspondence between localized plaque burden and neuronal injury markers is explained by the delay in time that these biomarkers become abnormal. Amyloid pathology may start to accumulate up to 20 years before the onset of symptoms and plateaus at a relatively early stage (Jack et al., 2013; Villemagne et al., 2013). Markers of neurodegeneration on the other hand, are more closely related to the onset of symptoms (Jack et al., 2009; Da et al., 2014). Gray matter network alterations might be sensitive to detect very subtle brain structural changes associated with amyloid pathology, and precede more overt manifestations of neurodegeneration such as atrophy. Longitudinal studies are necessary to further examine the temporal relation between amyloid deposits and gray matter network changes. Possibly, the observed association between amyloid and gray matter network measures may reflect the presence of tau in addition to amyloid pathology. Regional tau deposits may show more clear associations with regional disruptions of brain structure and function (Ossenkoppele et al., 2016; Xia et al., 2017). With the advent of new tau-binding ligands for PET, the anatomical relation between amyloid plaques, tau deposits and gray matter network changes can be examined in future studies (Villemagne et al., 2015).

### Effect of APOE

In agreement with previous studies in cognitively normal subjects, we did not find an effect of APOE ε4 genotype, a major genetic risk factor for AD, on the association between amyloid pathology and gray matter network measures (Oh et al., 2011; Tijms et al., 2016; Teipel et al., 2017). Although APOE ε4 genotype has been associated with amyloid pathology in cognitively normal subjects in a large meta-analysis study (Jansen et al., 2015), it seems that subsequent structural brain alterations are not different for APOE ε4 carriers and non-carriers. This suggests that APOE ε4 most strongly affects (the age of) amyloid aggregation, but not necessarily the anatomical locations that will show most pronounced structural brain changes.

### Limitations

A potential limitation of the present study is that we only had local SUVr values available for a subset of anatomically relevant cortical regions, for which the regional SUVr were all highly correlated with global SUVr. As such, the possibility that other anatomical areas might show more variability in amyloid depositions cannot be excluded (Villain et al., 2012). Additionally, amyloid load was assessed using semiquantitative SUVr values, which do not take into account confounding variables that may influence tracer uptake, such as flow effects, and so this might have introduced noise to the data (van Berckel et al., 2013). We presently studied subjects with subjective memory complaints, a population that might be enriched for preclinical AD, because these subjects may have higher chances of amyloid pathology and be at increased risk of cognitive decline (Jessen et al., 2014). Although this makes our study clinically relevant, this limits generalizability to the broader population. We used a cross-sectional approach to study the relationship between amyloid PET and gray matter networks. Longitudinal amyloid PET and structural MRI data might give more insight into the relationship between amyloid pathology, gray matter network disruptions and cognitive decline. Finally, it is possible that the association between amyloid and gray matter network changes reflects the presence of tau pathology. We were not able to examine this in the present sample as we did not have information on tau pathology from CSF or PET. Future studies may focus on examining the relationship between amyloid, tau and gray matter network changes.

### CONCLUSION

In summary, we found that in cognitively normal subjects, global amyloid burden is associated with alterations in gray matter network measures. These results suggest that gray matter network alterations may occur at a very early stage in the pathogenesis of AD.

# AUTHOR CONTRIBUTIONS

MK and BMT analyzed the data and drafted the manuscript. PJV, HB, FB, SAMS, WMF, PS, HH, M-OH and BD revised the manuscript for important intellectual content. HB, HH, BD and BMT conceived and designed the study.

# FUNDING

The study was promoted by Institut National de la Santé et de la Recherche Médicale (INSERM) in collaboration with

### REFERENCES


ICM, IHU-A-ICM and Pfizer and has received support within the ''Investissement d'Avenir'' (Association Nationale de la Recherche et de la Technologie (ANR)-10-AIHU-06) program. The study was promoted in collaboration with the ''CHU de Bordeaux'' (coordination CIC EC7), the promoter of Memento cohort, funded by the Foundation Plan-Alzheimer. The study was further supported by AVID/Lilly. This research publication benefited from the support of the Program ''PHOENIX'' led by the Sorbonne University Foundation and sponsored by la Fondation pour la Recherche sur Alzheimer. MK and PJV are appointed on a grant from the EU/EFPIA Innovative Medicines Initiative Joint Undertaking (EMIF Grant No. 115372). HH is supported by the AXA Research Fund, the ''Fondation partenariale Sorbonne Université'' and the ''Fondation pour la Recherche sur Alzheimer'', Paris, France. Ce travail a bénéficié d'une aide de l'Etat ''Investissements d'avenir'' ANR-10-IAIHU-06. BMT received funding from the Memorabel grant programme of the Netherlands Organisation for Health Research and Development (ZonMW Grant No. 733050506). FB is supported by the NIHR biomedical research center at UCLH.

### ACKNOWLEDGMENTS

INSIGHT-preAD study group: Audrain C, Auffret A, Bakardjian H, Baldacci F, Batrancourt B, Benakki I, Benali H, Bertin H, Bertrand A, Boukadida L, Cacciamani F, Causse V, Cavedo E, Cherif Touil S, Chiesa PA, Colliot O, Dalla Barba G, Depaulis M, Dos Santos A, Dubois B, Dubois M, Epelbaum S, Fontaine B, Francisque H, Gagliardi G, Genin A, Genthon R, Glasman P, Gombert F, Habert MO, Hampel H, Hewa H, Houot M, Jungalee N, Kas A, Kilani M, La Corte V, Le Roy F, Lehericy S, Letondor C, Levy M, Lista S, Lowrey M, Ly J, Makiese O, Masetti I, Mendes A, Metzinger C, Michon A, Mochel F, Nait Arab R, Nyasse F, Perrin C, Poirier F, Poisson C, Potier MC, Ratovohery S, Revillon M, Rojkova K, Santos-Andrade K, Schindler R, Servera MC, Seux L, Simon V, Skovronsky D, Thiebaut M, Uspenskaya O, Vlaincu M. INSIGHT-preAD Scientific Committee Members: Dubois B, Hampel H, Bakardjian H, Colliot O, Habert MO, Lamari F, Mochel F, Potier MC, Thiebaut de Schotten M.

### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fnagi.2018.000 67/full#supplementary-material


cortical networks in health and schizophrenia. J. Neurosci. 28, 9239–9248. doi: 10.1523/JNEUROSCI.1929-08.2008


in persons without dementia: a meta-analysis. JAMA 313, 1924–1938. doi: 10.1001/jama.2015.4668


**Conflict of Interest Statement**: The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 ten Kate, Visser, Bakardjian, Barkhof, Sikkes, van der Flier, Scheltens, Hampel, Habert, Dubois and Tijms. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Brain Aging and APOE ε4 Interact to Reveal Potential Neuronal Compensation in Healthy Older Adults

Elisa Scheller 1,2 \*, Lena V. Schumacher 2,3,4, Jessica Peter 1,5, Jacob Lahr 1,2 , Julius Wehrle6,7,8,9, Christoph P. Kaller 2,3,10, Christian Gaser 11,12 and Stefan Klöppel 1,2,5

<sup>1</sup> Department of Psychiatry and Psychotherapy, Faculty of Medicine, Medical Center-University of Freiburg, University of Freiburg, Freiburg, Germany, <sup>2</sup> Freiburg Brain Imaging Center, University of Freiburg, Freiburg, Germany, <sup>3</sup> Department of Neurology, Faculty of Medicine, Medical Center-University of Freiburg, University of Freiburg, Freiburg, Germany, <sup>4</sup> Medical Psychology and Medical Sociology, Faculty of Medicine, University of Freiburg, Freiburg, Germany, <sup>5</sup> University Hospital of Old Age Psychiatry and Psychotherapy Bern, Bern, Switzerland, <sup>6</sup> Department of Medicine I, Medical Center-University of Freiburg, Freiburg, Germany, <sup>7</sup> Berta-Ottenstein-Programme, Faculty of Medicine, University of Freiburg, Freiburg, Germany, <sup>8</sup> German Cancer Consortium (DKTK), Freiburg, Germany, <sup>9</sup> German Cancer Research Center (DKFZ), Heidelberg, Germany, <sup>10</sup> BrainLinks-BrainTools Cluster of Excellence, University Medical Center Freiburg, Freiburg, Germany, <sup>11</sup> Department of Psychiatry and Psychotherapy, Jena University Hospital, Jena, Germany, <sup>12</sup> Department of Neurology, Jena University Hospital, Jena, Germany

Compensation implies the recruitment of additional neuronal resources to prevent the detrimental effect of age-related neuronal decline on cognition. Recently suggested statistical models comprise behavioral performance, brain activation, and measures related to aging- or disease-specific pathological burden to characterize compensation. Higher chronological age as well as the APOE ε4 allele are risk factors for Alzheimer's disease. A more biological approach to characterize aging compared with chronological age is the brain age gap estimation (BrainAGE), taking into account structural brain characteristics. We utilized this estimate in an fMRI experiment together with APOE variant as measures related to pathological burden and aimed at identifying compensatory regions during working memory (WM) processing in a group of 34 healthy older adults. According to published compensation criteria, better performance along with increased brain activation would indicate successful compensation. We examined the moderating effects of BrainAGE on the relationship between task performance and brain activation in prefrontal cortex, as previous studies suggest predominantly frontal compensatory activation. Then we statistically compared them to the effects of chronological age (CA) tested in a previous study. Moreover, we examined the effects of adding APOE variant as a further moderator. Herewith, we strived to uncover neuronal compensation in healthy older adults at risk for neurodegenerative disease. Higher BrainAGE alone was not associated with an increased recruitment in prefrontal cortex. When adding APOE variant as a second moderator, we found an interaction of BrainAGE and APOE variant, such that ε4 carriers recruited right inferior frontal gyrus with higher BrainAGE to maintain WM performance, thus showing a pattern compatible with successful neuronal compensation. Exploratory analyses yielded similar patterns in left inferior and bilateral middle frontal gyrus. These results contrast those from a previous

### Edited by:

Lutz Jäncke, Universität Zürich, Switzerland

Reviewed by: Laura Lorenzo-López, University of A Coruña, Spain Anders Fjell, University of Oslo, Norway

### \*Correspondence:

Elisa Scheller elisa.scheller@psychologie. uni-freiburg.de

Received: 30 October 2017 Accepted: 05 March 2018 Published: 20 March 2018

### Citation:

Scheller E, Schumacher LV, Peter J, Lahr J, Wehrle J, Kaller CP, Gaser C and Klöppel S (2018) Brain Aging and APOE ε4 Interact to Reveal Potential Neuronal Compensation in Healthy Older Adults. Front. Aging Neurosci. 10:74. doi: 10.3389/fnagi.2018.00074 study, where we found no indication of compensation in prefrontal cortex in ε4 carriers with increasing CA. We conclude that BrainAGE together with APOE variant can help to reveal potential neuronal compensation in healthy older adults. Previous results on neuronal compensation in frontal areas corroborate our findings. Compensatory brain regions could be targeted in affected individuals by training or stimulation protocols to maintain cognitive functioning as long as possible.

Keywords: functional magnetic resonance imaging, BrainAGE, aging, APOE, neuronal compensation, working memory, multiple regression, moderator analysis

# INTRODUCTION

Neuronal compensation as an individual's reaction to cognitive challenge with the aim to maintain cognitive performance has been increasingly investigated over the past years in healthy aging and beginning neurodegenerative disease. Several theoretical frameworks describe compensation as a flexible recruitment of additional neuronal resources when existing networks reach their capacity limits and are not sufficient anymore for successful cognitive performance (Cabeza, 2002; Davis et al., 2008; Park and Reuter-Lorenz, 2009; Stern, 2009; Reuter-Lorenz and Park, 2014). Together with these frameworks, a debate on the interpretation of increased brain activation as compensatory arose (Price and Friston, 1999; Friston and Price, 2003; Grady, 2012). Moreover, an increase in brain activation cross-sectionally might not be maintained longitudinally, but be transformed in an overall decrease in functional response over time (Nyberg et al., 2010). To overcome this debate, clear-cut criteria to unambiguously characterize increased brain activation in frontal cortex as compensatory have been published recently (Cabeza and Dennis, 2013). Cabeza and Dennis state that successful compensation is indicated by an increase in activation that is positively related with task performance. In addition, guidelines for a translation of such criteria to statistical models have been suggested (Gregory et al., 2017). To establish a statistical model of compensation, three components are necessary: task performance, a measure of brain activation as well as a measure related to pathological or disease burden for the condition investigated (Cabeza and Dennis, 2013; Gregory et al., 2017).

Measures of task performance and brain activation for compensation models can be derived from task-based fMRI experiments. Approximations of pathological burden specific to healthy aging or a certain neurodegenerative disease can be specified according to the group investigated. In patients with beginning neurodegenerative disease, regional brain volume of an area affected by the disease can be used, e.g., striatal volume in Huntington's disease (Klöppel et al., 2015; Gregory et al., 2017). Regarding healthy aging, chronological age (CA) is an easily available proxy of biological age or pathological burden due to aging and has been utilized in models of compensation before (Scheller et al., 2017), though CA is linear in nature and does not cover individual deviations from average aging. A more sophisticated measure covering such deviations is the brain age gap estimation (BrainAGE, Franke et al., 2010), as it takes into account individual heterogeneity of brain anatomy. BrainAGE is estimated from T1-weighted structural MRI scans with the help of kernel regression. After estimation, BrainAGE constitutes the deviation (in years) from an individual's CA. The measure has proved helpful as a biomarker predicting the conversion from Mild Cognitive Impairment (MCI) to Alzheimer's disease (AD) and even outperforms other established markers of disease progression (Gaser et al., 2013). Moreover, BrainAGE is significantly related to markers of poor health such as indices of the metabolic syndrome and kidney and liver function in healthy older adults (Franke et al., 2014), and thus seems to reflect pathological burden in great detail. Recently, BrainAGE has proved a viable biomarker for aging, as individuals with older-appearing brains showed an increased mortality risk (Cole et al., 2017). On the other hand, younger-appearing brains seem to be related to higher years of education and the number of flights of stairs climbed daily, i.e., physical exercise (Steffener et al., 2016).

The current study is the first to combine BrainAGE with functional imaging data and to implement BrainAGE as a proxy for pathological burden in a model of neuronal compensation. Hence, we aimed at linking changes in brain activation to an underlying structural correlate (Gregory et al., 2017). This study builds on recent work on the same sample, where CA was successfully combined with APOE variant as moderator variables in a multiple regression model to unequivocally detect successful compensation in older adults at risk for neurodegenerative disease (Scheller et al., 2017). Carrying the APOE ε4 allele is an established risk factor for sporadic AD (Corder et al., 1993; Farrer et al., 1997). In a previous study, we showed that ε4 carriers activated medial frontal and inferior frontal areas to a greater extent compared to non- ε4 carriers during a working memory (WM) task, pointing to successful compensation in genetically burdened individuals. These effects were not additionally moderated by CA. Therefore, in the current study we strived to investigate if BrainAGE might aid to reveal additional compensatory areas in the same sample. In longitudinal data, BrainAGE changed to a greater extent in APOE ε4 carriers compared to non-carriers (Löwe et al., 2016). Thus, it is of great interest to examine a combination of both biomarkers within one model of compensation.

In the present experiment, we restricted our search for compensatory brain activity to prefrontal cortex, as previous studies of WM and APOE variant suggest potential compensation predominantly in prefrontal areas (Filbey et al., 2006, 2010; Wishart et al., 2006; Chen et al., 2013). Reviews of WM function in healthy older adults as well as early neurodegenerative disease corroborate these findings (Cabeza and Dennis, 2013; Reuter-Lorenz and Park, 2014; Scheller et al., 2014). Several behavioral and structural imaging studies point to an association between WM function and APOE allele status, as WM and other frontal cognition deficits in older APOE ε4 carriers were detected (Reinvang et al., 2010; Caselli et al., 2011; Bender and Raz, 2012; Greenwood et al., 2014). Moreover, frontal areas such as the dorsolateral prefrontal cortex (DLPFC) are investigated as targets for noninvasive brain stimulation (NIBS), e.g., transcranial direct current stimulation (tDCS) combined with cognitive training (Flöel, 2014; Jones et al., 2017; Ruf et al., 2017). In the future, tailoring stimulation to compensatory areas could constitute an approach to maintain cognitive abilities as long as possible (Scheller et al., 2014).

Taken together, the aim of the current study was to investigate compensatory recruitment in healthy older individuals to maintain WM function. First, we hypothesized that individuals with higher BrainAGE might require additional neuronal resources to perform a cognitively demanding WM task successfully. To this end, the moderating effect of BrainAGE on the relationship between task performance and brain activation was investigated with multiple regression with interaction effects. This assessment was compared to previously published findings of the same sample with CA instead of BrainAGE as a moderator (Scheller et al., 2017). Second, we examined the additional burden by the APOE ε4 allele on the relationship between task performance and brain activation and therefore implemented APOE variant as a second moderator variable. We hypothesized that the combined burden of a higher BrainAGE and the ε4 allele might reveal an augmented need for compensation in prefrontal cortex. With these analyses, our approach aimed at yielding further insight into differential compensatory mechanisms depending on the respective measure of pathological burden in healthy older individuals.

# MATERIALS AND METHODS

### Sample

Thirty-four community-dwelling healthy older adults (20 females, mean age 68.82 years, SD 5.33, range 61–80) were recruited as part of a project investigating neuronal plasticity in aging and early neurodegeneration at the University Medical Center Freiburg. All were right-handed, had normal or correctedto-normal visual acuity and no history of psychiatric or neurological disease as well as adequate performance in a sensitive cognitive test (Montreal Cognitive Assessment [MOCA] score ≥24; Nasreddine et al., 2005). We chose the MOCA cut-off at 24 to reduce the number of false positive exclusions (Luis et al., 2009; Roalf et al., 2013). The local Ethics Committee approved the study and all participants gave written informed consent prior to participation.

# BrainAGE Estimation and APOE Genotyping

BrainAGE of each participant was estimated based on the individual T1-weighted anatomical image with a voxel size of 1 mm<sup>3</sup> and respective CA. A detailed description of the algorithm can be found in previous publications (Franke et al., 2010; Gaser et al., 2013). We implemented the BrainAGE measure based on both gray and white matter to incorporate the entire brain structure in our model. In short, relevance vector regression (RVR) is utilized as statistical framework. The model is trained with the help of structural imaging data of a training sample, in this case we used 547 subjects of the IXI database (http://brain-development.org/ixi-dataset/). Then, the BrainAGE index of all participants was estimated using individuals' segmented T1-weighted images that were derived using the SPM8 package (http://www.fil.ion.ucl.ac.uk/ spm) and the VBM8 toolbox (http://www.neuro.uni-jena.de/). We used the affine registered segmentations for gray and white matter that were smoothed with a 4-mm full-width-athalf-maximum smoothing kernel. After smoothing, data were resampled to 8 mm and a data reduction was performed by applying principal component analysis (PCA), utilizing the "MATLAB Toolbox for Dimensionality Reduction" (https:// lvdmaaten.github.io/drtoolbox/). PCA was only performed on the training sample and the estimated transformation parameters were subsequently applied to the test sample. The BrainAGE score indicates the deviation (in years) from an individual's CA. A positive BrainAGE score implies an older-appearing brain (Cole et al., 2017), whereas a negative score signifies that the respective individual's brain is younger-appearing. The score can be added to CA to directly compare both measures (**Figure 2**), e.g., the BrainAGE of +2.5 of an individual with a CA of 79 yields a value of 81.5 years. In the current manuscript, we use "BrainAGE" to label the deviation from CA, and "estimated BrainAGE" to label BrainAGE + CA in years.

Genotyping of the sample is described in detail in previous work (Scheller et al., 2017). Twelve participants were identified as ε3/ε4 heterozygotes, 17 as ε3/ε3 homozygotes, two as ε2/ε3 heterozygotes and three as ε2/ε2 homozygotes. The allele frequencies in our sample reflect the distribution in the German population (∼8% ε2, ∼78% ε3, and ∼14% ε4; Corbo and Scacchi, 1999). As subsample sizes did not allow for further stratification, we decided to classify APOE genotype as a dichotomous variable (12 ε4 carriers, who were all ε3/ε4 heterozygotes, and 22 nonε4 carriers) for our statistical models. A classification of all participants according to their allele status as well as their BrainAGE index can be found in Supplement 1.

## Verbal n-Back Task and Behavioral Data

The blocked verbal n-back task consisting of 0-, 1-, and 2 back conditions has been described in detail in our previous manuscript (Scheller et al., 2017). In short, participants lay supine in the scanner while viewing instructions and stimuli via a mirrored projection system. They responded with the index and middle finger of their right hand using a custom-built 2 button response box. After a 5 s instruction screen, letters were presented one at a time for 1,500 ms each with 1,000 ms blank screen inter-stimulus interval (ISI), while subjects had to indicate by pressing a button with their dominant index finger whether the currently presented letter was the same as the previous (1 back) or second-last (2-back) letter. If the current letter was not the same, they had to indicate this by pressing a button with their middle finger. A third condition (0-back) served as a baseline to contrast against 1-back and 2-back and did not include working memory load. Here, either the letter A or B was presented for 1,500 ms each with 1,000 ms ISI and subjects had to press with their index finger if the current letter was A and with their middle finger if it was B. The three conditions (2 back, 1-back, 0-back) were presented in blocks of 10 letters in a pseudo-randomized order. There were six blocks per condition resulting in 60 trials per condition and each block lasted 30 s including 5 s of instructions. As an index of task performance, we computed accuracy as percentage of correct responses for each condition as well as reaction times as the latency between stimulus display and corresponding button press. Compatibility with the normal distribution was tested using Kolmogorov– Smirnov tests for all performance indices. To assess changes in performance between low and high WM load across the whole sample, we computed paired t-tests or Wilcoxon signedrank tests where appropriate for accuracy and reaction times. In addition, performance indices were compared between ε4 and non-ε4 carriers with independent t-tests to ensure that identified brain activation differences were not caused by a performance deficit. For the imaging data analyses, accuracy as percentage of correct responses was used as measure of task performance.

# Imaging Data

### Data Acquisition and Processing

Detailed descriptions of imaging data acquisition as well as preprocessing procedures can be found in a previous study of the same sample (Scheller et al., 2017). Preprocessing and subsequent statistical analyses were performed with the Statistical Parametric Mapping software (SPM8 r4667; Wellcome Trust Centre for Neuroimaging, http://www.fil.ion.ucl.ac. uk/spm). In short, functional images were coregistered to respective anatomical scans, which were segmented with the VBM 8 toolbox (r435; http://dbm.neuro.uni-jena.de/software/). First-level analysis was conducted using a general linear model (GLM) approach in native space. Normalization took place before 2nd level analyses, such that images were resampled to a spatial resolution of 1.5 cubic mm and smoothed with a 6 mm full width at half maximum (FWHM) Gaussian kernel. For group multiple regression analyses, contrast images of interest comparing 2-back vs. 0-back conditions were entered in a multiple regression model. Activation peaks were anatomically labeled with the Anatomy Toolbox for SPM (version 1.8, http://www.fz-juelich.de/inm/inm-1/DE/Forschung/\_docs/

SPMAnatomyToolbox/SPMAnatomyToolbox\_node.html, see e.g., Eickhoff et al., 2005). All group analyses were computed voxel-wise but restricted to a region of interest that comprised the entire prefrontal cortex due to the predominantly frontal location of compensatory areas in previous studies (see section Introduction). To this end, we defined an anatomical mask with the WFU pickatlas toolbox (Maldjian et al., 2003) comprising all bilateral frontal cortical areas (including the insular cortex).

### Multiple Regression Analysis

To investigate whether BrainAGE and APOE variant significantly moderate the relationship between task performance and brain activation and thus to draw conclusions on neuronal compensation, we chose multiple regression as the favorable statistical model. Multiple regression is a simplification of the model suggested for longitudinal data by Gregory et al. (2017). Compared to the often-found group analyses in the field of neuroimaging, multiple regression is advantageous, as the sample can be investigated in a continuous fashion instead of being artificially dichotomized. Moreover, the possibility to introduce moderator variables in multiple regression enables investigation of interaction effects across the whole continuum represented by the sample (for introductions, see Cohen et al., 2003; Jaccard and Turrisi, 2003; Hayes, 2013). Hence, we allowed performance, BrainAGE and APOE variant to interact in the prediction of brain activation to investigate potential compensation.

To assess the moderator effects of BrainAGE and APOE variant, we defined a multiple regression model for the 2-back condition of our n-back task (**Figure 1**). As expected, the 1-back condition showed a ceiling effect (**Table 1**), thus not offering sufficient variability for meaningful further interpretation. To construct the multiple regression model, we entered 2-back accuracy as "focal" predictor, i.e., the primary predictor of interest and both BrainAGE and APOE variant (ε4 vs. non-ε4) as one continuous and one dichotomous moderator, respectively. The contrast image 2-back > 0-back constituted the dependent variable or outcome in both models. The resulting regression models contained behavioral performance, BrainAGE and APOE variant as well as all possible product terms for two-way and three-way interactions of these variables as regressors. To control for confounds, gender and years of education were entered as nuisance variables. Results from such multiple regression models with interaction effects are conditional on the specific centering of the predictors. Therefore, all continuous predictors were mean-centered prior to entering the model to yield meaningful

FIGURE 1 | Conceptual diagram of the multiple regression model employed to analyze interaction effects in fMRI data allowing for moderated moderation. WM performance is the "focal" predictor, whereas BrainAGE as well as APOE variant act as moderators. In this model, 2- as well as 3-way interactions are possible, so as the focal predictor and both moderators can interact with each other. See also model templates for PROCESS Macro, http://www.afhayes. com/.

results representing average sample characteristics (cf. Jaccard and Turrisi, 2003).

All multiple regression analyses were conducted voxel-wise with the in-house developed IFX Toolbox for SPM8 (Kaller et al., 2012) based on the assessment of interaction effects as described by Jaccard and Turrisi (2003) and Bauer and Curran (2005). The toolbox identifies activation peaks showing significant interaction effects. We assessed respective interaction contrasts within prefrontal cortex at a voxel level threshold of p < 0.05 family wise error (FWE) corrected for multiple comparisons, paralleling previous work (Scheller et al., 2017). For further exploratory analyses, we applied an uncorrected threshold (p < 0.001).

Subsequently, the signal of significant peaks was extracted using the volume of interest function in SPM and further analyzed with the PROCESS macro version 2.16 in SPSS (Hayes, 2013) with a model allowing for moderated moderation (PROCESS Model 3). PROCESS and the IFX toolbox are based on the same literature and use equivalent implementations of multiple regression with interaction effects. PROCESS allows for further examination of interaction effects, e.g., the addition of more covariates. Hence, we double-checked and further characterized interaction effects with PROCESS. To be able to directly compare BrainAGE and CA in one regression model, we added CA as a nuisance variable to the model with task performance, BrainAGE, and APOE variant as predictors. We decided against operationalizing CA as a fourth moderator due to sample size constrictions and because CA was not identified as a significant moderator in previous work on the same sample (Scheller et al., 2017).

To visualize interactions, we used the Johnson–Neyman technique (Johnson and Neyman, 1936; Bauer and Curran, 2005; Hayes, 2013) considering the conditional effect of task performance on brain activity across the whole range of the moderator variables (see e.g., Kaller et al., 2015; Scheller et al., 2017 for examples with neuroimaging data). With this approach, it is possible to compute regions of significance within confidence bands for the moderator variable, i.e., BrainAGE ranges for which task performance significantly relates to the activation of certain brain regions.

# RESULTS

### Behavioral Data and BrainAGE Estimation

Participants performed well across all task conditions. Ceiling effects were present in the 0-back and 1-back conditions. The 2 back condition showed high variability in performance (**Table 1**). BrainAGE and 2-back performance as well as their residuals were compatible with a normal distribution as confirmed by Kolmogorov-Smirnov tests (statistical threshold p < 0.05), hence fulfilled prerequisites of multiple regression. CA, BrainAGE as well as task performance in all conditions did not differ between ε4-carriers and non-carriers (**Table 1**). BrainAGE and CA were not significantly correlated (Pearson's r = −0.09, p = 0.60) as well as BrainAGE and task performance (r = 0.01, p = 0.95). Of the 34 participants, 14 obtained a positive BrainAGE index pointing to accelerated atrophy; 20 participants obtained a negative


TABLE 1 |

Demographics,

neuropsychological,

 and behavioral n-back data.

CA, chronological

 age; MOCA, Montreal Cognitive Assessment; \*<sup>p</sup> represents the p-values after the assessment

 of differences between ε4 and non-ε4 carriers.

FIGURE 2 | (A) Boxplot of CA and estimated BrainAGE across the sample. The BrainAGE score was added to CA for immediate comparability. The line within the boxes represents the median, the outer lines of the boxes depict the first and the third quartile, respectively. The error bars reach from the first and third quartile to the respective extrema. Note that estimated BrainAGE shows higher variability than CA as well as a trend toward lower values. (B) Scatterplot of CA and estimated BrainAGE. The angle bisector represents the line where CA equals estimated BrainAGE. Please note that the data point that is situated directly on the angle bisector belongs to a participant with a positive BrainAGE of 0.09.

BrainAGE index and thus were estimated as younger than their CA (**Figure 2**, Supplement 1). Thus, there was a trend toward younger-appearing brains in the sample (Supplement 1). This reflects the neuropsychological characterization as cognitively intact older adults and potentially the high level of education (**Table 1**). Further analyses of task performance data can be found in a recent study of the same sample (Scheller et al., 2017).

### Imaging Data: Multiple Regression Task Performance as Focal Predictor, BrainAGE vs. CA as a Moderator

The main effect of task for the 2-back condition represented the well-known frontoparietal WM network (Owen et al., 2005). For an overview of MNI coordinates and images, please see Scheller et al. (2017).

We did not identify significant interaction effects of BrainAGE and task performance on activation in PFC. Hence, BrainAGE cannot be considered a moderator of the relationship between performance and brain activation such that individuals with higher BrainAGE need to recruit additional frontal regions to maintain performance. Similarly, CA did not moderate the relationship between performance and brain activation significantly, as shown in previous work. Thus, compensatory activation could not be detected.

### Task Performance as Focal Predictor, BrainAGE vs. CA and APOE Variant as Moderators

After inspecting the above-described two-way interactions of performance and BrainAGE, we tested the three-way interaction of performance, BrainAGE, and APOE variant (ε4 vs. non-ε4) to determine if the addition of genetic burden as a moderator would reveal compensatory effects. Indeed, we found a significant three-way interaction in right inferior frontal gyrus pars orbitalis at p < 0.05 FWE corrected (rIFG; MNI x = 26 y = 24 z = −18; T = 6.91; cluster extent k = 50; R <sup>2</sup> = 0.82; p < 0.001). The R 2 increase due to the inclusion of the threeway interaction of task performance, BrainAGE, and APOE variant (R 2 -change) was 0.29, p < 0.001. The effect size Cohen's f <sup>2</sup> of 1.61 was large according to the guideline defining f 2 ≥ 0.02, f <sup>2</sup> ≥ 0.15, and f <sup>2</sup> ≥ 0.35 as small, medium and large effects, respectively (Cohen, 1988). Individuals carrying the ε4 allele recruited this region to a greater extent with increasing BrainAGE as can be derived from the dark-gray regions of significance within the Johnson–Neyman confidence bands (**Figure 3B**). The positive region of significance starts at positive BrainAGE values, which can be seen on the xaxis (**Figure 3B**, right column), hence the effect is present in individuals with increased atrophy (**Figure 3B** right column). Moreover, the slope of the line for ε4 carriers is positive and the region of significance is situated above the x-axis, signifying an increase of brain activation with better performance. Altogether, they showed activation compatible with successful neuronal compensation according to the criteria by Cabeza and Dennis (2013), while this effect was not present in non- ε4 carriers, as can be derived from missing regions of significance (**Figure 3B**, left column).

We further investigated the activation peak in rIFG with the PROCESS macro for SPSS. To be able to discern the incremental variability of using BrainAGE instead of CA as a moderator, we implemented CA as an additional nuisance variable. BrainAGE as a predictor significantly contributed to the model [unstandardized regression coefficient b = −0.01, standard error (SE) = 0.02, t = −2.12, p = 0.05], which was not the case for CA (b = 0.002, SE = 0.005, t = 0.48, p = 0.64). The three way interaction of task performance, BrainAGE, and APOE variant proved highly significant as already derived from the analysis with the IFX toolbox (b = 0.70, SE = 0.11, t = 6.19, p < 0.0001). It is possible to report so-called simple effects in multiple regression allowing for interaction effects (see Cohen et al., 2003; Hayes, 2013). In short, simple effects are comparable

gyrus. Z-coordinates of the respective peak are depicted above the slices. (B) Johnson–Neyman (JN) confidence bands depicting the relationship of task performance and brain activation as a conditional effect across the whole range of the first moderator variable BrainAGE. To visualize the three-way interaction, JN bands are depicted separately for the second moderator APOE variant (ε4 vs. non- ε4).

to main effects in principle, but are conditional on the centering (here: mean-centering) of the remaining predictors in the model (Cohen et al., 2003). Simple effects change significantly as a function of the moderator if there is a significant interaction concerning the predictor of interest. We assessed simple effects for all predictors and found that there was a highly significant simple effect of APOE allele status in rIFG (t = −6.26, p < 0.001), and—as already pointed out above—a significant simple effect of BrainAGE (t = −2.12, p = 0.05), but not a significant effect of accuracy. This underlines that the interaction is driven by APOE allele status and is significant only at positive BrainAGE. An overview of simple and interaction effects in our multiple regression model can be found in Supplement 2. Taken together, performance and rIFG activation were positively related at positive BrainAGE scores in e4 carriers and there was no such relationship in non-e4 carriers at any BrainAGE (**Figure 3B**). With the help of BrainAGE, we were able to explain a greater amount of variance compared to the implementation of CA alone.

For exploratory purposes, we tested the same interaction with an uncorrected threshold (p < 0.001). Interestingly, we found a similar interaction in the contralateral homologous area to the above-reported peak in rIFG (lIFG pars orbitalis; MNI x = −28 y = 27 z = −12; t = 4.09; k = 63; R <sup>2</sup> = 0.59; p = 0.01; R 2 change = 0.25, p = < 0.001; Cohen's f <sup>2</sup> = 0.61) as well as in left and right middle frontal gyrus (lMFG; MNI x = −34 y = 3 z = 55; t = 5.05; k = 57; R <sup>2</sup> = 0.72, p < 0.001; R 2 change = 0.25, p < 0.001; Cohen's f <sup>2</sup> = 0.89; rMFG; MNI x = 38 y = 9 z = 40; t = 3.88; k = 28; R <sup>2</sup> = 0.59, p = 0.01; R 2 change = 0.22, p = 0.002; Cohen's f <sup>2</sup> = 0.54). In all three areas, there was a similar activation pattern compared with the peak in rIFG, with APOE ε4 carriers exhibiting increased recruitment with higher BrainAGE together with better performance, with significance regions beginning around the mean BrainAGE of −0.46 years (**Figure 3**). Hence, the effect is significant predominantly in individuals with positive BrainAGE. Concerning non-ε4 carriers, we identified a negative region of significance in lIFG (**Figure 3**). Hence, individuals without genetic burden recruited the area less with increasing BrainAGE together with better task performance. The additional areas, although not significant at a corrected threshold when analyzed with SPM, proved to be significant when further tested with the PROCESS macro including CA as a nuisance variable. The three-way interaction of task performance, BrainAGE, and APOE variant remained highly significant in lIFG (b = 0.71, SE = 0.19, t = 3.78, p < 0.0001), lMFG (b = 1.53, SE = 0.34, t = 4.53, p < 0.0001), and rMFG (b = 0.87, SE = 0.25, t = 3.52, p < 0.0001), confirming the incremental benefit of the moderator variables.

The same three-way interaction was tested with CA instead of BrainAGE as a moderator in a previous study of the same sample. At an FWE corrected threshold (p < 0.05) as well as at an uncorrected threshold (p < 0.001), we did not find a moderating effect of CA and APOE on the relationship between performance and prefrontal activation. A voxel-based morphometry (VBM) analysis of the same sample confirmed that there were no differences in gray matter volume between APOE ε4 and nonε4 carriers (Scheller et al., 2017), in a way that the identified differential activation is not biased by structural abnormalities.

# DISCUSSION

The aim of the current study was to investigate potential compensatory recruitment in healthy older individuals with a compensation model comprising the components task performance, brain activation and two measures related to pathological burden in aging, BrainAGE, and APOE variant. First, individuals with higher BrainAGE did not require additional neuronal resources to perform a cognitively demanding WM task successfully, hence compensatory brain activation was not detected when BrainAGE alone moderated the relationship between performance and PFC activation. Second, the three-way interaction of performance, namely BrainAGE and APOE variant was examined. Here we found increased activation in bilateral inferior frontal as well as bilateral middle frontal gyrus at higher BrainAGE with better performance, fulfilling one clear-cut criterion of successful compensation (Cabeza and Dennis, 2013). The interaction was driven by ε4 carriers, thus with the combination of a positive BrainAGE and the unfavorable ε4 allele, potential compensatory frontal recruitment in our sample of healthy older adults became apparent. Moreover, the effect remained significant after the variance explained by CA was partialed out.

# Task Performance as Focal Predictor, BrainAGE vs. CA as a Moderator

When testing the moderating effects of BrainAGE on the relationship between task performance and prefrontal activation, we did not find significant interaction effects and thus no indication of compensation. Fourteen of thirty-four study participants had an estimated higher BrainAGE compared to their CA. Hence, the majority of our sample exhibited youngerappearing brains, which might be the reason why we did not identify compensatory areas when BrainAGE alone was implemented as a moderator. Younger-appearing brains are associated with higher years of education (Steffener et al., 2016), which is reflected in our highly educated sample drawn from the older population of a university town. The result also conforms to previously obtained findings with CA instead of BrainAGE as a moderator, as we did not find an additional activation in PFC with higher CA in recent work (Scheller et al., 2017). Compensatory patterns might surface in individuals with higher CA/BrainAGE compared to those investigated here, but our sample was presumably too high-functioning due to the abovementioned selection bias to detect compensatory recruitment with proxies of biological age alone.

## Task Performance as Focal Predictor, BrainAGE vs. CA and APOE Variant as Moderators

After combining BrainAGE with APOE variant as moderator variables to yield a more detailed characterization of pathological burden, we identified increased prefrontal recruitment compatible with successful neuronal compensation. Specifically, the three-way interaction of task performance, BrainAGE, and APOE variant was significant in rIFG and as revealed by exploratory analyses, also in lIFG and bilateral MFG, with large effect sizes. The increased recruitment of these areas along with better task performance confirms one criterion for successful compensation (Cabeza and Dennis, 2013), such that better performance is associated with higher activation in doubleburdened individuals with the ε4 allele and a positive BrainAGE. A second criterion for successful compensation as stated by Cabeza and Dennis, namely the disruption or enhancement of this positive relationship between task performance and brain activation, is not included in the current study. CA did not impact on these interaction effects, as we controlled its influence by implementing CA as a nuisance variable. The results further corroborate previously reported findings on compensatory recruitment in prefrontal cortex. For instance, medial PFC was found to be increasingly activated in ε4 carriers before (Filbey et al., 2010) as well as ventromedial PFC with slightly different fMRI tasks (Wishart et al., 2006). Recent work suggests that the ability to modulate MFG activation (among others) to increasing levels of difficulty in an n-back task is associated with successful cognitive aging (Kennedy et al., 2017).

The negative effect in lIFG observed in non-ε4 carriers might be a sign of processing efficiency, i.e., they perform best when using a concise task-related network without additional areas, as their need for compensation might still be small due to lack of genetic burden (Goh and Park, 2009; Reuter-Lorenz and Park, 2014). Activating additional areas might therefore be a sign dedifferentiation in non-ε4 carriers, which could explain the negative association with task performance (Goh, 2011). Association of lIFG activation with both successful and un-successful compensation depending on genetic burden underlines the importance of distinguishing APOE variants in future studies. The absence of effects compatible with potential compensation in non-ε4 carriers at our chosen statistical threshold does not signify that these individuals do not compensate or are not able to compensate. Due to previous work on a large multicentric sample (Klöppel et al., 2015), we assume that compensation is highly variable across individuals, hence group-level statistics might not be able to grasp such effects. As ε4 carriers and non-ε4 carriers performed equally well in the WM task, we conclude that ε4 needed to recruit additional neuronal resources to reach the same level of accuracy as non-ε4 carriers.

Our compensation-related findings can also be viewed as an elaboration of previous results of the same sample. Prior to the availability of the sample's BrainAGE coefficients, we had already identified bilateral areas on the margin of IFG and insula as compensatory areas in APOE ε4 carriers (Scheller et al., 2017). With the help of BrainAGE, we were able to better stratify these findings by revealing a further association of IFG recruitment with BrainAGE: Not only do APOE ε4 carriers show a need for compensation, but especially do APOE ε4 carriers with older-appearing brains, i.e., individuals with maximal pathological burden. Recent work suggests that APOE ε4 is associated with a different lifespan trajectory regarding the modulation of brain activation under cognitive load (Foster et al., 2017). This corroborates our findings of APOE variant as a strong moderator with high impact on neuronal recruitment.

DLPFC is part of the well-established WM network (Owen et al., 2005; Rottschy et al., 2012) and frontal areas show compensatory activation across several cognitive domains (Cabeza, 2002; Davis et al., 2008; Cabeza and Dennis, 2013), hence we restricted our analysis to frontal cortex. In addition, compensation is assumed to take place in usually task-relevant areas and—when neuronal resources decline—in additional areas often close to the established task network (Reuter-Lorenz and Park, 2014). Thus, we can assume that increased activation in inferior as well as middle parts could have buffered beginning deficits in DLPFC.

Why we observe a significant interaction of task performance, BrainAGE, and APOE and not of task performance, CA, and APOE cannot be determined unambiguously, as our sample size was not sufficient for more complex statistical models, i.e., a four-way interaction of task performance, CA, BrainAGE, and APOE variant. A larger sample or a replication sample would be desirable to strengthen our findings. Still, taking brain structure into account when approximating pathological burden helped to obtain a more fine-grained picture of compensatory recruitment. Consequently, we would argue that our ε4 carriers showed compensation in PFC, but that potential compensatory recruitment could only be revealed when taking into account a combination of risk factors, i.e., the most accurate approximation of pathological burden available. Moreover, our cross-sectional design only captures inter-individual variability. To follow individuals' compensation trajectories, i.e., the initialization and further development of compensatory recruitment, longitudinal investigations are needed (Nyberg et al., 2010; Gregory et al., 2017). Finally, to strengthen the reported findings, future work will need to prove that also a second criterion of successful compensation (Cabeza and Dennis, 2013) is fulfilled, namely the disruption or enhancement of the identified positive relationship between task performance and brain activation by e.g., NIBS procedures.

# CONCLUSION

BrainAGE together with APOE variant has proved a helpful proxy of pathological burden to be implemented in models of neuronal compensation. The suggested combination of structural and functional imaging as well as genetic data translating theoretical frameworks to statistical models of compensation should be transferred to other cognitive domains as well as further samples of healthy older individuals and patients with beginning neurodegenerative disease. As proposed in a review of studies on compensation (Scheller et al., 2014), previous results could be revisited with the same model of compensation as suggested here. Structural imaging and herewith the opportunity to compute BrainAGE is easily available in functional imaging studies and thus key compensatory regions of specific cognitive functions could be identified, further characterized and potentially amplified by non-invasive brain stimulation combined with cognitive training programs.

# ETHICS STATEMENT

This study was carried out in accordance with the recommendations of the ethics committee of the Albert Ludwigs University Freiburg with written informed consent from all subjects. All subjects gave written informed consent in accordance with the Declaration of Helsinki. The protocol was approved by the ethics committee of the Albert Ludwigs University Freiburg.

# AUTHOR CONTRIBUTIONS

ES, JP, LS, CK, and SK: designed the study; JP, JL, and LS: acquired data; ES, CK, and CG: analyzed data; ES, JW, and SK: interpreted data for the study; JW and ES: created figures; ES and SK: drafted the manuscript and all authors revised it critically for important intellectual content. All authors gave their final approval of the version to be published and agree to be accountable for all aspects of the work.

## FUNDING

The study was partly supported by the BrainLinks-BrainTools Cluster of Excellence (German Research Foundation grant no. EXC 1086).

### REFERENCES


### ACKNOWLEDGMENTS

The authors wish to thank Hansjoerg Mast and Verena Landerer for their help in fMRI data acquisition.

# SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fnagi. 2018.00074/full#supplementary-material

apolipoprotein E-ε4 healthy adults. Brain Imaging Behav. 4, 177–188. doi: 10.1007/s11682-010-9097-9


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Scheller, Schumacher, Peter, Lahr, Wehrle, Kaller, Gaser and Klöppel. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# The Aging Brain With HIV Infection: Effects of Alcoholism or Hepatitis C Comorbidity

Natalie M. Zahr 1,2 \*

*<sup>1</sup> Neuroscience Program, SRI International, Menlo Park, CA, United States, <sup>2</sup> Department of Psychiatry and Behavioral Sciences, Stanford University School of Medicine, Stanford University, Stanford, CA, United States*

As successfully treated individuals with Human Immunodeficiency Virus (HIV)-infected age, cognitive and health challenges of normal aging ensue, burdened by HIV, treatment side effects, and high prevalence comorbidities, notably, Alcohol Use Disorders (AUD) and Hepatitis C virus (HCV) infection. In 2013, people over 55 years old accounted for 26% of the estimated number of people living with HIV (∼1.2 million). The aging brain is increasingly vulnerable to endogenous and exogenous insult which, coupled with HIV infection and comorbid risk factors, can lead to additive or synergistic effects on cognitive and motor function. This paper reviews the literature on neuropsychological and *in vivo* Magnetic Resonance Imaging (MRI) evaluation of the aging HIV brain, while also considering the effects of comorbidity for AUD and HCV.

Keywords: alcohol use disorder, alcoholism, hepatitis C, magnetic resonance imaging, magnetic resonance spectroscopy, diffusion tensor imaging, neuropsychological tests

### Edited by:

*Aurel Popa-Wagner, Department of Neurology, University Hospital Essen, Germany*

### Reviewed by:

*James H. Cole, King's College London, United Kingdom Valerie Cardenas, Neurobehavioral Research, United States*

### \*Correspondence:

*Natalie M. Zahr nzahr@stanford.edu*

Received: *31 July 2017* Accepted: *20 February 2018* Published: *22 March 2018*

### Citation:

*Zahr NM (2018) The Aging Brain With HIV Infection: Effects of Alcoholism or Hepatitis C Comorbidity. Front. Aging Neurosci. 10:56. doi: 10.3389/fnagi.2018.00056*

# INTRODUCTION

The concept and benefits of combining multiple drugs for treatment of Human Immunodeficiency Virus (HIV) infection was introduced in 1996 (Gulick et al., 1997; Hammer et al., 1997). Polydrug therapies, referred to as highly active Antiretroviral Therapy (HAART) or equivalently, combination Antiretroviral Therapy (cART) were quickly incorporated into clinical practice, resulting in significantly reduced rates of hospitalizations, Acquired Immune Deficiency Syndrome (AIDS), and death (Moore and Chaisson, 1999). Because highly effective, combination regimens have since been the default in ART, and because newer one-pill options make use of the word "combination" obsolete, there has been a recent trend in referring to HIV treatments as ART instead of HAART or cART (Myhre and Sifris, 2017). Despite the effectiveness of ART in reducing HIV viral load and improving immune function, HIV infection continues to have major untoward public health and clinical consequences (Powderly, 2002).

Each year in the United States (US), 55,000–60,000 new infections are reported, with an estimated total of ∼1.2 million infected individuals. In 2013, people ≥50 years old accounted for 17–26% (or up to 312,000 individuals) of the HIV population (Center for Disease Control and Prevention, 2013). Older individuals are more likely to be diagnosed later in the course of the disease; indeed, 40% of people ≥55 are diagnosed with AIDS at the time of HIV diagnosis (Lindau et al., 2007; Brooks et al., 2012; Center for Disease Control and Prevention, 2015, 2016a,b). As individuals infected with HIV live longer (e.g., Thompson and Jahanshad, 2015), they are likely to accrue central nervous system (CNS) risk from factors such as substance use disorders (e.g., alcoholism), comorbid infections [e.g., hepatitis C virurs (HCV)], and medical conditions associated with ART treatment (Woods et al., 2004).

The considerable comorbidity of HIV infection and alcoholism (Cook et al., 2001; Miguez et al., 2003; Samet et al., 2004, 2007; Conigliaro et al., 2006; Fuller et al., 2009; Bonacini, 2011) poses a greater public health burden than either condition alone. Individuals who drink heavily or have been diagnosed with DSM-IV alcohol abuse/dependence or DSM5 alcohol use disorder (AUD) are more likely to engage in risky sexual behaviors, delay testing for HIV, and postpone treatment (Fritz et al., 2010; Howe et al., 2011). Conversely, AUD may make it difficult for infected patients to follow the complex medication regimen prescribed to treat HIV or interfere with basic mechanisms of pharmacological treatment. HCV infects ∼25% of HIV-infected people in the US (Center for Disease Control and Prevention, 2011). HIV patients co-infected with HCV, who are also likely to drink heavily (>50 g alcohol/day), have higher mortality rates than low or moderate drinkers (Bonacini, 2011).

Cross-sectional studies have been instrumental in identifying brain regions and systems affected in HIV infection, but are limited to speculation about the potential interaction of these effects with aging and variables that change with disease progression or mitigation (e.g., Ances et al., 2012). Inconsistency in findings may be, at least in part, attributable to the cross-sectional examination of a dynamic disease. Indeed, any conclusion determining whether aging interacts and exacerbates the untoward effects of HIV infection, or alternatively, whether disease progression is a greater contributor than age to decline requires longitudinal study of the relevant variables in HIVinfected groups (e.g., Holt et al., 2012; Spudich and Ances, 2012).

In longitudinal modeling of the interactions of aging and HIV, two potential trajectories are often considered: premature (additive) or accelerated (synergistic) aging. Infection may facilitate processes compromised by older age resulting in premature aging, during which changes occur earlier but in parallel to normal aging or accelerated aging, wherein changes occur at a faster rate than in normal aging (**Figure 1**). Results may also depend on the metric evaluated (e.g., neuropsychological performance vs. brain volumes).

In the following, the literature on brain structure and function in HIV and relevant comorbidities (i.e., AUD, HCV) is reviewed, with a focus on longitudinal studies to help clarify the independent or interactive effects of older age. **Table 1** provides a list of references used herein, concentrating on manuscripts published after 2007, for HIV and each comorbid condition, also indicating cross-sectional or longitudinal studies. **Table 2** summarizes key findings highlighted in this review.

### MEDICAL AND PSYCHIATRIC EFFECTS OF HIV AND COMORBIDITIES

Age-related medical conditions (e.g., diabetes, hypertension, coronary artery disease, stroke, Alzheimer's disease) are not usually observed in the general population until over age 60: in HIV-infected patients, such conditions may present at middle age or sooner (Guaraldi et al., 2014). HIV-infection is also associated with frailty, the likelihood of which increases with age (Desquilbet et al., 2007). Accelerated aging in HIV may

may facilitate processes associated with aging resulting in premature aging, during which changes occur earlier but in parallel to normal aging or accelerated aging, wherein changes occur at a faster rate than in normal aging.

put affected individuals at increased risk for non-HIV-associated cancers (Nasi et al., 2014) and dementias (Verma and Anand, 2014; Sheppard et al., 2015).

HIV infected patients self-report feelings of apathy, lethargy, and depression (Hardy and Vance, 2009; Robertson et al., 2009; Lane et al., 2012; Zayyad and Spudich, 2015). Indeed, aging with HIV may lead to higher rates of psychiatric comorbidities (e.g., major depression, bipolar disorder, anxiety; Valcour et al., 2004; Effros et al., 2008; Leserman, 2008; Havlik et al., 2011). Medical or psychiatric comorbidities in HIV complicate access to care, interfere with self-management, and often necessitate a greater reliance on caregivers.

Because healthy aging results in global increases in immune activation and immune senescence (Schuitemaker et al., 2012), it is thought that a canonically dysregulated immune system (e.g., altered T cell production) can hasten medical or psychiatric disease (Önen and Overton, 2011), thereby contributing to premature or accelerated aging in HIV (Watkins and Treisman, 2012; Zapata and Shaw, 2014).

Medical conditions associated with AUD include liver, lung, and cardiac disease (Simet and Sisson, 2015). AUD-related liver disease has a negative effect on the progression of HIV infection (Petry, 1999; Braithwaite et al., 2007; Soboka et al., 2014; Tran et al., 2014). HIV-infected patients who drink heavily are furthermore at increased risk for cardiovascular disease (Kelso et al., 2015), certain types of cancer (McGinnis et al., 2006), and diabetes (Butt et al., 2009; Wakabayashi, 2014). AUD independently presents with depression and reduced quality of life (Sassoon et al., 2012); alcoholism in HIV likely has an additive effect on depression (Sullivan L. E. et al., 2011), stress, and anxiety (Pence et al., 2008).




Frontiers in Aging Neuroscience | www.frontiersin.org

TABLE 2 | Summary of findings from manuscripts listed in Table 1.


HCV liver damage progresses more rapidly in HIV and may accelerate the course and impair the management of HIV (Luetkemeyer et al., 2006; Weber et al., 2006; Chamie et al., 2007; Kim and Chung, 2009; Soriano et al., 2010). In addition, individuals seropositive for HCV have co-occurring insulin resistance beyond what might be predicted by chance (Harrison, 2008). HCV patients frequently report fatigue, lassitude, depression, and poor quality of life (Hilsabeck et al., 2003; Adinolfi et al., 2015). Emerging evidence supports an additive role of HCV and HIV on depression (Ramasubbu et al., 2012), which can negatively impact medical outcomes (Šprah et al., 2017).

### NEUROPSYCHOLOGICAL AND MOTOR EFFECTS OF HIV AND COMORBIDITIES

HIV-associated neurocognitive disorder (HAND) is ideally assessed using comprehensive neuropsychological batteries and interpreted using demographically appropriate normative data (Antinori et al., 2007). Assessment of HAND allows for grading of functional impairment (Marder et al., 2003; Sacktor et al., 2016), from asymptomatic neurocognitive compromise to HIVassociated dementia (HAD) (Day et al., 1992; Maj et al., 1994; Robertson et al., 2011; Nakazato et al., 2014). The prevalence of HAD on the severe end of the spectrum has declined with ART (Gates and Cysique, 2016). Mild to moderate cognitive deficits in HIV, by contrast, remain an issue (Vivithanaporn et al., 2010; Manji et al., 2013; Underwood et al., 2017b). Despite heterogeneity (Dawes et al., 2008; Vassallo et al., 2015; Joseph et al., 2016), neuropsychological assessments of treatmentstabilized HIV patients often report compromise in domains of attention, psychomotor speed, memory, and executive control (Hinkin et al., 1999; Martin et al., 2003; Becker et al., 2015). Visuospatial abilities are relatively spared (Cysique et al., 2006), but may be sensitive to age-HIV interactions (Foley et al., 2013). Persistent cognitive impairments post-ART have been attributed to a variety of factors (e.g., immunological, genetic, psychosocial) (e.g., Arentoft et al., 2015; Thaler et al., 2015; Hobkirk et al., 2017), including ART, in particular efavirenz (Ciccarelli et al., 2011; Romão et al., 2011; Funes et al., 2014; Ma et al., 2016), advancing age (e.g., Morgan et al., 2011; Brew and Chan, 2014; Jacks et al., 2015; Jiang et al., 2016; Gomez et al., 2017), and comorbidity for substance use (Rosenbloom et al., 2010; Sassoon et al., 2012; Míguez-Burbano et al., 2014) or HCV infection (Devlin et al., 2012).

Motor symptoms described in the treated HIV population include slowing, clumsiness, poor balance, and loss of fine motor control (Fama et al., 2007; Robertson et al., 2007; Sullivan E. V. et al., 2011; Bernard et al., 2013; Wilson et al., 2013; Prakash et al., 2016). Peripheral neuropathy, a persisting and prevalent (15–40%, Newton, 1995; Evans et al., 2008) HIV-associated disturbance in the post-ART era (Geraci and Simpson, 2001; Robertson et al., 2011; Kranick and Nath, 2012; Gabbai et al., 2013), is also associated with older age (Saylor et al., 2017) and ART (Dragovic and Jevtovic, 2003; Venhoff et al., 2010; Birbal et al., 2016; Weldegebreal et al., 2016; Adoukonou et al., 2017; Benevides et al., 2017) and likely contributes to impaired motor control.

Indeed, toxicity of ART goes beyond originally reported side effects of medications. An unexpected relationship between high current CD4 and deterioration of clinical status is an active area of investigation (e.g., Jernigan et al., 2011; Nasi et al., 2017) and a growing concern for the aging HIV population (Manji et al., 2013; Zaffiri et al., 2013). This condition, referred to immune reconstitution inflammatory syndrome (IRIS), applies to HIV patients who experience worsening symptoms as a result of anti-retroviral therapy mediated immune restoration (Venkataramana et al., 2006; Johnson and Nath, 2010). The effects of IRIS on brain structure may not be visible with conventional MRI (Narvid et al., 2016), but may be detectable with quantitative diffusion tensor imaging (DTI) (Zhu et al., 2013), which focuses on the integrity of white matter microstructure.

To account for variability seen in neuropsychological performance in cross-sectional studies (Schretlen et al., 2003), it has been posited that an increase over time (6-month interval) in intra-individual variability (or dispersion) in neurocognitive performance contributes to poorer antiretroviral medication adherence, which in turn can lead to additional neurocognitive impairments, precipitating a deteriorating cycle (Thaler et al., 2015). The Multicenter AIDS Cohort Study (MACS) enrolled a total of 6972 men from sites in Baltimore, Washington, Chicago, Los Angeles, and Pittsburgh at three separate time points: in 1984–1985, 1987–1991, and 2001–2003. Neuropsychological evaluation included measures from multiple domains. A datadriven Mixed Membership Trajectory Model technique was used to investigate potential trajectories of cognitive impairment. The findings suggest three distinct trajectories: "normal aging" was defined as a low probability of mild impairment until age 60; "premature aging" was defined as mild impairment starting at age 45–50 (i.e., "premature aging" relative to "normal aging" was offset to the left by 25+ years); "unhealthy aging" was defined as mild impairment at ages 20–39. Clinically defined AIDS, HCVinfection, depression, and race affected an individual's trajectory classification (Molsberry et al., 2015). Our work comports with the results of the MAPS study showing that cognitive performance slope differences between control and HIV groups can be modeled as premature aging, in that differences between the patients and the controls occur without interactions with aging (Pfefferbaum et al., 2014).

Studies focused on neuropsychological performance in AUD show impairments in memory, psychomotor speed, and executive functioning: problems in visuospatial and emotional regulation domains appear to be unique to AUD (Chanraud et al., 2007; Fama et al., 2009; Oscar-Berman et al., 2014; Wilcox et al., 2014; Le Berre et al., 2017). Motor effects of AUD include compromise of upper limb motor abilities, and gait and balance (Sullivan et al., 2000b,c, 2002; Vassar and Rose, 2014). Peripheral neuropathy reported in AUD has been related to nutritional deficiencies (Chopra and Tiwari, 2012; Noble and Weimer, 2014).

Substance abuse can independently contribute to neuropsychological impairments in HIV (e.g., Gomez et al., 2017). In a recent study, 52% of HIV positive patients showed cognitive deficits, often related to high alcohol consumption (McNamara et al., 2017). In studies aimed at discerning the independent effects of HIV and AUD (e.g., Fama et al., 2012), impairments in planning and free recall of visuospatial material marked AUD, whereas impairments in psychomotor speed, sequencing, narrative free recall, and pattern recognition marked HIV. Our work demonstrates that tests of executive function, episodic memory, and processing efficiency (expressed as ageand education-corrected composite Z-scores) show a graded effect, with HIV+AUD performing worse than controls on executive function and episodic memory and worse than AUD alone or HIV alone on episodic memory (Fama et al., 2016): in HIV+AUD, age was a unique predictor of poor episodic memory (**Figure 2**). Our work comports with the literature that comorbidity for HIV+AUD results in compounding effects (Rothlind et al., 2005; Fama et al., 2014) on declarative memory (Fama et al., 2009, 2016), remote memory (Fama et al., 2011), selective attention and conflict processing (Schulte et al., 2005), psychomotor speed (Sassoon et al., 2007), gait and balance (Fama et al., 2007), and quality of life (Rosenbloom et al., 2007).

HCV-infected individuals experience cognitive decline even in the absence of cirrhosis-associated hepatic encephalopathy or other indices of liver damage (Karaivazoglou et al., 2007). Some groups have argued that cognitive deficits in HCV are due to interferon treatment (Asnis and Migdal, 2005; Capuron et al., 2005; Reichenberg et al., 2005), but cognitive deficits persist despite successful antiviral (interferon) therapy (Thein H. H. et al., 2007; Weissenborn et al., 2009; Cattie et al., 2014; Kuhn et al., 2017). Although the literature is heterogeneous and characterized by cross-sectional rather than longitudinal assessments of relatively small and select cohorts, neurocognitive deficits reported in HCV include compromised attention, memory, and psychomotor speed (Forton et al., 2002; Hilsabeck et al., 2002; Capuron et al., 2005; Iriana et al., 2017) with fewer reports of deficits in executive functioning (Córdoba et al., 2003; Weissenborn et al., 2004), fine-motor coordination (Vigil et al., 2008), and presence of peripheral neuropathy (Adinolfi et al., 2015; Mathew et al., 2016).

Studies reporting on the combined effects of HIV and HCV on neuropsychological performance suggest that the two viruses result in similar neurocognitive consequences (cf., Parsons et al., 2006; Thein H. et al., 2007; Martin-Thormeyer and Paul, 2009; Martin et al., 2015; Molsberry et al., 2015) with comorbidity associated with greater neurocognitive impairment than in either infection alone (Hilsabeck et al., 2003; von Giesen et al., 2004; Cherner et al., 2005; Letendre et al., 2005; Richardson et al., 2005; Sun et al., 2013; Caldwell et al., 2014; but see: Perry et al., 2005; Soogoor et al., 2006; Clifford et al., 2015), particularly on measures of memory (Hilsabeck et al., 2005; Hinkin et al., 2008), executive functioning (Ryan et al., 2004), and motor dexterity (Cherner et al., 2005).

In summary, available evidence suggests that neurocognitive performance in ART-treated HIV individuals shows premature aging. HIV, AUD, and HCV can independently impair neuropsychological functioning and appear to have additive effects on some domains of cognition, which in practice can have significant effects on key outcomes such as employment status (van Gorp et al., 1999; Heaton et al., 2004), medication adherence (Hinkin et al., 2004), and driving safety (Marcotte et al., 2006).

### IN VIVO NEUROIMAGING OF HIV AND COMORBIDITIES

### Macrostructural Magnetic Resonance Imaging (MRI)

In the ART era, clinical MRI scanning reveals relatively few gross intracranial abnormalities in HIV, particularly when neurological signs are absent (Nishijima et al., 2014). Although severe brain atrophy is uncommon in HIV stabilized by treatment, brain volume deficits can be detected with quantitative methods in select regions of the cortex, basal ganglia, and cerebellum (Aylward et al., 1993; Di Sclafani et al., 1997; Stout et al., 1998; Tagliati et al., 1998; Ragin et al., 2012; Kallianpur et al., 2013; Underwood et al., 2017a). Cortical areas with gray matter volume deficits in HIV with viral suppression, relative to healthy controls, include frontal, cingulate, sensorimotor, and parietal regions (Heaps et al., 2012; Li et al., 2014; Pfefferbaum et al., 2014; Clark et al., 2015; Janssen et al., 2015; Wang et al., 2016). Those without complete viral suppression exhibit greater volume deficits than virally-suppressed individuals (Cardenas et al., 2009; Kallianpur et al., 2013; Hines et al., 2016). The imaging literature typically reports the effects of HIV on gray matter volume [see the following for exceptions] (Corrêa et al., 2016a; du Plessis et al., 2016; Castillo et al., 2017). In studies that assessed cortical thickness rather than cortical volume, HIV effects can be evident in areas such as the insula and temporal cortices (Kallianpur et al., 2012; Sanford et al., 2017).

Subcortical regions with significantly smaller volumes, particularly in older HIV subjects relative to healthy controls, include thalamus, hippocampus, caudate, putamen, and pallidum (Dewey et al., 2010; Li et al., 2013; Wade et al., 2015; du Plessis et al., 2016; Wright et al., 2016; Sanford et al., 2017). Brain tissue abnormalities have been reported to correlate with nadir CD4 cell counts (Thompson et al., 2005; Jernigan et al., 2011; Kallianpur et al., 2012; Hua et al., 2013). However, HIV individuals with an active life style (energy use above resting expenditure) were found to have a larger putamen (Ortega et al., 2015), and longitudinal study reveals that increasing CD4 counts (notwithstanding IRIS) are associated with increases in subcortical gray matter volumes (Fennema-Notestine et al., 2013) and slower tissue volume declines (Pfefferbaum et al., 2014).

As described for compromised neuropsychological performance in HIV, brain volume deficits in the ART era may be associated with more traditional risk factors (e.g., age, education, diabetes) than with HIV-related variables (Bonnet et al., 2013; Lake et al., 2017; but see Ragin et al., 2011; Kallianpur et al., 2016). Although HIV and aging appear to contribute independently to heighten brain structural vulnerability (Ances et al., 2012), HIV may accelerate brain aging (Cysique et al., 2013; Cole et al., 2017). Consequently, despite persistent control of plasma viremia, older HIV infected patients demonstrate more rapid progressive brain compromise when compared to healthy aging (Clifford et al., 2017).

The few published longitudinal volumetric MRI studies have been conducted over relatively brief intervals, typically 1–2 years. An initial study found faster rate of cortical volume decline in mild (CDC stage A) and severe (CDC stage C) stages of HIV infection relative to changes observed in infection-free controls and faster rates of white matter volume decline in the HIV-infected subgroup with stage C than stage A severity level. Further, decline in caudate nucleus volume and increase in ventricular volume were greater in the HIV-infected group that progressed from a less severe to a more severe CDC stage across MRI sessions, and these changes in brain volumes correlated with decline in CD4 cell count (Stout et al., 1998). A 2-year longitudinal study indicated widespread white matter volume loss and posterior gray matter loss (parietal, occipital, and cerebellar) in virally-suppressed HIV individuals, depending on analysis approach; those without complete viral suppression exhibited accelerated volume loss in gray and white matter compared with declines measured in controls (Cardenas et al., 2009). Examination of HIV infected individuals before and about 6 months after starting ART revealed improvement in neuropsychological test performance but no appreciable change in regional brain volumes (Ances et al., 2012). In this relatively small study, older age and HIV infection were independently related to smaller volumes of the caudate, with evidence for premature aging of the caudate in HIV-infected participants, while volumes of the amygdala and corpus callosum were sensitive to HIV but not aging.

We evaluated brains of 51 HIV and 65 controls from 351 longitudinal MRI scans and concurrent neuropsychological evaluation collected 2 or more times over 6 months to 8 years (Pfefferbaum et al., 2014). Although HIV individuals were in good general health and free of clinically detectable dementia, significant volume effects, where HIV-infected participants had greater volumes in CSF regions and smaller volumes in tissue regions than controls, were found in the Sylvian fissures, cingulum, insula, thalamus, and hippocampus. Significant slope effects, where the HIV-infected group showed greater change per year over the years of observation than the control group, were detected in the lateral ventricles, insula, and hippocampus. Greater acceleration in slope with advancing age in the HIVinfected individuals was found for frontal, temporal, and parietal cortices and thalamus (**Figure 3**). In this study, the most consistent and robust predictors of brain volume trajectories were CD4 count and duration of HIV infection (Pfefferbaum et al., 2014).

Effects of HIV and comorbid substance abuse on brain structure can depend on the substance and quantity consumed [e.g.,] (Durazzo et al., 2007; Thames et al., 2017). In AUD, volume deficits are evident in brain regions including frontal cortex (Pfefferbaum et al., 1997; Cardenas et al., 2005, 2007), cerebellum (i.e., hemispheres; Sullivan et al., 2000a,c; De Bellis et al., 2005; Chanraud et al., 2007, 2009a; Boutte et al., 2012), pons (Pfefferbaum et al., 2002a; Sullivan, 2003; Chanraud et al., 2009b), mammillary bodies (Shear et al., 1996; Sullivan et al., 1999), hippocampus, thalamus (Sullivan, 2003; De Bellis et al., 2005; Chanraud et al., 2007; Pitel et al., 2012; van Holst et al., 2012), caudate (Boutte et al., 2012), putamen (Jernigan et al., 1991), amygdala (Fein et al., 2006), and nucleus accumbens (Sullivan et al., 2005). Those with both HIV infection and alcoholism show ventricular enlargement greater than in either condition alone (Rosenbloom et al., 2010). Quantitative analysis of MRI brain structural data from cross-sectional study of 4-groups (controls, AUD, HIV, HIV+AUD) revealed regional volume deficits in all 3 patient groups: HIV alone had relatively few deficits, except in thalamus (Pfefferbaum et al., 2012), as has recently been replicated (Janssen et al., 2015); HIV+AUD showed moderate to severe abnormalities affecting multiple brain regions (e.g., frontal and temporal cortices, thalamus, corpus callosum, Sylvian fissure, 3rd ventricle); and HIV+AUD with an AIDS diagnosis had the most serious untoward effects on brain structure (Pfefferbaum et al., 2012).

In non-cirrhotic HCV patients relative to controls, a recent study suggests that cortical thickness is reduced in frontal and occipital cortices (Hjerrild et al., 2016; also see Iwasa et al., 2012). HIV + HCV co-infection has been associated with increased incidence of neurovascular disease (Jernigan et al., 2011; Ojaimi et al., 2014; but see Ramos-Casals et al., 2007) and compromised brain perfusion (Bladowska et al., 2014), but the effects of HIV and HCV co-infection on brain macrostructural integrity is an area for further investigation. In summary, structural imaging suggests that HIV infection may lead to accelerated aging of the brain, which is compounded by AUD comorbidity, particularly in subcortical regions such as the thalamus. Additional work is required to determine whether non-cirrhotic HCV is associated with regional brain volume deficits and whether HIV+HCV coinfection has additive effects on reducing regional brain volumes.

### White Matter Hyperintensities

White matter damage can be measured by examining white matter hyperintensities (WMH) on fluid attenuated inversion recovery (FLAIR) images from MRI. WMH may reflect vascular or inflammatory brain changes (Maniega et al., 2015; Shoamanesh et al., 2015). The prevalence of cerebrovascular events in HIV remains higher, in relatively younger patients, despite treatment, than in the general population (Haddow et al., 2014; Arentzen et al., 2015). The frequency of cerebrovascular disease increases with age (Kendall et al., 2014) and HIV individuals with cerebrovascular disease are more likely to have cognitive deficits (Foley et al., 2008; Nakamoto et al., 2012).

WMH are a frequent finding on brain MRI of elderly subjects (over aged 60) and associated with hypertension (e.g., Rostrup et al., 2012; Peng et al., 2014). A number of studies report a greater prevalence of WMH in HIV relative to healthy controls (Foley et al., 2008; Su et al., 2016), specifically affecting frontal lobes (McMurtray et al., 2008). While one study reports that with older age, patients with HIV have a greater number of WMH relative to age-matched healthy controls related to a history of AIDS, current CD4, and active HCV infection (Seider et al., 2016), another found a similar number of WMH volumes in HIV and controls (Watson et al., 2017), explained by hypertension (Su et al., 2016; Watson et al., 2017).

There is little evidence that alcohol consumption increases WMH load (e.g., Anstey et al., 2006). In non-cirrhotic HCV patients relative to controls, imaging provides evidence for an increased incidence of WMH representing cerebral vasculitis (Heckmann et al., 1999; Casato et al., 2005; Bezerra et al., 2011). Indeed, in HIV, the presence of HCV was the strongest predictor of WMH (Robinson-Papp et al., 2017).

### Structure/Function Relationships

A primary goal of evaluating structure/function relationships in HIV is to advance understanding of the neural substrates of HIV-associated motor and cognitive compromise. Significant, but non-specific correlations have been reported between the severity of global brain atrophy and general cognitive

impairment in HIV (Becker et al., 2012; Steinbrink et al., 2013; but see Heaps et al., 2015). By contrast, cortical thinning of the retrosplenial cortex has been proposed as a selective contributor to general cognitive impairment in HIV (Shin et al., 2017). A number of studies report deficits in regional brain volumes associated with poor cognitive performance in HIV: the caudate with psychomotor performance (Kieburtz et al., 1996; Paul et al., 2002; Kallianpur et al., 2016); the anterior cingulate with emotion processing (Clark et al., 2015); the prefrontal cortex with verbal learning and memory (Rubin et al., 2016). The assortment of brain regions implicated likely reflects heterogeneity in disease course. Indeed, post-ART, global, cortical-driven pathogenesis rather than subcortical dysfunction is a more likely contributor the varying HIV clinical manifestations (Foley et al., 2008). Cognitive heterogeneity post-ART thus requires further evaluation of select brain structure/function relationships, particularly in stably-treated, aging HIV cohorts, with comorbid risk factors.

In HIV-infected alcoholics, smaller thalamic volumes were associated with poorer performance on tests of explicit (immediate and delayed) and implicit (visuomotor procedural) memory (Fama et al., 2014), again indicating the thalamus as a structure that is particularly susceptible to HIV and the compounding effects of AUD. The potential for segmentation of thalamic subregions (Behrens et al., 2003; Deoni et al., 2007; Zhang et al., 2010; Deistung et al., 2013; Kim et al., 2013; Barron et al., 2014) holds promise for a more refined understanding of brain structure/function relationships and affected neural circuitry (Fama et al., 2016) in HIV.

### Magnetic Resonance Spectroscopy (MRS)

MRS is a modality used to quantify brain metabolites, typically N-acetyl aspartate (NAA), choline-containing compounds (Cho), and total creatine (tCr). NAA is an indicator of neuronal integrity, with decreases suggesting neuronal dysfunction (e.g., Zahr et al., 2010, 2013). The signal from Cho, including contributions from free choline, glycerophosphorylcholine, and phosphorylcholine (Miller, 1991), is a marker for cell membrane synthesis and turnover. The signal from tCr, with contributions from creatine and phosphocreatine, represents the high-energy biochemical reserves of neurons and glia (Inglese et al., 2003). Less frequently reported, as their quantification is more challenging, are levels of myo-Inositol (mI) and glutamate (Glu). Because mI, an osmolyte, is primarily present in glial cells (Brand et al., 1993), it is considered a glial marker. Glu is a ubiquitous molecule used in cellular metabolism and is the principal excitatory neurotransmitter (Thangnipon et al., 1983; Fonnum, 1984).

MRS studies of HIV patients commonly report that neuronal injury (dysfunction or loss) is associated with low levels of NAA and changes (both increases and decreases) in Glu levels (often quantified from the combined resonance of glutamate + glutamine and referred to as Glx) in regions including frontal cortex and basal ganglia (López-Villegas et al., 1997; Chang et al., 1999, 2013; Paul et al., 2008; Hua et al., 2013; Harezlak et al., 2014; Bairwa et al., 2016); longitudinal: (Lentz et al., 2011; Sailasuta et al., 2012; Gongvatana et al., 2013; Young et al., 2014; Scott et al., 2016; Rahimy et al., 2017). Similar findings of abnormally low NAA (McAndrews et al., 2005) are also reported in HCV in regions such as the occipital cortex (Weissenborn et al., 2004); but see (Bokemeyer et al., 2011).

During acute/early infection and at two follow-up time points (2 and 6 months), greater numbers of activated (CD16+) monocytes were associated with lower NAA and higher Cho levels in frontal cortex (Lentz et al., 2011). Similarly, above control levels of Cho were identified in basal ganglia in acute HIV; these resolved to control levels at 6 month following initiation of ART (Sailasuta et al., 2012). Similar findings (longitudinal increases in Cho) were reported in frontal white matter and parietal gray matter prior to ART initiation, with resolution following ART (Young et al., 2014). By contrast, a study in chronic HIV with longer intervals between MRS showed that despite stable ART and virological suppression, and in both asymptomatic and cognitively impaired subgroups, HIV-infected subjects showed significant annual decreases in brain metabolites (including NAA, Cho, tCr, and Glx) in midfrontal cortex, frontal white matter, and basal ganglia (Gongvatana et al., 2013).

Most MRS studies show lower levels of NAA in recently sober alcoholics relative to healthy subjects in several brain regions including frontal areas (Fein et al., 1994; Jagannathan et al., 1996; Seitz et al., 1999; Bendszus et al., 2001; Schweinsburg et al., 2003; Durazzo et al., 2004, 2010; Meyerhoff et al., 2004) and cerebellum (Jagannathan et al., 1996; Seitz et al., 1999; Bendszus et al., 2001; Parks et al., 2002; Durazzo et al., 2010). Neuronal compromise (reduced NAA) appears to be compounded in HIV+AUD (Pfefferbaum et al., 2005). Below control levels of Cho in AUD patients shortly following detoxification are also reported in frontal (Fein et al., 1994; Durazzo et al., 2004; Ende et al., 2005) and cerebellar (Seitz et al., 1999; Bendszus et al., 2001; Parks et al., 2002; Ende et al., 2005; Pfefferbaum et al., 2005; but see Modi et al., 2011; Hermann et al., 2012) regions.

Neuroinflammation in either HIV or HCV has been associated with elevated levels of mI, Cho, and tCr in frontal and basal ganglia regions (Chong et al., 1993; English et al., 1997; Forton et al., 2001, 2002, 2008; Chang et al., 2002, 2013, 2014; Fuller et al., 2004; Weissenborn et al., 2004; McAndrews et al., 2005; Grover et al., 2012; Bladowska et al., 2013). MRS studies of HIV + HCV suggest that co-infection might be associated with higher mI (Garvey et al., 2012) and less variability and more reliability in reported metabolite changes (Vigneswaran et al., 2015).

In a previously published work, we challenged the specificity of Cho and mI as markers of neuroinflammation. Significant group effects were evident for striatal Cho and striatal mI, higher in HIV+AUD than in controls (**Figure 4**). Correlations evaluated in HIV groups only (i.e., HIV, HIV+AUD) demonstrated that having HCV or an AIDS-defining event was associated with higher Cho; lower Cho levels, however, were associated with low thiamine levels and with ART. Higher levels of mI were related to greater lifetime alcohol consumed, whereas ART was associated with lower mI levels (Zahr et al., 2014). These results demonstrate that competing mechanisms can influence Cho and mI levels, and that elevations in these metabolites cannot necessarily be interpreted as reflecting a single underlying mechanism such as neuroinflammation.

# Microstructural Diffusion Tensor Imaging (DTI)

Examination of brain microstructural integrity using DTI has detected subtle HIV-related differences from controls [e.g., low fractional anisotropy (FA) and high mean diffusivity (MD)] in markers of myelin (radial or transverse diffusivity) and axonal (axial or longitudinal) integrity, even in normal-appearing white matter, notably in corpus callosum and frontal lobe white matter (e.g., Pfefferbaum et al., 2007, 2009a; Chen et al., 2009; Hoare et al., 2011; Towgood et al., 2011; Du et al., 2012). Variable results from DTI studies may be due, at least in part, to timing of evaluation relative to treatment (i.e., treatment naïve, currently un-medicated, chronically medicated, or older HIV infected individuals). For example, in early, treatment naïve HIV infection, white matter impairment (Tang et al., 2017) correlated with days since infection (Wright et al., 2015). In those on ART, a number of fiber tracts, including those of the corpus callosum and corona radiate are often reported as compromised (Leite et al., 2013; Xuan et al., 2013; Su et al., 2016; Wang et al., 2016). Effects on DTI metrics may also depend on presence of neurological complications (Corrêa et al., 2015), with symptomatic individuals showing effects extending to frontal areas (Zhu et al., 2013). Chronic relative to initial infection often shows more substantial differences in DTI metrics related to biomarkers of infection (e.g., viral load and immune compromise), disease duration, and ART duration (Wright et al., 2015; Cordero et al., 2017; Strain et al., 2017), which complicates attempts to distinguish effects of age, as age is often correlated with the duration of infection and ART.

Most DTI studies report independent effects of age and HIV on DTI metrics, but no evidence for an interaction (Gongvatana et al., 2011; Towgood et al., 2011), even in subjects over the age of 60 (Nir et al., 2014). Instead, for example, longer HIV duration may interact with the presence of the apolipoprotein E4 allele (which increases the risk for Alzheimer's disease) (Jahanshad et al., 2012; Wendelken et al., 2016) or impaired glucose metabolism (Nakamoto et al., 2012) to compromise the brain in older HIV-infected individuals. A single study reported significant age by HIV interactions for decreased FA in the posterior limbs of the internal capsules, cerebral peduncles, and anterior corona radiata in HIV+ relative to seronegative control participants (Seider et al., 2016); HIV duration as measured by time since diagnosis was not a significant predictor of white matter damage in the described cohort suggesting that the reported interaction truly reflected the effects of aging. Support for an interactive effect of aging and HIV on DTI metrics comes from a longitudinal DTI study suggesting greater than normal age-related changes on the genu of HIV patients at 1 year follow up (Chang et al., 2008). A more recent longitudinal study, with an approximate 2-year follow-up interval, did not show differences in metrics between the first and second evaluation (Corrêa et al.,

2016b), possibly because viremia was better controlled in the later study.

permission from John Wiley and Sons.

Although widespread abnormalities in white matter microstructure correlate with general cognitive compromise in HIV (Nir et al., 2014; Strain et al., 2017; Underwood et al., 2017a; Watson et al., 2017), more specific microstructure/function relationships have also been reported. For example, planning deficits correlated with low FA in anterior thalamic radiations, inferior fronto-occiptal fasciculi, superior longitudinal fascicule, corpus callosum genu, and uncinate fascicule (Corrêa et al., 2015); motor impairments correlated with low FA in various motor tracts (Bernard et al., 2013); self-reported signs of peripheral neuropathy correlated with abnormally high callosal diffusivity (Pfefferbaum et al., 2009a).

DTI has revealed microstructural damage related to alcoholism in cerebral areas that appear intact in structural MRI analyses (e.g., Pfefferbaum and Sullivan, 2002; Sullivan et al., 2003; Pfefferbaum et al., 2006). Quantitative fiber tracking has demonstrated in alcoholics compared with controls greater FA deficits in anterior than in posterior fibers of supratentorial and infratentorial white matter bundles as well as low FA in tracts of the corpus callosum, centrum semiovale, internal and external capsules, fornix, superior cingulate, longitudinal fasciculi (Pfefferbaum et al., 2000, 2002b, 2009b; Pfefferbaum and Sullivan, 2005; Müller-Oehring et al., 2009; Trivedi et al., 2013; Fortier et al., 2014).

Quantitative analysis DTI data from cross-sectional study of 4-groups (controls, AUD, HIV, HIV+AUD) revealed in all patient groups relative to controls lower integrity of callosal regions (Pfefferbaum et al., 2007) and uncinate fasciculus (Schulte et al., 2012): degradation of callosal microstructure showed evidence for compounded AUD+HIV effects (Pfefferbaum et al., 2007).

In HCV, FA has been reported as low in fiber tracks including the corpus callosum, middle cerebellar peduncles (Bladowska et al., 2013), external capsules, fronto-occipital fasiculi (Bladowska et al., 2013; Thames et al., 2015), longitudinal fasciculi (Bladowska et al., 2013; Kuhn et al., 2017), and corona radiata (Kuhn et al., 2017). Studies of HIV+HCV coinfection show greater brain-wide diffusivity with voxel-based analysis (Stebbins et al., 2007) and higher diffusivity and lower FA by region-of-interest analysis (Gongvatana et al., 2011). A study evaluating co-infection on corpus callosum microstructure reported no additive effects (Heaps-Woodruff et al., 2016), whereas another study using TBSS noted compromise of the corona radiata in HIV + HCV co-infection (Seider et al., 2016).

### SUMMARY AND CONCLUSIONS

Getting old with HIV appears to cause premature aging with respect to medical conditions, psychiatric comorbidities, and neurocognitive performance. Structural MRI findings suggest accelerated aging of select brain gray matter volumes, but equivocal evidence for an interactive increase in WMH burden in older HIV-infected individuals. Current DTI studies are similarly conflicting as to whether older age and HIV have interactive effects on white matter integrity. The literature remains sparse with respect to longitudinal studies, which will help distinguish between healthy, premature, and accelerated aging with HIV.

ART has largely controlled the HIV epidemic, but fundamental questions regarding the precise cause of neurocognitive dysfunction in HIV remain. In the post-ART era, persistent issues related to an aging HIV population include effects of common comorbid conditions, such as AUD and HCV infection. Neuroimaging points to the sensitivity of the thalamus to HIV infection. High-resolution imaging and segmentation of thalamic substructures may provide a more refined understanding of the substrates underlying cognitive decline in HIV. DTI has been underutilized in studying the HIV brain and thus also holds promise for clarifying the brain regions involved in HIV-associated cognitive and motor impairments and in explicating mechanisms that may contribute to dysfunction with age. Free water imaging, a DTI analysis method that improves the specificity and sensitivity of DTI by accounting for extracellular water (Pasternak et al., 2009, 2012; Metzler-Baddeley et al., 2012), may permit a better understanding of neuroinflammatory processes in HIV (Strain et al., 2017) and aging. A better understanding of the aging HIV brain will help in the development of integrated healthcare approaches for these complicated patients.

### AUTHOR CONTRIBUTIONS

NMZ envisioned and wrote this review manuscript and is accountable for all aspects of the work.

### REFERENCES


### FUNDING

This study was supported with grant funding from the National Institute of Alcohol Abuse and Alcoholism (NIAAA): AA017347, AA017168, and AA013521.

### ACKNOWLEDGMENTS

NMZ would like to acknowledge Drs. Adolf Pfefferbaum and Edith V. Sullivan for supporting this work and reviewing in-progress versions of the manuscript.

connections between human thalamus and cortex using diffusion imaging. Nat. Neurosci. 6, 750–757. doi: 10.1038/nn1075


to white matter abnormalities in HIV infection. J. Neurovirol. 23, 441–450. doi: 10.1007/s13365-017-0512-5


associated with impaired striatal functioning. J. Neural Transm. 123, 643–651. doi: 10.1007/s00702-016-1571-0


aging with HIV cohort study. J. Stroke Cerebrovasc. Dis. 17, 212–217. doi: 10.1016/j.jstrokecerebrovasdis.2008.02.005


HCV genotypes, HCV-RNA viraemia and HIV coinfection. J. Viral Hepat. 14, 736–742. doi: 10.1111/j.1365-2893.2007.00866.x


changes in HIV-infected patients in the HAART Era. J. Acquir. Immune Defic. Syndr. 74, 563–570. doi: 10.1097/QAI.0000000000001294


alcoholic men: relationships to changes in brain structure. Neuropsychology 14, 178–188. doi: 10.1037/0894-4105.14.2.178


alcoholics: a quantitative diffusion tensor tractography study. Behav. Brain Res. 250, 192–198. doi: 10.1016/j.bbr.2013.05.001


(HAND). Curr. HIV/AIDS Rep. 12, 16–24. doi: 10.1007/s11904-014- 0255-3


**Conflict of Interest Statement:** The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Zahr. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Feeling How Old I Am: Subjective Age Is Associated With Estimated Brain Age

Seyul Kwak <sup>1</sup> , Hairin Kim<sup>1</sup> , Jeanyung Chey <sup>1</sup> \* and Yoosik Youm<sup>2</sup>

<sup>1</sup>Department of Psychology, Seoul National University, Seoul, South Korea, <sup>2</sup>Department of Sociology, Yonsei University, Seoul, South Korea

While the aging process is a universal phenomenon, people perceive and experience one's aging considerably differently. Subjective age (SA), referring to how individuals experience themselves as younger or older than their actual age, has been highlighted as an important predictor of late-life health outcomes. However, it is unclear whether and how SA is associated with the neurobiological process of aging. In this study, 68 healthy older adults underwent a SA survey and magnetic resonance imaging (MRI) scans. T1-weighted brain images of open-access datasets were utilized to construct a model for age prediction. We utilized both voxel-based morphometry (VBM) and age-prediction modeling techniques to explore whether the three groups of SA (i.e., feels younger, same, or older than actual age) differed in their regional gray matter (GM) volumes, and predicted brain age. The results showed that elderly individuals who perceived themselves as younger than their real age showed not only larger GM volume in the inferior frontal gyrus and the superior temporal gyrus, but also younger predicted brain age. Our findings suggest that subjective experience of aging is closely related to the process of brain aging and underscores the neurobiological mechanisms of SA as an important marker of late-life neurocognitive health.

### Edited by:

Nicolas Cherbuin, Australian National University, Australia

### Reviewed by:

Manuel De Vega, Universidad de La Laguna, Spain Deep R. Sharma, SUNY Downstate Medical Center, United States

### \*Correspondence:

Jeanyung Chey jychey@snu.ac.kr

Received: 03 January 2018 Accepted: 18 May 2018 Published: 07 June 2018

### Citation:

Kwak S, Kim H, Chey J and Youm Y (2018) Feeling How Old I Am: Subjective Age Is Associated With Estimated Brain Age. Front. Aging Neurosci. 10:168. doi: 10.3389/fnagi.2018.00168 Keywords: subjective age, self-perceptions of aging, gray matter atrophy, VBM, brain age

### INTRODUCTION

Subjective age (SA) refers to how individuals experience themselves as younger or older than their chronological age. Subjective perception of aging does not coincide with the chronological age and shows large variability among individuals (Rubin and Berntsen, 2006). The concept of SA has been highlighted in aging research as an important construct because of its relevance to late-life health outcomes. Previous studies have suggested that SA is associated with various outcomes, including physical health (Barrett, 2003; Stephan et al., 2012; Westerhof et al., 2014), self-rated health (Westerhof and Barrett, 2005), life satisfaction (Barak and Rahtz, 1999; Westerhof and Barrett, 2005), depressive symptoms (Keyes and Westerhof, 2012), cognitive decline (Stephan et al., 2014), dementia (Stephan et al., 2016a), hospitalization (Stephan et al., 2016b) and frailty (Stephan et al., 2015a). Although chronological age is a primary factor in explaining these late-life health outcomes, these studies suggest that SA can be another construct that characterizes individual differences in the aging process.

The interoceptive hypothesis posits that a significant number of functions, both physical and cognitive, decline with age and this is subsequently followed by an awareness of such age-related changes (Diehl and Wahl, 2010). In other words, feeling subjectively older may be a sensitive marker or indicator reflecting age-related biological changes. This hypothesis is supported by several studies that have reported significant associations between older SA and poorer biological markers, including C-reactive protein (Stephan et al., 2015c), diabetes (Demakakos et al., 2007), and body mass index (Stephan et al., 2014). Moreover, the indices of biological age (MacDonald et al., 2011) including peak expiratory flow and grip strength, also were associated with SA, even after demographic factors, self-rated health and depressive symptoms were controlled for Stephan et al. (2015b). Among a variety of biological aging markers, a decrease in neural resource constitutes a major dimension of age-related changes in addition to physical, socio-emotional and lifestyle changes (Diehl and Wahl, 2010). Together with the interoceptive hypothesis, the subjective experience of aging may partly result from one's subjective awareness of age-related cognitive decline. For example, subjective reports of one's own cognitive decline have received attention as an important source of information for the prediction of subtle neurophysiological changes. Even when no signs of decline are found in cognitive test scores, subjective complaints of cognitive impairment may reflect early stages of dementia or pathological changes in the brain (de Groot et al., 2001; Reid and MacLullich, 2006; Stewart et al., 2008; Yasuno et al., 2015). It is, thus, possible to examine a link between the subjective experience of aging and neurophysiological aging.

To assess age-related brain structural changes and widespread loss of brain tissue, neuroanatomical morphometry methods have been widely used (Good et al., 2001; Fjell et al., 2009a; Raz et al., 2010; Matsuda, 2013). Moreover, the large neuroimaging datasets and newly developed machine learning techniques have made it possible to estimate individualized brain markers (Gabrieli et al., 2015; Cole and Franke, 2017; Woo et al., 2017). This approach can be advantageous for interpreting an individualized index for brain age, since predictive modeling can represent multivariate patterns expressed across whole brain regions, unlike massive and iterative univariate testing. In recent studies, estimated brain age was found to predict indicators of neurobiological aging, including cognitive impairment (Franke et al., 2010; Franke and Gaser, 2012; Löwe et al., 2016; Liem et al., 2017), obesity (Ronan et al., 2016) and diabetes (Franke et al., 2013).

Although SA has predictive values for future cognitive decline or dementia onset, few studies have examined the neurobiological basis of such outcomes. Combining both regional morphometry and the brain age estimation method, this study will provide an integrated picture of how each individual undergoes a heterogeneous brain aging process and supply further evidence of the neural underpinnings of SA (Kotter-Grühn et al., 2015). Using analyses for voxel-based morphometry (VBM) and age-predicting modeling, we aimed to identify whether younger SA is associated with larger regional brain volumes and lower estimated brain age. We also examined possible mediators, including self-rated health, depressive symptoms, cognitive functions and personality traits that were candidates for explaining the hypothesized relationship between SA and brain structures.

# MATERIALS AND METHODS

## Subjects

The participants in this study were subsampled from the 3rd wave Korean Social Life, Health and Aging Project (KSHAP) which consisted of 591 older adults. KSHAP is a community-based cohort study that collected data from the entire population of older adults in Township K. In all, 195 elderly individuals received thorough health survey, psychosocial surveys and neuropsychological assessment. The health survey was conducted in 2014, and both the psychosocial survey and neuropsychological assessment were conducted in 2015. The following exclusion criteria were applied: psychiatric or neurological disorders, vision or hearing problems, having metal in the body that cannot be removed, hypertension and/or diabetes that cannot be controlled with drugs and/or insulin, and a history of losing consciousness due to head trauma. Older adults with cognitive impairment were excluded using neuropsychological tests and semi-structured interviews. The details of the screening procedures are described in a previous publication (Joo et al., 2017). The final data were comprised of 68 subjects who did not meet any of the exclusion criteria (mean age = 71.38, SD = 6.41, range = 59–84). MRI acquisition was done in 2015, and subsequent analysis was blinded from identification of all participants. The study was approved by the Institutional Review Board of Yonsei University, and all participants provided written informed consent to the research procedures.

### Subjective Age Groups

The following question was verbally asked to assess comparative SA: ''How old do you feel, compared to your real age?'' (Westerhof and Barrett, 2005). Participants responded with one of three categorized age identity options: ''I'm younger than my real age'' (younger SA), ''I'm the same as my real age'' (same SA) and ''I'm older than my real age'' (older SA; Boehmer, 2007). Among the subjects of KSHAP who did not have cognitive impairment (n = 137), those who identified themselves as younger SA were the greatest in proportion (40.1%), followed by same SA (34.3%) and older SA (25.5%). The gender ratio did not significantly differ among the three SA groups (χ <sup>2</sup> = 4.324, p = 0.112).

### Cognitive Functions

The Mini-Mental State Examination for Dementia Screening (MMSE-DS; Han et al., 2010), category fluency test (Kang et al., 2012) and episodic memory and working memory indices from the Elderly Memory Disorder Scale (Chey, 2007) were used to assess cognitive functions. Category fluency test asked participants to generate words from the two semantic categories (i.e., supermarket and animal) each within a minute (Kang et al., 2000). The episodic memory index was calculated by adding the correct rates on the Long-Delay Free Recall of the Elderly Verbal Learning Test (EVLT), Delayed Recall on the Story Recall Test (SRT) and Delayed Reproduction on the Simple Rey Figure Test (SRFT). The EVLT is a nine-word learning test utilizing the California Verbal Learning Test paradigm (Chey et al., 2006). The SRT requires subjects to recall a paragraph containing 24 semantic units (An and Chey, 2004). The SRFT is a simplified version of the Rey-Osterreith Complex Figure Test modified for the elderly population (Park et al., 2011). All delayed recall subtests were administered 15−30 min after the immediate recall session. The working memory index was the sum of the longest correct backward digit sequence repetition span and longest correct Corsi-block tapping order (Song and Chey, 2006).

### Health and Psychosocial Covariates

We assessed potential mediators and covariates that could account for or confound the association between SA and brain structural characteristics. The covariates were selected on the basis of previously reported associations with self-rated health (Stephan et al., 2015b), personality traits (Stephan et al., 2012), cognitive functions (Stephan et al., 2014) and depressive symptoms (Keyes and Westerhof, 2012). Participants rated the level of their global health on a 5-point Likert scale: poor, slightly poor, good, very good and excellent. Higher values represented better self-rated health. Depressive symptoms were measured using the 30-items Geriatric Depression Scale (Yesavage et al., 1982), to which respondents indicated whether they had experienced a given symptom during the past week using a ''yes'' or ''no'', i.e., on a binary scale. The personality traits of extraversion and openness were assessed using the NEO-Five-Factors-Inventory (Costa and McCrae, 1992) on a 4-point Likert scale, ranging from 1 (strongly disagree) to 4 (strongly agree).

# Voxel-Based Morphometry Analysis

Magnetic resonance imaging (MRI) scans were acquired in a 3Tesla MAGNETOM Trio 32 channel coil at Seoul National University Brain Imaging Center. Whole-brain T1-weighted magnetic-prepared rapid-gradient echo (MPRAGE) sequence images were acquired for each subject, with the following parameters: TR = 2300 ms, TE = 2.3 ms, FOV = 256 × 256 mm<sup>2</sup> , and FA = 9◦ . Whole-brain VBM analysis was carried out to determine the association between regional gray matter (GM) density and SA groups. The preprocessing of imaging data was conducted using the Statistical Parametric Mapping software (SPM12; Welcome Department of Imaging Neuroscience, London, UK) implemented in Matlab Version r2015b (MathWorks). T1 images were bias-corrected and segmented into five tissue classes, based on a non-linearly registered tissue probability map (Ashburner and Friston, 2005). The segmented native images were summed to infer individual total intracranial volume (TIV). To spatially normalize the GM image into the standard space with an enhanced accuracy of inter-subject registration (Ashburner, 2007), we used diffeomorphic anatomical registration using exponentiated lie algebra (DARTEL). A customized template was created, and a deformation field was applied to previously segmented GM images to warp non-linear transformation to standardized MNI space. During the transformations, the total amount of GM was preserved. All images were smoothed using an 8-mm full width half-maximum Gaussian kernel.

We first used F-contrast to test voxel-wise differences in GM volume among the three SA groups. For exploratory purposes, the F-test result was examined in a liberal clusterdefining threshold (z = 2.33, k > 500). The main effect of the F-map indicated differences in volumes of regional GM among the three SA groups. Based on the directional hypothesis that younger SA would be associated with larger brain volumes, we additionally conducted three pairwise t-tests (younger > same, same > older, younger > older). We identified the post hoc test results based on whether the voxels above cluster-level (cluster defining threshold of z = 3.09) or voxellevel family-wise error (FWE) p < 0.05 were included in the aforementioned clusters of F-test results. The cluster-level FWE rate was estimated based on Gaussian random field theory. Each VBM analysis was conducted after adjusting for age, gender, education and TIV.

### Age-Prediction Datasets

To estimate the degree of the age-related brain structural changes that have occurred in an individual, we implemented out-ofsample modeling and the prediction scheme that has recently been used to estimate brain ages (Franke et al., 2010; Liem et al., 2017; Rudolph et al., 2017). We utilized two publicly assessable datasets which consisted of T1-weighted MRI images: The Open Access Series of Imaging Studies (OASIS<sup>1</sup> ) and The Information eXtraction from Images (IXI<sup>2</sup> ). The crosssectional (Marcus et al., 2007) and longitudinal datasets (Marcus et al., 2010) of the OASIS consisted of 512 healthy adults aged 18–94 (mean age = 51.64, SD = 24.91, female: 256). Subjects diagnosed with dementia (clinical dementia rating ≥ 0.5) were excluded from the OASIS dataset. The IXI dataset consisted of 545 healthy adults aged 19–86 (mean age = 48.78, SD = 16.53, female: 342). To reduce non-linear age effects in model construction (Fjell et al., 2013, 2014) and avoid biased modeling in predicting subjects of KSHAP who have a relatively old age range (59–84), we selected subjects above the age of 40 as the training sample. The finalized age-prediction data included 598 subjects (mean age = 63.28, SD = 12.97, female: 383).

### Age-Prediction Image Preprocessing

To apply a standardized preprocessing analysis pipeline across different MRI scan protocols in both KSHAP and the age-prediction datasets, we used a fully automated preprocessing procedure implemented in CAT12 r1113 (Computational Anatomy Toolbox, Structural Brain Mapping Group, Departments of Psychiatry and Neurology, Jena University Hospital<sup>3</sup> ). First, a spatial-adaptive non-local means (SANLM) denoising filter (Manjón et al., 2010) was employed. Segmentation algorithms based on the adaptive maximum a posterior (AMAP) technique, implemented in CAT12, were used to classify brain tissue into three classes: GM, white

<sup>1</sup>http://www.oasis-brains.org

<sup>2</sup>http://brain-development.org/ixi-dataset

<sup>3</sup>http://dbm.neuro.uni-jena.de/cat/

matter (WM) and cerebrospinal fluid (CSF). Additionally, partial volume estimation (PVE) was used to create a more accurate segmentation for the two mixed classes: GM–WM and GM–CSF. Projection-based estimation of cortical thickness was conducted in the segmented images (Dahnke et al., 2012, 2013), which showed a comparable accuracy with other surfacebased tools (Righart et al., 2017). In total, 156 values were extracted from CAT12 region of interest (ROI) analysis pipeline, including 148 cortical thickness and averaged GM density in eight bilateral subcortical structures (caudate, putamen, amygdala and hippocampus). Cortical areas were defined based on automatic parcellation of gyri and sulci (Destrieux et al., 2010), while subcortical volumes were defined using the Neuromorphometric atlas<sup>4</sup> . Identical procedures for preprocessing and extracting ROI values were made using the KSHAP data.

### Partial Least Square Regression Modeling

To effectively summarize and explain age-related characteristics of brain structure, we constructed a cross-validated partial least square regression (PLSR) model using the Caret package for R (Kuhn, 2015). PLSR reduces high-dimensional data into orthogonal components that have the greatest covariance with the output (the target of the prediction) before multiple regression analysis is conducted. In contrast to reducing dimensions with principal component analysis, PLSR decomposes orthogonal components in a way that is more relevant to the outcomes in the model construction stage. The PLS method is utilized in neuroimaging studies to effectively summarize the highly collinear data structures that are observed across brain regions (McIntosh and Lobaugh, 2004; Krishnan et al., 2011; Rudolph et al., 2017). In this study, the PLSR model was constructed to find linearly combined latent components highly predictive of the age of an individual. The training and estimation procedure of the age-prediction model was based on 156 ROI values from the open-access datasets (n = 598).

<sup>4</sup>http://Neuromorphometrics.com

# Cross-Validation of the Age-Prediction Model

To construct a PLSR model applicable to the independent data and to achieve generalizability, we optimized the number of PLS components, using leave-one-out cross-validation (LOOCV). Although sequentially adding more components of latent variables would derive a more complex model in explaining the given data, cross-validation procedures must be applied to determine whether such a complex model ultimately explains the novel data that are independent from training data. Using the LOOCV procedure, we iteratively partitioned the training data (n = 597) to construct a model and predicted the left-out single-subject data. Each left-out procedure of modeling and predicting was repeated 598 times. Within each model, with differing number of components, the root mean squared error (RMSE) of the iterated LOOCV procedures was calculated. Examining the out-of-sample prediction error for each PLS component, the PLSR model was optimized between under-fitted and over-fitted models (Whelan and Garavan, 2014; Gabrieli et al., 2015; Rudolph et al., 2017). This approach identified the optimal model that showed the lowest RMSE and the greatest explained variance (R 2 ) in predicting the age of the left-out subject.

### Statistical Analysis of Predicted Age

The brain ages of the individual subjects from the KSHAP data (n = 68) were predicted based on the weights of the cross-validated PLSR model, using the inputs for 156 brain regional values. We examined bivariate correlations between the real ages and predicted ages for the KSHAP data. Then, we used analysis of covariance (ANCOVA) to compare the differences in the predicted brain age between the SA groups, adjusting for the linear effects of gender, education and age. The adjusted mean of predicted age indicated the difference. We additionally examined whether the ANCOVA result changed when self-rated health, depressive symptoms, cognitive function and personality traits were additionally included as covariates. We determined the significance of the group difference at p < 0.05 level.


Higher self-rated health denotes poorer health. MMSE, Mini-mental State Examination. Spearman's rho indicates rank-order correlation from younger to older subjective age. ∗∗p < 0.01, <sup>∗</sup>p < 0.05.

pairwise t-tests of the three groups indicated whether family-wise error (FWE)-corrected (voxel-level or cluster-level p < 0.05) voxels were included in the initially

# RESULTS

identified F-test clusters.

Rank-order correlation analysis showed that SA group was positively (from younger to older) associated with fewer years of education, lower working memory performance and poorer self-rated health (**Table 1**). Group comparisons specifically showed significant differences in chronological age (same < older; p = 0.043), depressive symptoms (same < older; p = 0.035) and MMSE-DS (younger, same > older; p = 0.013, p = 0.002, respectively).

We then examined how SA was associated with regional GM volume using VBM analysis. Group differences among SA groups in the exploratory ANCOVA analysis showed differences in regional volume in the right inferior frontal gyrus, right superior temporal gyrus, bilateral striatum and left postcentral gyrus (**Figure 1**, **Table 2**). A post hoc comparison between each pair of groups identified four significant clusters among the F-test results (voxel or cluster-level FWE p < 0.05). Pairwise comparison showed that those with younger SA had especially larger regional GM density compared to those in the same or older SA group.

From the open-access database (n = 598), we constructed a PLSR model of brain structural morphology predicting chronological age. Models with sequentially added latent variables (from 1 to 15 components) were cross-validated using LOOCV. The model with five components had the greatest accuracy in predicting left-out data (**Figure 2**; RMSE = 6.795, R <sup>2</sup> = 0.726). The PLSR models with more or less than five components showed relatively larger RMSE and smaller explained variance in predicting the left-out data, which indicated that these were either under-fitted or over-fitted models. The bilateral hippocampus, superior temporal gyrus, and inferior prefrontal cortex had the highest average of all coefficient weights, indicating the brain structures important


Regional gray matter (GM) density with significant group differences (p < 0.01, uncorrected, k > 500). Post hoc pairwise t-test indicated whether family-wise error (FWE)-corrected (voxel-level or cluster-level p < 0.05) voxels were included in the initially identified F-test clusters.

in predicting the chronological age of the individuals in the final cross-validated model (**Table 3**). From the cross-validated PLSR model, the chronological age of the KSHAP subjects (n = 68) were predicted with a moderate accuracy (**Figure 3A**, R <sup>2</sup> = 0.179, p = 3.32 × 10<sup>4</sup> , Mean absolute error = 5.74). Subjects in their 60 s had relatively higher predicted ages, whereas those in their 80 s showed estimated ages lower than their real ages.

The ANCOVA result showed a significant difference between SA groups in the predicted age, when gender, education and chronological age were adjusted for (F(2,62) = 4.441, p = 0.016, η <sup>2</sup> = 0.125). Spearman's correlation analysis showed that ordinal ranking from younger to older was positively associated with predicted brain age when the effect of chronological age was partialled out (ρ = 0.314, p = 0.009). As shown in **Figure 3B**, a post hoc test showed significant differences in predicted age between the younger SA group and the other two groups (younger > same, p = 0.039; younger > older, p = 0.009) but the difference between the same and older SA was not significant (same > older, p = 0.558). We repeated the ANCOVA tests by adding covariate terms for depressive symptoms, cognitive functions, personality traits and self-rated health, all of which had been found to be associated with SA in previous studies. The ANCOVA results, however, remained unchanged, even when each covariate was entered (ps < 0.05, η <sup>2</sup> = 0.105–0.142).

### DISCUSSION

SA was associated with decreased regional GM volume and predicted brain age as well. Our findings suggest that feeling subjectively older than one's age may reflect relatively faster aging brain structures, whereas those who feel subjectively younger would have better-preserved and healthier structures. This study, to our knowledge, was the first attempt to examine the neuroanatomical underpinnings of SA.

When we examined regional structural differences in GM using VBM analysis, we found that the volumes of the



Among 156 regions of interest (ROIs) of regional cortical thickness and subcortical volumes, 20 ROIs with the highest weights are noted in descending order.

inferior prefrontal cortex, posterior superior temporal gyrus and striatal region showed the strongest association with SA groups. Previous studies have indicated that the volume of the right insula is associated with metacognition and awareness of task performance (Cosentino et al., 2015), and the right posterior temporal gyrus plays an important role in processing the awareness of one's body and spatial representations (Karnath et al., 2001; Blanke et al., 2002). Neural degradation in these regions may affect how one tracks one's physical state and one's perception of age-related changes. On the other hand, the core mechanism of SA has been located in the frontostriatal dopaminergic system, which plays a central role in explaining healthy brain aging and cognitive decline (Bäckman et al., 2010). Meanwhile reduced brain volumes of the inferior frontal cortex have been found to be associated with inefficient functioning of inhibitory control processing (Turner and Spreng, 2012; Aron et al., 2014). The ability to inhibit or suppress irrelevant or no-longer-relevant information has been proposed as a core process in explaining the age-related cognitive decline in a variety of tasks (Hasher and Zacks, 1988; Hasher et al., 1991). The deficiency in tasks requiring the function of cognitive control may have affected overall appraisal of one's state of cognitive aging. It is notable, however, that other brain regions highly susceptible to aging (Fjell et al., 2009b, 2014) did not show a prominent association with SA in our results. That is, among the many brain regions undergoing age-related structural changes, specific areas were more relevant to explaining how older adults feel their own process of aging. It should also be noted that interpretations based on the variation of regional morphology and reverse inferences should be made with great caution (Poldrack, 2006).

To clarify the age-relatedness of the association between the brain structures and SA, we applied the out-of-sample modeling procedure to derive an overall index of brain aging. Since individual differences in brain structures can occur not only due to age-related neurophysiological changes but also from other pre-existing individual differences such as personality traits (Kanai and Rees, 2011), the application of the age-prediction modeling was crucial in interpreting the neuroanatomical differences observed in the study. Importantly, we found that age-related multivariate patterns expressed in cortical thickness and subcortical volumes differed among the SA groups. More specifically, the brain regions that were highly predictive of chronological age were found to be partly overlapping with the regions identified in the VBM analysis, including the hippocampus, superior temporal gyrus and inferior frontal cortex. Those who felt younger than their

real age also had younger brain structural patterns and vice versa.

Our results may suggest that, in concert with the interoceptive hypothesis (Diehl and Wahl, 2010), feeling younger or older than one's chronological age can be an indirect perception of neurobiological aging rather than a psychological defense against negative age stereotypes (Weiss and Lang, 2012) or social comparison (Mussweiler et al., 2000). If individual differences in SA result mainly from social impacts on attitudinal representation, it is less likely that feeling younger is associated with markers of neurobiological aging. Examining the correlates of SA using objectively measured aging markers other than self-reported measures may strengthen the validity of the interoceptive hypothesis. Although previous studies have already shown that those with older SA have poorer biological aging markers (Stephan et al., 2015b,c), our findings extend the hypothesis that older SA is associated with greater progression of brain aging process and poorer brain health. Significant tissue atrophy in the GM and older brain age may be reflective of cerebrovascular risks (Seo et al., 2012; Lockhart and DeCarli, 2014), and such changes may cause older adults to appraise their deteriorated functions as being a result of their aging (Vestergren and Nilsson, 2011).

Consistent with previous studies that have reported the clinical significance of subjective perceptions of cognitive decline (Rabin et al., 2017), subjective appraisal of one's own decline may provide information on neurobiological changes not otherwise detectable with objective cognitive tests. If feeling older in one's SA is affected by decreased cognitive function or cognitive efficacy (Boehmer, 2007; Schafer and Shippee, 2010; Stephan et al., 2011), it is likely that the SA reflects of pathological brain changes and subtle decreases in the neural capacity that cannot otherwise be detected. Diminished volumes of GM and older brain age may lead to reduced processing efficiency in a variety of demanding cognitive tasks, and this prolonged mismatch between reduced neural resources and burdensome environmental demands can create a subjective perception of aging. Individuals with older SA feel older because they experience frequent negative sensations as they make more cognitive efforts in daily life compared to those who report younger or same SA. Benign cognitive failures that occur daily tend to be attributed to age-related changes among older adults (Vestergren and Nilsson, 2011). It is possible that the effect of brain aging may influence the awareness of age-related change more directly than a mere appraisal of physical health (Cole and Franke, 2017), as seen from our result, which remained unchanged even when the effect of self-rated health was accounted for. Even though we observed that higher SA was associated with lower cognitive performance, especially in working memory function, adjustment of these differences did not change the relationship between the SA and brain age. While we assumed that SA reflects a stable and accurate perception of age-related changes in the brain, another possibility considered in this study was that people can feel older due to excessively self-referential and negative emotional states (Reid and MacLullich, 2006; Rabin et al., 2017). That is, when the experience of benign age-related cognitive decline is overestimated, an older adult can perceive him- or herself as older than their real age (Pearman et al., 2014; Hülür et al., 2015). However, according to the results of our study, neither self-rated health nor depressive symptoms accounted for the significant difference in brain age. These additional considerations may suggest that individual differences in SA stably reflect a prolonged and accumulated status of the brain aging to a certain degree that does not fluctuate in temporary conditions (Hughes et al., 2013; Stephan et al., 2013; Geraci et al., 2018).

Another notable finding was that the younger SA group showed a significant difference in predicted brain age. Although most previous studies have examined linear and continuous effects of SA on various outcomes, our findings suggest that feeling younger and feeling older may not be symmetric or linear cognitive processes (Kotter-Grühn and Hess, 2012; Weiss and Lang, 2012), and health consequences may also differ between the two categorized groups. In our study, while older SA group showed a tendency to have poorer cognitive function and exhibit greater depressive symptoms, feeling younger was especially associated with younger structural characteristics of the brain.

Several limitations should be noted. Although we constructed an age-predicting model that accurately explains real age across the left-out subjects at a comparable level with a previous study (Franke et al., 2010), the age-prediction model showed relatively lower accuracy among the KSHAP subjects. The low correlation between real age and estimated brain age may be explained by the fact that the predictive performance of external validation is typically poorer than that of internal validation because independent datasets do not guarantee homogeneous sample characteristics, data collection protocols and modeling parameters (Woo et al., 2017). Extensive screening procedures that are based on the neuropsychological assessments could have resulted in over-representing healthy older adults free from severe neuropathology in our study than when they were in the open access datasets. This in turn would have lowered the slope between real age and estimated brain age. Moreover, crosssectional age effects can be underestimated especially when age ranges are confined within the 60 s and the 80 s (Fjell et al., 2014). If the KSHAP data had included midlife subjects, predicted ages may have been more accurate across subjects. Another limitation is the coarse measurement of SA. The low resolution in the current categorical measure may have resulted in the loss of information regarding the extent to which the participants feel about their age within each categorized SA groups. In addition, recent studies have underscored the multidimensional aspects of SA (Diehl et al., 2014), and attempted to additionally separate the concept of SA into negative stereotypes of aging (Levy et al., 2016) and self-identification based on the social reference group (Barak, 2009), other than interoceptive awareness. The interpretation could have been clearer if we had questioned both the aspects of social influence and of the internal awareness of SA separately. Further investigation is required to distinguish specific neural mechanisms of both interoceptive perception and social influence. Lastly, although we have mainly interpreted the SA as being a result of age-related brain change, maintaining younger SA may also lead to a lifestyle physically and mentally more active, which leads to healthier brain. Future longitudinal studies will further elucidate these temporal relationships.

# AUTHOR CONTRIBUTIONS

JC and YY obtained funding and supervised the study. SK: data collection, study design, data analysis and manuscript writing. HK: data collection and data analysis.

### REFERENCES


Chey, J. (2007). Elderly Memory Disorder Scale. Seoul: Hakjisa.


### FUNDING

This research was supported by the National Research Foundation of Korea (NRF-2017S1A3A2067165), funded by the Ministry of Education, Science and Technology. This research has been supported by the AMOREPACIFIC Foundation.

### ACKNOWLEDGMENTS

We thank S. Cho for constructive feedback on the manuscript.


**Conflict of Interest Statement**: The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Kwak, Kim, Chey and Youm. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Predicting Age From Brain EEG Signals—A Machine Learning Approach

Obada Al Zoubi 1,2, Chung Ki Wong<sup>1</sup> , Rayus T. Kuplicki <sup>1</sup> , Hung-wen Yeh<sup>1</sup> , Ahmad Mayeli 1,2 , Hazem Refai <sup>2</sup> , Martin Paulus <sup>1</sup> and Jerzy Bodurka1,3 \*

<sup>1</sup> Laureate Institute for Brain Research, Tulsa, OK, United States, <sup>2</sup> Department of Electrical and Computer Engineering, University of Oklahoma, Tulsa, OK, United States, <sup>3</sup> Stephenson School of Biomedical Engineering, University of Oklahoma, Norman, OK, United States

Objective: The brain age gap estimate (BrainAGE) is the difference between the estimated age and the individual chronological age. BrainAGE was studied primarily using MRI techniques. EEG signals in combination with machine learning (ML) approaches were not commonly used for the human age prediction, and BrainAGE. We investigated whether age-related changes are affecting brain EEG signals, and whether we can predict the chronological age and obtain BrainAGE estimates using a rigorous ML framework with a novel and extensive EEG features extraction.

Methods: EEG data were obtained from 468 healthy, mood/anxiety, eating and substance use disorder participants (297 females) from the Tulsa-1000, a naturalistic longitudinal study based on Research Domain Criteria framework. Five sets of preprocessed EEG features across channels and frequency bands were used with different ML methods to predict age. Using a nested-cross-validation (NCV) approach and stack-ensemble learning from EEG features, the predicted age was estimated. The important features and their spatial distributions were deduced.

### Edited by:

Christian Gaser, Friedrich-Schiller-Universität-Jena, Germany

### Reviewed by:

Mihai Moldovan, University of Copenhagen, Denmark Safikur Rahman, Yeungnam University, South Korea

> \*Correspondence: Jerzy Bodurka jbodurka@laureateinstitute.org

Received: 31 March 2018 Accepted: 01 June 2018 Published: 02 July 2018

### Citation:

Al Zoubi O, Ki Wong C, Kuplicki RT, Yeh H, Mayeli A, Refai H, Paulus M and Bodurka J (2018) Predicting Age From Brain EEG Signals—A Machine Learning Approach. Front. Aging Neurosci. 10:184. doi: 10.3389/fnagi.2018.00184 Results: The stack-ensemble age prediction model achieved R <sup>2</sup> = 0.37 (0.06), Mean Absolute Error (MAE) = 6.87(0.69) and RMSE = 8.46(0.59) in years. The age and predicted age correlation was r = 0.6. The feature importance revealed that age predictors are spread out across different feature types. The NCV approach produced a reliable age estimation, with features consistent behavior across different folds.

Conclusion: Our rigorous ML framework and extensive EEG signal features allow a reliable estimation of chronological age, and BrainAGE. This general framework can be extended to test EEG association with and to predict/study other physiological relevant responses.

Keywords: aging, human brain, EEG, machine learning, feature extraction, BrainAGE

# INTRODUCTION

Brain changes due to age have been studied for decades (e.g., Lindsley, 1939; Harmony et al., 1990; Lao et al., 2004) and more recently using genetics Lu et al. (2017). The term BrainAGE (the difference between predicted age—chronological age) was introduced to examine and capture any disease-related deviations from natural aging, by comapring BrainAGE estimates in disease Al Zoubi et al. Predicting Age From Brain EEG

group to healthy control group. Magnetic Resonance Imaging (MRI) has been widely used to build predictive models for age by utilizing white matter (WM) and gray matter (GM) properties. Franke et al. (2010) employed T1-weighted (T1w) MRI structural images to establish a framework (using a kernel method for regression) for automatically and efficiently estimating the age of healthy individuals. This framework proved to be a reliable, scanner-independent, and efficient method for age estimation in healthy subjects, yielding a correlation of r = 0.92 between the estimated and the real age in the test samples and a mean absolute error of 5 years. Similarly, Cole et al. (2017) used Deep Learning (DL) to study BrainAGE using both pre-processed and raw T1w MRI images. Their approach predicted age with minimal efforts by achieving a correlation between age and predicted-age of r = 0.96 and MAE = 4.16 years. Also, Valizadeh et al. (2017) obtained R <sup>2</sup> = 0.77 from large healthy subjects (n = 3,144) by training features from various anatomical brain regions. Càmara et al. (2007) studied age-related changes in water self-diffusion in cerebral white matter using Diffusion Tensor Imaging (DTI). Their results revealed white matter changes with age in different brain regions, like the corpus callosum, prefrontal regions, the internal capsule, the hippocampal complex, and the putamen. Functional MRI (fMRI) imaging was also used to predict age alone, or combined with other imaging approach. For instance, Dosenbach et al. (2010) were able to explain up to 55% of their sample variance from the functional MRI connectivity (fcMRI) data. Likewise, Qin et al. (2015) related the developmental changes in the amplitude of low-frequency spontaneous fluctuations in resting-state fMRI to age. They reported MAE of 4.6 years between chronological age and predicted-age. More recently, Liem et al. (2017) utilized cortical anatomy and whole-brain functional connectivity for predicting brain-based age achieving MAE = 4.29 year. Several BrainAGE studies revealed changes and differences among clinical groups. For example, BrainAGE estimations in schizophrenia patients was attributed to accelerated aging when compared to healthy and bipolar subjects (Nenadic et al., 2017 ´ ). In additions, individuals diagnosed with medically refractory epilepsy had a higher predicted age than health subjects (Pardoe et al., 2017).

Herein, we focus on studying BrainAGE using EEG signals. Several studies have demonstrated that EEG features like EEG rhythmic activity (e.g., delta, theta, alpha, beta, and gamma) changes as a function of age (Matthis et al., 1980; Clarke et al., 2001; Marshall et al., 2002; Ashburner, 2007; Cragg et al., 2011). For instance, Benninger et al. (1984) found theta band showed an increase in power spectra, while delta exhibited decrease for healthy children between 4 and 17 years. Gasser et al. (1988) showed that: (i) the relative power increases with age in fast bands, while decreases for the slow bands in healthy children and adolescent (6–17 years), (ii) all bands showed increase in the absolute power except for alpha-2. Analyzing the coherence of EEG in resting state revealed that younger healthy subjects had a lower coherence than elderly ones for theta, alpha-3, beta-2, and beta-2 (Kikuchi et al., 2000). The beta relative power was positively correlated with age for older subjects for resting with eye closed condition (Marciani et al., 1994). The alpha reactivity decreased and showed negative correlation with age in the older group when performing mental tasks (Marciani et al., 1994). The theta power was shown to increase from resting to arithmetic task for younger group, while decreasing for the older group (Widagdo et al., 1998). Moreover, the delta band beta-3 power showed an increase from resting to arithmetic tasks, while alpha was decreased (Widagdo et al., 1998). A more recent study used four channels EEG recording to investigate age-related changes in EEG power from thousands of subjects throughout adulthood (Hashemi et al., 2016). Their findings showed an overall agerelated shift in band power from lower to a higher frequency and a gradual slowing of the peak α frequency with age. Furthermore, studying the source of cortical rhythm suggested that occipital delta and posterior cortical alpha rhythms decrease in magnitude during physiological aging with both linear and nonlinear trends (Babiloni et al., 2006). Age prediction from EEG was studied in Dimitriadis and Salis (2017), where authors used functional connectivity features from EEG to predict age from 94 healthy subjects. Their results showed accuracy of R <sup>2</sup> = 0.60 for eyesopen and R <sup>2</sup> = 0.48 for eyes-closed.

The influence of diseases on EEG features were investigated elsewhere. For instance, Saletu et al. (1995) used the mean EEG power spectrum to study group differences between multi-infract dementia (MID) and dementia of Alzheimer's disease (AD) and compared it a healthy group. MID group showed a significant increase of theta activity in occipital regions and decrease in alpha activity. Abnormalities in cortical neural synchronization for subjects were observed in subjects with mild cognitive impairment due to AD (ADMCI) and to Parkinson Disease (PDMCI) in delta and alpha (Babiloni et al., 2016). Differentiating subjects with Alzheimer's disease from healthy ones was studied in Babiloni et al. (2016). Authors reported 70% accuracy using the power and functional connectivity of cortical sources, which was later improved to 77% using Artificial Neural Network (Triggiani et al., 2017). **Table 1** provides a summary of studies that specifically reported age prediction performance from brain imaging data.

In this study, we proposed a robust and rigorous framework to predict BrainAGE using different features of EEG signals recorded during fMRI. First, we extended a recent open-source EEG feature extraction software in Matlab (Toole and Boylan,


2017) to provide a feature representation of individual subjects. Then, we applied a set of machine learning (ML) methods to predict age from features. Our proposed framework and a proof-of-concept analysis revealed that robust BrainAGE predictors span multiple EEG signal features, including separate channels, and frequencies. The overall accuracy elaborated that EEG BrainAGE is a promising approach to study brain maturity and has capacity to reveal different factors that affect natural aging.

# METHODS

## Participants

Participants were selected from the first 500 subjects of the Tulsa 1000 (T-1000), a naturalistic study that is assessing and longitudinally following 1,000 individuals, including healthy comparisons and treatment-seeking individuals with mood disorders and/or anxiety, substance use, and eating disorders. The T-1000 aims to determine how disorders of affect, substance use, and eating behavior organize across different levels of analysis with a focus on predictors of longterm prognosis, symptom severity, and treatment outcome (Victor et al., 2018). The T-1000 study is conducted at the Laureate Institute for Brain Research. The study human research protocol was approved by the Western Institutional Review Board. All participants provided written informed consent and received financial compensation for participation. As described in details in Victor et al. (2018), the study participants were screened on the basis of a treatment-seeking history and dimensional psychopathology scores: Patient Health Questionnaire (PHQ-9) ≥ 10 and/or Overall Anxiety Severity and Impairment Scale (OASIS) ≥ 8, Drug Abuse Screening Test (DAST-10) score > 3, or Eating Disorder Screen (SCOFF) score ≥ 2. Each participant underwent approximately 24 h of testing over the course of 1 year including a standardized diagnostic assessment, self-report questionnaires, behavioral and physiological measurements indexing RDoC domains, magnetic resonance imaging focusing on brain structure and rewardrelated processing, fear processing, cognitive control/inhibition, interoceptive processing, and blood/microbiome collection. Please refer to Figure S1 in Supplementary for the detailed information about the demographics of the dataset.

### EEG Recording

EEG signals were recorded simultaneously with fMRI using a 32 channel MR-compatible EEG system arranged according to the international 10–20 system from Brain Products GmbH. ECG signal was recorded using an electrode on the subject's back. In order to synchronize the EEG system clock with the 10 MHz MRI scanner clock, a Brain Products' SyncBox device was utilized. The EEG acquisition temporal resolution, and measurement resolution were 0.2 ms (i.e., 16-bit 5 kS/s sampling) and 0.1 µV respectively. A hardware filtering throughout the acquisition in a frequency band between 0.016 and 250 Hz was applied to EEG signals.

We included EEG data collected from 468 subjects (mean age: 34.8 years, 297 females). One resting EEG-fMRI run was conducted for each subject; lasting 8 min. The participants were instructed to relax and keep their eyes open and fixate on a cross.

Magnetic resonance (MR) images were acquired simultaneously via a General Electric Discovery MR750 whole-body 3 T MRI scanner with a standard 8-channel, receive-only head coil array. A single-shot gradient-recalled EPI sequence with Sensitivity Encoding (SENSE) was employed for the fMRI acquisition. The fMRI data has not been used in this paper.

### EEG Data Preprocessing

For each scan the EEG data was preprocessed with an in-house script developed in MATLAB. The script was designed to remove the MR gradient artifact and cardioballistic artifact from the EEG data. The details about the preprocessing script are given as follow. The MR gradient artifact was first removed from the EEG data using optimal basis sets (Allen et al., 2000; Delorme and Makeig, 2004; Niazy et al., 2005). Then the EEG data was band-pass filtered between 1 and 70 Hz, down-sampled to 4 ms temporal resolution, and band-stop filtered (1 Hz bandwidth) at the harmonics of the fMRI slice selection frequency (19.5 Hz), AC power line frequency (60 Hz), and a 26 Hz vibration artifact frequency (Mayeli et al., 2016). Then the cardioballistic artifact was corrected using optimal basis sets subtraction (Niazy et al., 2005), which requires the timing of the artifact cycle. In order to achieve a robust artifact cycle determination, the script determined the artifact cycle using the cardioballistic component directly from the EEG-fMRI data (Wong et al., 2018), which was extracted by independent component analysis (Bell and Sejnowski, 1995) and was automatically identified (Wong et al., 2016).

### EEG Feature Extraction

Feature extraction is a quintessential phase in any EEG analysis that depends on finding common features representation among EEG samples. The existing literature provides quite extensive span of features extraction using variety of signal processing approaches (Jenke et al., 2014). Choosing feature extraction method relies on the applications of the prediction and the compromisation between interpretation and performance. For instance, advanced features extraction methods can be used at the cost of interpretation, where such approaches have been shown to outperform the typical approaches (Dimitriadis and Salis, 2017; Al Zoubi et al., 2018). In our case, BrainAGE emphasizes on the interpretation and understanding of the predictors since the goal is to find those features that influence BrainAGE modeling. Thus, we adopted the similar set of features used by Toole and Boylan (2017), which extracts a wide range of commonly used features from EEG. However, our work takes an extensive approach to survey all features from all channels and bands without reducing features by averaging as done in Toole and Boylan (2017). That is, all features from all possible channels, bands and across different types of features were extracted from EEG. In addition, the types of features used here are commonly used in literature to analyze EEG data. That is, the interpretation and replication of such features are less challenging than using uncommon features. However, our approach results in a relatively large number of features from EEG. Therefore, a feature selection and suitable ML algorithms are needed to deduce the important predictors. All features were extracted from each subject independently and arranged in one row/sample.

### General Configuration

EEG bands of interest are [δ = 0.5–4; θ = 4–7; α = 7–13; β = 13– 30; W = 0.5–30] Hz using the bipolar montage of the EEG, with W denotes the whole frequency range of EEG. We denoted EEG time series as x<sup>i</sup> [n] with frequency bands of i = α, β, θ, γ , W and n is channel's index (the total number of channels is N = 31). We selected five types of features as follows: amplitude, range, spectral, connectivity, and fractal dimension. We divided the EEG recordings from each subject into 60 s and 50% overlap among epochs (14 epochs). **Figure 1** elaborates on the features extraction process. For each channel, we divided the signal into m epochs, then we filtered each epoch into corresponding frequency bands. A specific feature extraction was applied to each subsegment yielding m values. Finally, we estimated the channellevel feature for the corresponding frequency band as the average across all epochs. The process is slightly different for Fractal Dimension (FD) features, since we estimate the features without filtering into the frequency bands.

### Amplitude Domain Features

The amplitude features characterize the statistical properties of the signal power A i power and the signal envelope E i mean. We calculated: (i) the mean, (ii) standard deviation, (iii) skewness, and (iv) kurtosis for each channel across frequency bands. The E i mean is calculated using mean of the envelop e[n]<sup>i</sup> , which is identified in complex notation as: e<sup>i</sup> [n] = |x<sup>i</sup> [n] + jH {x<sup>i</sup> [n]}| 2 , with which is the Hilbert transformation.

### Range Domain Features (rEEG)

Range features account for the peak-to-peak voltages changes and characterize changes in the signal over the time. To achieve that, we segmented each epoch into short-time portions each with a window size of w = 2 s and overlap = 50%. Then, for each segment, we calculated the corresponding range of peak-to-peak. This produced samples from each epoch to estimate the mean, median, 5th and 95th percentiles, standard deviation, coefficient of variation and the measure of symmetry.

### Spectral Domain Features

Spectral features have been the most commonly used features for EEG. To extract the spectral features, we applied Welch periodgram to estimate the power spectral density (PSD) and the hamming window with a length of 2 s and overlap of 50%. The following spectral features have been extracted: (1) power, (2) relative power, (3) entropy (using Wiener and Shannon methods), (4) edge frequency (the cut-off frequency at which encompasses 95% of spectral power), and (5) differences between consecutive short-time spectral estimations.

### Connectivity Domain Features

We calculated the brain symmetry index (BSI) as the mean of PSD difference between the left and right hemispheres for each frequency band (K = δ, θ, α, β, γ).

Let a<sup>i</sup> and b<sup>i</sup> be the lower and upper frequency limit of band i, the BSI for band i is:

$$C\_{BSI}^{i} = \frac{1}{\langle b\_i - a\_i \rangle} \sum\_{k=a\_i}^{b\_i} \left| \frac{P\_{left} \begin{bmatrix} K \end{bmatrix} - P\_{right} \begin{bmatrix} K \end{bmatrix}}{P\_{left} \begin{bmatrix} K \end{bmatrix} + P\_{right} \begin{bmatrix} K \end{bmatrix}} \right| \right. \tag{1}$$

With

$$P\_{\text{left}}\left[K\right] = \frac{\sum\_{m=1}^{n/2} P\_m[K]}{n/2} \text{ and } P\_{\text{right}}\left[K\right] = \frac{\sum\_{m=\frac{M}{2}+1}^{M} P\_m[K]}{n/2} \text{ (2)}$$

Also, we calculated the median and lag of maximum correlation coefficient of the Spearman correlation between envelopes of hemisphere-paired channels and coherence between channel pairs.

### Fractal Dimension Domain Features

Fractal dimension for time series is a value that estimates to what extent the fractal pattern changes with respect to the scale at which it embeds. We applied Higuchi method with k = 6 for each EEG channel to estimate the FD.

**Table 2** summarizes the extracted set of features from EEG data.

### Feature Reduction

After feature extraction, we eliminated features that are either low in variation among subjects or highly correlated with other features using the "findCorrelation" function in the "caret" package (Kuhn, 2008), version "6.0-78." The "findCorrelation" evaluates the pair-wise correlation of features. Then, it finds the highest absolute pair-wise correlation, if two features have a high correlation (r ≥ 0.9 Pearson's correlation), it eliminates the feature with the highest mean absolute correlation. It should be noted that other feature selection methods could be used to select the best features using NCV approach. However, the interpretation of such approach could be challenging i.e., the selected features from the inner loop of the NCV may vary across folds. In addition, using other feature selections should be applied within each loop of NCV, which increases the computational overhead. Thus, removing correlated features provides a better way to select features in this case. Figures S2, S3 in Supplementary shows the correlation matrices before and after removing the correlated features.

### Machine Learning Methods

Selecting appropriate ML algorithms is a critical step to achieve robust BrainAGE estimation. Having represented each subject's features in one row, the final dataset dimension is x = n × m, with n = 468 and m = 863. We used R package "caret" to perform a set of regression algorithms: Elastic Net (ENET), Support Vector Regression (SVR), Random Forest (RF), extreme gradient boosting tree (XgbTree), and Gaussian Process with Polynomial Kernel (gaussprPoly). The aim is to test different

ML techniques in order to provide a better estimation for age. First, ENEST is a linear regression technique that uses L1 and L2 regularization to prevent overfitting. Second, SVR uses optimization to build the regression model, but in high dimensional version of the training data. In our case, we used a kernel with radial basis function to project the data into high dimension space. Third, RF is one of the most common ensemble techniques, where it performs subsampling for the feature space of training data to build multi weak learners. Thus, different models from the training data are produced and then averaged to minimize the variance across models. Fourth, XgbTree utilizes a combination of ensemble learning, optimization and regularization to build generalized model from training data. Finally, gaussprPoly is a probabilistic approach to build a regression model by learning the distribution of the training data given the response (age). Similar to the kernel function in SVR, gaussprPoly adopts a polynomial kernel to project data into high dimension space.

To provide un-biased prediction for age, the nested cross validation was adopted in building age prediction models (Varma and Simon, 2006). **Figure 2** depicts the NCV procedure consisting of two main loops: the inner and outer loops. The

### TABLE 2 | The extracted features from EEG data.


inner loop is used to find the best parameters from training set, while the outer loop is used to evaluate the best parameters on the testing set. To elaborate on the NCV, let the subscript refers to data and models from inner loop of NCV, while the superscript represents the ones from outer loop. In our run, we used 10 fold cross-validation (K<sup>I</sup> = 10) for the inner and 10-fold crossvalidation for outer loop (K<sup>O</sup> = 10). The inner loop was used to estimate the best parameters on training data (Tr<sup>1</sup> ) using a grid search and the one-standard error rule. Each inner loop consists of 5-repeat (R = 5) for each method. The outer loop uses the best obtained models to build a stack-ensemble model. The best models are represented by its best parameters θ l i , with i is the method index of the corresponding method M<sup>i</sup> (i = 1..r) and l refers to the fold l from the outer loop. The symbol "P" refers to the prediction process associated with each method. Stacking ensemble helps to improve the stability of prediction by combining the prediction from other models; i.e., predictions from the five methods were combined by learning weights via a general linear model (GLM). In details, the GLM was trained on the resampled predicted age from the inner loop (yTr<sup>l</sup> i ). Then, the GLM was used to provide one weighted-average prediction in 10-fold cross-validation (KEns = 10). From there, the best stackensemble model (θ l Ens) was used to predict age for the testing set (YTs d<sup>l</sup> ). That is, the prediction of age is calculated for the

individual methods yTr<sup>l</sup> <sup>i</sup> = Pi(Tr<sup>l</sup> , θi), and then the weighted average is estimated for fold l.

$$\widehat{YTs^l} = P\_{\mathrm{Ens}}([\![\![\![yrr]\!]\!], \![\![r \!] \!] \!]\_{\mathrm{2}}, \;\ldots, \;\![\![\![\!r \!] \!]\_{\mathrm{Ens}}^{l}])$$

After iterating over all folds from the outer loop, a prediction for the age for the entire dataset can be built. In addition, the variable importance of predictors from the stacking ensemble models was estimated across the outer loop of NCV. Finally, the predicted age and age values were used to estimate the BrainAGE for the dataset. **Figure 3** shows the overall framework to estimate the BrainAGE.

### RESULTS

The NCV R-Squared performance for Stack-Ensemble and underlay methods is shown in **Figure 4**. The individual performance for each ML method was calculated before the stack-ensemble phase. The results showed that SVR with radial kernel achieved the best accuracy R <sup>2</sup> = 0.34 (0.056), MAE = 7.01(0.68) years and RMSE = 8.7(0.63) years. On the other hand, the stack-ensemble improved the overall performance with R <sup>2</sup> = 0.37 (0.064), MAE = 6.87(0.69) years, and RMSE = 8.46 (0.59) years.

The correlation between predicted age and age is shown in **Figure 5**, while the BrainAGE variable is plotted in **Figure 6**.

The importance of features was estimated such that the total summation of features importance is 100 from each fold of the outer loop of NCV. Then, the importance scores were averaged across folds. In our case, we report the results as the mean across all folds. **Figure 7** shows the top 15 important predictors of age. The color of the bars represents the Pearson's correlation values between each predictor and the age. From the graph, we can notice that "spectral flatness of beta band from channel TP9" is the most important predictor of age with r = 0.34. Please refer to Figure S4 in Supplementary for detailed graphing for the relationship between top predictors and age.

The relationship between chronological age and the top features was studied by the Partial Dependence Plot (PDP) (Friedman et al., 2001). For each training model, the consistency across folds was examined by overlaying the PDP curves. One wants the same feature to behave similarly among the folds of the outer loop of NCV. **Figure 8** shows the PDP for the top feature. As can be seen, the PDP for each fold (thin lines) have consistent behavior among all folds. Figure S6 in Supplementary shows PDPs for the top features.

Frontiers in Aging Neuroscience | www.frontiersin.org

To show the spatial distribution of feature importance, MNE software (Gramfort et al., 2013) was used. More specifically, the feature importance scores obtained from the NCV were averaged based on the feature type and categorized based on the frequency bands. The resultant mapping for the feature importance scores is shown in **Figure 9**. Figure S5 in Supplementary presents the PDPs for the top features.

Finally, we consider the effect of number of samples on the performance of predicting age. We tested our framework on different number of samples. **Figure 10** graphs the R <sup>2</sup> of NCV as a function of the number of samples in our dataset.

### DISCUSSION

In the discussion part, we address the results, our research goals and elaborate on different implementation details. In addition, we compare our results with related work and point out various aspects of differences.

FIGURE 7 | The top 15 important features to predict age sorted from most important (bottom) to top. Ventricle axis shows the scoring values from stack-ensemble model predictor, while the color indicates the correlation values between that feature and age.

### Age-Related Changes Are Affecting Brain EEG Signals

Results suggest that indeed the aging affects human brain EEG signals. We have also determined that, a comprehensive

feature extraction is required from EEG signals to capture the relationship between chronological ge and the age predictors. This suggests that the aging is reflected broadly on the EEG signals without selected predominate feature and also suggests that utilized EEG predictors feature different mechanisms of influence by age and/or disease. In addition for features extraction, selecting the best features is important to improve the performance and reduce the complexity of the model. We eliminated the correlated features to select the best features, which improve the overall R 2 . Our selection for correlated features preserves the consistency among NCV folds and importantly eases the interpretation of the results. The agerelated changes in EEG are strongly supported by the literature (Benninger et al., 1984; Gasser et al., 1988; Marciani et al., 1994; Widagdo et al., 1998; Kikuchi et al., 2000; Babiloni et al., 2006; Hashemi et al., 2016) and by our results as well, where the correlation between top four features and age was relative high with r = 0.34, 0.3, 0.26, and 0.24, respectively.

# Can Age Be Predicted From EEG Signals?

Using unbiased prediction of age, NCV, we were able to provide a reasonable accuracy for predicting age. The best results were obtained by SVR (R <sup>2</sup> = 0.34) and were slightly improved by the Stack-ensemble approach (R <sup>2</sup> = 0.37). The correlation between predicted age and age (r = 0.6), which shows the ability of our model to predict the age. The overall feature importance scores were extracted for each fold in the outer loop of NCV and then averaged across all folds. The feature importance showed that the important predictors are spread out across different features types and bands. In addition, we used PDP to examine the consistency of features across the outer loops of NCV, where we showed that top features have a similar behavior across the folds.

The effect of the number of samples on prediction accuracy is shown in **Figure 10**. The graph indicates a potential improvement may be achievable adding more samples. When testing on 50 samples, the overall accuracy was R <sup>2</sup>∼ = 0.26, which shows that the features are informative for predicting age even from small number of samples. It should be noted that our samples size is relatively smaller than other works, especially those ones used MRI.

We found no differences in age prediction across female and male groups. Both groups have a relatively matched average chronological age: female group = 34.47 (10.65) and male group = 35.29(10.47). The average predicted age resulted in 34.78 (6.87) for female and male 35.11 (6.04). The MAE was 6.99 (5.10) and 6.66(4.77) years for female and male groups, respectively.

Mapping of the spatial distribution of feature importance scores revealed that age predictors are not uniquely corresponding to specific channels, frequency band nor to a specific feature domain. That is, different features types capture some characteristics of EEG, but not the whole relationship. For example, **Figure 7** showed that among the top 15 important features, the spectral features are positively correlated with age, while rEEG features are negatively correlated. That is, one type of features captures a specific aspect of the relationship between that feature type and the age. Thus, providing heterogeneous features can improve the predictability of age. This is also supported by **Figure 9**, where the spatial distribution of feature importance scores does not exhibit a uniform representation. Our analysis shows that relative contribution of features importance is 46, 31, 18, 3, and 2% for spectral, rEEG, amplitude, FD, and connectivity, respectively. It should be noted that the number of features among different domains are not the same especially that is the case for FD and connectivity features. Similarly, features contributions are also spread out across bands as follows: 31, 21, 27, and 18% for theta, delta, alpha, beta, and theta, respectively.

### Comparison With Other Works

Predicting age from EEG features was also studied in Dimitriadis and Salis (2017). Compared with the current study, they reported relatively higher prediction accuracy, 0.6, compared with 0.4 here. There are a number of differences which may contribute to this disparity. Perhaps the most significant one is that they seem to have done feature selection using the response variable and the entire dataset, which will generally lead to more optimistic evaluations than doing feature selection within a nested cross validation framework, as done here. Additionally, we report R 2 as 1-SSresid/SStotal (SSresid is the squared residuals from the regression and SStotal is the total sum of squares of differences from the mean) taken from the model prediction, while they seem to have reported the R <sup>2</sup> of a line fit through Age vs. Predicted Age. Other differences include the feature sets used and the fact that our data were collected during fMRI, which may leave some residual artifact. Furthermore, we use here an interpretation-friendly features.

Predicting age from functional brain imaging is probably more challenging than structural imaging. One can notice from **Table 1** that fMRI yields generally a lower performance than MRI data. The best results was reported by Cole et al. (2017) with r = 0.96 from structural imaging of healthy subjects. EEG and fMRI are both functional imaging for the brain and thus it's more subjective to compare EEG results with fMRI results. Our method's performance is relatively lower than those from fMRI works reported in Dosenbach et al. (2010) with R <sup>2</sup> = 0.55 and Qin et al. (2015) with MAE = 4.6 years. Without a subjective comparison between EEG and fMRI from the same dataset, it's hard to draw conclusions about amount of information that each domain embeds. Although fMRI/MRI imaging may yield a higher accuracy, but it comes at extra cost and less portability as compared to EEG.

The contribution of some features in BrainAGE is in line with previous works (Chiang et al., 2011; Zappasodi et al., 2015). For instance, our findings show the negative correlation between age and alpha power spectra in healthy groups reported in Chiang et al. (2011). This correlation trend could be observed in other frequency bands, especially Delta and Theta bands. FD is positively correlated with age for Healthy subjects, which is consistent with finding in Zappasodi et al. (2015). However, Zappasodi et al. (2015) showed that FD increases for ages from 20 to 50 years and then decreases. Since our age limit is 58, the pattern is increasing overall for ranges from 18 to 58 years. Figures S6, S7 in Supplementary provide a spatial mapping of the correlation values between the spectral and FD features and age.

### CONCLUSIONS

We have introduced the rigorous framework for BrainAGE estimation based on EEG brain signals. Proof-of-concept analysis showed that, it is possible to build a robust BrainAge estimation by harnessing both extensive EEG feature representation and suitable ML algorithms. ML and NCV play a significant role in identifying informative features and studying the spatial distribution of significant predictors, and providing unbiased prediction. In addition, we showed how to evaluate and interpret the results using the feature importance scores and partial dependence plots. The introduced framework can be extended to test association with and predict other physiological relevant measures based on EEG brain signals.

# AUTHOR CONTRIBUTIONS

All authors contributed significantly to work regarding conception and design; acquisition, analysis; drafting the work and revised critically; approval of the version to be published, and carrying the responsibility for achieving the accuracy or integrity of any part of the work.

### FUNDING

The study was supported by the Laureate Institute for Brain Research and William K. Warren foundation, and in part by the W81XWH-12-1-0697 grant from the U.S. Department of Defense, and the P20 GM121312 award from National Institute of General Medical Sciences, National Institutes of Health.

# ACKNOWLEDGMENTS

We are particularly grateful to Dr. Vadim Zotev for continuous help with EEG hardware, Julie Owen, Bill Alden, Julie DiCarlo, and Greg Hammond for helping with MRI and EEG-fMRI

### REFERENCES


scanning. We would like to thank Dr. Patrick Britz, Dr. Robert Störmer, Dr. Mario Bartolo, and Dr. Brett Bays of Brain Products, GmbH for their help and technical support.

# SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fnagi. 2018.00184/full#supplementary-material


Kuhn, M. (2008). Caret Package. J. Stat. Softw. 28, 1–26. doi: 10.18637/jss.v028.i05


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Al Zoubi, Ki Wong, Kuplicki, Yeh, Mayeli, Refai, Paulus and Bodurka. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Biological Brain Age Prediction Using Cortical Thickness Data: A Large Scale Cohort Study

Habtamu M. Aycheh1†, Joon-Kyung Seong2†, Jeong-Hyeon Shin<sup>2</sup> , Duk L. Na3,4,5 , Byungkon Kang<sup>1</sup> , Sang W. Seo3,4,5 \* and Kyung-Ah Sohn<sup>1</sup> \*

<sup>1</sup> Department of Software and Computer Engineering, Ajou University, Suwon, South Korea, <sup>2</sup> School of Biomedical Engineering, Korea University, Seoul, South Korea, <sup>3</sup> Department of Neurology, Samsung Medical Center, School of Medicine, Sungkyunkwan University, Seoul, South Korea, <sup>4</sup> Neuroscience Center, Samsung Medical Center, Seoul, South Korea, <sup>5</sup> Department of Health Sciences and Technology, Clinical Research Design and Evaluation, SAIHST, Sungkyunkwan University, Seoul, South Korea

Brain age estimation from anatomical features has been attracting more attention in recent years. This interest in brain age estimation is motivated by the importance of biological age prediction in health informatics, with an application to early prediction of neurocognitive disorders. It is well-known that normal brain aging follows a specific pattern, which enables researchers and practitioners to predict the age of a human's brain from its degeneration. In this paper, we model brain age predicted by cortical thickness data gathered from large cohort brain images. We collected 2,911 cognitively normal subjects (age 45–91 years) at a single medical center and acquired their brain magnetic resonance (MR) images. All images were acquired using the same scanner with the same protocol. We propose to first apply Sparse Group Lasso (SGL) for feature selection by utilizing the brain's anatomical grouping. Once the features are selected, a non-parametric non-linear regression using the Gaussian Process Regression (GPR) algorithm is applied to fit the final age prediction model. Experimental results demonstrate that the proposed method achieves the mean absolute error of 4.05 years, which is comparable with or superior to several recent methods. Our method can also be a critical tool for clinicians to differentiate patients with neurodegenerative brain disease by extracting a cortical thinning pattern associated with normal aging.

Edited by: James H. Cole, King's College London, United Kingdom

### Reviewed by:

Heath R. Pardoe, New York University, United States Franziskus Liem, Universität Zürich, Switzerland

### \*Correspondence:

Sang W. Seo sangwonseo@empas.com Kyung-Ah Sohn kasohn@ajou.ac.kr

†These authors have contributed equally to this work

Received: 30 December 2017 Accepted: 31 July 2018 Published: 22 August 2018

### Citation:

Aycheh HM, Seong J-K, Shin J-H, Na DL, Kang B, Seo SW and Sohn K-A (2018) Biological Brain Age Prediction Using Cortical Thickness Data: A Large Scale Cohort Study. Front. Aging Neurosci. 10:252. doi: 10.3389/fnagi.2018.00252 Keywords: aging, cortical thickness, cortical lobe, regression analysis, ROI, Sparse Group Lasso, Gaussian process

### INTRODUCTION

Aging is a biological process that exhibits distinct attributes from childhood to old age. Human brain aging is affected by progressive and regressive neuronal processes due to cell growth and death (Silk and Wood, 2011). Moreover, environmental factors and health conditions affect structural changes in the brain (Pannacciulli et al., 2006; Chee et al., 2009; Ziegler et al., 2012). Thus, the structure of the brain changes continuously throughout a life span. Human brain degeneration has a specific pattern during the normal aging process (Seidman et al., 2004; Fjell et al., 2009; Fjell and Walhovd, 2010). This provided the groundwork for studies that predict brain age from brain atrophy patterns. The majority of these studies were inspired by the clinical benefits of biological brain age estimation for early prediction of neurocognitive disorders. For example, diseases related to Alzheimer's that change brain aging patterns can be examined. The human brain changes over a lifespan. According to neuroscience studies, the brain can be macro-anatomically grouped into the following major six lobes: frontal, temporal, parietal, limbic, occipital, and insula (Allen et al., 2005). Cortical thickness rate of decline in an elderly lifespan is variable in each cortex lobe (Resnick et al., 2003; Seidman et al., 2004; Allen et al., 2005; Fjell et al., 2009; Fjell and Walhovd, 2010; Lemaitre et al., 2012; Ziegler et al., 2012; Ruigrok et al., 2014). These studies identified that there is variability in the brain aging process from person to person, and among age groups (neonatal, youth and adult ages). These evidences indicate the variability of anatomical measurement trajectories in different cortical regions. The demographic variables of education and gender are also confounding factors that influence cortical thickness (Tang et al., 2013; Li et al., 2014; Mortby et al., 2014; Ruigrok et al., 2014; Ritchie et al., 2017; Belathur Suresh et al., 2018; Thow et al., 2018). In particular, the recent work in Belathur Suresh et al. (2018) and Thow et al. (2018) demonstrated the impact of education on cortical thickness for the accuracy of Alzheimer's disease detection. These indicate the potentiality of gender and education as predictive variables in addition to brain anatomical features. We also include these demographic variables in our study.

Brain age prediction is an active research area. There have been continuous research efforts in the estimation of human biological brain age using magnetic resonance imaging (MRI). Currently, there are two major trends of using brain MRI to predict age: (1) using raw image data and (2) cortex anatomical measures. The acquisition of both data sources are related to the arguments for and against using surface-based or voxelbased registration methods as described in Clarkson et al. (2011). The works in both directions are continually improving and have significantly assisted practitioners and researchers in the neurology-related research domain. Progressively, interesting and fruitful brain-age-prediction analysis results are presented in Ashburner (2007), Franke et al. (2010), Cole et al. (2015, 2017), Kondo et al. (2015), Wang et al. (2015), Alam et al. (2016), Cherubini et al. (2016), Cole and Franke (2017), Liem et al. (2017), Valizadeh et al. (2017), and Madan and Kensinger (2018). Recently, the prediction of brain age using 3D raw image data using deep learning presented by Cole et al. (2017) showed a promising result. Cole et al. (2017) used convolutional neural network (CNN) algorithms and obtained the best mean absolute error (MAE) of 4.16 years. This result is comparably an improvement on their prior work of brain age prediction using Gaussian Process Regression (GPR), which had an MAE = 4.66 years (Cole et al., 2015). The prediction of brain age prediction using surface-based features has also been studied (Kondo et al., 2015; Wang et al., 2015; Liem et al., 2017; Valizadeh et al., 2017; Madan and Kensinger, 2018). The recent study by Madan and Kensinger (2018) compared different parcellation approaches to extract explanatory features for brain age prediction from MRIs. They reported the median absolute error (MdAE) = 6–7 years using Relevant Vector Regression (RVR). This prediction result is obtained by the combination of cortical thickness and fractal dimension. Another recent study by Valizadeh et al. (2017) presented a detailed feasibility analysis of age prediction from surface-based measures. They described brain age prediction using anatomical measures such as cortical thickness, surface area, cortical volume, and their combinations from a human brain MRI using 148 regional cortex compartments. Their overall analysis showed the plausibility of age prediction from brain surface-based features with high accuracy. In it, the best performance was obtained using a neural network prediction model, where the prediction errors were similar to prior results reported in Wang et al. (2015) and Cherubini et al. (2016). The analysis by Valizadeh et al. (2017) revealed an additional important point that prediction error increases with increasing age specifically in older adults. Wang et al. (2015) used an RVR model (Tipping, 2001) to estimate age on the basis of different anatomical measures such as cortical surface area, cortical thickness, mean curvature, Gaussian curvature, and a combination of these measures by using 148 regions of interest (ROIs). The best performance result obtained was by a combination of cortical thickness and the curvature predictive features, with a reported root mean square error (RMSE) of 5.57 years. The findings of Wang et al. (2015) support the idea that among surfacebased features, cortical thickness is more informative for agerelated morphometric changes across the life span than other type of features. The age prediction algorithm reported in Kondo et al. (2015) also used RVR based on a local feature extraction approach from a T1-weighted MRI. They used 90 local regions of white matter, gray matter, and cerebrospinal fluid (CSF) to reduce the requirement for high-order features when combining brain anatomical features for age estimation to simplify the medical implications. There are also several studies that have investigated the potential of functional connectivity measures derived from resting state functional MRI (rsfMRI) data for the brain age prediction (Dosenbach et al., 2010; Vergun et al., 2013; Smyser et al., 2016; Liem et al., 2017). In these studies, functional connectivity measures were derived from rsfMRI data based on regions of interest (ROIs) defined by different parcellation methods and the support vector regression was adopted to build age prediction models.

In general, the major components of brain age estimation from MRI are feature extraction, feature selection or identification of the explanatory variables from a brain feature dataset, and the regression model for fitting the target age. According to state-of-the-art studies, a consistent ageto-brain-development relationship pattern is exhibited using surface-based brain features (Clarkson et al., 2011; Madan and Kensinger, 2018). Further, model overfitting is one of the challenging issues in brain age estimation using cortical measures due to the inherent correlation among brain features. Wang et al. (2015) and Valizadeh et al. (2017) employed principal component analysis (PCA) for dimension reduction and feature extraction. Kondo et al. (2015) used a local feature extraction approach. For model fitting, the majority of the works focused on kernel-based regression, specifically RVR models for the prediction of age from brain anatomical features. The recent implementation of deep learning algorithms also benefits from the automatic feature learning property of the algorithm when the sample size is sufficient (Cole et al., 2017).

In this paper, we propose the prediction of brain age based on cortical thickness data by first applying Sparse Group Lasso (SGL) (Simon et al., 2013) for selecting important features from each major cortical lobe and then using GPR (Rasmussen and Williams, 2006) for fitting the age prediction model. The rate of decline in cortical thickness can differ in each cortex lobe. Human aging is related to a healthy brain-change pattern within the respective cortical lobes. SGL is a regularized regression method for grouped variables that supports feature selection on a group level and within group level. As such, SGL is robust and consistent in feature selection. Thus, SGL is an appropriate approach to select explanatory features on and within the cortical lobes. Then, we deploy GPR, which is a non-parametric nonlinear regression method to predict the target subject's brain age based on the selected features. We obtained prediction accuracy of MAE = 4.05 years by using the proposed method, which is comparable with or superior to several recent methods.

# METHODS

### Study Participants

Study participants were recruited from the Health Promotion Center of Samsung Medical Center (SMC), Seoul, Korea. The study population was comprised of men and women 45 years of age or older who underwent a comprehensive health screening exam between January 1, 2009 and December 31, 2014. Of the eligible participants 3,370 attended a preventative medical check-up, which included an assessment of cognitive function and dementia status. All study participants underwent a highresolution 3.0 Tesla (3T) brain MRI, which included threedimensional (3D) volume images as a part of their dementia assessment. The assessment procedure used for the participants has been described in detail elsewhere (Lee et al., 2016b). Participants were excluded for meeting disqualifying conditions: 202 participants with missing educational data or missing Mini-Mental State Examination score (MMSE); 178 participants who showed significant cognitive impairment defined by MMSE scores below the 16th percentile in age-, sex-, and educationmatched norms, or through an interview conducted by a qualified neurologist; and 136 participants with unreliable analyses of cortical thickness due to head motion, blurring of the MRI, inadequate registration to a standardized stereotaxic space, misclassification of tissue type, or inexact surface extraction, for which the image preprocessing and cortical thickness computation process were manually checked and corrected by an expert neuroanatomist. Participants were excluded if they had a cerebral, cerebellar, or brainstem infarction; hemorrhage; brain tumor; hydrocephalus; severe cerebral white matter hyperintensities (deep white matter ≥2.5 cm and caps or band ≥1.0 cm); or severe head trauma by personal history. The final sample size was 2,911 healthy individuals (1,460 males and 1,451 females). All 2,911 participants underwent a 3T brain MRI using the same type of scanner with the same scan parameters. We parcellated the cerebral cortex into 148 cortical ROIs based on the Destrieux Atlas (http://surfer.nmr.mgh.harvard.edu/fswiki/ CorticalParcellation). For each of the 148 cortical ROIs, the average cortical thickness was computed. Because we learn the prediction model for brain age based on the chronological ages of healthy individuals, we further detect and exclude outlier samples to minimize the potential bias from individuals with unknown health conditions. After excluding noise and outliers (as is explained in section Filtering Outliers From Cortical Thickness Data), 2,705 observations (1,368 males and 1,337 females) remained. The age range of the subjects in this study was 45–91 years. The mean age of the subjects was 64.2 years, with a standard deviation of 7.1 years (male: 65.2 ± 6.9 years; female: 63.1 ± 7.2 years). See **Table 1** for more details.

This study was approved by the Institutional Review Board at the Samsung Medical Center. The requirement for informed consent was waived, as we only used de-identified data collected for clinical purposes during the health screening exams.

## Image Acquisition and Preprocessing

3D T1-weighted Turbo Field Echo MRI images were acquired from all participants in this study using the Philips 3T Achieva MRI scanner with the same imaging parameters (sagittal slice thickness 1.0 mm, over contiguous slice acquisition with 50% overlap; no gap; repetition time 9.9 ms; echo time 4.6 ms; flip angle 8◦ ; and matrix size 240 × 240 reconstructed to 480 × 480 over a 240-mm field of view).

For each subject, we first performed image preprocessing using FreeSurfer v5.1.0 (Athinoula A. Martinos Center at the Massachusetts General Hospital, Harvard Medical School; http://surfer.nmr.mgh.harvard.edu/). FreeSurfer was used to volumetrically segment and parcellate cortex from T1-weighed images (Dale et al., 1999; Fischl et al., 1999, 2002; Desikan et al., 2006; Destrieux et al., 2010; Fischl, 2012; Klein and Tourville, 2012). We first constructed the outer and inner cortical surface meshes from the MR volume of each subject. The two meshes are isomorphic with the same vertices and connectivity because the outer surface was constructed by deforming the inner surface. In order to establish inter-subject correspondence, we resampled each subject's cortical surface to 40,962 vertices for each hemisphere using the previously proposed method (Cho et al., 2012).

For smoothing cortical thickness data, we adopted the noise removal procedure proposed by Cho et al. (2012) to our problem setting. Cho et al. (2012) employed the manifold harmonic transform (MHT) to delineate the cortical thickness data with its spatial frequency components (Vallet and Lévy, 2008). For the transform, the Laplace-Beltrami operator is used to obtain basis functions which results in robustness to noise by filtering out high frequency (Cho et al., 2012). Since high frequency components of the transformed cortical thickness data were regarded as noise, those components are filtered out, and the cortical thickness data were then reconstructed using only low frequency components (Chung et al., 2007).

In general, the mean cortical thickness values of 148 distinct ROIs were computed from each brain MRIs and used as predicting variables. The confounding variables of gender and education of the subjects were also included to these predictors because they have cortical thinning effect in relation to normal aging (Tang et al., 2013; Li et al., 2014; Mortby et al., 2014; Ruigrok et al., 2014; Ritchie et al., 2017; Belathur Suresh et al.,



2018; Thow et al., 2018). The gender feature is a "0–1" binary variable that indicates whether the subject is male (0) or female (1). The education feature is a numeric value that reflects the level of education the subject has attained, which is related to the number of years of study (zero indicates uneducated, and a higher number of years of study indicates a higher level of education). Thus, we had 150 explanatory variables.

### Filtering Outliers From Cortical Thickness Data

In this study, we used mean cortical thickness data extracted from brain images for brain age prediction. The proposed model is a supervised learning method and it is understandable that the response variable is a chronological age, which is set under the assumption that chronological age and brain age is the same for a healthy subject. Outliers significantly affect the prediction accuracy of a model when non-robust statistical methods are used. Nevertheless, the cortical thickness measures can deviate from the expected range due to overlooked health factors, life style conditions, environmental effects and other related factors that affect cortical thickness. Most importantly, we are interested in investigating the outliers due to sample subjects that could be included in our target study due to bias or latent conditions of the cognitive health assessment. There are high risk factors related to cognitive health in older adults. Thus, our target sample (age 45–91 years) requires additional attention regarding the reliability of the subjects' cognitive health. Accordingly, we are interested in investigating the effects of outliers in the dataset by using an outlier-filtering method suitable to our problem data representation.

The choice of the outlier detection method predominantly depends on the nature and representation of the dataset, which requires an insight investigation of the problem domain such as small perturbation values. In this study, we adopted the Local Outlier Factor (LOF) method to our dataset (Breunig et al., 2000). LOF is a density-based unsupervised outlier detection method that has a property of comparing outliers to their local neighborhoods instead of the global data distribution. The age range in our study was from 45 to 91 years. In this age range, cortex thickness gradually decreases with increasing age. Thus, LOF can be used to identify outliers by grouping ages to their proximity. Accordingly, we grouped ages by eight intervals: 45–49, 50–54, . . . Then, LOF was applied on each interval to identify outliers. We are interested in checking outliers per age group interval because the cortical thickness changes gradually and we need to manage expected variabilities between the youngest age groups (approximately age 45) and the oldest age groups (around age 91) in the target sample of age range from 45 to 91 years.

In our dataset, we had N = 2,911 subjects and p = 148 features (mean cortical thickness values) for each subject. The demographic features, gender and education, were not included in this case. We check outliers in each subsets of the 2,911 observations per the specified age group. That is, we had subdatasets, Sni×<sup>p</sup> where n<sup>i</sup> is the number of observations in the given age group. Based on this, the outlier score values were computed using the LOF algorithm as stated in Breunig et al. (2000). LOF uses the K-nearest neighborhood approach to compute the outlier scores by setting k heuristically. The algorithm compares the density of each point to the density of its K-closest neighbors. Note that the distance between two points is the distance between two vectors X<sup>1</sup> ∈ R p and X<sup>2</sup> ∈ R p . A higher value of LOF score indicates a potential outlier. An illustration of our proposed outlier filtering approach is displayed in **Figure 2**. After outlier filtering, 2,705 samples remained for the training set and testing set. The number of identified outliers for each age group is shown in **Figure 3**.

# Brain Age Modeling Methods

Brain age prediction is a supervised regression problem. The response variable is chronological age. The predicting variables are mean cortical thickness measures of the 148 ROIs and the confounding factors of gender and education. The main challenge of brain age prediction from brain anatomical features is overfitting due to the correlation among brain features. Thus, the parsimony of the model is crucial for the analysis of the prediction and inferences. Balancing this tradeoff between bias and variance helps to overcome the overfitting problem in relation to analyzing the explanatory variables used in modeling the brain age prediction. Accordingly, the main concern of our approach is selecting the most important prediction variables from the existing features while maintaining good generalization.

The prediction accuracy of brain age using the given Pdimensional covariates can be improved by combining the individual strengths of the learning algorithms. The macro anatomical grouping of cortex structure into cerebral lobes favors the SGL model implementation. We combine the SGL that selects the most important features on a group and within group levels with the acceptable prediction power of kernel methods such as GPR. In general, we use the SGL model described in section Sparse Group Lasso to select the top q important features from the P covariates. Then, the brain age prediction model is fitted using GPR. The general framework of the proposed brain age prediction model is depicted in **Figure 1**.

### Sparse Group Lasso

SGL is a robust and consistent regularization model with L<sup>1</sup> and L<sup>2</sup> penalties for grouped variables (Chatterjee et al., 2012; Simon et al., 2013). SGL has a sparse effect both between and within groups. Sparsity is a property of learning methods that results when only a small number of coefficients of the model are non-zero. The majority of real-world problems can be sparsely represented because only a subset of the underlying features is required to best-fit a model that generalizes well to test instances. Regularization methods are frequently used to maintain the complexity of the model at a reasonable level to prevent the problem of overfitting.

Considering the regression problem of predicting brain ages of N individuals Y ∈ R <sup>N</sup> based on X ∈ R N×P , we can represent the problem in multiple linear regression form as given in Equation (1).

$$Y = X\beta + \varepsilon \tag{1}$$

where Y ∈ R <sup>N</sup> is the vector of response variable, age; X ∈ R N×P is the matrix of predicting variables; β ∈ R P is a weight vector– the unknown parameters; and ε ∈ R <sup>N</sup> is a vector of random errors. We omit the bias without loss of generality. Then the Ordinary Least Square (OLS) method estimates the parameters β by minimizing the cost function given in Equation (2).

$$\left\|\boldsymbol{Y} - \boldsymbol{X}\boldsymbol{\beta}\right\|\_{2}^{2} \tag{2}$$

However, OLS has a limitation with respect to high collinearity and on high-dimensional data, including when the number of observations is less than the number of predictors. Lasso (Least Absolute Shrinkage and Selection Operator) is a regularization method that promotes sparsity by extending the OLS method with an additional penalty term (Tibshirani, 1996). Lasso estimates β by minimizing Equation (3).

$$\frac{1}{2} \left\| Y - X\beta \right\|\_2^2 + \lambda \left\| \beta \right\|\_1 \tag{3}$$

Further, (Yuan and Lin, 2007) developed a Group Lasso method for grouped variables. It minimizes the objective function given in Equation (4) to estimate β.

$$\min\_{\beta} \frac{1}{2} \left\| Y - \sum\_{l=1}^{m} X^{(l)} \beta^{(l)} \right\|\_{2}^{2} + \lambda \sum\_{l=1}^{m} \sqrt{p\_l} \left\| \beta^{(l)} \right\|\_{2} \tag{4}$$

where m is number of groups, X (l) and β (l) are predictors and coefficients in-group l, l = 1, 2, · · · , m, respectively and pl is the length of β (l) . The Group Lasso produces a sparse set of groups that are related to the response variable. That is, if a group is selected in the model then all coefficients in the group are non-zero. SGL combines Group Lasso and Lasso methods to select the most important variables between and within groups. Considering m total group of predictors, the general SGL model representation is given in Equation (5).

$$\begin{aligned} \min\_{\beta} & \frac{1}{2n} \left\| Y - \sum\_{l=1}^{m} X^{(l)} \beta^{(l)} \right\|\_{2}^{2} \\ & + (1 - \alpha) \lambda \sum\_{l=1}^{m} \sqrt{p\_l} \left\| \beta^{(l)} \right\|\_{2} + \alpha \lambda \left\| \beta \right\|\_{1} \end{aligned} \tag{5}$$

where X (l) is the predictors in-group l; β (l) is the coefficient vector in-group l, l = 1, · · · , p<sup>l</sup> ; p<sup>l</sup> = the total number of features in-group l, l = 1, · · · , m. α and λ are hyper-parameters of the model. The value of α is between zero and one and controls the weight assigned to the L1and L<sup>2</sup> penalties; That is, α = 0 produces Group Lasso model; α = 1 produces the Lasso model; when 0 < α < 1, we obtain a balance between the two schemes. In our case we used α = 0.25. An optimal estimation of the tuning parameter λ is important for prediction accuracy. We used 10-fold cross validation on λ sequences of length 50 in such a way that the optimal value of λ is the point at which an increase of λ does not provide substantial decrease of cost function.

The resulting coefficients of SGL are sparse, that is, only a small number of the coefficients are non-zero; hence, the most important features with non-zero coefficients can be automatically selected. Therefore, it supports simultaneous feature selection and regression coefficient estimation in a single framework. In the proposed approach, we used SGL using the cerebral lobe classification of the cortex structure as primary groups. Cortical regions outside of the major cerebral lobes are grouped as "Others." Gender and education features were considered as a singleton group, i.e., their group size was one. Thus, the value of m in the SGL model was nine. **Table 2** summarizes the details of the groupings used in our analysis. We analyzed our model using the R package SGL library (https://CRAN.R-project.org/package = SGL).

### Gaussian Process Regression

A Gaussian process (GP) is a collection of random variables, where any subset of the variables follows a multivariate

Gaussian distribution (Rasmussen and Williams, 2006). GP can be used to describe probabilities over arbitrary functions, which allows us to apply it in a regression setting called Gaussian Process Regression (GPR). GPR is a non-parametric regression model based on the Bayesian approach (Rasmussen and Williams, 2006). Multivariate GP can show local patterns of covariance between individual points. Moreover, the combination of multiple Gaussians in GP can model non-linear relationships and it is more flexible than parametric models. GPR previously demonstrated high accuracy in predicting age from T1-MRI data of voxel based morphometry (Cole et al., 2015).

Considering the training dataset of input-target pairs xi , yi m i=1 , the GPR assumes the output y<sup>i</sup> as a function f on input x<sup>i</sup> as given in Equation (6):

$$
\gamma\_i = f\left(\mathbf{x}\_i\right) + \varepsilon\_i \tag{6}
$$

where ε<sup>i</sup> ∽ 0, σ 2 . Let Y ∈ R <sup>m</sup> be the vector of the response variables y<sup>i</sup> , and X ∈ R <sup>m</sup>×<sup>P</sup> be a matrix of features. The GP for the distribution of function values we are trying to estimate are based on the mean, m (X) and a covariance function K(X, X ′ ), as given in Equation (7).

$$f(\mathbf{x}) \sim GP\left(m\left(X\right), \ K(X, X')\right) \tag{7}$$

The covariance functionK(X, X ′ ), which is called a kernel function, describes the relationship between the function values at all input points X and X ′ . The prior mean m (X) is usually set to zero without loss of generality, i.e., the set of function variables have a zero mean Gaussian distribution as indicated in Equation (8).

$$f\left(X\right) \sim GP(0, K\left(X, X'\right))\tag{8}$$

For some valid covariance function K X, X ′ , considering the test dataset x ∗ i , y ∗ i n ∗ i=1 and the corresponding response variable

correspond to the highest values, which are skewed to the right.

vector Y<sup>∗</sup> = (y ∗ 1 , y ∗ 2 , · · · , y ∗ n ∗ ) <sup>T</sup> ∈ R n ∗ , and n <sup>∗</sup> × P matrix of features, X<sup>∗</sup> = (x ∗ 1 , x ∗ 2 , · · · , x ∗ n ∗ ) <sup>T</sup> ∈ R n <sup>∗</sup>×P , the prediction of the response variable Y∗ using predicting variables X∗ can be obtained by using the conditional Gaussian distribution given in Equation (9).

$$P\left(Y\_{\ast}|Y,X,X\_{\ast}\right) = P\left(\circ|f,X,X\_{\ast}\right) \sim N(\boldsymbol{\mu}^{\ast},\boldsymbol{\Sigma}^{\ast})\tag{9}$$

where,

$$\begin{aligned} \boldsymbol{\mu}^\* &= \boldsymbol{K}(\boldsymbol{X}\_{\ast}, \boldsymbol{X}) (\boldsymbol{K}(\boldsymbol{X}, \boldsymbol{X}) + \boldsymbol{\sigma}^2 \boldsymbol{I})^{-1} \boldsymbol{Y} \\ \boldsymbol{\Sigma}^\* &= \boldsymbol{K}(\boldsymbol{X}\_{\ast}, \boldsymbol{X}\_{\ast}) + \boldsymbol{\sigma}^2 \boldsymbol{I} - \boldsymbol{K}(\boldsymbol{X}\_{\ast}, \boldsymbol{X}) (\boldsymbol{K}(\boldsymbol{X}, \boldsymbol{X}) \\ &+ \boldsymbol{\sigma}^2 \boldsymbol{I})^{-1} \boldsymbol{K}(\boldsymbol{X}, \boldsymbol{X}\_{\ast}) \end{aligned}$$

In addition, a special case of GPR called Relevance Vector Regression (RVR) is used for comparison as it has been widely


TABLE 2 | Grouping list of predicting variables.

used previously in predicting brain age from T1-MRI data (Ashburner, 2007; Franke et al., 2010; Kondo et al., 2015; Wang et al., 2015; Madan and Kensinger, 2018). RVR is a Bayesian sparse learning model used for regression and classification (Tipping, 2001). RVR determines the relationship between the target output and the covariates by enforcing sparsity. Given the training dataset of input-target pairs xi , yi n i=1 , the prediction of the unseen data x can be defined as a linear combination of a kernel function as given in Equation (10).

$$f\left(\mathbf{x}\right) = \beta\_0 + \sum\_{i=1}^{P} \beta\_i \Phi\_i(\mathbf{x}) \tag{10}$$

where β = (β0, β0, · · · , β<sup>P</sup> ) is vector of weights and Φi(x) = K(x, xi) is the kernel function defining the basis function of each example of the training set. Because in the y<sup>i</sup> = f (xi; β) + εi, ε<sup>i</sup> ∼ N(0, σ 2 ), thus, P yi xi = N(y<sup>i</sup> |y (xi), σ 2 ). Accordingly, the likelihood estimate based on the Gaussian distribution is given in Equation (11).

$$P\left(Y|\beta,\sigma^{2}\right) = \left(2\pi\sigma^{2}\right)^{\frac{-n}{2}} \exp\left\{-\frac{1}{2\sigma^{2}}\left\|y-\Phi\beta\right\|^{2}\right\} \tag{11}$$

where Y = (y1, y2, ··· , yN) T , and 8 is the N × (N + 1) design matrix with 8 = [8 (x1), 8 (x2), · · · , 8 (xN)] T and 8 (xi) = [1, K (x<sup>i</sup> , x1), K (x<sup>i</sup> , x1), · · · , K (x<sup>i</sup> , xN)] T

We analyzed our model of GPR and RVR by using R package "kernlab" library (http://www.jstatsoft.org/v11/i09/). The GPR is fitted using Radial Basis Function (RBF) on the most important features selected from the P covariates by SGL described in section Sparse Group Lasso.

### Deep Neural Network

Deep Learning was used for comparison, as it has previously shown high accuracy in predicting brain age on MRI raw data (Cole et al., 2017). An artificial neural network (ANN) is a machine learning algorithm inspired by a structure of the brain (Goodfellow et al., 2016). Architecturally, the neurons of ANN are interconnected and arranged in input, hidden, and output layers. ANN is typically classified as single layer or multilayer based on the number of hidden layers. Neural nets having only one hidden layer are called shallow neural networks. Multilayer neural networks, i.e., having two or more hidden layers, are called deep neural networks (DNNs) or deep learning. DNN is very helpful in automatic feature learning from complex nonlinear data representations. Autoencoder is a special type of DNN architecture that can be used for feature extraction. It is an unsupervised learning method that attempts to reconstruct its input (Goodfellow et al., 2016). Technically, autoencoders are feedforward neural networks with one hidden layer where the input is the same as the output. In other words, the input is compressed to lower dimensional code representation called a hidden layer and the output is reconstructed from this representation. The objective of the autoencoder is to learn an efficient and compact hidden representation of the input to successfully reconstruct it. Both the encoder and the decoder functions employ a form of non-linearity in order to learn rich representations. A stacked auto-encoder (SAE) is a DNN consisting of multiple layers of sparse autoencoders (Bengio et al., 2007). The outputs of each layer are wired as the inputs of the next layer. SAEs are trained in an unsupervised, greedy, and layer-wise fashion. That is, once the first layer is pre-trained as in an autoencoder by freezing all the other layers, its output is wired as input to the next hidden layer. This layer-wise training is continued to the last layer.

In brain age estimation, we used H2O package in R (https:// github.com/h2oai/h2o-3) for the analysis of DNN model. The rectified linear unit (ReLU) activation function and dropout regularization methods are used to train the DNN model. The number of hidden layers is four with 94, 48, 48 and 94 neurons in each hidden layer, respectively. In addition, we used stacked autoencoder to extract features from the P = 150 covariates irrespective of the cortical lobe grouping structure. In SAE, it takes the designated covariate vector X ∈ R P and maps to the deep hidden vector representation h ∈ R P ′ , P ′ < P. The extracted features (vector h ∈ R P ′ ) are used as input to fit the regression models for comparison with the SGL approach. For SAE, the number of hidden layers is two with 94 and 61 neurons in each layer, respectively. One of the challenges of deep learning is that the learning requires a very large dataset in order to obtain adequate prediction accuracy.

### Cross Validation

The dataset (N = 2,705) was stratified based on age and randomly divided into training set and test set. About 70% of the dataset was used for the training set (N<sup>1</sup> = 1,895) and the remaining 30% used for the test set (N<sup>2</sup> = 810). The 10-fold cross validation was done on the training set. We repeated this train-test split 10 times to obtain reliable generalizations. For each of the ten repetitions, the N observations were randomly resampled and divided into training set and test set. Then, the training set was randomly split into 10-folds for cross validation. When the first fold was used as the validation set, the model is fit on the remaining 9 folds. The mean square error (MSE1) was then computed for the held-out fold. For each fold, the MSE was computed as given in Equation (12).

MSE<sup>k</sup> = 1 n Xn i=1 yˆ<sup>i</sup> − y<sup>i</sup> 2 , k = 1, 2, · · · , 10. (12)

where y<sup>i</sup> is the actual age, yˆ<sup>i</sup> is the predicted age, and n is number of subjects in the validation set of a given fold. The 10-fold cross validation estimate (CVE) was computed by averaging Equation (12) as shown in Equation (13).

$$CVE = \frac{1}{10} \sum\_{k=1}^{10} MSE\_k \tag{13}$$

All performance assessment results reported for the brain age prediction model were conducted on the test set. Thus, for each resampled test dataset, we computed accuracy of age estimation in terms of RMSE and MAE. The RMSE and MAE can be computed as given in Equations (14,15).

$$RMSE\_{\hat{\jmath}} = \sqrt{\frac{1}{N\_2} \sum\_{i=1}^{N\_2} \left(\hat{\wp}\_i - \wp\_i\right)^2} \tag{14}$$

$$MAE\_{\hat{\jmath}} = \frac{1}{N\_2} \sum\_{i=1}^{N\_2} \left| \hat{\jmath}\_i - \jmath\_i \right| \tag{15}$$

where N<sup>2</sup> is the number of samples in the test set of each repeat. Finally, the generalization accuracy of the model is given as the average of Equations (14,15) as given in Equations (16,17).

$$RMSE = \frac{1}{10} \sum\_{j=1}^{10} RMSE\_j \tag{16}$$

$$MAE = \frac{1}{10} \sum\_{j=1}^{10} MAE\_j \tag{17}$$

### RESULTS AND DISCUSSION

### Effect of Outliers

The dataset in this study included only cognitively normal subjects. The objective of outlier checking was to investigate outliers that could violate this condition because our samples were from older adults (age 45–91 years), which are in a risk zone of cognitive disorder. Outliers affect the prediction accuracy of the model when non-robust statistical methods are used. As described in section Filtering Outliers From Cortical Thickness Data, we designed an algorithm that best suits the mean cortical thickness data representation by adopting an LOF method. The proposed outlier filtering approach showed a slight performance improvement in all tested models, as displayed in **Table 3**. The performance results of RMSE and MAE are in years and computed on the test dataset.

We compared the generalization results of different regression models on the dataset after removing outliers (resulting in N = 2,705 samples). The models were trained using 10 fold cross-validation repeated 10 times for reliability of the performance results. The best accuracy result was obtained by GPR model (RMSE = 5.18 years and MAE = 4.08 years) as shown in **Table 4**. All other models also showed comparable results.

### Performance of Hybrid Methods

As described in section Sparse Group Lasso, we used the SGL model to select the most important features based on macroanatomical cortex structure grouping of the brain. The major benefit of SGL is its robustness to select features on-group and within-group levels. The grouping of brain structures according to cerebral lobes best complies with the SGL benefits. In our dataset we had N = 2,705 observations and P = 150 covariates (ROIs = 148 and demographic variables–gender and education). Among the P = 150 covariates, the SGL method selected 94 features (ROIs = 93 and Education). Gender was not selected as an important predictor by SGL. Accordingly, we applied RVR and GPR on the selected features to estimate the brain age. The combination of SGL and GPR showed an improved performance result (RMSE = 5.16 years and MAE = 4.05 years) as indicated in **Table 5**. The paired t-test between MAEs from SGL + GPR and SGL + RVR produced a p-value of 0.004, which shows the statistical significance of the performance difference. The results from GPR and SGL + GPR also showed significant difference with p-value of 0.003.

In addition, we tested stacked autoencoder (SAE) for feature extraction based on the ungrouped covariates. The SAE was used to extract the features, and we applied RVR and GPR to estimate

TABLE 3 | Performance results before and after filtering outliers. # Model All data After removing outliers RMSE MAE RMSE MAE 1 OLS 5.409 4.240 5.264 4.112 2 SGL 5.347 4.162 5.265 4.071 3 GPR 5.274 4.151 5.139 4.033 4 RVR 5.368 4.213 5.179 4.063 5 DNN 5.378 4.181 5.160 4.022

TABLE 4 | Performance comparison of different regression models.


TABLE 5 | Performance results of hybrid approaches; SD, standard deviation.


the brain age. The results were comparable with that of SGL combined with GPR. However, if the actual selected features need to be known, it is impossible to identify them in the case of SAE as the learning of features is weighted and combined. This is because SAE learns a combination of features, whereas we were attempting to select a subset of features using SGL. The results of the hybrid methods are presented in **Table 5**.

The overall results of the SGL and GPR hybrid model showed marginal improvement over GPR. In addition, with SGL, the most important cortex regions are identified. Many studies showed that late life education reduces the rate of cortical thinning (Belathur Suresh et al., 2018; Thow et al., 2018). Thus, education can be considered as one of the features for brain age prediction.

The value of R <sup>2</sup> we obtained across the different methods is around 0.5 which is relatively lower than previous works. We conjecture this might be partly due to the age range of our study population. In this work, we studied only older adults (age 45–91 years with mean age around 64) than in other studies. Moreover, owing to the more complex non-linear relationship between normal aging and cortex anatomical structure of older adults, the amount of variance in the dependent variable explained by the independent variable(s) could be smaller. Therefore, R <sup>2</sup> may not always be an absolutely better measure that could compare the results under different age range and sample size. We would like to emphasize that our method has the additional advantage over previous methods in that it identifies the most contributing features and cortex regions having cortical thinning pattern due to normal aging.

The fitted plots for the SGL + GPR model are presented in **Figure 4**. The plot on the left displays the fitted lines of both chronological age and predicted age indexed from the least value of actual age to the greatest value. For example, the index for age 45 is "1," and the index for age 91 is m = 810 where m is the number of samples in the test set. Accordingly, the prediction interval of the test sample is shown as the gray area in **Figure 4** (left). The scatter plot for the chronological age vs. the predicted age is shown on the right in **Figure 4**. We find that the estimation result reveals an age-related bias such that the predicted age is higher for younger subjects and lower for older subjects. This appears to be due to the sample size imbalance across age groups. The prediction of age tends to be more accurate where more samples exist for each age, particularly in the 50–79 range, and the estimations for those groups in two extreme ranges with fewer samples tend to be substantially influenced by the estimates for the majority of samples in the middle. See **Table 1** for the detailed distribution of the ages in the target samples. A similar observation and discussion has been presented in Pardoe and Kuzniecky (2018).

# Consistency of SGL

We used a resampling method in order to verify the consistency of the SGL feature selection. We trained the SGL model with ten different random sample combinations of our dataset to validate its consistency. The experimental results showed that the majority of features selected were also repeatedly selected in the 10 random trials. Among the 94 features selected by SGL, 61 of them were repeatedly selected in different resampling trials. Education was selected in all ten trials. Gender was not selected in any of the ten trials. Among the 148 ROIs, most of the repeatedly selected cortex regions were from the frontal lobe, temporal lobe, and parietal lobe. The results of age estimation on these repeatedly-selected features showed comparable results with the 94 features. This shows the credibility of the selected regions, indicating the regions having cortical thinning patterns associated with normal aging. The results of age estimation on the repeatedly selected 61 features are shown in **Table 6**.

# Analysis of the Proposed Model

We designed a brain age prediction model with cortical thickness data extracted from 148 ROIs and confounding demographic variables of gender and education. The proposed method used SGL to select the most important features contributing to brain age prediction. SGL selected 94 features i.e., 93 ROIs and Education. The selected cortex regions are shown in **Figure 5** with different colors for cortical lobe groupings. The majority

of the selected regions are from cortex regions having cortical thinning pattern associated to normal aging (Allen et al., 2005; Fjell et al., 2009; Lee et al., 2016a,b). Our target study age range is from 45 to 91. In this range, brain cortical thickness declines due to normal aging. Education is one of the most important factors that reduce the rate of cortical thinning (Thow et al., 2018). Thus, Education has a contribution to the prediction of brain age as a confounding factor. We used these selected features to predict brain age using a GPR model and obtained good performance accuracy of MAE = 4.05 years. The combination of SGL with GPR has two benefits. First, it identifies the most contributing cortex regions having cortical thinning pattern due to normal aging. Secondly, it offers a comparable or superior generalization result to GPR alone or other models.

TABLE 6 | Age estimation on repeatedly selected 61 features; the number of repetition is ten.


The importance of surface-based morphology in comparison to voxel-based morphology was reviewed by Mechelli et al. (2005), Clarkson et al. (2011), and Madan and Kensinger (2018). The surface-based approach allows examining distinct measures of cortical structure. In contrast, the volume-based approach that estimates gray matter volume is influenced by a combination of structural features. In surface-based morphology, the predictive value of each specific ROI's cortical thickness can be assessed.

For an insightful analysis of our approach in relation to related studies of age prediction from brain anatomical features, the comparison of our model with the state-of-the-art studies is presented in **Table 7**. The prediction accuracy of our approach is comparable with the recent state-of-the-art studies. The major contribution to the improvement of generalization accuracy is the SGL regularization technique used to select important features that best fit the age prediction model. It is worth mentioning that each previous study for comparison had slightly different experimental setting because they either used a small sample size or studied relatively young subjects (see the mean age of the previous studies in **Table 7**). In this work, we studied only older adults (age 45–91 years). In this older age lifespan, the risk of dementia is higher. Cognitive functioning declines due to normal aging. Moreover, owing to the more complex nonlinear relationship between normal aging and cortex anatomical structure of older adults, the amount of variance in the dependent variable explained by the independent variable(s) could be different.

FIGURE 5 | Visualization of the features significantly contributing to age estimation: the warm and cold color represent important features in predicting brain age using cortical thinning patterns.


TABLE 7 | Our model vs. related studies: N, number of samples; \* given in median absolute error (MdAE).

The cortical thickness measurements are known to be sensitive to the selection of scanner vendor, imaging protocols, or sites. Our dataset was collected from one center (SMC) using a single scanner with the same protocol which is one of the limitations to consider. In addition, our study aims to model the biological age from healthy subject samples, and therefore the analysis of cortical thinning due to pathology is out of the scope. The other related point is atrophy in the subcortical structures might be important in normal aging. Our study focuses on the cortical change due to normal aging; the atrophy in the subcortical structures can be considered in the future study.

### Analysis of Brain Features Significantly Contributing to Age Estimation

We extracted and visualized the brain features that significantly contribute to the age estimation. **Figure 5** shows a cortical thinning pattern specifically associated with normal aging, extracted by the SGL model. As shown in the figure, our findings were consistent with previous studies. A recent study from our group suggested that there were brain regions vulnerable to brain aging. Specifically, compared to those in their twenties and thirties, participants in their forties showed thinning primarily in the medial and lateral frontal and inferior parietal regions, and cortical thinning occurred across most of the cortices with increasing age (Lee et al., 2016a,b).

We find that age-related cortical thinning occurs on areas responsible for executive processing tasks, spatial cognition, vocabulary learning, and episodic memory retrieval, which are also known to be associated with age-related cognitive decline(Pochon et al., 2001; Monsell, 2003; Cavanna and Trimble, 2006; Singh-Curry and Husain, 2009; Caspers et al., 2012; Barbey et al., 2013). Furthermore, our findings could support the "last in, first out" hypothesis(McGinnis et al., 2011). That is, late-maturing regions of the brain, such as the heteromodal

### REFERENCES


association cortices, are preferentially vulnerable to age-related loss of structural integrity.

### CONCLUSIONS

We presented a brain age estimation model using cortical thickness data extracted from T1 MRI. We designed a feature selection approach that identifies cortical regions associated with cortical thinning and better generalizes brain age prediction. The best prediction accuracy was obtained with the SGL + GPR hybrid model. The best performance result was MAE = 4.05 years, which is comparable with results obtained by several recent state-of-the-art studies. The deep learning automatic feature-learning capability of the stacked auto-encoder also showed comparable result when combined with GPR. In general, the analysis of this research indicates the desirability of feature selection strategies to design predictive models of brain age from surface-based features that are capable of generalizing.

### AUTHOR CONTRIBUTIONS

J-KS, SS, and K-AS designed the main idea and directed the overall analysis. HA, J-HS, and BK developed the algorithm and carried out the data processing and experiments. DN and SS helped the data collection and result interpretation. HA, J-KS, SS, and K-AS wrote the manuscript with input from all authors.

### FUNDING

This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MOE: No. 2016R1D1A1B03933875, No. 2016R1A6A3A11932796, MSIP: No. 2016R1A2B4014398 & No. 2017R1A2B2005081).

and a parcellation of the temporal region. Neurobiol. Aging 26, 1271–1278. doi: 10.1016/j.neurobiolaging.2005.05.023


studies of older adults: a shrinking brain. J. Neurosci. 23, 3295–3301. doi: 10.1523/JNEUROSCI.23-08-03295.2003


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Aycheh, Seong, Shin, Na, Kang, Seo and Sohn. This is an openaccess article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Progressive Decline in Gray and White Matter Integrity in de novo Parkinson's Disease: An Analysis of Longitudinal Parkinson Progression Markers Initiative Diffusion Tensor Imaging Data

Kirsten I. Taylor<sup>1</sup> , Fabio Sambataro1,2, Frank Boess<sup>1</sup> , Alessandro Bertolino1,3,4 and Juergen Dukart<sup>1</sup> \*

<sup>1</sup> Neuroscience, Ophthalmology, and Rare Diseases, Pharma Research and Early Development, Roche Innovation Center Basel, F. Hoffmann-La Roche Ltd., Basel, Switzerland, <sup>2</sup> Department of Experimental and Clinical Medical Sciences (DISM), University of Udine, Udine, Italy, <sup>3</sup> Department of Basic Medical Science, Neuroscience, and Sense Organs, University of Bari, Bari, Italy, <sup>4</sup> Psychiatry Unit, Bari University Hospital, Bari, Italy

Background: Progressive neuronal loss in neurodegenerative diseases such as Parkinson's disease (PD) is associated with progressive degeneration of associated white matter tracts as measured by diffusion tensor imaging (DTI). These findings may have diagnostic and functional implications but their value in de novo PD remains unknown. Here we analyzed longitudinal DTI data from Parkinson's Progression Markers Initiative de novo PD patients for changes over time relative to healthy control (HC) participants.

Methods: Baseline and 1-year follow-up DTI MRI data from 71 PD patients and 45 HC PPMI participants were included in the analyses. Whole-brain fractional anisotropy (FA) and mean diffusivity (MD) images were compared for baseline group differences and group–by–time interactions. Baseline and 1-year changes in DTI values were correlated with changes in DTI measures and symptom severity, respectively.

Results: At baseline, PD patients showed significantly increased FA in brainstem, cerebellar, anterior corpus callosal, inferior frontal and inferior fronto-occipital white matter and increased MD in primary sensorimotor and supplementary motor regions. Over 1 year PD patients showed a significantly stronger decline in FA compared to HC in the optic radiation and corpus callosum and parietal, occipital, posterior temporal, posterior thalamic, and vermis gray matter. Significant increases in MD were observed in white matter of the midbrain, optic radiation and corpus callosum, while gray matter of prefrontal, insular and posterior thalamic regions. Baseline brainstem FA white matter (WM) values predicted 1-year changes in FA white matter and MD gray matter values. White but not gray matter changes in both FA and MD were significantly associated with changes in symptom severity.

### Edited by:

Katja Franke, Universitätsklinikum Jena, Germany

### Reviewed by:

Federica Agosta, Università Vita-Salute San Raffaele, Italy Panteleimon Giannakopoulos, Université de Genève, Switzerland

> \*Correspondence: Juergen Dukart juergen.dukart@gmail.com

Received: 24 May 2018 Accepted: 21 September 2018 Published: 08 October 2018

### Citation:

Taylor KI, Sambataro F, Boess F, Bertolino A and Dukart J (2018) Progressive Decline in Gray and White Matter Integrity in de novo Parkinson's Disease: An Analysis of Longitudinal Parkinson Progression Markers Initiative Diffusion Tensor Imaging Data. Front. Aging Neurosci. 10:318. doi: 10.3389/fnagi.2018.00318

Conclusion: Significant gray and white matter DTI alterations are observable at the time of PD diagnosis and expand in the first year of de novo PD to other cortical and white matter regions. This pattern of DTI changes is in line with preclinical and neuroanatomical studies suggesting that the increased spatial spread of alpha-synuclein neuropathology is the key mechanism of PD progression. Taken together, these findings suggest that DTI may serve as a sensitive biomarker of disease progression in early-stage PD.

Keywords: fractional anisotropy, mean diffusivity, Parkinson's disease, DTI, aging

# INTRODUCTION

Diffusion tensor imaging (DTI) of brain grey matter (GM) and white matter (WM) integrity is a potentially valuable tool to quantify progressive neurodegeneration in Parkinson's disease (PD). In particular, two indices extracted from DTI – fractional anisotropy (FA) and mean diffusivity (MD) – are commonly applied to study tissue integrity. FA provides an anisotropy measure of water diffusion presumably reflecting preferential directions of fiber orientation whilst MD quantifies the overall diffusivity reflecting tissue density or its loss in a longitudinal setting. A key working hypothesis of the underlying pathophysiological process leading to neurodegeneration and subsequent clinical decline in PD is that alpha-synuclein pathology starting in brainstem GM and WM progressively spreads through connected white matter fiber systems to cortical GM structures during the course of the disease (Braak et al., 2003). If DTI alterations in PD indeed reflect the resulting neurodegenerative process due to alpha-synuclein pathology, these alterations would also be predicted to show a progressive spatial spread. Yet, most DTI studies to date in PD have focused on cross-sectional assessment of single regions of interest (ROI) such as substantia nigra pars compacta (SNpc) or basal ganglia. These brain regions have been shown to be affected in PD and linked with its pathognomonic motor symptoms, although heterogeneity across studies is high (see Cochrane and Ebmeier, 2013; Schwarz et al., 2013 for reviews). However, some whole-brain GM and WM studies also provided evidence for altered diffusion processes in other brain structures, including the cortex (Karagulle Kendi et al., 2008; Mole et al., 2016). Whilst MD has been consistently shown to be increased in PD, differential findings emerged with respect to the directionality of FA alterations, with studies reporting evidence of both increases and decreases in anisotropy measures (Mole et al., 2016; Atkinson-Clement et al., 2017).

Only very few studies to date focused on longitudinal characterization of DTI abnormalities in PD patients (Chan et al., 2016; Loane et al., 2016; Guttuso et al., 2018; Minett et al., 2018) and only one evaluated white matter FA and MD alterations in a de novo PD population (Minett et al., 2018). Additionally, none of these studies evaluated the potential GM pathology as evaluated through FA and MD. It therefore remains unclear if and to what extent FA and MD alterations represent potential early diagnostic and progression biomarkers in PD patients. Moreover, it also remains unclear if these alterations reflect symptom severity or predict disease progression in the early PD population.

Here we aimed to address the question of the value of GM and WM FA and MD as early diagnostic and progression biomarkers in a de novo PD population. We further evaluated the relationship between these imaging indices and clinical symptoms observed in the respective patient population.

## MATERIALS AND METHODS

### Participants

The DTI sample comprised 116 Parkinson's Progression Marker Initiative (PPMI) participants who completed 1 year follow-up: 71 with a recent diagnosis of PD and 45 healthy control (HC) participants. PD and HC groups did not differ with respect to age or gender distribution, but did differ with respect to MDS-UPDRS (Movement Disorder Society-sponsored revision of the Unified Parkinson's Disease Rating Scale, Goetz et al., 2008) subscale and total scores, as expected (see **Table 1**). All PD patients were treatment naïve at baseline but were allowed to start PD medication upon need. Only categorical (yes/no) information was recorded on the respective treatment categories (L-dopa, dopamine agonists or other PD medication). This study was carried out in accordance with Good Clinical Practice (GCP) regulations and International Conference on Harmonization (ICH) guidelines. PPMI is a large multicenter study and each site independently received ethics approval of the protocol. All subjects gave written informed consent in accordance with the Declaration of Helsinki.

### Data Acquisition

Baseline and 1-year follow-up examinations in the PPMI study included the administration of the MDS-UPDRS (Goetz et al., 2008). Cardiac-triggered DTI MR sequences were acquired on a Siemens 3T TIM Trio scanner using a 12-channel matrix head coil and a two-dimensional echo-planar DTI sequence with the following parameters: TR/TE = 900/88 ms, flip angle = 90◦ , voxel size = 2 × 2 × 2 mm<sup>3</sup> , 72 slices, 64 gradient directions with a b-value of 1000 s/mm<sup>2</sup> . One non-gradient volume (b = 0 s/mm<sup>2</sup> ) was also acquired. See (Marek et al., 2011) and the online PPMI protocol<sup>1</sup> for details of the PPMI study and image acquisition, respectively.

<sup>1</sup>http://www.ppmi-info.org/wp-content/uploads/2018/02/PPMI-AM-13- Protocol.pdf

### TABLE 1 | Subject group characteristics.

fnagi-10-00318 October 4, 2018 Time: 15:23 # 3


df, degrees of freedom; PD, Parkinson's disease; HC, healthy controls; med, medication; SD, standard deviation; MDS-UPDRS, Movement Disorder Society-sponsored revision of the Unified Parkinson's Disease Rating Scale; MoCA, Montreal Cognitive Assessment Test Scoring.

### DTI Image Processing

Details of the PPMI DTI pre-processing pipeline including non-linear distortion correction and FA and MD computation can be found on the PPMI website<sup>2</sup> . FA and MD maps were computed using the TEEM tool as described on the PPMI<sup>2</sup> . All maps were downloaded with corresponding structural T1 scans. All further pre-processing was performed using Statistical Parametric Mapping software (SPM12, Friston et al., 1994). To improve precision for longitudinal evaluation structural scans from different time points were first co-registered to a mean image from those time points for each subject. FA and MD maps were subsequently co-registered to the respective structural scans. The structural MRI was segmented into GM and WM compartments. FA and MD maps were then masked with the respective binarized compartments to obtain GM and WM FA and MD estimates and normalized into MNI space based on structural T1 information. A Gaussian smoothing kernel of 8 mm FWHM (full width at half maximum) was subsequently applied.

### Statistical Analyses

All voxel-wise statistical analyses of the imaging data were performed using SPM12. Subsequent statistical analyses of the extracted imaging eigenvariates and clinical data were computed using SPSS 25 (IBM Corp., Armonk, NY, United States). We first compared the baseline FA and MD GM and WM maps between PD and HC controlling for the effects of age and sex using two-sample t-tests. We then tested for group–by– time interactions (i.e., differential FA and MD trajectories across PD and HC between baseline and 1-year follow-up) using a flexible factorial design including the factors group, time and subject and controlling for sex and baseline age. To correct for multiple comparisons, an exact permutation based cluster threshold (p < 0.05) was applied to all voxel-wise analyses combined with an uncorrected voxel-wise threshold of p < 0.01 (Dukart et al., 2017). Eigenvariates adjusting for the effects of covariates of no interest were extracted from the most significant clusters identified in baseline comparisons and group–by–time interaction analyses in each contrast. Effect sizes (Cohen's d) for the differentiation between PD and HC and percent changes from baseline (for longitudinal analyses only) were then computed to estimate the magnitude of observed alterations. Using these eigenvariates, we then used general linear models to evaluate whether baseline alterations in FA and MD observed in PD were predictive of current clinical severity or 1-year changes in clinical severity as measured by MDS-UPDRS total or subscale scores while controlling for age, sex and treatment mode at follow-up (as random effects, for change prediction only). Similarly, we tested if these baseline DTI alterations were predictive of 1-year FA and MD alterations observed in the PD population in group–by–time interaction analyses controlling for age and sex. Further, we evaluated using general linear models if changes observed in FA and MD over time (all entered as covariates into a single model) were related to changes in function (MDS-UPDRS I, II, III,

<sup>2</sup>http://www.ppmi-info.org/wp-content/uploads/2011/12/DTI-processing-Pipeline3.pdf

and total scores) while controlling for the random effects of PD medication status (L-dopa, dopamine agonists or other PD medication), age and sex. Lastly, we tested if medication may have affected disease progression effects on FA and MD identified in the above group–by–time interaction analyses. For this we computed analyses of variance in PD testing for treatment–by– time interactions on FA and MD alterations identified in the longitudinal analyses, controlling for age and sex. We thereby tested for effects of all three medication types recorded in the PPMI database (L-dopa, dopamine agonists, and other PD medication).

# RESULTS

### Imaging Alterations

At baseline, significantly increased FA was observed in PD in midbrain, pons and cerebellum, anterior corona radiata, anterior corpus callosum and left inferior and inferior-occipital longitudinal fasciculus WM (Cohen's d = 1.17) (**Figure 1** and **Table 2**). MD was increased in PD in bilateral primary sensorimotor and supplementary motor GM (Cohen's d = 0.38) (**Figure 2**). No significant differences between PD and HC were observed in FA GM and MD WM.

In the group–by–time interaction analyses, significant differences between PD and HC in FA changes over time were observed in both GM (Cohen's d = −1.0, HC: 1.6%, PD: −4.1%) and WM (Cohen's d = −0.94, HC: 0.0%, PD: −2.3%) (**Figure 1** and **Table 2**). Stronger FA GM decreases were thereby observed in PD in a cluster covering primarily bilateral parietal, occipital, posterior temporal and posterior thalamic and vermis regions. Similarly, stronger FA decreases were observed in PD in the underlying WM including the left optic radiation and anterior and middle corpus callosum. For MD, significantly stronger increases were observed in PD in GM (Cohen's d = 0.95, HC: 0.5%, PD: 2.8%) and WM (Cohen's d = 0.8; HC: 0.8%; PD: 3.6%) including, and bilateral medial and lateral prefrontal lobes, right insula and posterior thalamus GM and the midbrain WM, left optic radiation and anterior and middle corpus callosum (**Figure 2** and **Table 2**). These differential longitudinal FA and MD changes observed in PD were not associated with PD medication status at follow-up (all p > 0.85).

# Associations Between FA and MD and Function

In PD patients, baseline FA and MD values were neither significantly associated with baseline symptom severity nor with changes in symptom severity from baseline to 1-year followup. Baseline FA but not MD alterations in PD significantly predicted changes in WM FA [F(1,65) = 7.7, p = 0.007] and GM MD [F(1,65) = 4.1, p = 0.048] which had been observed in the group–by–time interaction analyses described above (**Figures 3A,B**). Higher baseline FA in the brainstem predicted stronger declines in FA in the posterior corona radiata and corpus callosum. Importantly, both MD and FA WM changes were significantly associated with changes in MDS-UPDRS total scores [FA: F(1,61) = 4.7, p = 0.034; MD: F(1,61) = 4.6, p = 0.036] but not with the MDS-UPDRS subscale scores (**Figures 3C,D**). FA and MD GM changes were not significantly associated with changes in symptom severity as measured by the MDS-UPDRS.

# DISCUSSION

We find significant baseline FA and MD increases in de novo diagnosed PD patients' brainstem and subcortical WM and cortical GM. Within the first year after diagnosis, DTI abnormalities spread to significant portions of the initially unaffected cortex and WM. The progression of FA and MD changes in GM was spatially distinct, with MD changes localized to the frontal lobe regions, whereas FA changes were localized to parietal, occipital and posterior temporal regions. WM changes in both FA and MD co-localized in the posterior corpus callosum and corona radiata. Both FA and MD changes in WM but not in GM were clinically relevant, demonstrating significant correlations with changes in MDS-UPDRS total scores.

Previous research on diffusion alterations in PD have largely focused on abnormal cross-sectional findings in the substantia nigra (Cochrane and Ebmeier, 2013; Schwarz et al., 2013) or ROI (Hall et al., 2016), with a minority of studies adopting a hypothesis-free, whole-brain approach (for a review see Hall et al., 2016). With respect to anatomical location of FA and MD alterations, our cross-sectional and longitudinal findings agree well with previous studies in more advanced PD reporting FA and/or MD alterations in the midbrain (Leh et al., 2007; Cochrane and Ebmeier, 2013; Lenfeldt et al., 2013; Schwarz et al., 2013), cerebellum (Fling et al., 2013; Vercruysse et al., 2015), anterior corona radiata (Agosta et al., 2014; Lee et al., 2014), anterior corpus callosum (Melzer et al., 2012; Agosta et al., 2014; Chan et al., 2014; Lee et al., 2014; Canu et al., 2015), left inferior and inferior-occipital fasciculi (Kamagata et al., 2012; Deng et al., 2013; Lee et al., 2014) and bilateral primary sensorimotor and supplementary motor areas (Fling et al., 2013; Vercruysse et al., 2015). Most of these studies reported decreased FA and/or increased MD in the respective regions. Importantly, all of these studies focused on more advanced disease stages. In contrast, a more complex picture emerged from the present results in the de novo PD population with respect to the directionality of FA changes. Whilst our longitudinal data indeed suggest a faster FA decline in PD as compared to HC, at baseline, we observed only increases in WM FA in a wide-spread anatomical network including corticospinal tracts and subcortical WM. These discrepant findings with respect to FA as compared to results in more advanced PD patients point to a more complex picture of underlying pathology in particular at early clinical disease stages. Importantly, our findings are in line with two out of three previous studies in de novo PD reporting numerically increased FA values in white matter (Tessa et al., 2008; Zhang et al., 2016). FA directionality is often difficult to interpret in terms of underlying neuropathology. For example a selective loss of specific fiber directions in crossing-fiber regions may result in increased FA, whilst a general loss of fibers along a specific WM tract is expected to result in decreased FA.

FIGURE 1 | Results of voxel-based morphometry analyses of fractional anisotropy (FA) maps. Bar plots show the eigenvariates extracted from the displayed significant clusters. Outline colors of the bar plots correspond to the contrast colors of respective clusters. HC, healthy control participant; PD, Parkinson's disease patients; GM, gray matter; WM, white matter.


TABLE 2 | Results of voxel-wise analyses of FA and MD maps.

PD, Parkinson's disease; FA, fractional anisotropy; GM, gray matter; HC, healthy controls; MD, mean diffusivity; MNI, Montreal Neurological Institute space; WM, white matter.

fnagi-10-00318 October 4, 2018 Time: 15:23 # 5

Additionally, depending on the original structure composition a neurodegenerative or a neuroinflammatory process (both being key contributors if PD) can lead to increases and decreases in FA. An increased baseline FA in de novo PD may therefore represent an example of the former situation or point to a different, more complex underlying neuropathology. We note that in particular the finding of increased FA in the corticospinal tract is in line with a recent meta-analysis reporting this region as only one showing consistently increased FA across studies (Atkinson-Clement et al., 2017). Moreover, the significant correlation observed between this increased baseline FA and longitudinal FA and MD changes suggests that higher baseline FA is indeed associated with stronger FA loss and MD increase observed in the first year of followup. These findings support the relevance of this baseline FA values as a potential predictor of the future spread of pathology.

Far fewer studies analyzed the longitudinal progression of DTI signals in PD. Most notably, Zhang et al. (2016) analyzed PPMI DTI data from 122 PD patients and 50 HCs using a WM and subcortical region of interest approach (ROI) based on the JHU-DTI-MNI (Type I WMPM) atlas<sup>3</sup> . GM DTI findings were not analyzed, and the authors selected radial and axial, as opposed to mean, diffusivity measure for their analyses. Using this ROI approach, PD patients did not differ from controls in any WM or subcortical region of interest at baseline. However, in line with the present findings, FA declined significantly more rapidly in PD patients' substantia nigra, midbrain, thalamus, corpus callosum, and frontal white matter. FA changes in all ROIs were not associated with changes in MDS-UPDRS scores. We note that the present study's positive relationship between FA and MDS-UPDRS changes derived from global, as opposed to ROI, FA changes. Nevertheless, the converging findings between the present and Zhang et al. (2016) studies support the use of DTI to understand WM changes in de novo PD. Importantly, DTI based measures may provide complementary information in addition to other imaging modalities such as volumetric and resting state functional MRI. On one side, although volumetric MRI is very sensitive to alterations in regional gray matter volume and cortical thickness it provides little insight into the underlying gray and white matter microstructural changes. Such volumetric changes presumably reflect irreversible damage due to loss of the underlying tissue. On the other side, resting state functional MRI measures provide an insight into neural dysfunction, i.e., loss of activity or connectivity (Dukart et al., 2017). However, such local functional alterations may also reflect changes in the underlying structure (i.e., due partial volume effects) or damage of remote

<sup>3</sup>http://cmrm.med.jhmi.edu

FIGURE 3 | Results of regression analyses between baseline and longitudinal FA and MD values and clinical scores. Significant associations between baseline FA and longitudinal FA and MD changes are displayed in (A) and (B). Significant associations between longitudinal FA and MD changes and changes in symptom severity as measured with the MDS-UPDRS (Movement Disorder Society-sponsored revision of the Unified Parkinson's Disease Rating Scale) are displayed in (C) and (D).

structures, i.e., reduced input (Dukart and Bertolino, 2014). In that sense, DTI provides a complimentary insight into the underlying microstructural changes in the respective regions. As diffusion processes may be affected by any alterations in tissue organization such as inflammatory, demyelinating or other processes, DTI provides a potentially more sensitive biomarker to pick up early microstructural changes in PD prior to tissue loss detected through volumetric MRI and without confounding factors related to interpretation of resting state functional MRI.

Braak et al. (1999) demonstrated that alpha-synuclein-positive inclusions are present in the axons of PD patients, e.g., the intramedullary vagal axons from the dorsal motor vagal area, which itself contained many alpha-synuclein-positive Lewy bodies. Moreover, Iseki et al. (2001) found that ubiquitinpositive inclusions in the central nucleus of the amygdala of patient brains with dementia with Lewy bodies were also alpha-synuclein and tyrosine hydroxylase positive, suggesting that the degeneration of terminal axons of affected substantia nigra neurons were the source of the amygdala pathology in patients with dementia with Lewy bodies (Iseki et al., 2001). Consistent with this spreading model, the present study found that baseline brainstem FA WM values significantly correlated with both 1-year changes in FA WM and MD GM values. These findings indicate that DTI may serve as a valuable biomarker of both disease progression and, extrapolating from Spencer et al. (2017) preclinical study, also drug efficacy in studies of disease-modifying agents.

Burke and colleagues suggest that axonal damage in PD may occur prior to neuronal loss, as described in their "axonal dying back" model (Burke and O'Malley, 2013; Tagliaferro and Burke, 2016). The evidence for this claim derives primarily from various mouse models of PD [e.g., BAC, heterozygous null gene engrailed1 (En1), and Nurr1 transgenic mouse models], which all show axonal pathology prior to neuronal loss (for a review see Tagliaferro and Burke, 2016).

Correspondingly, in humans, circa 70% of nigrostriatal dopaminergic terminals are estimated to be lost at the timepoint of the PD diagnosis, while the level of neurodegeneration has only reached 30% of substantia nigra dopaminergic neurons (Cheng et al., 2010). Moreover, a recent study of Alzheimerrelated pathology suggests that DTI abnormalities may portend the downstream aggregation of pathological proteins: in a multi-model imaging study with MRI DTI and FTP tau PET, Jacobs et al. (2018) demonstrated that abnormal MD in the hippocampal cingulum bundle predicted tau accumulation in the downstream-connected posterior cingulate cortex 2 years later in amyloid positive older individuals. We note that as in synucleinopathies, intraneuronal pathologically aggregated tau proteins in Alzheimer's disease are likewise hypothesized to spread from neuron to neuron along WM tracts (Clavaguera et al., 2009). Taken together, these findings suggest that DTI may be used not only as an early marker of alpha-synuclein pathology, but of future upstream neuronal cell loss and downstream pathological alpha-synuclein aggregation.

The present study demonstrated baseline brainstem and cortical abnormalities in de novo PD WM and GM with DTI, as expected, and a propagation of these abnormalities within the first year of de novo PD to include large swathes of cortical regions. Importantly, both the FA and MD changes were functionally relevant, as evidenced by significant correlations with changes in MDS-UPDRS total scores. The patterns of cortical WM neurodegeneration in PD mimicked those described during normal aging, with stronger FA effects in frontal and stronger MD effects in parietal regions, which together suggest a selective vulnerability of these brain regions. In the context of preclinical PD models suggesting an early affectation of axons affected by synucleinopathy (i.e., prior to neuronal loss) (Cheng et al., 2010), the present findings indicate that DTI-based measures of WM and GM integrity may represent powerful early biomarkers of disease progression in de novo PD.

### REFERENCES


# DATA AVAILABILITY STATEMENT

The datasets analyzed for this study can be found in the PPMI data repository (http://www.ppmi-info.org/access-dataspecimens/download-data/).

### AUTHOR CONTRIBUTIONS

JD, FS, FB, and AB conceptualized the analysis plan. JD analyzed the data. KT and JD wrote the manuscript.

# FUNDING

KT, FS, FB, AB, and JD are current or former full-time employees of F. Hoffmann-La Roche, Basel, Switzerland. The authors received no specific funding for this work. F. Hoffmann-La Roche provided financial contribution in the form of salary for all authors but did not have any additional role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript.

## ACKNOWLEDGMENTS

Data used in the preparation of this article were obtained from the Parkinson's Progression Markers Initiative (PPMI) database (www.ppmi-info.org/data). For up-to-date information on the study, visit www.ppmi-info.org. PPMI – a public-private partnership – is funded by the Michael J. Fox Foundation for Parkinson's Research and funding partners, including Abbvie, Avid Radiopharmaceuticals, Biogen Idec, Briston-Myers Squibb, Covance, GE Healthcare, Genentech, GlaxoSmithKline, Lilly, Lundbeck, Merck, Meso Scale Discovery, Pfizer, Piramal, Roche, and UCB.



**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Taylor, Sambataro, Boess, Bertolino and Dukart. This is an openaccess article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# A Nonlinear Simulation Framework Supports Adjusting for Age When Analyzing BrainAGE

Trang T. Le1,2†, Rayus T. Kuplicki <sup>1</sup> \* † , Brett A. McKinney 2,3, Hung-Wen Yeh<sup>1</sup> , Wesley K. Thompson<sup>4</sup> , Martin P. Paulus <sup>1</sup> and Tulsa 1000 Investigators ‡

<sup>1</sup> Laureate Institute for Brain Research, Tulsa, OK, United States, <sup>2</sup> Department of Mathematics, University of Tulsa, Tulsa, OK, United States, <sup>3</sup> Tandy School of Computer Science, University of Tulsa, Tulsa, OK, United States, <sup>4</sup> Department of Psychiatry, University of California, San Diego, La Jolla, CA, United States

### Edited by:

Katja Franke, Universitätsklinikum Jena, Germany

### Reviewed by:

Guray Erus, University of Pennsylvania, United States Franziskus Liem, Universität Zürich, Switzerland

### \*Correspondence: Rayus T. Kuplicki

rkuplicki@laureateinstitute.org

†These authors have contributed equally to this work

‡Tulsa 1000 Investigators: Robin L Aupperle, Jerzy Bodurka, Yoon-Hee Cha, Justin S. Feinstein, Sahib S. Khalsa, Jonathan Savitz, W Kyle Simmons, Teresa A Victor

Received: 30 March 2018 Accepted: 21 September 2018 Published: 24 October 2018

### Citation:

Le TT, Kuplicki RT, McKinney BA, Yeh H-W, Thompson WK, Paulus MP and Tulsa 1000 Investigators (2018) A Nonlinear Simulation Framework Supports Adjusting for Age When Analyzing BrainAGE. Front. Aging Neurosci. 10:317. doi: 10.3389/fnagi.2018.00317 Several imaging modalities, including T1-weighted structural imaging, diffusion tensor imaging, and functional MRI can show chronological age related changes. Employing machine learning algorithms, an individual's imaging data can predict their age with reasonable accuracy. While details vary according to modality, the general strategy is to: (1) extract image-related features, (2) build a model on a training set that uses those features to predict an individual's age, (3) validate the model on a test dataset, producing a predicted age for each individual, (4) define the "Brain Age Gap Estimate" (BrainAGE) as the difference between an individual's predicted age and his/her chronological age, (5) estimate the relationship between BrainAGE and other variables of interest, and (6) make inferences about those variables and accelerated or delayed brain aging. For example, a group of individuals with overall positive BrainAGE may show signs of accelerated aging in other variables as well. There is inevitably an overestimation of the age of younger individuals and an underestimation of the age of older individuals due to "regression to the mean." The correlation between chronological age and BrainAGE may significantly impact the relationship between BrainAGE and other variables of interest when they are also related to age. In this study, we examine the detectability of variable effects under different assumptions. We use empirical results from two separate datasets [training = 475 healthy volunteers, aged 18–60 years (259 female); testing = 489 participants including people with mood/anxiety, substance use, eating disorders and healthy controls, aged 18–56 years (312 female)] to inform simulation parameter selection. Outcomes in simulated and empirical data strongly support the proposal that models incorporating BrainAGE should include chronological age as a covariate. We propose either including age as a covariate in step 5 of the above framework, or employing a multistep procedure where age is regressed on BrainAGE prior to step 5, producing BrainAGE Residualized (BrainAGER) scores.

Keywords: BrainAGE, simulation, false positives, SVR, MRI, aging

# INTRODUCTION

Aging is a biological process that can affect behavioral and cognitive dimensions. Biological age as measured by telomere length deviates from an individual's chronological age as a result of environment, lifestyle, and genetics (Shammas, 2011). However, other measures of biological age that may be particularly relevant to psychopathology can involve structural and functional changes in the brain.

Several imaging modalities, including T1-weighted structural imaging (Franke et al., 2010), diffusion tensor imaging (Han et al., 2014; Lin et al., 2016), and functional MRI (Tian et al., 2016) have been used in conjunction with machine learning algorithms to predict an individual's age. Recently, integration of neuroimaging data of different feature types and across multiple modalities has been shown to improve age prediction (Erus et al., 2015; Liem et al., 2017; Gutierrez Becker et al., 2018). While the details vary according to modality, the general strategy has been to (1) extract image-related features, (2) build a model on a training set composed of healthy participants using these features to predict participant age, (3) apply that model to a testing set, producing a predicted age for each individual, (4) compute the difference between a participant's predicted age and chronological age (often referred to as Brain Age Gap Estimate, BrainAGE, or brain predicted age difference, brain-PAD), (5) test for relationships between other variables of interest and BrainAGE, and (6) make inferences about accelerated or delayed brain aging (Cole and Franke, 2017). Variables of interest have included physical fitness (Ritchie et al., 2017), physical activity (Steffener et al., 2016), cognitive impairment after traumatic brain injury (Cole et al., 2015), mortality risk in elderly participants (Cole et al., 2018), acute ibuprofen administration in healthy participants (Le et al., 2018), or status of various diseases and disorders such as diabetes (Franke et al., 2013), Alzheimer's disease (Gaser et al., 2013; Löwe et al., 2016), psychiatric disorders (Koutsouleris et al., 2014; Nenadic´ et al., 2017), and human immunodeficiency virus (Wilkins, 2017).

Support Vector Regression (SVR) with a radial kernel is a commonly used machine learning algorithm to predict age and compute BrainAGE (Franke et al., 2010), along with other methods such as Gaussian process and relevant vector regression (Drucker et al., 1997). The residual error of these age-predicting models, BrainAGE, is necessarily correlated with age, which results in an overestimation of the age of younger individuals and an underestimation of the age of older individuals. This is due to the fact that these algorithms, like all regression methods, are subject to the fundamental phenomenon of "regression toward the mean" (Galton, 1886). A theoretical basis for this phenomenon is presented in section Theoretical Basis for the Age-BrainAGE Correlation. In practice, the correlation between chronological age and BrainAGE is visually evident in many figures of chronological vs. predicted age (Franke et al., 2010; Cole et al., 2018). While most studies involving BrainAGE have not discussed the age-BrainAGE correlation, some have accounted for this correlation by using predicted age as the primary outcome, which is similar to the correction we propose (Erus et al., 2015; Habes et al., 2016).

The age-BrainAGE correlation may affect the apparent relationship between BrainAGE and variables of interest when these other variables are also related to age. In the clinical neuroscience domain, for example, we may be interested in covariates including physiological variables such as body composition or psychological measures of mood or testing performance, some of which have clear relationships with age. In this study, we examine the detectability of multiple covariate effects in both real and simulated data. Using real data, we characterized relationships between BrainAGE, age, and other variables of interest. Then, we generated a known "ground truth" with characteristics similar to what we observed in real data. In our simulation model, age has a direct effect on the variables of interest, which may in turn affect simulated imaging features. We include both linear and nonlinear effects at each level.

The goals of the current study are: (1) to highlight the universal correlation between chronological age and BrainAGE in theory and practice and (2) develop a general framework for simulating age-dependent data that can be used to investigate the effect of the age-BrainAGE correlation in subsequent analyses. One of the challenges of determining the best practices for using BrainAGE in statistical modeling is related to the fact that variables of interest may be related to age, but not directly related to accelerated or delayed brain aging. In that case, spurious relationships with BrainAGE may be observed. Our results strongly support the proposal that models including BrainAGE as an independent variable should be adjusted for chronological age as well.

### METHODS

We begin with a theoretical explanation for regression toward the mean and the concurrent correlation between the residuals and observed values for any regression. Then, we show in our own data the relationships between chronological age, BrainAGE, and other covariates of interest as a basis for the parameters in our simulations. Finally, we describe a simulation approach to generate data with a comparable age effect on brain image features and show how the age-BrainAGE correlation can contribute to observed relationships, even when the simulated independent variables do not associate with imaging features. The R scripts for simulation and analysis are publicly available on the GitHub repository https://github.com/ lelaboratoire/BrainAGE-simulation.

### Theoretical Basis for the Age-BrainAGE Correlation

### Regression Toward the Mean

Consider n data points yi , xi , i = 1, . . . , n used to fit a simple linear regression y = α + βx + ε. Least-square estimation leads to

$$
\hat{\beta} = r\_{\infty} \frac{s\_{\mathcal{Y}}}{s\_{\mathcal{X}}}, \qquad \hat{\alpha} = \overline{\mathcal{y}} - \hat{\beta}\overline{\mathcal{x}},
$$

where rxy is the Pearson correlation between x and y, s<sup>x</sup> and s<sup>y</sup> are the standard deviation, respectively. Substituting the formulas

$$\frac{\hat{\boldsymbol{\gamma}} - \overline{\boldsymbol{\gamma}}}{s\_{\boldsymbol{\gamma}}} = r\_{\boldsymbol{\gamma}\boldsymbol{\gamma}} \left( \frac{\boldsymbol{x} - \overline{\boldsymbol{x}}}{s\_{\boldsymbol{\gamma}}} \right).$$

In this setting, regression toward the mean refers to the phenomenon that the standardized predicted value of y is closer to its mean than that of x to its mean for any imperfect correlation, −1 < rxy < 1. The weaker the correlation, the greater the extent of regression toward the mean. For perfect correlations ( rxy  = 1), the standardized distance between the predicted value in y to its mean equals that of x to its mean and there is no regression toward the mean. The implication for BrainAGE is that the age of younger individuals tends to be overestimated and the age of older individuals tends to be underestimated.

### Partition of Variance or Analysis of Variance (ANOVA)

In the general setting y = f (X) + ε, where X can be any dimension and f (·) can be any regression model, the variance of y is partitioned into a part that can be explained by X, and a part due to random error: σ 2 <sup>y</sup> = σ 2 <sup>X</sup> + σ 2 ε . Then

$$\begin{aligned} \text{Cov}\left(\mathbf{y}, \mathbf{f}\left(\mathbf{X}\right)\right) &= \sigma\_{\mathbf{X}}^2, \text{Cov}\left(\mathbf{y}, \varepsilon\right) = \sigma\_{\varepsilon}^2\\ \text{Corr}\left(\mathbf{y}, \mathbf{f}\left(\mathbf{X}\right)\right) &= \frac{\sigma\_{\mathbf{X}}^2}{\sqrt{\sigma\_{\mathbf{X}}^2 + \sigma\_{\varepsilon}^2}\sqrt{\sigma\_{\mathbf{X}}^2}} = \frac{\sigma\_{\mathbf{X}}}{\sqrt{\sigma\_{\mathbf{X}}^2 + \sigma\_{\varepsilon}^2}},\\ \text{Corr}\left(\mathbf{y}, \varepsilon\right) &= \frac{\sigma\_{\varepsilon}}{\sqrt{\sigma\_{\mathbf{X}}^2 + \sigma\_{\varepsilon}^2}} \end{aligned}$$

For yˆ = ˆ f (X), y = ˆ f (X) + ˆε and

$$\text{Corr}\left(\boldsymbol{\jmath}, \boldsymbol{\hat{f}}\left(\mathbf{X}\right)\right) = \frac{\hat{\sigma}\_{\mathbf{X}}}{\sqrt{\hat{\sigma}\_{\mathbf{X}}^{2} + \hat{\sigma}\_{\varepsilon}^{2}}},\\\text{Corr}\left(\boldsymbol{\jmath}, \boldsymbol{\hat{\varepsilon}}\right) = \frac{\hat{\sigma}\_{\boldsymbol{\varepsilon}}}{\sqrt{\hat{\sigma}\_{\mathbf{X}}^{2} + \hat{\sigma}\_{\varepsilon}^{2}}}$$

where σˆ 2 <sup>X</sup> <sup>=</sup> Var ˆ f (X) = Var yˆ and σˆ 2 <sup>ε</sup> <sup>=</sup> Var εˆ .

Thus, Corr y, εˆ > 0 unless ˆ f (X) predicts y perfectly with σˆ<sup>ε</sup> = 0. The correlation formulas suggest that the correlation between residual and y decreases with the correlation between y and yˆ, i.e., prediction accuracy of ˆ f (X). **Supplementary Figure 1** illustrates this phenomenon using a simple simulation where y was a function of x plus random normal noise. As the noise decreases (and fit increases), the correlation between y and the residuals decreases as well.

In the context of BrainAGE, the goal is to find ˆ f (·) that best predicts chronological age (y) using brain measures as X, and BrainAGE is computed as −ˆε = ˆy − y. Because ˆ f (X) never predicts chronological age perfectly, BrainAGE remains correlated with age. When BrainAGE is used as the response variable in subsequent analyses to make inferences on a covariate Z, it is important to check whether Z is associated with chronological age. If Z is not associated with chronological age, then one may simply evaluate the bivariate association between BrainAGE and Z. On the other hand, if Z is associated with both chronological age and BrainAGE, chronological age may confound the relationship between BrainAGE and Z (Elwood, 1992) and should be taken into account. Confounding effects can be addressed at study design (e.g., randomization and matching) or in statistical analysis [e.g., stratification of the confounder or including the confounder as a covariate (Pourhoseingholi et al., 2012)]. For example, Franke et al. (2010) considered a variable Z that represents two groups (ill vs. healthy) and selected two groups of individuals with similar chronological age (so Z is not associated with chronological age) to compare their BrainAGE. In the current work, we include chronological age as a covariate and evaluate this approach in the context of BrainAGE.

### Empirical Data

We used two separate datasets to illustrate the correlation between BrainAGE and chronological age and the effect this can have on associations with covariates of interest. All data were collected at the Laureate Institute for Brain Research between 2009 and 2017. All protocols were approved by Western Institutional Review Board (www.wirb.com). Participants signed written informed consent and received financial compensation for their participation.

### Training Dataset

Structural MRI data were collected from 475 healthy volunteers (mean age ± sd = 30.5 ± 10.3 years; age range = 18–60 years; 259 female) between 2009 and 2017. Each participant was scanned in a 3T GE MR750 whole body scanner. Scans were acquired using axial T1-weighted MP-RAGE sequences with a 24 cm FOV, 256 × 256 acquisition matrix, 8-degree flip angle and 0.9375 × 0.9375 mm in-plane resolution with no gap. Other parameters varied within the following ranges: 5.736 to 6.292 ms TR, 1.896–2.104 ms TE, 0.9–1.2 mm slice thickness, with either an 8- (General Electric, Milwaukee, WI) or 32- (Nova Medical Inc., Wilmington MA) channel phased array coil. Healthy neuropsychiatric status was assessed using either the MINI-international Neuropsychiatric Interview (Sheehan et al., 1998) or the Structured Clinical Interview for DSM-IV (First et al., 2002).

### Testing Dataset

Structural MRI data were collected from 489 (mean age ± sd = 34.6 ± 10.6 years; age range = 18–56 years; 312 female) participants as part of Tulsa 1000, a longitudinal observational study including people with mood/anxiety, substance use, eating disorders, and healthy controls. Inclusion criteria for the participant populations were Patient Health Questionnaire ≥10, Overall Anxiety Severity and Impairment Scale ≥8, Drug Abuse Screening Test >3, or SCOFF ≥2. Exclusion criteria included a history of significant brain trauma, neurological disorders, change in medication within 6 week prior to scanning, bipolar disorder, and schizophrenia. Scanning parameters for this dataset were: 24 cm FOV, 256 × 256 acquisition matrix, 186 axial slices, 0.9 mm slice thickness with no gap, TR/TE = 5/2.012 ms, using an 8-channel phased array coil (General Electric, Milwaukee, WI). Testing and training sets differed on mean age (t = 6.2, p < 0.0001, mean difference 4.2 years) and sex composition (χ <sup>2</sup> = 8.2, p = 0.004).

All participants in the testing dataset also underwent an intense battery of assessments including self-report, clinical interviews, neuropsychological testing, and body composition analysis. For full details, please see (Victor et al., 2018). From these, we selected 154 measures, which were used to illustrate the normal range of correlations with age and how these can affect the relationship between BrainAGE and covariates of interest.

### Image Processing

All images in both the testing and training sets were processed using Freesurfer version 6.0.0 (Dale et al., 1999) in order to produce gray/non-gray matter masks. Then, using a procedure similar to Franke (Franke et al., 2010) but implemented in AFNI, all gray matter masks were transformed to MNI space via affine transformation, smoothed with an 8 mm gaussian kernel, and downsampled to 8 × 8 × 8 mm voxels. This produced a set of 3,707 voxels per participant, with the value at each voxel representing the fraction of that voxel comprised of gray matter.

R (version 3.2.2) and R package caret (version 6.0.76) were used to fit a support vector regression (SVR) model with radial basis functions. The ε (tolerance margin) was fixed at and cost parameters were tuned using 5 repeats of 10-fold cross validation in the training set. The hyperparameter space was sampled using a grid search that fixed ε at 0.000145 and allowed cost to vary from 0.25 to 4,096. The final best model (cost = 2) was then applied to the testing set to produce one predicted age for each participant. BrainAGE was taken to be predicted age minus chronological age.

Additionally, we define the Brain Age Gap Estimate Residualized (BrainAGER) to be the residual of the regression of BrainAGE on age to remove the remaining linear bias of age. This way, we have a measure of deviation from expected age that is linearly uncorrelated with chronological age.

### Simulation

To investigate the effect of the age-BrainAGE correlation on subsequent modeling results, we simulated hierarchical correlation structures among brain features, chronological age and covariates using a generative biological model (**Figure 1**). We then generated two groups of independent variables. Within each group of variables, some are dependent on age and others are not. One group was used in the simulation of neuroimaging features, while the other was not. We randomly split the data set into two subsets, trained SVR on the training set and computed BrainAGE on the testing set. On the testing set, we conducted linear regressions of BrainAGE on all independent variables, both with and without chronological age. With 1,000 replications, we assessed the significance of the contribution from the independent variables by examining the distribution of the resulting p-values.

### Model Definition

A realistic simulation model should capture the properties of normal age-related brain volumetric data, such as brain region-dependent changes and nonlinear chronological age dependence (Fjell et al., 2013). A realistic simulation should also include the ability to generate age-dependent deviations from the normal population and age-dependent covariates that may influence BrainAGE nonlinearly. We consider a biological causal path model and develop a novel age-basis-function approach for simulating BrainAGE data with covariates (**Figure 1** and **Supplementary Figure 2**).

Denoting age by A, we assumed an underlying (unobserved) biological process represented by m functions of age, denoted as f<sup>m</sup> (A), which we referred to as age basis functions (ABFs). Here, without a function space defined, the term "basis" is used loosely to indicate the elementary functions that can be combined linearly to form any variable of interest y:

$$\mathbf{y} = \sum\_{\mathbf{m}=1} \mathbf{w}\_{\mathbf{m}} \mathbf{f}\_{\mathbf{m}}(\mathbf{A}) + \epsilon. \tag{1}$$

In this study, we implemented three monotone decreasing ABFs that can generate a wide range of non-linear functions (**Supplementary Figure 3**), and used these ABFs to simulate covariates of interest and the features extracted from an imaging modality.

### **Simulating covariates**

A covariate of interest Z<sup>j</sup> for participant i with chronological age A<sup>i</sup> was generated by

$$Z\_{ij} = \sum\_{m=1}^{3} \alpha\_{m\not\!j} f\_m(A\_i) \; + \; \epsilon\_{ij} \tag{2}$$

where αmj is a covariate-specific weight and the covariate-specific error ǫij∼N 0, σ 2 j denotes a Gaussian noise with mean 0 and standard deviation σ<sup>j</sup> .

### **Simulating imaging modality**

The proportional gray-matter volume for voxel k of a participant i with chronological age A<sup>i</sup> was generated by

$$\mathbf{v\_{ik}} = \sum\_{\mathbf{m}=1}^{3} \mathbf{w\_{mik}} \mathbf{f\_{m}}(\mathbf{A\_{i}}) \, \, + \, \epsilon\_{\mathbf{i}} \tag{3}$$

or, in short, vik = f (Ai) + ǫ<sup>i</sup> , where ǫ<sup>i</sup> represents Gaussian noise with mean 0 and standard deviation σ<sup>ν</sup> . This setting allows capturing within-participant correlations (4b) and spatial dependence within participants (4c):

$$\text{Var}\left(\mathbf{v}\_{\text{ik}}\right) = \text{Var}\left(\mathbf{f}\left(\mathbf{A}\_{\text{i}}\right)\right) + \sigma\_{\text{v}}^{2} \tag{4a}$$

$$\text{Cov}\left(\mathbf{v}\_{\text{ik}}, \mathbf{v}\_{\text{i'k}}\right) = \text{Cov}\left(\mathbf{f}\left(\mathbf{A}\_{\text{i}}\right), \mathbf{f}\left(\mathbf{A}\_{\text{i'}}\right)\right) + \sigma\_{\text{v}}^2 \tag{4b}$$

$$\text{Cov}\left(\mathbf{v}\_{\text{ik}}, \mathbf{v}\_{\text{ik}'}\right) = \text{Var}\left(\mathbf{f}\left(\mathbf{A}\_{\text{i}}\right)\right) \tag{4c}$$

Note that the weight function wmik (Ai) allows the weights of ABFs to vary across individuals and volumes, and as a function of an individual's chronological age.

To further make the imaging modality dependent on some covariates, we let

$$
\omega\_{mik} = \omega\_{mk} + D\_{\text{i}} \tag{5}
$$

particular individual i, the ABFs are combined to create volume k's gray matter proportion vik (orange, black, and blue arrows) and age-dependent covariates of interest, Zij(A) with a different set of coefficients αj . (B) Some of the Zij are then fed back into the wik when generating volume vik , which leads to two levels of age association between covariate and BrainAGE. (C) Proportional gray matter volume (volumetric data) generated from non-linear combinations of ABFs. (D) Predicted-age and BrainAGE computed from simulated volumetric data and simulated chronological age with Support Vector-based regression; (E) Test for association between BrainAGE and covariates of interest.

where wmk is the population mean weight for ABF f<sup>m</sup> at voxel k, and the participant level departures D<sup>i</sup> depends on the first q variables (covariates):

$$\mathbf{D}\_{\mathbf{i}} = \boldsymbol{\chi} \sum\_{\mathbf{j}=1}^{q} \mathbf{Z}\_{\mathbf{i}\mathbf{j}} \mathbf{(A}\_{\mathbf{i}} \mathbf{} \tag{6}$$

Other measurable variables, Zj>q, do not contribute to the weights deviation. In addition to the age-related imaging features that are generated from the ABFs, we also added 25% "background" features that do not correlate with age. Other parameters such as standard deviation of the noise ǫ were chosen with the objective of yielding R 2 and MAE values that closely match our empirical results when the volumetric features were used as inputs to the support vector regression (SVR) model to estimate chronological age. Nevertheless, the choice of parameters and even the simulation design matrix do not affect the overall improvement in the regression that includes age as an explanatory variable from the regression without age.

Finally, we carried out linear regressions of the covariates of interest on BrainAGE, with and without including age as an explanatory variable in the model. Over 100 replications, we assessed the detectability of the covariates as significant contributors to BrainAGE by examining their p-value distributions. In the ideal case, we should detect relationships between BrainAGE and covariates Z ′ j s.

### Simulation Steps


In steps 3 and 4, we simulated 16 covariate types in each of 1,000 replicate data sets (**Supplementary Table 1**). The 16 variables were simulated by using all 8 possible combinations of the three

FIGURE 2 | Similar out-of-sample R <sup>2</sup> when applying SVR to predict age as well as negative correlation between BrainAGE and chronological age between T1000 data and simulated data. (a,b) Chronological age vs. predicted age in the testing dataset, with a mean absolute error (MAE) of 4.78 years and R <sup>2</sup> = 0.65 in (a) and MAE = 5.15, R <sup>2</sup> = 0.841 in (b). Overlaying black 45-degree line and blue regression line showed regression toward the mean. (c,d) Chronological age vs. BrainAGE (r = −0.63). Negative correlation between BrainAGE and chronological age indicates younger participants tend to have positive BrainAGE and old participants tend to have negative BrainAGE. (e,f) After removing the linear trend in (b,c), there is no relationship between age and BrainAGER (r = 0.001). BrainAGER has an expected value of 0, regardless of chronological age.

age basis functions. Half of them contributed to the weights wmik (A), which consequently affected the gray matter density. For example, Z<sup>2</sup> and Z<sup>10</sup> were both derived from only the linear basis function f1, but Z<sup>10</sup> does not influence the aging.

Additionally, the complete simulation procedure was carried out for two scenarios: one with relatively large and another with relatively small effects of the covariates on BrainAGE. This was achieved by modifying the constant γ in Equation (3) so that, in one case, the final weights wmik have a larger fold change on the original weights. In particular, the fold change is computed as

$$FC = \frac{\overline{w}\_{mk}}{\overline{w}\_{mk}} = \frac{w\_{mk} + \overline{D}\_{mk}}{w\_{mk}},\tag{7}$$

where Dmik is the average of Dmik(A) across all ages.

FIGURE 3 | Relationship between age-covariate correlation and the difference in measured correlation. The difference between using BrainAGE and BrainAGER depends on the age-covariate relationship. (A) Covariate-BrainAGER correlations as a function of the covariate-BrainAGE correlation, with points colored according to the Age-Covariate correlation. The 45-degree line is shown, and covariates more strongly related to age are further from the line. (B) The squared difference in r between using BrainAGE and BrainAGER as a function of the variance explained by age.



The last column contains the direct correlation between each covariate and age (rage). For BrainAGE, where age is not considered there are 17 covariates with FDR adjusted p < 0.05 and for BrainAGER, which is residualized on age, there are six covariates with adjusted p < 0.05. Cells with p < 0.05 are bold.

# RESULTS

# Empirical

### Covariate Correlations With Age

Observed Pearson correlations between age and the 154 clinical variables ranged from −0.33 (PROMIS physical function) to 0.29 (waist circumference) (**Supplementary Figure 4**). Because any confounding effect of the correlation between age and covariates of interest is likely to be worse with larger correlations, we focused on simulated covariates that correlated with age with an r of up to 0.3.

### Age Prediction Accuracy and Bias

After fitting on the training dataset, SVR achieved a mean absolute error of 4.84 years and explained 64% of the variance in age in the testing dataset (**Figure 2a**). This is comparable to the cross-validated performance on the training set, where MAE was 5.1 years and R <sup>2</sup> was 0.59. The correlation between age and predicted age was 0.82. On the other hand, regression toward the mean lead to a negative relationship between age and BrainAGE (r = −0.63, **Figure 2c**). After removing the linear trend as shown in **Figure 2c**, we observed no relationship between age and BrainAGER (r = 0.001, **Figure 2e**). More explicitly, BrainAGE had a positive expected value at low chronological age and a negative expected value at high chronological age, while BrainAGER has an expected value of 0 regardless of actual age.

### Relationships Among Age-Covariate, Covariate-BrainAGE, and Covariate-BrainAGER Correlations

In order to investigate the effect that the correlation between BrainAGE and chronological age can have on the conclusions of an imaging study, we computed the correlations between each of the covariates and age, BrainAGE and BrainAGER. Larger age-covariate correlations lead to larger differences in measured correlation between that covariate and BrainAGER or BrainAGE (**Figure 3A**, colored points far from the 45◦ line). When age did not correlate with a covariate, BrainAGE, and BrainAGER tended to give similar results (gray points, near the 45◦ line). When age positively correlated with covariates (e.g., BMI), BrainAGER gave more positive values, and when age negatively correlated with covariates (e.g., PROMIS physical function), BrainAGER yields more negative values. Similarly, the greater the variance explained by age, the greater the squared difference in r between using BrainAGE or BrainAGER (**Figure 3B**).

**Table 1** shows the top 22 variables that are significantly correlated with either BrainAGE or BrainAGER after FDR correction for 154 tests. Notably, 17 variables were related to BrainAGE, and the strongest relationships were among variables strongly correlated with age, including body composition (percent body fat r = −0.2, percent body water r = 0.2, percent dry lean mass r = 0.2) and sensation seeking (r = 0.18). BrainAGER was only significantly correlated with six variables including waist to hip ratio (r = 0.15), color naming scaled (r = −0.15), and lean body mass (r = 0.17).

### Simulation

### Negative Correlation Between BrainAGE and Chronological Age in Simulated MRI Data

We set the parameters of our simulation algorithm to achieve realistic characteristics of experimental data, such as correlation distribution between volumes and chronological age and the negative correlation between computed BrainAGE and chronological age. This negative correlation was also present in previous models such as with Gaussian Process Regression (Cole et al., 2018) and Relevant Vector Regression (Franke et al., 2010). Simulated results closely mirrored empirical results. The simulated testing data had MAE of 4.58 years and R <sup>2</sup> of 0.71 (**Figure 2b**). In our simulated data, we observed an overestimation of younger participant's ages and an underestimation of older participant's ages (**Figure 2d**). After removing the effect of age on BrainAGE, simulated BrainAGER had an expected value of 0 regardless of actual age (**Figure 2f**).

### Reduction of False Discoveries in Regression That Include Age as Explanatory Variable

In the linear models regressing BrainAGE on the 16 covariates of interest with simulated large effect sizes (FC = 1.255), we observed the following: when age was not included as an explanatory variable, many age-related covariates were shown to have statistically significant association with BrainAGE (**Figures 4A,C**), even when they did not contribute to the weights that made up the neuroimaging features (**Figure 4**, **orange** boxplots above the horizontal). These false positives (FP) were simply the result of the relationship between these covariates and chronological age that are part of the BrainAGE's defining formula. Moreover, several covariates that were simulated to contribute to the brain structure volumes had p-values on average above 0.05 (**Figure 4**, **blue** boxplots below the horizontal).

When age was included in the regression as an extra explanatory variable, the significance increased (p-values decreased) for all variables that were generated to have an association with the imaging features, even variables that were already detected in the previous regression without age (**Figures 4B,D**). Further, the decrease in significance (increase in p-values) for unrelated covariates indicated a significant decrease in the number of false positives. Variation in the p-values across covariates came from their different (linear and nonlinear) age dependencies and effects on volumetric variation. In other words, the real "significance" of a covariate depended on from which age basis functions it was generated and how it affected the brain features (w1<sup>k</sup> , w2<sup>k</sup> , or w3<sup>k</sup> ). Simulations with a smaller effect size (FC = 1.170, **Figures 4C,D**) showed a similar effect, though attenuated, for covariates that were contributors to wmk. The positive rate (true and false) across 100 replications is quantified in **Supplementary Table 2**. Values in this table represent the portion of each boxplot above the horizontal line, which is the TP rate for covariates that had an influence on imaging features and FP rates for covariates that did not.

## DISCUSSION

This study aims to highlight the relationship between chronological age and BrainAGE and its transitive effect on the relationship between BrainAGE and covariates of interest that are also related to age. We propose a solution to this problem: either use BrainAGER, or in the simple case of posthoc linear regression, use chronological age as a covariate in subsequent analyses. We developed a simulation framework to generate data with complex, but known, relationships between the original imaging features, age, and a set of covariates that may also be related to age. Then, we were able to quantify the effect that accounting for age has on the ability to detect actual and spurious correlations with covariates in subsequent analyses.

Our main findings can be separated into three parts: analytical, empirical, and simulated data results. The analytical results provide a theoretical basis for the age-BrainAGE correlation, and the analyses using real and simulated data demonstrate

differing age dependencies (linear and nonlinear) and effects on volumetric variation. Blue boxes are variables that have a direct (TRUE) effect on BrainAGE, orange boxes are variables that do not have a direct effect on BrainAGE (FALSE), and this effect is relatively large in the top (A,B) and small in the bottom (C,D) plots. Boxplots on the left (A,C)do not use age as an explanatory variable and models on the right (B,D) include age as an explanatory variable. "Significance" was measured by –log(p). Horizontal line is at –log(0.05).

this effect in practice. For the empirical data, there were three main findings: (1) many variables that may be of interest are correlated with age with Pearson coefficients of up to r = 0.3, (2) BrainAGE is strongly negatively correlated with chronological age (r = −0.63 in our dataset), (3) BrainAGER provides a measure of deviation between predicted and actual age that is not dependent on age, and has substantially different correlations with covariates that are correlated with age when compared to BrainAGE.

Since it is unknown which covariates are actually related to premature aging, we then developed a simulation framework to generate synthetic data. Simulated data showed: (1) similar characteristics to actual data when used to train and test a model on separate datasets, and (2) increased detectability of true positives and decreased occurrence of false positives when accounting for the age-covariate relationship, with this being modulated by the size of the simulated effect on physiology.

Based on our observations in both real and simulated data, we recommend that the relationship between chronological age and BrainAGE should be accounted for. The two methods proposed in this study are either: (1) regress age on BrainAGE, producing BrainAGER, which is centered on 0 regardless of a participant's actual age or (2) include age as a regressor when doing follow-up analyses. In fact, these two methods will produce the same coefficients in the case of linear regression, with slightly larger t-statistics in the second case. The advantage of using BrainAGER is simplicity and generalizability; it could be used as the dependent variable in any arbitrary model, rather than being confined to simple linear regression. While the focus of this study is not to show specific correlates of premature aging, it is worth noting that 17 variables significantly correlated to BrainAGE whereas only 6 were related to BrainAGER, with 1 variable (PROMIS Alcohol Negative Consequences) overlapping between the two sets (**Table 1**). Thus, accounting for the age-BrainAGE relationship results in a vastly different set of positive findings and would lead to a remarkably different interpretation of these data. More explicitly, not correcting the age-BrainAGE correlation would lead to an extensive set of spurious results in this dataset.

# Limitations

There are a few cases where the age-BrainAGE correlation is not relevant. When comparing two groups with matched age, any differences in BrainAGE are not likely to be caused by the relationship with age. When the individuals being examined are in a restricted age range, there is not likely to be much contribution from the age-BrainAGE correlation. Also, when the variable of interest is not related to age, removing the effect of age makes almost no difference (**Figure 3B**). However, when these cases are not true, our findings suggest that we should include age as an explanatory variable in a final model that aims to detect association of brain anomalies with covariates of interest.

The magnitude of the age-BrainAGE correlation is directly related to the accuracy of the prediction model. The fact that the residuals are correlated with observed values is a characteristic of regression in general, regardless of the specific data domain, and has a theoretical basis described in section Theoretical Basis for the Age-BrainAGE Correlation. Several factors may decrease the model performance on our testing set, and thereby increase the age-BrainAGE correlation. Specifically, the distribution of age ranges in our samples is non-uniform, which may lead to more weight being given to the middle of the distribution. There are substantial differences between the testing and training sets we used including age, sex, and diagnosis. It may therefore be possible to improve model performance on the testing set by subsampling the training set to have a more uniform distribution of ages and to match the testing set on several factors. However, model performance is already comparable across testing and training sets (R <sup>2</sup> of 0.59 and MAE of 5.1 years, compared to 0.64 and 4.84) and is comparable with what has been previously reported.

Although the simulation was carefully designed and executed, because of the model's complexity, we have not fully explored all scenarios with different simulation parameters. However, we have identified effect size as the most important parameter and showed how it influenced the results. When varying other parameters, we still observed a reduction in the number of false positives when age is included as an explanatory variable in the final regression (results not shown). Moreover, while determining the parameters, we aimed to obtain realistic patterns as we observed in real data, such as similar distributions of the correlation values.

By constructing and studying an appropriate generative model containing covariates that have linear and non-linear relationship with age, we demonstrated that the correlation between covariates and age should be considered when making inferences about the relationship between BrainAGE and these covariates.

# ETHICS STATEMENT

This study was carried out in accordance with the recommendations of the Belmont report and the U.S. Department of Health and Human Services PART 46: PROTECTION OF HUMAN SUBJECTS. The protocol was approved by the Western IRB. All subjects gave written informed consent in accordance with the Declaration of Helsinki.

# AUTHOR CONTRIBUTIONS

TL, RK, BM, H-WY, WT, and MP contributed to the design of the study. TL and RK wrote the manuscript. TL performed simulation analyses and RK performed empirical analyses. The T-1000 investigators contributed to the conception of the study, selection of the assessments and discussion regarding the interpretation of the results. All authors revised the manuscript critically for important intellectual content and approved the final manuscript.

# FUNDING

This work has been supported in part by The William K. Warren Foundation, the National Institute of Mental Health Award Numbers K23MH112949 (SSK), K23MH108707 (RLA), K01MH096175-01 (WKS), and the National Institute of General Medical Sciences Center Grant Award Number 1P20GM121312. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.

# REFERENCES


# SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fnagi. 2018.00317/full#supplementary-material

placebo-controlled, dose-response exploratory study. Biol. Psychiatry Cogn. Neurosci. Neuroimaging. 3, 836–843. doi: 10.1016/j.bpsc.2018.05.002


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Le, Kuplicki, McKinney, Yeh, Thompson, Paulus and Tulsa 1000 Investigators. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Pre-aging of the Olfactory Bulb in Major Depression With High Comorbidity of Mental Disorders

Fabian Rottstaedt <sup>1</sup> \*, Kerstin Weidner <sup>1</sup> , Thomas Hummel <sup>2</sup> and Ilona Croy <sup>1</sup>

<sup>1</sup>Department of Psychosomatic Medicine and Psychotherapy, Technische Universität Dresden, Dresden, Germany, <sup>2</sup>Smell & Taste Clinic, Department of Otorhinolaryngology, Technische Universität Dresden, Dresden, Germany

Recent studies suggest that accelerated aging of the brain is a neuroanatomical signature of the state of mental diseases. In major depression, this pre-aging effect is negatively associated with the duration since the first onset of the disease. The olfactory bulb (OB) shrinks with age in healthy subjects and patients with mental diseases show reduced OB volumes, especially those with major depression. It is unclear whether this OB reduction in mental diseases resembles a pre-aging process and whether it is associated to the duration since the onset of the mental disease. To this aim, we investigated OB volume in 73 patients (mean-age 40.4 years, SD = 12.1 years, 57 women) with major depression and mixed comorbid mental diseases (diagnoses ranged from 1 to 6, median: 3) and 51 healthy controls (mean-age 39.2 years, SD = 13.0 years, 26 women) matched for age and sex. Patient's first onset of disease ranged from 15 to 53 years (mean 24.2 years). All participants underwent structural MR imaging with a spin-echo T2-wheighted sequence covering the anterior and middle segments of the skull base. All results were corrected for total intracranial volume (TIV) and sex. Individual OB volume was calculated by planimetric manual contouring and the pronounced diameter change in transition from bulb to tract was used as the distal demarcation of the OB. Inter-rater correlation between two independent persons analyzing the data was high (IRC = 0.81, p < 0.005). An age-dependent decline of the OB volume was confirmed in healthy controls (r = −0.34, p < 0.05). However, this pattern was altered in patients where the OB volume was not related to age, but to the duration since the onset of the mental disease (r = −0.25, p < 0.05). This association remained stable when controlling for age. Additionally, analyses of age sub-groups revealed that the association between duration since the onset of the mental disease and OB volume was mainly driven by the group aged 50 years and above (r = −0.68; p < 0.01). We conclude that there are time windows where the OB volume is susceptible for the effects of a mental disease, e.g., depression. These effects result in cumulative pre-aging in the OB in older patients with mental diseases.

Keywords: depression, olfactory bulb, MRI, biological markers, aging, premature

### Edited by:

Christian Gaser, Friedrich-Schiller-Universität Jena, Germany

### Reviewed by:

Martin Witt, Universitätsmedizin Rostock, Germany Shengying Qin, Shanghai Jiao Tong University, China

### \*Correspondence:

Fabian Rottstaedt fabian.rottstaedt@uniklinikumdresden.de

Received: 12 March 2018 Accepted: 17 October 2018 Published: 08 November 2018

### Citation:

Rottstaedt F, Weidner K, Hummel T and Croy I (2018) Pre-aging of the Olfactory Bulb in Major Depression With High Comorbidity of Mental Disorders. Front. Aging Neurosci. 10:354. doi: 10.3389/fnagi.2018.00354

# INTRODUCTION

Throughout human life, the brain's neural architecture undergoes a steady transformation. Whereas the early years from birth to adolescence are determined by the interplay of neural growth and differentiation (Shaw et al., 2008), adulthood is mostly characterized by depletion (Fjell et al., 2013). Especially after midlife, brain weight (Skullerud, 1985) and whole brain volume decrease (Scahill et al., 2003; Hedman et al., 2012) and equally the volume of most brain structures shrinks with increasing age (Allen et al., 2005; Raz and Rodrigue, 2006). Accordingly, the olfactory bulb (OB) shows its peak volume around the age of 40 years from where it linearly decreases with age (Buschhüter et al., 2008). In line with this, olfactory functioning shows the same trajectory, peaking around the age of 40 years decreasing thereafter (Buschhüter et al., 2008). Most investigation of the aging brain are based on the frontal cortex and hippocampus (Fjell et al., 2014), showing that over the life span both regions are particularly vulnerable to age and undergo comparable decline in GM volume (Fjell et al., 2014).

Depressed patients show alterations that are similar to the described processes of aging. Those affect particularly brain networks involving limbic and prefrontal regions. Similar to aging, the typical GM reduction patterns in depression especially concern the hippocampus and prefrontal cortex (Bora et al., 2012; Sacher et al., 2012; Singh et al., 2013). Interestingly, depression is also related to premature reduction of OB volume (Negoias et al., 2010; Croy et al., 2013) which was suggested as biological marker for the disease (Kohli et al., 2016; Croy and Hummel, 2017; Rottstaedt et al., 2018). We hence aimed to investigate the association of OB volume decline with age in healthy controls and depressed patients. We assumed that the age-related decline of OB volume in healthy participants is shifted to younger age in patients with depression.

# MATERIALS AND METHODS

### Participants

This study was embedded in a larger design (for complete Methods and Materials information please compare; Rottstaedt et al., 2018). Of the patient cohort investigated there, only patients with diagnosed Major Depression (n = 73, all inpatients of the Department of Psychosomatic and Psychotherapy of the Dresden University Hospital) were included in this investigation. Structured anamnestic interviews (German version of the SCID-I; Wittchen, 1997) performed by trained psychotherapists had previously been completed. The patient group included 57 females and 16 males, aged between 19 and 62 (M ± SD = 40.4 ± 12.1) years (compare **Table 1** for further demographic and illness-related parameters). Diagnoses included unipolar depression or recurrent depressive disorder (N = 73), anxiety disorders (N = 47) somatoform disorders (N = 20), posttraumatic stress disorder (N = 37), substance abuse (N = 9) and eating disorders (N = 18; compare supplementary information for individual data) and hence the median of the number of overall diagnosed mental disorders was 3 (range from 1 to 6 diagnoses; compare **Supplementary Table S1** for an overview of all patients and their diagnoses). The majority of patients received medical treatment (compare **Supplementary Table S1**). Fifty-one age and sex matched healthy controls (26 females and 25 males; 20–69 years; M ± SD = 39.16 ± 13.0 years) were recruited who were required not to meet criteria of a mental disorder (confirmed by the Patient Health Questionnaire; Spitzer et al., 1999). Exclusion criteria were concomitant nasal pathology (e.g., severe septal deviation, sinonasal disease) or potential brain pathology which was ascertained by detailed medical interview and whole brain magnetic resonance imaging (MRI) scans.

The groups did not differ in terms of age (t(124) = 0.68, p = 0.50). Sex distribution appeared not to be equal between groups (X 2 (1,124) = 10.8, p < 0.01). Hence all statistical analyses were controlled for sex differences.

# Ethics Statement

The study followed the Declaration of Helsinki on Biomedical Research Involving Human Subjects and was approved by the local Ethics Committee (EK 51022015). All participants provided written informed consent.

### Magnetic Resonance Imaging Procedures

MRI scans were performed using an 8-channel phased-array head coil (3T Siemens Magnetom Verio scanner; Siemens Healthcare, Erlangen, Germany).

In order to obtain OB volume measures, a fast spin-echo T2-wheighted sequence covering the anterior and middle segments of the skull base was acquired (TR = 8,090 ms; TE = 97 ms; voxel size 2 × 2 × 2 mm<sup>3</sup> ; flip angle 123◦ , in total 36 contiguous slices of 2 mm thickness, coronal orientation with no gap).

# Statistical Analysis

Data was analyzed using SPSS 21 for Windows (SPSS Inc., Chicago, IL, USA). AMIRA 3D visualization and modeling system (Visage Imaging, Carlsbad, CA, USA) was used to calculate OB volumes.

### Calculation of OB Volumes

Based on manual segmentation of the acquired T2-weighted coronal data, all OB volumes were calculated by the same experimenter (FR) blinded to the diagnosis of the participant. The pronounced diameter change in transition from bulb to tract was used as the proximal demarcation of the OB. On each coronal slice, right and left OB's shape was outlined manually and OB volumes were calculated by planimetric manual contouring (surface in mm<sup>2</sup> ). All surfaces were then added and multiplied by 2 (2-mm slice thickness) to obtain an estimated overall volume (compare **Figure 1**). The volume of the left and right OB was calculated for each participant. The larger of the two volumes was then used for all further analyses. Hence the term OB volume refers to ''best OB volume.'' This approach of calculating and analyzing OB volumes has previously shown to be highly reliable

### TABLE 1 | Descriptive characteristics of the patient and control group.


Age is shown in years, volumes of OB and Total Intracranial Volume are shown in mm<sup>3</sup> , n, number; %, percentage; OB, olfactory bulb; OBleft, volume of left OB, OBright, volume of right OB, OBbest, greater volume of left or right OB, BDI, score in Beck's Depression Inventory; ∗∗p < 0.01, ∗∗∗p < 0.001.

and accurate (Yousem et al., 1997; Hummel et al., 2015). Interrater correlation between two independent persons analyzing the OB volume was high (IRC = 0.81, p < 0.005).

### Correlation Analysis

Pearson-correlation coefficients were computed in healthy controls and depressed patients separately to assess associations between age and OB volume. In depressed patients Pearsoncorrelation coefficients for the association between OB volume and the duration since the first onset of mental disease were computed and additionally controlled for age. The first onset of mental disease was taken from the documentations of the structured anamnestic interviews (German version of the SCID-I; Wittchen, 1997) where a detailed inquiry of episodes of mental disease is essential.

Furthermore, we divided the group of depressed patients into three different age groups of young (18–34 years), middle-aged (35–49 years) and old (above 50 years) individuals. For every group we computed Pearson-correlation coefficients for the association between OB volume and the duration since the first onset of the mental disease separately.

All analyses were controlled for six medical conditions (medication in general, Antidepressants, Neuroleptics, Antiepileptics, Soporifics/Tranquilizer, other drugs) by

introducing them sequentially as confounding variables in a partial correlation design.

### Between Group Analysis

OB volumes were compared between patients and controls using ANCOVA with sex and total intracranial volume (TIV) as covariate. Sex was chosen as covariate because women exhibit a smaller OB volume than men (Buschhüter et al., 2008) and sex distribution was not equal between groups; TIV was chosen as covariate as it was positively associated with OB volume (r = 0.24; p < 0.01).

All analyses were controlled for six medical conditions (medication in general, Antidepressants, Neuroleptics, Antiepileptics, Soporifics/Tranquilizer, other drugs) by introducing them sequentially as confounding variables in a partial correlation design.

### RESULTS

In healthy controls, OB volume was negatively associated with age (r = −0.34, p < 0.05; compare **Figure 2A**) which was not the case in depressed patients (r = −0.10; p = 0.39). However, when controlling for age OB volume was related to the duration since the first onset of mental disease (r = −0.25, p < 0.05; **Figure 2B**). In the group of patients, the oldest individuals (aged 50 years and above) showed the strongest association between the duration since the first onset of the mental disease and OB volume (r = −0.68; p < 0.01) when controlling for age, whereas for the youngest (aged between 18 years and 34 years; r = 0.26; p = 0.18) and middle-aged (aged between 34 years and 50 years; r = −0.25; p = 0.33) patients the associations did not reach significance. In the oldest age-group, a manifestation of OB volume reduction was evident after 5 years of mental disease (compare **Figure 2**). Inclusion of medication did not change the results.

Depressed patients exhibited smaller OB volumes than healthy controls (F(2,124) = 6.0; p < 0.05; for further comparisons of the two groups please compare **Table 1**). Inclusion of medication did not change the results.

### DISCUSSION

An age-dependent decline of the OB volume had been shown in healthy people (Yousem et al., 1998; Buschhüter et al., 2008). However, this pattern was altered in patients where OB volume was not related to age, but to the duration since the first onset of the mental disease.

In detail, the OB volume showed accelerated decrease with the duration of the mental disease in patients of the oldest age-group in the sample. Notably, the data suggests that this acceleration was evident after about 5 years of duration of the mental disease. No such association was found for the middle-aged subgroup. The youngest patient sub-group showed highly reduced OB volumes compared to the respective healthy control group. However, this group still showed a trend towards an increase of the OB volume over time which can also be seen in healthy controls at this age in other studies (Yousem et al., 1998; Buschhüter et al., 2008). We conclude that there are time windows where the OB volume is susceptible for the effects of depression, namely the young and the old age. This is in line with the observation of developmental time windows of the human brain (Lupien et al., 2009) which implies that certain brain areas show increased vulnerability during specific development stages.

Two mechanisms are possible here: (a) reduced OB volume could be a pre-existing factor of the mental disease and hence indicate increased vulnerability for a mental disease (as already formulated by Croy and Hummel, 2017); and (b) The OB volume reduction could be the consequence of the mental disease. In line with the vulnerability hypothesis, the data shows that patients of the youngest age group exhibit reduced OB volumes compared to healthy controls. On the other hand, the very same result can also be interpreted as a consequence of reduced OB volume growth which happens during this time window in healthy individuals and seems diminished in patients with depression. Furthermore, the results of the group of older patients are in favor of the consequence hypothesis: in healthy aging, the OB volume starts to decrease at the age of about 40 years (Buschhüter et al., 2008). The manifestation of a mental disease increases those normal aging effects. Also known as the neurotoxicity hypotheses, this theory suggests that a long-term increase of individual stress levels leads to prolonged exposure to glucocorticoids which reduces the ability of neurons to resist insults, increasing the rate at which they are damaged by other toxic challenges or ordinary attrition (Lupien et al., 2009). In patients aged 35–49 years however, the absence of a relation between depression and OB volume could be interpreted as a higher resilience to damage of the OB during middle-age.

Gray matter volume reductions throughout the brain are well-known in depression and affect particularly brain networks involving limbic and prefrontal regions (Bora et al., 2012; Sacher et al., 2012; Singh et al., 2013). Recently, the OB volume reduction was suggested as an additional biomarker for the disease (Croy and Hummel, 2017; Rottstaedt et al., 2018). As depression is connected with increased stress levels (Liu and Alloy, 2010), risk for hypertension (Grippo and Johnson, 2009), diabetes (Talbot and Nouwen, 2000; Ali et al., 2006) and reduced physical activity (Camacho et al., 1991; Teychenne et al., 2008)—conditions known to drive brain aging processes (Raz and Rodrigue, 2006; Lupien et al., 2009; Fjell et al., 2013)—the neural alterations usually reported in depression could also be seen as manifestations of accelerated brain aging. Supporting this assumption, it could be shown that especially early-onset depression is associated with accelerated brain aging (Koutsouleris et al., 2013) and that depression duration rather than age predicts hippocampal volume loss (Sheline et al., 1999). Our findings point in the same direction and we assume that the manifestation of depression provokes cumulative pre-aging in the OB, especially in older subjects.

To sufficiently explore which of the provided theories best explains the association between OB volume reduction and depression, long-term studies are necessary. It is possible that both mechanisms—preterm aging as a consequence of

depression and increased vulnerability—could work together in a vicious circle and hence are not mutually exclusive (Lupien et al., 2009; Croy and Hummel, 2017): exposure to stress during developmental periods of certain brain regions might alter their development and lead to increased vulnerability to mental disorders (de Kloet et al., 2005; Cohen et al., 2006), e.g., depression. On the other hand long-term exposure to stress during periods of mental disease could also affect brain organization (Frodl et al., 2008). For the OB that would mean cumulative volume reduction that accelerates the decline seen in older individuals.

### LIMITATIONS

There are two major limitations when interpreting the results: (1) the presented data is cross-sectional. To sufficiently explore aging trajectories, longitudinal designs are necessary; and (2) The group sizes of the investigated age-sub-groups are rather small.

# REFERENCES


Hence, the interpretations of the results concerning those sub-groups should be treated with care. Investigations in larger cohorts are necessary to finally confirm or disprove the divided interpretations.

### AUTHOR CONTRIBUTIONS

FR: data acquisition, data analysis and interpretation, drafting the article. KW and TH: substantial contribution to design of the study, critical revision of article. IC: design of the study, data interpretation, revision of article for important intellectual content.

### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fnagi. 2018.00354/full#supplementary-material

effects of stress? Arch. Gen. Psychiatry 65, 1156–1165. doi: 10.1001/archpsyc.65. 10.1156


**Conflict of Interest Statement**: The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Rottstaedt, Weidner, Hummel and Croy. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

digital media

of impactful research

article's readership