Computer-Aided Diagnosis and Localization of Lateralized Temporal Lobe Epilepsy Using Interictal FDG-PET

Interictal FDG-PET (iPET) is a core tool for localizing the epileptogenic focus, potentially before structural MRI, that does not require rare and transient epileptiform discharges or seizures on EEG. The visual interpretation of iPET is challenging and requires years of epilepsy-specific expertise. We have developed an automated computer-aided diagnostic (CAD) tool that has the potential to work both independent of and synergistically with expert analysis. Our tool operates on distributed metabolic changes across the whole brain measured by iPET to both diagnose and lateralize temporal lobe epilepsy (TLE). When diagnosing left TLE (LTLE) or right TLE (RTLE) vs. non-epileptic seizures (NES), our accuracy in reproducing the results of the gold standard long term video-EEG monitoring was 82% [95% confidence interval (CI) 69–90%] or 88% (95% CI 76–94%), respectively. The classifier that both diagnosed and lateralized the disease had overall accuracy of 76% (95% CI 66–84%), where 89% (95% CI 77–96%) of patients correctly identified with epilepsy were correctly lateralized. When identifying LTLE, our CAD tool utilized metabolic changes across the entire brain. By contrast, only temporal regions and the right frontal lobe cortex, were needed to identify RTLE accurately, a finding consistent with clinical observations and indicative of a potential pathophysiological difference between RTLE and LTLE. The goal of CADs is to complement – not replace – expert analysis. In our dataset, the accuracy of manual analysis (MA) of iPET (∼80%) was similar to CAD. The square correlation between our CAD tool and MA, however, was only 30%, indicating that our CAD tool does not recreate MA. The addition of clinical information to our CAD, however, did not substantively change performance. These results suggest that automated analysis might provide clinically valuable information to focus treatment more effectively.


INTRODUCTION
It is difficult to differentiate between patients with epilepsy (PWE), and those with non-epileptic seizures (NES). The clinical assessment relies on the report of untrained witnesses or the patients themselves. A non-epileptic seizure is defined as the presence of external seizure symptoms and/or signs with no electrographic features characteristic of epilepsy. Long term video-EEG monitoring has shown consistently that roughly one third of patients diagnosed with "medication refractory epilepsy" in fact suffer from NES (Kerr et al., 2012a). Because they don't suffer from epilepsy, these patients with NES (PWN) are not treated effectively with anti-epileptic drugs (AEDs). For the majority of PWN, the NES are a manifestation of dissociative or conversion disorder in which their psychological challenges manifest themselves physically (Marchetti et al., 2008(Marchetti et al., , 2009. A minority of PWN suffers from organic, non-epileptic maladies that can be confused with seizure disorder including, but not limited to, dementia and cardiovascular disease (Sahaya et al., 2011). The gold standard for the differential diagnosis and pre-surgical assessment of epilepsy includes 72 or more hours of video-EEG monitoring (Cragar et al., 2002;LaFrance and Devinsky, 2004). However, 10% of patients admitted for this extensive assessment leave with inconclusive results (Kerr et al., 2012a). Considering that one sixth of PWE are diagnosed with medication refractory epilepsy (Privitera, 2011), improved methods to effectively identify PWN who do not benefit from AEDs effectively could reduce the morbidity and both the financial and social cost of treating epilepsy.
Improved diagnostic tools could also help PWE. The difficulty in ruling out non-epileptic etiologies speaks to the challenge of adequately localizing and characterizing each patient's epileptic etiology. The major seizure type discriminations are focal vs. generalized; partial vs. complex; and lesional vs. nonlesional. Each of these key discriminations leads patients down a different treatment path. When medication or other novel treatments like the vagus nerve stimulator fails, as they frequently do, the patient is left to consider resective neurosurgery. Recent reports have shown that surgery is most effective earlier in the course of disease . Improved diagnostic tools could more quickly and effectively diagnose patients with epileptic seizures and therefore speed the progression toward considering the surgical option.
Ultimately, our goal is to establish a general, automated computer-aided diagnostic (CAD) tool that effectively combines clinical information, manual interpretation of EEG and imaging technologies as well as automated analysis of interictal FDG-PET (iPET), EEG, structural MRI (sMRI), and diffusion MRI for all subtypes of epilepsy and NES. To accomplish this, we first must develop effective CAD tools that harness the information from each modality for a limited set of epileptic localizations. We have begun already to address automated analysis of interictal EEG for a wide variety of epilepsy subtypes (Kerr et al., 2012a). Others have described effective CAD tools that diagnose and lateralize temporal lobe epilepsy (TLE) using structural and diffusion MRI (Farid et al., 2012;Focke et al., 2012;Keihaninejad et al., 2012).
The clinical, metabolic, and structural differences between left and right TLE can be subtle. Some theories suggest that TLE is inherently a bilateral disease. Potentially, due to the strong functional link between the hippocampi, the only clinical difference is that in the aura of patient with left TLE (LTLE) more frequently includes language dysfunction. Over time, patients with LTLE more commonly develop verbal memory deficits, compared to non-verbal memory deficits in right TLE (RTLE) (Delaney et al., 1980;Kim et al., 2003). This functional connection between the hippocampi may also lead some patients to be falsely-lateralized using scalp EEG because a small seizure onset zone (SOZ) in one hippocampus can induce larger scale ictal activity in the contralateral hippocampus with very little time delay. This can lead neurologists to falsely conclude that the SOZ is either bilateral or in the contralateral hippocampus. Structural and metabolic imaging can reduce these errors by demonstrating that that one temporal lobe is asymmetrically affected, as shown by the previous CAD tools that lateralize TLE (Farid et al., 2012;Focke et al., 2012;Keihaninejad et al., 2012). Studies of the functional connectivity of these epileptic networks, however, conclude that there are very few, if any, differences between the two lateralizations (Zhang et al., 2010;Liao et al., 2011;Morgan et al., 2011Morgan et al., , 2012Pittau et al., 2012;McCormick et al., 2013). Recently, Pereira et al. (2010) suggested that more patterns of functional connectivity change in LTLE compared to RTLE. However, after patients suffer from intractable seizures for 10 or more years, the intrahemispheric hippocampal connectivity linearly increases with the duration of disease, suggesting that over time lateralized disease may become bilateral disease (Morgan et al., 2011). Because patients with bilateral hippocampal disease are no longer considered surgical candidates, improved methods to distinguish left and right TLE early in the course of disease are needed.
In this manuscript, we discuss the development of an automated CAD tool to diagnose, and lateralize, TLE using iPET. We also begin to address how to combine our CAD tool with manual analysis (MA) and incorporate it into clinical practice. Using a mutual information-based feature selection technique, we examine how our methods reveal more about the distributed metabolic abnormalities that are associated with the different anatomical locations of the epileptogenic focus.
The realistic goal of CAD tools is to complement, not to replace, expert analysis. Therefore, we focus on how clinical information and expert analysis can work synergistically with our automated technology. To summarize the major clinical differences, patients with NES are characteristically females in the third decade of life with psychiatric co-morbidities (Sahaya et al., 2011). PWE, however, also have significant psychiatric co-morbidities including potentially reduced financial and social independence due to the suspension of their driver's and, frequently, professional license. Particularly in adult onset epilepsy, age-associated changes in metabolism may confound the interpretation of iPET, possibly leading to an increased diagnostic uncertainty. It is well established that 80-90% of medication refractory epilepsy is "PET positive" Lee and Salamon, 2009).The rate of PET positivity in NES has not been studied extensively, therefore the true positive predictive value of iPET is unclear. Although these differences in clinical presentation are salient, their quantitative effect on diagnostic probabilities is unknown. Therefore, we also examined how simple clinical information and expert manual interpretation can be incorporated into our quantitative CAD tool.
The standard of care for the pre-surgical assessment for epilepsy is the manual correlation of iPET with numerous other diagnostic modalities. The goal of this assessment is to simultaneously verify the diagnosis of epilepsy, characterize the seizure etiology, and identify the location and extent of the SOZ. Expert radiologists and neurologists can detect metabolic asymmetries indicative of the epileptogenic focus or foci (Person et al., 2010). The exact threshold at which asymmetric metabolism is attributed to pathologic change or seen as a variant of normal is part of the art of neuroradiology (Benbadis et al., 2000;Reuber et al., 2002). Once non-epileptic etiologies have been ruled out, our previous work demonstrated that the quantitative degree of metabolic asymmetry is correlated with surgical outcome (Lin et al., 2007). Surgical outcome is improved further when iPET is co-registered to sMRI because of improved characterization of the focus or foci (Chandra et al., 2006;Rastogi et al., 2008;Salamon et al., 2008;Lee and Salamon, 2009). These hypometabolic lesions are thought to be secondary to increased inhibitory neuron cell death, gliosis, and abnormal functional connectivity resulting in altered functional metabolism.
The size of the hypometabolic lesion tends to be larger than the SOZ, potentially due to functional changes in nearby tissue secondary to the presence of the epileptogenic lesion (Juhasz et al., 1999;Matheja et al., 2001;Henry and Roman, 2011). Such reports are major limitations to the wide implementation of iPET in epilepsy practices (Barrington et al., 1998;So et al., 2000;Henry Frontiers in Neurology | Epilepsy et al., 2011). In addition to the limitation of counting statistics, that forces the quantitative radioactivity intensity of iPET to be less certain in hypometabolic lesions (Kerr and Lau, 2012), the biological hypothesis is that the epileptogenic abnormality induces metabolic abnormality at the SOZ and also at closely associated and/or functionally connected regions (Henry et al., 1990(Henry et al., , 1993Sperling et al., 1990;Sadzot et al., 1992;Arnold et al., 1996;Dlugos et al., 1999;Bouilleret et al., 2002;Rusu et al., 2005;Nelissen et al., 2006;Takaya et al., 2006;. The epileptogenic lesion commonly is larger and more diffuse in left TLE then right TLE, potentially because of the high degree of functional connectivity between specialized foci within the left temporal lobe associated with language and other functions (Toga and Thompson, 2003;Barrick et al., 2005;Iturria-Medina et al., 2011;Haneef et al., 2012;Kucyi et al., 2012). These insights parallel the trend in dementia that atrophy starts focally then spreads more quickly to functionally connected regions (Zhou et al., 2012). The limited sensitivity of iPET unaligned with sMRI to characterize extratemporal lesions may be partly due to the insufficient description of the local functional network of each extratemporal focus and thereby reduced detection of a characteristic pattern of metabolic abnormalities associated with each focus. In general, an improved insight into the clinical interpretation and value of metabolic abnormalities outside the SOZ is needed. To overcome this limitation, the iPET analysis is used in combination with other diagnostic modalities determine which tissue to resect.
Clinical description, EEG, MRI, and FDG-PET each describe separate facets of the pathophysiological etiology, and therefore all play critical roles in the diagnosis of epilepsy, and in the identification of the epileptogenic lesion (Struck et al., 2011). Each modality, however, also has unique limitations. EEG provides an in-depth description of the seizures and interictal epileptiform spikes. These seizures and spikes, however, are rare events: only 50% of PWE exhibit diagnostic interictal epileptiform spikes and/or seizure activity during the first outpatient scalp EEG (Gilbert et al., 2003). The characteristic signs of epilepsy in structural and diffusion MRI may not be measurable until years after the first seizure because these methods require the detection of atrophic tissue and/or subtle regions of cortical dysplasia (Swartz et al., 1992;Reutens et al., 1996;Van Paesschen et al., 1998;Liu et al., 2002;Jung da and Lee, 2010;Bernasconi et al., 2011;Schmidt and Pohlmann-Eden, 2011;Dabbs et al., 2012). MA uses the contralateral structure to assess if atrophy is present but a certain degree of asymmetry is expected (Farid et al., 2012;Keihaninejad et al., 2012). It takes years of specific experience in manually analyzing sMRIs from PWE to reliably discriminate between normal variation and pathologic changes. Once these relatively large-scale changes in neural structure have occurred, it is less likely that both invasive and noninvasive treatments will be effective . iPET can localize the epileptogenic lesion without observing rare events and, potentially, before changes are detectible on sMRI and/or diffusion tensor imaging (DTI) (Theodore et al., 1990;Ryvlin et al., 1991;Swartz et al., 1992;Gaillard et al., 1995;Debets et al., 1997;Knowlton et al., 1997Knowlton et al., , 2008Blum et al., 1998;Drzezga et al., 1999;Benedek et al., 2004;Carne et al., 2004;Chandra et al., 2006;Yun et al., 2006;Uijl et al., 2007;Willmann et al., 2007;Rastogi et al., 2008;Salamon et al., 2008;Duncan, 2009;Lee and Salamon, 2009;Lerner et al., 2009;Liew et al., 2009;Brodbeck et al., 2010;Chinchure et al., 2010;Kim et al., 2011;Jupp et al., 2012). As discussed above, the presence of metabolic abnormalities outside the SOZ, however, complicates the effective localization of the SOZ using iPET alone (Henry and Roman, 2011). An improved description of these induced changes outside the SOZ may help spare healthy tissue from resective surgery. Given the recent report that resective neurosurgery for epilepsy is more effective earlier in disease ; we believe that iPET may play a critical role in characterizing patients with unremarkable MRIs and inconclusive EEGs earlier in the course of their disease.

PATIENT DATA
All of the 105 patients that were included in our analysis were admitted to the University of California, Los Angeles (UCLA) Seizure Disorder Center's video-EEG Epilepsy Monitoring Unit (EMU) between 2005 and 2012. Each patient's diagnosis was based on a consensus panel review of their clinical history, physical and neurological exam, neuropsychiatric testing, video-EEG, iPET, ictal FDG-PET, structural and diffusion MRI, and/or CT scan. This multimodal assessment is the gold standard for epilepsy diagnosis and localization of the epileptic focus (Cragar et al., 2002;LaFrance and Devinsky, 2004). The patients included in this analysis were chosen because they had an FDG-PET after 2005; had no history of penetrative neurotrauma, including neurosurgery; were determined by consensus diagnosis to have a single, lateralized epileptogenic focus; and had no suspicion of mixed non-epileptic and epileptic seizure disorder. These patients were diagnosed either with LTLE (n = 39), right TLE (RTLE, n = 34), or NES (NES, n = 32). PET images were determined to be interictal by clinical findings and concurrent scalp EEG.
PET and MRI images were acquired according to the best clinical practices at the time of acquisition. PET/CT studies were acquired using a Siemens Biograph scanner. After a minimum fasting period of 6 h, patients received 0.14 mCi/kg of 18F-FDG-PET intravenously. During the ensuing 40 min uptake period with concomitant EEG monitoring to confirm interictal status, the patients waited in a quiet, dimply lit room with their eyes open. PET images were reconstructed with an iterative algorithm (OSEM: 2 iterations, 8 subsets). CT images were reconstructed using filtered back projection at 3.4 mm axial intervals to match the slice separation of the PET data, and used for attenuation correction.

COMPUTER-AIDED DIAGNOSTIC TOOL TRAINING AND VALIDATION
Automated analysis of the iPET records was performed in four stages. (1) First, each image was screened for gross structural and/or metabolic abnormalities by S.T.N., N.M.R., and/or W.T.K. (n = 21). These excluded subjects are not reflected in the sample sizes quoted above. (2) NeuroQ (Syntermed, GA, USA) was used to segment each brain into 47 regions of interest (ROIs) and then to calculate the average radioactivity in each ROI, normalized by the whole brain radioactivity ( Table A1 in Appendix). (3) The minimum redundancy-maximum relevancy (mRMR) toolbox for MATLAB (Mathworks, MA, USA) was used to generate a ranked list of the ROI metabolisms (features) within each training set that were maximally relevant to the diagnosis of epilepsy www.frontiersin.org and minimally redundant with all higher ranked features Peng et al., 2005). The representative number of features to exclude and quantal levels was selected based on our method discussed previously (Kerr et al., 2012a,b) (see below). In each of the training sets, the feature ranking was determined exclusive of the test patient's data. We expect the ranked lists to be similar, but not identical, across training sets. For purely illustrative purposes, the full dataset was used to create the ranked list in Table 2. (4) Weka was used to implement leave-one-out cross-validation of a cost-sensitive Multilayer Perceptron (MLP) that was weighted to maximize balanced accuracy, defined by the mean of sensitivity and specificity (Bouckaert et al., 2010). Using this method, we examined our ability to diagnose either LTLE or RTLE from NES and assessed our ability to diagnose and lateralize disease simultaneously. For the remainder of this manuscript, the latter tool that discriminates LTLE vs. RTLE vs. NES is called the trinary classifier. Similarly, the binary CAD tools are referred to by the laterality of epilepsy that is being detected. The comparison to NES is not stated, but can be assumed. We then compared our CAD tool's performance to the results of MA alone.

MACHINE LEARNING ALGORITHMIC DETAILS
The MLP was implemented with default parameters in Weka (Bouckaert et al., 2010). All input features were normalized to values between negative and positive 1. No limit was set on the number of hidden layers or nodes within each hidden layer. These parameters were optimized within each training set independently. The learning rate and momentum were set to 0.3 and 0.2, respectively. Five hundred epochs were used for training. During training, models with more than 20 consecutive errors were excluded. The trinary classifier was created by decomposing the three class problem into three 1-against-1 problems that were combined using majority voting. No three-way ties occurred during training or testing.
Balanced accuracy was optimized using a cost-sensitive classifier in which a false positive was given a cost of n + and a false negative was given the cost of n − , where n + and n − represent the number of PWE and NES in the full sample, respectively. In the trinary classifier, the cost was set as the sum of the number of patients in the other two diagnostic classes.
Cyclical leave-one-out cross-validation (CL1OCV) was used to assess the performance of the MLP. In this paradigm, all but one patient was used to determine the features selected and train the algorithm. The single remaining patient is tested using the model built upon the other patients. The identity of the test patient is permuted until all patients have been the test case once and only once.
To determine the number and identity of the input features, the mRMR algorithm requires the number of input features, F, and quantal levels, Q, be set a priori. For the calculation of mutual information, the features were smoothed into Q quantal bins akin to the bins in a histogram. Classification, however, utilizes unsmoothed features. The choice of input features smoothed into quantal levels was determined to be most representative of the performance of the algorithm across a wide variety of choices of F and Q (Kerr et al., 2012b). This choice was made by selecting a point within a region of F-Q parameter space that performed significantly better than the naïve classifier with 95% confidence based on random field theory correction where the spatial smoothness is estimated directly from the data (for more details, see Worsley et al., 1992Worsley et al., , 2004Chauvin et al., 2005). The naïve classifier classifies all test exemplars as the most common class in the training set. Under the CL1OCV procedure, these input features were determined independently for each of the training samples. The illustrated rank order of features was calculated based on the full dataset, and does not necessarily match the rank list of any individual training sample.
When clinical information was incorporated into the algorithm, the same methodology was applied as above, except that all exemplars with missing data were excluded from analysis. In these additional analyses, we did not re-sample the parameter space of F and Q. We simply used the selections determined in the previous analysis.

MANUAL ANALYSIS OF PET AND MRI RECORDS
Manual analyses of the iPET and sMRI records were performed based on the review of clinical records primarily written by Dr. Noriko Salamon. Dr. Salamon has 10 years of experience in the pre-surgical assessment of epilepsy using FDG-PET and MRI. All manual interpretation was conducted for the clinical assessment of each patient when it occurred, prior to the CAD tool development. Therefore, Dr. Salamon was blinded to the automated results. Due to the unclear relationship between structural and metabolic abnormalities, asymmetries, and epilepsy, all abnormal results were interpreted to be consistent with some form of epilepsy. Not all patients had sMRI (n = 6) and iPET (n = 1) reports available; therefore all analysis regarding MA of neuroimaging includes only patients with available records. These patients had raw iPET data available; they therefore were included in the automated analysis.

COMBINATION OF CLINICAL INFORMATION WITH COMPUTER-AIDED DIAGNOSTIC INFORMATION
To examine the combined power of clinical knowledge, MA and our automated analysis, we assessed the linear correlation of detecting epilepsy with CAD compared to MA, and also incorporated clinical information and MA into our algorithm in two ways. First, the clinical literature suggests that patients with NES are more likely to be female, begin having seizures in the third decade of life, have a decreased duration of disease and have increased seizure frequency (Table 1). Although we did not see a significant difference seizure frequency within our dataset, we included this features to better match clinical practice. These clinical features were then added to the input and leave-one-out cross-validation was repeated. Secondly, to explore how our computational methods can complement clinical wisdom, we included the results of MA of the iPET and sMRI as two additional input features and re-evaluated CAD performance. For the trinary classifier only, we split each of the features describing the iPET and sMRI MA to indicate if a left and/or right sided abnormality was reported.
To assess the applicability of our CAD as a separate modality that could be considered as part of the clinical assessment of epilepsy, we calculated the likelihood ratios (LRs) of each of the combinations of our CAD with MA of iPET and/or sMRI. This was done only for the binary classifiers, because LRs have a clear formulation only for binary outcomes. The likelihood ratio is defined by the likelihood that a patient with a certain combination of diagnostic outcomes has epilepsy, divided by the likelihood that the same patient has NES. Intuitively, a likelihood ratio of two implies that the patient is twice as likely to have epilepsy. The 95% confidence intervals of chance were calculated using exact binomial intervals by considering the likelihood ratio of a classifier that diagnosed patients according to their prior likelihood alone, conditioned upon the assumption that the same total number of patients would have the diagnostic outcome of interest. For example, 39 of 71 patients had LTLE when we discriminated between LTLE and NES, therefore the median LR is 1.2. Thirty-five patients from the NES vs. LTLE group had negative MA of their iPET. Therefore, we use a binomial distribution with 35 trials and success probability of 39 over 71 to yield a 95% confidence interval of 0.94-3.38.

RESULTS
All of our results are compared to the gold standard diagnosis from the consensus panel. The clinical trial statistics of each of our automated diagnostic tool matched, but were not redundant with, expert MA of both interictal PET and sMRI (Figure 1) (56-81%)]. The pattern in sensitivities, specificities, and odds ratios all parallel this trend where our automated diagnostic tools are non-statistically superior to MA oriPET, which, in turn, are non-statistically superior to MA of sMRI (Figure 1). The accuracy of our trinary CAD tool that simultaneously diagnoses epilepsy and lateralize disease was 76% (66-84%), where 89% (77-96%) of patients correctly identified with epilepsy were also lateralized correctly. MA to diagnose and lateralize was 78% (69-86%) accurate with 89% (76-94%) correctly lateralized using iPET and 71% (61-80%) accurate with 91% (78-97%) correctly lateralized using sMRI.
The rank order of the features used in our algorithm parallel the clinical observation that the epileptogenic networks in LTLE are broader than in RTLE. The LTLE vs. NES classifier achieved its performance by utilizing trends across almost the entire brain by including 42 of the 47 features in the final algorithm. In contrast, the RTLE vs. NES classifier only needed to measure the metabolism in six regions -bilateral temporal cortex and two associated regions of cortex -to achieve its impressive performance ( Table 2). As expected, the trinary classifier utilized an intermediate number of features to achieve its accuracy (30 of 47). The rank list of these features matches the biological intuition based on knowledge about the potential connectivity of epileptogenic networks ( Table 2).
When the same automated analysis was used to combine clinical findings with our iPET data, performance did not change significantly. After the four clinical factors were added to the input www.frontiersin.org FIGURE 1 | CAD tool performance matches manual analysis. These figures indicate the accuracy, sensitivity and specificity of the LTLE (A), RTLE (B) and trinary (C) classifiers. The performance of our CAD tools matched that of MA and was superior to just using gender alone. The error bars indicate standard error of the mean performance for each measure. The translucent region indicates the performance of a naïve classifier. *Indicates significant differences from the naïve classifier with a confidence level of 95% or more. of our tools, the accuracy changed to 79% (66-88%), 68% (56-79%), and 64% (54-73%) for the LTLE, RTLE, and trinary classifiers, respectively (Figure 3). These accuracies do not substantively

inferior (i), lateral (l), median (m), anterior (a), and posterior (p).The lagging C signifies cortex. Note that the LTLE vs. NES and trinary classifiers include information from 42 and 30 ROIs, respectively. To better understand the benefit of mRMR, this list can be directly compared to the list of ROIs ranked by t-statistics
in Table A1 in Appendix.

FIGURE 2 | CAD tool is not redundant with manual analysis.
The squared correlation of our CAD tools' results with those of MA of the iPET or sMRI from the same patients was below 50%. This indicates that while some information is shared, the majority of information provided by our CAD tools is not captured by MA. The correlation between MA of iPET and sMRI is similar in magnitude to the correlation of CAD with MA, therefore the CAD could potentially be seen as similar to another informative modality. *Indicates significant differences of the correlation from zero with a confidence level of 95% or more.
change when only sex and duration of disease were considered (results not shown). Adding the results of MA of both iPET and sMRI to our iPET data changed the accuracy to 82% (73-91%), 77% (67-88%), and 68% (59-77%) for the LTLE, RTLE, and trinary classifiers, respectively. When all information sources contribute to the algorithm, the accuracy changed to 77% (68-88%), 74% (64-85%), and 76% (68-84%) for the LTLE, RTLE, and trinary classifiers, respectively. We combined the results of MA were combined with our CAD tool manually using LRs. After doing so, the likelihood was generally only significant if all considered modalities agreed. Viewed alone, MA and our CAD increased the likelihood of the predicted outcome between two and ninefold (p < 0.02; Figure 4A). When two analysis streams were combined, if both analyses agreed, the likelihood of the predicted outcome was increased between 8-and 27-fold (p < 3 × 10 −4 ; Figures 4B,C). If all three analyses agreed, the likelihood of the predicted outcome increased more than 15fold (p < 1.3 × 10 −5 ; Figure 4D). However, in most cases, if there was any disagreement, the likelihood did not change significantly, most probably due to the small numbers of patients with each potential outcome. There are two key exceptions: (1) Given iPET results indicating NES over RTLE using either MA or CAD, the sMRI could be largely ignored (p < 1.1 × 10 −2 ). (2) If both MA and CAD of iPET agreed that a patient suffered from LTLE and not NES, the sMRI results could be similarly ignored (p < 3.3 × 10 −2 ).

DISCUSSION
These results demonstrate how our CAD tool has the potential for clinically application, while also confirming and elucidating the distributed effects of epilepsy on the entire brain. Our CAD tool's diagnostic performance of TLE matches, but is not redundant with, expert MA of iPET and sMRI. When considered in the context of recent reports of CAD tools for epilepsy based on sMRI and interictal EEG data (Farid et al., 2012;Focke et al., 2012;Keihaninejad et al., 2012;Kerr et al., 2012a), CAD is proving especially applicable to epilepsy. Further, if more work confirms the hypothesis that metabolic changes in iPET are observable before the structural changes in sMRI, our iPET tool may have better clinical utility than these existing sMRI tools. In contrast to MA, this and other CAD tools can be quickly and efficiently applied by minimally trained technicians, emergency physicians, and primary care providers as preliminary analysis of the iPET images (van Ginneken et al., 2011;Kerr et al., 2012c). The performance of MA can vary with experience and fatigue of the observer; automated tools are consistent over time. Upon further validation, these CAD results could also be incorporated into the consensus diagnoses with minimal cost if iPET already has been obtained.

CLINICAL IMPACT
Our CAD tools could provide valuable clinical information that may help readily identify which treatments may be effective in patients who present with uncharacterized, and/or medication refractory seizures (Kerr et al., 2012a,c). In particular, 15 of our 105 patients were admitted twice to achieve definitive characterization or localization of their seizures. The appropriate binary classifier correctly diagnosed 12 (80%) of these challenging patients. This valuable information might reduce the need for multiple video-EEG admissions. Additionally, 28% (9/32) of our PWN were admitted for improved characterization of their previously-diagnosed "epilepsy," and 16% (12/73) of our PWE were admitted for the differential diagnosis of epilepsy, indicating that non-epileptic etiologies were not ruled out sufficiently. The trinary CAD effectively diagnosed 67% (14) of these www.frontiersin.org However, if there is disagreement, the likelihood ratio is generally not significantly different from chance. The translucent bars indicate the 95% confidence interval for chance with the relevant sign (see Materials and Methods).The numbers above the translucent bars indicate the total number of patients with each outcome. The bars that go off the scale of the graph diverge toward zero or infinity because no patients of a certain class had that outcome. *Indicates significant differences of the correlation from zero with a confidence level of 95% or more.
particularly challenging patients. Despite this impressive performance, the ultimate goal of CAD, however, is to complement -not replace -MA.

COMBINATION OF AUTOMATED ANALYSIS WITH CLINICAL WISDOM
Our finding that performance almost uniformly, but nonstatistically, decreased when the automated algorithm incorporated clinical information indicates that automated analysis cannot and should not replace manual interpretation across information modalities. We suspect that this performance decreased due to ineffective modeling of the contribution of the clinical information and over-fitting. The statistical distribution of the clinical factors was very different from the metabolic data therefore the same model likely cannot effectively utilize both modalities. The efficient incorporation of multimodality information into machine learning is an active area of theoretical research, and wellvalidated methods are not yet available. Now that CAD tools using interictal EEG (Kerr et al., 2012a), sMRI (Farid et al., 2012;Focke et al., 2012;Keihaninejad et al., 2012), and iPET have been published, we believe it will be extremely exciting to assess how these various tools can be combined.
We expected that the best performance would be achieved when our CAD is used synergistically with MA. The low correlations between the CAD results and MA suggest that our CAD tool provides information that is not evident on visual inspection. These results emphasize that PET is not redundant with MRI (Henry et al., 1999). Physicians could learn to view CAD as analogous to another imaging modality that provides valuable, but not perfectly diagnostic, clinical insight. This synergistic application of computer-aided diagnosis after manual interpretation already has proven beneficial in the detection of lung nodules by the FDA and is an active area of translational research (Kerr et al., 2012c;Wang et al., 2012). The key differences between MA and automated analysis are the ability to entirely ignore certain pieces of data, and to rule that the results are inconclusive.
The results summarized above, and the LRs for each analysis stream individually, show that both MA and CAD are useful clinically. If the analysis streams agree, the diagnostic certainty increases substantially, but at a cost: as more analyses are added, more patients have inconclusive results because the analyses did not agree, and the LRs are not significant. Even though our sample size is large compared to other studies of this type, there were not enough patients in our dataset with each diagnostic outcome to explain the clinical implication of disagreeing analyses adequately. This matter of inconclusive results is a common challenge faced in clinical practice. Physicians struggle regularly with those types of Frontiers in Neurology | Epilepsy decisions. When MA of iPET and sMRI are combined, they need to agree to yield meaningful results. However, our analysis shows that in some specific cases, if both the MA and CAD of iPET agree, the sMRI is not needed. This parallels the finding we suggested above: iPET may be more clinically useful than sMRI to diagnose and lateralized epilepsy. The hypometabolic abnormality may be present earlier in disease (Theodore et al., 1990;Ryvlin et al., 1991;Swartz et al., 1992;Gaillard et al., 1995;Debets et al., 1997;Knowlton et al., 1997Knowlton et al., , 2008Blum et al., 1998;Drzezga et al., 1999;Benedek et al., 2004;Carne et al., 2004;Chandra et al., 2006;Yun et al., 2006;Uijl et al., 2007;Willmann et al., 2007;Rastogi et al., 2008;Salamon et al., 2008;Duncan, 2009;Lee and Salamon, 2009;Lerner et al., 2009;Liew et al., 2009;Brodbeck et al., 2010;Chinchure et al., 2010;Kim et al., 2011;Jupp et al., 2012), and it may provide slightly more accurate disease characterization, as seen in our dataset. In settings where the PET scanner is not combined with the MRI scanner, and/or when the cost of imaging is a limiting factor (both common occurrences) the effective application of our CAD could result in substantial cost savings.

PATHOPHYSIOLOGICAL INSIGHTS
Our methods also reveal a potential difference in the pathophysiology of left vs. right TLE. This may help explain why CAD tools perform slightly better when diagnosing RTLE compared to LTLE (Farid et al., 2012;Focke et al., 2012;Keihaninejad et al., 2012). The finding that mostly bilateral temporal ROIs, the right inferior frontal cortex and left sensorimotor cortex provide non-redundant diagnostic information for RTLE is consistent with the clinical wisdom that the epileptogenic network in RTLE is more focal than in LTLE. The inclusion of temporal regions echoes the conventional wisdom that focal hypometabolism and asymmetry reflect characteristic changes due to epilepsy. This suggests that conservative resection of the temporal lobe may result in increased rates of seizure freedom in RTLE compared to LTLE due to complete resection of the SOZ. Further, seizures that originate in the left temporal lobe may secondarily generalize more frequently in LTLE. These differences have not yet been studied clinically.
The trends in the extratemporal regions included in the algorithms suggest that the primary lesion may induce metabolic changes in functionally or anatomically associated regions. This is substantiated further by the finding that almost all regions of the brain provide informative diagnostic information in LTLE. This in turn mirrors the increased stereotypic connectivity of the left temporal lobe. Even though the interconnectivity of the right hemisphere is higher than the left hemisphere, the left hemisphere has strong connections between specialized foci (Barrick et al., 2005;Iturria-Medina et al., 2011;Kucyi et al., 2012). We hypothesize that the SOZ may induce abnormal metabolism along these strong, stereotyped connections. This change cannot be attributed to language specifically in our dataset because we did not identify the laterality of language dominance in our patients. Compared to our t -statistics ranking, it may seem surprising that the metabolism of the midbrain was ranked first by mRMR for LTLE vs. NES. This rank may indicate a non-linear change in the metabolism within the dorsal midbrain anticonvulsant zone, which has itself been identified in animals to be part of the network that modulates seizure threshold (Shehab et al., 1995). The exact relationship between epilepsy and midbrain metabolism is unclear, however. The lack of distributed atrophy in LTLE measured by sMRI suggests that these changes are not associated with distributed cell death or gliosis (Farid et al., 2012;Focke et al., 2012;Keihaninejad et al., 2012). Instead, we hypothesize that this change instead reflects abnormal metabolism in these regions due to altered neural connectivity and/or activity secondary to the epileptogenic lesion. This is supported by the finding that LTLE was associated with more changes in functional connectivity than RTLE was (Pereira et al., 2010). This also explains why we observed metabolic changes in the right thalamus in RTLE: recent work demonstrates that the connectivity of the right thalamus with the right hippocampus is reduced in RTLE (Morgan et al., 2012). The presence of such distributed changes also supports the finding that the size of the hypometabolic lesion visualized on PET may be larger than the SOZ (Juhasz et al., 1999;Matheja et al., 2001;Henry and Roman, 2011). It is particularly interesting to note that the extent of these distributed changes is underappreciated by tstatistics comparing LTLE to NES. This indicates that there is a complex, likely non-linear, relationship between the metabolism of the hypometabolic lesion and its associated tissue that may be better understood by mutual information.
The inclusion of the contralateral hippocampus in both of the binary classifiers lends itself to multiple interpretations that are all supported by biologically sound hypotheses. Firstly, a salient feature of LTLE or RLTE could be asymmetric metabolism, as suggested clinically; therefore the metabolism of the contralateral hippocampus was compared to the observed metabolism in the ipsilateral hippocampus. Alternatively, the interhemispheric connectivity between the hippocampi is high, therefore under our hypothesis that changes in metabolism spread according to functional connections, the metabolism in the contralateral hippocampus may be one of the first induced changes due to the epileptic lesion. Lastly, if LTLE and RTLE are inherently bilateral diseases then the metabolism in the contralateral hippocampus may also be abnormal. This also provides an explanation for why LTLE and RTLE were not perfectly distinguished.
In addition to diagnosing epilepsy, our algorithm lateralized disease efficiently with an accuracy of approximately 90% when epilepsy was diagnosed correctly. This impressive accuracy could be clinically useful for pre-surgical planning, when used in combination with other clinical and radiological information. Although our current sample size is too small to fully assess this potential fully, our results suggest that similar methodology could be applied to a larger dataset with more diverse and specific SOZ localizations to yield an objective and reliable tool to assist in pre-surgical SOZ localization. Our data suggest that this approach likely would identify and utilize distributed metabolic findings associated with each epileptic lesion to improve performance. Instead of blurring the boundary of the SOZ by detecting affected tissue outside the SOZ, the improved understanding of these distributed effects may lead to more refined characterization of this clinically vital SOZ. However, the spatial resolutions of our outcome classes were insufficient to assess the utility of this method directly to identify candidate lesions for resective surgery.
While our lateralization accuracy is exciting, there is also a potential clinical interpretation of the patients who were www.frontiersin.org falsely-lateralized. Functional connectivity between the temporal lobes is particularly strong. In a minority of patients, this connectivity allows epileptogenic activity to spread quickly from the SOZ to the contralateral temporal lobe on EEG, resulting in the appearance of either bilateral or falsely-lateralized disease. Similarly to the distributed networks discussed above, this high degree of functional connectivity also may induce metabolic abnormalities in the contralateral temporal lobe that may be indistinguishable from the primary lesion. This hypothesis can be tested by comparing these falsely-lateralized patients to patients with bilateral TLE. This comparison requires a detailed methodological treatment of non-mutually exclusive classes in machine learning and therefore lies outside the scope of the current manuscript.
To characterize these and other pathophysiological insights, most studies utilize healthy neurologically normal controls. In contrast, we prefer the use of PWN as our control group. In brief, when constructing a control group, one aims to match the patients in the pathologic group in all aspects other than the pathology. In contrast to neurologically normal controls when compared to PWE, PWN's have been exposed similarly to AEDs and other medications, have increased prevalence of TBI and some other risk factors for epilepsy (Sahaya et al., 2011), have regular and frequent meetings with health care providers, and have much more strict inclusion criteria. Lastly, and perhaps most importantly, physicians do not consider whether all of their patients have epilepsy; they assess only the patients with seizures. Therefore, in our opinion, the use of PWN as the control group is a benefit in of our study because it maximizes the clinical relevance of our results while simultaneously improving its statistical selectivity.

LIMITATIONS AND FUTURE DIRECTIONS
Because our retrospective dataset was collected as part of clinical care, our approach has a few important limitations. The accuracy of MA reported in our patients is worse than the rates quoted in previous literature (Rastogi et al., 2008;Salamon et al., 2008;Lee and Salamon, 2009). Given UCLA's status as a tertiary referral center, the decrease in manual accuracy likely indicates that our patients had more heterogeneous etiologies and/or were more complex and difficult to diagnose than other centers. This suggests that our CAD tool may perform better on other datasets. Our iPETs and MRIs were collected on varying cameras with varying resolutions. This demonstrates the flexibility of our automated analysis using NeuroQ. The efficacy of the MA of older and limited resolution data may not be comparable to that of more current and higher resolution data. After establishing the efficacy of our method, we plan to both validate our tool prospectively on data from other centers, and to incorporate multi-center data into our algorithm to further improve its performance. Additionally, we only discuss the combination of CAD results with independently derived MA. Future work will examine the efficacy of CAD tools informed by MA and vice versa.
Critics of our approach might claim that the significant gender and age difference of the patients with NES compared to PWE may lead to our CAD simply detecting the age and/or gender of the patients. While we do not expect this to be the case for RTLE, the utilization of language areas by the LTLE classifier might reflect differences in gender, and not epileptogenic pathology. However, the performance of our CAD was significantly higher than when clinical information was used directly, therefore the algorithm utilized more information than just clinical data to achieve its strong performance. These significant differences in clinical factors largely mirror the observed differences in clinic; therefore our dataset better matches the population for which our CAD tool would be applied. The only notable exception is the significant age difference between LTLE and RTLE, which was unexpected. Due to the naturalistic nature of our data collection scheme, we did not correct for this difference. However, we note similarly to the NES group, the use of age alone was significantly worse than our tool and the addition of age to the iPET data to control for its effect did not significantly change performance.
Another key caveat to the direct clinical application of our tool to clinical practice is the fact that epilepsy is an extremely heterogeneous disease. The generalization of our method to bilateral TLE, extratemporal foci and multifocal epilepsy will be critical before it can be incorporated into clinical practice. In particular, even though NES mimic all types of seizures, it is uncommon for TLE to be mistaken for NES. Instead, it is more common that NES appear to have a focus in frontal cortex (LaFrance and Benbadis, 2011). Therefore, the literature suggests that the highest impact CAD tool would discriminate between frontal lobe epilepsy and NES and another, separate tool could be used to lateralize TLE. Based on our results above (see section Clinical Impact), we believe that our TLE-specific tool may be clinically applicable. For the first publication demonstrating the applicability of computer-aided diagnosis based on iPET data, we chose to focus on the diagnosis and lateralization of TLE, based, based on prior findings that the sensitivity of iPET is highest for TLE. Our future work then can address generalizing our methods to the other epilepsies, including bilateral TLE and frontal lobe epilepsy.

CONCLUSION
Despite a few caveats, and upon further validation with data from other centers, our automated methods could provide unique information for the effective and efficient characterization of epilepsy, with the potential to decrease the fraction of patients with NES that are being treated (inappropriately) with AEDs, and to more quickly triage patients with medication refractory epilepsy toward surgical intervention. This may help achieve the ultimate goal: a global reduction in seizures .