AUTHOR=Giannini Lucia A. A. , Xie Sharon X. , Peterson Claire , Zhou Cecilia , Lee Edward B. , Wolk David A. , Grossman Murray , Trojanowski John Q. , McMillan Corey T. , Irwin David J. 

TITLE=Empiric Methods to Account for Pre-analytical Variability in Digital Histopathology in Frontotemporal Lobar Degeneration

JOURNAL=Frontiers in Neuroscience

VOLUME=Volume 13 - 2019

YEAR=2019

URL=https://www.frontiersin.org/journals/neuroscience/articles/10.3389/fnins.2019.00682

DOI=10.3389/fnins.2019.00682

ISSN=1662-453X

ABSTRACT=Digital pathology is increasingly prominent in neurodegenerative disease research, but variability in immunohistochemical staining intensity between staining batches prevents large-scale comparative studies. Here we provide a statistically rigorous method to account for staining batch effects in a large sample of brain tissue with frontotemporal lobar degeneration with tau inclusions (FTLD-Tau, N=39) or TDP-43 inclusions (FTLD-TDP, N=53). We analyzed the relationship between duplicate measurements of digital pathology, i.e. percent area occupied by pathology (%AO) for grey matter (GM) and white matter (WM), from two distinct staining batches. We found a significant difference in duplicate measurements from distinct staining batches in FTLD-Tau (mean difference: GM=1.13±0.44, WM=1.28±0.56; p<0.001) and FTLD-TDP (GM=0.95±0.66, WM=0.90±0.77; p<0.001), and these measurements were linearly related (R-squared [Rsq]: FTLD-Tau GM=0.92, WM=0.92; FTLD-TDP GM=0.75, WM=0.78; p<0.001 all). We therefore used linear regression to transform %AO from distinct staining batches into equivalent values. Using a train-test set design, we examined transformation prerequisites (i.e. Rsq) from linear-modeling in training sets, and we applied equivalence factors (i.e. beta, intercept) to independent testing sets to determine transformation outcomes (i.e. intraclass correlation coefficient [ICC]). First, random iterations (x100) of linear regression showed that smaller training sets (N=12-24), feasible for prospective use, have acceptable transformation prerequisites (mean Rsq: FTLD-Tau≥0.9; FTLD-TDP≥0.7). When cross-validated on independent complementary testing sets, in FTLD-Tau, N=12 training sets resulted in 100% of GM and WM transformations with optimal transformation outcomes (ICC≥0.8), while in FTLD-TDP N=24 training sets resulted in optimal ICC in testing sets (GM=72%, WM=98%). We therefore propose training sets of N=12 in FTLD-Tau and N=24 in FTLD-TDP for prospective transformations. Finally, the transformation enabled us to significantly reduce batch-related difference in duplicate measurements in FTLD-Tau (GM/WM: p<0.001 both) and FTLD-TDP (GM/WM: p<0.001 both), and to decrease the necessary sample size estimated in a power analysis in FTLD-Tau (GM:-40%; WM:-34%) and FTLD-TDP (GM:-20%; WM:-30%). Finally, we tested generalizability of our approach using a second, open-source, image analysis platform and found similar results. We concluded that a small sample of tissue stained in duplicate can be used to account for pre-analytical variability such as staining batch effects, thereby improving methods for future studies.