Impact Factor 3.113 | CiteScore 3.03
More on impact ›

Original Research ARTICLE Provisionally accepted The full-text will be published soon. Notify me

Front. Med. | doi: 10.3389/fmed.2019.00193

A High-Performance System for Robust Stain Normalization of Whole-Slide Images in Histopathology

 Andreea Anghel1*,  Milos Stanisavljevic1,  Sonali Andani1,  Nikolaos Papandreou1, Jan Hendrick Rueschoff2, Peter Wild3, Maria Gabrani1 and Haralampos Pozidis1
  • 1IBM Research - Zurich, Switzerland
  • 2Institute of Pathology and Molecular Pathology, University Hospital Zurich, Switzerland
  • 3University Hospital Frankfurt, Germany

Stain normalization is an important processing task for computer-aided diagnosis (CAD) systems
in modern digital pathology. This task reduces the color and intensity variations present in stained images from different laboratories. Consequently, stain normalization typically increases the prediction accuracy of CAD systems. However, there are computational challenges that this normalization step must overcome, especially for real-time applications: the memory and run-time bottlenecks associated with the processing of images in high resolution, e.g., 40X. Moreover, stain normalization can be sensitive to the quality of the input images, e.g., when they contain stain spots or dirt. In this case, the algorithm may fail to accurately estimate the stain vectors.

We present a high-performance system for stain normalization using a state-of-the-art unsupervised method based on stain-vector estimation. Using a highly-optimized normalization engine, our architecture enables high-speed and large-scale processing of high-resolution whole-slide images. This optimized engine integrates an automated thresholding technique to determine the useful pixels and uses a novel pixel-sampling method that significantly reduces the processing time of the normalization algorithm. We demonstrate the performance of our architecture using measurements from images of different sizes and scanner formats that belong to four different datasets. The results show that our optimizations achieve up to 58x speedup compared to a baseline implementation. We also prove the scalability of our system by showing that the processing time scales almost linearly with the amount of tissue pixels present in the image.

Furthermore, we show that the output of the normalization algorithm can be adversely affected when the input images include artifacts. To address this issue, we enhance the stain normalization pipeline by introducing a parameter cross-checking technique that automatically detects the distortion of the algorithm’s critical parameters. To assess the robustness of the proposed method we employ a machine learning (ML) pipeline that classifies images for detection of prostate cancer. The results show that the enhanced normalization algorithm increases the classification accuracy of the ML pipeline in the presence of poor-quality input images. For an exemplary ML pipeline, our new method increases the accuracy on an unseen dataset from 0.79 to 0.87.

Keywords: Stain normalization, whole-slide image analysis, large-scale image analysis, Tumor detection, Convolutional Neural Network, Digital pathology imaging

Received: 05 May 2019; Accepted: 15 Aug 2019.

Copyright: © 2019 Anghel, Stanisavljevic, Andani, Papandreou, Rueschoff, Wild, Gabrani and Pozidis. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence: PhD. Andreea Anghel, IBM Research - Zurich, Rüschlikon, Switzerland, aan@zurich.ibm.com