Impact Factor 4.716 | CiteScore 4.71
More on impact ›

Original Research ARTICLE Provisionally accepted The full-text will be published soon. Notify me

Front. Immunol. | doi: 10.3389/fimmu.2019.02009

A computational pipeline for the diagnosis of CVID patients

 Emmaneel E. Emmaneel1, 2*,  Delfien Bogaert2, 3, 4, 5, 6, 7,  Sofie Van Gassen1, 2, Simon Tavernier2, 3, 4, 7,  Melissa Dullaers2, 3, 4, 7, 8,  Filomeen Haerynck2, 3, 6, 7 and Yvan Saeys1, 2
  • 1Department of Applied Mathematics, Computer Science and Statistics, Faculty of Sciences, Ghent University, Belgium
  • 2VIB-UGent Center for Inflammation Research (IRC), Belgium
  • 3Ghent Clinical Immunology, Ghent University Hospital, Belgium
  • 4Ghent University Hospital, Belgium
  • 5Center for Medical Genetics, Ghent University Hospital, University of Ghent, Belgium
  • 6Department of Pediatrics, Division of Pediatric Immunology and Pulmonology, Ghent University Hospital, Belgium
  • 7VIB-UGent Center for Inflammation Research (IRC), Belgium
  • 8Department of Internal Medicine, Ghent University, Belgium

Common variable immunodeficiency (CVID) is one of the most frequently diagnosed primary antibody deficiencies (PADs), a group of disorders characterized by a decrease in one or more immunoglobulin (sub)classes and/or impaired antibody responses caused by inborn defects in B cells in the absence of other major immune defects. CVID patients suffer from recurrent infections and disease-related, noninfectious, complications such as autoimmune manifestations, lymphoproliferation and malignancies. A timely diagnosis is essential for optimal follow-up and treatment. However, CVID is by definition a diagnosis of exclusion, thereby covering a heterogeneous patient population and making it difficult to establish a definite diagnosis. To aid the diagnosis of CVID patients, and distinguish them from other PADs, we developed an automated machine learning pipeline which performs automated diagnosis based on flow cytometric immunophenotyping. Using this pipeline, we analyzed the immunophenotypic profile in a pediatric and adult cohort of 28 patients with CVID, 23 patients with idiopathic primary hypogammaglobulinemia, 21 patients with IgG subclass deficiency, 6 patients with isolated IgA deficiency, 1 patient with isolated IgM deficiency and 100 unrelated healthy controls. Flow cytometry analysis is traditionally done by manual identification of the cell populations of interest. Yet, this approach has severe limitations including subjectivity of the manual gating and bias towards known populations. To overcome these limitations, we here propose an automated computational flow cytometry pipeline that successfully distinguishes CVID phenotypes from other PADs and healthy controls. Compared to the traditional, manual analysis, our pipeline is fully automated, performing automated quality control and data pre-processing, automated population identification (gating) and deriving features from these populations to build a machine learning classifier to distinguish CVID from other PADs and healthy controls. This results in a more reproducible flow cytometry analysis, and improves the diagnosis compared to manual analysis: our pipelines achieve on average a balanced accuracy score of 0.93 (± 0.07), whereas using the manually extracted populations, an averaged balanced accuracy score of 0.72 (± 0.23) is achieved.

Keywords: CVID, flow cytmetry, FlowSOM, Computational pipeline, PAD

Received: 15 Mar 2019; Accepted: 08 Aug 2019.

Edited by:

Tomas Kalina, Charles University, Czechia

Reviewed by:

Klaus Warnatz, University of Freiburg, Germany
Jan Stuchly, Charles University, Czechia  

Copyright: © 2019 Emmaneel, Bogaert, Van Gassen, Tavernier, Dullaers, Haerynck and Saeys. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence: PhD. Emmaneel E. Emmaneel, Department of Applied Mathematics, Computer Science and Statistics, Faculty of Sciences, Ghent University, Ghent, Belgium, annelies.emmaneel@ugent.be