%A Gomez-Alvarez,Vicente %A Revetta,Randy P. %D 2020 %J Frontiers in Microbiology %C %F %G English %K Nitrification,Bioindicators,Receiver-operating characteristic (ROC),microbiome,machine learning %Q %R 10.3389/fmicb.2020.571009 %W %L %M %P %7 %8 2020-September-16 %9 Original Research %# %! Monitoring the DWDS with artificial intelligence %* %< %T Monitoring of Nitrification in Chloraminated Drinking Water Distribution Systems With Microbiome Bioindicators Using Supervised Machine Learning %U https://www.frontiersin.org/articles/10.3389/fmicb.2020.571009 %V 11 %0 JOURNAL ARTICLE %@ 1664-302X %X Many drinking water utilities in the United States using chloramine as disinfectant treatment in their drinking water distribution systems (DWDS) have experienced nitrification episodes, which detrimentally impact the water quality. Identification of potential predictors of nitrification in DWDS may be used to optimize current nitrification monitoring plans and ultimately helps to safeguard drinking water and public health. In this study, we explored the water microbiome from a chloraminated DWDS simulator operated through successive operational schemes of stable and nitrification events and utilized the 16S rRNA gene dataset to generate high-resolution taxonomic profiles for bioindicator discovery. Analysis of the microbiome revealed both an enrichment and depletion of various bacterial populations associated with nitrification. A supervised machine learning approach (naïve Bayes classifier) trained with bioindicator profiles (membership and structure) were used to classify water samples. Performance of each model was examined using the area under the curve (AUC) from the receiver-operating characteristic (ROC) and precision-recall (PR) curves. The ROC- and PR-AUC gradually increased to 0.778 and 0.775 when genus-level membership (i.e., presence and absence) was used in the model and increased significantly using structure (i.e., distribution) dataset (AUCs = 1.000, p < 0.01). Community structure significantly improved the predictive ability of the model beyond that of membership only regardless of the type of data (sequence- or taxonomy-based model) we used to represent the microbiome. In comparison, an ATP-based model (bulk biomass) generated a lower AUCs of 0.477 and 0.553 (ROC and PR, respectively), which is equivalent to a random classification. A combination of eight bioindicators was able to correctly classify 85% of instances (nitrification or stable events) with an AUC of 0.825 (sensitivity: 0.729, specificity: 0.894) on a full-scale DWDS test set. Abiotic-based model using total Chlorine/NH2Cl and NH3 generated AUCs of 0.740 and 0.861 (ROC and PR, respectively), corresponding to a sensitivity of 0.250 and a specificity of 0.957. The AUCs increased to > 0.946 with the addition of NO2 concentration, which is indicative of nitrification in the DWDS. This research provides evidence of the feasibility of using bioindicators to predict operational failures in the system (e.g., nitrification).