AUTHOR=Gurke Robert , Etyemez Semra , Prvulovic David , Thomas Dominique , Fleck Stefanie C. , Reif Andreas , Geisslinger Gerd , Lötsch Jörn TITLE=A Data Science-Based Analysis Points at Distinct Patterns of Lipid Mediator Plasma Concentrations in Patients With Dementia JOURNAL=Frontiers in Psychiatry VOLUME=Volume 10 - 2019 YEAR=2019 URL=https://www.frontiersin.org/journals/psychiatry/articles/10.3389/fpsyt.2019.00041 DOI=10.3389/fpsyt.2019.00041 ISSN=1664-0640 ABSTRACT=Based on accumulating evidence of a role of lipid signaling in many pathophysiological processes including psychiatric diseases, a data driven analysis was designed to evaluate possible biomarker development based on a targeted lipidomics approach covering different candidate mediators. Using unsupervised methods of data structure detection, implemented as hierarchal clustering, emergent self-organizing maps of neuronal networks, and principal component analysis, a cluster structure was found in the input data space comprising serum concentrations of d = 35 different lipid-markers of various classes, acquired in n = 94 subjects with clinical diagnoses of depression, bipolar disorder, ADHD, dementia or healthy. Patients with dementia appeared separated from other clinical groups, indicating that dementia is associated with a distinct lipid mediator serum concentrations pattern possibly providing a basis for a future biomarker. This hypothesis was assessed using supervised machine-learning methods for feature selection and classifier building, implemented as random forests, k-nearest neighbors and support vector machines to estimate whether lipid mediators provide sufficient information that the diagnosis of dementia can be established at a higher accuracy than by guessing. This succeeded using a set of d = 7 markers comprising GluCerC16:0, Cer24:0, Cer20:0, Cer16:0, Cer24:1, C16 sphinganine, and LacCerC16:0, at an accuracy of 77 %. By contrast, using randomly selected lipid markers reduced the diagnostic accuracy to 65 % or less, whereas training the algorithms with randomly permuted data was followed by failure to diagnose dementia, emphasizing that the selected lipid mediators reflect a particular pattern in this disease possibly qualifying as biomarkers.