Original Research ARTICLE
Variance-Preserving Estimation of Intensity Values Obtained from Omics Experiments
- 1Department of Computer Science, Institute of Mathematics and Statistics, University of São Paulo, Brazil
- 2Department of Statistics, Institute of Mathematics and Statistics, University of São Paulo, Brazil
Faced with the lack of reliability and reproducibility in omics studies, more careful and robust methods are needed to overcome the existing challenges in the multi-omics analysis. In conventional omics data analysis, signal intensity values (denoted by M and A values) are estimated neglecting pixel-level uncertainties, which may reflect noise and systematic artifacts. For example, intensity values from two-color microarray data are estimated by taking the mean or median of the pixel intensities within the spot and then subjected to a within-slide normalization by LOWESS. Thus, focusing on estimation and normalization of gene expression profiles, we propose a spot quantification method that takes into account pixel-level variability. Also, to preserve relevant variation that may be removed in LOWESS normalization with poorly chosen parameters, we propose a parameter selection method that is parsimonious and considers intrinsic characteristics of microarray data, such as heteroskedasticity. The usefulness of the proposed methods is illustrated by an application to real intestinal metaplasia data. Compared with the conventional approaches, the analysis is more robust and conservative, identifying fewer but more reliable differentially expressed genes. Also, the variability preservation allowed the identification of new differentially expressed genes. Using the proposed approach, we have identified differentially expressed genes involved in pathways in cancer and confirmed some molecular markers already reported in the literature.
Keywords: Delta method, Pixel-level uncertainty, Spot quantification, Optimal LOWESS normalization, Two-color microarray, Variability preservation, Parameter selection
Received: 05 Apr 2019;
Accepted: 16 Aug 2019.
Copyright: © 2019 Ribeiro, Soler and Hirata Jr.. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
* Correspondence: Dr. Adèle H. Ribeiro, Department of Computer Science, Institute of Mathematics and Statistics, University of São Paulo, São Paulo, 05508-090, São Paulo, Brazil, firstname.lastname@example.org