REVIEW article

Front. Environ. Sci.

Sec. Biogeochemical Dynamics

Reframing natural organic matter research through compositional data analysis

  • 1. Kobe University, Kobe, Japan

  • 2. Carnegie Institution for Science, Washington, United States

  • 3. Carl von Ossietzky Universitat Oldenburg, Oldenburg, Germany

  • 4. Universitat de Girona, Girona, Spain

  • 5. Universitat Politecnica de Catalunya, Barcelona, Spain

Article metrics

View details

1

Views

The final, formatted version of the article will be published soon.

Abstract

Compositional data (CoDa) are prevalent in environmental research. They represent parts of a whole, such as percentages, proportions, and relative or absolute abundance. They are arrays of positive data that convey relevant information in the ratios between their components. Standard statistical tech-niques developed for real random observations often yield spurious results and are therefore unsuitable for CoDa, which has unique geometric prop-erties. CoDa analysis is now widely acknowledged across various research fields, ranging from geoscience to social science, with a recent surge in popularity in microbial genomics. However, its adoption remains limited in natural organic matter (NOM) research, despite NOM data from key analytical tools such as mass spectrometry, fluorescence spectroscopy, and nuclear magnetic resonance spectroscopy all being compositional. Given the structural similarity between NOM and high-throughput sequencing data, for which CoDa analysis has been successfully adopted, we argue that CoDa analysis should also be consistently integrated into NOM research to prevent analytical pitfalls and misleading inferences. A few pioneering studies have applied CoDa analysis to NOM data, and a wide array of useful open-source tools are already available. This paper discusses step-by-step the application of CoDa analysis to NOM research, using ultrahigh-resolution mass spectrometry data as an illustrative example. The goal of the study is to provide the community with an overview of CoDa analysis and guide them on how to use it in practice.

Summary

Keywords

CODA, EEM, FT-ICR MS, Mass Spectrometry, NMR, parafac, Sum constraint

Received

13 October 2025

Accepted

20 February 2026

Copyright

© 2026 Kida, Merder, Dittmar, Pawlowsky-Glahn and Egozcue. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Morimaru Kida

Disclaimer

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.

Outline

Share article

Article metrics