Challenges and perspectives in MS-based omics approaches for ecotoxicology studies: An insight on Gammarids sentinel amphipods

The aquatic environment is one of the most complex biosystems, as organism at all trophic levels may be exposed to a multitude of pollutants. As major goals, ecotoxicology typically investigates the impact of toxic pollutants on the ecosystems through the study of sentinel organisms. Over the past decades, Mass Spectrometry (MS)-based omics approaches have been extended to sentinel species both in laboratory and field exposure conditions. Single-omics approaches enable the discovery of biomarkers mirroring the health status of an organism. By covering a restricted set of the molecular cascade, they turn out to only partially satisfy the understanding of complex ecotoxicological effects. In contrast, a more complete understanding of the ecotoxicity pathways can be accessed through multi-omics approaches. In this perspective, we provide a state-of-the-art and a critical evaluation on further developments in MS-based single and multi-omics studies in aquatic ecotoxicology. As case example, literature regarding Gammarids freshwater amphipods, non-model sentinel organisms sensitive to pollutants and environmental changes and crucial species for downstream ecosystems, will be reviewed.


Introduction
Since the advent of industrialization, human activities have increased as a result of human population growth, leading to the release of novel entities into the environment (Persson et al., 2022). Water, an essential substance on Earth for living organisms, is also one of the most heavily polluted environments. In the biosphere, freshwater ecosystems represent one of the most delicate compartments, hosting around 10% of the animal kingdom (Balian et al., 2008;Brondizio et al., 2019). The toxicological effects on freshwater organisms caused by exposure to pollutants may include altered reproduction, changes in nutrition habits, physiological or morphological anomalies, migration, death, and extinction (Ahmed et al., 2022). Adverse effects on living organisms are related to modifications at different biological levels, including modulation of gene expression, protein synthesis or metabolic pathways. During the past decades, ecotoxicological research has relied on the use of robust analytical techniques and modern bioinformatics approaches for the discovery of genes, proteins, metabolites, lipids involved in stress responses. Among these technologies, mass spectrometry (MS)based omics represents a gold standard for both structural and quantitative analysis of thousands of compounds down to ultratrace levels (Girolamo et al., 2013;Groh and Suter, 2020). The past years of research have been mainly based on the use of single omics (proteomics, lipidomics, or metabolomics, to name a few) leading to a limited view of the investigated system. More recently, multi-omics approaches, based on the integration of multiple omics data on a same sample have been implemented in ecotoxicology studies leading to a more performant elucidation of complex processes (Nam et al., 2022;Faugere et al., 2023).

State-of-the-art on omics approaches for ecotoxicological research on Gammarids
Gammarids represent the most abundant macroinvertebrate species, in terms of biomass, in freshwater environments. More importantly, they represent keystone species being involved in the detritus cycle, providing prey for secondary consumers, and intervening in the food web. Furthermore, Gammarids are very sensitive to diverse chemical compounds as metals, organics, or oil spills, and may be subjected to bioaccumulation of toxic compounds. Since almost a century, Gammarids have been used as bioindicator species in freshwater ecotoxicological assessment. Considering these peculiarities and drawing on our own experience in the field, we focus on Gammarids to present the advancements and perspective on the use of MS-based omics in ecotoxicology research.

Proteomics approaches
Proteomics, i.e., the large-scale study of proteins expressed by an organism, has been largely investigated in ecotoxicology for biomarker discovery. Protein sequence identification has been historically performed through shotgun approaches in which peptide proxies, obtained after proteins have undergone tryptic digestion, are analyzed through liquid chromatography tandem mass spectrometry (LC-MS/MS). The first studies performed on model organisms were conducted by comparing experimental and in silico fragmentation spectra of peptides derived from nucleoside sequences, for protein annotation. Proteogenomics, based on high resolution MS (i.e., shotgun proteomics), have enabled protein identification in the absence of genomic data from unfinished genome or from species-specific RNA sequences, also for nonmodel organisms (Armengaud et al., 2014). In the case of Gammarus fossarum, proteogenomics has enabled the identification of 1,873 proteins involved in reproductive pathway , to characterize the female core-proteome (Trapp et al., 2016), and to identify proteins related to endocrine perturbation caused by exposure to xenobiotics (Trapp et al., 2015, Koenig et al., 2021. Even if proteogenomics has in part gained prevalence in ecotoxicology research, the technique remains costly, suffers from poor reproducibility and reduced sensitivity (Aggarwal et al., 2022).
Indeed, for quantification purposes, improvements in both sensitivity and selectivity have been reached using targeted MS employing the multiple reaction monitoring (MRM) acquisition mode (Liebler and Zimmerman, 2013). This MS-based technique relies on the use of low-resolution mass spectrometers (triple quadrupole or hybrid quadrupole-linear ion trap) which enables three stages of analyses constituted by precursor ion selection, fragmentation, and fragments ions selection, namely, MRM transition. Only ions satisfying both m/z criteria (precursor and fragment ions) are detected, allowing increasing specificity and signal to noise ratio in complex samples analysis. Ecotoxicological research has made profit of MRM for the discovery and quantification of vitellogenin-related potential ecotoxicological biomarkers in Gammarus fossarum (Simon et al., 2010). However, classic MRM mode limits the number of monitored transitions as a compromise between dwell time transition and the total duty cycle (Rodriguez-Aller et al., 2013). While dwell time corresponds to the time necessary for the acquisition of an MRM transition, the duty cycle is the time spent monitoring an analyte. Importantly, the higher the duty cycle, the greater number of acquired points for chromatographic peak analysis which results in better quality data. In opposition, an increase of monitored transitions for a single analysis (multiplexing ability) may lead to poor reproducibility due to lower duty cycles, especially in coupling with Ultra High-Pressure Liquid Chromatography (UHPLC) which implies narrower peak widths. The MRM 3 approach enables increased sensitivity and selectivity by monitoring in addition to the classic approach, second generation of product ions without any further improvement in the multiplexing ability (Jaffuel et al., 2013). A first step towards increased multiplexing has been reached out through the scheduled MRM algorithm. In this operational mode, narrow retention time windows are set up to acquire the MRM transitions only during the expected analyte chromatographic elution (Bertsch et al., 2010). While a prior knowledge of the retention time is required, algorithms adapt dwell times maintaining optimal duty cycles. This allows the monitoring of hundreds of compounds in a single analysis without sacrificing signal-to-noise ratio and reproducibility (Leprêtre et al., 2022). This multiplexing approach is mainly based on the reliance of measured retention times and suffers of matrix effects which can cause timewindow shifts. Despite tedious complications in terms of use for consumables, instrument time and operator work, this may represent a problem both for intra-laboratory reproducibility and method transferability among different analytical platforms. More recently, Scout-MRM (renamed scout-triggered MRM or stMRM) has enabled a more reliable multiplexing method for both quantification and identification (Rougemont et al., 2017;Ayciriex et al., 2020;Salvador et al., 2020). Monitoring of concurrent MRM transitions is triggered by marker transitions of known/exogenous compounds (scout compounds), instead of retention time windows. When spiked into the biological sample, scout compounds ideally distribute uniformly along the chromatogram, allowing the monitoring of concomitant MRM transitions of analytes during an acquisition window spanning from the first scout triggering transition to a second one exceeding a chosen intensity threshold. Faugere et al. (2020) optimized and applied for the first time in aquatic ecotoxicology, a method based on Scout-MRM mode for broad and multiplex Frontiers in Analytical Science frontiersin.org analysis of proteins in adult Gammarids. Based on preliminary optimization of 44 labelled peptides (Gouveia et al., 2017), Scout-MRM method enabled 157-protein multiplex quantification and identification. The study represents a good example in which low resolution mass spectrometers serves both for robust quantification (increased signal to noise ratio and specificity) and identification based on the simultaneous monitoring of 4 MRM transitions per analyte. The relevance of optimized method was demonstrated through its application on adult Gammarids exposed to pesticides contamination, and on the modulation of proteins involved in key physiological pathways.

Metabolomics and lipidomics approaches
Similarly to proteomics, metabolomics and lipidomics give access to an in-depth perception of stress responses acting on metabolites and lipids in exposed organisms. Importantly, metabolites and lipids present extremely diverse chemical compounds with different polarity and a great structural diversity resulting in the presence of different isobaric and isomeric forms. MS-based metabolomics and lipidomics present many advantages over historical NMR-based analyses, principally relying on the greater sensitivity and the possibility to combine hyphenated techniques (i.e., liquid chromatography) for the detection and characterization of compounds in complex biological samples. Moreover, MS enables faster analyses and reduces needed sample quantity (Letertre et al., 2021). While the use of MRM based techniques is mainly used in for quantitative analysis, the advent of high-resolution mass spectrometry (HRMS) has enabled ions exact mass measurements and the acquisition of fragmentation mass spectra at high precision for the identification of potential biomarkers (Rampler et al., 2020;Heiles, 2021). Coupling separative techniques (liquid or gas chromatography) to MS in a targeted or non-targeted approach is routinely used to investigate respectively a known subset of compounds or to obtain undiscriminating information on the whole range of analytes present in the examined sample (Bletsou et al., 2015). Modern MS platforms providing high resolving power, mass accuracy and sensitivity have shed light on alternative approaches, such as shotgun metabolomics, lipidomics, glycomics (Hu et al., 2020;Bui et al., 2022). These techniques enable fast characterization and quantification of metabolites and lipids in crude extracts ionized through direct injection into an electrospray (ESI) source and resolved through the utilization of high-resolution analyzers (Han and Gross, 2005;García-Sevillano et al., 2015). In this extent, the terms diverge from the expression "shotgun proteomics", which indicates bottom-up techniques for the digestion of crude protein extracts and analysis through LC-MS/MS (Wu and Maccoss, 2002). While proteomics has been widely applied to study stress responses on Gammarids, metabolomics and lipidomics applications remain limited. Targeted metabolomics has been applied to measure changes in the concentration of 29 selected metabolites in G. pulex exposed to xenobiotics (Gómez-Canela et al., 2016), while combination of non-targeted metabolomics and chemometrics have highlighted putative metabolic biomarkers of G. fossarum male and female organisms exposed to pharmaceuticals (Bonnefoy et al., 2019). Similar approaches have been applied to the discovery of lipid biomarkers involved in female G. fossarum reproductive function perturbation caused by the exposition to fenoxycarb, a carbamate insecticide (Arambourou et al., 2018). Despite these studies being relatively recent, biomarker discovery has not been always followed by identification or structural characterization. A single study performed on G. pulex has allowed to get a hint on metabolites identity through molecular formula attribution on exact mass measurements and KEGG database screening (Sheikholeslami et al., 2020).
"Old is the new black": what is the contribution of other MS and bioinformatics tools in ecotoxicology studies?
The literature reviewed previously focuses on Gammarids species. It is only a case study and reflects the need for ecotoxicology to move towards the application of high throughput analytical methods that are well established in other application areas such as phytochemistry, pharmacology or medicine. To reiterate an already expressed concept, the identification of compounds in the absence of isolated and characterized compounds is a challenge in omics sciences. Data dependent analysis (DDA) and data independent analysis (DIA) are nowadays routinely used to acquire high-resolution fragmentation spectra for thousands of compounds in one single measurement (Fernández-Costa et al., 2020). While DDA-MS performs fragmentation only for selected precursor ions above a certain intensity threshold defined by the operator, DIA-MS enables an indiscriminate fragmentation of all precursor ions. Sequential window acquisition of all theoretical mass spectra (SWATH-MS) is a modified version of DIA-MS in which ions fragmentation is triggered by specific isolation windows (Anjo et al., 2017). In this acquisition mode, all precursors which fall in the m/z mass detection range are fragmented without prior detection, assuring in-depth acquisition for a broad range of compounds in complex samples. The method has been specifically optimized for the identification and quantification of non-labelled protein, but it has recently shown potential applications in metabolomics and lipidomics (Raetz et al., 2020). In all cases, the annotation of the compounds depends strictly on the possibility to compare them with reference fragmentation spectra. Molecular networking (available on GNPS, MetGem and MS-DIAL platform), is a bioinformatic tool permitting the grouping of structurally correlated compounds which fragment similarly (Tsugawa et al., 2015;Olivon et al., 2018;Nothias et al., 2020). These tools allow faster compound annotation as match with database fragmentation spectra.
The use of ultrahigh-resolution analyzers as Fourier Transform Cyclotron Ion Resonance (FTICR) and Orbitrap can offer unrivaled resolution (~10 6 at m/z 200 for FTICR in direct infusion mode), mass precision typically below 1 ppm, high sensitivity and good dynamic range enabling the detection of thousands of peaks in a few-minutes run, isobar resolution and access to the fine isotopic peak distribution (Hernández et al., 2012). Unique formula assignment for thousand peaks allows global molecular profiling through the use of van Krevelen diagrams (Laszakovits and MacKay, 2021).
In addition, ion mobility mass spectrometry (IMS) enables separation of isomers based on their tridimensional conformation in gas phase, expressed by their collision cross section (CCS) (  . While these MS tools are nowadays routinely applied in metabolomics for the characterization of natural compounds in plants (Calabrese et al., 2022(Calabrese et al., , 2023 or human biofluids (Zhu et al., 2021), applications in ecotoxicological research remains still very limited (Taylor et al., 2009;Duarte, et al., 2022) and totally unexplored in Gammarids.
In all the aforementioned techniques, spatial information of interesting compounds in biological tissue is missing. MS imaging is an emerging field in life science allowing real label-free molecular imaging of biological tissue sections. In this approach, information on the spatial distribution of the molecules within a tissue or whole organism can be attained, giving another level of omics information and hints on metabolic pathway (Amstalden van Hove et al., 2010). Despite the fact that this technique has been widely used in biology, MS imaging is relatively new in the ecotoxicology field (Fu et al., 2021). Nano-Secondary Ion Mass Spectrometry (nanoSIMS) imaging has been used to study the selective distribution of silver and gold toxic nanoparticles respectively in the cuticle and the gut area of G. fossarum tissue sections (Mehennaoui et al., 2018) and Time-of-Flight Secondary Ion Mass Spectrometry (ToF-SIMS) has been employed to assess the dynamic changes in lipid composition during the maturation of oocytes in G. fossarum (Fu et al., 2021). IMS has been also integrated in MSI offering in situ separation and mapping of isobaric and isomeric lipids in the muscle and oocytes of females of G. fossarum and disclosing differential lipid composition and abundance in the different organs (Fu et al., 2020).
The integration of all the aforementioned approaches can lead to improvements in the optic of both high throughput MS-based single layer and multiple layers-omics approaches (Figure 1).

Challenges and advancements on multiomics approach for ecotoxicological research
Omics approaches pave the way for an initial understanding of ecotoxicological adverse responses but lack of the ability to predict complex mechanisms underlying stressed organisms. There is nowadays a growing consciousness for the need to apply multiomics approaches based on the use of multi-layers analysis to obtain high-value integrative information. Although multi-omics approaches are increasingly spreading, some scientific challenges have still to be faced. Beside from large investments in terms of diverse analytical platforms, qualified operators, time and resources, multi-omics need optimized analytical protocols and modern tools for data analysis and integration. For example, an effort should be made towards the reduction of biases among the different omics layers, upstream the analysis. In ecotoxicology, different organs or organisms are used to obtain pertinent samples specific for each single-omics layer, introducing biological variability in the analysis. On the other hand, additional analytical variability could arise from multiple sample extraction steps and instrumental runs. Thus, reduction of variability can be reached either by getting multiomics information from a unique sample and in the ideal case, from a single organism, down to a single cell (Li et al., 2021). In these conditions it is possible to obtain optimized correlation of results for a better understanding of ecotoxicological effects. Recently, Faugere et al. developed a procedure based on a liquid-liquid extraction step

FIGURE 1
Summary of MS-based techniques for a high multiplexing and throughput multi-omics approach applied to freshwater sentinel organism such as Gammarids in ecotoxicology.

Frontiers in Analytical Science
frontiersin.org (MTBE/MeOH) for the simultaneous extraction of proteins, lipids, and metabolites from a unique sample of G. fossarum, which turned out to improve compound recovery and repeatability, with respect to classic independent fractions sample preparation (Faugere et al., 2023). In the study, MS-based multi-omics in combination with multivariate data analysis revealed specific proteomics, metabolomics and lipidomics signatures of the different female reproductive stages. Besides from this illustration, rare examples of multi-omics development for ecotoxicology application have been published and most of the present literature on MS-based multiomics approaches in freshwater ecotoxicology is restrained to the study of model organisms, such as zebrafish (Danio rerio) or organisms belonging to the Daphniidae family (Huang et al., 2017;Wang et al., 2019;Jia et al., 2022;Marana et al., 2022). On another side, there is the question concerning multi-omics data integration and the development of predictive models for highly complex datasets with intrinsic variability. In the last years, progressively sophisticated and appealing approaches have been developed based on the expansion of classic chemometrics, artificial intelligence and machine learning. Among these, "data integration analysis for biomarker discovery using latent components" (DIABLO) and "mining interesting numerical pattern sets" (MINT) enable respectively N-integration (same biological N samples measured on different 'omics platforms) and the P-integration (several independent data sets or studies measured on the same predictors) of multi-omics datasets (Rohart et al., 2017). These methods consider complex factors as heterogeneity between omics platforms and give adequate weight to different-omics layers. In addition, weighted correlation network analysis (WGCNA) has been extended to MS-based proteomics and metabolomics to find clusters of highly correlated proteins and metabolites (lipids, sugars, and so on), and to highlight connection between specific pollutants and adverse outcomes (Pei et al., 2017;Degli Esposti et al., 2019). Efforts are being made also towards the analysis of complex multi-omics datasets sampled frequently over time in longitudinal studies (Bodein et al., 2022). Among the available platforms, MetaboAnalyst 5.0 and Workflow4Metabolomics (within the Galaxy framework) are distinguished thanks to their user-friendly and MS-driven workflows, from single-omics data exploration and analysis to comprehensive integration and visualization of multi-omics datasets. Table 1 gathers the principal platforms for MS data integration in single and multi-omics applications, together with a summarized description and advantages.

Conclusion
The journey to a better understanding of freshwater ecosystems and gammarid species has just begun. In the near future, the main efforts in ecotoxicology studies should be addressed to the use of alternative mass spectrometry platforms, for identification and quantification of entire metabolome, lipidome, and proteome. From another perspective, the optimization and validation of universal sample treatments compatible with high-throughput multi-omics analyses have still to be developed, with the highest expectative in single-cell analysis. To go much further, the use of novel and progressively complete integration methods of large datasets based on MS-omics (and other non-MS-based omics) data can fill the gap between molecular response, toxicity pathways and apical physiological effects in exposed organisms and provide a more holistic view in ecotoxicology.

Data availability statement
The original contributions presented in the study are included in the article/supplementary material, further inquiries can be directed to the corresponding authors.

Author contributions
VC and SA designed the concept and wrote the article, with contributions from AS, YC, TAB, AE, AC, OG, and DD-E. VC prepared the figure. All authors contributed to manuscript revision, read and approved the submitted version.

Funding
VC was supported by a post-doctoral fellowship of the SENS research funding of the Université Claude Bernard Lyon 1. This work was also supported by the French National Research Agency (ANR) (young investigator grant, ANR-18-CE34-0008 PLAN-TOX and ANR-18-CE34-0013 APPROve), and the French GDR "Aquatic Ecotoxicology" framework which aims at fostering stimulating discussions and collaborations for more integrative approaches.

Conflict of interest
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher's note
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.