Recent methodological developments in data-dependent analysis and data-independent analysis workflows for exhaustive lipidome coverage

Untargeted lipidomics applied to biological samples typically involves the coupling of separation methods to high-resolution mass spectrometry (HRMS). Getting an exhaustive coverage of the lipidome with a high confidence in structure identification is still highly challenging due to the wide concentration range of lipids in complex matrices and the presence of numerous isobaric and isomeric species. The development of innovative separation methods and HRMS(/MS) acquisition workflows helped improving the situation but issues still remain regarding confident structure characterization. To overcome these issues, thoroughly optimized MS/MS acquisition methods are needed. For this purpose, different methodologies have been developed to enable MS and MS/MS acquisition in parallel. Those methodologies, derived from the proteomics, are referred to Data Dependent Acquisition (DDA) and Data Independent Acquisition (DIA). In this context, this perspective paper presents the latest developments of DDA- and DIA-based lipidomic workflows and lists available bioinformatic tools for the analysis of resulting spectral data.


Introduction
Lipids represent a large class of primary metabolites playing vital biological functions to maintain homeostasis in living organisms, serving as energy storage and cell signaling processes (Züllig et al., 2020). They are also involved in a large number of heterogeneous diseases including cancer, neurodegenerative and metabolic disorders. Lipids are classified in eight categories according to the Lipid Maps classification system (Fahy et al., 2005;2009;Liebisch et al., 2020) available online through the curated Lipid MAPS Structure Database (LMSD) gathering almost 48,000 unique lipid species in 2022 (https://www.lipidmaps.org/). Those categories can be further separated into classes, in which lipids share some building blocks but also differ by the length of alkyl chains and number of unsaturations creating a high diversity of closely related lipid species making their distinction highly challenging (Quehenberger et al., 2010;Harayama and Riezman, 2018).
Lipidomics can be defined as a system-level analysis of lipids and their interacting factors (Wenk, 2005). Lipidomics is a subset of metabolomics requiring dedicated sample preparation, as lipids are mostly soluble in organic solvents (e.g., methyl tert-butyl ether, chloroform and methanol for Bligh-Dyer and Folch extractions), as well as specific analytical tools to qualitatively and quantitatively characterize complex samples.
One of the major challenges in lipidomics is to distinguish isobaric and isomeric lipid species by implementing either optimized hyphenated mass spectrometry (MS)-based methods and/or tandem mass spectrometry approaches. The fine structure elucidation of complex lipids, such as glycerophospholipids, requires the identification of the polar head group, the chain lengths and the position of the fatty acids together with the number and position of unsaturations or other chemical modifications of the lipid backbone. The precise characterization of lipid species is mandatory to interpret their biological or chemical roles in a defined context (Wenk, 2010).
Among the technical developments performed in the last 70 years, gas chromatography coupled to electron ionization-mass spectrometry (GC-EI-MS) can be considered as the first major tools towards lipidomics. Even if the techniques required complex sample preparation including subclasses separation and chemical derivatization, GC-MS allowed efficient profiling of the fatty acids composing complex lipids (Chiu and Kuo, 2020).
The multiple ionization sources and particularly soft ionization such as Electrospray Ionization (ESI), Matrix-Assisted Laser Desorption/Ionization (MALDI) and Atmospheric Pressure Chemical/Photo-Ionization (APCI/APPI), have made mass spectrometry as the technique of preference to perform lipidome characterization and quantification. Also, Atmospheric Pressure ionization sources are mostly employed for compounds difficult to ionize in ESI, mostly represented by neutral lipids such as triacylglycerols or carotenoids (Byrdwell, 2001;Imbert et al., 2012).
The advent of liquid chromatography coupled to mass spectrometry (LC-MS) in the 80's offered new perspectives as intact complex lipids can be directly separated, detected and isolated for tandem mass spectrometry (MS/MS) experiments. Nevertheless, the low mass resolution instruments available at that time did not offer sufficient high mass measurement accuracy and led to difficulty in distinguishing isobaric species. The commercialization of fast and robust high-resolution (HR)/high-mass accuracy tandem mass spectrometers paved the route to the analysis of high-mass lipids thanks to extended mass capacities and dynamic ranges (up to 4 decades on modern instruments), high ionization rate for increased sensitivity in both polarities and for all lipid sub-classes. Time-of-flight (TOF) mass analyzers show the highest repetition rate. In terms of mass accuracy and resolving power, Fourier Transform mass spectrometers, such as Orbitrap and Fourier-transform ion cyclotron resonance (FTICR) instruments, are overpassing the TOF performances in spite of their slower acquisition rates limiting the number of data points per chromatographic peak. The Orbitrap-based instruments, since their first commercialization with the LTQ-Orbitrap in 2006, have witnessed innovations in sensitivity, resolution and scan speed (Eliuk and Makarov, 2015). The lipidome investigation can be now achieved at high throughput by either a shotgun approach (direct introduction into the mass spectrometer) or using hyphenated MS-based techniques such as normal or reverse liquid phase or supercritical fluid chromatography (Rustam and Reid, 2018). The common analysis workflow for those methods is based on the acquisition at the MS level of robust data for biological and technical replicates together with quality control (QC) injections to correct any deviation of the analytical chain. Then the identification of relevant markers or compounds differentiating sample groups is further accomplished by tandem mass spectrometry (MS/MS) following multiple injections into the LC-MS/MS system using targeted MS/MS workflows. MS/MS spectral data are necessary for the elucidation structure of lipids. The major part of the existing lipid databases mostly includes tandem mass spectra acquired under low-energy nonresonant collision-induced dissociation (CID) or higher energy collision dissociation (HCD) conditions on hybrid TOF and Orbitrap-based instruments. But, multiple types of ion activation modes (i.e., resonant and non-resonant activation modes) exist and can be employed in MS/MS and are currently used to a lesser extent for high throughput lipidomics workflows.
As for classical proteomics approaches, the parallel acquisition of HRMS and MS/MS data can be achieved using either a Data-Dependent Analysis (DDA) or a Data-Independent Analysis (DIA) approach. In the following Perspective article, technical and practical considerations on DDA and DIA for lipidomics will be discussed to offer an overview of the latest developments and future trends in the field of lipidomics.

DDA-based acquisition workflows
The DDA fragmentation method allows successive acquisitions of MS and MS/MS data. The most widely used DDA acquisition method is often designated as the "DDA Top N". It consists of a cycle of a single HRMS scan followed by the automatic selection and further fragmentation of the N most intense signals in a defined m/z range ( Figure 1). The first HRMS scan offers the unique possibility of getting high mass accuracy measurements, typically below 1-2 ppm. Unambiguous chemical formula can be easily determined for small lipids, below m/z 300-400 but much higher mass accuracy (<0.1 ppm) would be required to determine the elemental composition of complex lipids. Then automatic MS/MS scans are programmed based on rules such as peak intensity and mass range on HRMS data. This method enables fragmenting a high number of ions and leads to high-quality and high-purity MS/MS spectra since the selection window is restricted to monoisotopic ions. Nevertheless, precursor ions selection is a (semi-) stochastic event suffering from low analytical reproducibility and favoring selection of the most abundant (but sometimes biologically irrelevant) precursor ions to the detriment of other co-eluting relevant ones (Fenaille et al., 2017). These issues can be partially overcome through the implementation of exclusion rules, typical after one or several consecutive MS/MS experiments. The number of selected ions must be thoroughly optimized considering the MS/MS overall duty cycle (usually few tens to hundreds of milliseconds per MS/MS spectrum) and the LC peak width. A too high number of precursor ions will limit the number of complete DDA cycles recorded per chromatographic peak, thus downgrading the quality of peak integration and quantification at the MS level due to insufficient number of data points. This clearly advantages fast MS/MS scanning instruments such as hybrid TOF, with MS acquisition rate up to 100 Hz in both the MS and MS/MS modes (at 50 K resolution) for the TripleTOF 6,600 and timsTOF instruments, compared to the latest  (Neumann et al., 2013). First, a LC run at the MS level is performed, with the direct analysis of the data by preprocessing software XCMS for automated annotation and statistical analysis to select significant features. Then methods with inclusion lists are created and performed on pooled samples using targeted MS/MS (tMS 2 ) acquisition mode in order to confirm previous annotations while also limiting the number of LC-MS/MS experiments. This method enables a better coverage of the sample by selecting lower abundance relevant features, while losing the unbiased nature of DDA. Koelmel et al. developed an Iterative Exclusion IE-Omics approach whose goal is to create several non-redundant inclusion lists (Koelmel et al., 2017). First, a classic DDA acquisition is performed, and all the fragmented ions are added to an exclusion list. The workflow is repeated up to 5 times until the intensity of the precursor ions becomes too low for getting exploitable MS/MS data for automated annotation. When compared to five independent DDA scans performed on the same sample, the IE-Omics method shows an increase of 30% in the coverage of the lipidome for a plasma sample.
A strategy based on an automated DDA inclusion list workflow, called Data-Set-Dependent Acquisition (DsDA), was developed by Broeckling et al. (Broeckling et al., 2018). This method reports the specific combination of MS-data processing and target prioritization to enable a more comprehensive MS/MS coverage of the metabolome. The workflow consists of Top4 DDA scan, then data is automatically treated and a score is attributed to each MS/ MS spectra to determine its importance to be fragmented. A module, written in R programming language, forces the workflow to fragment precursor ions not previously fragmented in order to improve the metabolome coverage. This novel method offered 45% more MS/MS features. In the recently introduced BoxCar method, the MS spectrum is segmented in several windows to improve detection sensitivity and the dynamic range of the MS acquisition (Meier et al., 2018). Under these conditions, less abundant ions can be isolated during DDA, thus providing unprecedented structural information.
Another recently introduced gas phase fractionation approach is the so-called Parallel Accumulation Serial Fragmentation (PASEF, available on TimsTOF Pro instruments) that synchronizes Trapped Ion Mobility Spectrometry (TIMS) with MS/MS precursor selection and fragmentation. This acquisition mode drastically multiplies the fragmentation speed and detection sensitivity thereby approaching Scheme illustrating the DDA, SWATH, and AIF acquisition workflows reproduced with the permission from Fenaille et al. (2017) with permission from Elsevier, copyright (2017).

Frontiers in Analytical Science
frontiersin.org almost 100% duty cycle, thus annotating up to 1,100 unique lipids in 1 μL of human plasma (Vasilopoulou et al., 2020). All the presented methods of DDA are suffering from the same issue. As the sample often needs to be injected several times (up to five in both positive and negative ionization modes), its volume needs to be sufficient and the global analysis time is significantly extended making it impossible to apply these strategies for large cohort analysis in the field of medicine or biology.
3 DIA-based acquisition workflows DIA performance was first described in the proteomic field 18 years ago (Venable et al., 2004) and have been implemented for lipidomics (Schwudke et al., 2006). The application of DIA is essentially motivated by the wider accessible dynamic range (4 orders of magnitude) (Gillet et al., 2012), higher sensitivity and reproducibility and, more importantly, the unbiased characterization of complex biological matrices (Venable et al., 2004).
Analyzing the lipidome by direct injection into the mass spectrometer without prior chromatographic separation steps (shotgun approach) is highly attractive owing to the short acquisition time, the low solvent consumption, and its ease of implementation even if ion suppression effects are usually reported (Schwudke et al., 2011;Yang and Han, 2016). Shotgun DIA mode is mainly based on MS ALL workflow (Simons et al., 2012) and aims at monitoring MS/MS spectral information with 1 Da mass isolation on a large mass range for a run time lasting less than 6 min (Simons et al., 2012). MS ALL mode was efficient to identify and quantify lipids belonging to different classes in various biological matrices (Simons et al., 2012;Gao et al., 2016;. MS E for Q-TOF or All Ion Fragmentation (AIF) for Orbitrapbased instruments consist in acquiring alternating scans of all ions without mass filtering at two or three collision energies (Plumb et al., 2006) (Figure 1). Precursor information are retrieved at low collision energy whereas specific product ions are given by higher collision energy, facilitating the annotation step by combining all the MS/MS data into consensus MS/MS spectra (Plumb et al., 2006;Diedrich et al., 2013).
One of the first approaches of DIA-AIF method was implemented on a LC coupled to a single-stage Orbitrap mass analyzer using polarity switching in a single analytical run and using either in source CID or HCD fragmentation (Gallart-Ayala et al., 2013). Annotation of both phospholipids and triacylglycerols from serum samples for diagnosis of canine mammary cancer thanks to known retention times and typical diagnostic product ions was performed. Similarly, Castro-Perez et al. reported the advantages of applying alternated low and high collision energies in positive and negative ionization modes to identify lipids from human plasma (Castro-Perez et al., 2010). More recently, lipids from human plasma and dermal fibroblasts extracts were annotated using DIA-AIF and HCD in both polarities (Ventura et al., 2020a;2020b).
In order to improve the association of precursor ions with their corresponding fragments, SWATH (Sequential Window Acquisition of All THeoretical) has been developed (Gillet et al., 2012). The principle is to define sequential MS/MS experiments in reduced m/z windows, typically 10-20 Da to cover the whole mass range, all ions within these windows are then fragmented simultaneously (Figure 1). It has been initially developed on hybrid TOF instruments since a high-speed acquisition is mandatory, but can be now implemented on Orbitrap-based instruments (Fenaille et al., 2017). SWATH experiments were performed to characterize 214 and 140 lipids from immortalized keratinocytes in both positive and negative ionization modes, respectively. As statistical treatment of the data is mainly based on retention times, authors strongly advise to add internal standards to correct any deviation (Calderón et al., 2020;Cebo et al., 2021). Interestingly, SWATH can be employed for compound annotation using MS/MS library search but also quantification at the MS/MS level because, unlike classic DDA, it offers continuous MS/MS spectra acquisition for particular fragment ions.
One of the main drawbacks of DIA workflows is related to the obtained (highly) convoluted tandem mass spectra. Indeed, isobaric or isomeric lipid species from different classes can produce overlapping isotopic clusters that will be co-isolated and fragmented MS/MS thus yielding highly convoluted and thereby difficult-to-match MS/MS spectra. For instance, protonated phosphatidylcholine (PC 33:1) and phosphatidylethanolamine (PE 36:1) share the same elemental composition (Züllig and Köfeler, 2021). The frequent presence of sodium adducts can bring additional spectral complexity with lipid sodiated species isobaric to protonated unsaturated ones differing only by~2 mDa in some extreme cases (Höring et al., 2020;Züllig and Köfeler, 2021).
In DIA workflows, such as SWATH, the direct link between a given precursor ion and its corresponding fragments is often lost, which renders MS/MS data interpretation of unknown lipid species much more difficult than that of MS/MS data resulting from DDA protocols. Data treatment step, i.e. the concatenation and alignment of precursor and fragment ions, is indeed often considered as the critical point. This can be performed according to retention times (Castro-Perez et al., 2010) or by specific data filtering through mass defect calculation (Stagliano et al., 2010). A correlation-based deconvolution (CorrDec) method that uses correlation of ion abundances between precursor and product ions has been developed to improve the metabolite annotation rate (Tada et al., 2020).
While SWATH techniques provide a better coverage of the lipidome, it comes at the cost of lower spectral quality. Therefore, DIA SWATH method has been implemented together with DDA leading to the hybrid DaDIA mode (Guo et al., 2021). The sequence is composed of blanks and samples analyzed in DIA SWATH mode and quality controls (QCs) in DDA mode. DDA MS/MS data provided high quality spectra for the highest intensity features and are used to improve the deconvolution of SWATH data for better quality. This hybrid method enables a better lipidome coverage while improving the quality of deconvoluted SWATH spectra to a level similar to DDA spectra. Combination of 2D-LC, DIA MS E and DDA Top3, called HDDIDDA, outperforms DDA with a slightly higher annotation rate and MS E with a slightly higher coverage at the cost of a complex data treatment (Wang et al., 2022).
Even if DIA approach looks appealing in terms of numbers of annotated lipid species, the annotation steps required the most exhaustive and reliable MS/MS databases including normalized retention times if available. Acquiring MS/MS spectra from pure authentic standards is the best way to generate spectral databases, but this is clearly unrealistic due to the cost and viability of pure compounds. The most feasible strategy is to generate in silico lowenergy fragmentation spectra thanks to the well-documented mechanisms of lipid fragmentation (Murphy, 2015) as defined in LipidBlast database. As already documented in the literature, MS/MS Frontiers in Analytical Science frontiersin.org 04  (Goracci et al., 2017). By Molecular Discovery.
DIA: Tool to tune method and optimize width windows for SWATH acquisitions (Zhang et al., 2015).

MS-DIAL
X X X X X X DDA, DIA: Deconvolution of MS/MS spectra of SWATH acquisition (two available algorithms) and identification from implementable lipid libraries with MS/MS scoring. Univariate and nonsupervised statistics (e.g. t-test, PCA) (Tsugawa et al., 2015;Tada et al., 2020).
MetDIA X X X X X X DIA: Targeted peak detection from SWATH acquisition according to implementable lipid libraries with MS/MS scoring, univariate statistics on precursors ion area (Li et al., 2016).
LipiDex X X X X X DDA: Peak detection and identification from implementable lipid libraries (Hutchins et al., 2018). Skyline X X X X X DDA, DIA: Targeted approach for identification according indexed Retention Time and Collision-Cross Section, MS and MS/MS spectra from implementable lipid libraries (Kirkwood et al., 2022).
DecoMetDIA X X X DIA: Deconvolution of MS/MS spectra from SWATH acquisitions and identification according to implementable lipid libraries (Yin et al., 2019).
Greazy/LipidLama X X X X DDA: Peak detection and identification from set of theoretical phospholipids and fragments (Kochen et al., 2016).
(Continued on following page) Frontiers in Analytical Science frontiersin.org data are strongly dependent on the type of tandem mass spectrometer and the collision energy used that is challenging to normalize especially in the case of small molecule analysis (Murphy, 2015). Thus, the information related to the relative intensities of the fragments cannot be easily employed to refine structure annotation making the match scoring process less robust. Several software solutions (Kitata et al., 2021) include modules capable of achieving the complete pipeline from the conversion of vendor raw files to statistical analysis and through the identification from integrated and/or implementable lipid libraries as exemplified in Table 1. While attributing ion products to the precursor ion is unproblematic in DDA mode, to reconstruct MS/MS data in DIA mode remains highly challenging. To the best of our knowledge, only two robust software tools, i.e., MS-DIAL and DecoMetDIA, offer a deconvolution step for untargeted analysis in the DIA mode (Tsugawa et al., 2019;Yin et al., 2019;Tada et al., 2020). Multiple processing strategies for DIA deconvolution emerge in proteomics (Kitata et al., 2021) and are expected to be applied to small compounds in the near future.

Discussion and conclusion
Considering the rapid development of mass spectrometers in terms of scan speed, resolving power and mass accuracy, the analytical strategies for lipidomics have widely expanded during the last two decades. Beyond the use of high mass accuracy at MS level to suggest molecular formulas, MS/MS experiments are fully integrated in lipidomics workflows to provide successful semi-quantification and structural characterization. Both DDA and DIA methodologies require high acquisition rate tandem mass spectrometers to access deep lipidome. The quality of MS/MS data is needed for facilitating annotation but the total number of MS/MS scans on a unique feature is limited. Iterative DDA allows to exceed this limitation but significantly increases the sample consumption and the overall experimental time making this approach not compatible with the analysis of large cohorts. On the contrary, DIA significantly increases the number of annotated lipid species but is still far limited by the available software solutions (notably regarding spectra deconvolution).
Free software tools are reviewed in the Table 1 and three of them, namely MS-DIAL (Tsugawa et al., 2015), DecoMetDIA (Yin et al., 2019) and DecoID (Stancliffe et al., 2021) propose efficient bioinformatic tools for the deconvolution of mixed tandem mass spectra obtained from SWATH workflows. Also, as exposed by Barbier Saint Hilaire et al., well-designing SWATH isolation windows can impact significantly the detection rate of targeted compounds (Barbier Saint Hilaire et al., 2020). The free software tool SwathTUNER (Table 1) permits the automatized design of the variable isolations windows according either the Total Ion Current (TIC) or precursors density (Zhang et al., 2015) to significantly lower the complexity of tandem mass spectra obtained in the DIA-SWATH mode. Also, the commercial vendor Sciex proposes the SWATH ® Acquisition module available on their TripleTOF ® systems for the automatically designing of variable isolation windows, permitting the all-in-one acquisition workflow (https://sciex.com/technology/swath-acquisition).
Since a decade, ion mobility (IM) offers the unique possibility to separate ions not only on their m/z values but also on their shape in the gas phase, thus enabling the measure of collision cross-sections (CCSs). The IM time is fully compatible with the chromatographic time and the fast acquisition time of modern mass spectrometers. Several groups demonstrated that lipid isomers bearing different fatty chains but the same total number of carbons, including Z/E isomers, can be efficiently separated by IM-MS (Paglia et al., 2015). MS E experiments by coupling LC to IM-QTOF were performed for the characterization of non-purified glycolipids extracts from Caenorhabditis elegans (Witting et al., 2021). The multidimensional information collected from chromatographic separation (Retention Time), ion mobility (CCS) and MS E acquisition permits extended lipid annotation. Ion mobility coupled to DIA or DDA approaches paves the way for a rather complete structural annotation and (semi-) quantification of deep lipidome including all minor species, while  (Zha et al., 2018).
LipidIMMS X X DDA, DIA: Targeted approach for identification according Retention Time and Collision-Cross Section predictions, MS and MS/MS spectra from integrated lipid libraries (Zhou et al., 2019).
Frontiers in Analytical Science frontiersin.org 06 also significantly improving the quality of both DDA and DIA tandem mass spectra by separating isobaric/isomeric lipid ion species prior to their isolation and fragmentation. The file sizes, from several hundreds of Mo to several Go per injection, the limited software availability for efficiently handling resulting datasets and the difficulty to generate absolute collision cross-sections for highly flexible molecules, such as lipids in the gas phase, strongly restrict the full exploitation of the data acquired by DIA-or DDAion mobility.
A last issue in MS/MS is the efficient location of double bonds on lipid structures as dissociations at low-energy are not efficient (Ma et al., 2016). Prior chemical derivatization of carbon-carbon double bonds either directly by a chemical agent (e.g., ozonolysis, epoxidation, hydroxylation) or facilitated by UV irradiation (e.g., Paternò-Büchi reaction) can lead to reaction products that can be further fragmented under low-energy dissociation conditions to pinpoint the location of double bonds (Cao et al., 2018;Harris et al., 2018). Nonetheless, the routine and broad application of these methods remain limited due to the required long reaction time and the inadequate utilization with online LC-MS/MS workflows (Poad et al., 2017). Other fragmentation techniques can be also implemented. Electron-capture Dissociation (ECD) on a divalent metal complex of glycerophospholipids was proposed for double bond location (James et al., 2011). Other electron-activated dissociation (EAD) types have been devised such as Electron-induced dissociation (EID) and Electron-Impact Excitation of Ions from Organics (EIEIO) compatible with singlycharged precursor ions (Baba et al., 2021). Ultraviolet photodissociation (UVPD) activation mode has recently been used to distinguish snpositional isomers and to determine carbon-carbon double bonds location (Williams et al., 2017). Only few commercial instruments are now equipped with EAD, EID, EIEO, or UVPD while relatively low yields of fragmentation are usually reported for those methods restricting their applications to abundant species even if the fragmentation process is fully compatible with LC peak width.
Even if major progresses in lipidomics have been reported in the last years, improvements of each step of the workflow, from the sample preparation, the chromatography, the MS and MS/MS acquisition to the automated data treatment, are needed to expand lipidome coverage.

Data availability statement
The original contributions presented in the study are included in the article/Supplementary material, further inquiries can be directed to the corresponding authors.