Integration of Wnt-inhibitory activity and structural novelty scoring results to uncover novel bioactive natural products: new Bicyclo[3.3.1]non-3-ene-2,9-diones from the leaves of Hymenocardia punctata

In natural products (NPs) research, methods for the efficient prioritization of natural extracts (NEs) are key for discovering novel bioactive NPs. In this study a biodiverse collection of 1,600 NEs, previously analyzed by UHPLC-HRMS2 metabolite profiling was screened for Wnt pathway regulation. The results of the biological screening drove the selection of a subset of 30 non-toxic NEs with an inhibitory IC50 ≤ 5 μg/mL. To increase the chance of finding structurally novel bioactive NPs, Inventa, a computational tool for automated scoring of NEs based on structural novelty was used to mine the HRMS2 analysis and dereplication results. After this, four out of the 30 bioactive NEs were shortlisted by this approach. The most promising sample was the ethyl acetate extract of the leaves of Hymenocardia punctata (Phyllanthaceae). Further phytochemical investigations of this species resulted in the isolation of three known prenylated flavones (3, 5, 7) and ten novel bicyclo[3.3.1]non-3-ene-2,9-diones (1, 2, 4, 6, 8–13), named Hymenotamayonins. Assessment of the Wnt inhibitory activity of these compounds revealed that two prenylated flavones and three novel bicyclic compounds showed interesting activity without apparent cytotoxicity. This study highlights the potential of combining Inventa’s structural novelty scores with biological screening results to effectively discover novel bioactive NPs in large NE collections.


Introduction
Nature is a valuable source of chemical diversity, offering a wide range of molecules with therapeutic properties (Newman and Cragg, 2020).Plants serve as important reservoirs of bioactive natural products (NPs) that have been utilized for medicinal purposes for centuries.NPs often exhibit complex chemical structures due to evolutionary processes that enable them to interact with biological targets in precise ways (Feher and Schmidt, 2003;Atanasov et al., 2021).These characteristics are challenging to replicate synthetically, making NPs exceptionally suitable as starting points for drug development (Clark, 1996;Dias et al., 2012;Allard et al., 2023).Natural extracts (NEs) from plant origin possess a vast chemical diversity of NPs, positioning them as highly promising assets for the exploration and advancement of novel therapeutic agents.Although to date, only about c.a. 20% of plant species have been investigated, finding novel or rare structural scaffolds is becoming increasingly difficult.This challenge arises because species that are taxonomically related tend to biosynthesize similar constituents (David et al., 2015).
The most common approaches used for the selection of NEs prior to in-depth phytochemical studies include high-throughput bioactivity screening, traditional use of given medicinal plants, and literature reports (Hostettmann and Terreaux, 2000;Sarker et al., 2005;Sarker and Nahar, 2012).The identification of the active principles is classically performed by bio guided isolation.This strategy is resource-intensive and time-consuming due to the need for multiple rounds of fractionation and bioassays.There is also a risk of bioactivity lost during the isolation process while other concerns include false positives, selectivity issues in bioassays, and missing synergistic effects (Pieters and Vlietinck, 2005;Hamburger, 2019;Najmi et al., 2022).To overcome certain limitations and anticipate the chances to find bioactive NPs of interest, strategies like structural dereplication and extensive metabolites annotations through metabolomics are increasingly being integrated early in research workflows (Olivon et al., 2017;Caesar et al., 2021).
Early structural identifications of NPs in NEs can assist researchers in avoiding reported active NPs or efficiently searching for analogs of previously reported bioactive NPs (Hubert et al., 2017;Selegato et al., 2023).With the advancement of computational annotations methods throughput in metabolomics, it is now possible to evaluate the chemical space of large NEs collections (Gaudry et al., 2023).This information can be used to prioritize samples in the search for structurally novel NPs (Quiros-Guerrero et al., 2022).
To automatedly mine the large amount of metabolite profiling data and make use of prior pharmacognosy knowledge we recently introduced Inventa (Quiros- Guerrero et al., 2022), a metabolomics bioinformatic workflow designed to streamline the NEs selection process.Its primary objective is to pinpoint NEs with a heightened probability of containing structurally novel NPs within NEs collections, that have undergone untargeted UHPLC-HRMS 2 metabolite profiling.Inventa follows a structured process and takes as input the results from the MZmine data processing (Schmid et al., 2023), the subsequent MS 2 spectral data organization using Featured-Based Molecular Networking (FBMN) (Nothias et al., 2020), and the MS 2 spectra annotation from advanced computational methods like TIMA (Allard et al., 2016;Rutz et al., 2019), and SIRUS (Dührkop et al., 2019).The annotation results of the features [a peak with an m/z value at a given retention time (RT)] detected in the samples include molecular formulas, chemical classes based on NPClassifier (Dührkop et al., 2021;Kim et al., 2021), and structural candidates (Dührkop et al., 2015;Cabral et al., 2016).It integrates previous literature reports for the considered taxon by conducting automated searches in the LOTUS initiative (Rutz et al., 2022), where NP structure occurrences are catalogued in their respective source organisms.Additionally, it exploits the MEMO (Gaudry et al., 2022) spectral fingerprints to evaluate the spectral diversity exhibited by a particular sample within a set of NEs.Based on all these data, Inventa calculates four individual component scores: the proportion of annotated features in each NE, the specificity of these features within a NEs data set, the number of reported structures in the NE taxon and the spectral divergence of the individual NE within the data set.It provides a combined score (Priority Score, PS) that enables the prioritization of NEs based on their potential for containing structurally novel NPs.In this study we intend to evaluate how Inventa can be combined with bioactivity screening results to highlight structurally novel bioactive NPs capable of regulating the Wnt signaling pathway.
The Wnt signaling pathway (Q155769) is critical in several biological processes like embryonic development, tissue homeostasis, and cellular proliferation (Blagodatski et al., 2020;Boudou et al., 2022;Liu et al., 2022).However, when dysregulated, it has been associated with several disorders, including cancer (Shaw et al., 2019b;Lim et al., 2021;Jiang et al., 2022), Alzheimer's (Inestrosa et al., 2012), and osteoporosis (Houschyar et al., 2018;Lojk and Marc, 2021).Many of the current cancer treatments affect rapidly dividing cells resulting in notable side effects since these cells are essential for tissue maintenance in adults.A more targeted and specific approach, with fewer side effects, may be possible by focusing on targeting the Wnt signaling pathway exclusively in the cancer cells (Shaw et al. , 2019b).Several NPs from diverse plant species have been reported to have some activity over the Wnt-signaling pathway through disruption of the Wnt/β-catenin cascade (Pooja and Karunagaran, 2014;Gu et al., 2019).Thus, the discovery of NPs capable of inhibiting or regulating the Wnt-signaling pathway has become a topic of significant interest in drug discovery programs (Fuentes et al., 2015;Nusse and Clevers, 2017).
The collection of NEs used for the Wnt-pathway regulation screening consists in a subset of 1,600 NEs from the Pierre Fabre Laboratories (PFL) Library that were previously analyzed by massive UHPLC-HRMS 2 metabolite profiling, and different annotations workflows were applied.The set data was publicly disclosed allowing researchers to explore a wide range of chemical compositions across different plant species (Allard et al., 2023).The NEs were generated directly from the plant material by maceration with ethyl acetate, followed by SiO 2 -SPE filtration.This method was optimized for the recovery of middle polarity compounds, which is crucial for the objectives of the HTS program conducted by PFL.The samples were prepared in DMSO at a concentration of 5 mg/mL (Allard et al., 2023).This set of 1,600 NEs has been exploited for the development of bioinformatics tools (Gaudry et al., 2022;Gaudry et al., 2023).
In the search for structurally novel bioactive NPs from plants, we sought to investigate the UHPLC-HRMS 2 metabolite profiling and Wnt-pathway regulation screening results for this set of 1,600 NEs.Then, to increase the chances of selecting active NEs containing novel NPs, Inventa was used to calculate priority scores for structural novelty.The combination of both information, the screening results, and Inventa's scores highlighted several bioactive NEs with a high potential of containing structurally novel NPs.

Selection of promising NEs by combining bioactivity results and structural novelty scores
The same samples used for the UHPLC-HRMS 2 metabolite profiling previously described by Allard et al. (2023) were screened for the presence of compounds with a potential Wntregulatory activity.The screening experimental design used the BT-20 triple-negative breast cancer cell line (TNBC), stably transfected with the TopFlash reporter construct, and sensitive to purified Wnt3a stimulation (Koval et al., 2014;Shaw et al., 2019a).The NEs were screened in single repeats at five different concentrations (50, 25, 12.5, 6, and 3 μg/mL) and cytotoxicity was monitored at the same time.Given that the assay does not include positive control compounds, an NEs or compound is considered 'toxic' if its IC 50 value against Renilla luciferase is less than 1.7 times the estimated TopFlash value.This indicates that any reduction observed in the TopFlash response is likely influenced by a significant toxic effect (Shaw et al., 2019a).
The results of the Wnt-regulatory bioactivity testing showed that out of the 1,600 NEs, 497 exhibited either Wnt-regulatory or cytotoxic activity.Among these active samples, 389 active NEs were classified into 148 NEs Wnt potentiators (79 NEs were non-toxic and 69 NEs had a toxicity IC 50 > 50 μg/mL), and 241 NEs Wnt inhibitors (all non-toxic).The remaining 108 NEs were solely cytotoxic, with an IC 50 value ranging between 0.30 μg/ mL and 50 μg/mL.
Out of these 241 inhibitory NEs, 132 NEs showed a Wntinhibition IC 50 < 50 μg/mL, with 53 NEs having a Wntinhibition IC 50 < 10 μg/mL.Focusing on samples capable of inhibiting the Wnt pathway is essential for discovering NPs to treat diseases linked to the dysregulation of this pathway.Therefore, in this study, only inhibitory NEs were further considered.
To further reduce the list only NEs with an inhibition IC 50 below 5 μg/mL were considered.This reduced the 241 to 30 NEs comprising 25 different species from 17 different botanical families, with Fabaceae and Euphorbiaceae being the most prominent, each contributing 5 NEs.Within this reduced set, there were 23 different genera, with the majority represented by only 1 NE, except for Elegia with three samples, and Pandanus, Euphorbia (Q146567), Ehretia (Q276756), and Baliospermum with two samples each.
The biological screening results drove the first selection step, resulting in a subset of 30 non-toxic NEs with an inhibitory IC 50 ≤ 5 μg/mL (see Figure 1A).To further refine the selection, additional selection criteria based on UHPLC-HRMS 2 metabolite profiling data were used.Specifically, Inventa scores, which evaluates the metabolites potential structural novelty within NEs (Quiros-Guerrero et al., 2022).These scores were calculated for the entire 1,600 NEs set using the positive ionization (PI) mode UHPLC-HRMS 2 (Allard et al., 2023) (Figures 1.B,C).They were therefore not limited to the 30 active extracts alone, and thus better demonstrated their potential for holding new structures, since the reference sample set was much broader than that restricted by the biological activity filter.
The rough ranking based on PS significantly reduces the number of samples to consider which is important in large datasets.However, within the list of top-ranked samples, it is important to evaluate each parameter individually and, when possible, refine the literature search.This provides a better overview of the available data.The PS score enables to rapidly estimate the likeliness of a sample to contain potentially structurally novel NPs.This should not be interpreted as an absolute ranking.In this study, the focus shifted from the entire collection of 1,600 NEs to a much smaller subset of 30 active NEs.Instead of selecting these NEs based on their overall PS assigned by Inventa, a more meticulous approach was adopted.Each Inventa's component for these NEs was individually assessed for a more precise selection.
The importance of the different novelty score components in the selection of NEs lies in its multifaceted approach to evaluating the structural richness and dissimilarities among samples.Inventa operates at two levels: first, by assessing individual features within each extract to gauge their specificity and annotation status, and second, by comparing the overall spectral space of each extract to measure dissimilarities in a sample set and potentially highlight NEs holding a pool of spectra correlated to a very specific metabolome.Subsequently, it integrates data from literature reports for the taxon, highlighting NEs potentially containing novel NPs (Quiros-Guerrero et al., 2022).The insights gained from these scores offer a thorough evaluation of the potential for extracts to contain structurally novel NPs.This comprehensive evaluation framework empowers researchers to pinpoint NEs with untapped metabolic potential, thereby facilitating the discovery of novel NPs with potential therapeutic applications.
First, to ensure that only NEs potentially holding specific constituents (specific pool of MS 2 spectra), only those with a Similarity Component (SC) value of '1' were further considered.This score highlights extracts containing metabolites whose MS 2 spectra are significantly different from those of all 1,600 extracts in the data set.The SC employes the MEMO metric (Gaudry et al., 2022) to generate a matrix containing all MS 2 information in the form of peaks and neutral losses (Huber et al., 2020;2021) and automatic outlier detection machine learning algorithms to emphasize NEs that display substantial spectral dissimilarity (Quiros-Guerrero et al., 2022).Out of the 30 NEs only six remained after this filter.Then, only NEs with a Literature Component (LC) value close to '1' were selected.The value of LC reflects a rough estimation of the extent of the prior phytochemical knowledge for a given taxon (according to LOTUS).The closer to '1', the fewer compounds have been reported at the species, genus, and family levels for a given sample.Inventa calculates the number of reported compounds for each species, genus, and family, and this data forms part of the final information provided for each sample.Based on this information, only NEs reporting less than 10 reported compounds in both genus and species levels were further considered (Figure 1D).
This reduced the 6 NEs to only 4 NEs with few compounds reported, since the species Derris scandens (Q15488445, Fabaceae) and Iris lactea (Q6747387, Iridaceae Q155941) presented over 100 and 400 compounds reported at the genus level respectively (see Wikidata Query results for genus Derris and Iris).Additionally, the remaining 4 NEs presented a Class Component (CC) of '1'.A CC value of one indicates that there are chemical classes proposed by CANOPUS (Dührkop et al., 2021) not yet reported for both the species and the genus (according to LOTUS).This suggests a high probability of potentially discovering unreported NPs within these NEs (Quiros-Guerrero et al., 2022).
To further refine the selection process, a thorough complementary literature search was done on this final set of three species, considering all the possible botanical synonyms according to WFO Plant List.This search revealed no direct reports for the Brosimum solanifolium species.However, for one of its synonyms, Baliospermum montanum (Q3595677), the literature reports the presence of alkaloids, daphnanes, ingenanes, and phorbol esters (tiglianes Q27117179) like montanin (Q27107381) and baliospermin (Q27105913) (Seigler, 1994;Mali and Wadekar, 2008) with proven anticancer activity (Ogura et al., 1978).These metabolites were not accessible by LOTUS, so they were not considered in first instance.Since B. solanifolium, presents active reported NPs, both extracts were no further considered.For A. villosa, some reports already described various bioactivities of ethanolic extracts from different plant parts were reported (Srikrishna et al., 2008;Venkataraman et al., 2010;Nanna et al., 2021).Additionally, preliminary metabolite profiling indicated a high concentration of fatty acids.Therefore, this plant was not initially considered for further study.
In contrast, for Hymenocardia punctata, there were no existing reports on its chemical composition or bioactivity evaluation.This lack of information aligned with its initial LC score based on the Lotus.Consequently, the ethyl acetate extract of H. punctata leaves was identified as the most promising candidate for the discovery of novel NPs.This plant is a flowering shrub from the Phyllanthaceae family, found in Myanmar, Thailand, Laos, Cambodia, the Malay Peninsula, and Sumatra (van Welzen, 2016).
Dereplication results overview for the ethyl acetate extract of Hymenocardia punctata leaves According to Inventa's results, the annotation rate for the H. punctata extract was notably high (c.a.75%).To further explore the regions of the chromatogram that were annotated, the comparison between the original (PI) UHPLC-HRMS 2 chromatogram (from the 1,600 NEs collection metabolite profiling) and the Ionmap generated by Inventa was carefully inspected (Figure 2).Upon examination of the SIRIUS and ISDB annotation results, as well as the outcomes of Ion Identity FBMN (II-FBMN, see Supplementary Figure S1, PDF version here) (Nothias et al., 2020;Schmid et al., 2021), it emerged clearly that the most intense features (Figure 2A), were not annotated (green dots on the Ionmap in Figure 2B).
For the following phytochemical studies, the leaves of H. punctata were subsequently extracted on a larger scale with hexane, ethyl acetate (HPE) and methanol.The HPE and HPM extracts underwent UHPLC-HRMS 2 metabolite profiling.Additionally, a Charge Aerosol Detector (CAD) was used to obtain semiquantitative information (Ligor et al., 2013;Gamache, 2017).After careful composition assessment, only HPE was further considered.As shown in Figure 3A the features of interest were present and correlated with the major compounds in the extract according to the CAD chromatographic trace (Figure 3B).
The (PI) UHPLC-HRMS 2 data from HPE was used to generate a new II-FBMN which confirmed most information obtained in the original extract from the 1,600 NEs collection.The most intense peaks were clustered together indicating their close structural relationship (II-FBMN, see Supplementary Figure S2, PDF version here).The chemical class and structural annotations obtained through GNPS, SIRIUS and CANOPUS (Dührkop et al., 2019;2021) suggested that most compounds derived from the shikimate-phenylpropanoid and terpenoid pathways (refer to Treemap overview Supplementary Figure S3 -interactive plot visualization here-, and Supplementary Table S1).
Both the CAD and MS traces confirmed that the major constituents of HPE were not annotated.This, together with the novelty scores given by Inventa, confirmed that HPE is a promising extract for the search for new bioactive NPs.

HPLC-based bioactivity profiling of the ethyl acetate extract of Hymenocardia punctata leaves
An HPLC-based bioactivity profiling (Hamburger, 2019) was carried out to establish a relationship between the major unannotated chromatographic peaks (potentially new NPs) and the observed bioactivity of HPE.A small amount of HPE (c.a. 10 mg) was fractionated by semi-preparative HPLC-UV under optimized chromatographic conditions.Column's effluent was collected into a 96 deep-well plate and the Wnt-regulatory bioactivity of each dried micro-fraction was assessed.The HPLC based bioactivity profile confirmed that the bioactivity was mainly related to the major unannotated peaks (See Supplementary Figure S4).
Isolation and de novo structural characterization of compounds from the Hymenocardia punctata leaves ethyl acetate extract According to the dereplication results and the HPLC-based bioactivity profiling, the isolation efforts should be focused on the peaks with retention times between 3.5 and 6 min (see Figure 3).An in-depth phytochemical study of the HPE extract was carried out to corroborate the presence of structurally novel NPs and evaluate their Wnt regulatory activity.The chromatographic conditions used for the HPLC based bioactivity profiling were adapted to the flash chromatography scale using a geometrical gradient transfer (Guillarme et al., 2008).Several of the fractions obtained contained mixtures of compounds that were further separated by high resolution semi-preparative HPLC using dryload injection (Queiroz et al., 2019).
The relative retention time in the PI HRMS 2 chromatogram is depicted in Figure 3 (numbers in parenthesis).The 13 compounds were recovered in sufficient amounts to allow full structural characterization and assessment of biological activity from only 55 g of dried plant material.
The NMR data of 10 were closely related to those of 2 and the HRMS data confirmed that both molecules had the same MF   TABLE 2 1 H NMR (600 MHz) and 13 C NMR (151 MHz) data of compounds 9-13 in CD 3 OD.NA: signal not detected due to the keto-enol tautomerism in system C(5)-C(6)-C(7).
From a biosynthetic point of view, the new compounds isolated from H. punctata may have been formed by rearrangement of an 8prenylflavane as suggested for acutifolin A, tazettone A, and tazettone B (Figure 7).A similar rearrangement occurs also   when catechin is subjected to basic conditions, forming catechinic acid (Sears et al., 1974;Ibrahim et al., 2007;Khiari et al., 2017).The extraction process was under neutral conditions, which is an indication of the authenticity of the compounds.The proposed biosynthesis pathway involves the presence of an electrophilic species that introduces the hydroxy group at position C-10 in acutifolin A, and tazettone A and B. We hypothesize that, in our case, this electrophilic species is a dimethylallyl diphosphate (DMAPP) unit, resulting in the prenylation of the position C-10 ( Yazaki et al., 2009;Zhou et al., 2021).This assertion is plausible since we were able to isolate prenylated flavonoids (3, 5 and 7), which is an indication of their abundance in this plant.

Evaluation of the Wnt-regulatory activity of isolated compounds
The Wnt-regulatory activity of all isolated compounds from H. punctata was evaluated using two different cancer cell lines representing TNB cancer, known for its reliance on Wnt signaling: BT-20 and HCC1395 cells.Additionally, Human Embryonic Kidney HEK293 cells were used to represent nonmalignant cells.The results for the active compounds are shown in Table 3 (and Supplementary Figure S7).
The results demonstrated that the prenylated flavone 7, exhibited the highest potency against the HEK293 cell line, with an IC 50 value of 12 µM.Other compounds exhibited significantly higher selectivity against HCC1395 cancer cells.For example, the other prenylated flavone 3 demonstrated the highest selectivity, followed by one of the newly discovered bicyclic compounds 4, with IC 50 values of 13 µM and 14 μM, respectively.Notably, their potency against both BT-20 and HEK293 cells was nearly two-fold lower.In the case of the BT-20 cell line, seven displayed the highest activity, followed by 4, with IC 50 values of 16 µM and 26 μM, respectively.It is worth noting that, in general, the prenylated flavones (3 and 7) had lower IC 50 values compared to the novel bicyclic compounds (1, 2 and 4) in at least two out of the three different cell models.Additionally, for all compounds, the specificity for Wnt inhibition was controlled by the absence of effects of the compounds on co-transfected constitutively expressed Renilla luciferase, serving as a reporter of cytotoxic or other negative effects of the compounds on the cell wellbeing (Shaw et al., 2019a).
Over the past 2 decades, numerous studies have revealed that flavonoids and structurally related compounds possess inhibitory effects on human diseases by targeting various cellular signaling components (Amado et al., 2011;2014).Flavonoids have been recognized as inhibitors of the Wnt pathway, with many of them shown to inhibit it by downregulating the levels of β-catenin (Fuentes et al., 2015).For instance, Apigenin (Q424567), the first flavonoid to be reported as a Wnt inhibitor, has been found to decrease β-catenin levels and promote cell cycle arrest in breast and colorectal cancer (Song et al. , 2000;Landesman-Bollag et al., 2001;Amado et al., 2011).However, to date, there have been no reported findings of whether and how prenylation changes the potency of flavones towards the Wnt/βcatenin pathway.A few reports show the direct activity of prenylated isoflavones, such as 8-prenylgenistein by promoting osteogenesis (Zhang et al., 2018;Qiu et al., 2020).Our results clearly demonstrate that prenylated flavones, and similar analogs like the new bicyclic compounds, selectively disrupt the Wnt/βcatenin pathway, however with a potency only modestly improved from that reported for apigenin, from c. a. 30 μM down to 10-20 µM.This is also paralleled by other studies showing that structurally similar prenylated chalcones such as derricin and derricidin isolated from Lonchocarpus sericeus (Q15471182) reduce cell viability and induce cell cycle arrest in colorectal cancer HCT116 cells (Q28334584) through negative modulation of the Wnt/β-catenin pathway (Stevens, 2020).

General experimental procedures
The plant material was extracted in a Thermo Scientific Dionex ASE 350 Accelerated Solvent Extractor (Thermo Scientific ™ , Bremen, Germany).HPLC analyses were performed on an HP 1260 Infinity Agilent High-Performance Liquid Chromatography System equipped with a photodiode array detector (HPLC-PDA) (Agilent Technologies, Santa Clara, CA, United States) connected to an Evaporative Light Scattering Detector (ELSD, SEDERE, Orleans, France).The HPLC-based biactivity profiling was performed on an HP 1260 Agilent Infinity II High-Performance liquid chromatography equipped with a photodiode array detector (HPLC-PDA) and a sample collector (Agilent Technologies, Santa Clara, CA, United States).Flash chromatography was performed on a Sepacore instrument (Buchi, Flawil, Switzerland) composed of a pump module C-605, fraction collector model C-620, and UV spectrophotometer model C-640.The semi-preparative HPLC was performed using a Shimadzu system consisting of LC-20A module pumps, an SPD-20A UV/Vis detector, a 7725I Rheodyne ® valve, and an FRC-10A fraction collector (Shimadzu, Kyoto, Japan).The system was controlled using the LabSolutions software from Shimadzu.NMR spectroscopic data were acquired on a Bruker Avance Neo 600 MHz spectrometer equipped with a QCI 5 mm Cryoprobe and a sampleJet automated extract changer (Bruker BioSpin, Rheinstetten, Germany).Chemical shifts are reported in parts per million (ppm, δ), and coupling constants are reported in Hertz (Hz, J).The residual CD 3 OD/CDCl 3 signals (δ H 3.31/7.26,δ C 49.8/77.16)were used as internal standards for 1 H and 13 C, respectively.Comprehensive assignments were based on 2D-NMR spectroscopy techniques such as COSY, edited-HSQC, HMBC, and ROESY.Electronic Circular Dichroism (ECD) measurements were measured using a JASCO J-815 spectrometer (Loveland, CO, United States) in methanol, utilizing a 1 cm cell.The scan speed was set at 200 nm/min in continuous mode, scanning Proposed biosynthesis for the bicyclo[3.3.1]non-3-ene-2,9dionecore of compounds 1,2,4,6, and 8-13.from 400 nm to 165 nm.Optical rotations were determined in methanol using a JASCO P-1030 polarimeter (Loveland, CO, United States) with a 1 mL, 10 cm tube.

Extraction of plant material
PFL supplied the dried, grounded leaves of H. punctata (Phyllanthaceae) (identifier V114372GP-01).This plant was part of the PFL collection registered with the European Commission on 1 April 2020, under accession number 03-FR-2020.This official registration acknowledges the collection's compliance with legal standards for access and management.It signifies that the PFL collection adheres to the European Access and Benefit Sharing Regulation criteria, which enforces the Nagoya Protocol's directives at the European level.These directives pertain to accessing genetic resources and justly sharing the benefits derived from their use (Nagoya Protocol, 2011).
A mass of 55 g was extracted in a 100 mL pressure-resistant stainless steel extraction cell using an ASE system.At the bottom and the top of the cell, a cellulose filter (Dionex ™ 100, Thermo Scientific ™ , Bremen, Germany) was added to prevent solid particles from reaching the internal system.The cell was extracted with 60% rinse volume under pressure at 40 °C, three cycles, and a static time of 5 min per cycle.The sample was extracted successively with HPLC quality hexane (Fisher Chemicals, Reinach, Switzerland), ethyl acetate (Fisher Chemicals, Reinach, Switzerland), and methanol (Fisher Chemicals, Reinach, Switzerland).The resulting extracts were dried at 35 °C under vacuum in a rotary evaporator to yield 0.32 g of hexanic extract (HPH), 1.13 g of ethyl acetate extract (HPE), and 2.60 g of methanolic extract (HPM).

Electronic circular dichroism calculations (ECD)
The absolute configuration assigned for all compounds was based on a comparison between the calculated and experimental ECD.The calculations were based on the relative configuration determined through NMR 2D ROESY experiments.The structures were used to find the conformers through a random rotor search algorithm (number of conformers, 100) employing the MMFF94s force field in Avogrado v1.2.0 (Hanwell et al., 2012).The conformers were further optimized using PM3 and B3LYP/6-31G(d,p) basis sets in Gaussian 16 software ( © 2015-2022, Gaussian Inc., Wallingford, CT, United States of America) with the SCRF model in methanol (Nugroho and Morita, 2014;Mándi and Kurtán, 2019).All optimized conformers were checked for imaginary frequencies.The conformers were subjected to ECD calculations using TD-DFT B3LYP/def2svp as a basis set and an SCRF model in methanol in Gaussian16 software.The calculated ECD spectrum was generated in SpecVis1.71software (Berlin, Germany) based on a Boltzmann-weighted average.Supplementary Figure S6 shows the results.The ECD calculations were performed on the HPC Baoab cluster at the University of Geneva.
The mass analyzer was calibrated using a mixture of caffeine, methionine−arginine−phenylalanine− alanine−acetate (MRFA), sodium dodecyl sulfate, sodium taurocholate, and Ultramark 1,621 in an acetonitrile/methanol/water solution containing 1% formic acid by direct injection.Control of the instruments was done using Thermo Scientific Xcalibur software v. 4.6.67.17.Full scans were acquired at a resolution of 30,000 fwhm (at m/z 200) and MS2 scans at 15,000 fwhm in the range of 100-1,000 m/z, with one microscan, time (ms): 200 m, an RF lens (%): 70; AGC target custom (Normalized AGC target (%): 300); maximum injection time (ms): 130; Microscans: 1; data type: profile; Use EASY-IC(TM): ON.The Dynamic exclusion mode: Custom; Exclude after n times: 1; Exclusion duration s): 5; Mass tolerance: ppm; low: 10, high: 10, Exclude isotopes: true.Apex detention: Desired Apex Window (%): 50.Isotope Exclusion: Assigned and unassigned with an exclusion window (m/z) for unassigned isotopes: 8.The Intensity threshold was set to 2.5E5.and a targeted mass exclusion list was used.The centroid data-dependent MS 2 (dd-MS 2 ) scan acquisition events were performed in discovery mode, triggered by Apex detection with a trigger detection (%) of 300 with a maximum injection time of 120 m, performing one microscan.The top three abundant precursors (charge states one and 2) within an isolation window of 1.2 m/z were considered for MS/MS analysis.For precursor fragmentation in the HCD mode, a normalized collision energy of 15, 30, 45% was used.Data was recorded in profile mode (Use EASY-IC(TM): ON).
The chromatographic separation was done on a Waters BEH C18 column (50 × 2.1 mm i. d., 1.7 μm, Waters, Milford, MA) using a gradient as follows (time (min), %B): 0.5, 8.2; 7,99; 8,99; 8.10,8.2; 9.75, 8.2.The mobile phases were A) water with 0.1% formic acid and B) acetonitrile with 0.1% formic acid.The flow rate was set to 600 μL/min, the injection volume was 1 μL, and the column was kept at 40 °C.The PDA detector was used from 210 to 400 nm with a resolution of 1.2 nm.The CAD was kept at 40 °C, 5 bar N 2 , and power function one for a data collection rate of 20 Hz.

MZmine data preprocessing
The converted files were processed with MZmine3 (Schmid et al., 2023).For mass detection at the MS 1 level, the noise level was set to 1.0 E 6 .For MS 2 detection, the noise level was set to 0.00.The ADAP chromatogram builder parameters were set as follows: Minimum consecutive scans, 5; Minimum intensity for consecutive scans, 1.0 E 6 ; Minimum absolute height, 1.0 E 6 , and m/z tolerance of 0.0020 or 10.0 ppm.The Local minimum feature resolver algorithm was used for chromatogram deconvolution with the following parameters: Chromatographic threshold, 80; Minimum search range RT/Mobility (absolute), 0.10; Minimum relative height, 1%; Minimum absolute height, 1.0 E 6 ; Min ratio of peak top/edge, 1.0; peak duration range, 0.01-1.0min; Minimum scans, 5. Isotopes were detected using the 13 C isotope filter with an m/z tolerance of 0.0050 or 8.0 ppm, a Retention Time tolerance of 0.05 min (absolute), the maximum charge set at 2, and the representative isotope used was the most intense.Each file was filtered to remove duplicates using the Duplicate peak filter with an m/z tolerance of 0.005 or 10 ppm and an RT tolerance of 0.10 min.The Feature list row filter was used to filter with the following parameters: Minimum features in an isotope pattern, 2; Retention time, 0.50-7.00min; Feature duration range: 0.1-1.0min; and only the ions with an associated MS 2 spectrum were kept.The resulting filtered list was subjected to Ion Identity Networking (Schmid et al., 2021) starting with the metaCorrelate module (RT tolerance, 0.10 min; minimum height, 1.0 E 5 ; Intensity correlation threshold 1.0 E 5 and the Correlation Grouping with the default parameters).Followed by the Ion identity networking (m/z tolerance, 8.0 ppm; check: one feature; Minimum height: 1.0 E 3 , Ion identity library [maximum charge, 2; maximum molecules/cluster, 2; Adducts small networks without major ion, yes; Delete networks without monomer, yes), Add ion identities networks (m/z tolerance, 8 ppm; Minimum height, 1.0 E 5 ; Annotation refinement (Minimum size, 1; Delete small networks without major ion, yes; Delete small networks: Link threshold, 4; Delete networks without monomer, yes)) and Check all ion identities by MS/MS (m/z tolerance (MS 2 )), 10 ppm; min-height (in MS 2 ), 1.0 E 3 ; Check for multimers, yes; Check neutral losses (MS 1 -> MS 2 ), yes) modules.The resulting aligned peak list was exported as a. mgf file for further analysis.

Spectral organization through molecular networking
A molecular network for HPE was constructed from the. mgf file exported from MZmine3, using the FBMN workflow on the GNPS platform (Wang et al., 2016;Nothias et al., 2020).The precursor ion mass tolerance was set to 0.02 Da with an MS 2 fragment ion tolerance of 0.02 Da.A network was created where edges were filtered to have a cosine score above 0.7 and more than six matched peaks.The spectra in the network were then searched against GNPS' spectral libraries.All matches between network and spectra were required to have a score above 0.6, and at least three matched peaks.Job link: https://gnps.ucsd.edu/ProteoSAFe/status.jsp?task=c9e133b094404c0ab373c991b8924fb0.

Taxonomically informed metabolite annotation
The.mgf file exported from MZmine3 was also annotated by spectral matching against an in silico database to obtain putative annotations (Allard et al., 2016).The resulting annotations were subjected to taxonomically informed metabolite scoring (https:// taxonomicallyinformedannotation.github.io/tima-r/,v 2.8.2) and re-ranking (Rutz et al., 2019) from the chemotaxonomic information available on LOTUS (Rutz et al., 2022).The in silico database used for this process includes the combined records of the Dictionary of Natural Products (DNP, v30.2) and the LOTUS Initiative outputs.

Wnt activity assessing assay Cell lines and culture conditions
The BT-20, HCC1395 and HEK293 cell lines were cultured and utilized in experimental conditions in Dulbecco's Modified Medium (Thermo Fisher Scientific) supplemented with 10% Fetal Calf Serum and 1% penicillin-streptomycin at 37 °C and 5% CO 2 .
Luciferase-based assay of the Wnt-dependent transcriptional activity Purified Wnt3a was obtained from mouse L-cells stably transfected with Wnt3a, as previously described (Willert et al., 2003), with our own modifications (Xu et al., 2020).The 3 cell lines, stably transfected with the M50 Super 8×TopFlash plasmid, were seeded at a density of 6,000 cells/well in white tissue-culture-treated 384-well plates in 20 µL/well maintenance medium.The cells were also transfected with the pCMV-RL plasmid to allow for constitutive expression of Renilla luciferase, using XtremeGENE nine reagent according to the manufacturer's protocol.After 24 h post-transfection, the medium was replaced with 2-fold indicated concentrations of compounds in 10 µL/ well maintenance medium.Following a 1-h preincubation, Wnt3a was added to a final concentration of 500 ng/mL in an additional 10 µL/well volume.After a further 24 h of incubation, the medium was removed and measurements were taken using a Tecan Infinite M200 PRO plate reader equipped with a two-channel dispensing unit by injecting sequentially 15 µL of each of the buffer solutions for activity measurements of firefly and Renilla luciferase, as described previously (Boudou et al., 2022).The resulting dose-response data for this and the MTT assay were fitted using GraphPad Prism nine software (v9.4.0,Boston, United States) to obtain IC 50 values.Since the assay is designed to not use positive control compounds, an extract or compound is considered 'toxic' if the IC50 value against Renilla luciferase is less than 1.7-fold of estimated TopFlash one, indicating that the decrease observed in TopFlash response is affected by a strong toxic effect.

Conclusion
The findings of this study demonstrate the potential of combining Inventa's structural novelty scores with bioactivity results for guiding the discovery of structurally novel bioactive NPs in collections of NEs.Through the evaluation of Wntregulation activity results and Inventa's scores, a collection of 1,600 NEs was narrowed down to four active NEs with a high potential of containing structural novel NPs.
Inventa's multifaceted approach to evaluating structural richness and dissimilarities among extracts proves instrumental in this process.By assessing individual features within each extract and comparing the overall spectral space, Inventa effectively identifies extracts with potentially unknown specialized metabolisms.Through the integration of data from these two levels and the incorporation of literature reports for the taxon, Inventa highlights extracts with high novelty potential.The priority score, derived from its four components provides a comprehensive evaluation of the NEs potential of containing novel NPs.While Inventa's novelty scores may not directly correlate with observed bioactivity, they play a crucial role in prioritizing NEs and reducing selection prior to in-depth phytochemical study.This approach mitigates the risk of prioritizing known NPs and underscores the importance of employing comprehensive bioinformatics approaches in sample selection.Thus, Inventa empowers researchers to identify NEs harboring structurally novel NPs with potential therapeutic applications.
The subsequent phytochemical investigation of H. punctata leaves led to the isolation of ten novel bicyclo[3.3.1]non-3-ene-2,9-diones and three known prenylated flavones.Some of the newly isolated compounds exhibited appreciable IC 50 values and showed no apparent cytotoxicity in three different cell lines, indicating their potential as Wnt inhibitory compounds.This work illustrates the utility of Inventa in assisting the efficient selection of active NEs from large sample collections for the identification of novel and bioactive NPs.

FIGURE 1
FIGURE 1 Overview of the general strategy for the selection of promising NEs for the discovery of structurally novel bioactive NPs in collections of NEs.(A) Bioactive-driven selection: The NEs collection is reduced according to the bioactivity screening results.In this study only non-toxic NEs with an Wnt inhibitory IC 50 ≤ 5 μg/mL were considered.(B) General overview of the UHPLC-HRMS 2 profiling and annotation workflow used for the characterization of the 1,600 NEs collection by Allard et al. (2023).(C) The UHPLC-HRMS 2 and annotations results for the entire 1,600 NEs collection were fed to Inventa to calculate the structural novelty scores.(D) Novelty-driven selection: Each component given by Inventa for the reduced set of NEs was individually assessed for a more precise selection.SC: Similarity Component; CC: Class Component; rsp: reported compounds in species; rsg: reported compounds in genus.
FIGURE 2 (A) Original UHPLC-HRMS 2 chromatogram of Hymenocardia punctata leaves ethyl acetate extract from the 1,600 NEs collection.(B) Inventa's ion map for the Original UHPLC-HRMS 2 chromatogram of Hymenocardia punctata leaves ethyl acetate extract.Each dot represents one feature, the size is proportional to the intensity.The color shows the annotation status, green: unannotated, yellow: annotated (see interactive plot here).
H 5.80 (CH-6), two quaternary carbons (δ C 61.7 (C-10) and 67.2 (C-8)), and three carbonyl carbons (δ C 181.2 (C-5), 201.3 (C-7), and 207.7 (C-9)) fitted the number of carbons (31) with the molecular formula.The HMBC correlations from CH 2 -11 to CH-2, C-7, C-8, and C-9, from CH 2 -16 to CH 2 -4, C-5, C-9, and C-10, from CH-6 to C-5, C-8, and C-10, from CH-2 to C-7, and C-9, and from CH 2 -4 to C-5 allowed to position the 3-prenyl-4-methoxy-5-hydroxyphenyl group, the prenyl, and the 2,3-dihydroxy-3-methylbutyl group on a bicyclic [3.3.1]core as shown in Figure 4.The carbons C-8 and C-10 are bridged by a ketone (C-9) with a typical chemical shift (δ C 207.7).These carbons together with CH-2, CH 2 -3, and CH 2 -4 integrate the first 6-member ring.The second ring shares the carbons C-8, C-9, and C-10 with a keto-enol system between carbons C-5, C-6, and C-7.According to the molecular formula, an additional ring must be established either between the tertiary alcohol in C-18 and C-5 to form a tetrahydropyran ring, or between the secondary alcohol in C-17 and C-5 to form a tetrahydrofuran ring.The lack of HMBC correlation between CH-17 and C-5 could not answer this question.Measurements in CDCl 3 were done to try to find correlations between these protons by avoiding solvent exchanges, but they did not show any correlations either.However, the CH-17 and C-18 chemical shift values (δ C 92.8 and δ C 71.1, respectively) compared to values in the literature for oblongifolin R (which presents a tetrahydrofuran ring with CH at δ C 94.8 and C at δ C 71.1) and oblongifolin S (bearing a tetrahydropyran ring with CH at δ C 69.6 and C at δ C 86.9) (Zhang et al., 2014) led to the conclusion that a 5-member ring is present.The two-dimensional structure of compound 1 was established as shown in Figure 4.The relative configuration was established through the ROESY correlations observed in CD 3 OD and CDCl 3 .The spectra recorded in CDCl 3 helped identify the pseudo equatorial and axial positions of CH 2 - FIGURE 3 (A) PI UHPLC-HRMS 2 chromatogram of de novo Hymenocardia punctata leaves ethyl acetate extract (HPE) between 3.5 min and 6 min.(B) Charge Aerosol Detector semiquantitative trace of HPE.The chromatographic peaks colored in green correspond to unannotated features after the dereplication process.The mass-to-charge ratio (m/z) of the most intense ions in the PI HRMS 2 trace are shown.The m/z colored in green (unannotated features) and yellow (annotated features) found in the original HRMS 2 profiling of Hymenocardia punctata (in the 1,600 NEs collection, Figure 2A).

TABLE 3
Wnt-inhibition assay results IC 50 (µM) for the isolated compounds one to four and 7.