Functional Characterization of Genes Coding for Novel β-D-Glucosidases Involved in the Initial Step of Secoiridoid Glucosides Catabolism in Centaurium erythraea Rafn

Secoiridoid glucosides (SGs) are monoterpenoids derived from the iridoid cyclopentane-C-pyran skeleton with β-D glucose linked at C1 position. Coordinated metabolic processes, such as biosynthesis and catabolism of SGs, ensure constitutive presence of these bitter tasting compounds in plant tissues, which plays a decisive role in the defense against pathogens and herbivores. These compounds are susceptible to hydrolysis mediated by enzymes β-glucosidases, and the resulting aglycones are subsequently directed toward different metabolic pathways in plants. Function of two β-D-glucosidases (named CeBGlu1 and CeBGlu2) from centaury (Centaurium erythraea Rafn; fam. Gentianaceae), belonging to the glycoside hydrolase 1 (GH1) family, was confirmed using in vitro assays with recombinant proteins, following their heterologous expression in E. coli and His-tag affinity purification. Although they show slightly differential substrate preference, both isoforms display high specificity toward SGs and the organ-specific distribution of transcripts was positively correlated with the content of SGs in diploid and tetraploid C. erythraea plants. Transient overexpression of CeBGlu1 and CeBGlu2 in C. erythraea leaves induced changes in metabolite profiles. The effectiveness of transgene overexpression has been altered by plant ploidy. UHPLC/DAD/(±)HESI − MS2 profiling of leaves of diploid and tetraploid C. erythraea genotypes revealed that the amounts of major SGs; sweroside, swertiamarin, and gentiopicrin was decreased in agroinfiltrated leaves, especially when CeBGlu1 and CeBGlu2 were co-expressed with transgene silencing suppressor p19. The work demonstrates that in planta metabolic engineering adopting transient overexpression of CeBGlu1 and CeBGlu2 is a suitable tool for the modulation of SGs content and glucosides/aglycones ratio, which might have substantial effects on overall phytochemistry of C. erythraea.

The content of SGs in C. erythraea tissues is a net result of the two dynamic metabolic processes, their biosynthesis and their catabolism ( Figure 1A). Catabolism of SGs starts with their deglycosylation catalyzed by β-glucosidases. Cleavage of the glucose (Glu) residue from SW, SWM, and GP results in the formation of aglycones (Figures 1A,B), which are further metabolized in plant tissues through isomerization, reduction and oxidation reactions. Gentiopicral and erythrocentaurin are the common products of SWM and GP hydrolysis (Ishiguro et al., 1983;Zeng et al., 2013;Božunović et al., 2018), while SW is metabolized into naucledal and epinaucledal (Purdy and McLean, 1977;El-Sedawy et al., 1990) (Figure 1C). In order to better understand the catabolism of SGs in C. erythraea, we focused our attention on β-glucosidases (β-D-glucoside glucohydrolases, E.C. 3.2.1.21), categorized into the glycoside hydrolase family 1 (GH1), the largest GH family in plants. GH1 β-glucosidases are enzymes that hydrolyze glycosidic bonds to release non-reducing terminal glucosyl residues from various compounds -benzoxazinoid, cyanogenic, iridoid and phenolic glucosides -as well as glucosinolates (Morant et al., 2008a). Genes coding for β-glucosidase (β-Glu) enzymes have been previously characterized in some members of the order Gentianales, including C. roseus and Rauvolfia serpentina (Geerlings et al., 2000;Warzecha et al., 2000;Barleben et al., 2007). Isolation and heterologous expression of C. roseus β-glucosidase with high specificity toward strictosidine which directly derives from SEC, provided valuable information on the function of this β-glucosidase involved in metabolic pathway of indole alkaloids (Geerlings et al., 2000). The aim of the present study was to isolate and functionally characterize C. erythraea β-glucosidase with high substrate specificity toward SGs using in vitro and in planta assays. Our presumption was that the action of C. erythraea β-glucosidase with high specificity toward SGs is essential for the catabolism of these compounds, and is indirectly related to defense against herbivores and pathogens. Secoiridoids in the form of glucosides possess remarkable antimicrobial effects , and their hydrolysis/deglycosylation mediated by β-D-glucosidases release biologically active aglycones which provide more efficient antioxidant protection . Moreover, during hydrolysis of SGs, glucose is also released, which may serve as an alternative source of energy under stress conditions.

Plant Material
Plants used in experiments in vitro were obtained as previously described by Filipović et al. (2019). Briefly, mother stock shoot cultures of C. erythraea diploids and tetraploids originating from seeds collected at the locality Tjentište (Sutjeska National Park, Bosnia and Herzegovina), in 2007 and 2016, respectively, were maintained in vitro on half-strength MS medium ( 1 / 2 MS, Murashige and Skoog, 1962) in 370 ml glass jars. Root cultures, established from root segments of 3-month-old diploid and tetraploid plants, were grown on solid 1/2 MS medium in Petri dishes. Spontaneously regenerated shoots, formed on root explants, were further transferred on fresh 1/2 MS medium in 370 ml glass jars for rooting. After 10 weeks of culturing, shoots and roots of the obtained regenerated plants were harvested and used in experiments to determine the expression patterns of CeBGlu candidates and their activities. All in vitro cultures were maintained at a temperature of 25 ± 2 • C under fluorescent light of 47 µmol s −1 m 2 and a 16 h/8 h light/dark photoperiod.
Agroinfiltration experiments were performed under greenhouse conditions. C. erythraea seedlings were established in greenhouse, in pots filled with Floradur B seed substrate for multiplication (Floragard Vertriebs-GmbH, Oldenburg, Germany). Two-month-old seedlings were individually transferred into pots with Floragard growth medium (Floragard Vertriebs-GmbH, Oldenburg, Germany), and grown under greenhouse conditions at 50-85% humidity. Five-month-old diploid and tetraploid plants, displaying rosette phenotype, were used in agroinfiltration experiments. Seeds of tetraploid C. erythraea, collected in 2006, were obtained from Ecological-Botanical Garden of the University of Bayreuth (Germany). Diploid individuals were of the same origin as those used in in vitro experiments. All the C. erythraea accessions used in the present study are deposited within the seed collection at the

RNA Isolation and cDNA Synthesis
Total RNA from approximately 150 mg of C. erythraea shoots and roots was isolated using modified CTAB method (Gasic et al., 2004). Isolated RNA was quantified using Qubit 3.0 Fluorometer (Thermo Fisher Scientific, United States), and its integrity was confirmed using gel electrophoresis. Prior to RT-PCR, isolated RNA was treated with DNAse I (Thermo Fisher Scientific, United States) to deplete contaminating genomic DNA. First strand cDNA was synthesized from 300 µg RNA using the RevertAid Premium First Strand cDNA Synthesis Kit (Thermo Fisher Scientific, United States).

Isolation and Cloning of β-Glucosidase Candidate Genes
Centaurium erythraea leaf transcriptome databases (Malkov and Simonović, 2011;Ćuković et al., 2020) was searched for homologs of Catharanthus roseus strictosidine β-glucosidase (Geerlings et al., 2000). After selecting the most promising candidate, gene specific primers were designed using Primer3Plus software 1 in order to amplify the full length of the CeBGlu coding sequence. A list of primers used for cloning and expression analysis is given in Supplementary Table 1.
The full length amplification of the candidate gene was performed using Q5 Hot Start High-Fidelity DNA Polymerase (New England Biolabs, United States) and gene specific forward and reverse primers (Supplementary Table 1) following cycling conditions: one cycle of 98 • C for 3 min, 35 cycles of 98 • C for 30 s, 62 • C for 2 min, and 72 • C for 5 min followed by a final extension of 72 • C for 10 min in a thermal cycler (Eppendorf, Austria). The amplicon was separated by 1% agarose gel electrophoresis and then purified by GeneJET Gel Extraction Kit (Thermo Fisher Scientific, United States) according to manufacturer's instructions. The purified amplicon was cloned into pTZ57R/T cloning vector using T/A PCR product cloning kit (Thermo Fisher Scientific, United States). In summary, 20 µl of purified PCR product was mixed with 3 µl of pTZ57R plasmid vector (600 ng), 6 µl of 5× ligation buffer, 1 µl of T4 DNA Ligase (5 U µl −1 ), in 1.5 ml microtube. A total of 30 µl reaction mixture was incubated for 1 h at room temperature to let the ligation reaction take place.
For transformation, 5 µl of the ligation product was added into 100 µl of Mach1 E. coli competent cells, incubated on ice for 20 min and then heat-shocked for 45 s in water bath at 42 • C. The mixtures were immediately placed on ice and subsequently cultivated with 250 µl liquid Luria-Bertani (LB) media followed by 1 h incubation at 37 • C. The transformed cells were cultivated on LB agar plate containing ampicillin (100 µg ml −1 ), which was followed by overnight incubation at 37 • C. Positive transformants were verified by colony PCR with CeBGlu gene-specific primers (Supplementary Table 1). Cells harboring the recombinant plasmid were cultured in ampicillin containing liquid LB medium overnight at 37 • C in a shaker incubator at 220 rpm. GeneJET Plasmid Miniprep Kit (Thermo Fisher Scientific, United States) was used to purify plasmids from 2 ml of Mach1 E. coli overnight culture according to the manufacturer's instructions. Recombinant plasmids were confirmed by restriction digestion using XhoI and KpnI restriction enzymes (Thermo Fisher Scientific, United States). Subsequent sequencing has confirmed the isolation of two highly similar gene variants named CeBGlu1 and CeBGlu2.

Organ-Specific Profiling of CeBGlu Expression
For qRT-PCR, due to the very high similarity between CeBGlu1 and CeBGlu2 sequences, it was possible to design only one pair of primers common to both isoforms (Supplementary Table 1). For CeBGlu expression analysis, SYBR Green I (Maxima SYBR Green/ROX Kit, Thermo Scientific, United States) was used. Amplification was conducted in Light cycler QuantStudio 3 (Thermo Fisher Scientific, United States), according to the manufacturer's instructions. General thermocycler conditions were 95 • C for 10 min; 40 cycles of 95 • C for 15 s; 60 • C for 30 s; 72 • C for 30 s and final extension at 72 • C for 10 min. Expression levels of targeted genes were calculated according to the 2 − Ct method (Livak and Schmittgen, 2001). EF1 gene expression was used as endogenous control to normalize all data (primer sequences are presented within Supplementary Table 1). Presented results are obtained from three biological replicates.

Organ-Specific Profiling of Total Hydrolytic Activity Against Secoiridoid Glycosides
Shoots and roots of diploid and tetraploid C. erythraea plants were ground to a fine powder using liquid nitrogen, and proteins were isolated in 100 mM potassium-phosphate buffer (pH 6.5) supplemented with phenylmethylsulfonyl fluoride (PMSF) and 5 mM ascorbate. Protein content was determined according to Bradford (1976) using bovine serum albumin as a standard.
Activity of protein extracts to reduce the content of secoiridoid glycosides was evaluated using standards of SW, SWM and GP as substrates. The reduction in quantity of the mentioned substrates is the net result of combined enzymatic activities present in the extract including glucosyl hydrolases which produce the respective aglycones and unknown biosynthetic enzymes involved in transformations of secoiridoids ( Figure 1A).
All reference compounds were diluted in dH2O (w:v = 1:1) and kept as a stock solutions. Aliquots of 30 µl were vacuum evaporated and diluted in 300 µl 50 mM citrate phosphate buffer pH 5.5 containing 10 µg of protein extracts. Reactions were incubated for 24 h at 37 • C. Subsequently, 700 µl of methanol was added, and samples were centrifuged at 10,000 g for 10 min. Supernatants were filtered through 0.2 µm cellulose filters and subsequently analyzed for the content of SW, SWM and GP.

Phylogenetic Analysis
For phylogenetic tree creation, multiple sequence alignments were generated using the Muscle algorithm. A neighborjoining tree was constructed using MEGA X, Version 10.2.6 (Kumar et al., 2018). Cluster stability was estimated with 1,000 bootstrap replicates. The evolutionary distances were computed using the Poisson correction method. All ambiguous positions were removed for each sequence pair (pairwise deletion option). There were a total of 766 positions in the final dataset. The data were converted into Newick format and transferred to Dendroscope (Huson and Scornavacca, 2012) for creating the final phylogenetic tree. Refer to Supplementary Table 2 for a list of plant protein sequences used for phylogenetic analysis and their corresponding accession numbers, including those of CeBGlu1 and CeBGlu2 functionally characterized within the present study.

Sequence Analysis, 3D Modeling and Ligand Docking
Multiple sequence alignments were generated with the DECIPHER R package (Wright, 2015) using 5 iterations and 5 refinements. Subcellular localization based on primary sequence was estimated using Light attention (Stärk et al., 2021).
Tertiary protein structure was estimated using AlphaFold2.1 (Jumper et al., 2021) queried via UCSF ChimeraX 1.4rc (Pettersen et al., 2021). The generated structures were assessed using MolProbity 4.4 (Williams et al., 2018) via SWISS-MODEL Workspace (Waterhouse et al., 2018). Obtained models were compared to each other and with experimental PDB structures using pairwise structure alignment 2 and the jFATCAT rigid model (Li et al., 2020), as well as using ChimeraX after superposition using the matchmaker command with default parameters (Pettersen et al., 2021). Protonation state for the predicted structures was estimated using PlayMolecule ProteinPrepare (Martínez-Rosell et al., 2017). Ligand docking was performed using AlphaFold predicted protein structure of CeBGlu1. Ligand preparation was performed starting from tridimensional sdf files obtained from PubChem for SWM (PubChem id: 442435), SW (PubChem id: 161036) and GP (PubChem id: 88708). Conformers for these compounds were generated using the ETKDG version 3 method with imposing small ring torsion angle preferences (Wang et al., 2020) using RDKit 2021.09.5 (Landrum et al., 2022). A 100 conformers were generated per compound and subsequently filtered using an RMSD threshold of 0.5 Å so that only those conformations that are at least 0.5 Å RMSD away from all retained are kept. The geometry of the resulting conformers was optimized using Merck molecular force field (MMFF94s) as implemented in RDKit 2021.09.5 (Landrum et al., 2022). The resulting ligand conformations were docked into the predicted enzyme structure using AutoDock Vina 1.2.3 (Eberhardt et al., 2021). The docking box was defined based on the coordinates of the malo secologanin ligand in the experimental structure of the raucaffricine β-D glucosidase from Rauvolfia serpentina (PDB: 3U5Y, Xia et al., 2012) after superposition of the structure to the AlphaFold model of CeBGlu1. The docking box center was defined as x = −1.964, y = 0.374, z = −6.148, while the size of the box was 18, 18.75, and 21 Å in each direction, respectively. The docking was performed using autogrid4 (Morris et al., 2009) precalculated affinity maps, with autodock4 scoring function. Flexible docking was performed where the Glu476 side chain was allowed to change conformations. The exhaustiveness of the algorithm was set to 64. The highest estimated affinity poses, as reported by AutoDock Vina, were inspected and compared to the position of secologanin in 3U5Y, and the best pose based on these two criteria for each ligand was analyzed using UCSF ChimeraX 1.4rc (Pettersen et al., 2021) and protein-ligand interaction profiler (PLIP, Adasme et al., 2021).

Accession Numbers
Sequence data from this article can be found in the NCBI database under the following accession numbers: ON060690 (CeBGlu1); ON060691 (CeBGlu2).

Heterologous Expression and Purification of 6His-Tagged Proteins
For heterologous expression, CeBGlu1 and CeBGlu2 were cloned into bacterial expression vector pRSETA, yielding the final plasmids pRSETA:CeBGlu1 and pRSETA:CeBGlu2 (Figure 2A). Amplification of the gene sequences by PCR was performed using Phusion high-fidelity DNA polymerase (Thermo Fisher Scientific, United States) and a pair of CeBGlu1/CeBGlu2 gene specific primers with added XhoI/KpnI restriction sites (Supplementary Table 1). The constructs were heat-shock transformed, as described above, into Mach1 E. coli competent cells which were further cultured on LB medium containing ampicillin. Polymerase chain reaction using gene specific primers, XhoI/KpnI double restriction digestion and sequencing were applied to verify the constructs containing transgenes of interest. pRSETA vector contains a N-terminal polyhistidine (6×His) tag, which can be used as an affinity ligand for protein purification. The pRSETA:CeBGlu1 and pRSETA:CeBGlu2 recombinant constructs were used for heat-shock transformation of BL21 (DE3) CodonPlus-RIL cells (Stratagene, United States). Cultures were grown overnight in 20 ml 2 × Yeast Extract Tryptone (2 × YT) media broth supplemented with ampicillin (100 µg ml −1 ), 50 µg ml −1 kanamycin and 17 µg ml −1 chloramphenicol at 37 • C with shaking at 220 rpm. On the next day 200 ml culture was initiated by inoculating above mentioned overnight cultures, with the initial OD600 set to 0.1. Cultures were incubated at 37 • C with shaking at 230 rpm. After reaching an OD600 of 0.4-0.5, protein expression was induced using 0.1 mM isopropyl-β-D-thiogalactopyranoside (IPTG). Subsequently, incubation continued for 4 h at 21 • C. Following centrifugation for 10 min at 5,000 g and removal of supernatant, the cell pellets were harvested. All of the purification steps were performed at 4 • C. For each purification batch, Ni-NTA agarose (Qiagen, United States) beads were equilibrated with equal volumes of lysis buffer (50 mM NaH 2 PO 4 , 300 mM NaCl, 10 mM imidazole pH 8.0). Bacterial cells were collected by centrifugation and re-suspended in 500 µl of lysis buffer. Following lysozyme (Sigma Aldrich, Germany) addition (1 mg ml −1 final concentration), samples were incubated for 30 min on ice. After 10 min of centrifugation (12,000 g at 4 • C) supernatants were loaded into columns containing Ni-NTA resin and incubated at 4 • C using FALC F205 rotary tube mixer (Falc Instruments, Treviglio, Italy). After 1 h, samples were spun at 12,000 g for 5 min at 4 • C. Supernatant was discarded and resin was rinsed for 3 times using washing buffer (50 mM NaH 2 PO 4 , 300 mM NaCl, 50 mM imidazole, pH 8.0). The recombinant protein was eluted with increased imidazole concentration in elution buffer (50 mM NaH 2 PO 4 , 300 mM NaCl, 250 mM imidazole, pH 10.2).
Eluted fractions were analyzed using 5-10% SDS-PAGE using Mini-PROTEAN II Electrophoresis Cell (Bio-Rad, United States) followed by Coomassie blue staining and immunoblotting. Proteins were electro-transferred to PVDF membrane (Amersham Biosciences, Germany), using Mini Trans-Blot Module electric transfer system (Bio-Rad, United States). Transfer on the membrane was performed at 4 • C for 90 min at a constant voltage of 60 V. Membrane was blocked with 10% (w/v) non-fat dry milk (NFDM; Nestle, United States) in phosphate-buffered saline containing 0.05% Tween-20 overnight at 4 • C. The presence of 6×His labeled proteins was confirmed using His-probe antibody in 1:100 dilution (Acc. No. sc-53073, Santa Cruz Biotechnology, United States), which was followed by incubation with goat anti-mouse IgG-HPR (1:5,000, Agrisera Antibodies, Sweden). The bound antibodies were visualized by enhanced chemiluminescence (ECL). The membrane was incubated for 5 min at room temperature using ECL solution containing 100 mM Tris-HCl pH 8.5, 0.2 mM p-coumaric acid, 1.25 mM 3-aminophthalhydrazide and 1.7 µl of 30% H 2 O 2 . Detection was performed by exposure to radiographic film (Medical X-ray Green/MXG Film, Carestream Health, United States) for 10 min.

Confirmation of CeBGlu1 or CeBGlu2
Hydrolytic Activity Using in vitro Enzymatic Assays Beta-glucosidase activity of CeBGlu1 and CeBGlu2 was determined in a reaction with 20 µg of recombinant proteins and 10 mM 4-nitrophenyl β-D-glucopyranoside (pNPG) as a substrate, in 50 mM citric phosphate buffer pH 5.5 at 40 • C for 48 h. The reaction was stopped by the addition of ice cold 1 M Na 2 CO 3 (1:1 = v:v) and colorimetric detection of p-nitrophenol, the product of pNPG hydrolysis, was spectrophotometrically measured at 410 nm (Esen and Blanchard, 2000).
Aerial parts of C. erythraea (100 mg) were ground into homogeneous powder using liquid nitrogen and extracted with 1 ml of 99.8% methanol (AppliChem GmbH, Germany) by vortexing for 30 s and subsequent sonication for 10 min using an ultrasonic bath (RK100, Bandelin, Berlin, Germany). After centrifugation at 10,000 g for 10 min supernatants were filtered using 0.2 µm syringe filters (Agilent Technologies, Santa Clara, CA, United States). C. erythraea methanol extract (10 µl) was evaporated in a Vacuum Rotor Evaporator (Eppendorf Concentrator 5301, Germany) at room temperature. Subsequently, dried extracts were dissolved in 300 µl 50 mM citrate phosphate buffer (pH = 5.5) containing 20 µg of purified recombinant enzyme (CeBGlu1 or CeBGlu2) and incubated for 48 h at 37 • C. Following incubation, 700 µl of methanol was added to stop the reaction, and reaction mixtures were centrifuged for 10 min at 10,000 g. Supernatants were filtered through 0.2 µm cellulose filters (Agilent Technologies, United States) and subsequently subjected to UHPLC/DAD/(±)HESI−MS 2 quantification of SGs. Control samples were prepared by replacing recombinant protein with the elution buffer used for protein purification.
Hydrolytic activity of CeBGlu1 and CeBGlu2 was further tested using standards of epideoxyloganic acid, loganin, secologanin, sweroside, swertiamarin, gentiopicrin, apigetrin, isoquercitrin, and vitexin as substrates. All reference compounds were diluted in dH 2 O (w:v = 1:1) and kept as a stock solutions. Aliquots of 30 µl were dried in vacuum evaporator and diluted in 300 µl 50 mM citrate phosphate buffer pH = 5.5. Reaction mixture containing 30 µl of previously diluted standard compound and 20 µg of purified recombinant enzyme in a final volume of 300 µl 50 mM citrate phosphate buffer pH = 5.5 was incubated for 48 h at 37 • C. Subsequently, 700 µl of methanol was added, and samples were centrifuged at 10,000 g for 10 min. Control samples contained elution buffer instead of purified recombinant protein, and the final concentration of standards was 3 µg ml −1 . Supernatants were filtered through 0.2 µm cellulose filters and injected into UHPLC/DAD/(±)HESI−MS 2 instrument.

Construction of CeBGlu Expression Plasmids and Agrobacterium-Mediated Transformation
CeBGlu1 and CeBGlu2 sequences were PCR-amplified from pTZ57R/T plasmids with the addition of PacI/AvrII restriction sites. Amplicons were ligated to transient expression vector pJL-TRBO (Lindbo, 2007), and positive colonies were identified by colony PCR using vector-and gene-specific primers. All PCR reactions were performed with Phusion high-fidelity DNA polymerase (Thermo Fisher Scientific, United States) and a pair of CeBGlu1/CeBGlu2 gene specific primers (Supplementary Table 1). After heat-shock transformation with pJL-TRBO:CeBGlu1 and pJL-TRBO:CeBGlu2 (Figure 2B), Mach1 E. coli competent cells were cultured overnight in LB liquid medium containing kanamycin (50 µg ml −1 ) at 37 • C in a shaker incubator at 230 rpm (IKA KS 4000 ic control, China). GeneJET Plasmid Miniprep Kit (Thermo Fisher Scientific, United States) was used to purify plasmids from 5 ml of Mach1 E. coli cultures according to the manufacturer's instructions. After plasmid isolation, each CeBGlu sequence was confirmed by sequencing.
The recombinant pJL-TRBO:CeBGlu1 and pJL-TRBO:CeBGlu2 plasmids extracted from Mach1 cells were transferred to A. tumefaciens GV3101 strain by electroporation using "Gene Pulser" (Bio-Rad, United States) and subsequently cultured in LB plates containing kanamycin (50 µg ml −1 ), gentamicine (25 µg ml −1 ) and rifampicin (10 µg ml −1 ). A single colony of recombinant bacteria was inoculated into 5 ml liquid LB media containing antibiotics and incubated overnight at 28 • C with shaking at 230 rpm. On the next day, 1 ml of bacterial suspension was sub-cultivated in 10 ml of liquid LB media containing 10 mM MES-KOH (pH 5.5) and 20 µM acetosyringone. Cultures were incubated overnight at 28 • C with shaking at 230 rpm. The agrobacterial cells were harvested by centrifugation for 20 min at 3,000 g, and the pellet was resuspended in the infiltration medium (10 mM MES, 10 mM MgCl 2 and 100 µM acetosyringone) to a final OD600 of 1.0. After an incubation at room temperature for 4 h, cultures were introduced into abaxial surface of leaves of five-month-old C. erythraea plantlets using a blunt tipped plastic syringe and applying gentle pressure. Additionally, to prevent the silencing of transgene expression in C. erythraea, pJL-TRBO:CeBGlu1 and pJL-TRBO:CeBGlu2 were co-infiltrated in a 1:1 ratio with pBIN vector expressing the p19 silencing-suppressor gene from TBSV (pBIN:p19). After agroinfiltration, plants continued to grow for 5 days in the greenhouse. Leaves were harvested from diploid and tetraploid C. erythraea plants and immediately frozen at liquid nitrogen. Samples were stored at −80 • C until further use.

Plant Methanol Extract Preparation
Plant material (shoots and roots) of C. erythraea was manually ground in liquid nitrogen into fine powder and diluted in 96% methanol (w:v = 10:1). Following vortexing for 1 min, extraction was performed overnight at 4 • C. The next day, extraction was continued for 20 min in an ultrasonic bath (RK100, Bandelin, Germany) maintained at room temperature. Samples were centrifuged at 8,000 g for 20 min and supernatants were filtered using 15 mm RC filters with 0.22 µm pore size (Agilent Technologies, United States). Samples were stored at 4 • C until use. All extractions were performed in biological triplicates.

UHPLC/DAD/(±)HESI-MS 2 Quantification of Targeted β-D Glucosides
Samples were analyzed using Dionex Ultimate 3000 UHPLC system (Thermo Fisher Scientific, Germany) equipped with a DAD detector and connected to a triple quadrupole mass spectrometer (TSQ Quantum Access MAX, Thermo Fisher Scientific, Switzerland). Samples were chromatographically separated on a Hypersil gold C18 column (50 × 2.1 mm) with 1.9 µm particle size (Thermo Fisher Scientific, United States) thermostated at 40 • C. Mobile phase, consisting of water + 0.01% acetic acid (A) and MS grade acetonitrile (B), was eluted at flow rate of 0.4 ml min −1 according to Banjanac et al. (2017). Injection volume was set to 10 µl. DAD absorption was acquired at λmax = 260 and 320 nm. A triple quadrupole mass spectrometer with a heated electrospray ionization (HESI) was operated with a following parameters: vaporizer temperature 300 • C, spray voltage 4,000 V, sheet gas (N 2 ) pressure 27 AU, ion sweep gas (N 2 ) pressure 1.0 AU and auxiliary gas (N 2 ) pressure at 10 AU, capillary temperature 275 • C, skimmer offset 0 V. Argon was used as the collision gas in the collisioninduced fragmentation of the selected reaction monitoring (SRM) mode of the instrument, and collision energies (cE) were set as shown in Supplementary Table 3. Calibration curves of targeted compounds showed excellent linearity with correlation coefficients r = 0.999, p < 0.001. Total concentrations of targeted secoiridoids were obtained by calculating their peak areas, and were expressed as µg per 100 mg of plant fresh weight (µg 100 mg −1 FW). Xcalibur software (version 2.2) was used for the instrument control, data acquisition and analysis. All analyses were performed using three biological replicates.

Statistical Analysis
Statistical significance was determined by using Minitab Statistical Software (Minitab, State College, PA, United States). For statistical analysis of relative gene expression and compound quantification in shoots and roots of diploid and tetraploid plants one-way ANOVA was performed followed by Fisher LSD test (p < 0.05). For comparison of hydrolytic activities, as well as for comparing SGs content in leaves after agroinfiltration with CeBGlu1 and CeBGlu2 cloned into pJL-TRBO expression vector (without or with silencing inhibitor p19) Student's t-tests were used for data analysis (p < 0.05).

RESULTS AND DISCUSSION
Specific hydrolytic enzymes activate many glycosylated compounds in plants (e.g., glucosinolates, alkaloids, benzoxazinoids, cyanogenic, and (seco)iridoid glucosides), which often defines their dual-defense system against herbivores. Highly active and unstable aglycones released from iridoid and secoiridoid glycosides and monoterpenoid indole alkaloids usually display more prominent biological activities toward herbivores and pathogens, by adversely affecting their enzymatic machinery. Aglycones react with nucleophilic side chains of amino acids to form covalent protein complexes (Bartholomaeus and Ahokas, 1995;Konno et al., 1999;Kim et al., 2000;Guirimand et al., 2010), and act as unspecific enzyme inhibitors (Bartholomaeus and Ahokas, 1995;Ling et al., 2003). The typical defense compounds of Plantaginaceae are iridoid glycosides, which retard growth and/or enhance mortality of non-adapted herbivores. As a part of the dual defense system, Plantago lanceolata and P. major possess β-glucosidases that hydrolyze aucubin, one of the two major iridoid glycosides in these species, and thereby release protein-denaturing aglycones (Pankoke et al., 2013(Pankoke et al., , 2015. Oleaceae species like Ligustrum obtusifolium and Olea europaea are rich sources of SGs that after tissue disruption are metabolized by endogenous plant β-glucosidases (Konno et al., 1999;Mazzuca et al., 2006). Oleuropein β-glucosidase (OeGlu) from O. europea is reported to have a defensive role in young organs and meristem tissues as well as in mature tissues, where it activates oleuropein into a potent protein crosslinking agent during de-compartmentalization caused by pests (Koudounas et al., 2015). Oleuropein β-glucosidase (OeGlu) is highlighted as a molecular target of high biotechnology interest to regulate qualitative and quantitative content of bioactive secoiridoids in olive oils, and thus their organoleptic properties (Koudounas et al., 2021). Likewise, in MIAs-rich C. roseus and R. serpentina (fam. Apocynaceae), strictosidine is hydrolyzed by β-glucosidases (Guirimand et al., 2010). Following the disruption of strictosidine and strictosidine β-D-glucosidase (SGD) compartmentalization in C. roseus, initiated by cellular breakup after tissue wounding, the highly reactive strictosidine aglycone with prominent feeding-deterrent/toxic properties is released, and further conducted toward the production of cytotoxic MIAs (Guirimand et al., 2010(Guirimand et al., , 2011.
Secoiridoid glucosides of C. erythraea are defense compounds constitutively present in tissues that, due to their extremely bitter taste, pre-ingestively deter feeding and thus reduce consumption rates of various non-adapted herbivores. The presumption of the present study was that C. erythraea might also possess β-glucosidases displaying high specificity toward SGs. By activating aglycones of SEC, SW, SWM, and GP, these hydrolytic enzymes could be involved in initial steps of their catabolism. Furthermore, resulting aglycones might offer multilevel protective features to plants, and, as previously shown, shape their antioxidant properties .

Isolation of Full Length of CeBGlu, Comparison With Homologs, and Phylogenetic Analysis
By exploring transcriptomic resources of C. erythraea leaves, we identified a candidate for β-Dglucosidase (CeBGlu) displaying high homology with strictosidine β-D-glucosidase previously characterized from C. roseus (Geerlings et al., 2000). Following PCR, cloning of amplified product into pTZ57R/T vector and sequencing, two highly similar CeBGlu candidates (CeBGlu1 and CeBGlu2) were revealed. Full lengths of isolated CeBGlu1 and CeBGlu2 comprised open reading frames of 1,659 bp encoding polypeptides of 552 amino acids with calculated molecular masses of 62.06 kDa. The two CeBGlu gene candidates shared a high level of similarity (99.3%) and differed at 12 nucleotide sites, i.e., 4 amino acids (Figure 3 and Supplementary  Table 4). Based on primary structure both enzymes were predicted to be localized in the lysosome/vacuole.
Phylogenetic tree was constructed incorporating β-Dglucosidase amino acid sequences of diverse plant species to narrow the prediction of the function of both CeBGlu candidates (Figure 4). The neighbor-joining tree grouped the enzymes into several clusters based on their role in plants: defense response, lignification, hormone deglycosylation, βmannosidase or myrosinase activities. Defense-related enzymes included a separate cluster that belonged to monocots, and four clusters belonging to dicots, having different types of substrates: cyanogenic glucosides, isoflavonoid conjugates, alkaloid glucosides, and terpenoid glucosides. Interestingly, even though their substrates belong to the group of terpenoids, the two C. erythraea candidates (CeBGlu1 and CeBGlu2) showed the highest homology with β-D-glucosidases of the order Gentianales that prefer alkaloid glucosides as substrates (GsSGD, RsRG, RsSGR1, RvSGD, CrSTR, and CiIpeglu1) (Figure 4).

Comparative Analysis of Organ-Specific Secoiridoid Glucosides Content and CeBGlu Expression and Activity
Plant glucosidase genes are developmentally regulated (Morant et al., 2008a), and exhibit different spatial expression patterns depending on their physiological functions. In this sense, we analyzed the organ-specific distribution of transcript levels and activities of the two C. erythraea β-glucosidase candidates, in parallel with the content of major SGs, in both diploid and tetraploid genotypes. The presence of CeBGlu1 and CeBGlu2 transcripts in C. erythraea was validated by amplifying their specific fragments using qRT-PCR. Due to the high sequence similarity between the two gene candidates, a combination of primers common to both transcripts were employed ( Figure 5A). CeBGlu was amplified at low level when cDNA from roots was used as a template in qRT-PCR analysis. Expression analysis revealed that the transcription of CeBGlu is regulated in an organspecific manner (Figure 5A). The CeBGlu expression pattern revealed significantly higher transcript levels in shoots compared to roots of C. erythraea, in both diploid and tetraploid genotypes. However, tetraploid plants displayed significantly higher CeBGlu transcript levels in shoots and roots compared to diploids.
In parallel, content of major secoiridoids (SW, SWM, and GP) was significantly higher in shoots than in roots of both diploid and tetraploid C. erythraea genotypes (Figure 5B). It is well documented that aerial parts are the major site of SGs biosynthesis and accumulation in common centaury (Šiler et al., 2012, 2014Matekalo et al., 2018;Božunović et al., 2019;Filipović et al., 2019). Similarly, as in our previous study , the dominant SG in leaves of diploid and tetraploid plants was SW, followed by GP and SWM. Diploid plants displayed slightly higher GP content in shoots than tetraploid ones, which FIGURE 3 | Multiple sequence alignment of CeBGlu1 and CeBGlu2 with β-glucosidases from species belonging to the Gentianales order. The two Glu catalytic residues are colored blue, active site residues involved in sugar specificity are colored olive, and residues differing between CeBGlu1 and CeBGlu2 are indicated in red. The secondary structure diagram shown below the alignment is based on the CeBGlu1 AlphaFold model. The pLDDT score (predicted lDDT-Cα) shown below the alignment is a per-residue estimate of AlphaFold confidence on a scale from 0 to 100. Lower values are often associated with disordered regions. It is shown for CeBGlu1 AlphaFold model. RsRG sequence corresponds to the PDB:3U5Y glucosidase from Rauvolfia serpentina. is in accordance with our previous study (Filipović et al., 2019), and higher content of SW in roots compared to tetraploids.
To test the total hydrolytic activity of C. erythraea organs, pure SW, SWM and GP were subjected to an in vitro enzymatic assay using crude protein extract of shoots or roots, and the decrease in SG content was analyzed using UHPLC/DAD/(± )HESI−MS 2 analysis. SWM and GP were efficiently hydrolyzed, while no significant decrease in SW content was recorded ( Figure 5C). Shoots of both diploid and tetraploid genotypes displayed more intensive hydrolysis of SWM than corresponding roots. Both shoots and roots were efficient in hydrolyzing GP. These results generally indicate higher SG-related β-D-glucosidase activity in shoots than in roots, which corresponds to higher expression level of the two candidates identified within the present study.
Lower content of SGs in shoots and roots of tetraploid individuals when compared to diploids might be, at least partially, ascribed to higher expression level of SGs-related β-D-glucosidases and higher activity of these enzymes in tetraploids. Tetraploids most likely display more intense SGs catabolism which is reflected through lower SGs content. As tetraploid plants used in the present study are not a direct offspring of the diploid ones, the observed divergence between them is most likely also influenced by differences in their genomes.

Cloning and Heterologous Expression of CeBGlu in E. coli
Isolated BGlu candidates were sub-cloned into the pRSETA vector; pRSETA:CeBGlu1 and pRSETA:CeBGlu2 constructs were transformed into E. coli BL21 (DE3) competent cells. To achieve simple and efficient purification of the enzyme, CeBGlu1 and CeBGlu2 were expressed in fusion with the His-tag.
Following expression of recombinant proteins in E. coli, they were isolated using His-tag affinity purification and subsequently resolved in SDS-PAGE (Figure 6Aa and FIGURE 4 | The neighbor-joining tree generated using MEGA X (Version 10.2.6) with a bootstrap of 1,000 replicates, illustrating the phylogenetic relationship of the CeBGlu1 and CeBGlu2 candidates (red letters) relatively to homologous genes from various plant species (refer to Supplementary Table 2 for a list of plant protein sequences used for phylogenetic analysis and their corresponding accession numbers). The optimal tree is shown. The percentage of replicate trees in which the associated taxa clustered together in the bootstrap test (1,000 replicates) are shown next to the branches. Groups of enzymes with experimentally defined functions are labeled with different colors. Figure 1). SDS-PAGE analysis of the purified recombinant proteins detected bands with a slightly higher CeBGlu1 and CeBGlu2 molecular masses of ∼70 kDa than expected (62.06 kDa) (could be, at least partially, ascribed to a His-tail of recombinant proteins). Purified preparations of recombinant CeBGlu1 and CeBGlu2 were also analyzed by western blot using the anti-6×His antibody (Figure 6Ab). In both cases, a protein band with the same molecular mass as deduced from the SDS-gel was observed.
CeBGlu1 and CeBGlu2 hydrolytic activity was subsequently tested in enzymatic reactions with SG-rich methanol extract of C. erythraea as a substrate, and changes in the content of major SGs (SW, SWM, and GP) were recorded ( Figure 6C). CeBGlu2 was more efficient than CeBGlu1 in hydrolyzing SWM, which was the most abundant SG in C. erythraea methanol extract used in experiments. CeBGlu1 and CeBGlu2 were equally efficient against SW and GP. As the amounts of SW (2.0 µg ml −1 extract), SWM (3.5 µg ml −1 extract) and GP (2.1 µg ml −1 extract) in C. erythraea methanol extract were not equal, it was not possible to make conclusions about the preferable substrates of the two CeBGlu.
To test the substrate specificity, β-D-glucosidase activity of recombinant CeBGlu1 and CeBGlu2 enzymes was determined against several plant-derived glucosides of the β-D-type ( Figure 6D). Hydrolytic activity of CeBGlu1 and CeBGlu2 was tested against iridoid glucosides (1,5,9-epideoxyloganic acid-1,5,9-eDLA and loganin-LOG), secoiridoid glucosides (SEC, SW, SWM and GP), and flavonoid glucosides (apigetrin-AP, isoquercitrin-IQ, and vitexin-VI). Pure 1,5,9-eDLA was isolated from methanol extracts of Nepeta rtanjensis as described by Aničić et al. (2021), while AP and VI, as well as standards of iridoids and secoiridoids, were commercially purchased. As standards for aglycones of secoiridoids were not available, the β-D-glucosidase activity of recombinant proteins was evaluated by UHPLC/DAD/(±)HESI−MS 2 quantification of targeted glucosides (Supplementary Figure 2). Their content in reaction mixtures were compared to those of the negative controls where recombinant proteins were excluded. Although two recombinant proteins share 99.3% similarity of amino acid sequences, CeBGlu1 and CeBGlu2 display slightly differential hydrolytic activity and specificity against tested substrates ( Figure 6D). The highest activity of CeBGlu1 was recorded when using GP and AP as substrates, which were hydrolyzed by 95.4 and 97.1%, respectively. This enzyme was also efficient in hydrolyzing LOG (76.0%). The highest hydrolytic activity of CeBGlu2 was recorded for GP (99.7%) and SW (98.3%), followed by AP (63.5%) and SWM (52.2%). Generally, both CeBGlu1 and The bars indicate the quantity of substrates (%) not enzymatically hydrolyzed in an in vitro enzymatic assay. Control group, where no protein extract was added, was set to 100%, and all the values are presented relatively, in respect to control. Data are mean ± SD (n = 3). Red asterisks denote significantly different values according to the t-test, p-values, *p < 0.05, **p < 0.01, ***p < 0.001.
CeBGlu2 display higher affinity for GP, than for SW and SWM. Although significant activity of CeBGlu1 and CeBGlu2 was recorded when using AP (apigenin 7-O-glucoside) as a substrate, one should bear in mind that, up to the best of our knowledge, this compound was not previously detected in C. erythraea methanol extracts (Šiler and Mišić, 2016;Banjanac et al., 2017;Božunović et al., 2018). Slight hydrolytic activity of CeBGlu1 and CeBGlu2 with 1,5,9-eDLA, SEC and IQ was recorded, but it was not statistically significant ( Figure 6D). The two enzymes displayed no hydrolytic activity against VI, an apigenin 8-Cglycoside, which was also not previously detected in C. erythraea. β-D-glucosidases display substantial substrate promiscuity (Hannemann et al., 2018) so the hydrolysis of several substrates is not uncommon. Many beta-glucosidases have transglucosidase activities in addition to their hydrolase activity (Opassiri et al., 2004). However, the number of glucoconjugates in plants is likely larger than the number of beta-glucosidases and the enzymes tend to have overlapping specificities, which complicates determination of their exact functions. These cases of multiple functions have been categorized into "multitasking, " where the enzyme carries out multiple functions at the same time, or "moonlighting, " where the enzyme has two different functions in divided situations, according to Ketudat Cairns et al. (2015).

Transient Overexpression of CeBGlu1
and CeBGlu2 in Leaves of Diploid and Tetraploid Centaurium erythraea Transient expression of recombinant proteins in plants using tobacco mosaic virus (TMV) based viral vectors have been well documented (Chiong et al., 2021). TMV-based overexpression pJL-TRBO vector can be used for the insertion of a target gene by agroinfiltration into plants. Agrobacterium-mediated delivery of genes, results in highly efficient transient expression of the foreign protein, producing up to 100-fold more recombinant protein compared to a non-viral system (Lindbo, 2007).
To verify the functional identity of the CeBGlu gene, Agrobacterium-mediated transient expression in centaury leaves was carried out. To that end, the CeBGlu coding region was subcloned into the pJL-TRBO vector. The resultant plasmids designated pJL-TRBO:CeBGlu1 and pJL-TRBO:CeBGlu2 were used to transform A. tumefaciens GV3101 for the transient in planta expression. The two CeBGLU candidates were overexpressed in leaves of five-month-old diploid and tetraploid C. erythraea plants, alone or in combination with p19 (Figure 7). Phytochemical profiling of SGs in leaves of diploid and tetraploid C. erythraea plants, harvested 5 days after agroinfiltration, was performed. Overexpression of CeBGlu1 and CeBGlu2 induced changes of SGs content in leaves of diploid and tetraploid C. erythraea plants when co-expressed with p19 gene silencingsuppressor (Figure 7). In general, phytochemical changes were more pronounced in leaves of diploid plants. Significant decrease in SWM content was recorded in diploid plants jointly overexpressing CeBGlu1 and p19, or CeBGlu2 and p19. The content of SW in diploid plants was significantly decreased following overexpression of CeBGlu1 and CeBGlu2 in combination with p19, but also when CeBGlu2 was over-expressed alone. In tetraploid plants, SW content was decreased following agroinfiltration with CeBGlu1 and CeBGlu2 in combination with p19. No statistically significant alterations in GP content in response to agroinfiltration with CeBGlu1 and CeBGlu2 was observed, in both diploid and tetraploid plants. Viral-encoded suppressor of gene silencing, the p19 protein of tomato bushy stunt virus (TBSV), which is believed to prevent the onset of post-transcriptional gene silencing in the infiltrated tissues (Voinnet et al., 2003;Shah et al., 2013), obviously allowed high level of CeBGlu1 and CeBGlu2 transient expression, which resulted in pronounced phytochemical changes in C. erythraea leaves. Previous studies have suggested that transgene expression or RNA silencing in plants can be affected by ploidy, and is significantly less efficient in tetraploids than in diploids (Finn et al., 2011). In other words, transgenes are more prone to transcriptional inactivation in polyploids than in diploids (Mittelsten Scheid et al., 1996, which often poses a major limitation to polyploids improvement via biotechnology (Gao et al., 2013). Thus, it is not surprising that the use of p19 more efficiently increases the transient expression of recombinant CeBGlu1 and CeBGlu2 proteins in leaves of C. erythraea diploids than in tetraploids, which is accompanied with the more pronounced decrease in SGs content (Figure 7). Taken together, overexpression of the two CeBGlu isogenes decreased the content of SGs in C. erythraea leaves, which additionally confirmed the β-glucosidase function of these two genes. The entire centaury genome sequence is still unknown, yet it probably contains many other β-glucosidases of diverse as well as similar functions and substrate specificities, which could also contribute to shaping the secoiridoid profiles of these remarkable plants.

Comparative 3D Modeling of CeBGlu1 and CeBGlu2
To gain insights into sequence-structure relations of the two centaury β-glucosidases their 3D structure was inferred using AlphaFold (models provided as Supplementary Material 1). The obtained models had a low percentage of Ramachandran outliers and a low clash score (Supplementary Table 5). The percentage of Ramachandran favored Cα angles was between 92 and 93% for the two models. The majority of residues not within the favored parts of the Ramachandran diagram had a low pLDDT value (a per-residue estimate of AlphaFold confidence on a scale from 0 to 100) suggesting they are in disordered regions. The two predicted models corresponding to CeBGlu1 to CeBGlu2 are highly structurally similar, especially if the likely disordered regions are not taken into account -the RMSD of the superimposed pruned 475 Cα atoms in the two structures is 0.216 Å. Both obtained models showed a high degree of structural similarity to the experimental structure of the raucaffricine β-D-glucosidase from R. serpentina (PDB: 3U5Y, Supplementary Table 5; Xia et al., 2012). The models have a (β/α) 8-barrel (Figures 8A,B) fold characteristic for the CAZy GH1, consisting of eight parallel β-strands forming a barrel-like structure. These strands are connected via intricate helical regions and loops (Figure 8). Strands β5 and β6 are longer compared to other, and protrude the barrel (Figures 8A,B). Based on AlphaFold secondary structure annotation these are split into two strand regions each. Together with an antiparallel strand βa1 (Figure 8) which is not a part of the barrel, β5 and β6 form a sheet. After the (β/α) 8-barrel structure is a sheet formed from two antiparallel strands βa2 and βa3. The regions with low pLDDT are on the termini, between β4 and β5, as well as between β6 and βa1, matching closely the regions with missing density in the experimental structure of raucaffricine β-D-glucosidase from R. serpentina which strengthens the conclusion these regions are disordered.
The enzymes from GH1 family have a retaining double displacement mechanism in which one Glu residue acts as a proton donor for the leaving group, while another Glu residue acts as a nucleophile (Ketudat Cairns and Esen, 2010). FIGURE 9 | Retaining double displacement reaction mechanism of GH1 β-glucosidase enzymes. Glu187 is the catalytic proton donor which activates the leaving group while Glu420 residue is the nucleophile. The reaction proceeds via a covalent glycosyl enzyme intermediate with an inverted bond which is hydrolyzed by nucleophilic attack of a water molecule, activated by the proton donor residue, resulting in second bond inversion and in overall anomeric retention.
FIGURE 10 | Position of docked ligands in the CeBGlu1 active site. Interactions of docked ligands: gentiopicrin (A), sweroside (B) and swertiamarin (C) with the CeBGlu1 active site residues. Ligands are colored yellow with oxygen shown in red; active site residues are colored green with oxygen shown in red and nitrogen in blue. H-bonds are indicated with yellow dashed lines while the hydrophobic surface encompassing the aglycone is indicated with a silver mesh. (D) Superposition of ligands docked to the CeBGlu1 active: gentiopicrin (yellow), sweroside (pink) and swertiamarin (green) with secologanin (gray) from the experimental structure of glucosidase from Rauvolfia serpentina (PDB: 3U5Y). Superposition of CeBGlu1 with 3U5Y was performed using matchmaker command in ChimeraX using default options.
The nucleophilic attack forms a covalent glycosyl enzyme intermediate with an inverted bond (Figure 9). This covalent intermediate is hydrolyzed by nucleophilic attack of a water molecule, activated by the proton donor residue, resulting in second bond inversion and in overall anomeric retention. The catalytic proton donor and nucleophile in the two centaury β-glucosidases correspond to Glu187 and Glu420, respectively (Figure 9). Glu187 is located in the coil just after β4, while Glu420 is situated at the end of β7, while their Cα atoms are positioned at a distance of 10.3 Å in the models. The pKa values of respective carboxylic groups were estimated to be ∼8 for Glu187 and ∼5.5 for Glu420 which is in concordance to their proposed role of proton donor and nucleophile, respectively.
The two centaury β-glucosidases differ in four amino acids which include, from CeBGlu1 to CeBGlu2: A211D, S278C, D307E and D332E; the first two being radical replacements, while the latter two are conservative replacements. Position 211 is located in the disordered region between β4 and β5; 278 is located in β5, proximate to, but with a side chain facing away from the active site, while 307 and 332 are located in the helical regions connecting β5 to β6 (Figure 8). Based on the type, position and orientation of the differing amino acids at positions 211, 307, and 332 no impact of the mentioned amino acids replacements on the overall enzyme performance can be expected. It should be noted that these amino acids are not at the dimer interface based on comparison with 3U5Y (result not shown), so not even a longdistance interaction, due to differential multimer binding, is to be expected. However, one cannot rule out the possible indirect roles of these amino acids on the overall enzyme performance. On the other hand, the position 278 is close to the substrate binding site, and the amino acids around it -Ile277 and Gln279 are directly involved in interactions with the aglycone (Figure 10). Thus, even though the predicted structures of CeBGlu1 to CeBGlu2 are highly similar, including the active site, and the orientation of Ile277 and Gln279 side chains, S278C could have an impact on substrate recognition and overall catalytic performance. It should be noted there are no nearby Cys with which potential disulfide bonds can be formed by Cys287. Thus, the observed differences in the substrate preferences of the two CeBGlu enzymes (Figure 6), might, at least partially, be ascribed to the differences in their amino acid sequences, especially at position 287.
To explore and visualize the binding of substrates to the active site ligand docking was analyzed only for CeBGlu1, because of the very high overall similarity of the predicted models for the two β-glucosidases. To paraphrase quantitatively, the RMSD between the 236 atom pairs belonging to residues Gln37, Trp142, Asn186, Glu187, Thr190, Ile277, Gln279, Asn347, Trp392, Glu420, Trp469, Glu476, and Trp477 which form the active sites including the catalytic residues between the two predicted structures is 0.625 Å. Most open-source software for ligand docking is able to perform some sort of conformer search during docking, usually limited to rotatable bonds. The new AutoDock-Vina version implements a macrocycle conformer sampling method (Eberhardt et al., 2021), however, an analogs method appropriate for small aliphatic rings is lacking. Due to this, sets of conformers were pre generated by the ETKDG3 method with small ring torsion angle preferences (Wang et al., 2020) for GP, SW and SWM and used for docking. Prior to docking, while comparing the centaury β-glucosidase models with similar experimental structures from the PDB it was noticed CeBGlu1 and CeBGlu2 Glu476 side chain clashes with the glucose in the active center of superimposed experimental structures. Therefore, a flexible docking procedure was employed where Glu476 was allowed to change conformations, while all other enzyme residues were rigid. For GP the highest scoring docked pose (Table 1 and Figure 10A) seems to be very close to what could be expected in vivo, at least based on comparison to the pose of the secologanin ligand in the experimental structure of raucaffricine β-D-glucosidase from R. serpentina (Figure 10D, 3U5Y, Xia et al., 2012). The specificity of the active site for βglucosides is achieved by numerous H-bonds with the sugar OH groups: Asn186, Glu187 and Asn347 form H-bonds with O2, Gln37 and Trp477 with O3, Gln37, Trp469 and Glu476 with O4 on β-glucose ( Figure 10A). The aglycone binding pocket is tight and hydrophobic, enclosed from one side by Trp392 which is parallel to the aglycone, and from the others by Trp142, Thr190 and Ile277. The only polar interaction with the ligand is via an H-bond between Gln279 and the keto oxygen. The relatively tight space in the aglycone binding site allows mostly planar aglycones, while Gln279 position favors aglycons with H-acceptors or H-donors in the appropriate position. The positions of the highest scoring poses for SW and SWM were not suitable for hydrolysis, because the aglycone occupied the position of glucose. However, the 2nd best pose for SWM and 3rd best pose for SW (Table 1 and Figures 10B,C), which scored only slightly worse, were very similar both to the previously mentioned docked GP, as well as to secologanin ligand in 3U5Y ( Figure 10D). The suboptimal highest scoring poses for SWM and SW might be the result of limitations of the used docking procedure, and imperfections in the scoring function.

CONCLUSION
The present study highlights the direct link between β-D-glucosidases characterized within the present study (CeBGlu1 and CeBGlu2) and the content of SGs in analyzed C. erythraea tissues. Thus, it provides the first evidence of SGs-related β-glucosidases in centaury plants, and advances our understanding of SGs deglycosylation, as a part of their catabolism. These enzymes could serve as a molecular target of high biotechnological interest, in order to produce centaury plants with an optimal composition of secoiridoids, which are essential defense compounds in plants with beneficial effects in human and animal health. One of the major challenges ahead is to gain better understanding of how plants regulate the function of CeBGlu1 and CeBGlu2 in distinct tissues and developmental stages, and in response to biotic and abiotic stress stimuli, which is the course of our further work. Furthermore, it is imperative to assess their biological function and confirm their localization at the cellular and organelle level and define conditions under which these enzymes come into contact with their physiological substrates. Alternatively, gene silencing of the CeBGlu1 and CeBGlu2 combined with metabolic profiling of silenced plants might provide more information on the role and function of these genes, and could also result in elevated amounts of SGs in tissues. In parallel, unrevealing the remaining unknown steps of SGs biosynthetic pathway in C. erythraea is of essential importance for establishing the biotechnology-based production of these valuable bioactive compounds. Accumulating knowledge will, in the future, enable manipulation of SGs biosynthesis and catabolism through multi-target metabolic engineering, which will further enable large-scale production of desired secoiridoids in C. erythraea.

DATA AVAILABILITY STATEMENT
The data presented in this study are deposited in the National Center for Biotechnology Information (nih.gov) repository (NCBI; https://www.ncbi.nlm.nih.gov/) under accession numbers ON060690 and ON060691.