Structure–Activity Relationship Analysis of Flavonoids and Its Inhibitory Activity Against BACE1 Enzyme Toward a Better Therapy for Alzheimer’s Disease

Drug development in Alzheimer’s disease (AD) suffers from a high attrition rate. In 2021, 117 agents tested in phases I and II and 36 agents tested in phase III were discontinued. Natural product compounds may be good lead compounds for AD as they contain functional groups that are important for binding against key AD targets such as β-secretase enzyme (BACE1). Hence, in this study, 64 flavonoids collected from rigorous literature search and screening that have been tested from 2010 to 2022 against BACE1, which interferes in the formation of amyloid plaque, were analyzed. The 64 unique flavonoids can be further classified into five core fragments. The flavonoids were subjected to clustering analysis based on its structure, and each representative of the clusters was subjected to molecular docking. There were 12 clusters formed, where only 1 cluster contained compounds from two different core fragments. Several observations can be made where 1) flavanones with sugar moieties showed higher inhibitory activity compared to flavanones without sugar moieties. The number of sugar moieties and position of glycosidic linkage may also affect the inhibitory activity. 2) Non-piperazine-substituted chalcones when substituted with functional groups with decreasing electronegativity at the para position of both rings result in a decrease in inhibitory activity. Molecular docking indicates that ring A is involved in hydrogen bond, whereas ring B is involved in van der Waals interaction with BACE1. 3) Hydrogen bond is an important interaction with the catalytic sites of BACE1, which are Asp32 and Asp228. As flavonoids contain favorable structures and properties, this makes them an interesting lead compound for BACE1. However, to date, no flavonoids have made it through clinical trials. Hence, these findings may aid in the design of highly potent and specific BACE1 inhibitors, which could delay the progression of AD.


INTRODUCTION
Alzheimer's disease (AD) is a disorder characterized by the progressive degeneration of the structure and function of the central nervous system (Plaingam et al., 2017). Globally, individuals with this disorder have steadily increased in numbers, where in 2016, there were 43.8 million cases reported, which is expected to increase to 131.5 million by 2050. In Malaysia, it is estimated that in 2016, there were about 50,000 individuals diagnosed with AD, and this number is projected to increase to 123,000 by 2030 (Feigin et al., 2019). AD is an irreversible disease where it will slowly affect memory and thinking skills. If the disease is not properly managed, the person diagnosed with AD may not be able to carry out daily activities (Nazarko, 2019). In 2021, drug development for all types of AD suffered from a high attrition rate (around 95%), where 117 agents tested in phases I and II and 36 agents tested in phase III were discontinued (Cummings et al., 2021). In 2020, clinical trials for early-to-mild and mild-to-moderate phases of AD involving semagacestat, bapineuzumab, and solanezumab were halted due to lack of improvement in cognitive function. While these drugs showed good results in phases I and II such as reducing the CSF biomarker level, little or no significant improvement in cognitive and functional endpoints were observed during phase III. An increase in dose did not display a change in the cognitive and functional endpoints but resulted in several severe toxic effects such as vascular edema in bapineuzumab (Huang et al., 2020). Currently, only six drugs are approved by the Food and Drug Administration (FDA), which are donepezil, rivastigmine, galantamine, memantine, combination capsule of memantine with donepezil, and aducanumab (Cummings et al., 2021). Aducanumab is the first disease-modifying drug that has been approved by the FDA in June 2021. Although aducanumab clears the amyloid plaque, its efficacy in improving cognitive function is still lacking in evidence (Walsh et al., 2021). This highlights the insufficient evidence of the clinical benefit as well as the narrow therapeutic index of potential AD drugs as some of the challenges faced in the development of AD drugs (Mehta et al., 2017).
The neuronal cell death that occurs in AD is due to the molecular and cellular changes in the brain. It is characterized by the presence of extracellular amyloid-β (Aβ) proteins in senile plaques and intracellular deposits of tau (τ) proteins in neurofibrillary tangles (NFTs) that affect memory and cognition (Hampel et al., 2020). Amyloid precursor protein (APP) is an integral membrane protein found in various tissues, such as synapses of neurons. APP acts as a regulator of synaptic formation and repair, anterograde neuronal transport, and iron export. In healthy neurons, APP is cleaved by an enzyme known as α-secretase through the non-amyloidogenic pathway and produces amyloid precursor protein alpha-secretase (APPsα) ectodomain and membrane-bound carboxyl-terminal fragment (CTF) known as C83. Then, C83 is cleaved by γ-secretase to form p3 and APP intracellular domain (AICD) (Chen et al., 2017). The activity of α-secretase is associated with proteases such as tumor necrosis factor-alpha converting enzyme (TACE), disintegrin, and metalloproteinase domain-containing protein 9 (ADAM9) and ADAM10 (Chen et al., 2017). However, in AD, β-secretase enzyme (BACE1) and γ-secretase cleave APP to form amyloid beta (Aβ), an insoluble peptide through the amyloidogenic pathway. APP is first cleaved by BACE1 to produce two protein fragments, which are APPsβ (secreted ectodomain) and C99 (membrane-bound fragment) (Ashrafian et al., 2020). C99 will then be cleaved by γsecretase, which is composed of proteins such as anterior pharynx defective 1 (APH1), presenilin enhancer 2 (PEN2), nicastrin and presenilin (PS1 or PS2), and forms AICD and C-terminus of Aβ peptides. The insoluble Aβ peptide, which is made up of 42 amino acids, can aggregate and form amyloid plaques in the brain, a pathological hallmark of AD (Ashrafian et al., 2020). Hence, inhibiting BACE1 could interfere in the formation of amyloid plaque since it is the initial and ratelimiting step of Aβ production. Several inhibitors against BACE1 have been developed over the past few years (Abeysinghe et al., 2020;Hampel et al., 2021;Zhumanova et al., 2021); however, most of them have failed during preclinical trials. BACE1 has a large catalytic pocket, and hence the inhibitors need to be large enough to interact with key amino acid residues of the BACE1 active site (Coimbra et al., 2018). However, large-sized inhibitors may not be able to cross the blood-brain barrier (BBB), as only low molecular weight and lipophilic compounds are able to cross the BBB, unless aided by transporters (Pardridge et al., 2005;Banks, 2009).
BACE1 is structurally homologous to other aspartic proteases of the pepsin family such as BACE2 and cathepsin D (CTSD) where the inhibition of these enzymes can increase the production of Aβ and protein deposits in several organs. Hernández et al. (2016) performed alignment studies, molecular dynamics simulations, and docking studies on these three aspartic proteases to study their differences. Based on the result, selective BACE1 inhibition can be achieved through strong electrostatic interactions with Asp32 and Asp228 of the catalytic site, in addition to a large number of hydrogen bonds, π-π interaction, and van der Waals interaction. BACE1 is also limited by its highly flexible catalytic site. It has been observed that the presence or absence of an inhibitor within the active site influences the flap's conformation in BACE1. A better understanding of the binding of substrates and inhibitors to BACE1 is required before developing an inhibitor (Xu et al., 2012). Chakraborty et al. (2011) studied the dynamic transition of BACE1 employing normal mode analysis (NMA) using a simplified elastic network model (ENM). The dynamic transition of BACE1 from an open to a closed conformation can be observed from this study using the combined approach of cavity and volume calculation obtained from the dynamics of BACE1 encoded by normal mode. In the open conformation, a large catalytic cavity allowed the binding of a wide range of substrates with the help of the 10s and F loops. Both loops are part of the cavity, and the F loop forms the C-terminal lining of the cavity. Meanwhile, in the closed conformation, the F loop detaches from the cavity, and the 10s loop moves upward and lines the cavity, which causes the cavity to squeeze and tightly hold the substrate. The effect of myricetin on the BACE1 enzyme was also studied to explain the pharmacophoric feature of BACE1 inhibitors using molecular docking. The docking results showed that myricetin forms a strong interaction with BACE1 in the closed conformation compared to that in the open conformation. In the closed conformation of BACE1, myricetin formed interactions with the enzyme through hydrogen bonding by binding to the catalytic aspartic acid residues (Asp228) and forming van der Waals interactions with Asp32 and Ser35. Myricetin also binds with other residues such as Pro70, Val69, Tyr71, and Thr72 through hydrogen bonds.
In addition, BACE1 inhibitors also need to overcome potential drug-drug interactions from activity against CYP450 enzymes and potential hERG channel inhibition. Inhibition of the hERG channel could cause QT interval prolongation that may lead to ventricular arrhythmias known as Torsades de Pointes (TdP) (Coimbra et al., 2018). Malone and Hancox (2020) evaluated evidence from the scientific literature to find potential association between AD drugs (donepezil, galantamine, and rivastigmine) and the risk of QT interval prolongation and Torsades de Pointes (TdP). From the study, one of the AD drugs, which is donepezil, showed a risk of producing QT interval prolongation and Torsades de Pointes (TdP).
Natural product compounds may be potential candidates for BACE1 inhibitors. Several natural product compounds have been shown to bind simultaneously to the catalytic amino acid residues of BACE1, which are Asp228 and Asp32. Wagle et al. (2019) performed molecular docking between BACE1 (pdb id: 2wjo) and isolated compounds from Cirsium japonicum var. maackii, namely luteolin, luteolin 5-O-β-D-glucopyranoside, and luteolin 7-O-β-D-glucopyranoside. The result showed that luteolin 5-O-β-D-glucopyranoside and luteolin 7-O-β-Dglucopyranoside interacted with both Asp228 and Asp32 residues via hydrogen bonds. Additionally, natural product compounds have low activity against the hERG channel due to the presence of aromatic rings (Kratz et al., 2017;Babiaka et al., 2020). Moreover, natural product compounds have the ability to cross the blood-brain barrier (BBB) as they have physicochemical properties that are favorable for BBB permeation (Carecho et al., 2020). Low-molecular-weight polyphenol molecules can cross the BBB through passive permeation and carrier-mediated transport. Flavonoids are able to cross the BBB through transporters such as ATP-binding cassette (ABC) transporters, organic anion transporters (OATs), and organic anion transporting polypeptides (OATPs) (Lu et al., 2014;Stieger et al., 2017;Williamson et al., 2018).
One of the essential classes of natural product compounds is flavonoids. It can be found in several parts of a plant and is a secondary metabolite. The main structure of flavonoids contains two aromatic rings (ring A and ring B) connected by three carbon atoms, which form an oxygenated heterocycle (ring C) ( Figure 1). There are about eight subclasses of flavonoids, which are flavones, flavonols, flavanols, isoflavonoids, chalcones, anthocyanins, flavanones, and neoflavanoids. The subclasses are categorized based on the basic flavan ring system, alkylation, glycosylation, and hydroxylation (Vauzour et al., 2008;Panche et al., 2016). The presence of chromone structure in the backbone of many flavonoids is an important feature in several anti-HIV, anti-inflammatory, antibacterial, and anticancer drugs as well as those used in neurodegenerative diseases, inflammatory diseases, and diabetes (Reis et al., 2017).
Several flavonoids from various plants such as mycertin, kaempferol, morin, apigenin, luteolin, and polymethoxyflavones have been studied for their inhibitory activity against BACE1. A study from Youn et al. (2016a) showed that 5,7-dimethoxyflavone (DMF), 5,7,4′trimethoxyflavone (TMF), and 3,5,7,3′,4′pentamethoxyflavone (PMF) exhibited strong BACE1 inhibitory activities with no suppression of other enzyme activities such as α-secretase and other serine proteases. This indicates that these three flavonoid compounds are relatively specific and selective toward the BACE1 enzyme. Moreover, the study by Zhumanova et al. (2021) also reported that O-methylated quercetins, a flavonoid isolated from the aerial part of the endemic Caragana balchaschensis (Kom.) Pojark, were significantly effective in inhibiting BACE1 with IC 50 values ranging from 1.2 to 6.5 μM. Additionally, flavonoids such as kaempferol and quercetin have diverse bioactivities and are known to exert neuroprotective effects and inhibit BACE1 activity. A number of flavonoids have also been investigated for anti-acetylcholinesterase (AChE) and butyrylcholinesterase (BChE) activities. In a study, macluraxanthone was found to be the most potent and specific inhibitor of both AChE and BChE, with IC 50 values of 8.47 and 29.8 μM, respectively (Khan et al., 2009).
Despite their favorable structure and properties, to date, no flavonoids have advanced further in clinical trials as BACE1 inhibitors. Hence, the aim of this study was to analyze the data published between 2010 and 2022 regarding flavonoids and their activity against the BACE1 enzyme. Specifically, the twodimensional (2D) structural patterns as well as binding patterns of the flavonoids are analyzed, with the aim of discovering the functional relationship between structure and activity. Understanding the structure-activity relationship between flavonoids and BACE1 is important in designing highly potent and specific BACE1 inhibitors.

MATERIALS AND METHODS
In the data collection that is similar to that of a systematic review process, two authors (Nur Intan Saidaah; NS, and (Zafirah Liyana Abdullah; ZA) independently reviewed and evaluated all articles found in the literature search, including titles, abstracts, and

Keyword Identification
A comprehensive literature search was performed through several databases such as Science Direct, Scopus, and PubMed using search strings ( Table 1). The search was limited to articles published from 2010 to 2022, and only research and journal articles were selected as the document type. In this study, 389 articles were retrieved in the first round of electronic literature searches. After removing review papers, conference proceedings, letters, meta-analyses, abstracts, emails, and unrelated topics (n = 299), 90 articles were identified for screening.

Screening
In the screening stage, 90 articles were screened based on their titles and abstracts. The articles were screened based on the following criteria: 1) studies that did not include the BACE1 enzyme inhibition assay and 2) studies that did not involve flavonoids as compounds against BACE1. Duplicate articles were also removed. A total of 74 articles were excluded based on these criteria.

Eligibility
A total of 16 articles were evaluated for their eligibility. All articles were examined based on inclusion and exclusion criteria ( Table 2). Four studies were excluded because the studies either involved plant extracts, only in silico experiments were conducted without in vitro validation, no full text was available, and/or IC 50 values that were not available in M ( Table 2). Finally, 12 studies with 64 flavonoids were selected after 2 duplicates were removed.

Data Extraction
Information of all 64 flavonoids from 12 studies were extracted, which include authors, year of publication, the structure of flavonoids, and BACE1 inhibitory activities. A chemical annotation of the flavonoids such as SMILES was identified and tabulated in an Excel spreadsheet, which was then converted into a CSV format for further analysis.

Molecular Properties and Clustering
The molecular properties of the flavonoids were calculated based on their structure in DataWarrior (www.openmolecules.org). Next, the flavonoids were clustered based on their chemical structure. All of the extracted information in the CSV file was uploaded into the DataWarrior software. Under the Chemistry tab, the Calculate Properties was selected. Properties for druglikeness such as cLogP, H-acceptors, H-donors, and total surface area were chosen. Automatic structure-activity relationship (SAR) analysis was performed under the Chemistry tab to determine core fragments and functional groups of flavonoid compounds. In this analysis, SMILES was used to generate the scaffold, with Murcko scaffold selected as the scaffold type. After core fragments and functional groups were determined, a cluster analysis was performed. In the same Chemistry tab, the Cluster Compounds or Reactions was selected. Clustering starts with the construction of a complete similarity matrix of all flavonoids that was calculated based on the default descriptor, FragFp, which is similar to MDL keys and consists of 512 predefined structural fragments. The flavonoids with the highest similarity were included in the first cluster. Then, the similarity matrix is updated by removing all similarity values related to the first two flavonoids. This is replaced by a new set of similarities between the cluster center and other compounds. The new values were calculated as mean of the two original similarity values. The process continues by merging the most similar   flavonoid and terminated when the highest similarity falls below 0.8.

Molecular Docking
Representative compounds from each cluster were subjected to molecular docking against BACE1. The crystal structure of BACE1 (pdb id: 2wjo, 2.50 Å) was obtained from the Research Collaboratory for Structural Bioinformatics (RCSB) Protein Data Bank (PDB). Using Discovery Studio Visualizer Software, heteroatoms and water were removed, and polar hydrogen was added to the BACE1 structure. BACE1 was loaded onto the PyRx software and set as a macromolecule. The three-dimensional (3D) structures of each representative compound and the reference ligand (2,2,4-trihydroxychalcone; CID: 5811533) were obtained from the PubChem Compound Database (NCBI). These files were uploaded to the Open Babel tab in PyRx software, and energy minimization was selected for all the ligands. The force field used in the Open Babel software package is by default the universal force field. The docking of the BACE1 and flavonoid compounds was simulated using PyRx Software (AutoDock Vina) to determine the binding energy for each flavonoid with BACE1. The Lamarckian genetic algorithm (LGA) method was used for all calculations between the protein and ligand. The docking site on the protein target was defined by establishing a grid box with the dimensions of X: 61.0616, Y: 51.4261, and Z: 59.9516 Å, with a center of X: 16.0076, Y: 39.7409, and Z: 40.5572 Å. The visualization of BACE1 with flavonoid compounds was visualized using Discovery Studio Visualizer (Trott and Olson, 2010;Dallakyan and Olson, 2015).

General Findings
From the 12 articles that were filtered, 66 flavonoids were collected and included information such as chemical structure, molecular properties, and IC 50 values. Two duplicates were found and removed, which were naringenin and baicalein. In total, 64 FIGURE 2 | Flowchart of the literature search conducted and number of articles filtered at each stage. The process is similar to that of a systematic review.
Frontiers in Chemistry | www.frontiersin.org June 2022 | Volume 10 | Article 874615 5       compounds were filtered (Figure 2). For each of the 64 flavonoids, their core fragments were identified, which include flavone, flavanone, chalcone, rotenoid, and isoflavone. For one compound, its core fragment could not be identified, which was neocyclomorusin.

Clustering Analysis Based on Core Fragment and Functional Groups of Flavonoids
Using DataWarrior, the 64 flavonoids were clustered based on the similarity of their structures, which resulted in 12 clusters. A full detail of each compound and their cluster is listed in Table 3, and a graph showing the distribution of the IC 50 for each compound in the cluster is shown in Figure 3. Each cluster contains only compounds with one distinct core fragment, with the exception of cluster 4 where there are compounds with either a flavanone or flavone core fragment, as well as the presence of at least one sugar moiety. Flavonoids from these 12 studies exhibited IC 50 values ranging from 1.2 to 147 μM against the BACE1 enzyme. From Figure 3, it can be seen that clusters 1, 2, 7, and 9 contain compounds with the flavone core exclusively with the highest number of compounds in cluster 1 (31 compounds). Additionally, cluster 4 also contains two compounds with the flavone core. The standard structure of flavone consists of a double bond between C2 and C3, with a ketone at C4. The flavones showed cLogP values of 0.21-7.5, a total surface area of 191-422 m 2 /g, and a molecular weight of 270-541 g/mol. Flavones in clusters 1, 2, 7, and 9 contain around 0-5 hydrogen bond donors and 3-8 hydrogen acceptors, but flavones in cluster 4 contain 6 hydrogen bond donors and 12 hydrogen bond acceptors. The higher hydrogen bond acceptors and donors of compounds in cluster 4 may be attributed to the presence of a sugar moiety. The IC 50 values of compounds with the flavone core fragment were below 100 μM except for mormin in cluster 7 (103.5 μM) and morusinol in cluster 9 (135.9 μM). Compound 3,7,39-tri-O-methylquercetin from cluster 1 showed the lowest IC 50 value, which was 1.20 μM. 3,7,39-tri-Omethylquercetin consists of one hydroxyl group at C5 and one methoxy group at C7 (ring A). It also has a hydroxyl group at C4' and a methoxy group at C5′ (ring B). In addition, the presence of a methoxy group at C3 (ring C) may contribute to the inhibition of BACE1. It can be seen from Figure 3 that compounds with the flavone core fragment do not have an obvious correlation between the 2D structure and bioactivity, where the clusters show distinct structures but the ranges of IC 50 values overlap between the clusters.
Cluster 4 is the only cluster that includes compounds with different core fragments where it contains four flavanones and two flavones. Unlike flavone, flavanone compounds have no double bond between C2 and C3, but all compounds in cluster 4 contain at least one sugar moiety. The IC 50 of compounds in cluster 4 ranges from 2.34 to 23.2 μM where didymin in Cluster 4 showed the most potent BACE1 inhibition with an IC 50 of 2.34 μM, which may be due to the presence of disaccharides at ring A. Although poncirin also has disaccharides at ring A, the different position of the glycosidic linkage may explain the IC 50 value of poncirin being slightly higher than that of didymin. Prunin, which only has one sugar moiety, showed a lower IC 50 value than hesperidin, which has two sugar moieties, similar to didymin. The substitution of the hydroxide at ring B at different positions may explain the difference in the IC 50 value observed. Additionally, the two flavones in the cluster, 3,7di-O-methylquercetin-49 -O-glucoside and 3,7-di-Omethylquercetin-39-O-glucoside, had the highest IC 50 values of 21.2 and 23.2 μM, respectively. Compounds with the flavanone core also can be found in cluster 3 but without the presence of a sugar moiety. The physicochemical properties of flavanone core fragments between clusters 3 and 4 are comparatively different. In cluster 3, the range of cLogP was between 2 and 2.8 (vs. −0.18-0.17) with a total surface area of 186-215 m 2 /g (vs. 294-411 m 2 /g) and a molecular weight ranging from 256 to 303 g/mol (vs. 434-611 g/mol). In cluster 3, IC 50 values ranged from 22 to 39 μM. When comparing flavanones between clusters 3 and 4, it showed the most obvious relationship between 2D structure and bioactivity where flavanones with sugar moieties produce lower IC 50 values. Thirteen flavonoids with the chalcone core fragment were retrieved from the literature screening. Chalcone structures contain an α, β-unsaturated ketone with two aromatic rings (rings A and B). These chalcones can be further categorized as piperazine-and non-piperazine-substituted chalcones. The piperazine-substituted chalcones (8 compounds) were all grouped in cluster 12, while the non-piperazine-substituted chalcones were grouped in clusters 5 (1 compound) and 6 (4 compounds). The SAR of the chalcones was not obvious; however, this relationship is more obvious among the nonpiperazine-substituted chalcones. Cardamonin (cluster 5) showed the highest inhibitory activity among all chalcones, with an IC 50 of 4.35 μM. In contrast, compounds in cluster 6 showed IC 50 above 4.7 μM with compounds 2j ((E)-1-(4bromo-2-hydroxy-5-iodophenyl)-3-(4-fluorophenyl)propanone), 2h ((E)-1-(4-chloro-2-hydroxy-5-iodophenyl)-3-(4-methoxyphenyl) propanone), 2n ((E)-3-(4-fluorophenyl)-1-(2-hydroxy-5-iodo-4methoxyphenyl)prop-2-en-1-one), and 2p ((E)-1-(2-hydroxy-5iodo-4-methoxyphenyl)-3-(4-methoxyphenyl)prop-2-en-1-one) showing IC 50 values of 4.703, 13.82, 25.07, and 70.79 μM, respectively. Cardamonin contains a hydroxyl group at the ortho and para positions of ring A and no substitution in ring B. For compounds in cluster 6, all compounds are substituted at the ortho, para, and meta positions of ring A and the para position of ring B. All compounds in cluster 6 contain iodine at the meta position of ring A. Compound 2j ((E)-1-(4-bromo-2hydroxy-5-iodophenyl)-3-(4-fluorophenyl)propanone), which has the lowest IC 50 , is substituted with bromine and fluorine at the para position of rings A and B, respectively. Substitution with functional groups with decreasing electronegativity at the para position of rings A and B seems to result in an increase in IC 50 values. When the bromine in 2j ((E)-1-(4-bromo-2hydroxy-5-iodophenyl)-3-(4-fluorophenyl)propanone) is replaced with chlorine in ring A and fluorine is replaced with methoxy in ring B, this produces compound 2h ((E)-1-(4-chloro-2-hydroxy-5-iodophenyl)-3-(4-methoxyphenyl)propanone) with a higher IC 50 value. When methoxy is substituted at the para position of ring A, this produces compounds with higher IC 50 values, as seen in 2n ((E)-3-(4-fluorophenyl)-1-(2-hydroxy-5-iodo-4-methoxyphenyl)prop-2-en-1-one) and 2p ((E)-1-(2-hydroxy-5iodo-4-methoxyphenyl)-3-(4-methoxyphenyl)prop-2-en-1-one). Both compounds differ in the substitution at the para position of ring B, where 2n ((E)-3-(4-fluorophenyl)-1-(2-hydroxy-5-iodo-4-methoxyphenyl)prop-2-en-1-one) is substituted with a fluorine and recorded a lower IC 50 value than 2p ((E)-1-(2-hydroxy-5iodo-4-methoxyphenyl)-3-(4-methoxyphenyl)prop-2-en-1-one), which is substituted with a methoxy group. This suggests that the potency of non-piperazine-substituted chalcone decreases with the substitution of functional groups with decreasing electronegativity. Additionally, no substitution at ring B seems to correlate to higher activity, and similar to ring A, substitution at the para position with functional groups with decreasing electronegativity results in an increase in the IC 50 value. In terms of the pharmacokinetic profile, the cLogP values for the non-piperazine-substituted chalcone was in the range of 2.5-4.22, with a total surface area of 212-255 m 2 /g and a molecular weight of 270-448 g/mol. The number of hydrogen bond acceptors in these clusters was around 2-4, while the number of hydrogen bond donors was between 1 and 2. In regards to the piperazine-substituted chalcones, although the piperazine-substituted chalcones only differ by the substitution at the para position of ring B, no correlation can be established between the properties of the functional groups, i.e., electronegativity or lipophilicity and IC 50 values. The lowest IC 50 value in the piperazine-substituted chalcones was observed in compound PC3 ((2E)-3-(4methoxyphenyl)-1-[4-(1-piperazinyl)phenyl]-2-propen-1-one; 6.72 μM), where a methoxy is substituted at the para position of ring B. This is followed by PC8 ( (2E) In terms of the pharmacokinetic profile, the cLogP values for the piperazine-substituted chalcone was in the range of 2.7-4.9, with a total surface area of 246-276 m 2 /g and a molecular weight of 306-360 g/mol, which are comparable to the non-piperazine-substituted chalcones.
The remaining three clusters contained only one compound each, where cyclomorusin, neocyclomorusin, and biochanin A were clustered in clusters 8, 10, and 11, respectively, each representing a distinct core fragment. Neocyclomorusin from cluster 10 showed the highest IC 50 value, which was 146.1 μM.
Frontiers in Chemistry | www.frontiersin.org structure. However, the IC 50 value of cyclomorusin was in contrast with the docking result, where cyclomorusin showed lower potency in inhibiting BACE1 compared to other flavonoids (101.2 μM).
Another compound, neocyclomorusin, which belongs to cluster 10, also showed a low binding score, which is −9.80 kcal/mol. Similar to cylcomorusin, neocyclomorusin showed lower potency in inhibiting BACE1 (146.1 μM) than the rest. Uniquely, these two compounds have a similar pentacyclic structure formed by the ring closure of the prenyl group. As can be seen in Figures 4S-T, the embedded oxygen in the pentacyclic structure of neocyclomorusin binds to Ser35 of BACE1 through a hydrogen bond. Moreover, the π-anion interaction was observed between ring B of neocyclomorusin and Asp32 of BACE1. This interaction of the π-anion is stronger than the hydrogen bond (Bartlett et al., 2013). The presence of these structures in neocyclomorusin may contribute to the lower binding energy against BACE1.
From all the representative clusters, didymin (cluster 4) had the highest number of hydrogen bonds, which was 6 (Ile126, Trp76, Asp228, Gly230, Ser35, and Gly34) compared to the reference ligand, where there were only three hydrogen bond interactions with Asp106, Gln73, and Lys107. Didymin showed a binding energy of −9.40 kcal/mol where hydroxyl groups from the two sugar molecules form the most hydrogen interactions with the amino acid residues of BACE1. One of the hydroxyl groups interacts with the catalytic aspartic residue, Asp228. Other interactions such as π-sigma (Tyr71), π-π T-shaped (Phe108), and π-alkyl (Leu30) were present between didymin and BACE1. Only the π-sigma interaction was observed between the residue of Tyr71 with the methyl group of the sugar moiety. Other π-π interactions involved the aromatic rings of didymin.

DISCUSSION
Due to its favorable structures and properties, flavonoids have been considered a potential BACE1 inhibitor. However, to date, no flavonoids have advanced into clinical trials as BACE1 inhibitors. Here, an analysis of flavonoids that has been tested as BACE1 inhibitors from studies published from 2010 to 2022 was conducted to discover the pharmacophoric features of flavonoids against BACE1. In the clustering analysis, flavonoids were clustered based on their core fragments and binding interactions between flavonoids and BACE1 were analyzed using molecular docking. These two analyses were performed to determine the relationship between the chemical structure and inhibitory activity of flavonoids against BACE1. From the results of the clustering and molecular docking of 64 flavonoids compiled, several observations can be made. First, compounds with the flavanone core fragments showed an apparent relationship between the 2D structure and bioactivity. Flavanone core fragments were divided into two clusters, where compounds in the two clusters differed by the presence of sugar moieties. The sugar moiety may increase inhibition toward BACE1, as seen in Table 3, where most of the flavonoids in cluster 4 have low IC 50 values compared to those in cluster 3. This has also been discussed by Ali et al. (2019), where the number and position of sugar moieties and the different positions of the glycosidic linkages of flavanone may π-sigma: Val69 π-π stacked: Tyr71 π-π T-shaped: Phe108 π-alkyl: Arg128 Frontiers in Chemistry | www.frontiersin.org June 2022 | Volume 10 | Article 874615 affect the inhibitory activity of BACE1. There are also a few studies that demonstrated that the presence of a sugar moiety in natural product compounds contributes to the inhibitory activity of BACE1 such as terpenoids isolated from Dipsacus radix, rubrofusarin, and derivatives from Cassia obtusifolia Linn (Shrestha et al., 2018;Wang et al., 2021). The non-piperazine-substituted chalcone fragment also showed an obvious relationship between the 2D structure and bioactivity. Substitution at the para position of rings A and B with functional groups with decreasing electronegativity seems to result in an increase in the IC 50 value. When ring A is substituted with a hydroxyl group at the meta position of ring A and there is no substitution on ring B, such as in the case of cardamonin, this produces the lowest IC 50 value among the chalcones. Molecular docking shows that cardamonin formed three hydrogen bonds with BACE1 compared to only one by compound 2n ((E)-3-(4-fluorophenyl)-1-(2-hydroxy-5-iodo-4methoxyphenyl)prop-2-en-1-one). In a study by Ma et al. (2011), a series of hydroxychalcones were evaluated for their inhibitory activities against BACE1. The structure-activity relationship from this study showed that the inhibitory activity of the chalcone against BACE1 was governed by the hydroxyl substituents on rings A and B of the chalcone, where the most potent chalcone was substituted with four hydroxyl groups (IC 50 = 0.27 μM). The high potency may be attributed to the ability of the compounds to form hydrogen bonds with the catalytic site. Van der Waals interaction may also play an important part in the interaction between chalcone and BACE1. When compounds in cluster 6 were substituted at the para position in rings A and B with decreasing electronegativity (Br > Cl > CH 3 O and F > CH 3 O, respectively), the IC 50 value increases. In addition, flavonoids with chalcone core fragments did not violate the "Lipinski's rule," which shows they have good absorption and bioavailability. Chalcone is also known as "privileged structures" as both natural and synthetic chalcone derivatives have shown compelling biological activities with clinical potential against different types of diseases (Zhuang et al., 2017).
Thirdly, the number and position of sugar moieties attached to the flavanone core play an important role in inhibiting BACE1. These sugar moieties can be linked to an aglycone as monosaccharides, disaccharides, or oligosaccharides. In this study, all flavanones in cluster 4 have different numbers of sugar moieties and positions attached to the core fragment. Didymin, which contains disaccharides, showed the most potent inhibitory activity against BACE1 compared to other flavanones in cluster 4. Moreover, the hydroxyl group from one of the sugar moieties in didymin interacts with catalytic aspartic residues of BACE1 (Asp228). Although poncirin also contains two disaccharides, it exhibited a different level of potency against BACE1 due to a different position of the glycosidic linkage. These results are supported by Shrestha et al. (2018), where the activity of rubrofusarin and its derivatives against AChE and BACE1 was investigated. The study found that the glucose moiety at position C6 of norrubrofusarin is responsible for the inhibitory activity against BACE1. Moreover, the presence of two sugar moieties inhibited AChE nine times more than a compound having no sugar moiety. Choi et al. (2016) also showed that sugar molecules of ginsenosides had the most interactions with the residues of the BACE1 active site through hydrogen bond and van der Waals interactions.
The majority of BACE1 inhibitors seem to have a large number of hydrogen bond donors and acceptors and form strong hydrogen bonds with Asp32 and Asp228 of BACE1 (Hernández et al., 2016). From this study, didymin has the most hydrogen bonds compared to other representative flavonoids and formed hydrogen bonds with Asp228. This is indicated by the low binding energy score (9.80 kcal/mol) and high inhibitory activity (IC 50 = 2.34 μM) of didymin. This finding is supported by molecular docking studies that were performed to study the effects of flavonols and flavones as BACE1 inhibitors. In the BACE1 target interaction, myricetin was found to form the largest number of hydrogen bonds with BACE1. Two hydrogen bonds were formed with Asp32 through C3-OH of ring C. Several other hydrogen bonds were formed between Trp198 and C4′-OH and C5′-OH of ring B and between Gln73 and C7-OH of ring A (Shimmyo et al., 2008). A study by Shrestha et al. (2018) also showed that nor-rubrofusarin 6-O-β-D-glucoside had a binding energy of −8.34 kcal/mol in the allosteric inhibition mode with BACE1 and formed six hydrogen bonds with Gln303, Gln304, Glu339, and Gly156. In addition, the inhibition of BACE1 with nor-rubrofusarin 6-O-β-D-glucoside by catalytic inhibition showed a binding energy of -6.61 kcal/mol with six hydrogen bonds formed with Asp32, Trp76, Asn37, Ile26, and Tyr198 (Shrestha et al., 2018). Although didymin formed the most hydrogen bonds with BACE1, it violates two of "Lipinski's rule," as it has 14 hydrogen bond acceptors and 7 hydrogen bond donors due to the presence of the sugar moiety.
Another unique structure of flavonoid that needs to be highlighted is the pentacyclic structure that can be found in cyclomorusin and neocyclomorusin. The extra ring in both structures may contribute to the low binding energy score observed. However, previous study has reported that the pentacyclic structure of both compounds resulted in lower potency against BACE1, where a free hydroxyl group is needed for the BACE1 activity (Cho et al., 2011). Further study needs to be done on the pentacyclic structure to understand whether this structure significantly contributes to the BACE1 inhibitory activity. It should be noted that molecular dynamics stimulation would further enhance our understanding on the binding interaction between flavonoids and BACE1, which could not be performed in this study and should be performed in future studies. Molecular dynamics simulation provides more details on the movement of every atom in a protein compared to molecular docking, which only generates the binding mode of a ligand to a protein and predicts the number of possible conformations. In molecular dynamics simulation, it predicts important biomolecular processes such as conformational change, ligand binding, protein folding, and the position of all atoms at a very fine temporal resolution (Hollingsworth and Dror, 2018).
Despite various studies supporting flavonoids as potential BACE1 inhibitors, there are also significant challenges with the isolation, purification, and pharmacokinetic properties of the flavonoids. Flavonoids are very difficult to isolate, as only a small amount can be obtained at one time. Additionally, flavonoids are highly metabolized, with low solubility and poor oral absorption (Amawi et al., 2017). However, there are several approaches that could address these issues. In regard to its isolation, nanoharvesting has been applied to increase product yield. Kurepa et al. (2014) utilized nanoparticles such as anatase TiO 2 to conjugate enediol and catechol group-rich flavonoids. This technique eliminates the use of organic solvents and yielded a higher percentage of flavonoid compounds (Kurepa et al., 2014). Moreover, structural modifications can be performed to increase the solubility and stability of flavonoids. The replacement of hydroxyl with an ethyl group in quercetin was shown to improve its stability against metabolic enzymes by preventing oxidative degradation. Insertion of the ethyl group also increases the lipophilicity of quercetin from 10.7% to 18.8% (Grande et al., 2016). Micro-and nano-delivery systems could also help to increase the bioavailability of flavonoid such as the use of nano-emulsions and nano-crystals (Amawi et al., 2017). Yi et al. (2017) developed a novel silybin nano-crystal using highpressure homogenization. It was found that the release rate of silybin nano-crystal was faster in vitro and showed a higher peak concentration in vivo compared to the silybin coarse powder. This demonstrated that the nano-crystal technique increases bioavailability and is a promising oral drug delivery system for poorly soluble drugs such as flavonoids (Yi et al., 2017).

CONCLUSION
In this study, an analysis was conducted on flavonoids that have been tested against BACE1 enzymes from 2010 to 2022. Specifically, the structure-activity relationship analysis was conducted, involving clustering and molecular docking. Several key findings from this study include: 1) flavanones with sugar moieties showed higher inhibitory activity than those without sugar moieties. Additionally, the number of sugar moieties and the position of glycosidic linkage affect the inhibitory activity. 2) Non-piperazine-substituted chalcones when substituted with functional groups with decreasing electronegativity at the para position of both rings result in a decreased inhibitory activity. Molecular docking indicates that ring A is involved in hydrogen bonding, whereas ring B is involved in van der Waals interaction with BACE1. 3) Hydrogen bond is an important interaction with the catalytic site of BACE1. It should be noted that the SAR analysis performed in this study is limited to 2D similarities only. Further studies involving other similarity metrics are warranted. Additionally, molecular dynamic studies are also warranted to study the behavior of BACE1 with flavonoids in full atomic detail and at very fine temporal resolution. Hence, these findings may aid in the design of highly potent and specific BACE1 inhibitors, which could aid in delaying the progression of AD.

DATA AVAILABILITY STATEMENT
The original contributions presented in the study are included in the article/Supplementary Material; further inquiries can be directed to the corresponding author.