A Perspective on Reagent Diversity and Non-covalent Binding of Reactive Carbonyl Species (RCS) and Effector Reagents in Non-enzymatic Glycation (NEG): Mechanistic Considerations and Implications for Future Research

This perspective focuses on illustrating the underappreciated connections between reactive carbonyl species (RCS), initial binding in the nonenzymatic glycation (NEG) process, and nonenzymatic covalent protein modification (here termed NECPM). While glucose is the central species involved in NEG, recent studies indicate that the initially-bound glucose species in the NEG of human hemoglobin (HbA) and human serum albumin (HSA) are non-RCS ring-closed isomers. The ring-opened glucose, an RCS structure that reacts in the NEG process, is most likely generated from previously-bound ring-closed isomers undergoing concerted acid/base reactions while bound to protein. The generation of the glucose RCS can involve concomitantly-bound physiological species (e.g., inorganic phosphate, water, etc.); here termed effector reagents. Extant NEG schemes do not account for these recent findings. In addition, effector reagent reactions with glucose in the serum and erythrocyte cytosol can generate RCS (e.g., glyoxal, glyceraldehyde, etc.). Recent research has shown that these RCS covalently modify proteins in vivo via NECPM mechanisms. A general scheme that reflects both the reagent and mechanistic diversity that can lead to NEG and NECPM is presented here. A perspective that accounts for the relationships between RCS, NEG, and NECPM can facilitate the understanding of site selectivity, may help explain overall glycation rates, and may have implications for the clinical assessment/control of diabetes mellitus. In view of this perspective, concentrations of ribose, fructose, Pi, bicarbonate, counter ions, and the resulting RCS generated within intracellular and extracellular compartments may be of importance and of clinical relevance. Future research is also proposed.


INTRODUCTION
Reactive carbonyl species (RCS) are electrophiles containing one or more carbonyl functional groups, typically aldehydes and/or ketones, which are present under physiological conditions. RCS exist in vivo (a) as enzymatic and/or metabolic products (Tessier, 2010;Turk, 2010); (b) from exogenous sources (Uribarri et al., 2007); and (c) from nonenzymatic glycation processes (Tessier, 2010). Certain RCS exist as species in equilibrium with other non-RCS isomers and, as such, can be termed transient RCS. The ring-opened isomer of the monosaccharide D-glucose is a transient RCS because it equilibrates with other less reactive ring-closed isomers that do not contain a reactive carbonyl. A biological system that exhibits negative implications as a result of high RCS concentration is said to be under "carbonyl stress" (Suzuki and Miyata, 1999;Turk, 2010). The biochemical implications of carbonyl stress include, but are not limited to: (a) increased oxidation of carbohydrates and lipids (oxidative stress); (b) increases in steady-state level of reactive species, including reactive oxidative species (ROS), advanced lipoxidation end products (ALE), and advanced glycation end products (AGE); along with (c) perturbed cellular metabolism (Yim et al., 2001;Thornalley, 2005;Shumaev et al., 2009). Carbonyl stress has clinical implications in many chronic and degenerative diseases including diabetes, obesity, kidney and heart diseases, atherosclerosis, and neurodegenerative diseases.
The transient ring-opened glucose RCS isomer reacts nonenzymatically with intracellular and extracellular proteins to form glycated proteins, both in vivo (Bookchin and Gallop, 1968;Bunn et al., 1975) and in vitro (Baynes et al., 1989) over prolonged periods of time (weeks to months) via a multireaction process termed non-enzymatic glycation (NEG). For this perspective, we define the NEG process as the four stage process that involves the initial binding of glucose, followed by Schiff base formation, then Amadori formation, and finally the formation of AGE. One result of the process is the production of covalently-modified protein(s), which can alter protein function (Philippe and Bourdon, 2011) and is thought to be linked to the chronic complications of diabetes mellitus (Forbes and Cooper, 2013). The measurement of glycated human hemoglobin (i.e., HbA1 C ) is currently a cornerstone of the management of diabetes mellitus. It is used for diagnostic and management purposes (Swislocki, 2012). The pathophysiological implications of NEG also include age-related chronic diseases such as microangiopathy, retinopathy, and nephropathy (Brownlee, 1995;Baynes, 2001).
Glucose is not, however, the only species that can noncovalently bind proteins that can lead to covalently-modified protein. When other sugars (e.g., ribose, fructose, G6P, etc.) and non-sugar metabolites (e.g., glyceraldehyde, glyoxal, etc.) bind to protein and ultimately generate covalently-modified proteins, we refer to the process as non-enzymatic covalent protein modification (NECPM). The NECPM process, as defined here, differs from NEG in that NEG involves only glucose whereas NECPM involves non-glucose species. In NECPM, the bound species may, or may not, follow the four-stage NEG process, may have a different degree of reversibility relative to NEG, and may lead to a different array of AGE relative to NEG. The reason for the NEG/NECPM distinction is to emphasize that different reagents via different mechanistic pathways can lead to different covalently-modified proteins. For example, most of the nonglucose RCS metabolites are aldehydes that cannot cyclize but have the potential for significant hydrate formation (Swenson and Barker, 1971;Kitayama et al., 2002). In addition, the binding and follow-up chemistry of hydrates is completely unrelated to the NEG process. The NEG versus NECPM distinction is also of value because of the clinical significance of glucose as the predominant glycating agent in vivo and the realization that sites for protein modification are not identical for glucose versus glyceraldehyde. As an example, Lys82 on the β-chain of HbA is not glycated by glucose but is covalently modified by glyceraldehyde (Acharya and Manning, 1980;Delpierre et al., 2004;Ito et al., 2010). We posit that glucose follows an NEG pathway whereas glyceraldehyde follows a disparate NECPM pathway.
The goal of this perspective is to illustrate the diversity of reagents and the multiplicity of mechanisms that can be involved in early processes that lead to the covalent modification of proteins. For this purpose, HbA and human serum albumin (HSA) serve as model proteins. These reagents include glucose (in NEG), non-glucose sugars, RCS formed in vivo (and participate in NECPM), and effector reagents. We define an effector reagent as a small molecule that concomitantly binds with a reagent in a protein pocket and can facilitate or inhibit covalent bond making or bond breaking (e.g., water, Pi, bicarbonate, etc.). As an interdisciplinary team of authors, we posit concise, comprehensive schemes for NEG and NECPM that reflect recent research. Finally, we put forward future research directions and issues to be considered.

INITIAL GLUCOSE/PROTEIN BINDING IN NEG: THE ROLE OF RCS
Glucose undergoes reversible mutarotation in aqueous media, whereby five different isomers interconvert, including a pair of ring-closed pyranoses (α and β) and a pair of ring-closed furanoses (α and β). The ring-closed species are not RCS and are not sufficiently electrophilic to significantly react with protein amino acid residues. The fifth structure, through which the four ring-closed isomers interconvert, is a ring-opened isomer containing a free aldehyde group. This species is a transient RCS that is sufficiently electrophilic to react with amino acid residues in NEG.
Results from HSA (Wang et al., 2013) illustrate that the initially-bound glucose species are the non-RCS ring-closed isomers that then ring open once bound. This observation was confirmed by Clark et al. (2013) for glucose interactions with HbA and is consistent with studies of glucose and other monosaccharides binding to several enzymes (D-xylose isomerase, Lee et al., 1990;Allen et al., 1994; galactose mutarotase, Thoden et al., 1994;and phosphoglucoseisomerase, Lee and Jeffrey, 2005). In fact, we are unaware of any detailed mechanistic investigation in support of the direct binding of the transient, acyclic glucose isomer to protein. The direct binding (and subsequent reaction) of the ring-opened isomer of glucose in NEG is very unlikely because the equilibrium concentration of glucose in the ring-opened form in aqueous solution is 0.002% (Hayward and Angyal, 1977;Bunn and Higgins, 1981), corresponding to a concentration of just 0.12 µM in human plasma/serum. This concentration is likely an overestimate of the actual concentration of available ring-opened isomer because the ring-opened isomer rapidly re-ring closes. The lifetime of the ring-opened isomer is exceedingly short as it reverts (via intramolecular processes) to ring-closed isomers at a rate faster than the 1 HNMR time scale at 300 MHz (75 ms, or ∼10 −4 s, Bryant, 1983). Moreover, re-ring closure is likely faster than the process of binding and then reacting once bound. In contrast, the ring-closed isomers of glucose are at 50,000 times the concentration, are long-lived, and bind better than does the ring-opened isomer (Clark et al., 2013). Thus, the ring-opened RCS structure is most likely generated from previously-bound, ring-closed isomers (Clark et al., 2013;Wang et al., 2013; e.g., Figure 1). Once generated, the transient RCS must then react with protein before it either reverses to a non-electrophilic, ringclosed isomer or exits the protein pocket. Extant schemes for NEG depict the ring-opened isomer as the singular species that directly binds to protein, when, in fact, overwhelmingly it is ring-closed, non-RCS isomers that bind and then are reversibly converted to the ring-opened glucose isomer while bound (Clark et al., 2013;Wang et al., 2013). These findings are not reflected in historic or even in recent NEG schemes (Rodwell et al., 2015;Welch et al., 2016).

GLUCOSE RCS GENERATION WHILE BOUND TO PROTEIN
How Does a Glucopyranose Ring Open While Bound?
When pure α-glucopyranose is placed in DMSO (a polar aprotic solvent), mutarotation does not proceed (Ballash and Robertson, 1973). Thus, spontaneous ring opening does not occur. Therefore, generation of the reactive ring open RCS of glucose, whether in aqueous media or while bound to a protein requires an effector reagent. Water, as an effector reagent, is thought to bridge the anomeric OH and the hemiacetal oxygen of a glucopyranose and serve as both an acid and a base in a concerted process (Silva et al., 2006;Qian, 2012). A typical protein pocket has a dielectric constant of a 2-4 and, as such, there are often no water molecules in the pocket (Simonson and Brooks, 1995). That said, one or more water molecules can exist within protein pockets from incomplete desolvation of the protein upon glucose binding (Mowbray and Cole, 1992). Further, water complexes of Pi and/or glucose, and/or the concomitant binding of independent water molecules are also possible (Silva et al., 2006;Qian, 2012). For surface proteins with water exposure, water should be readily available. Thus, water can serve as an effector reagent to assist ring opening of glucose. However, effector reagents other than water (serving as an acid, base, or acid/base catalyzing species) should also be considered. For example, effector reagents for HbA include physiological anion/buffers such as Pi (Figure 1), bicarbonate, 2,3-bisphosphoglycerate (2,3-BPG), etc. The ring opening of a bound glucopyranose requires the deprotonation of the OH on the anomeric carbon and the protonation of the hemi-acetal oxygen (on C5). Ideally, this occurs without the formation of new ionic intermediates in a low dielectric environment. Therefore, irrespective of the identity of the effector reagent, concerted reactions are most likely.
Given the presence of acidic and basic amino acid residues in protein pockets, effector reagents are not necessary. For example, the amine forms (R-NH 2 ) of internal lysine and histidine side chains, the N-terminal amino acid residue, can act as bases, while ammonium ions (R-NH + 3 ) of internal lysines, N-terminal amino acids and/or Pi can act as acids (Figure 1). That said, the common external effector reagents (water, Pi, etc.) have greater geometric freedom to accommodate the proper alignment for concerted ring opening and also have greater buffering ability (Figure 1). Thus, rates of NEG and NECPM should be faster as effector reagent concentration increases.
In fact, Pi is known to accelerate HbA glycation, though the mechanism(s) are uncertain (Gil et al., 1988(Gil et al., , 2004Kunika et al., 1989). In recent work (Clark et al., 2013;Smith et al., in preparation), 31 P and 1 HNMR of model reactions and computational assessment of reaction thermodynamics and protein/substrate interactions highlight that: (1) Pi and a glucopyranose can undergo reversible, concomitant binding to HbA glycation sites, generating an array of HbAbound Pi/glucopyranose complexes of comparable energy with different geometries, (2) the Pi within the HbA-bound Pi/glucopyranose complex achieves a geometry to ring-open the bound glucopyranose, and (3) one such geometry has Pi bridge a protein-bound glucopyranose, enabling Pi to abstract the proton on the anomeric OH while protonating the hemi-acetal oxygen in a concerted process (Figure 1).
We assert that the observed site selectivity for glycation with glucose (NEG) may be related to the ability to generate the necessary electrophile (via ring-opening of the glucopyranose while bound), which varies from site-to-site.

THE POTENTIAL OF NON-GLUCOSE RCS IN NECPM
Glucose is not the only monosaccharide that is a physiological, transient RCS. Fructose (Bunn and Higgins, 1981;Wang et al., 2013), ribose (Bunn and Higgins, 1981), and G6P (glucose-6-phosphate, Haney and Bunn, 1976) may also undergo ring opening while bound and potentially proceed toward covalent protein modification. Bunn et al. (1975) and Swamy et al. (1993) show that each of these species covalently modify proteins (HbA and lens crystalline, respectively) more extensively than glucose on a per molecule basis. It is reasonable to assert that ring-closed non-RCS isomers of these monosaccharides are the predominate species that initially bind. As such, they likely ring open while bound to generate the bound RCS needed for further reaction (Wang et al., 2013). Moreover, when glyceraldehyde is used FIGURE 1 | Schemes showing the mechanistic diversity for the production of the non-covalently bound reactive carbonyl species (RCS) in a protein pocket (either at the surface of the protein or in an internal pocket) from initially bound glucopyranose in intracellular (hemoglobin) and extracellular (albumin) proteins. In an amino-acid-residue-mediated mechanism (1), the protein (shown as a semi-circle) itself is acting as both the acid and base. In the Pi-mediated mechanism (2), an amino acid residue acts as the acid while the concomitantly bound Pi (as an effector reagent) acts as the base. In another Pi-mediated mechanism (3), Pi bridges the bound glucopyranose and acts as the effector reagent for both the acid and base chemistry. Water or bicarbonate, etc. can also play the role of the effector reagent (in fact, water is the effector reagent in glucose mutarotation in aqueous solution, Silva et al., 2006). Each of these mechanisms are examples of concerted reactions that do not generate net charge and is similar to a charge relay enzymatic mechanism, such as that for chymotrypsin (Blow et al., 1969;Tsukuda and Blow, 1985;Park et al., 2016). Mechanisms are depicted as taking place with β-glucopyranose, but comparable mechanisms with α-glucopyranose also occur.
instead of glucose for in vitro protein modification, covalent modification is much more extensive than that for glucose (Hamada et al., 1996). This is likely because glyceraldehyde precedes via an NECPM mechanism that may involve the binding of both aldehydes and their hydrates which proceed through later mechanisms unrelated to those in NEG.
The formation of RCS derived from monosaccharide degradation prior to protein interaction needs to be considered. At pH = 7, Pi enhances the conversion of α-glucopyranose to β-glucopyranose in D 2 O by a factor of 21 (Bailey et al., 1970). Once Pi generates a ring-opened glucose isomer (before binding to protein), a second Pi can further react with the acyclic glucose isomer to form additional RCS (Thornalley et al., 1999;Henning et al., 2014). When summing the degradation products from the above work and extrapolating to include fructose, ribose, and G6P degradation products, more than 20 Pi-promoted RCS can theoretically be formed prior to protein binding. Binding of such RCS is consistent with the rapid modification of protein by species like glyceraldehyde (Acharya and Manning, 1980). Further support comes from preliminary computations in our laboratory that at least a subset of these RCS undergo binding to HbA and HSA (with similar exothermicity to glucopyranose binding) with geometries suitable to react such that the RCS can act as electrophiles and covalently modify HbA at Val1 of the β-chain of HbA-the site specific for HbA1c formation by glucose (Bookchin and Gallop, 1968;Bunn et al., 1975). Various researchers (Toi et al., 1967;Oya et al., 1999;Thornalley, 2005;Nasiri et al., 2011) have highlighted that certain of these RCS, (e.g., glyoxal and methylglyoxal) react with arginine residues and lead to the covalent modification of proteins (which we, here, would assert are NECPM processes).
This highlights yet another distinction between NEG and NECPM, as glucose (in NEG) does not react readily with arginine.
Arguments for the importance of these RCS and NECPM are: (1) degradation products from glucose, fructose, and ribose are far more electrophilic than glucose (e.g., methylglyoxal is 20,000 more reactive than glucose, Thornalley et al., 1999); (2) unlike the glucopyranoses, many of the RCS do not need to be modified, while bound, to make an electrophile and therefore, there is no need for the concomitant binding of an effector reagent; and (3) the residence time demand (the required amount of time bound in order to react; Tummino and Copeland, 2008) within a protein pocket is much shorter for most of these RCS than for glucose. On the other hand, the expected concentrations of these RCS in plasma will likely be in the low µM range (Henning et al., 2014), well below the normal physiological concentrations of glucose (4-5 mM). However, while each independent RCS (other than glucose) has a comparatively low concentration, it is the sum of all monosaccharide degradation products whose concentration matters most (10-20 µM). Intracellular concentrations of these RCS are unknown.

NUCLEOPHILE FORMATION IN NEG
The bound, electrophilic RCS may react with a nucleophilic N-terminal amino acid residue or an R-NH 2 amino acid side chain of a lysine or possibly an arginine in NEG and NECPM. At physiological pH, these amino acid residues are often in their non-nucleophilic, ammonium ion form (R-NH + 3 ). This important detail is often missing from current NEG schemes and is not consistent with basic nucleophile requirements and progression in NEG to the Schiff base.
Physiological anions (such as Pi and bicarbonate) can produce nucleophiles by deprotonating ammonium ions, a role not previously highlighted.

Perspective on Reagent and Mechanistic Diversity
Many RCS besides the ring-opened glucose isomer and an array of different effector reagents can potentially be involved. From a mechanistic perspective, bases may include Pi, histidine, lysine/N-terminal amino acid amines, and water, while reagents acting as acids may include water, Pi, bicarbonate, 2,3 BPG, and potentially others. The potential nucleophiles are the amines of N-terminal amino acid residues and internal lysines and arginines (if reacted with aldehydic RCS). Therefore, reactions to generate bound RCS can involve many combinations of reagents, via an array of different mechanisms. We propose that noncovalent binding mechanisms are likely to differ from protein-toprotein and from site-to-site on the same protein under the same conditions. This is significant because with multiple mechanisms there is potential for multiple rate-determining steps. A new scheme is proposed (Figure 2) for NEG and NECPM processes to account for this reagent and mechanistic diversity.

Perspective on NEG Site Selectivity
What dictates the site selectivity of covalent protein modification and the extent of NEG? What creates a glycation hot spot [(i.e., the N-terminal valine on the β-chain of HbA forming HbA1c (Delpierre et al., 2004) and the Lys195 on human albumin Wang et al., 2013]? The site-to-site binding affinity of the transient RCS electrophile cannot explain site selectivity (Clark et al., 2013). We propose that both the binding affinity for RCS and effector reagents and their binding geometries within protein pockets differ from site-to-site. Moreover, the identity of the effector reagent and the intrinsic mechanism may vary from site-to-site.

Perspective on NEG Rate
For transient RCS, the likelihood of arriving at the proper geometry with the proper amino acid charge state and a suitably proximate nucleophile within the lifetime of the bound species (e.g., before re-ring-closure or dissociation of glucose from the protein pocket) is low. These contingencies might help explain why the NEG process is so slow.

Perspective on Clinical Implications
The term "glycation gap" refers to the difference between measured HbA1c and HbA1c predicted from a concurrent measure of NEG of serum proteins (i.e., fructosamine), or other indices of glycemic control. We posit that differential concentrations of effector reagents in erythrocytes vs. serum (e.g., Pi and 2,3-BPG) for certain patients may be relevant to understanding the mechanistic basis for the glycation gap. A further consideration is the differential concentrations of Na + , FIGURE 2 | Two proposed schemes for nonenzymatic covalent protein modification emphasizing the early, non-covalent interactions. (a) A non-RCS precursor (1) is exposed to a protein pocket (an internal pocket or at the surface of the protein); here the example is β-glucopyranose (but it can be fructose, ribose, G6P, etc.). The non-RCS precursor noncovalently binds (2). It may dissociate from the protein pocket or it may proceed via 3 or 3 ′ [if an effector reagent (ER) such as Pi, water or bicarbonate, etc. were to concomitantly bind]. Under the influence of amino acid residues only, acid/base reactions with the initially-bound non-RCS precursor generate a noncovalently bound, transient, reactive RCS (3). In this case, the ring-opened glucose isomer may dissociate from the protein pocket or it may proceed via 4 to the Schiff base. An effector reagent such as Pi (mono or dibasic), water, bicarbonate, 2,3-BPG or other physiological species can concomitantly bind with the non-RCS precursor and then function as an acid/base reagent to generate a non-covalently bound, transient, reactive RCS; in this case, the ring-opened glucose isomer (3 ′ ). Any of the bound reagents may dissociate at any point in the 3 or 3 ′ transition to 5. If the ring-opened glucose does not dissociate, it can proceed to the Schiff base via 4 ′ . The bound, transient, reactive RCS (4, 4 ′ ), in this case, the ring-opened glucose isomer, can form either without an ER (2-3) or by an ER (2-3 ′ ). Schiff base can form with facilitation if an ER concomitantly binds at this point (4) or via 4 ′ where the ER did not dissociate. The covalently bound, protonated Schiff base has three fates: it can reverse back to noncovalently bound species (4), it can isomerize to a cyclic glycosylamine via 5, or it can undergo Amadori formation via 5 ′ . Note: Amadori formation may or may not involve an ER. The Amadori intermediate has three fates; it can reverse back to the Schiff base (5 ′ ), it can isomerize to a cyclic Amadori (6), or it can go on via multiple mechanisms to AGE (6 ′ ). (b) Non-glucose RCS and/or hydrates of the RCS are exposed to a protein pocket, can noncovalently bind, and then proceed by variable mechanisms (which may or may not include ER) to generate variable types of covalently-modified proteins. A subset of the RCS are sufficiently electrophilic to enable arginine residues to be covalently modified. These processes do not necessarily go through Schiff bases and/or Amadori intermediates. The covalent modification may or may not be reversible and may or may not involve AGE-type structures.
Perspective on Future Directions: What Needs to be Considered?
(1) Because the generation of RCS while bound to protein has a temporal component, the measurement of time-dependent geometry/energetics and residence time of bound reacting species utilizing Molecular Dynamics simulations (Dror et al., 2012) and/or 2D NMR methods (Williamson, 2013) is advised.
(2) To date, the only variables assumed to determine the degree of intracellular protein (i.e., HbA) modification are glucose concentration, protein lifetime, and cell permeability (Cohen et al., 2008;Malka et al., 2016). In view of this perspective, concentrations of ribose, fructose, Pi, bicarbonate, other cations and anions, and the resulting RCS generated within intracellular and extracellular compartments may be of importance and of clinical relevance. Clinical measures of the aforementioned species vs. disease progression in diabetic patients and a comparison with normal controls would be insightful.
(3) If RCS generated from reactions of monosaccharides are important, then reagents that selectively remove RCS from either serum or the erythrocyte cytosol may decrease the extent of protein modification (specifically by NECPM). The development of potential RCS scavengers (carbonyl trapping compounds), AGE formation inhibitors, and chemicals capable of breaking AGE-protein crosslinks is ongoing (Cho et al., 2007), and together with mechanistic investigations of various effector reagents under physiological conditions, may provide novel strategies to manage health complications related to excessive NEG and NECPM. (4) Research on effector reagent implications on equilibria between reacting bound species (Schiff base, protonated Schiff base, cyclic glycosylamines, etc.) and the resulting impact on site selectivity at stages after initial binding is warranted (especially as extended to proteins beyond HbA and HSA, and also to enzymes).
In summary, this perspective illustrates far greater reagent diversity and mechanistic complexity than previously thought, even when considering only implications in the initial binding stages. This presents a persistent challenge to analyze, describe, understand and potentially reduce the extent of NEG or NECPM processes.

AUTHOR CONTRIBUTIONS
KR, RH, and PR conceived the project and led the writing of the perspective with input and contributions from MH and AS.

FUNDING
The project described was supported by an Institutional Development Award (IDeA) from the National Institute of General Medical Sciences of the National Institutes of Health under Grant #P20GM103408. Funds were also provided by the Departments of Biological Sciences and Chemistry, and the College of Science and Engineering (CoSE) at Idaho State University.