A Review of Methodologies for the Detection, Quantitation, and Localization of Free Cysteine in Recombinant Proteins: A Focus on Therapeutic Monoclonal Antibodies

Free-cysteine residues in recombinant biotherapeutics such as monoclonal antibodies can arise from incorrect cellular processing of disulfide bonds during synthesis or by reduction of disulfide bonds during the harvest and purification stage of manufacture. Free cysteines can affect potency, induce aggregation, and decrease the stability of therapeutic proteins, and the levels and positions of free cysteines in proteins are closely monitored by both manufacturers and regulators to ensure safety and efficacy. This review summarizes the latest methodologies for the detection and quantification of free cysteines.


INTRODUCTION
Cysteine residues (Cys) in proteins are the most conserved residues throughout the entire proteome. They are redox-active, meaning that they can be oxidized or reduced, and this imparts several distinct functions such as active site catalytic functions in enzymes or forming disulfide bonds (Wong and Hogg, 2010). Disulfide bonds are the covalent bonds formed between the oxidized sulfur atoms of Cys residues and provide mechanical stabilization of protein tertiary and quaternary structures. This is particularly true for proteins that reside extracellularly where disulfide bonds help protect them from the harsh pH-variable, protease-rich environment (Pace et al., 1988).
The recombinant DNA technology has facilitated the bulk production of biotherapeutic proteins. In particular, immunoglobulins (Ig) have been utilized in the form of monoclonal antibodies (Carrara et al., 2021) (mAbs) to treat many inflammatory diseases and cancers. Immunoglobulin gamma subtype 1 (IgG1) is the most common mAb scaffold in antibody therapeutics (Shepard et al., 2017) and consists of two light chains (composed of two Ig domains each) and two heavy chains (formed from four Ig domains each). Figure 1 shows how these chains are arranged to form the distinctive Y-Shape of IgG1 with each of the Ig domains stabilized by a buried intrachain disulfide bond, with the quaternary structure stabilized by four interchain disulfide bonds, giving 16 in total (Janeway et al., 2001).
Although the disulfide bonding patterns of IgG1 are well conserved and there are relatively few noncanonical Cys found, even in the variable region, free-Cys have been detected in Ig extracted from sera and recombinantly produced mAbs. The majority of detected free-Cys arises from incomplete processing within the host cell during manufacture where high conditions of cellular stress are encountered or through extracellular reduction by intercellular host proteins such as thioredoxin in FIGURE 1 | Schematic of the IgG1 mAb characteristic Y-shaped structure. Heavy chains are shown in dark blue, and light chains in are shown light blue. Ig-like domains are represented as bulges in the linear protein sequence and are named HV, heavy variable; HC, heavy constant; LV, light variable; and LC, light constant. Disulfide bonds are represented by bars, with green being disulfide bonds buried within Ig domains and red being exposed interchain disulfide bonds. Reproduced with permission from Gurjar et al., 2019. FIGURE 2 | Typical workflows for the three method classes discussed in this review highlighting how each of the method steps is addressed.
Frontiers in Molecular Biosciences | www.frontiersin.org June 2022 | Volume 9 | Article 886417 the harvest and purification of mAbs. Free-Cys arising from disulfide bond reduction in mAbs is undesirable due to the negative effects this has on affinity (Harris, 2005), functions (Gurjar et al., 2019), aggregation (Trivedi et al., 2009;Buchanan et al., 2013;Chung et al., 2017), and stability (Lacy et al., 2008); manufacturers go to great lengths to minimize the amount of free-Cys in therapeutic mAb preparations (Trexler-Schmidt et al., 2010). Although as yet there are no guidelines from regulators on acceptable levels, manufacturers justify the levels on a safety and efficacy basis for each product. Furthermore, the development of structurally diverse next-generation therapeutic antibody platforms and antibody-drug conjugates exogenous cysteines are often added to stabilize structures (Sawant et al., 2020) or to conjugate payloads (You et al., 2021). The methods discussed herein can easily be adapted to quantify the level of free-Cys in these systems. The purpose of this study is to review the relevant methods for identifying and quantifying free-Cys in proteins, with a focus on mAbs. Pros and cons are discussed to provide insight into methodologies and inform readers so that they are able to select and improve upon their application. The focus is on methodologies developed over the last 15 years and is presented in three sections of increasing technical complexity: 1) spectroscopic methods, 2) hybrid spectroscopic-mass spectrometry methods, and 3) wholly mass spectrometry-based methods. Typical workflows for each of these method classes are represented in Figure 2.

SPECTROSCOPIC METHODS
The first, reliable spectroscopic method for the determination of free-Cys in proteins was developed by Ellman (Ellman, 1959). Free-Cys in a protein are reacted with 5,5-dithio-bis(2-nitrobenzoic acid) (DTNB) forming stable yellow-colored 2-nitro-5-thiobenzoic acid (TNB), which can be quantified by measuring the absorbance at 412 nm and applying a molar extinction coefficient of 13,600 M −1 cm −1 ; however, with a limit of detection of around 3 μM free-Cys, the method is not sensitive enough for the detection of the low levels encountered in mAbs. Wright et al. (Wright and Viola, 1998) used a systematic approach to greatly increase the sensitivity of DNTB down to 0.3 µM using carefully controlled conditions such as dialysis, an extended range of standards, and careful control of protein to reagent ratios with accurate quantitation of protein concentration. Partial denaturing conditions and control of reaction times can be used to identify surface accessible versus buried Cys. They also utilized the fluorescent reagent monobromobimane (mBBr), which reacts with free-Cys to form fluorescent adducts which emit at 360 nm when excited at 280 nm. Free-Cys are routinely detectable down to 1 μM, and concentrations as low as 10 nM can be achieved using high-performance liquid chromatography (HPLC) separation of the labeled proteins with the Cys adducts detected using a spectroscopic detector (Fahey et al., 1981). ThioGlo reagents are naphthopyranones derivatized with maleimide which react rapidly with accessible Cys and produce a highly fluorescent product (Langmuir et al., 1995) with sensitivity down to 50 fmol and much increased reproducibility with HPLC methods (Ercal et al., 2001).
An interesting take on colorimetric quantification of thiols is the papain amplification assay (Singh et al., 1993;Singh et al., 1995), in which an inactivated mixed disulfide form of papain is reacted with a thiol to generate a stoichiometric amount of papain. This is assayed using an N-benzoyl-L-arginine-pnitroaniline substrate that releases the chromogenic substrate p-nitroaniline. As this is an amplification assay, very high levels of sensitivity approaching those of fluorescence can be reached.
Although none of the aforementioned methods have been specifically applied to free-Cys in mAbs, there is no reason why they could not be adapted to perform quick, reliable quantitation of therapeutic mAbs. However, in 2002, Zhang et al. undertook a detailed study on quantifying free-Cys in mAbs using N-(1pyrenyl) maleimide, which fluoresces at 380 nm when covalently linked to a Cys (Zhang and Czupryn, 2002). To quantify, a standard curve of NPM-derivatized N-acetyl-Cys was used and free-Cys levels of 0.02 and 0.1 mol/mol of protein under native and denaturing conditions, respectively, were detected.
An interesting development of maleimide labeling has been recently employed, where free-Cys in mAbs are labeled with N-tert-butylmaleimide (NtBM) (Welch et al., 2018). When unlabeled and NtBM-labeled mAb are analyzed on C4 RP-HPLC, there is a retention time shift associated with NtBM-labeled mAb, allowing resolution of the two peaks and quantitation of free-Cys. A similar method utilizes N-cyclo-hexylmaleimide as the free thiol label and hydrophobic interaction chromatography for separation (Wei et al., 2019).

HYBRID SPECTROSCOPIC/MASS SPECTROMETRY METHODS
Combining mass spectrometry with spectroscopic quantitation of free-Cys, Chumsae et al. (2009) reported a combined fluorescent label and mass spectrometry approach that not only identifies the number of free-Cys but also simultaneously localizes them. Five recombinant IgG1 mAbs were first treated with 5iodoacetamidofluorescein (5-IAF) under partially denaturing conditions of 4M guanidine hydrochloride to expose all free sulfhydryl groups; a 10:1 5-IAF:mAb ratio ensured efficient alkylation and 'fixed' the redox state of free-Cys. Following this, remaining disulfide-bonded Cys were differentiated from the nonbonded Cys by reacting them with IAA after full denaturing and reduction. Initial analysis by matrix-associated laser desorption/ionization time of flight mass spectrometry (MALDI-Tof MS) revealed the quantified number of free-Cys as each one alkylated with 5-IAF gave a mass shift of 387.4 Da. Then, to determine where the free-Cys resided, mAbs were digested with trypsin and tryptic peptides separated by RP-HPLC using fluorescence detection at 520 nm specific for 5-IAF-modified peptides. These peaks were collected and the sequences were determined by MALDI-Tof MS, allowing the determination of the positions and domain localization of the free cysteines. Huh et al. (2013) used a similar approach but with a fluorescent Alexa Fluor C-5-coupled maleimide reagent (AF594) as the probe for free-Cys in several recombinant IgG1 and IgG2 mAbs. Under partial denaturing conditions (7M guanidine HCl), the intact antibodies showed similar levels of free-Cys using total AF594 fluorescence and RP-HPLC separation to that of DNTB; mAb constant domains contain 1-2.7% free-Cys for IgG1 and 1-2.8% for IgG2. To identify the free-Cys-containing peptides, the mAbs were Lys-C-digested and separated on RP-HPLC, and the experimental masses of the peptides plus labels compared to theoretical masses. Comparison of the total fluorescence at 594 nm of the peptide peaks to an AF594 standard curve was used to quantitate the levels of free-Cys per peptide. They then applied these methods to show that mechanical agitation of the mAbs results in breakage of disulfide bonds and covalent aggregation via the liberated Cys.

WHOLLY MASS SPECTROMETRY METHODS
Coupling stable isotope pairs to differentially alkylate free-Cys to high sensitivity, high-resolution LC-MS/MS provides the most comprehensive analysis of the redox and disulfide-bonded state of Cys in proteins/mAbs. Xiang et al. used 12 C-iodoacetic acid ( 12 C-IAA) and 13 C-iodoacetic acid ( 13 C-IAA) to differentially label five mAbs and quantify the levels and location of free-Cys in the mAb sequence. The 2 Da mass shift between the labels meant that the authors could identify, and distinguish between, the free Cys originally present in the mAbs and the free Cys liberated from denaturation and reduction of the mAbs. Liquid chromatography-mass spectrometry (LC-MS) was performed after multi-enzyme digest (trypsin, Lys-C, chymotrypsin, Asp-N, or Glu-C), and MS peaks for each peptide were identified from calculated masses of the peptide sequence plus any Cys modification. They calculated peptide isotope peak areas from MS 1 spectra for both the 12 C-IAA and 13 C-IAA peptide adducts to get relative percentages of each form. Spiking experiments showed they could accurately quantify down to 0.5% free-Cys for each peptide, and over the five mAbs they studied, they found levels of free-Cys ranging from 1.5 to 5.6%, with the heavy chain CH3 domain having the highest level of free-Cys. The same group went on to further utilize this method in determining the stability of each disulfide bond in IgG1 mAbs (Liu et al., 2010). It should be noted that the main pitfall of this method is that it is limited to the LC-MS analysis and does not make use of modern LC-MS/MS peptide sequencing and modification localization, therefore relying on time-consuming manual matching of LC-MS peaks with theoretical predicted peptide masses.
A similar method that did use a full nano-LC-MS/MS analysis used 18 O − -labeled iodoacetamide ( 18 O-IAA), whereby differential labeling was carried out by alkylating free-Cys with normal IAA, denaturing, reducing, and alkylating with 18 O-IAA, trypsin digestion, and LC-MS/MS analysis (Wang and Kaltashov, 2012;Wang and Kaltashov, 2015). The percentage free-Cys was calculated using the peak area of the extracted ion chromatogram for a given 18 O-IAA-labeled peptide from the total ion chromatogram and quoted as a percentage of the total area of IAA + 18 O-IAA-extracted ion chromatograms. Recombinant human transferrin was used as a model protein, but this could easily be applied to mAbs or other therapeutic proteins.
In another example reported by Chiu (Chiu, 2019), a stable isotope pair of 2-iodo-N-phenylacetamide ( 12 C-IPA) and its carbon-13 derivative ( 13 C-IPA) was used. This pair has a 6 Da mass difference, and the hydrophobic nature of the alkylating agent allows it to penetrate the hydrophobic core of protein domains without the need for partial denaturation, as demonstrated by quantitation of disulfide bond redox states in the platelet integrin αIIb βIII (Chiu, 2019;Pijning et al., 2021) and Influenza A Hemagglutinin (Florido et al., 2021). Differential alkylation, data analysis, and determination of % free-Cys were performed as in the aforementioned method.
In addition, stable isotopes of iodoacetamide Cys alkylating agents here are also stable isotopes of maleimide-derived Cys alkylating agents. N-Ethylmaleimide (d 0 -NEM) and d 5 -Nethylmaleimide (d 5 -NEM) can be used differentially to alkylate cysteines within a protein with the d 5 -NEM producing a 5 Da mass shift compared to d 0 -NEM. An early example of this was the quantification of thioredoxin-catalyzed disulfide bond reduction in the cell surface receptor CD44 (Kellett-Clarke et al., 2015). This maleimide chemistry was also used by Robotham et al., who devised a complete strategy for the quantitation of free-Cys in mAbs at both the intact mAb level and the peptide level (Robotham and Kelly, 2019). They looked at three commercially available mAbs and used B-lactoglobulin A as a control as it is known to have a free-Cys. These proteins were reacted with maleimide-PEG 2 -Biotin (MPB), which adds a mass of 525 Da per labeled Cys, and the intact labeled mAbs were analyzed by LC-MS. In spiking experiments, the authors could quantify mAbs containing a free-Cys down to less than 2% (~0.02 mol SH per mol protein) of the total mAb population. Furthermore, reduction of the labeled mAbs with TCEP allowed the independent analysis of the heavy and light chains; thus, the percentage of MPB labeling on each of the heavy and light chains could be ascertained. All their estimations of free-Cys levels agreed well with spectroscopic methods. They went on to demonstrate site-specific quantitation of free-Cys in different redox states by differentially labeling using the d 0 -NEM/d 5 -NEM isotope pair. mAbs partially reduced by 6M guanidine hydrochloride were labeled at pH 5.5 with d 0 -NEM, after which the mAbs were fully denatured and reduced prior to further labeling with d 5 -NEM. The mAbs were deglycosylated with PNGase, trypsin-digested, and subjected to LC-MS/MS analysis. Of the 17 cysteine residues in SigmaMAb, 16 were identified and the percentage of free-Cys was calculated by comparing the area of d 0 -NEM-labeled peptide to the total area of d 0 -NEM + d 5 -NEM-labeled peptides. Again, spiking experiments showed that <2% free-Cys per peptide could be easily detected.
More recently, we developed a differential alkylation strategy to investigate mAbs that does not require stable isotope pairs (Gurjar et al., 2019). Instead, IAA is used to initially alkylate native free-Cys, and then NEM is subsequently used to alkylate free-Cys liberated upon denaturation and reduction. Additionally, a "standard" is prepared-in this case, an mAb fully denatured, reduced, and 100% alkylated with IAA. A labelfree LC-MS/MS analysis is performed to sequence the peptides and localize the alkylated Cys; then, extracted peak areas of IAAlabelled Cys peptides in the sample runs are compared to extracted peak areas of the same IAA-labeled Cys peptide in the 100% standard (control). Non-Cys-containing peptides are used to normalize the intensities of the peptides between the different LC-MS/MS runs. The method was applied to five therapeutic mAbs, allowing quantitation of the redox state and amount of free-Cys after various treatments, with sensitivity down to 2%. This has since been successfully applied to determine free-Cys levels in therapeutic recombinant coagulation factor VIII products (Arsiccio et al., 2022). A recent development of this method forgoes the use of a fully NEM-alkylated mAb standard and directly quantified NEMlabeled free-Cys with IAA-labeled Cys derived from disulfide bonds (Li et al., 2021). However, it should be noted that the ionization properties of NEM-and IAA-labeled peptides will not be comparable and may potentially lead to inaccurate quantitation which should be addressed in detailed method validation.

DISCUSSION
Since the 1959 study by Ellman (Ellman, 1959), the sensitivity and limits of free-Cys quantification in proteins have steadily increased as technologies have evolved. Pure spectroscopic methods permit the quantitation of free-Cys within a protein/ mAb but do not inform on where the free-Cys resides. A hybrid approach combining spectroscopic free-Cys detection with mass spectrometry can provide partial localization information but no detailed information. This is because it only utilizes MS to provide an experimental peptide mass that is then compared to the theoretical mass of the peptide in question plus any additional probe mass-no peptide sequencing occurs. As such, it requires a lot of manual annotation and an offline data analysis.
The evolution of MS/MS peptide sequencing when coupled to the nano-ultra-HPLC separation technology allows for the fast and efficient sequencing of peptides along with identification and quantitation of any post-translational modification of amino acids either during synthesis or with exogenously added chemical probes (Prus et al., 2019). Furthermore, the use of stable isotopes of the same thiol-reactive probe allows for the quantitation of free-Cys levels in a single mass spectrometry run, alleviating the need for any standard curves to be generated and therefore any commutability issues that may occur between the behavior of the standards and the samples to be analyzed. Isotope pairs of Cys-reactive probes offer a means of identifying which label has alkylated the Cys on a given peptide based on a mass shift between the heavy and light probes; however, they possess the same physicochemical characteristics showing no difference in specificity or reactivity, nor do they introduce different ionization characteristics or retention time shifts into the LC-MS/MS analysis. Despite this, the stable isotope technology has mainly evolved in the redox-labile allosteric disulfide bond field where it is utilized in determining the relative reactivity of multiple disulfide bonds within a protein as well as the percentage of reduction in each disulfide bond, either in the native state or after treatment (Cook and Hogg, 2013). Strangely, given its ease of use and sensitivity, its uptake in quantifying free-Cys in therapeutic proteins such as mAbs has been slow and sparse.
The differing sensitivity and complexity of the methods mean that varying amounts of protein are needed for each analysis. This ranges from low mg to high μg for spectroscopic methods, especially if coupled with HPLC, to low μg and below for the LC-MS/MS methods. Both methods lend themselves to analysis at different stages of recombinant mAb development and manufacture. For example, if the desire is to monitor the overall level of free-Cys in an mAb product at different manufacturing stages, online spectroscopic methods will provide a good inline scalable solution. However, at the research and development stage, where many clones are being assessed for their stability and the material is at a premium, a full LC-MS/MS analysis might be beneficial to pinpoint the areas of the mAb where the free-Cys is occurring.

AUTHOR CONTRIBUTIONS
CM researched and wrote the manuscript.

FUNDING
CM is partially funded by the NIHR Policy Research Programme (NIBSC Regulatory Science Research Unit). The views expressed in the publication are those of the author(s) and not necessarily those of the NHS, the NIHR, the Department of Health, 'arms' length bodies, or other government departments.