Peptide Model of the Mutant Proinsulin Syndrome. II. Nascent Structure and Biological Implications

Toxic misfolding of proinsulin variants in β-cells defines a monogenic diabetes syndrome, designated mutant INS-gene induced diabetes of the young (MIDY). In our first study (previous article in this issue), we described a one-disulfide peptide model of a proinsulin folding intermediate and its use to study such variants. The mutations (LeuB15→Pro, LeuA16→Pro, and PheB24→Ser) probe residues conserved among vertebrate insulins. In this companion study, we describe 1H and 1H-13C NMR studies of the peptides; key NMR resonance assignments were verified by synthetic 13C-labeling. Parent spectra retain nativelike features in the neighborhood of the single disulfide bridge (cystine B19-A20), including secondary NMR chemical shifts and nonlocal nuclear Overhauser effects. This partial fold engages wild-type side chains LeuB15, LeuA16 and PheB24 at the nexus of nativelike α-helices α1 and α3 (as defined in native proinsulin) and flanking β-strand (residues B24-B26). The variant peptides exhibit successive structural perturbations in order: parent (most organized) > SerB24 >> ProA16 > ProB15 (least organized). The same order pertains to (a) overall α-helix content as probed by circular dichroism, (b) synthetic yields of corresponding three-disulfide insulin analogs, and (c) ER stress induced in cell culture by corresponding mutant proinsulins. These findings suggest that this and related peptide models will provide a general platform for classification of MIDY mutations based on molecular mechanisms by which nascent disulfide pairing is impaired. We propose that the syndrome’s variable phenotypic spectrum—onsets ranging from the neonatal period to later in childhood or adolescence—reflects structural features of respective folding intermediates.


INTRODUCTION
The mutant proinsulin syndrome (MPS) is a prototypical disease of toxic protein misfolding. Unlike toxic extracellular aggregation as observed among neurodegenerative diseases (1) and diverse amyloid disorders (2), in the MPS a dominant mutation impairs proinsulin folding efficiency in a critical intracellular organelle: the endoplasmic reticulum (ER). Impaired foldability of the variant protein of pancreatic b-cells leads to aberrant aggregation and in turn induces chronic ER stress (3)(4)(5). Although the unfolded protein response (UPR) evolved as an adaptive pathway [broadly conserved among eukaryotic cells (6,7)], its chronic activation in b-cells impairs glucose-stimulated insulin secretion and b-cell viability [for review, see (8,9)]. Also designated mutant INS-gene induced diabetes of the young (MIDY) (4), MPS encompasses a range of patient phenotypes, representing subtypes of permanent neonatal diabetes mellitus (PNDM) to maturity-onset diabetes of the young (MODY) (10)(11)(12)(13).
Among monogenic endocrine syndromes in general [such as complete or partial androgen insensitivity (14)], a given mutation may be associated with a range of phenotypes, even in the same kindred (15). A given mutation in the androgen receptor, for example, may be associated with male development, somatic female development with Mullerian regression, or ambiguous genitalia (16). The complexity of genotypephenotype relationships (GPR) in such syndromes presumably reflects the influence of modifier genes in multigenic regulatory pathways (17). Modifier genes have also been inferred in the genetics of Type 1 diabetes mellitus (DM). GPRs in MPS may be more straightforward: the extent of b-cell dysfunction and velocity of b-cell loss (together determining age of diabetes onset in a particular patient) may reflect mutation-specific molecular properties, i.e., whether a given amino-acid substitution is associated with a severe (PNDM) or mild (MODY) perturbation to foldability. In our initial study [preceding article in this issue (18)] we designed a 49-residue peptide model of an early on-pathway proinsulin folding intermediate and its application to three representative MIDY mutations. This single-chain model contains only one disulfide bridge and is thus designated 1SS. This bridge (cystine B19-A20) is the first to accumulate among populated partial folds in the in vitro folding pathway of proinsulin and homologous factors (19,20). The present study provides a detailed two-dimensional NMR study of the parent 1SS peptide and representative MIDY-related variants.
Native insulin contains two chains, B (30 residues) and A (21 residues) (21). Its native structure contains three a-helices stabilized by two inter-chain disulfide bridges (cystines B7-A7 and B19-A20) and one intrachain bridge (A6-A11; Figure 1A) (23). Whereas chain combination between the isolated chains is inefficient (24), cellular biosynthesis exploits a single-chain precursor, proinsulin, wherein disulfide pairing is intramolecular (25). Human proinsulin contains a disordered 35-residue connecting domain (C domain) between Thr B30 and Gly A1 . Efficiency of disulfide pairing in single-chain precursors can be augmented by shortening the C domain or deleting it entirely (26). The present peptide model contains a peptide bond between residues B28 and A1 (red bars in Figure 1B); the native Pro B28 is substituted by Lys to permit convenient enzymatic cleavage to a two-chain hormone (red in sequence N in Figure 1B) (27). The 49-residue synthetic precursor (designated "DesDi") exhibits remarkable folding efficiency, enabling preparation of certain insulin analogs refractory to classical chain combination (27). In the 1SS model cystine B7-A7 (solvent exposed in native insulin) is pairwise substituted by Ser whereas cystine A6-A11 (buried in the core of native insulin) is pairwise substituted by Ala (18). Segmental a-helical propensity and solubility were augmented by acidic surface substitutions His B10 !Asp (28,29) and Thr A8 !Glu (30). The seven amino-acid substitutions in the parent 1SS peptide (positions B7, B10, B28, A6-A8 and A11) are highlighted in red in Figure 1B (entry 1SS-WT). Relative to an homologous two-chain peptide model of the corresponding 1SS IGF-I folding intermediate ( Figure 1C) (22), we anticipated that the 1SS De sDi -b a se d s in g l e-cha in m od el wo uld e xhi b it a conformational equilibrium biased toward a collapsed state ( Figure 1D).
To connect our model to patient phenotypes, three MIDY mutations were introduced into the 1SS peptide model ( Figure 1B). Two (Pro B15 and Pro A16 ) are associated with neonatal-onset DM (31,32); the third (Ser B24 ) is associated with onset in early adulthood (33). The structural environments of these conserved side chains are shown in Figure 2A. Whereas the side chains of Leu B15 and Leu A16 are buried in the hydrophobic core (Figures 2B, C, E, F), the aromatic side chain of Phe B24 packs within a crevice overlying internal cystine B19-A20 such that one side of the aromatic ring is exposed to solvent ( Figures 2B-D). Initial characterization of these peptides was described in our companion article (18). Whereas the Pro substitutions introduced marked perturbations in folding efficiency in a mini-proinsulin ["DesDi", (27)], Ser B24 was well tolerated. Synthetic yields mirrored residual a-helix contents (as inferred from far-UV circular dichroism; CD) in the corresponding 1SS peptides (18). A further correlation was observed between these properties and a pertinent pathogenetic process: extent of ER stress induced in a human cell line (HEK 293T) on transient expression of the corresponding mutant proinsulins. The coherence of these correlations (18) motivated this companion study wherein NMR spectroscopy provides a residue-specific view.
In this companion study we employ two-dimensional 1 H-and [ 1 H, 13 C]-NMR methods to interrogate the 1SS peptide models in relation to the native insulin. Analysis of main-chain 1 H and 13 C chemical shifts in the parent peptide (34)(35)(36) provided evidence for nativelike nascent a-helices in the B domain (residues B9-B19) and A domain (A12-A20), together in accordance with CDdefined a-helix content (18). Although chemical-shift dispersion in this and the variant partial folds is more limited than in the spectrum of a native insulin monomer (37), key side-chain resonance assignments were verified by site-specific 13 C (and in chemical peptide synthesis. Analysis of signature chemical shifts and framework nuclear Overhauser effects (NOEs) provides evidence for a nativelike folding nucleus in the parent 1SS peptide that is dependent on maintenance of the B19-A20 disulfide bridge. The clinical mutations perturb this nascent structure in the order Ser B24 (least perturbed) >> Pro A16 > Pro B15 (most perturbed). Together, these findings suggest that the DesDi-based 1SS model will provide a general platform for comparative biophysical studies of that subset of MIDY mutations that perturb initial closure of cystine B19-A20 in proinsulin biosynthesis in pancreatic b-cells.
Automated couplings utilized diisopropylcarbodiimide (DIC)/6-Cl-hydroxybenzotriazole (6-Cl-HOBt) in N-methyl pyrrolidone (NMP) whereas Fmoc deprotections used 20% piperidine in NMP. a-Carboxyl-protected Asp was used in place of Asn in all syntheses of DesDi analogs to accommodate the use of ChemMatrix ® Rink-Amide resin (loading = 0.46mmol/g). The Tribute peptide synthesizer used heating protocols: coupling was done at 6 min at 60°C except for Cys/His (2 min at 25°C, then 5 min at 60°C) and Arg (20 min at 25°C, then 5 min at 60°C); deprotection was done twice (30 sec at 50°C, then 3 min at 50°C). Reagent conditions were otherwise similar to ABI protocols except that DMF was used as solvent and choice of the resin was H-Asn (Trt)-HMPB-ChemMatrix ® resin. Peptides were cleaved with a trifluoroacetic acid (TFA) cocktail (2.5% vol/vol of each: bmercaptoethanol, triisopropylsilane, anisole, and water) followed by ether precipitation.

Folding and Purification of N and N* DesDi Analogs
Crude peptides from ether precipitation were dissolved in glycine buffer (20mM glycine and 2mM cysteine hydrochloride, pH 10.5) to a final peptide concentration of 0.1mM. The pH of this solution was readjusted to 10.5 to account for traces of residual TFA present in lyophilized peptides. This solution was  Table S1 for LCMS retention times and mass verification.

Synthesis of Isotopically Labeled Peptides
Isotopically labeled 1SS and control peptides were prepared on 0.05 mmol scales. N* was assembled in entirety as a single batch whereas the 1SS peptides were assembled as a single batch (0.1mmol) through residue B25. At that point, half of the 1SS resin was used to complete assembly of the 1SS-Ser B24 analog. The remaining resin was used to complete the 1SS synthesis. Coupling of isotopically labeled amino acids (Cambridge Isotopes Inc., Tewksbury, MA) was performed manually using a single 0.3mmol mixture consisting of equivalent amounts of labeled Fmoc amino acid, 1-[Bis(dimethylamino)methylene]-1H-1,2,3-triazolo[4,5-b]pyridinium 3-oxid hexafluorophosphate (HATU), and N,N-diisopropylethylamine (DIEA) reacted with each individual batch of resin (2.25X overall excess). The same automated synthesis protocol used for unlabeled peptide syntheses was then used for the addition of subsequent unlabeled amino acids to the labeled peptide assembly. The following labeled amino acids were used (residual positions given in blue in Scheme S1): Fmoc-Gly[ 13 C 2 , 15 N]-OH, Fmoc-Leu[ 13 C 6 , 15 N]-OH, Fmoc-Ile[ 13 C 6 , 15 N]-OH, Fmoc-Val[ 13 C 5 , 15 N]-OH, and Fmoc-Tyr[ 13 C 9 , 15 N]-OH. The peptides were cleaved from the resin by treatment with TFA cocktail as described above. The folding and purification of these isotope labeled 1SS peptides followed essentially same protocol as described in the companion article (18). Purity of the materials was confirmed by LC-MS with an Agilent 1260 Infinity/6120 Quadrupole instrument utilizing a Kinetex C8 2.6-mm 100A (75 mm x 2.1 mm) column and a 10 min 10-80% acetonitrile elution gradient (1mL/min flow rate) (See Figures S1-S3).

Purification of Clinical Analogs
Wild-type insulin and insulin lispro were purified from U-100 pharmaceutical formulations of Humulin ® and Humalog ® (Eli Lilly and Co.), respectively, using preparative RP-HPLC (C4 10mm 250×20mm Proto 300 Column; Higgins Analytical, Inc.) utilizing buffer A (0.1% TFA in H 2 O) and a 10-min elution gradient of 20!70% buffer B (0.1% TFA in acetonitrile). Leu B15 (core). In these panels the highlighted side chain is shown in red. These electrostatic surfaces were calculated in absence of indicated side chain.
Following lyophilization of the collected protein fraction, purity was verified using analytical HPLC (TARGA C8 5-mm [250 mm x 4.6 mm]; Higgins Analytical, Inc.) with a 35-min elution gradient of 25!50% buffer B; molar mass was verified with an Applied Biosystems 4700 Proteomics Analyzer utilizing MALDI-TOF in reflector mode. chromatographic retention times and mass measurements for these clinical analogs are given in Table S1.

Proinsulin Constructs
Plasmids expressing full-length human proinsulin or variants were constructed by polymerase chain reaction (PCR). Mutations in proinsulin were introduced using QuikChange ™ (Stratagene). Constructions were verified by DNA sequencing.

NMR Spectroscopy
1 H NMR spectra were acquired at a proton frequency of 700 MHz at pD 7.4 (direct meter reading) at 35°C. 1 H-13 C heteronuclear single-quantum coherence (HSQC) spectra were acquired at natural abundance as described. The spectra were obtained at 13 C frequency of 176 MHz at a constant temperature of 308 K using the "hsqcetgp" Bruker pulse sequence as described by the manufacturer. Aliphatic 1 H-13 C HSQCs were acquired with FID size 2048 x 128, 800 scans, 1.0 sec relaxation delay, sweep widths 11 ppm ( 1 H) and 70 ppm ( 13 C) with offset 4.7 and 40 ppm for the 1 H and 13 C dimension, respectively. Similar parameters were used to acquire aromatic 1 H-13 C HSQC, except with sweep widths of 40 ppm and 125 ppm offset in 13 C dimension. Data were processed with Topspin 4.0.6 (Bruker Biospin) and analyzed with Sparky software (38) using a 90°s hifted-sine window function to a total of 2048 × 1024 data points (F2 × F1), followed by automated baseline-and phase correction. All NMR data were acquired using a BRUKER 700 MHz spectrometer equipped with quadruple [ 1 H, 19 F, 13 C, 15 N]resonance liquid-helium-cooled cryoprobe.

Secondary Structure Analysis
Protein secondary structure was inferred from selected 1 H and 13 C secondary chemical shifts ( 1 H N , 1 H a , 13 C a and 13 C b ) as described (35,39,40). In such algorithms 13 C a and 1 H a chemical shifts distinguish a-helix from b-strand or random coil (41) whereas 13 C b secondary shifts are more sensitive to b-strand. These chemical shifts in the parent 1SS model were assigned on the basis of 2D 1 H-1 H NOESY, TOCSY, DQF-COSY in D 2 O and H 2 O (10% D 2 O) and natural abundance 1 H-13 C HSQC spectra. Corresponding secondary shifts were extracted from observed chemical shifts (D = d obs -d coil ). Secondary structural elements were predicted by TALOSplus (34)(35)(36).

Molecular Modeling
Structural ensembles were calculated by simulated annealing using XPLOR-NIH (42)(43)(44). The models of the one-disulfide proinsulin and one-disulfide DesDi intermediates (containing cystine B19-A20) were generated using distance restraints pertaining to residues A16-A21 and B15-B26 as observed in an engineered proinsulin (45) or an engineered insulin monomer (46). To allow for protein flexibility in these partial folds, upper bounds on long-range distance restraints were increased by 3 Å relative to NMR-derived upper bounds obtained in prior studies of insulin and proinsulin (45,46).

RESULTS
One-dimensional 1 H-NMR spectra of the 1SS peptide model and its variants were presented in our companion study (18) in relation to spectra of the native state (provided as Figure S4 for convenience of the reader). Molecular properties of these peptides are summarized in Table S1. Although the 1D spectra were in overall accordance with trends in synthetic yield, CD deconvolution and redox stability (18), interpretation was limited by the small number of resolved features. To circumvent this limitation, 2D homonuclear and 2D 1 H-13 C HSQC NMR spectra were obtained at natural abundance. Analysis was undertaken in reference to baseline HSQC spectra of native DesDi (as the single-chain precursor and as cleaved two-chain hormone analog; Figures 3A, B). Nearcomplete assignment of 1 H N , 1 H a , 13 C a and 13 C b resonances in the parent 1SS peptide model enabled mapping of secondary structure based on pattern of secondary chemical shifts ( Figure 4) (34-36). a-Helical segments comprise residues B9-B18 and A13-A19 ( Figure 4E), a subset of native secondary structure. Chemical shifts (referenced below) and estimates of chemical-shift dispersion are tabulated in Tables S2-S9. 1 H-13 C HSQC spectra provide correlations between a 13 C atom and an attached proton ( 1 H) via a one-bond J-coupling (47,48). Initial spectra were obtained at 35°C. Key resonance assignments in the parent 1SS peptide and 1SS-Ser B24 variant were verified by site-specific 13 C labeling of residues Val B12 , Leu B15 , Gly B23 , Phe B24 , Tyr B26 and Ile A2 (Scheme S1 and Figure  S5). In each figure aromatic 2D spectra are shown at left, and aliphatic spectra at right. The spectrum of the native state is better illustrated by two-chain [Asp B10 , Glu A8 ]-DesDi-insulin ( Figure 3B; black) than its single-chain precursor ( Figure 3A; gray) due to selected resonance broadening in the latter spectrum; such broadening may reflect partial dimerization (stronger in single-chain analogs) and/or conformational exchange intermediate on the timescale of NMR chemical shifts. The baseline 2D 1 H-13 C HSQC spectrum of two-chain [Asp B10 , Glu A8 ]-DesDi-insulin is notable for its resolution of many individual spin systems; selected resonance assignments are provided in Figures 3A, B. Of particular interest are the aromatic resonance of Phe B24 and Tyr B26 , which pack against the central B-chain a-helix and influence the chemical shifts (via their aromatic ring currents) of the side chains of Leu B11 and Leu B15 . These respective upfield aromatic and upfield methyl resonances provide markers of the native B-chain supersecondary structure (49,50). The upfield chemical shifts of Ile A2 by contrast reflects A-chain supersecondary structure (in particular the aromatic ring current of Tyr A19 ); that of Ile A10 reflects inter-chain packing of His B5 (51,52). The latter structural features (and their NMR signatures) are reinforced by cystines A6-A11 and B7-A7 in native insulin.
The 2D 1 H-13 C HSQC spectrum of the parent 1SS model ( Figure 3C; green) is remarkable for preservation of some upfield shifts (Phe B24 , Tyr B26 , Leu B11 and Leu B15 ) but not others (Ile A2 and Ile A10 ). The former subset of nativelike chemical shifts indicates maintenance of B-chain secondary structure flanking Cys B19 (and therefore the B19-A20 disulfide bridge) whereas the latter attenuation of structure-dependent secondary shifts indicates destabilization of long-range structure involved the A2-A10 segment in accordance with removal of the B7-A7 and A6-A11 disulfide bridges (50,53,54). This combination of preserved and attenuated chemical shifts in the spectrum of the parent 1SS model is illustrated by overlay of its spectrum (green) and that of two-chain [Asp B10 , Glu A8 ]-DesDi-insulin (black) in Figure 3D. The spectrum of the parent 1SS peptide in turn provides a baseline for comparative analysis of the MIDY-derived variants ( Figure 5). Chemical-shift dispersion is markedly attenuated in each of the variants ( Figures 5B-D), most markedly in 1SS-Pro B15 ( Figure 5D). In the case of 1SS-Ser B24 ( Figure 5B) such attenuation may reflect both loss of ordered structure and removal of the Phe B24 aromatic ring current (52), potentially a confounding issue (the same issue pertains to Ser B24 -insulin analogs with native pairing; Figure S6). A subtle trend is observed in the 1 H ϵ chemical shift of Tyr B26 (arrows in Figures 5B-D) in which slight upfield shifts are observed in two case (1SS-Ser B24 > 1SS-Pro A16 ), but not observed in 1SS-Pro B15 . This trend is shown in an enlargement of spectra in Figure S7. Analysis of 1 H-13 C chemical shifts was extended by 2D 1 H-1 H NOE spectroscopy (NOESY). Of particular interest are interproton NOEs (reflecting distances < 5 Å) between aromatic and aliphatic side chains. Such NOEs are prominent in the spectrum of native insulin (upper panel of Figures 6A, B), shown in relation to the corresponding TOCSY (total correlation spectroscopy) spectra of aromatic spin systems (lower panel). These inter-residue NOEs in part retained in the NOESY spectrum of the parent 1SS peptide (upper panel in Figure 6C). Particularly notable is the retention of close contacts between Phe B24 and Tyr B26 and the methyl groups of Leu B15 , resolved due to native-like chemical-shift dispersion. A subtle feature is observed in the aromatic TOCSY spectra: the spin systems of Tyr B16 and Tyr A19 , downfield of the mobile and solvent-exposed side chain of Tyr A14 in native insulin (lower panels of Figures 6A, B), is retained in attenuated form in the TOCSY spectrum of 1SS peptide (lower panel of Figure 6C). NOEs between aromatic and aliphatic protons are observed in the spectra of the variants, but with decreased dispersion (inset boxes in Figures 6D-F); in the case of 1SS-Pro B15 , the overall integrated cross-peak envelope intensity is reduced ( Figure 6F). Although as expected the aromatic spin system of Phe B24 is absent in the TOCSY spectrum of 1SS-Ser B24 (lower panel of Figure 6D), subtle upfield shifts of Phe B24 are retained in 1SS-Pro A16 and 1SS-Pro B15 . These trends are shown in expanded form in Figure S8. We imagine that the latter conformational ensembles contain a minor fraction of compact substates with long-range contacts, which nonetheless are less populated than in the parent 1SS peptide. This interpretation is supported by more detailed examination of these NOESY regions ( Figures 7C-F) in relation to the structural relationships in native insulin (Figures 7A, B). Aromatic-methyl NOEs in the parent 1SS model are shown in expanded form in Figure 7C; long-range contacts are prominent in the neighborhood of cystine B19-A20, proposed to constitute the specific folding nucleus of proinsulin (green side chains in Figure 7A) (22). Also observed are long-range NOEs from Tyr B26 to the methyl groups of Ile A2 and Val A3 , presumably reinforced by the B28-A1 peptide bond in the DesDi framework and foreshadowing subsequent steps in A-domain segmental folding associated with pairing of the remaining two cystines. This subset of these nativelike long-range NOEs can be resolved in the variants despite their attenuated chemical-shift dispersion ( Figures 7D-F).
The above degree of organization in the nascent structure of the parent 1SS model is dependent on pairing of Cys B19 and  Table S9.
Cys A20 . Whereas overlay of the 1 H-13 C HSQC spectra of the parent 1SS peptide and two-chain [Asp B10 , Glu A8 ]-DesDi-insulin reveals similar limits of dispersion (green versus black in Figure 8A; inset vertical and horizontal arrows), reduction of the 1SS disulfide bridge by deuterated dithiothreitol (to yield a linear peptide) led to loss of 1 H-13 C chemical-shift dispersion (green versus brown in Figure 8B). The reduced 1SS peptide exhibits limited dispersion, with pattern that is similar in detail to that of 1SS-Pro B15 (brown versus red in Figure 8C). Possible transient or nascent long-range interactions in the linear 1SS peptide have not been investigated. 1 H-13 C HSQC spectra of the four 1SS peptides are overlaid in Figure 8D; relative main-chain dispersions exhibit the qualitative trend: parent > Ser B24 > Pro A16 > Pro B15 in accordance with the findings above. This trend was made quantitative by detailed analysis of respective secondary shifts [in reference to tabulated random-coil values (55)] as shown in the four histograms in Figure S11. Reliance of 1 H a / 13 C a chemical shifts circumvents the confounding absence of the B24 ring current in 1SS-Ser B24 as these resonances are less influenced by aromatic ring currents (52,56). The greater main-chain chemicalshift dispersion in 1SS-Ser B24 and 1SS-Pro A16 relative to 1SS-Pro B15 was accentuated by lowering the temperature from 35 to 10°C. Stacked plots of 1D 1 H-NMR spectra are shown for each 1SS peptide as a function of temperature in the range 5-35°C (in steps of 5°C) in Figure S9. At lower temperatures spectra of the parent 1SS peptide, 1SS-Ser B24 and 1SS-Pro A16 exhibit conformational broadening of upfield aromatic and aliphatic features, suggesting slowing of conformational fluctuations into the millisecond   Figure  S10). This trend extends to the aromatic and methyl regions of the HSQC spectra ( Figure S12). The chemical shift of Tyr B26 H ϵ and a resolved methyl resonance in the reduced 1SS peptide are likewise independent of temperature ( Figures S12, S13). We speculate that the anomalous NMR properties of 1SS-Pro B15 , indicating loss of nascent structure relative to the other 1SS peptides, rationalizes this mutation's essentially complete block to the folding of Pro B15 -DesDi, even in the presence of stabilizing substitutions Asp B10 and Glu A8 [preceding article in this issue (18)].

DISCUSSION
The discovery of proinsulin by Steiner and colleagues in 1967 [ (25,57); for review, see (58)] solved a problem encountered in the chemical synthesis of insulin: inefficient specific disulfide pairing encountered in chain combination (24). Although the isolated A-and B chains of insulin contain sufficient information to specify the native structure (59), yield is reduced by competing off-pathway reactions, including formation of cyclic peptides and amyloid. Proinsulin is nonetheless itself difficult to fold. The majority of human cell lines in routine laboratory use do not efficiency fold proinsulin (4), leading to detectable disulfide isomers (3). The specialized folding environment of the b-cell ER is adapted to the biochemical demands of proinsulin biosynthesis, and yet even so physiological overexpression of the INS gene-as a compensatory response to peripheral insulin resistance (60)-can induce chronic ER stress and contribute to the progression of prediabetes and Type 2 DM (8). The growing collection of MIDY mutations in proinsulin associated with toxic misfolding leading to b-cell dysfunction (9) has motivated the hypothesis that insulin sequences, well conserved among vertebrates (23,61), are entrenched at the edge of foldability (46). The marginal stability of insulin and proinsulin, relative to such classical model proteins as bovine pancreatic trypsin inhibitor and hen egg white lysozyme, is associated with qualitative differences in their respective refolding properties (62-67) (see also Supplemental Discussion).
In this and our companion study (18) we have introduced a single-chain peptide model of an early proinsulin folding intermediate. A framework ("DesDi") was provided by an innovative mini-proinsulin containing a peptide bond between residues B28 and A1, with Pro B28 substituted by Lys to enable facile enzymatic cleavage to liberate an active insulin analog (27). The B28-A1 peptide bond enables successful oxidative folding of the 49-residue synthetic precursor even in the presence of mutations (such as Val A16 ) that otherwise block classical chain combination (68). The B28-A1 inter-chain tether in DesDi presumably favors a productive orientation between A-and Bdomain folding determinants and limits off-pathway events. We further stabilized DesDi by enhancing the a-helical propensity of the central B-domain segment [His B10 !Asp (29)] and Nterminal A-domain segment [Thr A8 !Glu (30)]. Increasing the net negative charge through these acidic substitutions would also be expected to enhance solubility and retard competing formation of amyloid (69). A one-disulfide model of an initial proinsulin folding intermediate was thus obtained by pairwise substitution of exposed cystine B7-A7 by Ser and internal cystine A6-A11 by Ala (18). The present study builds on our foundational characterization of the 1SS peptide model to interrogate nascent structure by twodimensional 1 H and 1 H-13 C NMR spectroscopy. In accordance with prior NMR studies of a two-chain peptide model of a onedisulfide IGF-I folding intermediate (22), the parent 1SS model contains a subset of native secondary structure: central B-domain a-helix (residues B9-B19), C-terminal A-domain a-helix (A12-A20) and nascent b-strand (B24-B26). Molecular models of the parent model and the corresponding proinsulin intermediate are shown in Figure 9A in relation to the solution structure of an engineered proinsulin monomer (45). In these models cystine B19-A20 is integral to the hydrophobic mini-core formed at the confluence of the nascent elements of secondary structure. We envision that this nativelike subdomain represents the first organized nucleus in a series of successive folding landscapes ( Figure 9B). Although disulfide chemistry in polypeptides can exhibit (especially at basic pH) complex patterns of native and non-native disulfide exchange and rearrangement, this structural perspective offers a simplified view of the predominant proinsulin folding scheme at neutral pH ( Figure 9C). This scheme in principle provides a framework for interpreting clinical mutations that impair folding efficiency.
Comparative NMR studies of the variant 1SS peptides suggest structural mechanisms of impaired foldability. In particular, patterns of chemical shifts and NOEs provide evidence of native-like tertiary structure in the neighborhood of cystine B19-A20 and its destabilization in the variant peptides in rank order Ser B24 >> Pro A16 > Pro B15 (least organized). Their respective ensembles of partial folds each exhibit a subset of nativelike long-range NOEs-presumably reflecting fractional occupancies of analogous molten-globule states that foreshadow native structural relationships-but with progressively more complete averaging of chemical shifts in this series ( Figure S14). Among these 1SS peptides and native insulin, striking correlations are observed between CD-defined a-helix contents [in the same rank order (18)] and NMR parameters: mean 1 H a chemical-shift dispersion ( Figure 10A) and average 1 H a / 13 C a main-chain secondary shifts ( Figure 10B). The biological importance of these CD-and NMR-derived biophysical parameters is demonstrated by their further correlation with levels of ER stress induced by expression of the corresponding mutant proinsulin ( Figures 10C, D) (4,33). Although Pro B15 is more profoundly perturbing than is Pro A16 , each is associated with neonatal-onset DM (31,32) and so must surpass the threshold for post-natal b-cell ER stress leading to the rapid progression of b-cell dysfunction and death (8). A surprising aspect of the present NMR studies is extent to which the parent 1SS peptide retains native-like spectroscopic features. Such structure is lost on reduction of cystine B19-A20. We ascribe the nascent organization of the parent 1SS peptide to (a) the B28-A1 peptide bond, which orients flanking B-and Adomain segments, and (b) stabilizing a-helical substitutions Asp B10 and Glu A8 . Although such extensive nascent structure would not be expected in a one-disulfide analog of proinsulin (86 residues), the richness of the 1SS NMR spectrum suggests that it would be of future interest to develop a biosynthetic expression system, so that uniform 13 C and 15 N isotopic labeling would enable application of powerful 3D/4D heteronuclear NMR methods (72,73), including residual dipolar couplings (74). We foresee that such high-resolution analysis would enable comparison between the nascent structure and dynamics of the 1SS peptide and the classic globular structure of native insulin (23). Further, such complete 1 H, 13 C and 15 N characterization would provide a rigorous platform for comparative studies of those MIDY mutations that perturb pairing of Cys B19 and Cys A20 .

CONCLUDING REMARKS
The spectrum of diabetes-associated mutations in the insulin gene implicates diverse genotype-phenotype relationships (9, 75). These include not only mutations toxic misfolding of proinsulin with the ER (the present focus), but also those affecting upstream translocation of nascent preproinsulin (76,77) and downstream trafficking, prohormone processing, and receptor binding (75). Each class of clinical mutations promises an opportunity to dissect respective molecular mechanisms critical to wild-type hormone biosynthesis and function. Because mutations may introduce mild, intermediate or severe biochemical perturbations, their comparative study may reveal quantitative thresholds of dysfunction associated with clinical features, such as age of diabetes onset or degree of genetic penetrance. Adult-onset Ser B24 represents a mild perturbation of folding efficiency whereas both neonatal-onset Pro A16 and Pro B15 mutations-albeit distinct in location and degree of structural perturbation-must be below the threshold of foldability required for b-cell viability. Extending the present approach to additional MIDY mutations may define molecular determinants of this threshold. The mutant proinsulin syndrome thus promises to provide an intriguing model to relate chemistry to biology in a prototypical disease of intracellular protein misfolding. Insulin residues are denoted by residue type (in standard threeletter code) followed by the chain and position (e.g., Phe B24 designates a phenylalanine at the 24 th position of the B chain)