Tuning the Solubility of Self-Assembled Fluorescent Aromatic Cages Using Functionalized Amino Acid Building Blocks

We previously reported novel fluorescent aromatic cages that are self-produced using a set of orthogonal dynamic covalent reactions, operating simultaneously in one-pot, to assemble up to 10 components through 12 reactions into a single cage-type structure. We now introduce N-functionalized amino acids as new building blocks that enable tuning the solubility and analysis of the resulting cages. A convenient divergent synthetic approach was developed to tether different side chains on the N-terminal of a cysteine-derived building block. Our studies show that this chemical functionalization does not prevent the subsequent self-assembly and effective formation of desired cages. While the originally described cages required 94% DMSO, the new ones bearing hydrophobic side chains were found soluble in organic solvents (up to 75% CHCl3), and those grafted with hydrophilic side chains were soluble in water (up to 75% H2O). Fluorescence studies confirmed that despite cage functionalization the aggregation-induced emission properties of those architectures are retained. Thus, this work significantly expands the range of solvents in which these self-assembled cage compounds can be generated, which in turn should enable new applications, possibly as fluorescent sensors.

Recently, we have reported a method for obtaining a new type of molecular cage-like system using a set of two orthogonal reversible reactions (disulfides and acyl-hydrazones) that occur simultaneously (Drozdz et al., 2017). This was the first report of multi-dynamic and multi-component cages selectively obtained in a one-pot process, and characterized in semi-aqueous media, by applying two distinct reversible covalent bonds. The reported cages were based on two simple building blocks: tetraphenylethylene-tetraaldehyde (TPE-Ald) being known for its fluorescence properties such as aggregation-induced emission (AIE) (Zhao et al., 2012;Mei et al., 2015;Feng et al., 2018), and cysteine hydrazide, a small, chiral molecule containing three different functional groups: amine, hydrazide, and thiol. A multiplicity of factors was determined as contributing to the exceptional selectivity in the formation of cage-system, of which the presence of dimethyl sulfoxide (DMSO) in the reaction mixture was found essential for the effective cage generation. This is related to the participation of this solvent in the oxidation of thiols into disulfides which is promoted even under slightlyacidic conditions, as previously reported (Tam et al., 1991;Atcher and Alfonso, 2013). Both thiol oxidation and disulfide exchange usually proceed under mild-basic conditions, whereas acyl-hydrazones require the presence of acid catalysts. Therefore, the simultaneous formation of these two bonds, despite an obvious progress in this area, remains a challenging task and the use of DMSO as co-solvent helps in this regard. The presence of reversible covalent bonds in the cage structure is crucial as it allows generating thermodynamically stable products by self-correction of intermediate kinetic products, while maintaining dynamic features that are important for the responsiveness of those structures. The latter are extremely important when taking into account the potential use of such systems in the selective complexation of guest molecules or in the "self-healing" processes. The first generation of our doublydynamic cages, despite their interesting structural features, required the use of well-defined solvent mixture (H 2 O-DMSO with predominant content of the latter from 75 to 94%). This somewhat limited the scope of investigation of e.g., fluorescence properties only to this solvent system and the potential applications one can think of. Therefore, to expand the scope of our cages, we decided to modify the structure of one of the building blocks in order to generate two doubly-dynamic and multicomponent cage systems presenting similar structural properties but distinct hydrophobic/hydrophilic character. We chose to take advantage of the reactive α-amine present on amino acids and thus chose to functionalized the cysteine hydrazide building block at its Nterminal. It was of course necessary to ensure firstly that the preferential formation of cage structure would be retained, the principal intent simply being to modify the solubility of the cages, then to characterize the spectroscopic properties of the cages in a wider range of solvents. We describe here our syntheses of cages derived from modified cysteine hydrazide units and the properties of these new architectures in various media.

Design of Building Blocks
Our investigations began with the design and selection of several structurally distinct molecular components for the efficient construction of doubly dynamic tetrapodal cage systems. As shown in Figure 1, such cages can be formed in a self-assembly process between two tetratopic components constituting upper and lower planes of the cage and four ditopic molecules as bridging linkers. Both types of building blocks must have functional groups that allow facile formation of reversible covalent bonds such as acyl-hydrazones and disulfides. The TPE aldehyde (TPE-Ald) was retained as the aromatic core of the cage-like structures, because we wanted our new cages to have an analogous structure to those reported previously (Drozdz et al., 2017). We decided to use the amino group in cysteine hydrazide to insert a structural extension via amide bond formation. Coupling reactions based on different amino acid systems are well-described and run under mild conditions. We decided to use two organic acids to modify the cysteine hydrazide. We chose a more polar and hydrophilic component, 2-(2-methoxyethoxy)acetic acid (DEG) containing a short diglycol fragment, in order to achieve enhanced aqueous solubility, and a non-polar group, 2-ethylhexanoic acid (EH), to increase hydrophobicity and generate cages soluble in organic solvents. It should be mentioned, that we used the racemic ethyl hexanoic acid to make the synthesis more cost effective and facile. We assumed here that one of the key features for this project is to maintain the R-configuration on the cysteine hydrazide moieties, and that the using of racemic EH acid will not affect the selfassembly process of cage formation.

Synthesis of Building Blocks
The syntheses (Figure 2) began with the preparation of a partially-protected cysteine hydrazide. Commercially available Fmoc-STr-L-Cys-OH (1), was coupled with tert-butyl carbazate before the Fmoc group was removed in a standard reaction with piperimidine in DMF. N-hydroxysuccinimide activated esters (4 and 5) of diethyl-glycol acid and ethylhexyl acid were obtained by reaction with DCC and N-Hydroxysuccinimide (NHS) and H-L-Cys-STr-Hyd-Boc (3) was then combined with the activated esters in amide coupling reactions. In this way, two fully protected derivatives of cysteine hydrazide tethered with DEG and EH moieties were obtained (6 and 7). In the final step, the hydrazide and thiol groups were deprotected in both DEG and EH FIGURE 1 | General scheme of the molecular cages formation process. Three functional groups involved in the self-assembly process were marked with colored atom labels (aldehyde: red; hydrazide: blue; thiol: orange). Structural modifications that change the physical properties of the cage have been shown in background color (light-blue for diethyl glycol moiety and light-yellow for ethyl-hexyl moiety). The color-gradient under the structure of each cage shows the contribution of individual fragments in the solubility property of the entire system. derivatives in a solution of TFA/TIS 9/1. Synthetic protocols and characteristics of all obtained compounds are available in the experimental section.

Self-Assembly of Cages
Cage formation using the new building blocks DEG-L-Cys-Hyd (9) and EH-L-Cys-Hyd (10) was first assessed in the original solvent system (DMSO/H 2 O 94/6). In both cases, LC-MS monitoring showed that complete conversion was reached after 3 days of equilibration at 50 • C, with no TPE-Ald left and the appearance of a new single peak in the mass spectra that corresponded to the expected cage compounds, respectively DEG-CAGE and EH-CAGE. Cage formation using orthogonal dynamic covalent reactions, namely acylhydrazone and disulfide formation, was therefore confirmed with the new N-functionalized cysteine-derived building blocks (Figure 3). Then we varied the nature of the solvent. First, DEG-CAGE formation was studied in a solvent system with an increasing proportion of water, from DMSO In samples with less than 25% DMSO, LC-MS indicated the presence of intermediates, which corresponded to acyl-hydrazone condensation products with no sign of higher structure generated by disulfide bond formation. Similarly, EH-CAGE formation was studied in organic solvents (from 100% DMSO to 100% CHCl 3 ) and LC-MS showed effective cage formation up to 90% CHCl 3 ( Table 1; Electronic Supplementary Information). Here again, a minimum of 10% DMSO seems to be essential for complete cage formation, otherwise acyl-hydrazone intermediates are formed (Electronic Supplementary Information) and do not react further. At this stage, we explain the necessity of the presence of the indicated percentages of DMSO in the reaction mixture both in terms of solubility and the facilitated oxidation of thiols to disulfides. Both cages were also characterized by 1 H NMR. To obtain the sample the original post-self-assembly mixture was evaporated to dryness and then the residue was re-dissolved in the deuterated solvent. The recorded 1 H NMR spectra confirmed the complex structure of the tetrapodal cages that exist as a mixture of isomers ( Figure 3C). Due to the high complexity of these molecules, we employed semiempirical molecular modeling (Recife Model 1) to get further insights into the structural features of the cages (Figure 3D). Based on this we estimated the sizes and volumes of each cage. Optimization of DEG-Cage has shown the retained internal cage cavity in comparison to the unmodified cage. The lengths and shape of side chains allow them to fit fully into the side grooves of the capsule, which causes the enhancement of molecule size and volume (∼4,200 Å 3 of spherical volume). The EH-Cage side chain after optimization has no visible cavity inside the molecule. The preferred configuration of ethyl-hexanoic chains forces the TPE backbones of the EH-cage molecule to twist slightly around the central axis, while aromatic adjust and stack on each other, which constitutes the flat shape and makes the EH-cage smaller than DEG-Cage (∼3,200 Å 3 of spherical volume).

Fluorescence Properties
The fluorescence spectra of both cage compounds, DEG-CAGE and EH-CAGE, were recorded in different solvent systems at the same concentration (0.5 mM). The spectra remained essentially unaltered in all solvents, showing an emission maximum around 510-520 nm. Both cages in all solvent mixtures employed showed significant emission enhancement which can be seen by comparing with the fluorescence spectrum of TPE-ALD (Figure 4). The observed emission enhancement is caused by rigidification and suppression of phenyl rings torsion of the two TPE units within a single cage molecule (Shultz and Fox, 1989). Increasing the water content decreases about five-fold the fluorescence emission intensity of DEG-CAGE, while, on the other hand, increasing the chloroform content increases about two-fold the fluorescence emission intensity of EH-CAGE (Figure 4).

CONCLUSIONS
We have reported herein the design and synthesis of new building blocks, based on N-functionalized cysteine-derived amino acids, for the one-pot multi-component self-assembly of fluorescent aromatic cages. Our results establish a convenient synthetic strategy, and show that placing such side-groups does not hinder cage formation-highlighting the robustness of the approachand enables tuning the solubility of the corresponding cage compounds. These cages preserved interesting aggregationinduced emission properties in a now much wider range of solvent mixtures (from 90% CHCl 3 /DMSO to 75% H 2 O/DMSO), which we believe can be harnessed for sensing applications.

MATERIALS AND METHODS
Solvents and chemicals were purchased from commercial suppliers and used without further purification. Preparative purifications were performed by silica gel flash column chromatography (Merck R 40-60 µM). HPLC analyses were performed on a Waters HPLC 2695 (EC Nucleosil 300-5 C18, 125 × 3 mm column, Macherey-Nagel) equipped with a Waters 996 DAD detector. The following linear gradients of solvent A (99.9% water and 0.1% TFA) into solvent B (99.9% acetonitrile and 0.1% TFA) were used: 0-100% of solvent B in 10 min; flow 1 mL/min. Retention times are given in minutes. LC/MS analyses were performed on a Shimadzu LCMS2020 (Phenomex Kinetex C18, 2.6 µm × 7.5 cm, 100 Å) equipped with a SPD-M20A detector with the following linear gradient of solvent A (99.9% water and 0.1% HCOOH) into solvent B (99.9% acetonitrile, 0.1% HCOOH) and: 0-100% of solvent B in 10 min; flow 1 mL/min. Retention times are given in minutes. Fluorescence analyses were carried out on an AF-2500 HITACHI fluorescence spectrophotometer. UV-Vis absorption experiment was measured on UV-3100pC UVisco spectrophotometer. Samples of cages compounds were studied at 0.25 mM concentration. Excitation wavelength was set at 320 nm and emission spectra were recorded in the range 400-600 nm. 1 H NMR, 13 C NMR spectra were recorded at 400 MHz for 1 H and 101 MHz for 13 C (Bruker Avance 400) in deuterated solvents. Chemical shifts are reported in ppm relative to the solvent residual peak. HR-MS analyses were carried out at the Laboratoire de Mesures Physiques, IBMM, Université de Montpellier using Micromass Q-Tof instruments. The TPE aldehyde (TPE-Ald) and Fmoc-L-Cys-STr-Hyd-Boc (2) were obtained according to the previously reported method (Drozdz et al., 2017).

Synthesis Protocols of Essential Chemicals and Building Blocks
Synthesis of H-L-Cys-STr-Hyd-Boc (3) Fmoc-L-Cys-STr-Hyd-Boc (2.3 g, 3.31 mmol) was dissolved in a solution of DMF/piperidine (8:2, v/v; 70 mM) and then was stirred for 1 h in r.t. After that the reaction mixture was concentrated in vacuo. The crude residue was purified by flash chromatography on silica gel eluting with a gradient of DCM

DATA AVAILABILITY
All datasets generated for this study are included in the manuscript and/or the Supplementary Files.

AUTHOR CONTRIBUTIONS
MK performed all organic synthesis, most experiments, analysis, and co-wrote the paper. PC performed molecular modeling and optimization of the tetrapodal cages. SU performed some analysis, interpreted the results, and co-wrote the paper. AS interpreted the results and co-wrote the paper.