Unraveling Heparan Sulfate Proteoglycan Binding Motif for Cancer Cell Selectivity

Membrane heparan sulfate proteoglycans (HSPG) regulate cell proliferation, migration, and differentiation and are therefore considered key players in cancer cell development processes. Here, we used the NT4 peptide to investigate how the sulfation pattern of HSPG on cells drives binding specificity. NT4 is a branched peptide that binds the glycosaminoglycan (GAG) chains of HSPG. It has already been shown to inhibit growth factor-induced migration and invasiveness of cancer cells, implying antagonist binding of HSPG. The binding affinity of NT4 with recombinant HSPG showed that NT4 bound glypican-3 and -4 and, with lower affinity, syndecan-4. NT4 binding to the cancer cell membrane was inversely correlated with sulfatase expression. NT4 binding was higher in cell lines with lower expression of SULF-1 and SULF-2, which confirms the determinant role of sulfate groups for recognition by NT4. Using 8-mer and 9-mer heparan sulfate (HS) oligosaccharides with analog disaccharide composition and different sulfation sites, a possible recognition motif was identified that includes repeated 6-O-sulfates alternating with N- and/or 2-O-sulfates. Molecular modeling provided a fully descriptive picture of binding architecture, showing that sulfate groups on opposite sides of the oligosaccharide can interact with positive residues on two peptide sequences of the branched structure, thus favoring multivalent binding and explaining the high affinity and selectivity of NT4 for highly sulfated GAGs. NT4 and possibly newly selected branched peptides will be essential probes for reconstructing and unraveling binding sites for cancer-involved ligands on GAGs and will pave the way for new cancer detection and treatment options.


INTRODUCTION
Heparan sulfate proteoglycans (HSPG) are a large family of heterogeneous molecules found in the extracellular matrix (ECM) and on the membranes of vertebrate cells. They are composed of a protein linked to sulfated glycosaminoglycan (GAG) chains, which are linear polymers of repeated disaccharide units consisting of an amino sugar and uronic acid, that can be modified with sulfate groups at various positions. HSPG can be classified by their localization as extracellular, intracellular, pericellular, and cell surface associated. Cell surface HSPG include the two families of syndecans and glypicans and betaglycan, a transmembrane proteoglycan (PG) with heparan and chondroitin sulfate chains. Glycosaminoglycan moieties in membrane-associated HSPG do not differ much in saccharide composition but are very different in sulfation pattern in terms of positions and number of sulfates (1,2). Since membrane HSPG regulate cell proliferation, adhesion, migration, and differentiation (3,4), they are considered key players in cancer cell development (1). This is because GAG chains of HSPG interact with a large number (>435) of extracellular regulatory proteins, such as growth factors, chemokines, and morphogens (5). Indeed, drugs directed against HSPG are being evaluated in preclinical models. For example, peptides directed against syndecan-1 have shown therapeutic promise in preclinical models of breast cancer and myeloma (6)(7)(8).
NT4 peptide is a tetrabranched peptide that binds to GAG chains of HSPG. Its branched structure, obtained by synthesizing four copies of the 13-amino-acid sequence on a branching core of lysines, makes NT4 stable to proteolytic enzymes and gives it a long half-life (9,10). NT4 binds cell lines of different human cancers, including colon adenocarcinoma, pancreas adenocarcinoma, bladder cancer, and breast cancer (11,12). It does not bind PgsA-745 cells (Chinese hamster ovary cell mutant), which lack GAG chains, being deficient in xylosyltransferase, the enzyme responsible for anchorage of GAG chains to the protein core (13). Tumor selectivity was very evident in surgical resections of colon, pancreas, and bladder cancer, stained with NT4 conjugated with a fluorescent probe, compared to the healthy counterparts (14)(15)(16).
NT4 peptides can be conjugated with different functional units and can selectively deliver drugs for cancer therapy or transport tracers for tumor imaging (11,12,(15)(16)(17)(18). Using drugconjugated NT4, we obtained a significant reduction in tumor growth or even tumor regression (11,14,17), compared to animals treated with the unconjugated drug under identical conditions. NT4 transports the chemotherapeutic moiety to the cancer cell membrane and, ultimately, into the cell (14)(15)(16). In animal models of cancer, the higher concentration of the cytotoxic drug at the site of the tumor, obtained by the targeting with the peptide, showed better efficacy than the free drug (11,14,17). We found that the high selectivity of NT4 toward cancer cells and tissues resides in its high-affinity binding to sulfated GAGs, with preferential high-affinity binding to heparin and heparan sulfate (HS) compared to chondroitin sulfate (CS) (13,19). Importantly, NT4 inhibited oriented migration of pancreas adenocarcinoma cells (13) as well as growth factor-induced migration and invasiveness of breast cancer cells, implying antagonist binding to HSPG (13,20).
Here, we report how the sulfation pattern of HSPG on cells can drive binding specificity. Regardless the expression of different HSPG on cancer cells, GAG linear polymers are the only exposed HSPG moiety on the outer membrane and are responsible for specificity.
The glycoside sequence and sulfation pattern of GAGs are crucial for ligand binding and are synthesized by enzymes in the Golgi apparatus and modified by extracellular enzymes that can introduce recognition patterns for growth factors (2) and other binding proteins. The specificity of GAG-ligand interactions has been reported in several studies. For example, it has been described in the case of the fibroblast growth factor (FGF)-heparin interaction, where the key residues on FGF and GAG chains were identified (21). The FGF-HS-FGFR1 ternary complex can only be formed in the presence of 6-O-sulfate groups on HS (22,23). Interestingly, it has been observed that short analogs of heparin, i.e., heparin oligosaccharides, featuring one or two 6-O-sulfate groups on the reducing end of glucosamine, can fully activate FGF2 signaling (24). 6-Osulfation of HS is also reported to be necessary to prompt the response of primary fibroblasts to transforming growth factor-β1 (TGFβ1), whereas 6-O-sulfates negatively regulate Wnt signaling (25,26).
NT4 binds a specific pattern and competes with GAG binding proteins for important biological functions like angiogenesis and migration. As such, NT4 was used here to define the fine structure of binding sites on GAG chains.
High-performance liquid chromatography (HPLC) purification was performed on a C18 Jupiter column (Phenomenex). Water with 0.1% trifluoroacetic acid (TFA) (A) and methanol (B) were used as eluents. Linear gradients over 30 min were run at flow rates of 0.8 and 4 ml/min for analytical and preparatory procedures, respectively. All compounds were also characterized on a BrukerUltraflex matrix-assisted laser desorption/ionization time-of-flight/time-of-flight (MALDI TOF/TOF) mass spectrometer. NT4 (pyELYENKPRRPYIL) 4

NT4 Binding
Cells were incubated with 1 µM NT4-biotin for 30 min at room temperature and then incubated with 1 µg/ml streptavidinfluorescein isothiocyanate (FITC). For heparinase treatment, cells were incubated for 1 h at 37 • C on the plates with 0.03 IU/ml heparinase I/III blend (Sigma Aldrich), and then harvested and incubated with the same concentration of heparinase in suspension for an additional hour at 37 • C before NT4 staining. All experiments were repeated two times. P values were calculated using a two-tailed Student t-test and GraphPad Prism 5.0 software.

Real-Time Polymerase Chain Reaction (qRT-PCR)
Total RNA samples were extracted from different human cancer cells (1 × 10 6 cells) with TRIzol (Invitrogen, Milan, Italy). For quantitative RT-PCR, RNA samples were retrotranscribed using the High-Capacity cDNA Synthesis Kit (Applied Biosystems, Monza, Italy) and amplified on an Abi Prism 7000 instrument (Applied Biosystems, Monza, Italy) using the TaqMan Universal PCR Master Mix (Applied Biosystems) following the manufacturer's instructions.
In order to determine the efficiency of each TaqMan gene expression assay, standard curves were generated by serial dilution of cDNA, and quantitative evaluations of target and housekeeping gene levels were obtained by measuring threshold cycle numbers (Ct). A relative quantitative analysis was performed, using the 2-Ct value, where Ct = Ct (target)-Ct (endogenous control) and Ct = Ct (sample)-Ct (calibrator). Beta actin was used as an endogenous control, and the sample with the lowest expression was used as a calibrator (syndecan-3 in HT-29).

Gene Expression of Human Sulfatases by RT-PCR
PANC-1, HT-29, MDA-MB-231, and MCF-7 cells were seeded in 6-well plates (5 × 10 5 cells per well) and cultured overnight in a CO 2 incubator. Total RNA was extracted using an RNA isolation kit (Macherey-Nagel) according to the manufacturer's instructions. RNA was quantified by spectrophotometry at 260 and 280 nm and verified by agarose gel electrophoresis. The same quantity of RNA for every cell line was loaded on the gel. One-step RT-PCR (QIAGEN) was applied for retrotranscription and human cDNA amplification of SULF-1 (393 pb) and SULF-2 (434 pb). The following oligonucleotides were used as primers: SULF-1 primers, 5'-ACTTCCACTGCCTGCGTAATGA-3 ′ (sense) and 5 ′ -ATGAACGCTTTGAGGCTAGGCA-3 ′ (antisense); SULF-2 primers, 5 ′ -CCCAGAAGCTCACAAAGGAAAACG-3 ′ (sense) and 5 ′ -AATGTCCACAACTGCGAGGGAT-3 ′ (antisense). The following PCR conditions were applied: for SULF-1, 30 denaturing cycles at 94 • C for 60 s, annealing at 58 • C for 60 s, and extension at 72 • C for 90 s; for SULF-2, 30 denaturing cycles at 94 • C for 60 s, annealing at 54 • C for 60 s, and extension at 72 • C for 60 s. Glyceraldehyde-3-phosphate dehydrogenase (GAPDH) was used as experimental control. Signals were detected using Image LAS4010 (GE Healthcare). Densitometry analysis was carried out using ImageJ software. The value 100% corresponds to GAPDH gene expression for each cell line. The experiment was performed twice. P values were calculated using a one-tailed Student t-test and GraphPad Prism 5.0 software.

Expression of Sulf-1
HT-29, PANC-1, MDA-MB-231, and MCF-7 cells were seeded in 6-well plates (1.5 × 10 6 cells per well), previously coated with 10 µg/ml plasma fibronectin, and maintained overnight in a CO 2 incubator. Cells were lysed according to the antibody supplier's instructions (Abcam). Total proteins (20 µl/lane) were separated with a 12% sodium dodecyl sulfate-polyacrylamide gel electrophoresis (SDS-PAGE) and transferred to a nitrocellulose membrane (GE Healthcare). The membrane was saturated with 5% w/v nonfat dry milk in PBS containing 0.1% Tween20 for 1 h at room temperature and then incubated with specific antibodies [rabbit polyclonal to sulfatase 1/SULF-1 antibody (1 µg/ml, Abcam), and mouse anti-GAPDH monoclonal antibody (1 µg/ml, Invitrogen)]. After washing, the membrane was incubated with horseradish peroxidaseconjugated anti-rabbit IgG (1:2,000, Cell Signaling) in the case of anti-sulfatase 1/SULF-1 antibody and with horseradish peroxidase-conjugated anti-mouse immunoglobulin G (IgG) (1:10,000, ThermoFisher). Signals were detected using Image LAS4010 (GE Healthcare). Densitometry analysis was carried out using ImageJ software. The value 100% corresponds to average GAPDH protein expression for the four cell lines. The experiment was performed three times. P values were calculated using a parametric, unpaired Student t-test, and GraphPad Prism 5.0 software.

Surface Plasmon Resonance (SPR) Experiments
Experiments were performed on a Biacore T100 instrument (GE Healthcare). All materials were purchased from GE Healthcare unless otherwise specified. Full-length recombinant human HSPG were purchased from R&D Systems. Syndecan-3, syndecan-4, and glypican-3 were obtained from the mouse myeloma cell line (NS0), and glypican-4 was obtained from the Chinese Hamster Ovary cell line. The activity of syndecan-4, glypican-3, and glypican-4 was measured by the supplier as the ability of the immobilized protein to bind FGF-basic. The activity of syndecan-3 was measured by the supplier as the ability of the immobilized protein to inhibit adhesion of Saos-2 human osteosarcoma cells to human fibronectin.
NT4-biotin was captured on a CM5 sensor chip where streptavidin had previously been immobilized by standard amine coupling. Briefly, the sensor chip surface was activated with a mixture of 0.1 M 1-ethyl-3(3-dimethylaminopropyl)carbodiimide (EDC) and 0.4 M N-hydroxyl succinimide (NHS) for 7 min at a flow rate of 5 µl/min. Streptavidin was injected over the surface for 7 min, and finally, 1 M ethanolamine pH 8.5 was used to block any activated carboxyl groups. NT4-biotin, diluted in HBS-EP+ (Hepes 10 mM, NaCl 150 mM, EDTA 3.4 mM, 0.05% p20, pH 7.4) to 30 µg/ml, was injected for 2 min at a flow rate of 10 µl/min.
HSPG and oligosaccharides were diluted to different concentrations in HBS-EP+ and then injected over immobilized NT4 peptides. The sensor chip surface was regenerated with a short pulse of 10 mM NaOH/0.5 M NaCl 5 min after the end of the injections.
Kinetics were analyzed with the Biacore T100 evaluation 1.1.1 software using the 1:1 Langmuir model to fit the curves.

Modeling of NT4-Sulfated Oligosaccharide Complex
NT4 was modeled as extended conformation structure using PyMOL (The PyMOL Molecular Graphics System, Version 1.4, Schrödinger, LLC) and refined by energy minimization with the Gromacs package (27) and Amber force field (28). The molecule was centered in a triclinic box with at least 10-Å distance from the solute to the periodic box border; the box was filled with TIP3P water model, and the system was neutralized by adding counterions. A new force field entry was created for lysine in the scaffold by reparameterization of the standard lysine residue from the Amber library, taking covalent bonding of the side-chain amine into account. The peptide was linked to available amines of the scaffold. The three-dimensional (3D) structure of the 8mer heparin oligosaccharide was derived from the canonical helical structure of heparin (PDB ID 1HPN, 1 C 4 conformer) (29). The GLYCAM06 force field parameters (30) were used for GAGs.

RESULTS
In previous papers, we reported NT4 binding and internalization into different cancer cell lines by immunofluorescence and flow cytometry (11,13,14,19). In previous confocal microscopy experiments, NT4 conjugated with biotin (NT4-biotin) already proved to be completely internalized only after 2 h at 37 • C (14,16). Degradation of NT4-biotin by living cells was previously assessed by mass spectrometry and showed that the molecule was still stable after 24 h (14). NT4 binding and internalization into those cancer cells or tissues were completely inhibited by heparin and HS (13,19). We also demonstrated that NT4 binds to heparin and HS with high affinity and to CS with lower affinity (13).
To further assess the specificity of binding of the NT4 peptide to HSPG in HT-29 colon adenocarcinoma, PANC-1 pancreas adenocarcinoma, and MDA-MB-231 and MCF-7 breast cancer human cell lines, we first treated the cells with the heparinase I/III blend that removes HS from proteoglycans. We then incubated the cells with NT4. Flow cytometry analysis showed that NT4 binding to cancer cells treated with heparinase was much lower than to control cells (Figure 1). Glypican and syndecan levels have recently been studied with a view to defining new tumor markers or prognostic tools (6,31). Elevated levels of glypican-1 are found in pancreas carcinoma where increased expression is associated with poor prognosis (32). Levels of glypican-1 and syndecan-2 are also increased in colorectal cancer (1). Breast cancer was found to upregulate glypican-1 (33-35) and syndecan-4 (36) and to downregulate glypican-3 (37), while loss of glypican-3 promotes tumor proliferation and metastasis (38). Glypican-2 is upregulated in neuroblastoma and associated with poor overall survival (1). The roles of glypican-4 and syndecan-3 in tumors are still underexplored. Expression of syndecans (Figure 2, shades of green) was generally higher than that of glypicans (Figure 2, shades of blue). Among syndecans, syndecan-4 was the most expressed in all cell lines, followed by syndecan-3 in MCF-7, MDA-MB-231, and PANC-1 cells. Among glypicans, glypican-4 was the most expressed, but only in MCF-7 cells (Figure 2).

Sulfatases Modulate NT4 Binding on Cancer Cells
Human sulfatase 1 (hSULF-1) and human sulfatase 2 (hSULF-2) are extracellular enzymes that remove 6-O-sulfate groups from HS chains. Modified expression of both sulfatases, particularly SULF-1, has been associated with different cancers (38). By hydrolyzing 6-O-sulfate groups, hSULF-1 and hSULF-2 modulate binding of HS-binding proteins, such as growth factors and cytokines, and, finally, have effects on cell signaling (38). For example, hSULF-1, acting on HS, reduces the formation of the FGF2-FGFR-HS complex and consequently impairs FGF2 signaling (39). Figure 3A shows the relative abundance of mRNA of hSULF-1 and hSULF-2 in HT-29, PANC-1, MCF-7, and MDA-MB-231 cells as measured by RT-PCR. The two sulfatases were expressed very differently in the different cell lines. SULF-1 protein expression was also measured in the same cell lines using a specific anti-SULF-1 antibody (Figure 3B). PANC-1 and HT-29 cells showed much lower expression of sulfatases, which implies that their sulfated GAG chains retain more 6-O-sulfate groups than cancer cells with higher expression of sulfatases, such as MCF-7 and MDA-MB-231.
The pattern of NT4 cell binding detected by flow cytometry (Figure 3C) suggests that cells expressing lower levels of sulfatases, particularly SULF-1, such as PANC-1 and HT-29, bind NT4 better than the others. The higher presence of the 6-Osulfate groups is therefore correlated with higher binding of NT4 to those cell lines.

Affinity of NT4 for Recombinant HSPG and Sulfated GAGs
We used SPR to measure the affinity of NT4 binding to recombinant syndecans and glypicans, selected among those highly expressed by HT-29, PANC-1, MDA-MB-231, and MCF-7 cancer cell lines. We found that NT4 does not bind syndecan-3, whereas it binds syndecan-4, glypican-3, and glypican-4 (Figures 4A-D) with different affinities, the affinity of both glypicans being five times greater than that of syndecan-4. SPR analysis also enabled kinetic evaluation of NT4 binding to HSPG, showing different kinetic rates of association and dissociation ( Table 1).
Binding of NT4 to synthetic oligosaccharides carrying different sulfation patterns was also analyzed. We used 8-mer and 9-mer oligosaccharides with different sulfation patterns: no sulfation in oligosaccharide S00, 4 N-sulfate groups in S04, 6 sulfate groups in S06a including 4 N-sulfates and 2 6-sulfates, 6  sulfate groups in S06b including 4 N-sulfates and 2 2-O-sulfates, and, finally, 12 sulfate groups in S12, 4 in 6-O-position, 4 in 2-O, and 4 in N. We observed that the more sulfate groups there were, the higher was the affinity of the oligosaccharide for the peptide. We also found a correlation between sulfation in position 6 of oligosaccharides and NT4 binding affinity. Indeed, S06a, which carries the same number of sulfates as S06b, bound NT4 better by virtue of having two 6-O-sulfates (Figure 4). The best-binding oligosaccharide was S12, which carries repeated 6-O-sulfates, like S06a, but the 6-O-sulfates in S12 are alternated with 2-O or N-sulfates, making binding more stable ( Table 1).

Graphical Model of Interaction of NT4 and a Sulfated Oligosaccharide
NT4 was modeled with PyMol and refined by energy minimization. The 3D structure of the positively charged stretch of the NT4 peptide sequence (K6PRRP10), previously demonstrated to be critical for heparin binding (19), resulted in an extended conformation that lowers steric hindrance between rigid prolines and their preceding amino acids bearing a large side chain. This conformation gives rise to a triangular pattern formed by the charged termini of K6, R8, and R9, with 6-8 and 8-9 distances of ∼12 Å and an angle of ∼130 • between residues 6-8-9.
The 8-mer oligosaccharide was chosen for the in silico study on the basis of the experimental result obtained with flow cytometry that identified S12 (12 sulfate groups in an 8-mer) as the best-binding oligosaccharide, and its 3D structure was derived from the canonical helical structure of heparin (PDB ID 1HPN, 1 C 4 conformer) (29).
Previous studies showed that the binding of heparin and HS to polypeptides is ionic in nature (40)(41)(42). The chargebased interactions between the acidic substituents on the polysaccharide and basic residues on the polypeptide are reported to dominate the interface, and charges have to be in an appropriate 3D pattern (43). For example, FGF1 proved to prefer a specific pattern of sulfate groups in a specific spatial distribution (44). Following such evidences, a matching between charge clusters was attempted by mean of 3D molecular graphics.
Indeed, the sulfates of GlcNS i−3 -IdoA2S i -GlcNS6S i+1 (corresponding to GlcN 2 -IdoA 5 -GlcN 6 and GlcN 4 -IdoA 7 -GlcN 8 ), lying on the same side of the helix, form a pattern with distances and angles coherent with those of charged side chains of KPRR, and a specific geometry of interaction of charges is suggested (yellow dashed lines in Figure 5). Similar results hold for the 1 C 4 and 2 S 0 cyclic forms of the oligosaccharide. On an 8-mer saccharide, this pattern is found twice on opposite sides of the helix, possibly interacting with two different NT4 peptide arms.
This interaction model also explains the almost total loss of binding for S04 (N-sulfates only), where alternate side sulfates are unable to form any negative charge cluster (Figure 5) that could fit with the positive cluster of NT4.
The in silico modeling provides a theoretical picture of the interaction that can help in understanding the binding activity of NT4. In particular, the fact that the oligosaccharide has two negative clusters on opposite sides of the molecule could reinforce the hypothesis of multiple binding with NT4.

DISCUSSION
HSPG are synthesized by most animal cells, but due to the variable composition and sulfation of their GAG chains, their ability to interact with specific ligands may be modulated under different physiological and pathological conditions, including cancer. Tumor stroma is composed of the ECM, including proteoglycans, fibronectin, collagen, cytokines, and growth factors. Cells that populate the tumor stroma, like immune system cells, fibroblasts, and endothelial cells, together with tumor cells, can modify the stroma as the tumor evolves. The ECM of the tumor stroma is very different from that of normal tissues (1) due to tumor remodeling that also triggers tumor invasiveness (1). HSPG accumulate in remodeled stroma and are, in turn, modified on their glycosidic chains by tumor-dependent glycosyltransferases, sulfotransferases, sulfatases, and heparanases (6,45). The presence and amount of these GAG-related enzymes help identify high-risk tumors and develop targeting therapies (46).
In colon tumors, for example, significant upregulation of extracellular sulfatases SULF-1/2 has been observed and may indicate general alteration of HS 6-O-sulfation patterns in colon tumors (47).
As discussed in the introduction, hundreds of different extracellular regulatory proteins, such as growth factors, chemokines, and morphogens, also involved in cancer, interact with the GAG portion of HSPG, requiring specific glycosides sequences and sulfation patterns (23).
The peculiar post-translationally regulated variability of HSPG has made it difficult to study their activity in cancer cell biology.
NT4 is already known to have major effects on cancer cells, such as inhibition of migration and invasion of ECM induced by FGF (20).
We examined the expression of syndecans and glypicans in a panel of cancer cell lines that NT4 binds. The binding affinity of NT4 with human rHSPG expressed by these cells was then analyzed by SPR. NT4 did not bind syndecan-3, but it bound glypican-3 and -4, and also syndecan-4, but with one fifth of the affinity shown for glypicans. Glypicans and syndecans have different GAG chains: glypicans only carry HS chains, whereas syndecans-2 and 4 have HS chains and syndecans-1 and 3 have HS and CS chains (4,48). Besides, HS posttranslational modifications occur in clusters, i.e., HS has some domains that are more densely sulfated than others. For example, the FGF binding domain that has 2-, 6-, and N-sulfation, carries seven sulfated groups in five residues, whereas the anti-thrombin binding domain contains six sulfated groups in five residues. In contrast, CS has more homogeneously sulfated patterns with long tracts carrying an average of four sulfates every five residues (49).
The NT4 affinity profile is therefore consistent with our previous results that showed a preference of the peptide for HS chains featuring patches of dense sulfation, compared to CS (49). Frontiers in Oncology | www.frontiersin.org Another important finding regarding NT4 recognition of sulfated GAG chains came from the analysis of sulfatase expression in the same panel of human cancer cell lines. NT4 binding to the cancer cell membrane was inversely correlated with expression of sulfatases. NT4 binding was higher in cell lines with lower expression of sulfatases, particularly SULF-1, i.e., HT-29 and PANC-1, confirming the determinant role of 6-O-sulfate groups for recognition by NT4.
Using 8-mer and 9-mer HS oligosaccharides with analog disaccharide composition and different sulfation sites, a possible recognition motif was identified that includes repeated 6-Osulfates alternating with N-and/or 2-O-sulfates. This finding is again consistent with the preference of NT4 for HS more than for CS. CS carries GAG chains with 2-O-sulfates and 4-O-sulfates, whereas HS has 6-O-sulfates alternating with N-or 2-O-sulfates.
The possible structure of the NT4-sulfated oligosaccharide complex was then reconstructed by molecular modeling, taking into account our information on amino acids in NT4 sequences, i.e., KPRR, previously demonstrated to be essential for heparin and HS binding (13,19). The modeling showed that the distance between the crucial positive residues of NT4 is completely compatible with ionic interaction with sulfates on the oligosaccharide. Moreover, assuming a helical structure of the oligosaccharide, which is considered usual for sulfated oligosaccharides, sulfate groups lying on opposite sides of the helix can interact with positive residues on two peptide sequences of the branched structure, thus favoring multivalent binding, and explaining the high affinity and selectivity of NT4 for highly sulfated GAGs. Being a branched peptide, NT4 can give multiple binding to repeated domains on the same GAG chain or on different GAG chains of the same HSPG, improving binding affinity. Specificity of GAG ligand binding, which allows formation of the GAG-ligand-receptor complex that triggers signal transduction, is mediated by multivalent electrostatic interactions between GAGs and growth factors or proteins of the ECM. The presence of binding sites of growth factors and proteins on GAG chains is no longer disputed, and the exact structure and motifs of the recognition patterns are being explored (23,50,51).
NT4 and possibly newly selected branched peptides can be designed and used to unravel the exact structure of binding sites on GAG chains. These tools will be essential probes for reconstructing binding sites for cancer-involved ligands on GAGs, paving the way for new cancer detection and treatment options.

DATA AVAILABILITY
All datasets generated for this study are included in the manuscript and/or the supplementary files.