Defining the Extracellular Matrix of Rhabdomyosarcoma

Rhabdomyosarcoma (RMS) is the most common soft-tissue sarcoma of childhood with a propensity to metastasize. Current treatment for patients with RMS includes conventional systemic chemotherapy, radiation therapy, and surgical resection; nevertheless, little to no improvement in long term survival has been achieved in decades—underlining the need for target discovery and new therapeutic approaches to targeting tumor cells or the tumor microenvironment. To evaluate cross-species sarcoma extracellular matrix production, we have used murine models which feature knowledge of the myogenic cell-of-origin. With focus on the RMS/undifferentiated pleomorphic sarcoma (UPS) continuum, we have constructed tissue microarrays of 48 murine and four human sarcomas to analyze expression of seven different collagens, fibrillins, and collagen-modifying proteins, with cross-correlation to RNA deep sequencing. We have uncovered that RMS produces increased expression of type XVIII collagen alpha 1 (COL18A1), which is clinically associated with decreased long-term survival. We have also identified significantly increased RNA expression of COL4A1, FBN2, PLOD1, and PLOD2 in human RMS relative to normal skeletal muscle. These results complement recent studies investigating whether soft tissue sarcomas utilize collagens, fibrillins, and collagen-modifying enzymes to alter the structural integrity of surrounding host extracellular matrix/collagen quaternary structure resulting in improved ability to improve the ability to invade regionally and metastasize, for which therapeutic targeting is possible.


INTRODUCTION
Rhabdomyosarcoma (RMS) are highly malignant tumors known to phenocopy some of the early events in skeletal muscle embryogenesis but are also known to arise from tissues not known to contain striated muscle or muscle stem cells (1). Embryonal rhabdomyosarcoma (eRMS) and alveolar rhabdomyosarcoma (aRMS) are the major histological subtypes. eRMS is the most common subtype accounting for half of all RMS cases and occurs most often in the head, neck, and genitourinary tract (2). aRMS usually occurs in adolescents and young adults and is commonly found in the trunk and extremities.
To address the pressing and unmet clinical need for new treatments of soft tissue sarcomas, we have generated multiple genetically-engineered mouse (GEM) models of aRMS, eRMS and the undifferentiated pleomorphic sarcoma (UPS) subtype of nonrhabdomyosarcoma soft-tissue sarcoma (NRSTS) (3-7) (Figure 1). Lymphatic and hematogenous (pulmonary) metastasis is a predominant feature (4) and a primary cause of mortality in these models (9). We characterized these soft tissue sarcoma models and demonstrate them to be representative of the human diseases by histopathology, gene expression and other features (3,4,8,10). Furthermore, these conditional models have the special features of knowing the cell-of-origin as well as the mutational profile making them a valuable tool to study RMS ( Figure 1).
Very little is known about the composition of the RMS tumor microenvironment (TME). In this study, we have uncovered that type XVIII collagen alpha 1 (encoded by COL18A1) is expressed highly in aRMS and eRMS. Interestingly, the variant of rhabdomyosarcoma cell cultures that produce COL18A1 is the 'alveolar' subtype, thus called because rich collagen stroma encasing the tumor cells is reminiscent of lung histology.
In addition to analyzing the expression of collagens in RMS, we also examined the expression of the post-translational collagenmodifying enzymes such as prolyl 3-hydroxylase (P3H2, LEPREL1), lysyl hydroxylase 1 (LH1, PLOD1) and lysyl hydroxylase 2 (LH2, PLOD2) which facilitate collagen maturation (reviewed in (11)). Whereas prolyl 4-hydroxylation of the Yaa position of Gly-Xaa-Yaa by P4H enzyme is ubiquitous and provides stability to the collagen triple helix, prolyl 3-hydroxylation by P3H2 in the Xaa position of a Gly-Xaa-4Hyp sequence can be far less frequent-yet biologically essential for processes such as the maternal-fetal interface with respect to collagen IV (12). Like type IV collagen, type XVIII collagen (COL18) contains a relatively large number of Gly-Pro-Hyp sequences (13). Lysyl hydroxylases target the Yaa position lysine residues in Gly-Xaa-Lys tripeptides (11). This posttranslational lysyl hydroxylation allows subsequent glycosylation of hydroxylysine. Hydroxylysines and O-linked glycosylation of hydroxylysines within procollagen are essential to the formation of intra-and inter-molecular crosslinks (11). The critical underlying biology of the extracellular matrix not only has implications for histological appearance, but may also create therapeutic opportunities (14), which motivated this study.

Murine RMS Cell Cultures
All murine cell cultures were isolated from tumors generated in house by our genetically engineered mouse model (GEMM). Isolation of primary tumor cells and culturing method have been previously described (15,16). Briefly, tumors are dissected from mice under sterile conditions. Tumors are digested over night at 4°C in a 1% collagenase IV solution (17104019, Sigma Aldrich, St Louis, MO, USA) diluted in DMEM. After 18 h, cells are briefly vortexed then passed through a 70 mM filter. Cells were centrifuged at 200 g for 5 min, and the cell pellet was resuspended in 13 mL growth media then transferred to a T75 tissue culture flask. Cells were expanded at 37°C in 5% CO 2 .

Immunoblotting
For the analysis of secreted proteins, the medium was collected and proteins were precipitated with 0.2 mg/mL ammonium sulfate, centrifuged, dissolved in Tris Buffered Saline (TBS), and run on the SDS-PAGE. Western blotting with antibodies against COL18A1 (anti-NC1, 1087+ (20); anti-NC11 1112+ obtained from Dr. Takako Sasaki (unpublished)) was performed after transfer to a nitrocellulose membrane.
The Anti-NC11 antibody has not yet been published, though an associated manuscript is currently in review, and a general method is provided. Polyclonal rabbit anti-NC11 antibody was raised as follows: the RNA isolated from the mouse endothelioma cell line eEnd.2 was used to amplify reverse transcriptase PCR to obtain complementary DNA (cDNA) encoding the TSP1-like domain (NC11) of mouse collagen XVIII. The amplified cDNA was inserted into the episomal expression vector pCEP-Pu, which contained in addition the signal peptide of BM-40, to transfect human EBNA-293 cells (22). The recombinant TSP1 domain was purified from conditioned medium using dimethylaminoethanol (DEAE) cellulose followed by Superose 6 molecular sieve chromatography and then used for immunizing of a rabbit. The specificity of all the anti-NC11 antibody was demonstrated by a complete lack of staining with tissues from Col18a1 −/− mice.

Tissue Microarrays
Four samples of formalin-fixed paraffin embedded (FFPE) human RMS were available to include in a custom mouse model tissue microarray (TMA) (1× aRMS, 3× eRMS). Fortyeight murine model sarcomas were also included, representing developmental stages and genotypes including early myoblast (origin), postnatal stem cell (origin), maturing myofiber (origin), Pax3:Foxo1-expressing, Trp53 wild type or mutated and Rb1 wildtype or mutated (5,7,8). Four samples of murine normal skeletal muscle were included in a custom mouse model tissue microarray (TMA) as the negative control.
The TMA was stained with a standard H&E stain for histologic verification. Co-author AM classified each tumor as nonrhabdomyosarcoma and rhabdomyosarcoma. The latter was further divided into aRMS, eRMS, pleomorphic RMS and RMS not otherwise specified (RMS NOS). We have included a detailed spreadsheet file as Supplementary  Table 1). This TMA is publicly available by request.
Each IHC stain was scored by co-author SHG for intensity and percentage of positive cells in each individual tumor sample (some tumors were represented more than once in the TMA). The intensity of each stain was scored accordingly: 0-negative; 1indeterminate; 2-weak positive; 3-strong positive. The intensity and percentages were averaged for all cases containing more than one TMA fragment. Scoring of intensity was further simplified in a binary scheme where average intensity of greater than 1.5 was considered positive, and 1.5 or less was considered negative.

Immunocytochemistry Staining
Cells were allowed to recover post sort in culture for 2 to 3 days then replated in eight chamber slides at low confluency and allowed to incubate for 2 to 3 days. Cytology was conducted using standard protocol. Briefly, Cells were then fixed in 4% paraformaldehyde in PBS at room temperature for 20 min. After protein and streptavidin/biotin blocking steps, cells were washed again, and primary antibody diluted in 5% NGS in PBS was added overnight at 4°C. Alexafluor 546 conjugation-based antibody (Invitrogen) at 1:200 was added, and cells were incubated for 1 h at room temperature. Slides were mounted using the VECTASHIELD Mounting Medium with 40, 6diamidino-2-phenylindole (Vector Laboratories) and visualized with a Zeiss LSM 700 confocal microscope.

RNA Sequencing Analysis
Tissue for RNA sequencing experiments were obtained from flash frozen tissue samples obtained from biopsy or autopsy tissue (for human subjects) or from flash frozen tumor tissues preserved following necropsy (for murine models). All human tumor tissues were obtained through the pediatric tumor tissue banking program denoted CureFAST following consent by patient and/or patient's guardians. The CureFAST program is overseen by the Independent Review Board (IRB) that oversees all human-related research at the Children's Cancer Therapy Development Institute.
Raw transcriptome sequencing data as short read fastq files were aligned to the GRCh38 human reference genome for human data and GRCm38 murine reference genome for mouse data using STAR aligner (23), and aligned transcripts were subsequently quantified for overall expression using RSEM (24). Normal tissue gene expression data matched to individual samples was generally unavailable, thus region-specific skeletal muscle tissue gene expression data was accessed from the Genotype-Tissue Expression (GTEx) project to serve as a population normal and enable comparative expression analysis (25). Sequencing data availability information is provided in the Data Availability statement.

Statistical Analysis
Binary immunohistochemistry expression values were compared between histologic groups using the Fisher exact test. Statistical analyses were performed using R statistical software (version 3.3.1; R Foundation, Vienna, Austria) with the R commander graphical user interface.
RNA expression comparisons between histologic groups were analyzed using one-way analysis of variance (ANOVA) for cohort comparisons, and individual comparisons between relevant groups (human tumor vs. human cell line, human tumor vs. human normal, human cell line vs. human normal, murine tumor vs. murine normal) were made using the two-stage linear step-up procedure FDR adjustment of Benjamini, Krieger and Yekutieli for multiple comparisons. Statistical significance was set at *p < 0.05, **p < 0.01, ***p < 0.001, ****p < 0.0001.
Error bars indicate mean ± SD or SEM. Statistical analysis was performed with GraphPad Prism 7.0.

COL18A1 Expression Is Elevated in RMS
COL18A1 showed significantly increased expression at the mRNA level in human and murine aRMS and eRMS tumors relative to normal muscle (Figures 2 and 3). COL4A1, FBN1, and FBN2 also showed significant overexpression versus normal muscle in both human aRMS and eRMS, and COL4A1 was significantly expressed in murine aRMS (Figures 2 and 3). Fibrillin-1 (FBN1) and fibrillin-2 (FBN2) are two subtypes of the fibrillin glycoprotein incorporated into elastic tissue in the extracellular matrix, and FBN2 is a known immunohistochemical biomarker of eRMS (26), which was reflected at the RNA level for eRMS more so than aRMS. Increased expressions of COL4A2, PLOD1, and PLOD2 were seen for human RMS tumors ( Figures  2 and 3). For NRSTS, Col18a1 and Plod2 (but not Fbn1 and Fbn2) were elevated in mouse tumors (Figure 4). Statistical tests comparing RNA expression values are presented in Supplementary Table 2.
Given that RNA studies were conducted on whole tumor, which includes tumor cells and stromal cells, we sought to understand and further differentiate the source of these collagen and collagenassociated gene product expressions by immunohistochemistry on custom tissue microarrays. The expressions of COL18A1, PLOD1, PLOD2, FBN2, COL4A1, and COL4A2 were present in both murine aRMS and murine eRMS (Table 1 and Figure 5). However, FBN1 was not expressed in murine aRMS and was weakly expressed in only one model of murine eRMS (Table 1 and Figure 5). Models presented as representative images for IHC results include murine aRMS models (U21459, U24988, U48484, U20745, and U23674),   a human aRMS model (pcb00380), murine eRMS models (U37125, U34278, U24085, and U24055), and a human eRMS model (pcb00232). The complete set of 48 murine RMS models and four human RMS models used for IHC analysis and associated scores are provided in Supplementary Table 3. Undifferentiated sarcomas, which include a variety of morphologic phenotypes from spindled to epithelioid and pleomorphic expressed all collagen-modifying enzyme to varying degrees ( Table 1). No statistically significant trends were found in human RMS due to small sample size. Given the visual similarity between staining of extracellular matrix proteins and IHC background, a cohort of positive control images showing no background when used on skeletal muscle tissues are provided (Supplementary Figure 1). Comparison of RMS versus undifferentiated sarcomas in mice showed a significantly higher expression of PLOD2 in RMS (p = 0.044). When comparing aRMS to eRMS, a significantly higher expression of PLOD1 was seen in the aRMS subtype (p = 0.05) ( Table 1). In general, murine sarcomas with Rb1 nullizygosity were associated with the undifferentiated morphology, except for one case identified as aRMS which also showed lower expression of COL18A1, COL4A1, and PLOD2 compared to Rb1 wildtype sarcomas (p = 0.0035, 0.04, and 0.08, respectively).
When comparing samples based on cell-of-origin (early myoblast, postnatal stem cell and maturing myoblast), a significant increase in PLOD1 was seen in cases of early myoblast and maturing myoblast origin RMS compared to the postnatal stem cell origin (p = 0.04). No other significant differences in cell or lineage-of-origin were seen across the other IHC markers.
To visualize the intracellular versus extracellular localization of the ECM related protein in RMS cell lines, we performed immunocytochemistry (ICC) across several cultured murine RMS cells which showed expression of FBN1, the NC2 domain of COL16, COL4A1 as well as the NC1 and NC11 domains of COL18A1 ( Figure 6). Murine cell cultures used for ICC included FIGURE 6 | Immunocytochemical detection of ECM-related proteins in murine sarcoma models. Expression of ECM proteins in primary cell cultures derived from murine RMS. Immunocytochemistry using five independent antibodies against fibrillin-1, collagen XVI, collagen IV and two antibodies recognizing NC1 and NC11 domains of collagen XVIII. For each antibody, staining was intracellular and/or extracellular. Murine cell cultures used for ICC included U23674 (murine aRMS), U48484 (murine metastatic aRMS), U33915 (murine spindle cell eRMS), U57810 (murine eRMS), and U34279 (murine pleomorphic sarcoma). U23674 (murine aRMS), U48484 (murine metastatic aRMS), U33915 (murine spindle cell eRMS), U57810 (murine eRMS), and U34279 (murine pleomorphic sarcoma). Intracellular and secreted (extracellular) COL18A1 was readily detectable by ICC ( Figure 6). Secreted COL18A1 in the conditioned media was also detected by immunoblotting (Figure 7).

Clinical Significance of the Expression of COL18A1
Biopsy samples were studied for gene expression analysis as previously described (27). In the biopsy samples provided by the Intergroup Rhabdomyosarcoma Study-IV (IRS-IV) (27), higher expression of COL18A1, COL4A1, and COL4A2 was correlated with worse outcomes in human RMS patients (Figure 8), with the alveolar subtype showing worsened survival compared to other RMS subtypes ( Figure 8). Additionally, increased expression of FBN1 and FBN2 was correlated with worsened survival in aRMS ( Figure 8); however, statistical significance is not seen for FBN1 and FBN2 when examining all RMS ( Figure 8). IRS-IV samples used for analysis consisted of every available aRMS and eRMS sample in the IRS-IV database for which expression data was available. Samples available differed between groups.

DISCUSSION
The structural integrity of collagen fibers is reliant on a balance of post-translational modification and extracellular matrix enzyme composition, which can be significantly altered in tumors. To date, studies of the RMS TME are limited. However, mutations in COL2A1 have recently been associated with chondrosarcoma risk (28).
As individual prognostic markers, increased COL18A1, COL4A1, and COL4A2 expressions are all associated with decreased long-term survival in all pediatric RMS subtypes, with the alveolar subtype showing worsened survival versus other subtypes. aRMS is most often fatal when metastatic (29), yet recent studies of related soft tissue sarcomas suggest that collagen subtype and modification can determine the ease with which these sarcomas metastasize (14); furthermore, altering collagen modifications is therapeutically amenable with FDA-approved agents that suppress metastasis in mouse soft tissue sarcoma models. From the perspective of therapeutic opportunities, another study has introduced a 3D murine model recapitulating the in vivo structure of aRMS and the ECM, which has promising possibilities for tumor behavior and therapeutic exploration (30).
In this study, we have shown that RMS expresses specific collagen proteins (COL18A1, COL4A1) at significantly increased levels in pediatric RMS relative to normal muscle; this expression in turn represents a worse prognosis (COL18A1, COL4A1, COL4A2). Strong IHC expression of COL18A1 was shown to have a worse outcome in pulmonary carcinomas (31) similar to COL18A1 RNA expression in RMS. Moderate to strong COL18A1 IHC staining is also seen in greater than twothirds of murine and human RMS samples in our study. Due to small sample size in human RMS, statistical analysis is limited to data generated from the murine RMS cohort. The dysregulated expression of COL18A1 and the worse prognosis associated with increased expression raises the possibility that COL18A1 mediates RMS metastasis. Furthermore, modification or cleavage of COL18A1 (e.g. endostatin) perhaps alters tumor biology e.g., dysregulating tumoral angiogenesis or increasing metastatic potential (32). In our exploration of tumoral production of collagen, we also sought to explore the possibility of RMS utilizing enzyme production to alter the collagen matrix of surrounding tissues. This phenomenon in theory is an efficient way for a tumor originating in a stromal environment to quickly spread locally and gain access to blood supply/nutrition. Both PLOD1 and PLOD2 were shown to have increased overall RNA expression in human RMS relative to normal muscle. PLOD1 is an enzyme responsible for catalyzing the hydroxylation of lysyl residues in collagen-like peptides and is known to be deficient in patients with kyphoscoliotic form of Ehlers-Danlos syndrome (33). Although PLOD1 protein expression was present in less than 50% of RMS overall, a significantly larger number of murine aRMS samples showed PLOD1 expression compared to eRMS; however, this finding did not correlate with RNA expression patterns as PLOD1 expression in both murine aRMS and eRMS is statistically indistinguishable. A significant increase in PLOD1 RNA expression was seen in both human aRMS and  eRMS relative to normal muscle; unexpectedly, PLOD1 protein expression is significantly higher in eRMS than aRMS.
Previous studies have shown correlation of PLOD2 expression with metastatic potential in soft tissue sarcoma (14) akin to our discoveries in RMS. The fact that a larger number of aRMS show increased PLOD1 expression compared to eRMS may be related to the possibility that the more aggressive alveolar subtype utilizes hypoxia-dependent mechanisms to break down collagen. However, PLOD1/2 RNA expression did not correlate with worse outcome when looking at all RMS samples as a cohort and adjusting for stage.
In an attempt to isolate a specific tumoral mechanism based on the myogenic cell-of-origin, we analyzed whole transcriptome sequencing data segmented across murine aRMS and eRMS, and protein expression by IHC in RMS segmented by disease indication and by cell-of-origin (early myoblast, postnatal stem cell and maturing myoblast). Protein expression analysis by IHC yielded one important finding: significantly fewer cases of RMS with postnatal stem cell as cell-of-origin showed PLOD1 positivity compared to other cellsof-origin. Given that cell-of-origin influences pharmacological response (5), absence of PLOD1 positivity could potentially have diagnostic or even therapeutic value. Note that in the context of murine RMS models which have undergone whole transcriptome sequencing, the cell-of-origin status cleanly separated by histological diagnosis of aRMS or eRMS. Thus, we did not perform cell-of-origin based statistical comparison of extracellular matrix protein gene expression.
A more aggressive, undifferentiated morphology was associated with the Rb1 nullizygous mouse sarcomas and was not seen in eRMS subtypes. The Rb1 nullizygous sarcomas had a significantly lower number of tumors expressing COL18A1, COL4A1, and PLOD2. As described above, expression of COL18A1 and PLOD2 is associated with increased tumor infiltration, metastasis, and worse overall survival and thus offer potential therapeutic targets. This preliminary data suggests these targets are present in a higher number of aRMS/ eRMS cases than in undifferentiated sarcomas and may represent a treatment opportunity.
Overall, our findings imply that RMS produces an imbalance in expression of a variety of collagens and collagen-modifying enzymes implicated in tumor growth and metastasis. COL18A1 expression is significantly increased in all RMS compared to normal muscle and is associated with worse overall survival. Our identification of overexpression of additional collagen markers at the RNA and protein level, specifically PLOD1 and PLOD2, may be considered for evaluation with potential FDA-approved or investigational therapies targeting these enzymes in future studies.