Mapping Autoantibodies in Children With Acute Rheumatic Fever

Background Acute rheumatic fever (ARF) is a serious sequela of Group A Streptococcus (GAS) infection associated with significant global mortality. Pathogenesis remains poorly understood, with the current prevailing hypothesis based on molecular mimicry and the notion that antibodies generated in response to GAS infection cross-react with cardiac proteins such as myosin. Contemporary investigations of the broader autoantibody response in ARF are needed to both inform pathogenesis models and identify new biomarkers for the disease. Methods This study has utilised a multi-platform approach to profile circulating autoantibodies in ARF. Sera from patients with ARF, matched healthy controls and patients with uncomplicated GAS pharyngitis were initially analysed for autoreactivity using high content protein arrays (Protoarray, 9000 autoantigens), and further explored using a second protein array platform (HuProt Array, 16,000 autoantigens) and 2-D gel electrophoresis of heart tissue combined with mass spectrometry. Selected autoantigens were orthogonally validated using conventional immunoassays with sera from an ARF case-control study (n=79 cases and n=89 matched healthy controls) and a related study of GAS pharyngitis (n=39) conducted in New Zealand. Results Global analysis of the protein array data showed an increase in total autoantigen reactivity in ARF patients compared with controls, as well as marked heterogeneity in the autoantibody profiles between ARF patients. Autoantigens previously implicated in ARF pathogenesis, such as myosin and collagens were detected, as were novel candidates. Disease pathway analysis revealed several autoantigens within pathways linked to arthritic and myocardial disease. Orthogonal validation of three novel autoantigens (PTPN2, DMD and ANXA6) showed significant elevation of serum antibodies in ARF (p < 0.05), and further highlighted heterogeneity with patients reactive to different combinations of the three antigens. Conclusions The broad yet heterogenous elevation of autoantibodies observed suggests epitope spreading, and an expansion of the autoantibody repertoire, likely plays a key role in ARF pathogenesis and disease progression. Multiple autoantigens may be needed as diagnostic biomarkers to capture this heterogeneity.


INTRODUCTION
Acute rheumatic fever (ARF) is a serious multi-focal autoimmune sequela of Group A Streptococcal (GAS) infection, presenting with a combination of signs and symptoms including one or more of the major manifestations used for diagnosis as part of the Jones criteria (1, 2); arthritis, carditis, Sydenham's chorea, erythema marginatum and subcutaneous nodules. Approximately 60% of ARF cases progress to chronic rheumatic heart disease (RHD), which can cause permanent heart valve damage (3), with an estimated 33 million people living with RHD globally (4). Although ARF rates declined over the twentieth century, the disease persists in low-income countries and amongst disadvantaged communities in some high-income countries, with Indigenous Maōri and Pacific children in New Zealand and Aboriginal children in Australia having some of the highest incidences in the world (5,6).
The clinical manifestations of ARF usually develop 2-4 weeks after a GAS pharyngitis infection, with growing evidence also implicating GAS skin infections in disease (7). The pathogenesis pathway for ARF remains poorly understood. The current hypothesis involves "molecular mimicry", wherein antibodies initially targeting specific GAS components are proposed to cross-react with human tissue (8,9). This is largely based on M-protein specific antibodies and T-cells which cross-react with cardiac myosin, laminin and tropomyosin antigens found in the heart and synovium (9). The role of molecular mimicry remains the subject of debate, with an alternative hypothesis suggesting that infection with GAS causes disruption of the extracellular matrix, which exposes cryptic collagen epitopes, and triggers an autoimmune response (10,11). Additional autoantibodies could then be generated due to increased inflammation, subsequent tissue damage and epitope spreading (12).
There are few contemporary studies investigating the broader autoantibody response in ARF and a lack of application of unbiased approaches to study the ARF autoantibody repertoire. Protein−microarray technologies enable quantification of autoantibody responses to large proportions of the human proteome (13). These technologies have been used to identify novel autoantibodies and associated disease pathways in a broad range of immune-mediated diseases, including lupus, some cancers and the recently described Multisystem Inflammatory Syndrome in Children (MIS-C) that can develop following SARS-CoV-2 infection (14)(15)(16). This study aimed to apply high-content protein-microarray technology to ARF to enable a comprehensive analysis of the disease's autoantibody landscape. This unbiased array-based approach was taken to inform antibody-driven pathogenesis and identify possible novel disease biomarkers.

Protein Microarrays
Human Protorrays (Protein microarray platform v5.0) were performed following the manufacturer's instructions to detect serum autoantibodies (ThermoFisher, Massachusetts, USA). Samples were diluted 1:500 and antibody binding detected with an Alexa Fluor 647 labelled goat anti-human IgG antibody. Arrays were scanned using a GenePix4000B microarray scanner and array grids aligned using the GenePix Pro 5.0 software (Molecular Devices). Raw data were background corrected using the "saddle" correction (17) from the Bioconductor limma package (18), and data were quantile normalized followed by differential expression statistical analysis using linear models and empirical Bayes statistics with the limma package. Proteins antigens with p < 0.05 and a fold-change of > 2.0 were considered significant. For proteins with duplicated identifiers, (proteins with more than one variant on the arrays) variants with the highest absolute fold-change were kept for further analysis. Autoantigens were cross-validated using HuProt v3.0 arrays conducted by CDI Laboratories (Baltimore, USA). Serum antibodies were detected using an Alexa Fluor 532-labelled antihuman IgG secondary and data were quantile normalized as previously described for HuProt arrays (19). Proteins with p < 0.1 and fold−change >1.5 were considered significant.

Array Analysis and Visualizations
Analysis and visualizations were carried out in R (version 4.0.2) within R studio 19 (version 1.2.5042) using the tidyverse suite of packages (20). Upset plots were produced using ComplexHeatmap package (21). Venn diagrams were produced using jvenn (22). Heatmaps and hierarchical clustering (using the average Euclidean distance method) were carried out using Morpheus (https:// software.broadinstitute.org/morpheus). Disease pathway analysis of differentially bound proteins was carried out using Metascape using custom analysis for enrichment in DisGeNet disease pathways (23,24). Tissue specificity of proteins was elucidated using "Normal tissue data" downloaded from the human tissue atlas (HPA) from the URL (https://www.proteinatlas.org/about/ download) (25). Data were filtered from the HPA using both the "reliability score" and "level". The Compartments database was also used for filtering proteins via the "confidence score" (26). Filtering parameters applied were; an enhanced or supported "reliability score" and high or medium expression "level" in heart muscle, as well as a "confidence score" of >4 for plasma membrane expression or extracellular space.
To compare the individual antigens, and combination of antigens, for performance in discriminating ARF from control groups, the receiver operating characteristic curve (ROC) and area under the curve (AUC) values (from a logistic regression model) for each antigen or combination of all three antigens were calculated by comparing ARF and control samples using the pROC package (27). Confidence intervals for AUC and differences in AUC were obtained using bootstrapping (n=2000) implemented in the pROC package.

Study Participants
Human sera were obtained from several studies conducted in New Zealand. Each had appropriate ethical approval, and all participants (or their proxies) provided written informed consent. All ARF cases were diagnosed according to the New Zealand modification of the Jones criteria (1, 28). The sera for ProtoArrays were from a study conducted in the Waikato District Health Board region (2012 to 2015; ethics CEN/12/06/017) including ARF (n=3), GAS pharyngitis (n=3), as well as ethnically matched healthy controls (n=3) from the Auckland arm of the children of SCOPE study (29). The sera for the HuProt arrays (ARF with carditis (n=7), ARF without carditis (n=5) and matched healthy controls (n=6)) and ELISA orthogonal validation (ARF n=79 and controls n=85) were from participants recruited as part of the Rheumatic Fever Risk Factors (RF RISK) study (30). This nationwide study conducted between 2014 and 2017 (ethics 14/NTA/53) included first-episode ARF patients and closely matched healthy controls (30). Controls were matched by age, ethnic identification, socioeconomic deprivation (using the New Zealand Deprivation Index score (31)) and geographic area. Sera from children with GAS positive pharyngitis (n=39) used for orthogonal validation ELISAs were recruited as part of a paediatric study investigating GAS skin and throat infections conducted in the Auckland region (2018-2019; ethics 17/NTA/262) (32).

Global Analysis of Autoantibody Reactivity in Acute Rheumatic Fever
The ProtoArrays initially utilized to profile the autoantibody response in ARF contain over 9000 human proteins expressed in insect cells. As autoantibodies are present in all individuals (33)(34)(35), serum binding from ARF patients (n=3) was compared to that of healthy children (n=3) and children with GAS positive pharyngitis (n=3) as controls. Following array QC and normalization, the total antibody reactivity or fluorescence intensity, was determined for each array. ARF arrays showed an increased number of total reactivities compared to controls (p < 0.0001) ( Figure 1A), suggesting an overall increase in autoantibodies in ARF patient sera. The total reactivity observed on the ARF arrays was markedly increased compared to both the GAS positive pharyngitis and healthy controls (Supplementary Figure 1), and as the overarching goal was to identify ARF specific autoantibodies rather than those associated with GAS pharyngitis, the control groups were combined for the subsequent data analysis. The antibody reactivity signals for ARF patients were filtered to include only proteins with a > 2.0 foldincrease in fluorescence intensity compared to the mean of the combined controls (healthy and GAS pharyngitis). This selected for autoantibodies with stronger reactivity in ARF and enabled individual ARF patient's autoantibody profile to be compared. A total of 1013 autoantibodies showed > 2.0 fold-increased reactivity in ARF compared to the combined controls, with each of the ARF patients having similar numbers of proteins with an increased signal (687 in patient A1, 556 in patient A2 and 541 in patient A3) ( Figure 1B). Nearly half (47%, 480/1013) of the autoantibodies were unique to an individual patient, with only 23% (238/1013) shared between all three ARF patients and the remaining 29% (295/1013) present in two ARF patients but not the other. Taken together, these results show a global increase in autoantibodies in ARF sera with marked heterogeneity in the autoantibody profiles for each of the ARF patients assessed.

Autoantibodies Target Proteins in Relevant Disease Pathways
To identify differentially bound proteins in sera from ARF patients and perform pathway analysis, proteins with significantly elevated fluorescence intensity in the ARF group compared with the combined control group were selected (p < 0.05 and fold-change > 2.0). In total, 841 proteins were bound significantly more by ARF serum IgG compared to 693 proteins in controls (Figure 2A and Supplementary Data). This is in line with the prior global analysis suggesting higher overall autoantibodies in the ARF group. Encouragingly, some proteins that have previously been implicated in the pathogenesis of ARF and RHD were identified (10). These included extracellular matrix proteins; fibronectin (36) and collagens (10, 37, 38) (FSD1L, COL-/2/9/14-A1) as well as intracellular proteins involved in muscle contraction; tropomyosins and myosin (39, 40) (TPM2/3 and MYL6) ( Figure 2A). As the tropomyosin that previously linked to ARF is cardiac tropomyosin (TPM1), a sequence alignment was performed with the TMP2 and TMP3 isoforms identified in this study. This showed high sequence identity between TPM1 and TPM2 (85.563%) and TPM3 (91.197%) (Supplementary Figure 2).
To explore disease connections, the 841 proteins bound by ARF IgG were subjected to disease pathway analysis. This identified three noteworthy pathways, which were both significantly enriched (p < 0.01, fold-enrichment > 1.5) ( Figure 2B and Supplementary Data), and related to two of the major criteria used to diagnose ARF (carditis and arthritis). The "Myocardial Ischemia", "Autoimmune arthritis" and "Juvenile Rheumatoid Arthritis" disease pathways contain 33, 11 and 13 proteins targeted by autoantibodies in ARF sera, respectively. Unsupervised hierarchical clustering on these pathway proteins shows serum from ARF patients clustering separately from the control groups with respect to fluorescence intensity ( Figure 2C). This indicates that ARF autoantibodies target a diverse range of proteins that are enriched for in relevant disease pathways.

Multi-Platform Validation of Array Hits
To further explore and validate the ARF autoantibody repertoire, including antigens identified via the ProtoArray analysis, additional high-content arrays (HuProt arrays, > 16,000 human proteins, expressed in yeast cells) were conducted using a distinct cohort of patients; ARF patients with carditis (n=7), ARF patients without carditis (n=5) and healthy controls (n=6). A focussed analysis of the HuProt array data identified 158 human proteins bound significantly more by serum IgG in carditis patients compared to healthy controls (p < 0.1 and fold change >1.5) ( Figure 3A). Interestingly, one of these proteins, ANXA6, was previously identified in a pilot mass spectrometry analysis of 2−Dimensional Electrophoresis (2-DE) separated human heart lysate and ARF sera conducted in our laboratory (Supplementary Methods). Nine of the 158 proteins identified via the HuProt analysis overlapped with those identified by the ProtoArrays (Figure 3B). When these nine proteins were filtered for expression in heart muscle [using human protein atlas data (25)] as well as localization in or near the plasma membrane [using the compartments database (26)] just two proteins remained; PTPN2 and DMD ( Figure 3B and Supplementary Data). When the same analysis and filtering was applied to the control groups, five overlapping proteins were identified, but none of these passed the filters for expression location. Plotting normalized fluorescence values from the HuProt arrays for PTPN2 and DMD plus ANXA6 illustrates the increased autoantibodies in ARF compared to healthy controls ( Figure 3C).

Orthogonal Validation Using ELISA
To orthogonally validate hits from the arrays, ELISAs were performed with DMD, PTPN2 and ANXA6 as antigens. These antigens represent different aspects of ARF disease mechanisms including an immune cell signalling protein [PTPN2 (41)], a central component of the extracellular matrix in muscle fibre [DMD (42)] and a protein abundantly expressed in cardiomyocytes and chondrocytes during osteoarthritis [ANXA6 (43,44)]. A large cohort was used for validation comprising sera from children with first-episode ARF (n=79), closely matched healthy controls (n=85), as well as children with GAS pharyngitis (GAS positive throat swab and elevated streptococcal serology, n=39) Autoantibodies shared between all three ARF patients are indicated in grey, those shared between at least two patients in blue and those unique to an individual patient in red. The number of autoantibodies in each category is indicated on vertical bar charts with respective colours. Upset plots include only antibodies that showed at least a two-fold enrichment when compared with the mean of grouped controls. Total number of auto-antibodies identified using this threshold per patient is indicated on horizontal bar charts (Set size). Table 1). Significantly elevated autoantibodies were observed in the ARF patient group compared to both healthy controls and the GAS pharyngitis group for all three antigens ( Figure 4A). The lack of reactivity to these antigens in the GAS pharyngitis group confirms that these autoantibodies are associated with disease, and not with the prior GAS infection. Receiver Operator Curves (ROC) were generated to assess the predictive performance of each of the three antigens alone as well as in combination to distinguish ARF sera from the combined controls (healthy and GAS pharyngitis) ( Figure 4B). The area under the curve (AUC) metric showed DMD had the best predictive performance (AUC = 0.857, CI:0.805-0.904 followed by PTPN2 (AUC = 0.787, CI:0.718-0.847) and ANXA6 (AUC = 0.642, CI:0.565-0.716). DMD performed significantly better than PTPN2 (p < 0.05) which, in turn, performed significantly better than ANXA6 (p < 0.001). There was no significant gain in performance, above DMD alone, when combining all three assays (AUC = 0.861, CI:0.809-0.908, p = 0.583).

(Supplementary
To further explore the biomarker potential of these three antigens, cut-offs for positivity were determined using the Youden index (45). This method identifies an optimal cut-off from the ROC curve that maximizes sensitivity and specificity and resulting values were DMD = 0.190, PTPN2 = 0.274 and ANXA6 = 0.275 ( Figure 4A). These cut-offs were applied and the ARF patients were categorized as having positive or negative reactivity to each antigen in the form of an autoantibody barcode ( Figure 4C). In keeping with the superior predictive performance of DMD in the ROC analysis, this antigen yielded the highest number of positive ARF cases (62/79, 78%). However, the barcode also illustrates that patients with first episode ARF have every combination of biomarkers tested ranging from positive for one, two or three antigens through to negative for

DISCUSSION
This study has comprehensively investigated the serum autoantibody repertoire in ARF patients using multiple approaches. High content arrays revealed an overall increase in autoantibodies in ARF, with a large proportion of the antibodies unique to individual patients. The current pathogenesis models for ARF following GAS infection are centred on molecular mimicry and the development of autoantibodies to host coiled-coil proteins and extracellular matrix disruption and exposure of cryptic epitopes (10,11). While the array analysis in this study did identify autoantibodies to myosin, tropomyosin and collagens that might support mimicry and extracellular matrix disruption, the breadth of the autoantibody reactivity observed also points to epitope spreading playing a role in pathogenesis (12).
Epitope spreading, that is the involvement of antigens beyond those that initially trigger the autoimmune response, is thought to be central to the pathogenesis of other systemic autoimmune diseases such as rheumatoid arthritis (46). It has also been suggested to have a role in ARF pathogenesis (12), and is in keeping with the systemic and heterogeneous nature of symptoms associated with the disease including poly-migratory arthritis, carditis, subcutaneous nodules and in some, neurological symptoms or chorea (1,8). The pathway analysis applied to the array data in this study enabled filtering of the large number of autoantibodies identified and revealed antigens associated with disease pathways relevant to ARF symptoms; "Myocardial Ischemia", "Autoimmune Arthritis" and "Juvenile Rheumatoid Arthritis". Yet even within these pathways a diverse range of antigens were targeted by patient serum antibodies, suggesting that a loss of tolerance and epitope spreading may occur in relevant tissues. This is consistent with an ARF pathogenesis model in which a loss of tolerance, driven by inflammation, enhances a dysregulated immune response. There is increasing evidence to suggest that repeated GAS infections prime the immune response for a loss of tolerance in ARF (47)(48)(49), and it follows that the presence of inflammation observed in ARF patients (12,50) could enhance the dysregulated autoimmune response, counteracting tolerance mechanisms, and contribute to epitope spreading and further damage in specific tissues. In order to validate the analysis of the high content protein arrays, the presence of autoantibodies to three of the antigens identified, DMD, PTPN2 and ANXA6, was assessed in a large ARF cohort. While there was a significant increase in autoantibodies to each of these three antigens in ARF, there was also variability at an individual patient level such that a continuum of reactivity was observed, ranging from autoantibodies to all three antigens in some patients to an absence of autoantibodies to the three antigens in others. The three proteins validated in this study were selected based on their detection across multiple analyses and represent different aspects of potential pathways and tissues involved in ARF. In particular immune signalling [PTPN2 (41)], cardiac tissue [ANXA6 (43), DMD (42)] and joint tissue [ANXA6 (44)]. However, all appear to be associated with the plasma membrane rather than being fully extracellular antigens. It is therefore possible that these antigens are not involved in initiating disease, but rather are exposed as a result of inflammation driven tissue damage and epitope spreading. It is important to note that in autoimmune disease in general not all autoantibodies are pathogenic, and only those targeting cell surface proteins are generally thought to cause clinical manifestations (51). In a similar vein, the anti-myosin autoantibodies observed in ARF (8,12,36,40) could well be the result of tissue damage and cardiomyocyte burst given the intracellular location of myosin within the myocardium (52). This study has several limitations. The ProtoArrays antigens are expressed in insect cells such that the glycosylation patterns on the extracellular antigens will differ from those that maybe present in human tissue. Similarly, membrane associated antigens may be misfolded or under-represented on both of the array platforms utilised given the uniform approach required to express and purify such large numbers of antigens in parallel. To overcome this limitation, and expand the antigen space examined in the context of ARF, alternative approaches such as Phage Immunoprecipitation Sequencing [PhiP −Seq (53)] and Rapid Extracellular Antigen Profiling [REAP (54)] of the human proteome could be considered in future studies. Finally, the initial array analysis was based on small patient numbers and it is possible that additional ARF autoantibodies would be detected with larger cohorts. Despite this, the analysis and filtering approach applied to the array data successfully identified three novel ARF autoantigens, each validated in a large patient cohort, supporting the use of high content arrays as a discovery tool.
In conclusion, this study has utilized high content protein array platforms to assess autoantibodies present in ARF in an unbiased fashion. The broad yet heterogenous elevation of autoantibodies in ARF patients support a pathogenesis model in which tissue damage and inflammation leads to a loss of tolerance to endogenous proteins and subsequent epitope spreading. Whilst autoantibodies have diagnostic potential, a panel comprising multiple antigens will likely be needed to capture individual heterogeneity.

DATA AVAILABILITY STATEMENT
The original contributions presented in the study are included in the article/Supplementary Material. Further inquiries can be directed to the corresponding author.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by Health and Disability ethics committee with the following ethics approval numbers (CEN/12/06/017, 14/NTA/ 53, 17/NTA/262). Written informed consent to participate in this study was provided by the participants' or their legal guardian/next of kin.

FUNDING
This work was supported by funding from the Maurice Wilkins Centre, the Heart Foundation of New Zealand (small project grant #1576) and Cure Kids (project grants #3545 and #3585). NM was funded by a Heart Foundation Senior Research Fellowship for part of the study period (#1650). The RF RISK study, from which samples were obtained, was funded by the Heath Research Council of New Zealand (HRC) Rheumatic Fever Research Partnership (Ministry of Health, Te Puni Kōkiri, Cure Kids, Heart Foundation, and HRC).