Comparative Analysis of the Secretome and Interactome of Trypanosoma cruzi and Trypanosoma rangeli Reveals Species Specific Immune Response Modulating Proteins

Chagas disease, a zoonosis caused by the flagellate protozoan Trypanosoma cruzi, is a chronic and systemic parasitic infection that affects ~5–7 million people worldwide, mainly in Latin America. Chagas disease is an emerging public health problem due to the lack of vaccines and effective treatments. According to recent studies, several T. cruzi secreted proteins interact with the human host during cell invasion. Moreover, some comparative studies with T. rangeli, which is non-pathogenic in humans, have been performed to identify proteins directly involved in the pathogenesis of the disease. In this study, we present an integrated analysis of canonical putative secreted proteins (PSPs) from both species. Additionally, we propose an interactome with human host and gene family clusters, and a phylogenetic inference of a selected protein. In total, we identified 322 exclusively PSPs in T. cruzi and 202 in T. rangeli. Among the PSPs identified in T. cruzi, we found several trans-sialidases, mucins, MASPs, proteins with phospholipase 2 domains (PLA2-like), and proteins with Hsp70 domains (Hsp70-like) which have been previously characterized and demonstrated to be related to T. cruzi virulence. PSPs found in T. rangeli were related to protozoan metabolism, specifically carboxylases and phosphatases. Furthermore, we also identified PSPs that may interact with the human immune system, including heat shock and MASP proteins, but in a lower number compared to T. cruzi. Interestingly, we describe a hypothetical hybrid interactome of PSPs which reveals that T. cruzi secreted molecules may be down-regulating IL-17 whilst T. rangeli may enhance the production of IL-15. These results will pave the way for a better understanding of the pathophysiology of Chagas disease and may ultimately lead to the identification of molecular targets, such as key PSPs, that could be used to minimize the health outcomes of Chagas disease by modulating the immune response triggered by T. cruzi infection.

Chagas disease, a zoonosis caused by the flagellate protozoan Trypanosoma cruzi, is a chronic and systemic parasitic infection that affects ∼5-7 million people worldwide, mainly in Latin America. Chagas disease is an emerging public health problem due to the lack of vaccines and effective treatments. According to recent studies, several T. cruzi secreted proteins interact with the human host during cell invasion. Moreover, some comparative studies with T. rangeli, which is non-pathogenic in humans, have been performed to identify proteins directly involved in the pathogenesis of the disease. In this study, we present an integrated analysis of canonical putative secreted proteins (PSPs) from both species. Additionally, we propose an interactome with human host and gene family clusters, and a phylogenetic inference of a selected protein. In total, we identified 322 exclusively PSPs in T. cruzi and 202 in T. rangeli. Among the PSPs identified in T. cruzi, we found several trans-sialidases, mucins, MASPs, proteins with phospholipase 2 domains (PLA2-like), and proteins with Hsp70 domains (Hsp70-like) which have been previously characterized and demonstrated to be related to T. cruzi virulence. PSPs found in T. rangeli were related to protozoan metabolism, specifically carboxylases and phosphatases. Furthermore, we also identified PSPs that may interact with the human immune system, including heat shock and MASP proteins, but in a lower number compared to T. cruzi. Interestingly, we describe a hypothetical hybrid interactome of PSPs which reveals that T. cruzi secreted molecules may be down-regulating IL-17 whilst T. rangeli may enhance the production of IL-15. These results will pave the way for a better understanding of the pathophysiology of Chagas disease and may ultimately INTRODUCTION Trypanosoma cruzi and Trypanosoma rangeli belong to the genus Trypanosoma and the family Trypanosomatidae. They are the only two trypanosomes that infect humans in Latin America (1). T. rangeli and T. cruzi share the same mammalian hosts and triatomine vectors in overlapping areas. Although T. rangeli is non-pathogenic to mammalian hosts, it is harmful to insects. In particular, it causes morphological abnormalities or even lethal effects in molting and feeding in the genus Rhodnius (2). It has been recently reported that T. rangeli is more closely related to Old World trypanosomes of bats, civets, rats, and monkeys than to T. cruzi (3). T. cruzi causes Chagas disease, which is endemic in Latin America. Chagas disease afflicts 6-7 million individuals worldwide and has been associated with negative economic impacts in developing countries (4). Approximately 28,000 new cases of Chagas disease are diagnosed and 12,000 deaths are reported every year (4), indicating that Chagas disease is still a relevant public health issue (5). In addition to these estimates, demographic and migratory changes have resulted in the spread of the disease to non-endemic areas in the American continents, including the United States, and to other continents (6). T. rangeli shares morphological similarity and immunological cross-reactivity with T. cruzi, and they can cooccur as natural mixed infections in both vertebrate hosts and insect vectors in a broad geographical area (7).
The genome of T. rangeli has been sequenced and compared to that of T. cruzi (8). It has been shown that T. rangeli contains a smaller number of gene copies of virulence factors encoded by multigene families such as the mucin-associated proteins (MASPs), trans-sialidases (TS), and mucins; and a reduced repertoire of genes encoding antioxidant enzymes compared to T. cruzi (8). Additionally, transcriptomic analysis showed that T. rangeli contains genes encoding factors of virulence and pathogenicity, such as gp63, sialidases, and oligopeptidases, as has been described in other kinetoplastids (9). Proteins that are secreted to the extracellular medium usually contain an Nterminal signal peptide (SP) that drives them to the classical endoplasmic reticulum (ER)/Golgi-dependent secretion pathway [reviewed in Watanabe Costa et al. (10)]. In addition, other proteins without the canonical SP are secreted via non-classical pathways, usually through the shedding of extracellular vesicles (11). A significant amount of data on the genome structure and expression of human parasitic trypanosomes is currently available. There have been many studies of the biological functions of T. cruzi secreted proteins and their role in parasitehost interactions and pathogenesis [reviewed in Watanabe Costa et al. (10)].
As it is possible that secreted molecules from trypanosomes modulate host pathways including the immune response, the present study focused on species-specific potentially secreted proteins (PSPs) from T. cruzi and T. rangeli and their interaction with the host immune response. Using a computational pipeline which integrates genomic and proteomic data with bioinformatics predictions, we built an in silico secretome of both species and identified classically PSPs (i.e., those displaying the SP without transmembrane domains). Additionally, we built an interactome and evolution models of a few selected PSPs. This comparative analysis of T. cruzi and T. rangeli may provide new insights into how these sympatric species evolved and adapted to the mammalian host.

Protein Sequences of Trypanosoma cruzi and T. rangeli
The protein sequences of T. cruzi (T. cruzi Sylvio X10/1-2012) used in this study were acquired from the TriTrypDB kinetoplast database. The protein sequences in FASTA format of T. cruzi was downloaded in January 14th 2015 and September 20th 2019. The protein sequence of T. rangeli was provided by the Laboratory of Bioinformatics of the LNCC-National Laboratory of Scientific Computing and deposited in TriTrypDB (12) (T. rangeli SC58). Both sequences contained automatic annotations of proteins.

Prediction and Selection of PSPs
SignalP (version 4.1; default parameters) (13) was used to predict the presence and location of signal peptide (SP) cleavage in the amino acid sequence. SecretomeP (version 1.0; default parameters) (14) was used to perform predictions of secreted proteins by the non-classical pathway (lacking the canonical SP). In addition to SignalP, SecretomeP predicts arginine and lysine cleavage sites in eukaryotic protein sequences and subcellular localization.
In the case of the parasites studied here, there were some proteins with a SP, but also with additional transmembrane regions (which may belong to the cytoplasmic membrane or to the nuclear membrane, for example). Therefore, the program TMHMM (version 2.0) (15) was used to perform prediction of transmembrane helices. HMMTOP (version 2.0) (16,17), Tmpred (18), and DAS-Tmfilter (19) were used to validate the prediction from TMHMM. This prediction was done by the following steps: (i) generate fasta files of clean protein sequence (with start metionine and translated stop codon) with proteins for the individual species, (ii) run the SignalP and SecretomeP pipelines, (iii) run the TMHMM program to identify transmembrane proteins, (iv) validate the TMHMM data with online programs (HMMTOP 2.0, Tmpred, and DAS-TMfilter). The final result included only proteins with SP lacking other additional transmembrane regions. The steps of the pipeline performed are shown in Supplementary Figure 1.

Annotation of PSPs
The second part of the analysis consisted of evaluating the results from the protein prediction programs and confirming them through computational and manual inspection. In addition, the Blast2Go (20) (https://www.blast2go.com/) also provided additional information for proteins noted as hypothetical, helping the manual inspection.
The BlastGO annotations were based on the similarity levels of the local file sequences with the QBlast database (21). Mapping parameters were changed to search results on the non-redundant reference protein (nr) (21), PSD (21), UniProt (22), Swiss-Prot (23), TrEMBL (23), RefSeq (24), GenPept (21), and PDB (25). Assignment of functional terms of the set of GO terms was assembled in the mapping step. Pie charts were used for showing the biological functions of T. cruzi and T. rangeli.
In the case of proteins annotated as hypothetical in TriTrypDB for both species, we used the annotation of the InterPro (26) and domain entries were characterized if available. InterPro is embedded in TriTrypDB (https://tritrypdb.org/ tritrypdb/showQuestion.do?questionFullName=GeneQuestions. GenesByInterproDomain) and once you insert the Gene ID to search an automatic page opens and the features of the sequence are shown, including "13. Protein features and properties" where there is the option of Interpro domains.

Gene Family Comparative Analysis
OrthoMCL (version 1.4) (27-29) (https://orthomcl.org/ orthomcl/?rm=orthomcl) was used to verify which PSPs were expanded in both species. FASTA sequences of PSPs of T. cruzi and T. rangeli were used in the comparative analysis. In summary, OrthoMCL compares all proteins against all protein sequence similarities and find clusters based on reciprocal similarities. Protein clusters exhibiting bidirectional similarities between at least two Trypanosoma species were considered orthologs and those with bidirectional similarities within each species were classified as paralogues.

Criteria for Selection of PSPs for Interactome Analysis
From the OrthoMCL results, some proteins were selected for both phylogeny and interactome analyses. The following selection criteria were used: (a) potential modulators of the host immune system were selected based on the available protein annotation in the results of Blast2Go similarity annotation and literature survey of each protein. For the hypothetical or unknown proteins, conserved domains in other species were investigated to infer their characterization in NCBI databases, such as Pfam (European Bioinformatics Institute) (30); (b) some degree of identity with proteins of the human host. The sequences were individually verified in the BLAST (nr database) with the "Organism" field filled with Homo sapiens. The degree of identity can be visualized in the search result by descending the "Ident" column. Proteins sharing homology with the human proteins were considered and those that did not share homology were withdrawn from the interactome analyzes; (c) annotations not yet determined in databases. Hypothetical or unknown proteins were prioritized in the analysis because of the importance of exploiting the proteins not yet well-documented in the literature. To obtain some basic data of these proteins as biological functions, the conserved domains with proteins of other species were investigated; and (d) proteins found in both species and those present in only one species. For this criterion, OrthoMCL data were used to generate a Venn diagram containing proteins of both species at the intersection with proteins present only in each species outside the intersection.
These criteria allowed us to elaborate a comparative analysis between PSPs of T. rangeli and T. cruzi and to identify virulencerelated proteins of T. cruzi, possibly interacting with the human immune system.

Phylogenetic Inference
T. cruzi phospholipase A2 (PLA2) was selected to perform phylogeny inference. Sequences were obtained by a search using MEGABLAST (31) (search for highly similar sequences) from patatin protein domain from PLA2 (phospholipase 2) of Trypanosoma cruzi Sylvio strain as query, resulting in a set of 4,354 sequences (Supplementary Table 6).
We obtained multiple sequence alignment by Muscle algorithm (32), included in the Seaview software (33), based on 4,354 patatin domain protein sequences of PLA2 (phospholipase 2). Phylogenetic relationships analyses were made by neighbor-joining (NJ) and Bayesian methods. For NJ trees the confidence scores of each node were assessed with bootstrap analysis (34), with 100 replicates. Bayesian inference was performed using MrBayes 3.2 software (35) based on a reduced set of PLA2 patatin sequences, a set with 78 sequences aligned by Muscle (36) with Γ distribution (LG+G) to correct for rate variation among sites was selected as the substitution model to Bayesian inference (Supplementary Table 7). Substitution models were chosen based on the Akaike Information Criterion (AIC) and Bayesian Information Criterion (BIC), using MEGA 6 software ("find best DNA model" option) (37). Bayesian consensus tree was constructed with four million generations with sampling every 100 generations until the standard deviation from split frequencies were under 0.01. Phylogenetic trees figure edition was then formatted with the FigTree v1.3.1 software (http://tree.bio.ed.ac.uk/software/figtree/).

Gene and Pathway Network Analysis
The software Ingenuity Pathways Analysis (IPA) (Qiagen, USA-www.ingenuity.com) was used to build gene and pathways interaction networks for the parasite PSPs identified by a screening for orthologous in human protein databases. Proteins were selected for human interactome according the criteria cited above (2. IPA maintains a graphical database of networks of interacting genes (Ingenuity Knowledge Base, IKB, Qiagen, USA-www.ingenuity.com) the analysis performed was based on the content of date 2019-05. Molecules are represented as nodes, and the biological relationship between two nodes is represented as an edge (line). All edges are supported by at least one reference from the literature, from a textbook, or from canonical information stored in the IK.

Identification of the Potentially Secreted Proteins (PSPs) of T. cruzi and T. rangeli
Of the 10,794 predicted protein sequences from the T. cruzi Sylvio X10/1 genome, 658 sequences were found to contain the canonical SP and 322 (∼49%) among them did not contain transmembrane helices according the algorithm TMHMM2.0 ( Figure 1A, Supplementary Table 1). These latter sequences were here categorized as potentially secreted or secretory proteins (PSPs) that are targeted to the endoplasmic reticulum by the signal recognition particle (SRP)-receptor pathway. Among 7,448 proteins of the T. rangeli SC58, 492 sequences had the SP and 202 (41%) of these sequences presented the SP without transmembrane helices ( Figure 1A, Supplementary Table 1). Comparing the number of PSPs from T. cruzi and T. rangeli, we observed that the percentage of PSPs was very similar in both species. Only 3.3 and 2.7% of total protein sequences of T. cruzi and T. rangeli corresponded to the PSPs, respectively ( Figure 1A). Among the SP-bearing proteins, those containing transmembrane loops were discarded because they could be targeted to the membranes of other cell organelles such as nuclear and mitochondrial membranes or they are cell surface membrane -spanning proteins. It is noteworthy that among PSPs ( Figure 1A) there are membrane associated proteins by other post-translational modifications or membrane bound proteins (38).

Annotations and Biological Functions of PSPs
Among the PSPs, most proteins of T. cruzi were annotated as hypothetical followed by a group of T. cruzi multigene families such as mucin-associated surface proteins (MASP), mucin (TcMUCII), trans-sialidases and trans-sialidases-like (TS), RHS, gp63 surface protease, and DGF-1 (Supplementary Table 2 Table 2).
Concerning T. rangeli PSPs, cellular and metabolic processes are the main biological functions (Figure 1C,  Supplementary Figure 2, and to detailed functional categories of PSPs see Supplementary Table 3). Other proteins were involved in regulation of biological processes, such as response to stimuli, location, and organization of cellular components and biogenesis ( Figure 1C, and to detailed functional categories of PSPs see Supplementary Table 3).

Comparative Analysis of OrthoMCL Cluster Composition
Using OrthoMCL software, protein clusters (families) were identified for T. cruzi and T. rangeli. A total of 123 PSPs were present only in T. cruzi, 49 proteins were shared by both species, and 13 proteins were found only in T. rangeli ( Figure 1D, Table 1, Supplementary Table 4).
In comparison to T. rangeli, we observed a higher number of genes belonging to multigene families in T. cruzi, such as mucins (TcMUCII), MASPs, and TS/TS-like proteins (39-47) ( Table 1,  Supplementary Table 4); and dispersed gene family protein 1 (DGF-1) (40,(48)(49)(50)(51)(52), gp63 (40,(53)(54)(55)(56), and gp90, a putative 90 kDa surface protein which was renamed SAP (Serine, Alanine, and Proline-rich protein) (57,58). Stoco et al. (8) demonstrated that the T. rangeli genome contains a smaller number of sequence copies from MASP, mucin, and TS/TS like proteins compared to the T. cruzi genome (8), which is consistent with our PSP findings. All these proteins have been reported to participate in the host invasion process, including virulence and pathogenicity. The MASP family is the second largest T. cruzi family of proteins, representing ∼6% of the parasite genome (40,41). MASPs may be involved in lymphocyte activation, promoting polyclonal expansion, and hypergammaglobulinemia, which delays the specific humoral response characteristic of the acute phase of Chagas disease. B cell activation promotes the immune response, preventing the specific response from occurring, preventing parasite neutralization and elimination (59)(60)(61)(62). Studies have shown that TS/TS-like proteins are virulence factors, as they are involved in the adhesion of T. cruzi to the host cell, internalization, and intracellular survival [reviewed in (63)]. T. cruzi mucins (TcMUCII) found in the present study are present on the parasite surface and have two main functions: (i) provide protection against vector and host defense mechanisms and (ii) ensure the target for invasion into specific cells or tissues (43,(64)(65)(66)(67). Some studies have reported that the gp63 glycoproteins identified by this study have the ability to inactivate the host complement system, facilitating the invasion and survival of T. cruzi (54)(55)(56)68). SAP binds to the host cell and induces intracellular Ca 2+ mobilization and host cell lysosome exocytosis (57,58).
Among the hypothetical proteins, we identified a domain from heat shock protein 70 (Hsp70) which is presented in both species (Supplementary Table 1, green lines). Hsp70 interacts with cells of the immune system, exerting immunoregulatory effects. Exogenously added Hsp70 has potent cytokine activity and binds with high affinity to the plasma membrane, triggering rapid intracellular Ca 2+ flow, activating nuclear factor kβ (NF-kβ), and positively regulating proinflammatory cytokine expression in human monocytes (69). From a diagnostic point of view, although anti-Hsp antibodies are unable to distinguish chagasic patients from those infected with other trypanosomes, T. cruzi anti-Hsp70 antibodies can distinguish between healthy and infected patients and between those in the acute phase and those in the chronic phase (70,71). These results, in combination with the findings of our study, corroborate the importance of studying T. cruzi secreted proteins.
Several PSPs were found in both species, including gp63 surface proteases, glucose-regulated proteins, serine/threonine phosphatases, methyltransferases, DNA ligases, cytochrome b5 reductase, acetyltransferases, and others ( Table 1, Supplementary Tables 1, 4). These proteins are related to cellular processes, replication, and metabolism. Among these proteins, only gp63 has been described in previous studies as being related to T. cruzi infection in host cells. Gp63 is differentially expressed in specific stages of the parasite cycle and is more highly expressed in amastigotes than in epimastigotes or trypomastigotes (72). Previous genomic studies have demonstrated that despite the non-pathogenic nature of T. rangeli in mammals, several secreted, and surface protein genes associated with virulence and pathogenicity in other pathogenic trypanosomatids, such as gp63, are present in T. rangeli (8, 73) ( Table 1, Supplementary Tables 1, 4). Finally, in the group of protein families found only in T. rangeli, some hypothetical proteins were grouped with domains in MASPs, WD40, and phosphoenolpyruvate carboxylase, in addition to phosphatases and heat shock proteins HsIVU ( Table 1,  Supplementary Table 4).
Interestingly, some SP-displaying proteins already described to be secreted in the CL Brener clone, such as cruzipain  in T. cruzi Sylvio (X10/1-2012) secretome searches. These differences may be due to the high polymorphism among T cruzi strains from different lineages (CL Brener-lineage TcVI and Sylvio X10/1-lineage TcI), which is likely reflected in their infectivity modeling in different hosts, pathology evolution, and clinical manifestations (76). Conversely, a biological secretome (77) of T. cruzi CL Brener showed that many proteins found using our in silico strategy are also found in the biological secretome, such as protein kinases, lipases, heat shock proteins, DNA repair proteins, superoxide dismutase, transialidases, gp63, mucins, MASPs, and DGF-1. This indicates that in silico strategies are useful in selecting PSPs before proceeding to subsequent biological studies. To find orthologs for PSPs in other isolates of T. cruzi we performed a manual search with BLASTP using a sample of 186 PSPs of Sylvio X10/1 as query against the genomes of clones Dm28 (TcI), CL Brener (TcVI, Esmeraldolike, and non-Esmeraldo-like haplotypes) and TCC (TcVI) deposited in TriTrypDB (Supplementary Table 5). One hundred and three orthologous (55.4%) found in these isolates can be classified as PSP according our criteria. They are distributed as follows, 45 PSPs (24.2%) were found in the three isolates, whereas 58 PSPs (31.2%) were found in one or two isolates. Eighty-three proteins (44.6%) could not be classified in any of the three isolates. The variability regarding the presence of SP can be noticed when comparing the PSPs of T. cruzi Sylvio X10/1 (TcI) by BLAST with orthologs of other strains from the same TcI lineage, such as Dm28, or clones CL Brener e TCC from a different lineage (TcVI).

Characteristics of Proteins Selected for Phylogeny and Interactome Analyzes
After OrthoMCL analysis, some proteins have been selected to perform both phylogeny inference and interactome analyzes (see Materials and Methods: 2.5. Criteria for selection of PSPs for interactome analysis), taking into account that the proteins selected must indicate implications in the modulation of host immune system. TCSYLVIO_003749 (Supplementary Table 1, yellow line), annotated as a hypothetical protein, was selected only for phylogeny, because it did not have any similarity to human proteins to perform interactome analysis. It is a hypothetical PSP with a phospholipase 2 (PLA2) domain found in T. cruzi Sylvio. This protein was also found in T. cruzi CL Brener, exhibiting a SP but no transmembrane regions (TcCLB.506705.40). Although this protein did not meet protein filter criteria "b" (see Materials and Methods: Criteria for selection of PSPs for interactome analysis) previous studies have shown a relationship between T. cruzi phospholipases and parasite invasion and survival in the host (78,79). In T. rangeli, the orthologous protein (TRSC58_01394) has transmembrane regions, thus it was not included in the secretome list. It is a hypothetical triacylglycerol lipase 3 (TGL3) carrying the conserved protein domains of superfamilies DUF 3336, patatin, and phospholipase A2 (PLA2). This is referred to as PLA2-like in later analyses. TGL3 is reported to be responsible for triacylglycerol lipase activity in the lipid particle. Patatin is a family consisting of several plant glycoproteins (80). The phospholipases present in T. cruzi could be related to host immune evasion mechanisms. It has been shown in Vero cells that phospholipase A1 is involved in cellular lipid modifications leading to protein kinase C (PKC) activation. With PKC activation, Ca 2+ release occurs from intracellular stocks, contributing to parasite invasion (79). DNAJ (TCSYLVIO_005847) ( Table 1, Supplementary  Table 1, blue lines, Supplementary Table 4), was found to be a PSP only in T. cruzi and was selected to build a hypothetical hybrid interactome, as it has homology with human proteins. T. cruzi DNAJ (or Hsp40) protein, acts as a Hsp70 chaperone forming a protein folding pathway that integrates with Hsp90. Several environmental stimuli acting upon the parasite during evolutionary selection have resulted in a very expanded and varied chaperone network. Heat shock protein expression is increased during the transition of the parasite from the vector insect (temperature ∼ 26 • C) to the human host (∼ 37-38 • C), during which there is differentiation from trypomastigotes to amastigotes. Heat shock proteins are also involved with pathogenesis in the host. For example, Hsp90, paralleling the human interactome proteins HSP90AB1 and HSP90AA1, may be involved in the development and intracellular growth of the parasite and represents a potential target for therapeutic interventions (81)(82)(83)(84). Real et al. (85) demonstrated that Leishmania (L.) amazonensis (trypanosomatid that causes Leishmaniasis) secretes an ortholog to mouse Hsp70 kDa protein 5 (HSPA5) (69% identity; 90% sequence coverage). Considering that parasite factors mimic their mammalian counterparts, a hypothetical Leishmania-mouse interactome was proposed to identify host components that could be affected by the secretion of Leishmania HSPA5. The model predicted an interaction between an L. amazonensis HSPA5 and mammalian Toll-like receptor 9, which is implicated in important immune responses such as cytokine and nitric oxide production (85).
We selected three proteins found only in T. rangeli by OrthoMCL ( Table 1, Supplementary Table 1, blue lines, Supplementary Table 4) that share homology with human proteins: two heat shock HsIVU (TRSC58_01986 and TRSC58_04483) and a 4-nitrophenyl phosphatase, TRSC58_05352. They are present in several strains of T. cruzi (e.g., CL Brener), but they do not have SPs as they do in T. rangeli. According to BLAST, proteins TRSC58_01986 and TRSC58_04483 (heat shock domain HsIVU) have homology with some human proteins, such as ATPases and ATP-dependent metalloproteinases, and TRSC58_05352 (4-nitrophenyl phosphatase) has homology to some human phosphatases.

PLA2, a Possible Immune Modulator, Has a Plant Phospholipase Domain Which Has Not Been Acquired by Horizontal Transfer
Phospholipases are involved in multiple physiological processes, including the generation of lipid signaling. Tc-PLA1, T. cruzi phospholipase A1, is secreted into the extracellular medium during the infective stages (amastigotes and trypomastigotes). It displays high membrane binding activity and may be involved in parasite-host interaction events prior to cellular invasion (78,79). Interestingly, Tc-PLA1 does not display a SP in the Sylvio strain, but does in the CL Brener strain. We then chose the hypothetical protein PLA2, another phospholipase, to generate a phylogenetic tree because of a potential function in modulating host immune mechanisms and parasite invasion due to a similar function in PLA1 and the existence of a Patatin-like domain in its sequence.
Patatin is a family of glycoproteins that accounts for up to 40% of the total soluble protein in potato tubers (86). When we performed a BLAST search with PLA2-like of T. cruzi (TCSYLVIO_003749), many plant species sequences with some similarity to the Patatin domain were retrieved (not shown). Because of this, we questioned whether the Patatin domain in the PLA2-like gene had been acquired by horizontal transfer (HTG). We have reported an HTG event of a FYVE domain present only in viruses, such as Acanthamoeba mimivirus, transferred to phosphatidylinositol kinases (PIK) of T. brucei, Leishmania, and T. cruzi (87). The FYVE-PIK architecture is only present in trypanosomatids and viruses, suggesting a horizontal acquisition hypothesis that was supported by Bayesian phylogenetic inference (87). An HTG event has also been demonstrated for some Trypanosoma species in which they acquired Proline-racemase (PRAC) genes, previously identified in a restricted group of bacteria. The PRAC genes act as virulence factors in the highly pathogenic bacteria Clostridium difficile and Pseudomonas aeruginosa (88).
Thus, to test the hypothesis of PLA2-like patatin domain HGT, we generated a phylogenetic gene tree based on PLA2like amino acid sequences. While, "gene trees" represent the evolutionary history of the genes studied and can be incongruent with species trees, "species trees" are usually constructed with orthologous genes, aiming to recover the genealogy of taxa (89). The reason for the incongruence can be: (1) differences in rates of evolution, (2) occurrence of gene loss and/or gene duplication, (3) recombination between neighboring regions, and (4) horizontal gene transfer (HGT) (90). The use of phylogeny can indicate HGT events by revealing incompatibility between the gene and species evolutionary histories exposed by reconstruction of the phylogenetic gene and species trees, using the species tree as a reference (89). Differences between gene trees and species trees can be suggestive of HGT events. Gene trees can provide evidence for genetic processes as well as HGT. If two species are connected in the same branch of a gene tree but are evolutionarily distant according to the species tree, a HGT event has possibly occurred (91). Inference of horizontal gene or sequence transfer by phylogenetics is considered more sensitive and specific compared to other approaches (92). We constructed a gene tree based on the patatin protein domain from PLA2 (phospholipase 2) to search for a possible HGT event that could have affected T. cruzi. Using the PLA2 patatin domain protein sequences from the T. cruzi Sylvio strain, we searched for similar sequences using MEGABLAST (31) (search for highly similar sequences). We obtained 4,354 sequences that were extracted from BLAST (21) (Supplementary Table 6) and aligned by Muscle into Seaview software (33). Subsequently, we generated a neighbor joining (NJ) phylogenetic tree (Figure 2A). The NJ tree generated three clusters with Trypanosoma species, but there was no one distant species/taxon insertion that could be an indicator of putative HGT. Three Trypanosoma clades, far from each other in the same tree, occurred due to variation in the BLAST results (score and e-value). We also conducted a Bayesian phylogenetic inference based on a restricted PLA2 patatin sequence alignment using MrBayes software v.3.2 (35), which confirmed the absence of HGT ( Figure 2B).

Hypothetical Hybrid Interactome Revealed Proteins Potentially Involved in Host Immune Modulation
IPA (Ingenuity Pathway Analysis) was used to identify functional and molecular networks among the list of T. cruzi and T. rangeli PSPs and host molecules (Figures 3A,B). The analysis generated two different networks for the PSPs of T. cruzi and T. rangeli. The genes and gene products are represented as nodes, and the relationships between them are represented as edges (lines). All edges are supported by at least one literature citation or by canonical information stored in the database. Molecules were mapped to their corresponding node. The network generated from the T. cruzi PSPs (represented as red nodes) showed genes related to IL-17 production and showed central nodes including IL-17A and NOS2 (Figure 3A). For T. rangeli, IL-15 production  was the main biological function predicted to be related to the genes in the network (Figure 3B).
T. cruzi induces expression of proteins that trigger a decrease in IL-17 production, while T. rangeli leads to interleukin 15 (IL-15) production, as shown in Figures 3A,B, respectively. Although both species are parasites of mammals and are evolutionarily related, T. rangeli infection in humans is promptly eliminated by the host (93)(94)(95)(96). Studies have failed to demonstrate T. rangeli multiplication in vertebrate hosts, and its pathogenicity and virulence are still not well-understood (95)(96)(97). Thus, very little information is available regarding induction of the immune response by T. rangeli since this parasite appears to be harmless to mammals and may be eliminated by the complement system (98). Successful immune responses against pathogens depend upon efficient stimuli and consequent cytokine synthesis and equilibrium (99,100). According to IPA analysis, T. rangeli triggers IL-15 synthesis, which is associated with nuclear factor (NF)-kB complex activation ( Figure 3B). It has been demonstrated in cerebral epithelial cell lines that nuclear translocation of the p65 subunit of NF-kB is one of the major activators of IL-15 upon stimulation with TNF-α (101). IL-15 promotes B, T, and NK cell differentiation and proliferation and also functions as a danger signal during the inflammatory process (102,103). Our data suggest that the inability of T. rangeli to establish infection may be related to the activation of IL-15. IL-15 signaling begins with the binding of IL-15 to IL15Rα on the surface of antigen-presenting cells, triggering trans presentation to effector cells through its heterotrimeric receptor (IL15Rα, IL-2R/IL-15Rβ, and the typical cytokine receptor y-chain) and can lead to activation of the immune response by diverse routes (103). IL-15 production is commonly associated with efficient control of blood brain barrier permeability in the extracellular nervous system fluid which is a hardly selective semipermeable border (101,104,105). Trans-activation of the IL-15 complex is a crucial factor in protecting against cerebral injuries in Plasmodium-infected mice through the induction of IL-10producing NK cells (105). Therefore, one can assume that IL-15 may enable the host immune system to limit T. rangeli capacity for cell entry and eliminate the protozoan while it persists in the blood.
Conversely, T. cruzi has adopted countless strategies to manipulate the host immune response in order to survive (106,107). During the acute phase of Chagas disease, in which the parasite is present in the bloodstream, an effective immune response is established to control parasite replication in the tissues (108) Production of interferon-γ, IL-17, and specific antibodies by NK, T, and B cells are crucial for parasite clearance. In mice, IL-17A deficiency impaired the activation of immune cells critical for the killing of T. cruzi, resulting in greater susceptibility of the mice to T. cruzi infection (109).
IL-17 impairment as a result of T. cruzi infection may represent a crucial mechanism of parasite immune evasion and consequent cellular infection. IL-17 is a key cytokine, produced mainly by T helper 17 cells (T h 17) that promotes the recruitment of immune cells, including neutrophils and monocytes (110). IL-17A is required for the elimination of bacteria, fungi, and T. cruzi; it was found to be diminished in chagasic patients with established heart disease, especially in patients with congestive heart failure (111). We observed that the PSP DNAJA 2 leads to downregulation of IL-17 expression during pathogen invasion ( Figure 3A). T. cruzi DNAJA 2 shows between 50-70% similarity to the human isoform. DNAJA 2 is related to the downregulation of RORα, one of the key signal transducers and activators of T h 17 cells and consequent IL-17 production. Likewise, in vivo and in vitro studies showed that the absence of IL-17RA (IL-17 receptor)-signaling resulted in parasite-specific CD8+ T cell apoptosis, whereas recombinant IL-17A downregulated the pro-apoptotic BAD protein and promoted the survival of activated CD8+ T cells (112). Furthermore, IL-17A production seems to serve an important immunomodulatory protective role during human chronic Chagas disease, which correlates to improved left ventricular function and myocardial damage protection (111). These effects may also be observed in benznidazole-treated individuals with increased IL-17 blood levels (113).

CONCLUSION
PSPs from both T. cruzi and T. rangeli, which are pathogenic and non-pathogenic to humans, respectively, have been identified through the development of an in silico secretome. A hypothetical hybrid interactome of PSPs revealed that T. rangeli could enhance the production of IL-15, leading to NF-kB complex activation, and ultimately an immune response, which could explain the inability of T. rangeli to establish human infection. Conversely, T. cruzi could secrete proteins that trigger a decrease in IL-17. IL-17A is required for the elimination of bacteria, fungi, and T. cruzi, and it is diminished in chagasic patients with established heart disease. These results suggest that key PSPs could be used as novel therapeutic targets to minimize the health consequences of Chagas disease by subverting immune response-triggered by T. cruzi infection.

DATA AVAILABILITY STATEMENT
The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found in the article/Supplementary Material.  Supplementary Table 1 | Annotation of PSPs of both T. cruzi Sylvio and T. rangeli. Hypothetical proteins have been annotated manually. In blue, proteins selected to perform human interactome; in yellow protein used to phylogenetic tree. In green, Hsp70 proteins found in both species. TR04358 is a gene of T. rangeli genome that has not been deposited in TriTrypDB.  Table 4 | Quantitative and qualitative analysis by OrthoMCL. Amount of grouped proteins found in each organism and in both species.

Supplementary
Supplementary Table 5 | Identification of PSPs in other T. cruzi strains, compared to PSPs of T. cruzi Sylvio X10/1-2012. T. cruzi Sylvio (TcI) PSPs are used as queries to find orthologs in other strains such as Dm28 (TcI), Cl Brener Esmeraldo-like (TcVI), Cl Brener Non-Esmeraldo-like (TcVI), and TCC TcVI). A manual BLASTP was performed using TriTrypDB, and the presence of signal peptide (SP) and transmembrane domains (TM) has been evaluated by Interpro domain embedded in TriTrypDB. Proteins with E-value lower than 0 (negative value) were considered. When two or more copies of the same protein were found, the copy with the lowest E-value was considered. SP+ = presence of signal peptide; TM+ = presence of transmembrane helices; SP+ TM-means that it is considered a PSP (absence of transmembrane helices and presence of signal peptide); "-" symbol means that ortholog was not found.