ORIGINAL RESEARCH article

Front. Microbiol., 03 April 2017

Sec. Microbial Immunology

Volume 8 - 2017 | https://doi.org/10.3389/fmicb.2017.00540

Comparative Genomics of Glossina palpalis gambiensis and G. morsitans morsitans to Reveal Gene Orthologs Involved in Infection by Trypanosoma brucei gambiense

  • 1. UMR 177, Institut de Recherche pour le Développement-CIRAD, CIRAD TA A-17/G Montpellier, France

  • 2. Centre National de la Recherche Scientifique Unité Mixte de Recherche 5203, Institut de Génomique Fonctionnelle Montpellier, France

  • 3. Institut National de la Santé Et de la Recherche Médicale U661 Montpellier, France

  • 4. Universités de Montpellier 1 and 2, UMR 5203 Montpellier, France

  • 5. Montpellier GenomiX, c/o Institut de Génomique Fonctionnelle Montpellier, France

Abstract

Blood-feeding Glossina palpalis gambiense (Gpg) fly transmits the single-celled eukaryotic parasite Trypanosoma brucei gambiense (Tbg), the second Glossina fly African trypanosome pair being Glossina morsitans/T.brucei rhodesiense. Whatever the T. brucei subspecies, whereas the onset of their developmental program in the zoo-anthropophilic blood feeding flies does unfold in the fly midgut, its completion is taking place in the fly salivary gland where does emerge a low size metacyclic trypomastigote population displaying features that account for its establishment in mammals-human individuals included. Considering that the two GlossinaT. brucei pairs introduced above share similarity with respect to the developmental program of this African parasite, we were curious to map on the Glossina morsitans morsitans (Gmm), the Differentially Expressed Genes (DEGs) we listed in a previous study. Briefly, using the gut samples collected at days 3, 10, and 20 from Gpg that were fed or not at day 0 on Tbg—hosting mice, these DGE lists were obtained from RNA seq—based approaches. Here, post the mapping on the quality controlled DEGs on the Gmm genome, the identified ortholog genes were further annotated, the resulting datasets being compared. Around 50% of the Gpg DEGs were shown to have orthologs in the Gmm genome. Under one of the three Glossina midgut sampling conditions, the number of DEGs was even higher when mapping on the Gmm genome than initially recorded. Many Gmm genes annotated as “Hypothetical” were mapped and annotated on many distinct databases allowing some of them to be properly identified. We identify Glossina fly candidate genes encoding (a) a broad panel of proteases as well as (b) chitin—binding proteins, (c) antimicrobial peptide production—Pro3 protein, transferrin, mucin, atttacin, cecropin, etc—to further select in functional studies, the objectives being to probe and validated fly genome manipulation that prevents the onset of the developmental program of one or the other T. brucei spp. stumpy form sampled by one of the other bloodfeeding Glossina subspecies.

Introduction

Trypanosomes causing either Human African Trypanosomiasis (HAT, i.e., sleeping sickness) or Animal African Trypanosomiasis (AAT, i.e., Nagana) are transmitted by Glossina spp. (tsetse flies). These hematophagous flies acquire their parasite during a blood meal on an infected host, and transmit the mature form of the parasite to another host during a subsequent blood meal. Two forms of HAT have been reported: a chronic and an acute form (Hoare, 1972; Aksoy et al., 2014; Beschin et al., 2014). The chronic form, spread throughout 24 sub-Saharan countries of West Africa, is caused by Trypanosoma brucei gambiense (Tbg) and is transmitted by Glossina palpalis; this form represents over 90% of all sleeping sickness cases (Welburn et al., 2009). The acute form, endemic to 12 East African countries, is caused by Trypanosoma brucei rhodesiense (Tbr), and is transmitted by Glossina morsitans morsitans (Gmm). Currently the disease persists in sub-Saharan countries (Louis et al., 2002), where more than 60 million people are exposed to the trypanosomiasis risk. Progress in deciphering the mechanisms of host-parasite interactions involves identifying the genes encoding the factors that govern tsetse fly vector competence (Vickerman et al., 1988; Maudlin and Welburn, 1994; Van den Abbeele et al., 1999), which may promote the development of anti-vector strategies that are alternative or complementary to current strategies.

Using a microarray approach, we recently investigated the effect of trypanosome ingestion by G. palpalis gambiensis (Gpg) flies on the transcriptome signatures of Sodalis glossinidius (Farikou et al., 2010; Hamidou Soumana et al., 2014a) and Wigglesworthia glossinidia (Hamidou Soumana et al., 2014b), two symbionts of tsetse flies (Aksoy et al., 2014). The aim of this previous work was to identify the genes that are differentially expressed in trypanosome infected vs. non-infected or self-cured (refractory) flies and that, consequently, can be suspected to positively or negatively control fly infection. Similarly, using the RNA-seq de novo assembly approach, we investigated the differential expression of G. p. gambiensis genes in flies challenged or not with trypanosomes (Hamidou Soumana et al., 2015). Furthermore, transcriptome profiling of T. b. brucei development in Gmm has recently been reported (Savage et al., 2016).

Since the acute form of HAT is caused by the Gmm/Tbr vector/parasite “couple,” the identification of molecular targets common to both Gpg and Gmm (i.e., orthologous genes) deserves further consideration. Indeed, identification of these targets would allow the development of common approaches to fight both forms of HAT. As Gpg and Gmm are two separate Glossina species, their genomes should display some differences between each other. Furthermore, the Gmm genome and the sequences of the Gpg RNA-seq de novo assembled genes have been annotated with reference to two distinct database sets: the first set comprises Drosophila melanogaster, Aedes aegypti, Anopheles gambiae, Culex quinquefasciatus, and Phlebotomus papatasi (International Glossina Genome Initiative, 2014), whereas the second set comprises Ceratitis capitata, Drosophila melanogaster, D. willistoni, D. virilis, D. mojavensis, Acyrthosiphon pisum, Hydra magnipapillata, Anopheles sp., Bombyx sp., Aedes sp., and Glossina morsitans (data that were available before the publication of the whole genome sequence; Hamidou Soumana et al., 2015). This indicates that only the D. melanogaster database was common to the two database sets used to annotate the differentially expressed Gpg genes and the Gmm genome, respectively. Thus, for the present study, it was necessary to map the sequences of the Gpg RNA-seq de novo assembled genes on the Gmm genome and annotate them on the corresponding database. This has been achieved, and the Gpg genes that were previously shown to be differentially expressed (i.e., stimulated vs. non-stimulated flies, and infected vs. non-infected flies; Hamidou Soumana et al., 2015) were annotated on the Gmm database. Finally, the data resulting from the best hits annotation, which provide a translation product for each gene (and thus its potential biological function and physiological role), were compared with data resulting from the previous annotation of the same genes on the set of above-mentioned databases. The overall results provide a data platform that can be applied for further identification of candidate genes involved in the vector competence of both fly species. Importantly, these data could represent promising targets in the development of new anti-vector strategies in the fight against the chronic or acute forms of sleeping sickness.

Materials and methods

Ethical statement

All animal experiments in this report were conducted according to internationally recognized guidelines. The experimental protocols were approved by the Ethics Committee on Animal Experiments and the Veterinary Department of the Centre International de Recherche Agronomique pour le Développement (CIRAD; Montpellier, France).

Sample processing, RNA-Seq library preparation, and sequencing

Samples for this study were previously used to identify the differentially expressed genes (DEGs) in Gpg. The different steps are described in the corresponding report (Hamidou Soumana et al., 2015), as well as pro parte in reports related to the differential expression of S. glossinidius and W. glossinidia genes (Hamidou Soumana et al., 2014a,b). Sample processing is summarized in Figure 1.

Figure 1

Preparation and sequencing of the RNA-Seq libraries

The sequential steps consisted of: RNA extraction from the pooled midguts of each biological replicate, resuspension of RNA pellets in nuclease-free water, concentration, RNA quantification, and quality control (to confirm the absence of any DNA contamination).

Generation of RNA-Seq libraries

RNA-seq libraries were generated using the Illumina TruSeq™ RNA Sample Preparation Kit (Illumina; San Diego, USA). The sequential steps consisted of: mRNA purification from 4 μg total RNA using poly-T oligo-linked magnetic beads; fragmentation of RNA using divalent cations under elevated temperature (Illumina fragmentation buffer); first-strand cDNA synthesis using random oligonucleotides and SuperScript II; second-strand cDNA synthesis using DNA Polymerase I and RNase H; conversion of remaining overhangs into blunt ends via exonuclease/polymerase activities and enzyme removal; and adenylation of 3′ ends of cDNA fragments, with ligation of Illumina PE adapter oligonucleotides for further hybridization. Finally, cDNA fragments were selected (preferably 200 bp in length) in which fragments with ligated adaptor molecules on both ends were selectively enriched using Illumina PCR Primer Cocktail, and the products were purified and quantified using the Agilent DNA assay on the Agilent Bioanalyzer 2100 system.

Brief summary of the pipeline for generating quality-controlled reads

A total of 12 RNA-seq libraries were prepared, sequenced, and compared, including two biological replicates for each of the NS3, S3, I10, NI10, I20, and I20 samples. Clustering of the index-coded samples was performed on a cBot Cluster Generation System using TruSeq PE Cluster Kit-cBot-HS (Illumina). After cluster generation, the library preparations were sequenced on an Illumina Hiseq 2000 platform, and 100-bp paired-end reads were generated. Image analyses and base calling were performed using the Illumina HiSeq Control Software and Real-Time Analysis component. Demultiplexing was performed using CASAVA 1.8.2. The quality of the raw data was assessed using FastQC (Babraham Institute) and the Illumina software SAV (Sequencing Analysis Viewer). Raw sequencing reads from this study were exported in the FASTQ format and were deposited at the NCBI Short Read Archive (SRA) with the accession number SRP046074; aligned BAM files are available on request.

Identification of DEGs once the reads generated from the 12 Gpg fly gut RNA seq libraries were mapped and annotated on a panel of non-insect and insect genome databases, one of them being Gmm

The RNA-seq reads that satisfied the quality control (i.e., removal of ambiguous nucleotides, low-quality sequences with quality scores <20, and sequences <15 bp in length) were mapped on the G. m. morsitans genome (13,807 scaffolds; International Glossina Genome Initiative, 2014) from VectorBase (www.vectorbase.org) and GenBank (accession no. CCAG010000000). This was achieved via the splice junction mapper TopHat 2.0.13 (Kim et al., 2013) using Bowtie 2.1.0 (Langmead and Salzberg, 2012), to align RNA-seq reads to the Glossina morsitans genome (GmorY1 assembly, release date: January 2014). Final read alignments with more than 12 mismatches were discarded.

Gene counting (number of reads aligned on each gene) was performed before statistical analysis, using HTSeq count 0.5.3p9 (union mode; Anders et al., 2014). Genes with <10 reads (cumulating all analyzed samples) were filtered and removed. We used the Bioconductor (Gentleman et al., 2004) software package EdgeR (Robinson et al., 2010) 3.6.7. to identify genes displaying a modified expression profile as a result of fly infection by trypanosomes. Data were normalized using the upper quartile normalization factors, using the quartiles method (Bullard et al., 2010). Genes with an adjusted p < 5% according to the False Discovery Rate (FDR) method from Benjamini and Hochberg (1995) were declared differentially expressed.

Bio informatics-based approaches aimed to identify molecular DEGs in both Gmm and Gpg once the latter are subverted as T. brucei spp hosts per se

Tsetse fly gene orthologs were tentatively identified using BLAST searches (Mount, 2007) with annotation against the NCBI non-redundant (Nr) sequence database, using an E-value cut-off of 10−5 (E < 0.00001), according to the best hits against known sequences. This was performed to retrieve orthologous genes with the highest sequence similarity to the given unigenes along with putative functional annotations. The official gene symbols of tsetse fly gene orthologs were used for functional annotation. Along with Nr annotations, the “Database for Annotation, Visualization and Integrated Discovery” (DAVID; Dennis et al., 2003) was used to obtain GO annotations of unigenes. The KEGG pathway annotations of tsetse fly gene orthologs were performed using the BLASTX software against the KEGG database (Wixon and Kell, 2000).

Analyzing the two annotation processes of the Gpg DEGs consisted in comparing the list of the “best hits” resulting from the Gpg DEG annotation on the Gmm database with the list resulting from the Gpg DEG annotation previously performed on a set of other databases (Ceratitis capitata, Drosophila melanogaster, D. willistoni, D. virilis, D. mojavensis, Acyrthosiphon pisum, Hydra magnipapillata, Anopheles sp., Bombyx sp., Aedes sp., and Glossina morsitans; Hamidou Soumana et al., 2015). The first step consisted in mixing the DEGs identified at the three experimental times (3, 10, and 20 days) and removing the duplicates, so as to take into account all recorded DEGs except for one of each. The second step consisted in removing the DEGs in which the annotation (best hit) resulted in “hypothetical” or “uncharacterized” proteins, as well as those identified with a numerical identifier, in order to only consider identified and named proteins. Finally, the names of the proteins (best hits) were standardized and alphabetically classified. This process was performed separately for the DEGs annotated with reference to the Gmm database, as well as those previously annotated on the above-characterized set of other databases. The two final listings were then combined (Microsoft Excel software), and their content was arranged according to the alphabetical order of protein names. This procedure facilitated the detection of the best hits that are common to both annotation processes and their corresponding genes.

Results

Mapping of PolyA+ mRNA

A total of 459,555,846 clusters were generated from the 12 RNA-seq libraries. Quality controls were performed to ensure the reliability of the libraries after removal of ambiguous nucleotides, low-quality sequences (quality scores < 20), and sequences <15 bp in length. Finally, 436,979,101 clean clusters were obtained (Table 1). Clean reads had Phred-like quality scores at the Q20 level (i.e., a sequencing error probability of 0.01). These clean sequenced reads with no strand-specificity were mapped to the Gmm reference genome using TopHat (with Bowtie 2) software in order to identify exon-exon splice junctions and to ensure enough sensitivity in mapping reads with polymorphisms.

Table 1

SamplesNumber of crude clusters (CC)Number of clusters after filtering (CAF)% CAF/CC
NS 3-day samplea36,002,59634,386,73495.51
NS 3-day sampleb41,153,58039,330,01595.57
S 3-day sample32,726,72731,257,26995.51
S 3-day sample33,386,64631,848,38595.39
NI 10-day sample33,159,65031,593,96295.28
NI 10-day sample30,632,67129,185,03695.27
I 10-day sample42,223,04940,108,75694.99
I 10-day sample43,418,91841,279,34195.07
NI 20-day sample41,882,17039,688,76494.76
NI 20-day sample38,192,69236,205,08794.80
I 20-day sample40,587,35438,401,91594.62
I 20-day sample46,189,79343,693,83794.60
Total459,555,846436,979,101
Mean38,296,32036,414,92595.08

Assembly quality of Gpg libraries at the three different sampling times.

The superscipts a and b are two replicates of the “non-stimulated samples” at day 3. Idem for the other sampling conditions. S, stimulated; NS, non-stimulated; NI, non-infected; I, infected.

Filtering and removing any genes with <10 mapped reads allowed mapping 8,286 (stimulated vs. non-stimulated flies; 3 days), 8,032 (infected vs. refractory flies; 10 days) and 8,101 Gpg genes (infected vs. refractory flies; 20 days) on the Gmm reference genome (International Glossina Genome Initiative, 2014). Further, analyses to reveal differential expression (DE) were performed using the bioinformatics tools HTseq and EdgeR from Bioconductor (http://www.bioconductor.org/), which use the R statistical programming language and are widely accepted for modeling the inherent variation between biological replicates. Figure 2 presents the log2 fold-change (stimulated vs. non-stimulated flies at day-3 post-infected blood meal) against the log2 of the reads concentration (log-counts-per-million) for each gene after normalization. The generated cloud shows a log fold-change centered on 0 (ordinate axis), signifying that the libraries are properly normalized. Genes that are differentially expressed between the S and NS samples (p < 0.05) are represented in red. Similar results were obtained for the other experimental conditions.

Figure 2

Identification of DEGs and functional annotation

The EdgeR method identified a total of 284, 139, and 59 Gmm genes corresponding respectively to the Gpg DEG samples S3 vs. NS3 (Supplementary Table S1), I10 vs. NI10 (Supplementary Table S2), and I20 vs. NI20 (Supplementary Table S3), at a p < 0.05. Most of these genes were overexpressed regardless of the experimental condition. Specifically, there were 229 out of 284 genes (80.6%) in the day-3 samples (S3 vs. NS3), 119 out of 139 genes (85.6%) in I-10 vs. NI-10 samples, and 37 out of 59 genes (62.7%) in I20 vs. NI20. Furthermore, the number of DEGs were highly differentially overexpressed (log2 FC > 2) or underexpressed (log2 FC < –2). Specifically, there were 97 out of 284 DEGs (34%; S3 vs. NS3), 60 out of 139 DEGs (43%; I10 vs. NI10), and 19 out of 59 DEGs (32%; I20 vs. NI20). These data are summarized in Table 2. Genes exhibiting a highly differential overexpression or underexpression under the different experimental conditions (i.e., S vs. NS, I10 vs. NI10, and I20 vs. NI20) are grouped together in Table 3. Most DEGs encode a wide range of proteases, although 91 DEGs presented in Supplementary Tables S1–S3 could not be properly annotated (i.e., best hit description = “hypothetical”), signifying that the panel of databases used for the annotation process should be enlarged or that the genes may be specific to the Gmm genome. In addition, several of the DEGs were very highly overexpressed. For example the log2 FC of GMOY009756, which encodes a trypsin, had a fold-change of 7.14 in S3 vs. NS3 samples, and GMOY002278, which encodes the proteinase inhibitor I2, had a fold-change of 9.47 in I10 vs. NI10. In contrast, some DEGs were underexpressed: the log2 FC of GMOY005345, which encodes an aspartic peptidase, had a fold-change of –6.51 in I20 vs. NI20 samples. Table 3 is presented so as to facilitate comparison of differential expression levels for a given gene along the three sampling times. For instance, the levels (in log2 FC) of GMOY005345, which encodes an aspartic peptidase, are 3.39 (S3 vs. NS3), 2.70 (I10 vs. NI10), and –6.51 (I20 vs. NI20).

Table 2

Experimental conditionsNumber of identified genesSignificantly differentially expressed genes
OverallOverexpressedFold-change
2 < log2 FC or log2 FC < –23 < log2 FC or log2 FC < –3
S vs. NS (3 days)8,286284229 (80.6%)97 (34.1%)44 (15.5%)
I vs. NI (10 days)8,032139119 (85.6%)60 (43.1%)35 (25.2%)
I vs. NI (20 days)8,1015937 (62.7%)19 (32.2%)6 (10.2%)

Number of differentially expressed genes in Gpg.

S, stimulated; NS, non-stimulated; NI, non-infected; I: infected.

Table 3

GenesFold-change log2 FCEncoded proteins (best hits)Gene Ontology (GO)
Biological processMolecular functionCellular component
PROTEASES AND PROTEASE INHIBITORS
GMOY0053453.39Aspartic peptidaseGO:0006508 proteolysisGO:0004190 aspartic-type endopeptidase activityNo terms assigned
GMOY0053452.70Aspartic peptidaseGO:0006508 proteolysisGO:0004190 aspartic-type endopeptidase activityNo terms assigned
GMOY005345–6.51Aspartic peptidaseGO:0006508 proteolysisGO:0004190 aspartic-type endopeptidase activityNo terms assigned
GMOY0073052.09DestabilaseNo terms assignedGO:0003796 lysozyme activityNo terms assigned
GMOY0073053.00DestabilaseNo terms assignedGO:0003796 lysozyme activityNo terms assigned
GMOY0001032.64Fat body c-type lysozymeNo terms assignedNo terms assignedNo terms assigned
GMOY0001032.74Fat body c-type lysozymeNo terms assignedNo terms assignedNo terms assigned
GMOY002036–2.75Peptidase S1A, chymotrypsin-typeGO:0006508 proteolysisGO:0004252 serine-type endopeptidase activityNo terms assigned
GMOY0032732.30Peptidase S1A, chymotrypsin-typeGO:0006508 proteolysisGO:0004252 serine-type endopeptidase activityNo terms assigned
GMOY0039944.19Peptidase S1A, chymotrypsin-typeGO:0006508 proteolysisGO:0004252 serine-type endopeptidase activityNo terms assigned
GMOY0062663.72Peptidase S1A, chymotrypsin-typeGO:0006508 proteolysisGO:0004252 serine-type endopeptidase activityNo terms assigned
GMOY0089643.32Peptidase S1A, chymotrypsin-typeGO:0006508 proteolysisGO:0004252 serine-type endopeptidase activityNo terms assigned
GMOY0089653.52Peptidase S1A, chymotrypsin-typeGO:0006508 proteolysisGO:0004252 serine-type endopeptidase activityNo terms assigned
GMOY0089663.35Peptidase S1A, chymotrypsin-typeGO:0006508 proteolysisGO:0004252 serine-type endopeptidase activityNo terms assigned
GMOY0089664.41Peptidase S1A, chymotrypsin-typeGO:0006508 proteolysisGO:0004252 serine-type endopeptidase activityNo terms assigned
GMOY0094362.11Peptidase S1A, chymotrypsin-typeGO:0006508 proteolysisGO:0004252 serine-type endopeptidase activityNo terms assigned
GMOY0097572.76Peptidase S1A, chymotrypsin-typeGO:0006508 proteolysisGO:0004252 serine-type endopeptidase activityNo terms assigned
GMOY0107682.73Peptidase S1A, chymotrypsin-typeGO:0006508 proteolysisGO:0004252 serine-type endopeptidase activityNo terms assigned
GMOY0107682.06Peptidase S1GO:0006508 proteolysisGO:0004252 serine-type endopeptidase activityNo terms assigned
GMOY0027293.01Serine protease 1GO:0006508 proteolysisGO:0004252 serine-type endopeptidase activityNo terms assigned
GMOY0006726.95Serine protease 6GO:0006508 proteolysisGO:0004252 serine-type endopeptidase activityNo terms assigned
GMOY0006726.78Serine protease 6GO:0006508 proteolysisGO:0004252 serine-type endopeptidase activityNo terms assigned
GMOY0097567.14TrypsinGO:0006508 proteolysisGO:0004252 serine-type endopeptidase activityNo terms assigned
GMOY0097563.48TrypsinGO:0006508 proteolysisGO:0004252 serine-type endopeptidase activityNo terms assigned
GMOY0089672.55TrypsinGO:0006508 proteolysisGO:0004252 serine-type endopeptidase activityNo terms assigned
GMOY0089673.37Trypsin-like cysteine/serine peptid. domainNo terms assignedGO:0003824 catalytic activityNo terms assigned
GMOY0104886.83Imune reactive putative protease inhibitorNo terms assignedNo terms assignedNo terms assigned
GMOY0104884.69Immune reactive putative protease inhibitorNo terms assignedNo terms assignedNo terms assigned
GMOY0022772.31Proteinase inhibitor I2, Kunitz metazoaNo terms assignedGO:0004867 serine-type endopeptidase inhibit. Activ.No terms assigned
GMOY0022774.10Proteinase inhibitor I2, Kunitz metazoaNo terms assignedGO:0004867 serine-type endopeptidase inhibit. Activ.No terms assigned
GMOY0022786.54Proteinase inhibitor I2, Kunitz metazoaNo terms assignedGO:0004867 serine-type endopeptidase inhibit. Activ.No terms assigned
GMOY0022789.47Proteinase inhibitor I2, Kunitz metazoaNo terms assignedGO:0004867 serine-type endopeptidase inhibit. Activ.No terms assigned
GMOY0083442.94Trypsin Inhibitor-likeNo terms assignedNo terms assignedNo terms assigned
GMOY0083444.38Trypsin Inhibitor-likeNo terms assignedNo terms assignedNo terms assigned
ESTERASES—HYDROLASES
GMOY0000673.35Alkaline phosphataseGO:0008152 metabolic processGO:0016791 phosphatase activityNo terms assigned
GMOY0000673.60Alkaline phosphataseGO:0008152 metabolic processGO:0003824 catalytic activityNo terms assigned
GMOY0047312.06Alkaline phosphatase-like, alpha/beta/alphaGO:0008152 metabolic processGO:0003824 catalytic activityNo terms assigned
GMOY006875–2.02Alkaline phosphatase-like, alpha/beta/alphaGO:0008152 metabolic processGO:0003824 catalytic activityNo terms assigned
GMOY0042362.36Acylphosphatase-likeNo terms assignedGO:0003998 acylphosphatase activityNo terms assigned
GMOY0069582.60CarboxylesteraseNo terms assignedNo terms assignedNo terms assigned
GMOY0112492.83CarboxylesteraseNo terms assignedNo terms assignedNo terms assigned
GMOY0123682.53ExonucleaseNo terms assignedNo terms assignedNo terms assigned
GMOY0074023.85Extracellular Endonuclease, subunit ANo terms assignedGO:0016787 hydrolase activityNo terms assigned
GMOY012360–2.93Extracellular Endonuclease, subunit ANo terms assignedGO:0016787 hydrolase activityNo terms assigned
GMOY0093757.56Glycoside hydrolaseGO:0005975 carbohyd. metabolic processGO:0003824 catalytic activityNo terms assigned
GMOY012361–2.55Tsal2 protein precursorNo terms assignedGO:0016787 hydrolase activityNo terms assigned
GMOY0043092.44Thiolase-likeGO:0008152 metabolic processGO:0003824 catalytic activityNo terms assigned
GMOY0071482.10Thiolase-likeGO:0008152 metabolic processGO:0003824 catalytic activityNo terms assigned
BINDING
GMOY0101944.12AraucanNo terms assignedGO:0003677 DNA bindingNo terms assigned
GMOY0095257.53Armadillo-type foldNo terms assignedGO:0005488 bindingNo terms assigned
GMOY0096112.60Barrier- to-autointegration factor, BAFNo terms assignedGO:0003677 DNA bindingNo terms assigned
GMOY009394–5.56Basic-leucine zipper domainGO:0006355 regulation of transcriptionGO:0003700 sequence-specificNo terms assigned
GMOY0101955.99CaupolicanGO:0006355 regul. of transcrip,DNA-templatedGO:0003677 DNA bindingGO:0005634 nucleus
GMOY0027087.89Chitin bindingGO:0006030 chitin metabolic processGO:0008061 chitin bindingGO:0005576 extracel
GMOY0052782.93Chitin binding domainGO:0006030 chitin metabolic processGO:0008061 chitin bindingGO:0005576 extracel
GMOY0038404.12Chitin binding domainGO:0006030 chitin metabolic processGO:0008061 chitin bindingGO:0005576 extracel
GMOY0110546.08Chitin binding domainGO:0006030 chitin metabolic processGO:0008061 chitin bindingGO:0005576 extracel
GMOY0118106.55Chitin binding domainGO:0006030 chitin metabolic processGO:0008061 chitin bindingGO:0005576 extracel
GMOY0118098.08Pro1 (Chitin related)GO:0006030 chitin metabolic processGO:0008061 chitin bindingGO:0005576 extracel
GMOY0046474.24CupredoxinNo terms assignedGO:0005507 copper ion bindingNo terms assigned
GMOY0043642.90Haemolymph juvenile hormone bindingNo terms assignedNo terms assignedNo terms assigned
GMOY0054873.48Lim3No terms assignedGO:0008270 zinc ion bindingNo terms assigned
GMOY0070842.39NAD(P)-binding domainNo terms assignedNo terms assignedNo terms assigned
GMOY0023562.31Nucleotide-bindingNo terms assignedGO:0000166 nucleotide bindingNo terms assigned
GMOY0028254.47Odorant binding protein 2No terms assignedGO:0005549 odorant bindingNo terms assigned
GMOY0028252.03Odorant binding protein 2No terms assignedNo terms assignedNo terms assigned
GMOY0055482.99Odorant binding protein 7No terms assignedNo terms assignedNo terms assigned
GMOY0014762.20Odorant binding protein 22No terms assignedGO:0005549 odorant bindingNo terms assigned
GMOY0087694.95Small GTPaseGO:0007165 signal transductionGO:0005525 GTP bindingGO:0016020 membrane
GMOY0042285.44Transferrin familyGO:0006879 cellular iron ion homeostasisGO:0008199 ferric iron bindingGO:0005576 extracel
GMOY0042282.63Transferrin family, iron binding siteGO:0006879 cellular iron ion homeostasisGO:0008199 ferric iron bindingGO:0005576 extracel
GMOY0083152.05Winged helix-turn-helix DNA-binding domainGO:0006355 regul. of transcrip,DNA-templatedGO:0043565 sequence-specific DNA bindingNo terms assigned
Transcription Factor Activity
TRANSPORT/TRANSFERASE ACTIVITY
GMOY004684–2.39Cellul. retinaldehyde binding/a-tocopherol transportGO:0006810 transportGO:0005215 transporter activityGO:0005622 intracel
GMOY0086012.58Fatty acid synthase 3GO:0008152 metabolic processGO:0016740 transferase activityNo terms assigned
GMOY0086014.11Fatty acid synthase 3GO:0008152 metabolic processGO:0016740 transferase activityNo terms assigned
GMOY0086022.02Fatty acid synthase 4GO:0008152 metabolic processGO:0016740 transferase activityNo terms assigned
GMOY008602–2.55Fatty acid synthase 4GO:0008152 metabolic processGO:0016740 transferase activityNo terms assigned
GMOY0054422.35Lipid transport proteinGO:0006869 lipid transportGO:0005319 lipid transporter activityNo terms assigned
GMOY0054422.40Lipid transport proteinGO:0006869 lipid transportGO:0005319 lipid transporter activityNo terms assigned
GMOY0034904.50Major Facilitator Superfamily transporterGO:0055085 transmembrane transportNo terms assignedGO:0016021 integral
GMOY0034913.97Major Facilitator Superfamily transporterGO:0055085 transmembrane transportNo terms assignedGO:0016021 integral
GMOY0051032.77Major Facilitator Superfamily transporterGO:0055085 transmembrane transportNo terms assignedGO:0016021 integral
GMOY0076272.09Major Facilitator Superfamily transporterGO:0055085 transmembrane transportNo terms assignedGO:0016021 integral
GMOY0051026.28N-acetylgalactosaminyltransferaseGO:0008152 metabolic processNo terms assignedNo terms assigned
GMOY0118772.37Na+ channel, amiloride-sensitiveGO:0006814 sodium ion transportGO:0005272 sodium channel activityGO:0016020 membrane
GMOY0099032.75Neurotransmitter-gated ion-channelGO:0006811 ion transportNo terms assignedGO:0016021 integral
GMOY0059342.72Pyridoxal phosphate-dependent transferaseNo terms assignedGO:0003824 catalytic activityNo terms assigned
GMOY0093438.14Sodium:neurotransmitter symporterGO:0006836 neurotransmitter transportGO:0005328 neurotransmitter:Na symporter actGO:0016021 integral
GMOY0093436.43Sodium:neurotransmitter symporterGO:0006836 neurotransmitter transportGO:0005328 neurotransmitter:Na symporter actGO:0016021 integral
GMOY0093862.44Sodium:neurotransmitter symporterGO:0006836 neurotransmitter transportGO:0005328 neurotransmitter:Na symporter actGO:0016021 integral
GMOY0024862.59Two pore domain K channel, TASK familyGO:0071805 K ion transmemb, transportGO:0005267 potassium channel activityGO:0016020 membrane
GMOY0120884.05Tyrosine aminotransferaseGO:0009072 aromatic amino acidGO:0004838 L-tyrosine:2-oxoglutarateNo terms assigned
family metabolic processaminotransferase Activity
OXIDO-REDUCTION PROCESS
GMOY0019392.50Cytochrome P450-4g1GO:0055114 oxidation-reduction processGO:0016705 oxidoreductase activityNo terms assigned
GMOY0025982.15Cytochrome P450GO:0055114 oxidation-reduction processGO:0016705 oxidoreductase activityNo terms assigned
GMOY0064752.28Cytochrome P450-4g1GO:0055114 oxidation-reduction processGO:0016705 oxidoreductase activityNo terms assigned
GMOY0067612.42Cytochrome P450-4g1GO:0055114 oxidation-reduction processGO:0016705 oxidoreductase activityNo terms assigned
GMOY006761–2.08Cytochrome P450-4g1GO:0055114 oxidation-reduction processGO:0016705 oxidoreductase activityNo terms assigned
GMOY0071813.49Cytochrome P450GO:0055114 oxidation-reduction processGO:0016705 oxidoreductase activityNo terms assigned
GMOY0076523.43Cytochrome P450GO:0055114 oxidation-reduction processGO:0016705 oxidoreductase activityNo terms assigned
GMOY0097673.96Cytochrome P450GO:0055114 oxidation-reduction processGO:0016705 oxidoreductase activityNo terms assigned
GMOY0099093.35Cytochrome P450GO:0055114 oxidation-reduction processGO:0016705 oxidoreductase activityNo terms assigned
GMOY0075294.49Dehydrogenase/reductaseGO:0008152 metabolic processGO:0016491 oxidoreductase activityNo terms assigned
GMOY0043322.12Fatty acyl-CoA reductaseNo terms assignedGO:0080019 fatty-acyl-CoA reductase activityNo terms assigned
GMOY0074976.25NADH-cytochrome b-5 reductase 2GO:0055114 oxidation-reduction processGO:0016491 oxidoreductase activityNo terms assigned
GMOY0104462.402-oxoglutarate dioxygenaseGO:0055114 oxidation-reduction processGO:0050353 trimethyllysine dioxygenase activityNo terms assigned
HYPOTHETICAL
GMOY0002154.17HypotheticalNo terms assignedNo terms assignedNo terms assigned
GMOY0002155.24HypotheticalNo terms assignedNo terms assignedNo terms assigned
GMOY0002572.90HypotheticalNo terms assignedNo terms assignedNo terms assigned
GMOY0012392.53HypotheticalNo terms assignedNo terms assignedNo terms assigned
GMOY0024343.22HypotheticalNo terms assignedNo terms assignedNo terms assigned
GMOY0029332.07HypotheticalNo terms assignedNo terms assignedNo terms assigned
GMOY0029862.69HypotheticalNo terms assignedNo terms assignedNo terms assigned
GMOY0030112.30HypotheticalNo terms assignedNo terms assignedNo terms assigned
GMOY0030304.38HypotheticalNo terms assignedNo terms assignedNo terms assigned
GMOY003034–2.10HypotheticalNo terms assignedNo terms assignedNo terms assigned
GMOY0031582.70HypotheticalNo terms assignedNo terms assignedNo terms assigned
GMOY0031972.04HypotheticalNo terms assignedNo terms assignedNo terms assigned
GMOY0038302.75HypotheticalNo terms assignedNo terms assignedNo terms assigned
GMOY0038303.68HypotheticalNo terms assignedNo terms assignedNo terms assigned
GMOY0039742.28HypotheticalNo terms assignedNo terms assignedNo terms assigned
GMOY0039763.89HypotheticalNo terms assignedNo terms assignedNo terms assigned
GMOY004022–2.06HypotheticalNo terms assignedNo terms assignedNo terms assigned
GMOY0043375.80HypotheticalNo terms assignedNo terms assignedNo terms assigned
GMOY0043376.61HypotheticalNo terms assignedNo terms assignedNo terms assigned
GMOY0050556.08HypotheticalNo terms assignedNo terms assignedNo terms assigned
GMOY0056066.32HypotheticalNo terms assignedNo terms assignedNo terms assigned
GMOY0057976.60HypotheticalNo terms assignedNo terms assignedNo terms assigned
GMOY0057976.49HypotheticalNo terms assignedNo terms assignedNo terms assigned
GMOY0057983.98HypotheticalNo terms assignedNo terms assignedNo terms assigned
GMOY0057986.25HypotheticalNo terms assignedNo terms assignedNo terms assigned
GMOY0057992.24HypotheticalNo terms assignedNo terms assignedNo terms assigned
GMOY0066714.00HypotheticalNo terms assignedNo terms assignedNo terms assigned
GMOY0062762.33HypotheticalNo terms assignedNo terms assignedNo terms assigned
GMOY0071873.59HypotheticalNo terms assignedNo terms assignedNo terms assigned
GMOY0076374.06HypotheticalNo terms assignedNo terms assignedNo terms assigned
GMOY0080164.65HypotheticalNo terms assignedNo terms assignedNo terms assigned
GMOY0080166.67HypotheticalNo terms assignedNo terms assignedNo terms assigned
GMOY0086273.64HypotheticalNo terms assignedNo terms assignedNo terms assigned
GMOY0095392.28HypotheticalNo terms assignedNo terms assignedNo terms assigned
GMOY0095402.19HypotheticalNo terms assignedNo terms assignedNo terms assigned
GMOY0095412.40HypotheticalNo terms assignedNo terms assignedNo terms assigned
GMOY0099513.11HypotheticalNo terms assignedNo terms assignedNo terms assigned
GMOY0102246.87HypotheticalNo terms assignedNo terms assignedNo terms assigned
GMOY0102243.57HypotheticalNo terms assignedNo terms assignedNo terms assigned
GMOY010232–2.44HypotheticalNo terms assignedNo terms assignedNo terms assigned
xGMOY0120699.07HypotheticalNo terms assignedNo terms assignedNo terms assigned
GMOY0120695.33HypotheticalNo terms assignedNo terms assignedNo terms assigned
GMOY0089562.90hypothetical conserved proteinNo terms assignedNo terms assignedNo terms assigned
MISCELLANEOUS
GMOY0084585.74Actin-related proteinNo terms assignedNo terms assignedNo terms assigned
GMOY0083682.22Adipokinetic hormone recept isoform ANo terms assignedNo terms assignedNo terms assigned
GMOY0041474.32Apolipophorin-III superfamilyNo terms assignedNo terms assignedNo terms assigned
GMOY0115622.29CecropinNo terms assignedNo terms assignedGO:0005576 extracel
GMOY0115622.36CecropinNo terms assignedNo terms assignedGO:0005576 extracel
GMOY0115632.77CecropinNo terms assignedNo terms assignedGO:0005576 extracel
GMOY0108823.02Chemosensory protein 3No terms assignedNo terms assignedNo terms assigned
GMOY0074572.47Cytochrome b561/ferric reduct transmembraneNo terms assignedNo terms assignedGO:0016021 integral
GMOY0033542.48Elongase 9No terms assignedNo terms assignedGO:0016021 integral
GMOY0033546.25Elongase 9No terms assignedNo terms assignedGO:0016021 integral
GMOY0088213.10Elongase 4No terms assignedNo terms assignedGO:0016021 integral
GMOY0092773.72Insect cuticle proteinNo terms assignedGO:0042302 structural constituent of cuticleNo terms assigned
GMOY0038762.08Insect cuticle proteinNo terms assignedGO:0042302 structural constituent of cuticleNo terms assigned
GMOY0112162.40Insect cuticle proteinNo terms assignedGO:0042302 structural constituent of cuticleNo terms assigned
GMOY0022582.03Insulin-likeNo terms assignedGO:0005179 hormone activityGO:0005576 extracel
GMOY0039449.22LIM and senesc, cell antigen-like-protein 1GO:0009987 cellular processGO:0005198 structural molecule activityGO:0043226 organelle
GMOY0119973.02Mammalian NeuroPept, Y like receptorNo terms assignedNo terms assignedGO:0016021 integral
GMOY0120523.40Mammalian NeuroPept, Y like receptorNo terms assignedNo terms assignedGO:0016021 integral
GMOY0097452.02Milk gland protein 1No terms assignedNo terms assignedNo terms assigned
GMOY001342–2.40Milk gland protein 2No terms assignedNo terms assignedNo terms assigned
GMOY0013422.14Milk gland protein 2No terms assignedNo terms assignedNo terms assigned
GMOY0121253.57Milk gland protein 3No terms assignedNo terms assignedNo terms assigned
GMOY0013432.50Milk gland protein 6No terms assignedNo terms assignedNo terms assigned
GMOY012016–4.00Milk gland protein 8No terms assignedNo terms assignedNo terms assigned
GMOY0120162.36Milk gland protein 8No terms assignedNo terms assignedNo terms assigned
GMOY0123692.24Milk gland protein 10No terms assignedNo terms assignedNo terms assigned
GMOY0101602.78Mpv17/PMP22No terms assignedNo terms assignedGO:0016021 integral membrane
GMOY009494–3.95Rhodanese-like domainNo terms assignedNo terms assignedNo terms assigned
GMOY0106752.12Single domain Von Willebrand factor type CNo terms assignedNo terms assignedNo terms assigned
GMOY0070782.52Single domain Von Willebrand factor type CNo terms assignedNo terms assignedNo terms assigned

Annotation on the Glossina morsitans morsitans genome of Gpg genes differentially expressed in response to Tbg infection.

Gpg genes that were previously shown to be differentially expressed (DEGs) 3, 10, and 20 days after being challenged with Tbg were mapped on the Gmm genome and annotated on this reference genome. The table presents Gmm genes that are heterologs of the Gpg DEGs, in addition to the annotation results and the gene ontology. Only highly differential expressed genes (log2FC < –2 or log2FC > 2) have been considered. Black fonts: genes that are differentially expressed in stimulated vs. non-stimulated flies (at day 3 after fly challenge). 3952a4Blue and ee1f25Red fonts: genes differentially expressed at day 10 and day 20 after fly challenge, respectively.

Table 3 also provides the functional annotation data for each gene at each sampling time. To obtain an overview of the functional groups and categories, we used the GO assignment to classify the functions of the unigenes. According to this process the genes expressed at high levels were classified into three GO groups (Figure 3) and further subdivided into categories: biological process (14 categories), molecular functions (22 categories), and cellular component (6 categories). The category “No terms assigned” was predominant across all GO groups at any investigated time.

Figure 3

Comparing Gpg gene annotation on the Gmm genome and on a previously used panel of genomes

The global and detailed results of this comparative approach are presented in Supplementary Table S4. Table 4, which is a refined list of Supplementary Table S4, focuses on the expression of Gmm genes that are similar to Gpg genes previously identified as differentially expressed in response to Tbg infection. The results indicate that a high number of Gpg DEGs have orthologs in the Gmm genome. Furthermore, a large number of Gpg (22) and Gmm genes (23) encoding serine proteases were idenetified. Similarly, nine Gpg and nine Gmm genes were identified as encoding chitin binding proteins. Finally, whereas 14 Gmm genes encoding a “Major Facilitator Superfamily transporter” were identified, only one such gene was characterized in Gpg.

Table 4

Glossina palpalis gambiensis genesBest hit description-/-name of the encoded proteinsGlossina morsitans morsitans genes
GLOS_ARP3.1.1Actin-related protein [Drosophila melanogaster]
Actin-related proteinGMOY008458
GLOS_DVIR_GJ17549.1.1Acyltransferase—GJ17549 [Drosophila virilis]
Acyl-CoA N-acyltransferaseGMOY003123
GLOS_LOC101462532.1.1Adenylosuccinate lyase-like [Ceratitis capitata]
AdenylosuccinaseGMOY002461
GLOS_LOC101450467.1.1Alkaline phosphatase-like—membrane-bound [Ceratitis capitata]
Alkaline phosphataseGMOY000067
GLOS_LOC101455841.1.3Alpha-2-macroglobuline—CD109 antigen-like isoform X5 [C. capitata]
Alpha-2-macroglobulinGMOY010996
GLOS_LOC101461571.2.2Aspartic protease-like (lysosomal) [Ceratitis capitata]
Aspartic peptidaseGMOY005345; GMOY010103
GLOS_DVIR_GJ18228.1.1; GLOS_FDL.1.2Beta-hexosaminidase—GJ18228 [Drosophila virilis/D. melanogaster]
Beta-hexosaminidase domain 2-likeGMOY001794
GLOS_KCC2A.2.2Ca2+/calmodulin-dependent protein kinase type II
Ca2+/calmodulin-dependent protein kinaseGMOY006719
GLOS_CEC.2.2; GLOS_CECC.1.1; GLOS_CG10252.2.2Cecropin [G. m. morsitans/D. yakuba/D. melanogaster]
Cecropin (anti-microbial peptide)GMOY011562; GMOY011563
GLOS_DANA_GF24496.1.1; GLOS_DANA_GF24494.3.12Chitin binding—GF24496 [Drosophila ananassae]
GLOS_DGRI_GH11353.5.6; GLOS_DGRI_GH14440.1.1Chitin binding—GH11353 [Drosophila grimshawi]
GLOS_DMOJ_GI10981.2.2; GLOS_DMOJ_GI13574.3.3Chitin binding—GI10981 [Drosophila mojavensis]
GLOS_DWIL_GK11657.1.1; GLOS_DWIL_GK13541.1.5Chitin binding—GK11657 [Drosophila willistoni]
GLOS_DPER_GL15114.1.3Chitin binding—GL15114 [Drosophila persimilis]
Chitin bindingGMOY002708; GMOY003840; GMOY005251; GMOY005278;
GMOY009806; GMOY009807; GMOY011054; GMOY011809
GMOY011810
GLOS_LOC101462140.1.1Chitinase 3-like [Ceratitis capitata]
Chitinase-like protein Idgf5GMOY009161
GLOS_CP305.1.2; GLOS_C4AC3.1.1; GLOS_CP6G1.1.1;Cytochrome P450 305a1 [D. melanogaster]
GLOS_CP9F2.9.9; GLOS_CP6W1.1.1Cytochrome P450 9f2 [D. melanogaster]
cytochrome P450-4g1GMOY001150; GMOY001939; GMOY006475; GMOY006761
Cytochrome P450GMOY002598; GMOY002627; GMOY005461; GMOY007064;
GMOY007181; GMOY007270; GMOY007652;
GMOY009767; GMOY009909
GLOS_RN181.1.1E3 ubiquitin-protein ligase RNF181 homolog
E3 ubiquitin-protein ligase SINA likeGMOY002938; GMOY008903
GLOS_ELP2.1.1Elongator complex protein 2; D. m. GN = Elp2 PE = 1 SV = 1
Elongase 4; 9GMOY008821; GMOY003354
GLOS_LOC101461359.1.1Endoplasmic reticulum metallopeptidase 1-like isoform X1 [C. capitata]
Endoplasmic reticulum metallopeptidase 1GMOY009845; GMOY010241
GLOS_LOC101454791.1.1Enkurin-like [Ceratitis capitata]
EnkurinGMOY009600
GLOS_LOC101459623.1.5Exonuclease 3′-5′ domain-containing protein 2-like [C. capitata]
ExonucleaseGMOY012368
GLOS_LOC101459395.1.1Gamma-glutamyl hydrolase-like [Ceratitis capitata]
Gamma-glutamyl hydrolaseGMOY000946
GLOS_GSTT1.2.5; GLOS_GST.1.1Glutathione S-transferase 1-1 [L. cuprina/Musca domestica]
Glutathione S-transferaseGMOY002000; GMOY009373
GLOS_LOC101449625.1.1Glyoxalase domain-containing protein 4-like isoform X1 [C. capitata]
GlyoxylaseGMOY008525
GLOS_DVIR_GJ19325.1.1G-protein receptor activity—GJ19325 [Drosophila virilis]
G-protein coupled receptorGMOY009447
GLOS_DWIL_GK15016.1.2Haemolynph juvenile hormone binding- GK15016- [D. willistoni]
Haemolymph juvenile hormone bindingGMOY004364
GLOS_H12.1.1Histone H1.2 [Drosophila virilis]
Histone H1GMOY002746
GLOS_IMDH.2.2Inosine-5′-monophosphate dehydrogenase [D. melanogaster]
Inosine-5′-monophosphate dehydrogenaseGMOY006458
GLOS_LECA.10.13Lectin subunit alpha [Sarcophaga peregrina]
Lectin-like C-typeGMOY001011; GMOY009274
GLOS_LRRX1.1.4Leucine-rich repeat-containing protein— [D. discoideum]
Leucine rich repeat containing proteinGMOY010344
GLOS_LSD1.1.1Lipid storage droplets surface-binding protein 1
lipid storage droplet-1GMOY007510
GLOS_DWIL_GK13707.1.1Lipid transporter—GK13707 [Drosophila willistoni]
Lipid transport proteinGMOY002410; GMOY005442; GMOY005442;
GLOS_LOC101449088.1.1lysM—peptidoglycan-binding domain-containing protein 1-like [C. capitata]
LysM domainGMOY008891
GLOS_DPER_GL12526.1.1Major Facilitator Superfamily-type transporter / — [D. persimilis]
Major Facilitator Superfamily transporterGMOY001742; GMOY003428; GMOY003490; GMOY003491;
GMOY003839; GMOY004738; GMOY005103; GMOY005106;
GMOY005109; GMOY005501; GMOY007545; GMOY007627;
GMOY012075; GMOY012352
GLOS_LOC101462556.1.1MD-2-related lipid-recognition protein-like [Ceratitis capitata]
MD-2-related lipid-recognition domainGMOY006406
GLOS_MYSN.1.1Myosin heavy chain, non-muscle [D. melanogaster]
Myosin heavy chainGMOY007533; GMOY008852
GLOS_DMOJ_GI24301.1.1Neuropeptide Y receptor—GI24301 [Drosophila mojavensis]
NeuroPeptide Y like receptor / mammalian (Putative)GMOY011997; GMOY012052
GLOS_DGRI_GH13991.1.1; GLOS_DVIR_GJ10540.1.1Odorant binding—GH13991 [Drosophila grimshawi]/Drosophila virilis
GLOS_OB99B.1.1Odorant-binding protein 99b
Odorant binding protein 1; 2; 7GMOY000890; GMOY002825; GMOY005548;
Odorant binding protein 21; 22GMOY006418; GMOY001476
GLOS_LOC101453268.1.1Period circadian protein-like [Ceratitis capitata]
Period circadian proteinGMOY012110
GLOS_LOC101458811.1.1Pyridoxal kinase-like [Ceratitis capitata]
Pyridoxal phosphate-dependent transferaseGMOY005488; GMOY005934
GLOS_DMOJ_GI20119.1.1; GLOS_DMOJ_GI16517.2.2;Serine proteases (see details in Supplementary Table S4)GMOY000672; GMOY002036; GMOY002729; GMOY003271
GLOS_DSEC_GM17695.1.1; GLOS_LOC101456159.4.4;GMOY003273; GMOY003280; GMOY003693; GMOY003994
GLOS_DMOJ_GI18413.1.2; GLOS_DANA_GF15448.2.3;GMOY006266; GMOY006369; GMOY006991; GMOY008468
GLOS_AAEL_AAEL007969.1.1; GLOS_LOC101461009.2.2;GMOY008469; GMOY008958; GMOY008962; GMOY008964
GLOS_EAST.2.2; GLOS_LOC101457953.1.5;GMOY008965; GMOY008966; GMOY009418; GMOY009436
GLOS_LOC101462986.1.1; GLOS_DWIL_GK19454.1.1GMOY009757; GMOY010502; GMOY010768
GLOS_LOC101459895.9.9; GLOS_DWIL_GK24139.1.1;
GLOS_LOC101455430.10.10; GLOS_LOC101455604.4.10;
GLOS_DMOJ_GI21244.4.5; GLOS_DMOJ_GI24442.1.1;
GLOS_DVIR_GJ21497.1.3; GLOS_DVIR_GJ22718.8.10;
GLOS_DVIR_GJ21498.1.1; GLOS_DVIR_GJ21499.1.1;
GLOS_DMOJ_GI19420.1.1; GLOS_DVIR_GJ17584.1.1Serine protease inhibitor (Serpin) GI19420 [D. mojavensis/D. virilis]
GLOS_DANA_GF14653.1.2; GLOS_DERE_GG24413.1.1;Serine-type endopeptidase inhibitor— [D. ananassae/D. erecta]
GLOS_DWIL_GK10999.1.1; GLOS_LOC101459846.1.2Serine-type endopeptidase inhibitor /Metalloendopeptidase [D. willistoni]
Serine proteinase inhibitors (Kazal domain)GMOY010058
GLOS_DWIL_GK15974.5.7; GLOS_DMOJ_GI22128.1.1Single domain von Willebrand factor type C [D. willistoni]
Single domain Von Willebrand factor type CGMOY003774; GMOY005237; GMOY007078; GMOY010675
GLOS_DWIL_GK22031.1.1Sulfate transmembrane transporter—GK22031 [Drosophila willistoni]
Sulfate transporterGMOY000550
GLOS_LOC101448839.1.1; GLOS_LOC101454308.1.2Timeless-like isoform X1 protein (circadian rhythm regulation) [C. capitata]
Timeless proteinGMOY006112
GLOS_TRF.1.1Transferrin [Sarcophaga peregrina]
Transferrin family, iron binding siteGMOY004228
GLOS_LOC101463325.1.1Trypsin-like [Ceratitis capitata]
Trypsin-like cysteine/serine peptidase domainGMOY002535; GMOY008308
GLOS_DWIL_GK18237.1.1UDP-glucuronosyl/UDP-glucosyltransferase—GK18237 [D. willistoni]
UDP-glucuronosyl/UDP-glucosyltransferaseGMOY007046
GLOS_LOC101451574.1.1WD repeat-containing protein 81-like isoform X1 [Ceratitis capitata]
WD40/YVTN repeat-like-containing domainGMOY010092
GLOS_LOC101458997.1.1Yellow-like protein (protein of the gelly) [Ceratitis capitata]
yolk protein 3GMOY006227

Identification of Gmm gene orthologs of Gpg genes on the basis of their expression products.

The table compares the results of the previous Gpg DEG annotation on a panel of various genomes with the corresponding annotation on the Gmm genome.

A comparison is presented between the Gpg and the Gmm genes on the basis of the identified proteins they encode.

Homologies between identified Gmm genes that are heterologous to Gpg DEGs with genes from other organisms

In order to identify genes previously annotated as “uncharacterized” or “hypothetical,” we used the BLASTx program to identify heterologous genes among various organisms listed in the NCBI databases. Homologies with a cut-off E < 10−5 and-/-or displaying the highest hits score were selected; the minimum accepted homology level was 60%. Table 5 presents the results of the recorded annotation, and Figure 4 presents the species from which genomes the genes to be annotated displayed the best match.

Table 5

Geneslog2 FCHomology with other organisms (>60%)
GMOY000550−0.90Musca domestica sodium-independent sulfate anion transporter-like (LOC101893700), mRNA
GMOY0011371.09Musca domestica aminomethyltransferase, mitochondrial-like (LOC101895864), mRNA
GMOY001742-0.9Ceratitis capitata synaptic vesicle glycoprotein 2B-like (LOC101452461), transcript variant X3, mRNA
GMOY002004−1.78Musca domestica putative fatty acyl-CoA reductase CG5065-like (LOC101898308), mRNA
GMOY0020041.8Musca domestica putative fatty acyl-CoA reductase CG5065-like (LOC101898308), mRNA
GMOY0020240.96Musca domestica phosphotriesterase-related protein-like (LOC101890186), mRNA
GMOY0023562.31C. capitata CUGBP Elav-like family member 2-like (LOC101455154), transcript variant X1 to X3, mRNA
GMOY0024610.82Musca domestica adenylosuccinate lyase-like (LOC101900029), mRNA
GMOY0024862.6Musca domestica potassium channel subfamily K member 9-like (LOC101895107), mRNA
GMOY0025351.5Ceratitis capitata serine protease easter-like (LOC101451852), mRNA
GMOY0027291.91Lucilia sericata clone LScDNA1 putative salivary trypsin mRNA, complete cds
GMOY0027293.0Musca domestica serine proteinase stubble-like (LOC101890358), mRNA
GMOY002938−1.10Musca domestica uncharacterized LOC101893009 (LOC101893009), mRNA
GMOY0031582.70Musca domestica uncharacterized LOC101895341 (LOC101895341), mRNA
GMOY0031611.44Musca domestica thyrotropin receptor-like (LOC101887582), mRNA
GMOY0033542.48Ceratitis capitata elongation of very long chain fatty acids protein AAEL008004-like (LOC101449680), mRNA
GMOY0033546.3Musca domestica elongation of very long chain fatty acids protein AAEL008004-like (LOC101893043), mRNA
GMOY0034431.27Musca domestica uncharacterized LOC101889318 (LOC101889318), partial mRNA
GMOY0035901.77Musca domestica collagen alpha-1(IV) chain-like (LOC101897761), transcript variant X3, mRNA
GMOY0035901.3Musca domestica collagen alpha-1(IV) chain-like (LOC101897761), transcript variant X3, mRNA and variant X1, X2
GMOY0038302.75Homo sapiens BAC clone CH17-465I15 from chromosome unknown, complete sequence (= hypothetical)
GMOY0038303.7Volvox carteri f, nagariensis mRNA for pherophorin-dz1 protein
GMOY0038390.95Musca domestica putative inorganic phosphate cotransporter-like (LOC101889974), mRNA
GMOY0039491.01M. domestica glutamine-fructose-6-phosphate aminotransferase [isomerizing] 2-like (LOC101889985), transcript
GMOY0043092.44Musca domestica fatty acid synthase-like (LOC101893120), mRNA
GMOY0043322.1Drosophila willistoni GK20732 (Dwil\GK20732), mRNA /Fatty acyl-CoA reductase
GMOY0043375.8D. willistoni GK20950 (Dwil\GK20950), mRNA Bardet-Biedl syndrome 4 protein homolog (= hypothetical on Gmm genome)
GMOY0043376.6Ceratitis capitata Bardet-Biedl syndrome 4 protein homolog (LOC101449311), mRNA (= hypothetical on Gmm genome)
GMOY0045891.09Musca domestica muscle M-line assembly protein unc-89-like (LOC101890868) mRNA
GMOY0047121.65Musca domestica acyl-protein thioesterase 1-like (LOC101890399), mRNA
GMOY004738−0.78Musca domestica facilitated trehalose transporter Tret1-like (LOC101891733), transcript variant X1, Mrna
GMOY0048731.6Musca domestica transmembrane and TPR repeat-containing protein CG4341-like (LOC101893859), mRNA
GMOY0051026.28Musca domestica N-acetylgalactosaminyltransferase 4-like (LOC101894376), mRNA
GMOY005106−1.22Drosophila willistoni GK13266 (Dwil\GK13266), mRNA / Major facilitator superfamily transporter
GMOY0052782.93Musca domestica mucin-5AC-like (LOC101899868), mRNA
GMOY0052781.9Musca domestica mucin-5AC-like (LOC101899868), mRNA
GMOY005345-6.5Musca domestica lysosomal aspartic protease-like (LOC101894831), mRNA
GMOY0054873.48Musca domestica LIM/homeobox protein Lhx4-like (LOC101900654), mRNA
GMOY0054881.81Musca domestica alpha-methyldopa hypersensitive protein-like (LOC101888467), mRNA
GMOY0055270.87Musca domestica c-1-tetrahydrofolate synthase, cytoplasmic-like (LOC101891351), transcript variant X2, mRNA
GMOY0056066.3Musca domestica leucine-rich repeat-containing protein 15-like (LOC101899894), mRNA (= hypothetical on Gmm genome)
GMOY0059342.72Ceratitis capitata cysteine sulfinic acid decarboxylase-like (LOC101455610), mRNA
GMOY0061111.1Drosophila willistoni GK14673 (Dwil\GK14673), mRNA (Gonadal trypsine)
GMOY0062051.2Ceratitis capitata DNA replication licensing factor Mcm5-like (LOC101458261), mRNA
GMOY0064061.69Musca domestica ecdysteroid-regulated 16 kDa protein-like (LOC101898283), mRNA
GMOY0064061.7Musca domestica ecdysteroid-regulated 16 kDa protein-like (LOC101898283), mRNA
GMOY006458−0.78Musca domestica inosine-5′-monophosphate dehydrogenase-like (LOC101895820), mRNA
GMOY0066714.0Musca domestica uncharacterized LOC101890025 (LOC101890025), mRNA
GMOY0067611.99Musca domestica cytochrome P450 CYP4G13v2 mRNA, complete cds
GMOY0067612.4Musca domestica cytochrome P450 CYP4G13v2 mRNA, complete cds
GMOY006761-2.1Musca domestica cytochrome P450 CYP4G13v2 mRNA, complete cds
GMOY006875-2.0Musca domestica membrane-bound alkaline phosphatase-like (LOC101896753), mRNA
GMOY006875-1.8Musca domestica membrane-bound alkaline phosphatase-like (LOC101896753), mRNA
GMOY0069791.08Musca domestica phosrestin-2-like (LOC101892743), Mrna
GMOY0070462.0Ceratitis capitata UDP-glucuronosyltransferase 2B13-like (LOC101462823), transcript variant X2, mRNA
GMOY0071310.97Musca domestica inositol-3-phosphate synthase-like (LOC101889622), mRNA
GMOY0071482.10Ceratitis capitata fatty acid synthase-like (LOC101463409), mRNA
GMOY0074976.3Musca domestica NADH-cytochrome b5 reductase 3-like (LOC101897795), transcript variant X2, mRNA
GMOY0075231.44Musca domestica collagen alpha-1(IV) chain-like (LOC101895032), transcript variant X3, mRNA
GMOY0075231.1Musca domestica collagen alpha-1(IV) chain-like (LOC101895032), transcript variant X3, mRNA
GMOY007560-1.0C. capitata polypeptide N-acetylgalactosaminyltransferase 2-like (LOC101448408), mRNA (= hypothetical on Gmm genome)
GMOY0075841.1Ceratitis capitata synaptotagmin-1-like (LOC101450559), mRNA (= hypothetical on Gmm genome)
GMOY0080171.88Volvox carteri f. nagariensis mRNA for pherophorin-dz1 protein (= hypothetical when mapped on Gmm genome)
GMOY0080171.2Volvox carteri f, nagariensis mRNA for pherophorin-dz1 protein (= hypothetical when mapped on Gmm genome)
GMOY008266-1.3Drosophila willistoni GK24772 organic anion transporter (Dwil\GK24772), mRNA
GMOY0083081.9Drosophila melanogaster easter (ea), transcript variant A, mRNA
GMOY0084585.74Musca domestica actin, indirect flight muscle-like (LOC101895248), mRNA
GMOY0085250.97Drosophila willistoni GK21885 (Dwil\GK21885), mRNA
GMOY0086012.58Musca domestica fatty acid synthase-like (LOC101893120), mRNA
GMOY0086014.1Drosophila willistoni GK12914 (Dwil\GK12914), mRNA
GMOY0086022.02Drosophila pseudoobscura pseudoobscura GA26263 (Dpse\GA26263), mRNA
GMOY008602-2.5Drosophila willistoni GK12914 (Dwil\GK12914), mRNA
GMOY0088520.93Musca domestica myosin heavy chain, non-muscle-like (LOC101892851), transcript variant X1 to X3, Mrna
GMOY0089664.4Loxodonta africana kallikrein-11-like (LOC100667195), mRNA
GMOY0089730.80Lucilia cuprina alpha esterase (LcaE7) mRNA, implicated in organophosphate resistance, complete cds
GMOY0090181.59Ceratitis capitata uncharacterized LOC101448539 (LOC101448539), transcript variant
GMOY0090181.8Musca domestica uncharacterized LOC101899326 (LOC101899326), mRNA
GMOY0090791.23Musca domestica fatty acid synthase-like (LOC101893120), mRNA
GMOY0093757.56Musca domestica uncharacterized LOC101900740 (LOC101900740), mRNA
GMOY009394−5.56Musca domestica CCAAT/enhancer-binding protein-like (LOC101898926), mRNA
GMOY009447−1.22Ceratitis capitata calcitonin gene-related peptide type 1 receptor-like (LOC101462563), mRNA
GMOY0096001.20Musca domestica enkurin-like (LOC101897351), mRNA
GMOY0098450.6Musca domestica endoplasmic reticulum metallopeptidase 1-like (LOC101898765), transcript variant X3, mRNA
GMOY0099032.75M. domestica strain rspin nicotinic acetylcholine receptor beta 3 subunit (nAChRbeta3) gene, nAChRbeta3-C allele, complete cds
GMOY0099831.7Drosophila grimshawi GH17190 (Dgri\GH17190), mRNA
GMOY0102246.9Musca domestica uncharacterized LOC101889990 (LOC101889990), partial mRNA
GMOY0102243.6Musca domestica uncharacterized LOC101889990 (LOC101889990), partial mRNA ( = > Hypothetical)
GMOY0102410.8Musca domestica endoplasmic reticulum metallopeptidase 1-like (LOC101898765), transcript variant X3, mRNA
GMOY010481−1.53Musca domestica protein Wnt-5-like (LOC101892275), mRNA
GMOY0109720.73Musca domestica phenoloxidase subunit A3-like (LOC101897997), transcript variant X1 and X2, mRNA
GMOY0112321.32Drosophila melanogaster PAPS synthetase (Papss), transcript variant A, mRNA variant A to H
GMOY0114180.87Drosophila willistoni GK13980 (Dwil\GK13980), mRNA / glycogen synthase
GMOY011618-0.9Ceratitis capitata putative fatty acyl-CoA reductase CG5065-like (LOC101456246), transcript variant X2, mRNA
GMOY0120523.4Drosophila pseudoobscura pseudoobscura GA30114 Neuropeptide Y (Dpse\GA30114), mRNA
GMOY0120699.1Musca domestica uncharacterized LOC101891108 (LOC101891108), mRNA
GMOY0120695.3Musca domestica uncharacterized LOC101891108 (LOC101891108), mRNA
GMOY0120751.56Drosophila melanogaster CG31663 (CG31663), transcript variant B, mRNA (Major facilitator superfamily transporter)
GMOY012075-1.7Drosophila willistoni GK15555 (Dwil\GK15555), mRNA (Major facilitator superfamily transporter)
GMOY0123520.94Ceratitis capitata monocarboxylate transporter 10-like (LOC101448353), transcript variant X1, mRNA

Gmm gene heterologs of Gpg DEGs matching genes from other organism databases.

Black fonts, day-3 samples; 3952a4Blue fonts, day-10 samples; ee1f25Red fonts, day-20 samples.

Figure 4

Among the 284 Gmm genes heterologous to the day-3 Gpg DEGs samples, 54 genes showed significant matches with other organisms in the investigated databases. The top homology matches were Drosophila sp. (11.1%), Ceratitis capitata (13%), and Musca domestica (68.5%). The remaining 7.4% of genes matched with either Homo sapiens (1.8%), Lucilia sericata (3.5%), or Volvox carteri (2.1%). Similarly, among the 139 Gmm genes heterologous to the day-10 Gpg DEGs samples, 33 genes showed significant matches with other organisms. The top homology matches were Drosophila (18.2%), Ceratitis capitata (21.3%), and Musca domestica (51.5%). The remaining 9% of genes matched with either Loxodonta africana (2.9%) or Volvox carteri (6.1%). Finally, among the 59 DEGs from day-20 samples, 12 DEGs displayed significant matches with Musca domestica (58.7%), Ceratitis capitata (8.3%), or Drosophila sp (33%).

Several trends appear when comparing results from the annotation reported in Table 5 with those reported in Supplementary Tables S1–S3 (or in Table 3, regarding the genes in which the differential expression level was –2 < log2 FC or log2 FC > 2). First, many genes were not annotated; second, for genes that were annotated, the fold-change was identical; and finally, several genes that were annotated as “Hypothetical” when mapped on the Gmm genome could be identified when mapped on other databases. This was the case regarding the genes GMOY003830 (i.e., = > Pherophorin-dz1 protein, when annotated on Volvox carteri), GMOY004337 (i.e., = > Bardet –Biedl syndrome 4 protein homolog, when annotated on D. willistoni or Ceratitis capitata), GMOY005606 (i.e., = > leucine-rich repeat-containing protein 15-like, when annotated on Musca domestica), GMOY007560 (i.e., = > N-acetylgalactosaminyl transferase 2-like, when annotated on C. capitata), GMOY007584 (i.e., = > Synaptotagmin-1-like, when annotated on C capitata), and GMOY008070 (i.e., = > Pherophorin-dz1 protein, when annotated on Volvox carteri).

Discussion

The chronic and acute forms of sleeping sickness endemic to sub-Saharan Africa are caused by two Trypanosoma sub-species, Tbg and Tbr, which are, respectively, transmitted to their vertebrate hosts by the Glossina species Gpg and Gmm (Aksoy et al., 2014; Beschin et al., 2014). Nevertheless, the biological cycles, vertebrate transmission processes, and pathogenicity development of the two parasites are similar. Recently, in the context of an anti-vector strategy project to fight the disease, we performed a global transcriptomic analysis of Gpg gene expression associated with fly infection by Tbg. More precisely, we attempted to characterize genes that were differentially expressed according to the status of the fly at several sampling times (i.e., non-infected, infected, or self-cured). This included genes that could be involved in the fly's vector competence, and consequently genes that could possibly be manipulated in order to reduce or even suppress this competence.

The similarities between the Tbg and Tbr life cycles prompted us to determine whether the Gmm genome carried genes that could be heterologous to the Gpg DEGs, which could then allow the development of common molecular approaches. Accordingly, the Gpg sequences resulting from the previous RNA-seq de novo assembly (Hamidou Soumana et al., 2015) were mapped on the Gmm genome, the DEGs were characterized, and the corresponding genes were annotated.

When the Gpg sequences were mapped and annotated on a panel of various databases (C. capitata, D. melanogaster, D. willistoni, D. virilis, D. mojavensis, Acyrthosiphon pisum, Hydra magnipapillata, Anopheles sp., Bombyx sp., Aedes sp., and G. morsitans; Hamidou Soumana et al., 2015) we identified 553 (S vs. NS), 52 (I10 vs. NI10), and 143 (I20 vs. NI20) DEGs. In contrast, we identified 284 (S vs. NS), 139 (I10 vs. NI10) and 59 (I20 vs. NI20) DEGs when sequences were mapped and annotated on the G. m. morsitans database (using its whole genome annotated on the Drosophila melanogaster, Aedes aegypti, Anopheles gambiae, Culex quinquefasciatus, and Phlebotomus papatasi databases; International Glossina Genome Initiative, 2014). The differences in the number of identified DEGs, as well as the high number of “uncharacterized” genes, could be due to differences in the database panels used to annotate Gpg or Gmm. We cannot exclude the possibility that some of the Gpg DEGs do not have heterologous genes in Gmm, or that some of them could be specific to either Gpg or Gmm and consequently cannot be annotated yet. Nevertheless, regarding I10 vs. NI10 sampling (and in contrast to the two other experimental conditions), the number of recorded DEGs was more than 2-fold higher when the Gpg transcripts were mapped on the Gmm genome, prompting questions of how this is possible. However, at this stage of our research we cannot offer a satisfactory explanation.

We examined the potential influence of database panel composition by annotating the Gmm DEGs on a separate set of databases that included D. melanogaster as an internal control. The results (Table 5) clearly demonstrate the validity of the annotation process, since all Gmm genes (GMOY, etc.) were annotated (best hit description and fold-change) on the novel set of databases as they had been annotated on the former set (Table 3), and that several genes could be identified thanks to their annotation primarily on the Volvox carteri or Musca domestica databases which had never been used before, and despite the fact these organisms (algae and mouse) are genetically distant from the tsetse fly.

The most important observation regarding our objective is that almost all of the Gpg genes previously considered to be potentially involved in tsetse fly vector competence (cf. Hamidou Soumana et al., 2015) had a “countrepart” (i.e., heterologous genes) in the Gmm genome, despite the fact that none of the Gpg DEGs matched with any Gmm genes. This was the case for the large array of genes encoding peptidases, especially serine peptidases (represented by more than 20 genes), identified in the genomes of both fly species. This was similarly observed for ~10 genes present in both genomes that encode chitin binding proteins, since chitin metabolism is involved in the ability of tsetse flies to host trypanosomes (Maudlin and Welburn, 1994; Welburn and Maudlin, 1999), in addition to cecropin (an antimicrobial peptide), among others (Weiss et al., 2014).

Here, we were particularly interested in detecting the presence or not of genes with a reported role in the immunity of tsetse flies or other organisms (Weiss et al., 2014). Genes encoding Pro3 protein (GMOY009756, GMOY000672) and transferrin (GMOY004228) were identified. Pro3 has a potential function as a serine protease (tyrosinase) and is specifically produced by the proventriculus, an organ that plays an important role in the tsetse immune response. This protein could be involved in the immune response via activation of the cascade of prophenol oxidase and melanization (Jiang et al., 1998). Moreover, the gene GMOY0010488 was identified as encoding an “immuno reactive putative protease inhibitor” that is overexpressed in trypanosome stimulated or infected Gpg flies. The transferrin gene was overexpressed in both stimulated and infected Gpg flies; this result is in agreement with Geiser and Winzerling (2012), who reported on the role of transferrin in the immune response of insects, as well as its role in iron transport. By reducing the oxidative stress in tsetse fly guts, transferrin may promote the survival of trypanosomes. Guz et al. (2012) observed transferrin overexpression after challenge with bacteria, even at a higher level than what is typically observed in the case of infection by trypanosomes.

The gene GMOY011809 encodes Pro 1 peritrophin, which is a constituent of the peritrophic membrane (PM). The PM is established after the fly takes its first blood meal, and it is permanently renewed by the proventriculus (Moloo et al., 1970; Tellam et al., 1999). The PM primarily functions to envelop the blood meal and protect the intestinal epithelium against abrasion by ingested matter, although it can also represent an obstacle to the passage of ingested parasites into the ectoperitrophic space (Lehane, 1997; Hegedus et al., 2009). The gene GMOY005278 encodes mucin, which participates with peritrophin in the composition of the PM.

We have also identified genes encoding antimicrobial peptides: in Supplementary Table S1, GMOY01052 through GMOY010524 encode attacin, whereas GMOY0011562 and GMOY0011563 encode cecropin. Furthermore, both attacin and cecropin are overexpressed in Gpg trypanosome stimulated or infected flies.

Our work is the first comparison of its kind between the two Glossina species. This is primarily due to the fact that the different scientific teams working on HAT commonly focus on investigating either Gmm (and the acute form of trypanosomiasis) or Gpg (and the chronic form of trypanosomiasis), but not both together. Indeed, one of our most relevant findings is the observation that Gmm has the same genes at its disposal that Gpg may use to control its vector competence. Importantly, this comparison will assist future studies in revealing common molecular targets to increase the refractoriness of either fly species to infection by trypanosomes.

Statements

Author contributions

Conceived and designed the experiments: IH, AG. Performed the experiments: IH, BT, SR, HP. Analyzed the data: IH, SR, HP, AG. Wrote the paper: AG.

Acknowledgments

The authors thank the “Région Languedoc-Roussillon—Appel d'Offre Chercheur d'Avenir 2011” and the “Institut de Recherche pour le Développement” for their support.

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest. The reviewer CGFdL and handling Editor declared their shared affiliation, and the handling Editor states that the process nevertheless met the standards of a fair and objective review.

Supplementary material

The Supplementary Material for this article can be found online at: http://journal.frontiersin.org/article/10.3389/fmicb.2017.00540/full#supplementary-material

References

  • 1

    AksoyS.WeissB. L.AttardoG. M. (2014). Trypanosome transmission dynamics in tsetse. Curr. Opin. Insect Sci.3, 4349. 10.1016/j.cois.2014.07.003

  • 2

    AndersS.PylP. T.HuberW. (2014). HTSeq - A Python framework to work with high-throughput sequencing data. Bioinformatics31, 166169. 10.1093/bioinformatics/btu638

  • 3

    BenjaminiY.HochbergY. (1995). Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. R. Stat. Soc. Ser. B57, 289300.

  • 4

    BeschinA.Van Den AbbeeleJ.De BaetselierP.PaysE. (2014). African trypanosome control in the insect vector and mammalian host. Trends Parasitol.30, 538547. 10.1016/j.pt.2014.08.006

  • 5

    BullardJ. H.PurdomE.HansenK. D.DudoitS. (2010). Upper quartile: evaluation of statistical methods for normalization and differential expression in mRNASeq experiments. BMC Bioinformatics11:94. 10.1186/1471-2105-11-94

  • 6

    DennisG.Jr.ShermanB. T.HosackD. A.YangJ.GaoW.LaneH. C.et al. (2003). DAVID: database for annotation, visualization, and integrated discovery. Genome Biol.4:P3. 10.1186/gb-2003-4-5-p3

  • 7

    FarikouO.NjiokouF.Mbida MbidaJ. A.NjitchouangG. R.DjeungaH. N.AsonganyiT.et al. (2010). Tripartite interactions between tsetse flies, Sodalis glossinidius and trypanosomes-an epidemiological approach in two historical human African trypanosomiasis foci in Cameroon. Infect. Genet. Evol.10, 115121. 10.1016/j.meegid.2009.10.008

  • 8

    GeiserD. L.WinzerlingJ. J. (2012). Insect transferrins: multifunctional proteins. Biochim. Biophys. Acta1820, 437451. 10.1016/j.bbagen.2011.07.011

  • 9

    GentlemanR. C.CareyV. J.BatesD. M.BolstadB.DettlingM.DudoitS.et al. (2004). Bioconductor: open software development for computational biology and bioinformatics. Genome Biol.5:R80. 10.1186/gb-2004-5-10-r80

  • 10

    GuzN.KilincerN.AksoyS. (2012). Molecular characterization of Ephestia kuehniella (Lepidoptera: Pyralidae) transferrin and its response to parasitoid Venturia canescens (Hymenoptera: Ichneumonidae Gravenhorst). Insect Mol. Biol.21, 139147. 10.1111/j.1365-2583.2011.01129.x

  • 11

    Hamidou SoumanaI.KloppC.RavelS.NabihoudineI.TchicayaB.ParrinelloH.et al. (2015). RNA-seq de novo assembly reveals differential gene expression in Glossina palpalis gambiensis infected with Trypanosoma brucei gambiense vs. non-infected and self-cured flies. Front. Microbiol.6:1259. 10.3389/fmicb.2015.01259

  • 12

    Hamidou SoumanaI.LoriodB.RavelS.TchicayaB.SimoG.RihetP.et al. (2014a). The transcriptional signatures of Sodalis glossinidius in the Glossina palpalis gambiensis flies negative for Trypanosoma brucei gambiense contrast with those of this symbiont in tsetse flies positive for the parasite: possible involvement of a Sodalis-hosted prophage in fly Trypanosoma refractoriness?Infect. Genet. Evol.24, 4156. 10.1016/j.meegid.2014.03.005

  • 13

    Hamidou SoumanaI.TchicayaB.SimoG.GeigerA. (2014b). Comparative gene expression of Wigglesworthia inhabiting non-infected and Trypanosoma brucei gambiense-infected Glossina palpalis gambiensis flies. Front. Microbiol.5:620. 10.3389/fmicb.2014.00620

  • 14

    HegedusD.ErlandsonM.GillottC.ToprakU. (2009). New insights into peritrophic matrix synthesis, architecture, and function. Annu. Rev. Entomol.54, 285302. 10.1146/annurev.ento.54.110807.090559

  • 15

    HoareC. A. (1972). The Trypanosomes of Mammals, A Zoological Monograph. Oxford: Blackwell Scientific Publications.

  • 16

    International Glossina Genome Initiative (2014). Genome sequence of the tsetse fly (Glossina morsitans): vector of African trypanosomiasis. Science344, 380386. 10.1126/science.1249656

  • 17

    JiangH.WangY.KanostM. R. (1998). Pro-phenol oxidase activating proteinase from an insect, Manduca sexta: a bacteria-inducible protein similar to Drosophila easter. Proc. Natl. Acad. Sci. U.S.A.95, 1222012225. 10.1073/pnas.95.21.12220

  • 18

    KimD.PerteaG.TrapnellC.PimentelH.KelleyR.SalzbergS. L. (2013). TopHat2: accurate alignment of transcriptomes in the presence of insertions, deletions and gene fusions. Genome Biol.14:R36. 10.1186/gb-2013-14-4-r36

  • 19

    LangmeadB.SalzbergS. (2012). Fast gapped-read alignment with Bowtie 2. Nat. Methods9, 357359. 10.1038/nmeth.1923

  • 20

    LehaneM. J. (1997). Peritrophic matrix structure and function. Annu. Rev. Entomol.42, 525550. 10.1146/annurev.ento.42.1.525

  • 21

    LouisF. J.SimarroP. P.LucasP. (2002). Sleeping sickness: one hundred years of control strategy evolution. Bull. Soc. Pathol. Exot.95, 331336.

  • 22

    MaudlinI.WelburnS. C. (1994). Maturation of trypanosome infections in tsetse. Exp. Parasitol.79, 202205. 10.1006/expr.1994.1081

  • 23

    MolooS. K.SteigerR. F.HeckerH. (1970). Ultrastructure of the peritrophic membrane formation in Glossina Wiedemann. Acta Trop.274, 378383.

  • 24

    MountD. W. (2007). Using the Basic Local Alignment Search Tool (BLAST). CSH Protoc.2007:pdb.top17. 10.1101/pdb.top17

  • 25

    RobinsonM. D.McCarthyD. J.SmythG. K. (2010). EdgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics26, 139140. 10.1093/bioinformatics/btp616

  • 26

    SavageA. F.KolevN. G.FranklinJ. B.VigneronA.AksoyS.TschudiC. (2016). Transcriptome profiling of Trypanosoma brucei development in the tsetse fly vector Glossina morsitans. PLoS ONE11:e0168877. 10.1371/journal.pone.0168877

  • 27

    TellamR. L.WijffelsG.WilladsenP. (1999). Peritrophic matrix proteins. Insect Biochem. Mol. Biol.29, 87101. 10.1016/S0965-1748(98)00123-4

  • 28

    Van den AbbeeleJ.ClaesY.van BockstaeleD.Le RayD.CoosemansM. (1999). Trypanosoma brucei spp. development in the tsetse fly: characterization of the post mesocyclic stages in the foregut and proboscis. Parasitology118, 469478. 10.1017/S0031182099004217

  • 29

    VickermanK.TetleyL.HendryK. A.TurnerC. M. (1988). Biology of African trypanosomes in the tsetse fly. Biol. Cell64, 109119. 10.1016/0248-4900(88)90070-6

  • 30

    WeissB. L.SavageA. F.GriffithB. C.WuY.AksoyS. (2014). The peritrophic matrix mediates differential infection outcomes in the tsetse fly gut following challenge with commensal, pathogenic, and parasitic microbes. J. Immunol.193, 773782. 10.4049/jimmunol.1400163

  • 31

    WelburnS. C.MaudlinI. (1999). Tsetse-trypanosome interactions: rites of passage. Parasitol. Today15, 399403. 10.1016/S0169-4758(99)01512-4

  • 32

    WelburnS. C.MaudlinI.SimarroP. P. (2009). Controlling sleeping sickness-a review. Parasitology136, 19431949. 10.1017/S0031182009006416

  • 33

    WixonJ.KellD. (2000). The Kyoto encyclopedia of genes and genomes-KEGG. Yeast17, 4855. 10.1002/(SICI)1097-0061(200004)17:1<48::AID-YEA2>3.0.CO;2-H

Summary

Keywords

human African Trypanosomiasis, Glossina palpalis gambiensis, Glossina morsitans morsitans, Trypanosoma brucei gambiense, differentially expressed genes, heterologous genes

Citation

Hamidou Soumana I, Tchicaya B, Rialle S, Parrinello H and Geiger A (2017) Comparative Genomics of Glossina palpalis gambiensis and G. morsitans morsitans to Reveal Gene Orthologs Involved in Infection by Trypanosoma brucei gambiense. Front. Microbiol. 8:540. doi: 10.3389/fmicb.2017.00540

Received

01 December 2016

Accepted

14 March 2017

Published

03 April 2017

Volume

8 - 2017

Edited by

Alexandre Morrot, Federal University of Rio de Janeiro, Brazil

Reviewed by

Celio Geraldo Freire De Lima, Federal University of Rio de Janeiro, Brazil; Elisangela Oliveira De Freitas, University of Oxford, UK; Geneviève Milon, Institut Pasteur (INSERM), France

Updates

Copyright

*Correspondence: Anne Geiger

This article was submitted to Microbial Immunology, a section of the journal Frontiers in Microbiology

Disclaimer

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.

Outline

Figures

Cite article

Copy to clipboard


Export citation file


Share article

Article metrics