Characterization of the Complete Mitochondrial Genomes of Two Sibling Species of Parasitic Roundworms, Haemonchus contortus and Teladorsagia circumcincta

Haemonchus contortus and Teladorsagia circumcincta are among the two most pathogenic internal parasitic nematodes infecting small ruminants, such as sheep and goats, and are a global animal health issue. Accurate identification and delineation of Haemonchidae species is essential for development of diagnostic and control strategies with high resolution for Trichostrongyloidea infection in ruminants. Here, we describe in detail and compare the complete mitochondrial (mt) genomes of the New Zealand H. contortus and T. circumcincta field strains to improve our understanding of species- and strain-level evolution in these closely related roundworms. In the present study, we performed extensive comparative bioinformatics analyses on the recently sequenced complete mt genomes of the New Zealand H. contortus NZ_Hco_NP and T. circumcincta NZ_Teci_NP field strains. Amino acid sequences inferred from individual genes of each of the two mt genomes were compared, concatenated and subjected to phylogenetic analysis using Bayesian inference (BI), Maximum Likelihood (ML), and Maximum Parsimony (MP). The AT-rich mt genomes of H. contortus NZ_Hco_NP and T. circumcincta NZ_Teci_NP are 14,001 bp (A+T content of 77.4%) and 14,081 bp (A+T content of 77.3%) in size, respectively. All 36 of the typical nematode mt genes are transcribed in the forward direction in both species and comprise of 12 protein-encoding genes (PCGs), 2 ribosomal RNA (rrn) genes, and 22 transfer RNA (trn) genes. The secondary structures for the 22 trn genes and two rrn genes differ between H. contortus NZ_Hco_NP and T. circumcincta NZ_Teci_NP, however the gene arrangements of both are consistent with other Trichostrongylidea sequenced to date. Comparative analyses of the complete mitochondrial nucleotide sequences, PCGs, A+T rich and non-coding repeat regions of H. contortus NZ_Hco_NP and T. circumcincta NZ_Teci_NP further reinforces the high levels of diversity and gene flow observed among Trichostrongylidea, and supports their potential as ideal markers for strain-level identification from different hosts and geographical regions with high resolution for future studies. The complete mt genomes of H. contortus NZ_Hco_NP and T. circumcincta NZ_Teci_NP presented here provide useful novel markers for further studies of the meta-population connectivity and the genetic mechanisms driving evolution in nematode species.

Haemonchus contortus and Teladorsagia circumcincta are among the two most pathogenic internal parasitic nematodes infecting small ruminants, such as sheep and goats, and are a global animal health issue. Accurate identification and delineation of Haemonchidae species is essential for development of diagnostic and control strategies with high resolution for Trichostrongyloidea infection in ruminants. Here, we describe in detail and compare the complete mitochondrial (mt) genomes of the New Zealand H. contortus and T. circumcincta field strains to improve our understanding of species-and strain-level evolution in these closely related roundworms. In the present study, we performed extensive comparative bioinformatics analyses on the recently sequenced complete mt genomes of the New Zealand H. contortus NZ_Hco_NP and T. circumcincta NZ_Teci_NP field strains. Amino acid sequences inferred from individual genes of each of the two mt genomes were compared, concatenated and subjected to phylogenetic analysis using Bayesian inference (BI), Maximum Likelihood (ML), and Maximum Parsimony (MP). The AT-rich mt genomes of H. contortus NZ_Hco_NP and T. circumcincta NZ_Teci_NP are 14,001 bp (A+T content of 77.4%) and 14,081 bp (A+T content of 77.3%) in size, respectively. All 36 of the typical nematode mt genes are transcribed in the forward direction in both species and comprise of 12 protein-encoding genes (PCGs), 2 ribosomal RNA (rrn) genes, and 22 transfer RNA (trn) genes. The secondary structures for the 22 trn genes and two rrn genes differ between H. contortus NZ_Hco_NP and T. circumcincta NZ_Teci_NP, however the gene arrangements of both are consistent with other Trichostrongylidea sequenced to date. Comparative analyses of the complete mitochondrial nucleotide sequences, PCGs, A+T rich and non-coding repeat regions of H. contortus NZ_Hco_NP and T. circumcincta NZ_Teci_NP further reinforces the high levels of diversity and gene flow observed among Trichostrongylidea, and supports their potential as ideal markers for strain-level identification from different hosts and geographical regions with high resolution for future studies. The complete mt genomes of H. contortus NZ_Hco_NP and T. circumcincta NZ_Teci_NP presented here provide useful novel markers for further studies of the meta-population connectivity and the genetic mechanisms driving evolution in nematode species.
Keywords: Haemonchus contortus, Teladorsagia circumcincta, mitochondrial genome, helminth, parasite, anthelmintic-susceptible HIGHLIGHTS -In-depth comparative and phylogenomic analyses of the recently published complete mitochondrial (mt) genomes of the New Zealand H. contortus NZ_Hco_NP and T. circumcincta NZ_Teci_NP field strains to improve our understanding of species-and strain-level evolution in roundworms. -Analysis of the A+T rich and non-coding repeat regions of the complete Haemonchus and T. circumcincta mt genomes supports their potential as ideal markers for easy speciesand possibly strain-level identification with high resolution for future studies. -Phylogenetic relationships of complete mt genomes for all Trichostrongyloidea species currently available inferred from Bayesian, Maximum Likelihood and Maximum Parsimony analyses of mitochondrial genes, revealed high levels of intraspecies diversity and a panmictic structure observed among Trichostrongylidea. -Our findings indicate that a three-pronged approach incorporating phylogenetic inertia, pangenome structure/features and environmental data are needed in order to understand the mitochondrial genome evolution.

INTRODUCTION
H. contortus (barber's pole worm) and T. circumcincta (brown stomach worm), are the most economically important pathogenic nematodes infecting small ruminants (sheep and goats) worldwide. These blood-feeding strongylid nematodes are orally transmitted via contaminated pasture to the host where they infect the fourth stomach (abomasum) reducing animal production and causing anemia, edema, and associated complications often leading to death (Vlassoff et al., 2001;Sutherland and Scott, 2010). Although these parasites can be managed using existing prophylactic drugs (anthelmintics), their remarkable natural tendency to develop resistance combined with the diminishing efficacy of compounds used, threatens the global livestock industry (Kaplan, 2004;Kaplan and Vidyashankar, 2012). While these monoxenous and obligately sexual species have identical life-cycles (Veglia, 1915), the gene flow and evolutionary relationships observed among the Trichostrongylidae superfamily, and the Strongylida suborder as a whole, remain unclear.
Parasitic roundworms belonging to the Haemonchidae family are able to infect a range of ruminant hosts. For example and on communal pastures in particular, H. placei is primarily a cattle parasite but can also infect small ruminants (Amarante et al., 1997;Gasser et al., 2008;Nunes et al., 2013), and vice versa for H. contortus (Jacquiet et al., 1998). Given the increase in availability of genomic information resources (Palevich et al., 2018), it is crucial for the management of parasitic diseases that readily accessible molecular tools are developed to enable end-users to easily determine which species is infecting which animal host.
The development of low cost, high-throughput, nextgeneration sequencing technologies have led to a recent focus on whole-genome sequencing (WGS) of complete nuclear genomes of parasitic roundworms (Nematodes) Hu et al., 2007;Jex et al., 2008;Palevich et al., 2019a,b), instead of earlier markers such as cox1, cox2, and nad genes. Owing to the substantial coverage following WGS (Wit and Gilleard, 2017;Palevich et al., 2019c) and advances in bioinformatics pipelines, further complete mitochondrial genome resources will be available to facilitate detailed comparative phylogenetic analyses across many species complexes and major taxonomic groups of nematodes (Hu et al., 2004;Gasser et al., 2016). In general, complete nematode mt genomes are ∼12-21 kbp circular-DNA molecules . Nematode mt genomes can provide rich sources of informative markers and typically feature (Wolstenholme, 1992;Boore, 1999;Saccone et al., 1999;Blouin, 2002): 12 protein-coding genes (PCGs) and lack of the atp8 gene [except for T. spiralis (Lavrov and Brown, 2001)], 2 ribosomal RNA (rrn) genes (Gutell et al., 1993), and 22 transfer RNA (trn) genes (Wolstenholme et al., 1987); unidirectional transcription (usually in forward direction) with particularly unique initiation codons; and highly variable gene arrangements . The gene sequences, and in particular the 12 PCGs, are relatively conserved are useful phylogenetic markers that have been used for examining phylogenetic relationships within the phylum Nematoda (Hu et al., 2004). The mitochondrial genetic code of nematodes follows translation (transl_table = 5) 1 of National Center for Biotechnology Information (NCBI). In addition, nematode mt genomes also possess transposition genotypes of tRNAs that lack either a TΨ C or DHU stem in the trn secondary structures (Wolstenholme et al., 1994;Yokobori, 1995). In the pursuit of improving the phylogenetic resolution within the phylum Nematoda, mt genomes serve as excellent markers for investigation of species delineation and the genetics of population evolution. Future efforts should focus on the availability of more complete mt genomes across all nematode species, and especially for conspecific strains.
For these reasons, this study interrogates the complete mitochondrial genome sequences of the anthelmintic-susceptible H. contortus NZ_Hco_NP (Palevich et al., 2019a) and T. circumcincta NZ_Teci_NP (Palevich et al., 2019b) derived from pasture-grazed New Zealand sheep. In this study, we performed numerous comparative and phylogenomic analyses to directly compare the molecular characteristics including the nucleotide composition and codon usage profiles of PCGs, as well as the secondary structures of each identified tRNA and rRNA gene within the two species, but also among other closely related trichostrongyloid nematodes. The phylogenetic position of H. contortus NZ_Hco_NP and T. circumcincta NZ_Teci_NP among Trichostrongylidae and of other species of socio-economically important parasites within the phylum Nematoda, are investigated based on mitochondrial nucleotide sequences and PCGs. This research provides insights into the mt genome evolution that may facilitate the development of molecular tools to differentiate between these two sibling species of parasitic roundworms at the strain level.

Sample Collection and DNA Extraction
The acquisition of parasite samples, DNA extraction, and genome sequencing technologies used to generate the mt genomes used for analysis in this study have recently been outlined in the form of brief reports (Palevich et al., 2019a,b). To fill any knowledge gaps associated with the above-mentioned reports, here we describe in detail the methodologies, sequencing technologies, and annotation software used to generate the H. contortus NZ_Hco_NP and T. circumcincta NZ_Teci_NP mt genomes. H. contortus and T. circumcincta were recovered from sheep infected with pure strains of parasites in Palmerston North, New Zealand, and total high molecular weight genomic DNA was extracted separately from multiple worms using a modified phenol:chloroform protocol, as previously described (Choi et al., 2017;Palevich et al., 2019a,b,c). DNA was deposited, stored and available upon request from AgResearch Ltd., Grasslands Research Center.

Mitochondrial Genome Sequencing and Annotation
The complete mitochondrial genome sequences of H. contortus NZ_Hco_NP (Palevich et al., 2019a,c) and T. circumcincta NZ_Teci_NP (Palevich et al., 2019b) were obtained using next-generation sequencing technologies. The H. contortus NZ_Hco_NP (BioProject ID: PRJNA517503, SRA accession number SRP247265) whole genome shotgun paired-end (PE) and Single-Molecule, Real-Time (SMRT) long-read sequencing (SRA Runs: SRR11022845 and SRR11022846) were generated using the Illumina HiSeq2500 and Pacific Biosciences (PacBio) platforms, as previously described (Palevich et al., 2019a,c). The T. circumcincta NZ_Teci_NP (BioProject ID: PRJNA72569, SRA accession number SRP007648) whole genome shotgun library (SRA Run: SRR3145390) was generated as previously described (Tang et al., 2014;Palevich et al., 2019b), and sequenced by the Genome Center at Washington University (WUGSC, School of Medicine, St. Louis, MO, USA) using the Illumina MiSeq platform with the strategy of 150 bp paired-end sequencing mode. A total of 605 million raw reads were generated and made available in FASTQ format. The quality of the raw sequence reads was evaluated using the software package [http:// www.bioinformatics.babraham.ac.uk/projects/fastqc (Andrews, 2010)]. The software Trimmomatic v.0.36 (Bolger et al., 2014) was used for removal of adapter, contaminant, low quality (Phred scores <30), and short (<50 bp) sequencing reads. The remaining high-quality sequencing reads were assembled de novo using the NOVOPlasty pipeline v.3.1 with default parameters and based on a kmer size of 39 following the developer's suggestions (Dierckxsens et al., 2016). The assembled mt genomes, tRNA and rRNA annotations were checked using the MITOS web server based on translation (transl_table = 5) 1 (Invertebrate Mitochondrial) of NCBI (http://mitos.bioinf.unileipzig.de). Repeat-rich subsets of the mt genome assemblies were identified with varying degrees of complexity using the mDUST v.2006.10.17 (Morgulis et al., 2006) software tools with default settings. The sequence regions containing tandem repetitive elements were identified using Tandem Repeat Finder software online server (http://tandem.bu.edu/trf/trf.html). The coordinates of the found repeats were matched with coding sequences and the proportion of repeats overlapping coding regions was calculated. tRNA genes were further corroborated using the software SPOT-RNA (Singh et al., 2019) and the tRNA secondary structures were predicted using the ViennaRNA v.2.4 (Lorenz et al., 2011) andVARNA v.3.93 (Darty et al., 2009) packages. tRNA secondary structures and complete mt genomes were visualized using the Geneious Prime v.2019.1.1 (Kearse et al., 2012). The PCGs were manually curated by searching for ORFs (employing genetic code 5 as above) and alignment against the available H. contortus and T. circumcincta reference genomes in Geneious Prime v.2019.1.1 (Kearse et al., 2012).

Mitochondrial Pangenome Analysis
The entire nucleotide sequences alignments using Mauve (Darling et al., 2004) were performed using Geneious Prime v.2019.1.1 (Kearse et al., 2012). Sequence differences among the Haemonchus and Teladorsagia species as well as between the respective strains was done using the H. contortus NZ_Hco_NP (Palevich et al., 2019a,c) and T. circumcincta NZ_Teci_NP (Choi et al., 2017;Palevich et al., 2019b) as the reference sequences, respectively. Repeated sequence elements of Haemonchus and Teladorsagia mt genomes were visualized in the form of Dotplots and generated based on pair-wise alignments with chaining coverage using PipMaker (Schwartz et al., 2000). Gene synteny was investigated using BLASTX alignments of sequences and genetic similarity mapping was visualized using the CGView Comparison Tool (http://stothard.afns.ualberta.ca/downloads/ CCT) (Grant et al., 2012). For analysis of codon usage and amino acid composition, codon usage counts were extracted from each protein using Geneious Prime version 2019.1.1 (Kearse et al., 2012). These were then compiled and loaded into R version v.3.6.1 (R Core Team, 2013) where a clustered image map was produced using the "cim" function from the mixOmics package v.6.8.5 (Rohart et al., 2017).

Phylogenomics Analysis of Complete mt Genomes
The phylogenetic positions of H. contortus NZ_Hco_NP and T. circumcincta NZ_Teci_NP among other species of nematodes available in the GenBank database were examined. The nucleotide sequences and PCGs of published Trichostrongyloidea mitochondrial genomes were retrieved from the National Center for Biotechnology Information, USA (http://www.ncbi.nlm.nih.gov/). Caenorhabditis elegans N2 (JF896456) was used as the outgroup species in this study. All coding sequences (CDS) used in this analysis had correct start and stop codons based on translation (transl_table = 5) 1 of NCBI (Invertebrate Mitochondrial).
Considering the high degree of intraspecific variation in nucleotide sequences of mt genes of nematodes, we used the deduced amino acid sequences of mt proteins for phylogenetic analyses. Amino acid sequences from the 12 mt protein-coding genes of the complete mitochondrial genomes were deduced, aligned individually and concatenated into a single alignment using MAFFT v.7.450 (Katoh and Standley, 2013). Phylogenetic analyses were conducted using three methods: Bayesian inference (BI), Maximum Likelihood (ML), and Maximum Parsimony (MP) methods. BI analysis with the program MrBayes v.3.15265 (Huelsenbeck and Ronquist, 2001) and the MtInv model of amino acid evolution (Le et al., 2017) was selected as the most suitable model of evolution. However, since the MtInv model is not implemented in the current version of MrBayes, an alternative model, MtREV (Adachi and Hasegawa, 1996), was used in BI analysis. Four independent Markov chains were run for 1,000,000 metropolis-coupled MCMC generations, sampling a tree every 100 generations. The first 2,500 trees represented burn-in, and the remaining trees were used to calculate Bayesian posterior probabilities (PP). The evolutionary history was further inferred by using the ML and MP methods. ML analysis was performed with the MtREV General Reversible Mitochondrial + Freq. model (Adachi and Hasegawa, 1996) and MP was performed with PAUP * v.4.0b10 (http://paup.csit. fsu.edu/) (Swofford, 2001). Both methods were calculated using 1,000 bootstrap replicates with bootstrapping frequencies (BF) were calculated. Phylograms were drawn using the software MEGA X (Kumar et al., 2018).

RESULTS AND DISCUSSION mt Genome General Features and Characteristics
The reconstruction of partial and complete mitochondrial genomes is becoming increasingly cost-effective with the advent of next-generation sequencing technology and recent advances in molecular parasitology tools, in particular for species identification and population genetics studies Jex et al., 2009). Despite these advances, only 12 species of Trichostrongyloid mt genomes are currently available in GenBank with only half of these represented by multiple strains (dos Santos et al., 2017). For these reasons, the complete mt genomes of New Zealand H. contortus NZ_Hco_NP (Palevich et al., 2019a,c) and T. circumcincta NZ_Teci_NP (Choi et al., 2017;Palevich et al., 2019b) field strains were sequenced and compared in detail to investigate the strainlevel genetic differences of these two sibling Trichostrongyloidea species. Recently, complete mt genomes of NZ_Hco_NP and NZ_Teci_NP were generated using various sequencing platforms and assembled from 582 and 5.6 million (PE) reads with an average coverage depth per nucleotide of 13,293× and 2,097×, respectively.
The smallest Trichostrongyloidea mt genome published to date belongs to Dictyocaulus sp. cf. eckerti (Gasser et al., 2012) at 13,296 bp with the largest 15,221 bp mt genome belonging to M. digitatus (Jex et al., 2009). The NZ_Hco_NP and NZ_Teci_NP mt genomes are 14,001 and 14,081 bp in length, respectively (Figure 1) and are thus within the expected range. The reported mt genomes differ in size by ∼16-54 bp than other H. contortus and T. circumcincta characterized strains, except for the 13.7 kb H. placei MHpl1 mt genome (dos Santos et al., 2017). The size discrepancies among the different strains primarily relate to A+T-rich control region expansions and longer non-coding regions between numerous transfer RNA genes.
The total GC content of the H. contortus NZ_Hco_NP and T. circumcincta NZ_Teci_NP mt genomes were 21.1 and 22.8% and contained an A+T bias with an overall base composition of A = 32.9 and 31.2%, T = 44.5 and 46.1%, C = 6.3 and 7.3%, and G = 14.8 and 15.5%, respectively ( Table 1). The overall A+T bias of 77.4 and 77.3% observed in H. contortus NZ_Hco_NP and T. circumcincta NZ_Teci_NP mt genomes, respectively, are within the typical range reported for nematode mitochondrial genomes (Hu et al., 2004;Hu and Gasser, 2006). Compared with the others, the A+T content of all T. circumcincta strains are within 0.1% of each other with H. contortus strains being ±0.5 of NZ_Hco_NP. Low-complexity regions (LCRs) are extremely abundant in eukaryotic proteins that are typically represented by amino acid sequences containing repeats of single amino acids or short amino acid motifs, and may facilitate the formation of novel coding sequences. H. contortus NZ_Hco_NP had 23 LCRs with 15 (65%) in coding regions, while T. circumcincta NZ_Teci_NP had 26 repeats with 21 (81%) found in coding regions (Figure 2), however determining the functional role of these repeats is a focus for future work. Comparative analysis of the H. contortus NZ_Hco_NP and T. circumcincta NZ_Teci_NP mtDNAs also revealed five and four tandem repeat (TR) regions, respectively (Figure 2).
To date, mt genomes are available for many species of Trichostrongyloidea and the mt genomes of the trichostrongyloid H. contortus NZ_Hco_NP and T. circumcincta NZ_Teci_NP are in agreement, as they both comprise of 12 protein-encoding genes (PCGs), 22 transfer RNA (trn) genes, and 2 ribosomal RNA genes (rrnS or 12S ribosomal RNA and rrnL or 16S ribosomal RNA). The 12 mt PCGs of both H. contortus NZ_Hco_NP and T. circumcincta NZ_Teci_NP were encoded on the heavy strand and transcribed in the same direction. The order of genes observed in H. contortus NZ_Hco_NP and T. circumcincta NZ_Teci_NP are identical to the previously reported H. contortus (Palevich et al., 2018) and T. circumcincta (Jex et al., 2009), respectively (Figure 1). In accordance to the mt genomes of other Trichostrongyloidea, the reported mt genomes also all lack the atp8 gene. All of the 12 PCGs in both mt genomes are predicted to use unique translation initiation codons such as ATT (atp6, cytb, and nad4-6), TTG (nad2-3), and ATA (nad1 and cox1-3) in H. contortus NZ_Hco_NP, with T. circumcincta NZ_Teci_NP predominantly using ATT and ATA (cytb and cox1) ( Table 1). Both mt genomes are predicted to use the complete termination codons TAA. Nevertheless, cox2 had a TAG translation termination codon similar to nad3 of H. placei where not all termination codons were complete stop codons, typical of nematodes (Jex et al., 2008;dos Santos et al., 2017). Of the 22 predicted trn genes, size ranged from 54 to 63 bp in length (Table 1), in H. contortus NZ_Hco_NP and T. circumcincta NZ_Teci_NP mt genome sequences and are similar to those of other nematodes studied to date (Supplementary Figure 1). The characteristics associated with the secondary structures predicted for most tRNA genes include: a 7-8 bp acceptor stem (amino-acyl arm), a 5 bp anticodon stem and a T/U residue preceding. With the exception of T. spiralis (Lavrov and Brown, 2001), these characteristics are consistent with all previously published mt genomes for the Trichostrongyloidea (dos Santos et al., 2017) and Rhabditida [C. elegans (Blaxter et al., 1998)].
The rrnS (12S) and rrnL (16S) genes identified in the mt genomes of H. contortus NZ_Hco_NP (703 and 951 bp) and T. circumcincta NZ_Teci_NP (700 and 962 bp) were A+T biased ( Table 1). Secondary structure models for the rrn genes (Supplementary Figure 2) of H. contortus NZ_Hco_NP and T. circumcincta NZ_Teci_NP were constructed based on models of nematodes that have been previously reported (Hu et al., 2004(Hu et al., , 2007Hu and Gasser, 2006). The rrn genes were particularly A+T rich with 79.8 and 77.0% for rrnS and 82.9 and 83.0% for rrnL in H. contortus NZ_Hco_NP and T. circumcincta NZ_Teci_NP, respectively. The overall base compositions of the rrnS genes were; A = 41.8 and 36.0%, T = 38.0 and 41.0%, C = 5.8 and 7.4%, and G = 13.5 and 15.6%, and that of the rrnL genes were; A = 42.2 and 39.7%, T = 40.7 and 43.3%, C = 5.3 and 5.5%, and G = 10.7 and 11.4%. Therefore, the A+T content of the rrnS and rrnL genes were comparable to the overall A+T content of both whole mt genomes. The rrnL gene in both H. contortus NZ_Hco_NP and T. circumcincta NZ_Teci_NP, is located between trnH and nad3. Similarly, the FIGURE 1 | Mitochondrial genome atlases of H. contortus NZ_Hco_NP (A) and T. circumcincta NZ_Teci_NP (B). Each map is annotated and depicts the 12 protein-coding genes (PCGs, yellow), two ribosomal RNA genes (rrnS and rrnL, red), 22 transfer RNA (trn, blue) genes, and any putative non-coding region if applicable (gray). The innermost circles depict GC (blue) and AT (green) content, respectively, along the genome. Mauve visualization of an alignment of the H. contortus NZ_Hco_NP (top) and T. circumcincta NZ_Teci_NP (bottom) complete mt genomes (C). Mauve alignments of each genome are represented by a horizontal track, with annotated coding regions (white boxes), trn genes (green), and rrn genes (red). Red colored segments represent conserved regions among the two mt genomes. rrnS gene is located between trnE and trnS1 almost 3.5 Kb from the rrnL and between nad4L and nad1 in both mt genomes (Figures 1, 2). The locations, sizes and secondary structure characteristics of the rrnL and rrnS genes are relatively conserved among all Trichostrongyloidea mt genomes analyzed, especially the stem regions of the rrnL genes between H. contortus and T. circumcincta strains.
The predicted lengths of all the genes in H. contortus NZ_Hco_NP are very similar to those of T. circumcincta NZ_Teci_NP (≤1 amino acid difference). The sequence variability between Haemonchus species/strains and T. circumcincta strains of the Trichostrongyloidea superfamily was assessed by performing pairwise comparisons for each PCG of the mt genomes (Figure 2). Among the Haemonchus species/strains, the most variable genes were nad6 and nad2 with 13.8 and 10.2% of sequence divergence, respectively, and the most conserved genes were cox1, atp6, and cox2 with 93.6, 92.2, and 92.5% of sequence identity, respectively. For the T. circumcincta strains, we found that the most variable gene was nad4 with 2.2% sequence divergence, with most genes being conserved at around 98.6% sequence identity. Based on our results, the most variable genes among Trichostrongyloidea nematodes were nad2, nad6, nad3, and nad5 with 56.1, 38.4, 23.1, and 21.3% of sequence divergence, respectively, with the most conserved genes were cox1, atp6, cox2, and cox3 with 87.9, 83.5, 85.3, and 86.9% of sequence identity, respectively. Pairwise comparisons of all the PCGs among Trichostrongyloidea nematodes and it was observed that the nad4 gene was the most variable in terms of both the gene length and the sequence identity (Figure 2), where the predicted lengths differ by exactly 6 amino acids between Haemonchus (1,230 bp for H. contortus and H. placei) and T. circumcincta strains (1,212 bp). In comparison, the nad4 gene appears to be conserved among different Trichostrongyloidea species, for example Trichostrongylus vitrinus and T. axei (1,227 bp), while M. digitatus, C. oncophora, and C. elegans N2 all share 1,230 bp nad4 genes (Blouin et al., 1997;Blouin, 2002;Hu et al., 2004;Jex et al., 2009;Gasser et al., 2012;Xu et al., 2015).
A recent investigation comparing the divergences observed between ITS-1 and ITS-2 genetic nuclear markers with cox1 or nad4 genes (Blouin et al., 1997;Blouin, 2002), showed that mt DNA accumulates substitutions more quickly than nuclear genome markers and that nad4 genes in particular are a superior tool for prospecting species. For example, regarding H. placei and H. contortus mt genomes, a divergence in nuclear ITS-2 rDNA between these two species has previously been reported as 1.3% (Stevenson et al., 1995). Whereas, differences between nucleotide sequences of H. placei and H. contortus mt genome PCGs ranged between 10 and 20% dos Santos et al., 2017). Taken together, our findings are in agreement with previous reports suggesting the use of the most conserved mt cox genes for investigating systematics and speciation in nematodes (Hu et al., 2004;Hu and Gasser, 2006;Zarowiecki et al., 2007), while the variation observed in nad mt genes would be more appropriate for population genetics studies (Jex et al., 2009).
Analysis of codon usage patterns in Trichostrongyloidea and C. elegans N2 revealed that certain codons are predicted to be utilized almost ubiquitously across all mitochondrial PCGs (Figure 3). As expected, the A+T bias reported in the Trichostrongyloidea mt genome sequences also affects the amino acid sequence composition of the predicted proteins. Within each codon family, we found that the non-polar Leu (TTG, 5.0%), basic His (CAC, 0.4%), polar Asn (AAC, 0.5%), and acidic Asp (GAC, 0.3% but except nad3) were highly utilized across all PCGs. Among each codon family, the T-ending (49.9%) codons are used most frequently than codons ending in any of the three other nucleotides, with C-ending (2.7%) codons least used. Similarly, for both Haemonchidae and Trichostrongylidae the average nucleotide content of their mt PCGs are 26.0% (A) and 43.7% (T), compared to 25.4% (A), and 44.2% (T), respectively. In particular the nad4L was the most T-rich at 53.5 and 54.9%, in each family, respectively. Also the most commonly codon observed was TAA as the termination codon with the majority of initiation codons also being AT-rich.
Our pairwise alignments and synteny analysis of the complete mt genome nucleotide sequences for the Haemonchus and Teladorsagia species and strains revealed a common and highly variable A+T-rich non-coding region between 5,500 and 6,100 bp in each mt genome (Figure 2 and Supplementary Figure 3). The A+T content of this region is unusually high in both Haemonchus (89.0%) and Teladorsagia (88.2%) mt genomes, and is located between trnA and trnP (or between genes nad5 and nad6). Among Haemonchus, the A+T-rich region is ∼600 bp in H. contortus (89.7%) and 155 bp in H. placei (87.1%), whereas the

circumcincta
NZ_Teci_NP (B) were used as the reference sequences, respectively. The outermost ring indicates gene arrangement and distribution based on the primary sequence GenBank file. Tandem repeat regions are shown with green blocks (full length) for H. contortus NZ_Hco_NP (circle 2) and light red blocks for all other Haemonchus mt genomes (circle 3). Low complexity repeat regions for H. contortus NZ_Hco_NP are also shown with green blocks (half length) on circle 2. The next two (A) or three (B) rings represent BLAST hits obtained from BLASTX searches performed using a threshold of 1e −5 for protein sequences of genomes described above, in which the query sequences were translated in reading frames 1 and 2. The inner-most rings indicate the GC content information. The contents of the A+T-rich regions have been shown in more detail in expanded 10× zoomed views. nad1-6, NADH dehydrogenase subunits 1-6; cox1-3, cytochrome c oxidase subunits 1-3; atp6, ATPase subunit 6; cytb, cytochrome b. FIGURE 3 | Amino acid and codon usage of Trichostrongyloidea complete mt genomes. Heat map analysis was performed based on relative abundances of protein sequence content determined for all mitochondrial genes. The relative inferred codon abundances per mt gene and specific to each mt genome are shown using a heat color scheme (blue to red), indicating low to high relative abundance. Mt genomes are groped based on previous phylogenomic comparison of concatenated PCGs. C. elegans N2 (GenBank accession number JF896456) was also included for comparison. Amino acid families are colored to represent non-polar in black, basic in red, polar in green, and acidic in blue.
region is 340 bp in Teladorsagia (88.2%). A pairwise alignment of the nucleotide sequences of the A+T-rich regions of H. contortus NZ_Hco_NP (611 bp) and closely related H. contortus McMaster (EU346694, 578 bp) resulted in 91.5% identity. Interestingly, the region in H. contortus is among the largest among other nematodes studied to date with the exception of As. suum (886 bp) (Okimoto et al., 1992;Keddie et al., 1998;Hu et al., 2003).
Comparative analysis of the control (A+T-rich) region of Haemonchus mtDNAs identified three tandem repeat (TR) units (69, 120, and 62 bp) compared to only one (126 bp) found in T. circumcincta strains mt genomes (Figure 2). The number of repeat units within the A+T-rich region was variable depending on the mtDNAs. For example, the copy numbers of the A+T-rich region TRs for T. circumcincta NZ_Teci_NP corresponding to TTATAATTATTATATAATAATTA (23 bp), TATATATATATA AATTAATATAATTATTATTAATAAT (37 bp), and ATATTATA TATTATATATTA (20 bp) consensus units were 4, 2, and 4. In general, TRs of Haemonchus mt genomes were much more abundant and distributed randomly throughout the mtDNAs, whereas TRs were mostly in the intergenic regions of T. circumcincta mtDNA. Some TRs were found within the mitochondrial genes, such as those in cox1, nad5, nad6, and nad4, with also nad4L in T. circumcincta mtDNA. Overall, our analyses show that the number of tandem repeats was highly variable among different species and strains of Haemonchus and Teladorsagia and largely account for the variations among the mtDNA sequences lengths.
Based on our comparative genomics of A+T rich regions of complete Haemonchus species/strains and T. circumcincta strains mt genomes, we were able to differentiate individual strains with high confidence due to the high mutation rate and sequencing coverage for these regions. We propose that such regions can serve as ideal markers for future studies investigating nematode population structure (Jex et al., 2009;Gasser et al., 2012), population genetics (Blouin, 2002;Hu et al., 2004;Gasser et al., 2012), and phylogeny (Blouin et al., 1997;Blouin, 2002;Xu et al., 2015). For future work, we aim to design and test numerous primer pairs targeting the A+T-rich region between the nad5 and nad6 genes to develop accurate PCR-based molecular techniques to differentiate Trichostrongyloidea species members. Such molecular tools would enable easy species-and possibly strain-level identification with sufficient resolution and confidence from fecal samples that would also help prevent animal sacrifice for such studies.

Mitochondrial Phylogenomics
The super-family Trichostrongyloidea includes genera such as Ostertagia, Teladorsagia, Trichostrongylus, Haemonchus, Cooperia, Nematodirus, Dictyocaulus. To determine the phylogenetic relationship of the H. contortus NZ_Hco_NP and T. circumcincta NZ_Teci_NP with other H. contortus and T. circumcincta strains and members of the Trichostrongyloidea nematodes, the concatenated amino acid sequences predicted from 12 mtDNA PCGs were analyzed using BI, ML, and MP methods. Topologies of all trees inferred by the three different distance models and methods were identical (Figure 4), with phylogenetic relationships among the different Trichostrongyloidea species well-resolved with very high nodal support throughout. The presented phylogenomic tree of Trichostrongyloidea mt genomes, is consistent in topology with other published data Jex et al., 2009), and further support the hypothesis that the primary driver of early divergence is actually the site of infection in the host. Overall, rur results corroborate this hypothesis given that each of different species and strains of the Haemonchidae compared with Cooperidae and Trichostrongylidae families, are genetically closer to abomasal than to non-abomasal (intestine) Trichostrongyloidea, respectively. The anomaly to this hypothesis was T. axei that occupies the abomasum as the site of infection oppose to the small intestine of T. vitrinus, which implies the presence of other and yet uncharacterized drivers of genetic species diversity of these closely related parasites. The phylogenomic tree based on the concatenated PCGs of the complete mt genomes (Figure 4), placed H. contortus NZ_Hco_NP and T. circumcincta NZ_Teci_NP in their respective clades with other H. contortus and T. circumcincta strains, which are in agreement with previous studies (Jacquiet et al., 1998;Nunes et al., 2013;Rohart et al., 2017;Palevich et al., 2018). However, our tree is not directly comparable to previously described phylogenetic trees that performed different analyses and compared a broader set of taxa, but mainly because the trees were based on the ITS rDNA nuclear genetic markers (Chilton et al., 2001;Hoberg et al., 2004;Jex et al., 2008). While both Haemoncus and Teladorsagia strains grouped together in their respective families, T. circumcincta NZ_Teci_NP was clearly separated from the other strains. Interestingly, H. contortus NZ_Hco_NP and McMaster (Australia) strains grouped together and originate from nearby geographic locations, but separate from H. contortus ISE (East Africa). This observation corroborates our hypothesis on the impact of environmental constraints and preferences acting as secondary drivers of within-species diversity. Overall, we are in agreement with previous phylogenetic studies suggesting that ITS rDNA nuclear genetic markers alone may not provide sufficient information to reveal higher taxonomic level relationships within the Trichostrongyloidea. Our findings indicate that a three-pronged approach that incorporates phylogenetic inertia, pangenome structure/features and environmental data in order to understand the mitochondrial genome evolution.

CONCLUSIONS
This study compares the recently sequenced complete mitochondrial genomes of sibling species of parasitic roundworms, H. contortus and T. circumcincta, two of the most economically important and common pathogenic nematodes infecting small ruminants worldwide. We explored the mitochondrial pangenome features and phylogenomic relationships to assess species-and strain-level diversity among Trichostrongyloidea nematodes and with available mt genome in public databases. Our analyses corroborate previous studies showing that our H. contortus NZ_Hco_NP and T. circumcincta NZ_Teci_NP strains position in their respective clades. Future work should focus on utilizing our insights on the highly variable regions of Haemoncus and Teladorsagia conspecific strains to develop cost-effective DNA-based approaches for novel parasite management and control strategies. The complete mt genomes of the New Zealand H. contortus and T. circumcincta field strains are important contributions to our understanding of meta-population connectivity as well as species-and strain-level evolution in nematodes. With the continuing improvements in sequencing technology combined with a community effort we may be able to reconstruct the true origins of Haemoncus and Teladorsagia, as well as other parasitic nematodes that are of a global interest.

DATA AVAILABILITY STATEMENT
The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found in the article/Supplementary Material.

AUTHOR CONTRIBUTIONS
NP conceived and designed the project, analyzed and interpreted the data, and wrote the manuscript. PM provided bioinformatics support for the project. MM and Y-JC provided resources and critically revised the manuscript. All authors read and approved the final manuscript.

ACKNOWLEDGMENTS
Special thanks to PM, AgResearch Limited, Grasslands Research Centre, Palmerston North, New Zealand, for his contribution to data analysis. This manuscript has been released as a pre-print at Research Square (Palevich et al., 2020).

circumcincta
NZ_Teci_NP (right) mitochondrial genomes. The two-dimensional predicted layout predicted RNA secondary structures were predicted based on the energy model of Mathews et al. (2004). The minimum free energy (MFE) structure of hairpins is colored according to the base-pairing probabilities (red, high; green, mid; blue, low). Blue and red circles around nucleotides represent the beginning and the end of molecules, respectively.

Supplementary Figure 3 | Dot plots showing
Haemonchus and Teladorsagia species and strain-level synteny. The horizontal and vertical axes represent the entire translated mt genome nucleotide sequences, respectively. Each aligning gap-free segment with more than 50% identity is plotted as a black line or dot. Analysis was performed using chaining coverage with PipMaker (Schwartz et al., 2000).