Whole-Genome Resequencing to Study Brucellosis Susceptibility in Sheep

Brucellosis is a zoonotic disease and a major public health problem. However, the genetic mechanism of brucellosis in sheep remains unclear. In this study, serum samples were collected from 6,358 sheep from the F2 population (Dorper sheep ♂ × Hu sheep ♀), and antibody levels were continuously measured at 14 days and 1, 2, 3, 4, 5, 6, 7, 8, 9, and 10 months after administration of brucellosis vaccine. Finally, 19 brucellosis-resistant group (BRG) sheep and 22 brucellosis-susceptible group sheep (BSG) were screened for whole-genome sequencing. Using the fixation index, Fisher’s exact test, and chi-square test, a total of 205 candidate SNP sites were identified. Kyoto Encyclopedia of Genes and Genomes pathway enrichment analysis suggested that 138 candidate genes were significantly enriched in adherens junction (CTNNA3, PARD3, and PTPRM), cell adhesion molecules (NLGN1, CNTNAP2, NCAM1, and PTPRM), salivary secretion (LOC101102109, PRKG1, and ADCY2), and hippo signaling pathway (CTNNA3, YAP1, and PARD3). These findings provide valuable molecular markers for brucellosis resistance breeding in sheep and novel insights into the genetic mechanism of brucellosis resistance.


INTRODUCTION
Brucellosis is a common zoonosis (Shakir, 2021). After a brucellosis infection, patients show fever, hyperhidrosis, fatigue, and muscle and joint pain for days or even weeks. Most patients present enlargement of lymph node, liver, spleen, and testicle and other suspicious symptoms and signs that severely affect human health. In addition, it is not easy to differentiate brucellosis from other diseases in the early stage. If not treated in the early stage, there is a high probability that the disease will become chronic and recurrent. Some patients may have the disease for years or even decades. Among animals infected with Brucella, female animals often present abortion and infertility, whereas male animals show orchitis, which seriously affects their production performance and hinders the development of animal husbandry, causing serious economic losses. Certain developed countries such as Canada, Britain, the Netherlands, Australia, and other countries have eliminated brucellosis. However, brucellosis continues to pose an economic burden in countries such as Saudi Arabia (Al Jindan, 2021), India (Dhand et al., 2021), and Kenya (Munyua et al., 2021). In conclusion, brucellosis causes huge economic losses in the animal industry and seriously threatens human health (Zhang et al., 2018).
Therefore, it is of great significance to study the genetic mechanism of brucellosis for the development of animal husbandry and human health. Brucella infection activates cellmediated immune responses in animals (Serre et al., 1987). For example, Brucella abortus is recognized by several Toll-like receptor-associated pathways, which triggers proinflammatory responses that affect both the type and intensity of the immune response. Toll-like receptor 6 is required to trigger innate immune responses against B. abortus in vivo and is required for the full activation of dendritic cells to induce robust proinflammatory cytokine production (de Almeida et al., 2013). Cytokines mediate many of the effector phases of the immune and inflammatory responses. Polymorphisms within the coding and non-coding regions of the cytokine genes may affect the level of cytokine production and regulate the immune response. Polymorphisms of IL-10 and TGF-β1 genes are associated with susceptibility or resistance to brucellosis. The IL-6 (-174) GC genotype may be a risk factor for the development of focal complications of brucellosis, whereas the GG genotype may be a protective factor against brucellosis (Karaoglan et al., 2009;Amjadi et al., 2019). Therefore, gene polymorphisms can contribute to susceptibility, control, and resistance to treatment in different infectious diseases.
In the present study, we performed whole-genome resequencing of 19 brucellosis-resistant and 22 brucellosissusceptible sheep. Fixation index (F ST ) has been widely used in the study of candidate genes between groups, including genes associated with traits such as musical aptitude (Liu et al., 2016) and immune function (Mychaleckyj et al., 2017) in humans, vision in sheep (Wang et al., 2019), and immunity and fat deposition in pigs (Qin et al., 2019). Considering the low credibility of using a single method to identify candidate SNP loci, we also used the Fisher test and chi-square test to improve the credibility and accuracy of filtering results. The SNP sites overlapping between these three screening results were selected as candidate sites with significant differences. We aimed to identify genes associated with brucellosis susceptibility in sheep using comparative genomics analysis. Our study could identify valuable molecular markers for brucellosis resistance breeding and providing an important foundation for the study of sheep resistance to brucellosis.

Ethics Statement
All experiments in this study were carried out in accordance with the approved guidelines from the Regulation of the Standing Committee of Gansu People's Congress. All experimental protocols, including the sample collection protocol, were approved by the Ethics Committee of Gansu Agriculture University (China) under permission no. DK-005.

Animals and Blood Serum
A total of 6,358 F2 sheep (Dorper sheep × Hu sheep ) were selected from a sheep farm in Gansu, China. The experimental sheep were all female adult individuals, and there was no historical selection process. Throughout the research process, all the experimental sheep were provided with the same growth environment, the same feeding management, sufficient pellets, and drinking water to ensure that the breeding environment of all sheep was as uniform as possible. The Brucellosis Vaccine (Live, Strain S2) used in this study was purchased from China Qilu Animal Health Co., Ltd. (Jinan, China) (Supplementary Table 1). After lambing, the Brucellosis Vaccine, Live (Strain S2) was diluted to 1.0 ml per sheep, according to the manufacturer's instructions, and the whole population was subcutaneously inoculated. Blood samples were collected 14 days and 1, 2, 3, 4, 5, 6, 7, 8, 9, and 10 months after vaccination and centrifuged at 7,500 rpm for 5 min; serum was separated, and the antibody status was determined according to the brucellosis Rose Bengal plate agglutination test (RBT) using the method of the Chinese national standard "Diagnostic technology for animal brucellosis GB/T18646-2018." The standard serum agglutination test (SAT) simultaneously measures the antibody titer and assesses the trend of antibody production and increase in serum levels in sheep after vaccination.
The following procedure was used for the standard SAT procedure for brucellosis in sheep: (1) normal saline containing 0.5% carbolic acid was used to dilute the serum and SAT antigen to be tested. Six test tubes were prepared to dilute each serum sample to be tested; 1.15 ml diluent was added to the first tube, and 0.5 ml diluent was added to the rest of the tubes. (2) Next, 0.1 ml of serum sample was added to the first tube using a 1-ml pipette and mixed well by sucking up the mixture in the test tube into the pipette and then blowing it into the test tube along the wall of the test tube, three to four times. After mixing, 0.25 ml of the mixed solution was sucked up with a pipette and discarded. Then, 0.5 ml of the mixture in the first test tube was added to the second tube using a clean pipette and mixed well using the above mixing. (3) This dilution method was used to complete the dilution in the third, fourth, fifth, and sixth tubes. Finally, 0.25 ml of the mixture in the sixth tube was discarded. After completing the dilution, the dilution degree of the serum of the first to sixth tubes is 1:12.5, 1:25, 1:50, 1:100, 1:200, and 1:400, respectively. (4) Next, 0.5 ml of 20-fold diluted antigen was added to each diluted serum sample and shaken evenly. Thus, the degree of serum dilution was 1:25, 1:50, 1:100, 1:200, 1:400, and 1:800, respectively. Each test tube was incubated at 37 • C for 24 h, after which the tubes were checked for self-coagulation. (5) Positive and negative control serum samples and antigen control samples were established for each test. The methods used for dilution and antigen addition for positive and negative control serum samples were the same as described above.
With reference to the turbidity of Wheat-than, the clarity of the supernatant and the degree of bacterial aggregation were determined as follows: (1) Complete bacterial agglutination and 100% precipitation with 100% clarity of the supernatant were judged as "++++." (2) Almost complete bacterial agglutination and 75% clarity of the supernatant were judged as "+++." (3) Significant bacterial agglutination and 50% clarity of the supernatant were judged as "++." (4) Presence of agglutination and precipitation and 25% clarity of the supernatant was judged as "+." (5) Absence of precipitation of agglomerates and uniformly turbid supernatant was recorded as "−." RBT and SAT methods to group experimental sheep. RBT is a qualitative method to determine Brucella positivity and negativity, while SAT is a more accurate and quantitative method to determine Brucella positivity and negativity. RBT and SAT results were divided into positive and negative. Top19 test sheep with negative BRT and the smallest SAT antibody titer were selected as BRG, and Top22 test sheep with positive BRT and the largest SAT antibody titer were selected as BSG.

DNA Extraction and Resequencing
After antibody detection, jugular vein blood samples (5 ml) from 19 BRG sheep and 22 BSG sheep were collected for DNA extraction. DNA was extracted using an EasyPure Blood Genomic DNA Kit (TransGen Biotech, Beijing, China). DNA was then dissolved in elution buffer (10 mM Tris hydrochloride, 1 mM ethylenediaminetetraacetic acid; pH 8.0) and stored at 20 • C. After dilution to 100 ng/µl, the 41 genomic DNA samples were used to generate 41 libraries with a mean insertion size of 500 bp. The libraries were sequenced using 150 bp pairedend reads using an Illumina NovaSeq 6000 system (Illumina, San Diego, CA, United States).

Sequence Alignment and SNP Calling
FastQC (version 0.10.1) software was used to control the read quality of the original sequence. Linker sequences and sequences containing the uncalled base N were deleted, and the sequences shorter than 60 bp were discarded. The clean reads were mapped onto the sheep reference genome Oar v.4.0 with the Burrows-Wheeler Aligner (BWA, version 0.7.3a) (Li and Durbin, 2009) using the default parameters (bwa mem ref.fna read1.fq read2.fq > aln-pe.sam). Picard (version 1.140) software was used to sort in ascending order the SAM files according to chromosomes and loci to generate BAM files 1 . The duplicates generated because of excessive amplification during library preparation were labeled by Picard software and were not used as evidence for subsequent mutation detection. The Genome Analysis Toolkit (GATK, version 3.4.0) was used to detect variation information in BAM files and generate variant call format (VCF) files. The software VCFtools (version 0.1.14) was used to filter the detection results of SNPs. The screening criteria of SNPs were as follows: (1) the number of support (coverage depth) of SNPs > 3; (2) missing rate per site < 10%; (3) minimum allele frequency (maf) > 5%; (4) removes indel sites. Finally, a total of 6,643,365 SNPs were identified. SNPs were annotated using the ANNOVAR (K. Wang et al., 2010) based on the sheep reference genome Oar v.4.0, and finally the SNPs were divided into exonic regions (variant overlaps a coding), splicing sites (variant is within 2 bp of a splicing junction), intronic regions (variant overlaps an intron), upstream and downstream regions (variant overlap 1-kb region upstream/downstream of transcription start site), and intergenic regions (variant is in the intergenic region) (Supplementary Table 4).

Statistical Analysis
Three methods, fixation index (F ST ), Fisher's exact test, and chi-square test, were used to detect SNP sites with significant differences in allele frequency between BRG and BSG. The top 1,000 SNPs were defined as the significantly different loci.

Fixation Index
The VCFtools software (version 0.1.14) was used to calculate the fixation index (F ST value) of all SNP sites. F ST quantifies the allele frequency differences between BRG and BSG, and SNP sites with large F ST values may become putative selection sites. The F ST values were calculated for each pair of loci following the methods of Weir and Cockerham (Weir and Cockerham, 1984) and using VCFtools (Danecek et al., 2011) from the VCF file. The F ST values were arranged in descending order, and the SNPs of the first 1,000 sites were defined as important SNPs. These SNP sites will be used for subsequent candidate gene analysis.

Fisher's Exact Test and Chi-Square Test
Association analysis was performed for BRG group versus BSG group with the PLINK (version 1.9) software, using the Fisher's exact test and chi-square test (Purcell et al., 2007). Pearson chisquare values and p-values (chi-square and Fisher's exact test) were calculated using Haploview4.2. The first 1,000 SNPs were defined as significant SNPs, which will be used for subsequent analysis of candidate genes.
Finally, to further identify the credible SNPs, the top 1,000 SNPs detected by the above three methods were intersected to identify the overlapping SNPs.

Gene Annotation and Functional Analysis
The identified SNPs were annotated as the closest genes with a nearest gene distance of 1 kb (Oar_v4.0). The genes were defined as candidate genes. KOBAS software was used to test the statistical enrichment of candidate genes in the Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis (Xie et al., 2011).

Brucella Antibody Levels
Supplementary Table 1 shows the results of antibody titers in the serum agglutination test of 19 BRG and 22 BSG experimental sheep. Figure 1 shows the levels of Brucella antibodies in BRG and BSG. Results indicated that the Brucella antibody titer levels of BSG were significantly higher than those of BRG for 10 consecutive months after vaccination (p < 0.01).

Whole-Genome Resequencing and SNP Identification
The Illumina NovaSeq 6000 system was used to sequence the DNA of 41 sheep (BRG, n = 19 and BSG, n = 22). The system produced a total of 977.167 Gb of raw data. After inspecting the raw data for quality, we filtered out lowquality sequences and sequences containing joints, resulting in 948.21 Gb of clean data. The effective sequencing rate (clean_data/raw_data) was 97.04% (Supplementary Table 2), indicating that the data had high quality and could be used for further in-depth analysis. Using BWA, the clean reads were compared with the Ovis aries reference genome sequence (Oar v4.0), and the average mapped reads and average depth were about 99.75% and 8.5×, respectively (Supplementary Table 3). We filtered out the low-quality SNPs and annotated the detected genetic variations functionally with ANNOVAR software, and 6.64 million SNPs appeared in the final result of annotation (Supplementary Table 4).

Candidate SNPs and Genes
Three methods were used to identify SNPs and genes associated with Brucella susceptibility in sheep. Manhattan plots of genomewide F ST , Fisher's exact test, and chi-square test analyses are depicted in Figure 2. First, we extracted the top 1,000 SNPs by F ST value (Supplementary Table 5) and -log10(P) value of Fisher's exact test (Supplementary Table 6) and chi-square test (Supplementary Table 7), respectively. The top 1,000 SNPs were annotated to the closest gene (Oar_v4.0), and these genes were defined as candidate genes. The list of candidate genes and    Table 10). Intersection of the three groups of identified genes and SNPs showed that only a small fraction of SNPs could be detected by only one of the three methods, whereas most SNPs overlapped between the three methods; 138 candidate genes and 205 overlapping SNPs were identified (Figure 3 and Supplementary Table 11).

KEGG Enrichment Analysis
The 138 overlapped genes obtained above were analyzed using KEGG pathway enrichment analysis; the top 20 enriched pathways are shown in Figure 4. Subsequently, four significantly enriched KEGG pathways (p < 0.05) were identified: adherens

DISCUSSION
In this study, three methods [fixation index (F ST ), Fisher's exact test, and chi-square test] were used to identify Brucella susceptibility genes in the sheep genome. The experimental sheep were grouped on the basis of the serum levels of Brucella antibodies of different individuals. Sheep with low serum antibody persistence were classified as brucellosis susceptible, and those with high serum antibody persistence were classified as brucellosis resistant. Then, SNP loci showing differences in allele frequency between BRG and BSG were identified as candidate loci closely associated with Brucella resistance (Figure 2). In the end, 205 overlapping SNPs annotated 138 genes, and some of these genes contain multiple SNP sites (Supplementary Table 11); this finding is likely because genes associated with disease phenotypes often contain multiple SNP loci. Genes known to be highly associated with brucellosis contain multiple SNP sites, such as TGF-β (Bravo et al., 2008;Rafiei et al., 2007), IFN-γ (Bravo et al., 2003;Rasouli and Kiany, 2007;Hedayatizadeh-Omran et al., 2010;Eskandari-Nasab et al., 2013), TNF-α (Caballero et al., 2000;Davoudi et al., 2006;Batikhan et al., 2010), IL-15 (Kalani et al., 2011), and TNF-α (Caballero et al., 2000;Kalani et al., 2011).
The 138 genes identified above were used for KEGG pathway enrichment analysis to explore the putative biological functions of these genes. The KEGG pathway analysis identified a total of 12 significant KEGG pathways (Supplementary Table 12), and we mainly focused on the pathways closely related to Brucella infection, which were mainly associated with cell adhesion ( Table 1). Brucella invasion is the first step in which phagocytes play a role; the bacteria first adhere to host cell membrane through lipopolysaccharide receptors and lipid valves. FC receptors play a role in processes such as intrusion of bacteria into the cell to reproduce, transfer through blood, and lymph node invasion of host organs through adhesion, host response, organ inflammation, and organ lesion (Amjadi et al., 2019), such as membranoproliferative glomerulonephritis (Provatopoulou et al., 2018), testicular abscess (Vallianou et al., 2018), and splenic abscess (Sudulagunta et al., 2017).
In pathways associated with adherens junctions and CAMs, genes closely related to cell adhesion-such as CTNNA3, PARD3, PTPRM, NLGN1, CNTNAP2, and NCAM1-play a role. CTNNA3 mediates cell adhesion (Janssens et al., 2001); when CTNNA3 was knocked down, CTNNA3-depleted cells showed abnormal skeleton (Stahn et al., 2016). In esophageal cancer, PARD3 overexpression promotes cell apoptosis, inhibits cell proliferation, and inhibits cell migration and invasion, whereas PARD3 silencing promotes cell proliferation and increases migration and invasion (Wang et al., 2017). PTPRM overexpression in small intestinal neuroendocrine tumor cell lines reduced cell growth and proliferation and induced apoptosis (Barazeghi et al., 2019). Neuroligins are cell differentiation molecules located on the postsynaptic side of the synapse, which interact with their presynaptic partners neurexins to maintain trans-synaptic connection (Südhof, 2008). Mutations in the CNTNAP2 gene are partly responsible for autism and other disorders as the gene regulates neural connections in the frontal lobes (Scott-Van Zeeland et al., 2010). TGF-β1 regulates the adhesion properties of cardiomyocytes through an NCAM1-dependent mechanism and is detrimental to the heart (Ackermann et al., 2017). These genes may be related to the transfer of Brucella in the body and associated pathogenic processes, but these findings need further verification to confirm their role in the pathogenesis of brucellosis.
Adherence of Brucella to human epithelial cells and macrophages is mediated by sialic acid residues (Castañeda-Roldán et al., 2004). In our study, salivary secretion was identified as a significant KEGG pathway; PRKG1 and ADCY2 are involved in cell adhesion in different diseases. For example, PKAG1 mutations drive the relaxation of aortic smooth muscle cells and induce aortic disease (Milewicz et al., 2017), Brucella can cause recurrent episodes of vascular inflammation (Korkmaz et al., 2016). ADCY2 can catalyze the formation of signal molecule cAMP in response to G protein signal transduction, and downstream signal transduction cascade mediates the change in gene expression patterns, leading to increased IL-6 production (Ding et al., 2004), which is involved in the development of the chief complications of brucellosis (Karaoglan et al., 2009).
Functional deletions in the Hippo signaling pathway indicate significant overgrowth in the Hippo signaling pathway owing to increased cell proliferation and reduced cell death (Pan, 2010). The complexity of regulation of YAP1 and TAZ was greatly increased, and more regulatory components were discovered. The Hippo pathway is inter-linked with other cancer-related pathways, especially transforming growth factor-β (TGF-β) and WNT pathways (Moroishi et al., 2015). TGF-β1 may be involved in susceptibility to brucellosis and development of main forms of the disease (Bravo et al., 2008). It is speculated that these genes play an important role in the pathogenesis of brucellosis, but further studies are needed to confirm these findings.
Unfortunately, no SNP loci reached significant levels after p-value correction using FDR and Bonferroni methods (corrected p-value < 0.05) in this study, which may be due to the small sample size. Therefore, in the follow-up study, the sample size should be expanded and the FDR or Bonferroni methods should be used for screen candidate genes and SNP sites.

CONCLUSION
In conclusion, four significant pathways and nine candidate genes related to brucellosis susceptibility in sheep were identified by whole-genome resequencing using brucellosis-susceptible and brucellosis-resistant sheep. These pathways and genes are closely related to the process of cell adhesion. These results provide valuable candidate genes and lay an important foundation for the study of brucellosis in sheep.

DATA AVAILABILITY STATEMENT
The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found below: https://www.ncbi.nlm. nih.gov/, SRR11971264-SRR11971304.

ETHICS STATEMENT
The animal study was reviewed and approved by Ethics Committee of Gansu Agriculture University (China). Written informed consent was obtained from the owners for the participation of their animals in this study.

AUTHOR CONTRIBUTIONS
FL and WW designed the experiments. XL and WW analyzed the data. XL wrote the manuscript. CL, DZ, GL, XZ, YKZ, YZ, and ZS contributed to sample collection and prepared biological samples. WW, QW, and XL revised the manuscript. All authors read and approved the final manuscript.