Polymorphisms in TLR4 Gene Associated With Somatic Cell Score in Water Buffaloes (Bubalus bubalis)

Considering the importance of the diseases affecting the productive performance of animals in the dairy industry worldwide, it is necessary to implement tools that help to control and limit the occurrence of such diseases. As the increased somatic cell counts (SCC) are a direct expression of the inflammatory process, they are candidates to become the usual parameter for assessing udder health regarding milk quality and for monitoring mastitis incidences. Toll-Like Receptors are membrane proteins that play a key role in immunity, recognizing pathogens and, subsequently, activating immune responses. The present study was conducted to identify single nucleotide polymorphisms in the TLR4 gene of buffaloes and to analyze its associations with somatic cell counts. DNA samples of 120 Murrah buffaloes were used. The whole coding region of the TLR4 gene was amplified by polymerase chain reaction reactions and sequenced for polymorphism scanning. A total of 13 polymorphisms were identified for the sequenced regions of the TLR4, most of which are in the coding region. The association with the somatic cell score was highly significant (p < 0.001) for all identified polymorphisms of TLR4 gene (g.54621T>A, g.54429G>T, g.54407T>A, g.46616C>A, g.46613T>G, g.46612A>G, g.46611C>A, g.46609T>G, g.46541C>G, g.46526C>A, g.46516T>C, g.46376C>T, g.46372T>C). Therefore, it is suggested that the markers of the TLR4 gene can be used as molecular markers for mastitis resistance in buffaloes, due to their association with somatic cell counts.


INTRODUCTION
The water buffalo (Bubalus bubalis) is a worldwide species used as a source of draft power, milk, and meat (1). The milk production represents an economically interesting activity, especially because of appreciated milk derivatives, such as "mozzarella, " and the remarkably yield efficiency manufacturing dairy products, due to buffalo's milk physical-chemical properties with high levels of protein, fat, and minerals (2)(3)(4)(5). This species can be affected by health problems similar to those presented by cattle, among them the occurrence of mastitis (6), which consists of inflammatory reactions of the mammary gland triggered by the invasion of pathogens or traumatic events (7). The most frequently isolated pathogens in milk samples of buffaloes with mastitis are grampositive bacteria, such as Staphylococcus spp. Streptococcus spp. (6,8), and which causes mostly subclinical mastitis (9). However, low incidence of infection by Gram-negative bacteria, such as Escherichia coli and Klebsiella pneumoniae, has also been reported in the species (10,11). Further than changes in milk composition, mastitis' major impacts on the dairy industry include the deterioration of animal comfort and welfare, the reduced performance of herds, increased therapeutic costs, and involuntary culling of cows early in lactation (12). Therefore, it is of paramount importance the development of strategies to reduce the incidence of this disease, such as the implementation of improved management practices and enhance of mastitis resistance in dairy herds through selection.
Subclinical mastitis, the most common inflammation of mammary tissue, is asymptomatic. Therefore, its diagnosis and onset of treatment rely largely on indirect indicators of inflammation, like the somatic cell count (SCC) test (13). The presence of somatic cells in milk is a normal physiological phenomenon even in healthy animals. Leukocytes and epithelial cells are constantly circulating in the mammary gland to guarantee its surveillance and protection against infection (14). However, increased SCC in the milk indicates the recruitment of the first line of cellular defense, particularly neutrophils, from blood to the mammary gland in response to long term intramammary inflammation caused by gram-positive bacteria (15). Analogously to bovine, the SCC limit of 200,000 cells/ml is usually adopted to define subclinical mastitis in buffaloes (6,8,14,16), whereas values ranging between 11,000 and 171,000 cells/ml have been reported in the milk of healthy cows (11,17). According to Salvador et al. (18) and Cerón-Muñoz et al. (17), the prevalence of subclinical mastitis in the Philippine and Brazilian water buffaloes herds was 42.8 and 3.2%, respectively.
Beyond being the most indicator of subclinical mastitis, SCC is widely applied as a selection criterion for enhanced resistance to mastitis, either subclinical or clinical (19). Although there is some controversy regarding the relation of low SCC and clinical mastitis resistance (13), studies in sheep and cattle have demonstrated that animals with higher SCC are most prone to mastitis (20,21), and in agreement with these studies, the genetic correlation between SCC and clinical mastitis varying from moderate to high (0.59-0.77) have been reported in dairy cattle (22)(23)(24). In addition to its high correlation with clinical mastitis, the SCC represents an objective measure feasible of being periodically recorded in large populations with higher estimates of heritability than for clinical mastitis itself (25).
The comprehension of genomic regions and genes underlying the genetic variation in SCC of buffaloes might be useful for the assessment of genetic variability of mastitis resistance in the species, and it may assist the early selection of more robust animals. In addition, it is a pivotal first step for a better understanding of biological mechanisms behind the mastitis resistance and interpretation of the relation between clinical and subclinical mastitis, which are mostly associated with the activation of the innate and adaptive immune system, respectively (15). Toll-like receptor 4 (TLR4) is a patternrecognition receptor that plays key roles stimulating the immune system by binding to pathogen-associated molecular patterns (PAMPs), such as lipopolysaccharide, an outer membrane component of gram-negative bacteria, as well as lipoteichoic acid present in the wall of some of the gram-positive bacteria (26,27). Despite TLR4 main importance to induce an innate immune response, it was an overexpressed gene in the mammary gland of bovine cows (28), and in the milk of buffaloes (29), infected exclusively by gram-positive bacteria. Therefore, TLR4 corresponds to an interesting candidate gene for SCC, due its dual specificity to both gram-negative and -positive bacteria, hence its potential roles linking adaptive and innate immune systems (30). Moreover, Mesquita et al. (31) found a significant association between the polymorphisms of the TLR4 and the SCC in Holstein cows, suggesting its potential to improve udder health through marker-assisted selection.
The present study aimed to verify the existence of polymorphisms in the coding region of the TLR4 gene in a Brazilian buffalo population and verify its potential as molecular markers for infection resistance in the species, through assessing the association between TLR4 polymorphisms and SCC in buffalo milk. In addition, we contrast the predicted protein structure of the TLR4 gene of buffaloes and cattle.

Animals
The protocol for the present study was based on guidelines of the National Council for Animal Experimentation Control (CONCEA) and approved by the Committee on Ethics in the use of animals (CEUA) (protocol number 014624/17).
The phenotypic data and the biological material used in this study were provided by the Tapuio farm located in Taipu-RN, Brazil, which is part of the buffalo milk-recording program maintained by the Animal Science Department of the São Paulo State University (Unesp). The farms were free of brucellosis, tuberculosis, and leucosis and strictly follow the Brazilian vaccination calendar. The current herd has approximately 688 buffalo lactating dams, with average milk production of 2,042.11 ± 681.76 kg in up to 270 days of lactation. A total of 120 firstlactation cows that had milk sampled every other month during 2016 for determining the SCC were chosen for the genotyping process. The choice of the 120 cows considered a sampling that represented the genetic diversity of the herd and composed contemporary groups that contained at least three animals. The average of SCC in the milk analyses was 36,100 ± 586,000 cells/ml (ranging from 10,000 to 3,258,000 cells/ml). None of the animals presented clinical mastitis, only subclinical mastitis according to values of SCC exceeding 200,000 cells/ml. The large variation in the SCC makes the studied population a valuable source to initial evaluation of the association of the candidate gene TLR4 and the udder-health indicator trait SCC.
Genomic DNA from 120 Murrah animals was extracted from tail hair follicles using the commercial extraction kit Macherey-Nagel NucleoSpin R Tissue (Düren, Germany).

Primer Designs and PCR Reactions
The primer pairs used to amplify the coding region of the TLR4 gene ( Table 1) were designed using the Primer3 tool (http:// bioinfo.ut.ee/primer3-0.4.0/), based on the Bos taurus sequence (GenBank accession number: AC_000165.1). The TLR4 gene consists of 3 exons. Exons 1 and 3 include coding regions and the non-coding regions 5'UTR and 3'UTR, respectively. One pair of primers was designed for exons 1 and 2 each, whereas three further pairs of primers were designed for exon 3.
The amplification of coding regions of the TLR4 gene was performed by polymerase chain reaction (PCR) assays, which were conducted in Bio-rad S1000 thermal cyclers (Bio-Rad, Hercules, CA, USA) in a final volume of 15 µL, consisting of 100 ng of genomic DNA, 15 pM of each primer and the GoTaq Colorless MasterMix kit (Promega, Madison, WI, USA). The thermal profile was expressed by initial denaturation at 95 • C for 5 min, followed by 34 cycles of denaturation at 95 • C for 30 s; annealing at the melting temperature of each primer ( Table 1) pair for 30 s; and extension at 72 • C for 30 s. A final extension at 72 • C for 5 min was performed. Further, 2 µL of PCR products were visualized by electrophoresis on 2% agarose gels stained with the GelRed system (Biotium, Inc., Hayward, CA, USA).

Sequencing of Amplified Fragments
Each amplicon of the 120 samples was purified following the protocol recommended by the Wizard SV Gel and PCR Clean-Up System (Promega) kit. Then, they were sequenced from both primers (Forward and Reverse) by the dideoxynucleotide chain termination technique (ddNTPs) using the ABI PRISM BigDye Terminator Cycle Sequencing Ready Reaction Kit (Applied Biosystems) in an automatic ABI 3730 XL sequencer (Applied Biosystems). Then, the obtained DNA sequences were analyzed for identifying the polymorphisms using the CodonCode Aligner program. The structure of the TLR4 buffalo TABLE 1 | Identifying the primers, the TLR4 gene regions where the primers anneal, annealing temperature, and size of the amplified fragment.

Primers
Primers* T melting Fragment protein was predicted by the Modeler software (32) while the three-dimensional structures of the proteins were generated by the UCSF Chimera (33).

Allele and Genotype Frequencies
The allele and genotype frequencies of the polymorphisms detected in the TLR4 gene were estimated for the population by counting. The Hardy-Weinberg equilibrium was verified for each polymorphic site by the chi-square test (χ 2 ) (p < 0.05). The linkage disequilibrium between the pairs of markers was estimated from the r 2 statistics (34) using the Plink software (35).

Association Analysis
The SCC evaluated in the association study refers to the first lactation records of 120 dairy buffaloes. The association analyses were performed by a mixed model using univariate analyzes in the Proc Mixed procedure of the SAS/STAT 9.3 statistical program (SAS Institute, Inc., Cary, NC, USA). The following statistical model was applied: where, y ijkl is the SCC determined for the ijkl th animal; µ is the average of the SCC in the population; CG i is the fixed effect of i th contemporary group (year and season of birth); IV j is the age of the j th dam at birth, considered as a linear and quadratic (co)variable in the model; M l is the fixed effect of the l th genotype for each polymorphism detected in the TLR4 gene; T k is the random effect of the sire, and e ijkl is the residual random effect associated with the observation y ijkl . Since the SCC data do not have a normal distribution, the generalized linear mixed model was performed assuming Poisson distribution for the trait. To evaluate the association of the marker with the studied trait, a significance threshold for the P-value was calculated by Bonferroni correction (α = 0.05/N polymorphisms ). The additive and the dominance effects of the significant SNPs were tested after analyzing the associations between the SNPs and SCC. These analyses were performed by orthogonal contrasts using the PROC GLM in SAS (SAS 9.2, SAS Institute, Cary, NC, USA). The probabilities of haplotypes were constructed in the Haploview software to analyze and visualize linkage disequilibrium patterns (36), using the block system reported by Gabriel et al. (37) where the region is segmented according to the LD.

RESULTS AND DISCUSSION
A total of 13 SNPs were identified in the studied population ( Table 2), three in exon 1 and 10 in exon 3 of the TLR4 gene. The SNP g.54621T>A of exon 1 was the only one located in a non-coding region (5'UTR), whereas the SNP g.46616C>A|T, of the same exon, was the only polymorphism with more than two alternative alleles. The three alleles of this locus lead to three different amino acids in the coded protein, increasing the protein variability level. Moreover, some of the identified polymorphisms were located in the same codon (e.g., g.46611C>A, g.46612A>G and g.46613T>G), which increases the number of possible nucleotide combination in a codon and also contributes to the protein variability. Almost all SNPs found in the TLR4 gene have non-synonymous conservative amino acid substitutions in the protein.   Frontiers in Veterinary Science | www.frontiersin.org A noteworthy variation in the population refers to the possible combination of the alleles A and T of the SNPs g.46612A>G and g.46613T>G, respectively. This allelic combination corresponds to the establishment of a premature stop codon, which causes the interruption of the protein synthesis with 191 amino acids instead of the 841 expected under normal synthesis (41,42). Surprisingly, 3 of the healthy cows presented two copies of haplotype AT (i.e., AA in SNP g.46612A>G and TT in the g.46613T>G), which corresponds to two copies of the premature stop codon and absence of functional TLR4 protein.
Despite the direct influence of TLR4 in the immune system, this non-sense mutation did not represent a recessive lethal combination, although the truncated protein did not completely encompass the predicted signaling domain of its wild type TLR4 (43). It is questionable that these animals have survived up to adult age without being naturally exposed to gramnegative bacterial infections. Therefore, our findings raise a point to be further investigated about the biological mechanisms compensating the TLR4 absence in these buffaloes samples when it becomes essential. It is noteworthy that the two adjacent SNPs g.46612A>G and g.46613T>G presented the lowest linkage disequilibrium measured by r 2 between pairwise of SNPs in exon 3 (r 2 = 0.07, Table 3), which might represent purifying selection frequently acting on the stop codon determinant haplotypes, AT.
Beyond the SNPs in the buffalo species, the sequencing of the amplicons revealed 45 punctual differences in coding regions across buffaloes and cattle (Supplementary Table 1), 20 of them resulting in amino acids alterations. Some of the across species missense variation were non-conservative, suggesting that structural and functional differences exist between the species (44). Figure 1 shows the three-dimensional protein structure predicted for each bovine and buffalo species, as well as the predicted structure in the presence of a premature stop codon. It has been postulated that buffaloes are less susceptible to bacterial infection than cattle, due to some anatomic attributes, such as their longer narrow teat compared to bovine, which prevents pathogens invasion (6). The structural difference in proteins encoded by TLR4 genes of the two species might represent an additional physiological mechanism empowering this difference in mastitis resistance.
The allele and genotype frequencies of each SNP are shown in Table 4. Most SNPs had minor allele frequency (MAF) higher than 0.10, which is favorable to performing association analyses since low MAF could affect the prediction of SNP effects (45). Most of the SNP frequencies adhered to the expected value of of the factors that affect Hardy-Weinberg Equilibrium, might be acting over this particular locus. SNPs with higher r 2 tend to segregate together because they are closer. The LD of the g.54621T>A, g.54429G>T, and g.54407T>A SNPs ( Table 3) suggested very high concordance between the genotypes of three loci (0.929, 0.946, 0.982), indicating that their alleles segregate together. Therefore, any of the three SNPs might equally capture the proportion of the genetic variation of the SCC trait, which is attributed to a causative mutation close/among them or to the block. Accordingly, five SNPs of the exon 3 (g.46611C>A, g.46609T>G, g.46541C>G, g.46526C>A, and g.46516T>C) also represented a block in high LD. The correlation between the blocks (0.26) and the estimated frequencies for each of the haplotypes is shown in Figure 2, being ACA and GAAAC the most common haplotypes in exons 1 and 3 with estimated frequencies of 0.75 and 0.34, respectively.
Adjusted means of SCC were compared according to the genotype classes of each SNP, and significant differences (P < 0.0001) were observed in all loci ( Table 5). A significant effect of allele substitution was observed for all SNPs, indicating phenotypical differences associated with the number of copies of favorable alleles in each SNP, even if some of them are not the causative mutations. As selection for SCC would target the lower values, due to its association with great mastitis resistance, the alleles related to the reduction in the averages would be favorable, e.g., the allele T of SNP g.54621T>A and the allele A of the SNP g.46616C>A. Roughly, for SNP g.54621T>A, where the allele substitution was modeled as replace of alleles Ts by As ("T>A" representing TT > TA >AA), the SNP effect represents the regression coefficient of SCC on the genotypes TT, TA, AA, hence the substitution effect of A allele (Table 5). Additionally, some SNPs showed a significant dominance effect, including g.46616C>A, g.46611C>A, g.46541C>G, and g.46526C>A.
To the best of our knowledge, it is the first study of the effect of SNP in the TRL4 gene on the SCC in buffaloes, demonstrating that TRL4 polymorphisms are associated with the SCC concentration in the milk of healthy animals. Only a few studies regarding candidate genes for economically important traits are available in buffaloes (46)(47)(48)(49). For mastitis resistance, they are even scarcer (50). Alfano et al. (51) reported significant results for the tuberculosis resistance of SNPs in the TRL4 gene of dairy buffaloes. The SNP (672A>C) described by these authors is the same reported here, as SNP g.46516T>C. It evidences the importance of gene polymorphisms in the immune system of ruminants. In cattle, TLR4 polymorphisms have been associated with SCC (31,52). Thus, despite the large number of nonsynonymous divergences between the amino acids in the TLR4 of buffaloes and bovine, the roles of the gene in the immune response of the mammary gland is preserved in both species.
DNA polymorphism can be used to assist the selection for a specific trait when it is associated with such traits. Specific markers can also be used to avoid some congenital recessive defects of genetic origin that affects cattle, such as bovine leukocyte adhesion deficiency (BLAD, 40). Considering the impact of mastitis in milk production, the usefulness of SCC as an indicator of mastitis resistance, and the results found in this study, there is evidence that the TLR4 gene can be used for marker-assisted selection. We need to emphasize that even finding promising results, this is a preliminary study with a small population. Further studies involving a larger sample, more generations, and animals with clinical mastitis are encouraged to validate these results. Additionally, the milk cell profile of cows that were homozygous for the premature stop codon, as well as the animal reaction face bacterial challenge are yet to be adequately assessed. The intriguing point behind the stop codon detected in the current study is the well-known importance of TLR4 for the immune system, especially for immune responses to gram-negative bacteria. Other non-lethal premature stop codons have been reported for other genes in the literature. For instance, a non-sense mutation was also reported in buffaloes JY-1, which encodes an oocyte-specific protein. Additionally, a premature stop codon created a novel favorable allele for muscle development in the GDF8 of cattle (53).
The TLR4 buffalo gene shows a different coding region in comparison to the cattle sequence with many nonsynonymous polymorphisms. TLR4 gene is highly polymorphic in buffaloes, especially for exon 3. The non-synonymous polymorphisms herein presented show potential as molecular markers for SCC in buffaloes and likely for resistance to clinical mastitis.

DATA AVAILABILITY STATEMENT
The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found at: https://www.ncbi.nlm.nih. gov/, NW_005784801.1.

ETHICS STATEMENT
The animal study was reviewed and approved by National Council for Animal Experimentation Control (CONCEA). Written informed consent was obtained from the owners for the participation of their animals in this study.

AUTHOR CONTRIBUTIONS
VR-M, NH-L, GdC, and HT planned the study. VR-M, DC, AdN, and AdF performed the molecular. NH-L, AH, DS, and DSa performed the curation of data and statistical analyses. LA and DC supported the data analysis and contributed to the interpretation of the results. VR-M composed the original draft. All authors read and approved the current draft.