Genomic Region Containing Toll-Like Receptor Genes Has a Major Impact on Total IgM Antibodies Including KLH-Binding IgM Natural Antibodies in Chickens

Natural antibodies (NAb) are antigen binding antibodies present in individuals without a previous exposure to this antigen. Keyhole limpet hemocyanin (KLH)-binding NAb levels were previously associated with survival in chickens. This suggests that selective breeding for KLH-binding NAb may increase survival by means of improved general disease resistance. Genome-wide association studies (GWAS) were performed to identify genes underlying genetic variation in NAb levels. The studied population consisted of 1,628 adolescent layer chickens with observations for titers of KLH-binding NAb of the isotypes IgM, IgA, IgG, the total KLH-binding (IgT) NAb titers, total antibody concentrations of the isotypes IgM, IgA, IgG, and the total antibodies concentration in plasma. GWAS were performed using 57,636 single-nucleotide polymorphisms (SNP). One chromosomal region on chromosome 4 was associated with KLH-binding IgT NAb, and total IgM concentration, and especially with KLH-binding IgM NAb. The region of interest was fine mapped by imputing the region of the study population to whole genome sequence, and subsequently performing an association study using the imputed sequence variants. 16 candidate genes were identified, of which FAM114A1, Toll-like receptor 1 family member B (TLR1B), TLR1A, Krüppel-like factor 3 (KLF3) showed the strongest associations. SNP located in coding regions of the candidate genes were checked for predicted changes in protein functioning. One SNP (at 69,965,939 base pairs) received the maximum impact score from two independent prediction tools, which makes this SNP the most likely causal variant. This SNP is located in TLR1A, which suggests a fundamental role of TLR1A on regulation of IgM levels (i.e., KLH-binding IgM NAb, and total IgM concentration), or B cells biology, or both. This study contributes to increased understanding of (genetic) regulation of KLH-binding NAb levels, and total antibody concentrations.


INTRODUCTION
Modern poultry production is facing high impact changes in production systems and management (1,2). These changes may enhance the risks of infectious disease, because of higher pathogenic pressure. According to FAO statistics, economic losses in poultry production due to diseases were as high as 10-20% of the gross production value in developed countries in 2014. 1 Only few years ago, diseases were controlled by (preventive and abundant) use of antibiotics, but this resulted in an increased risk for antibiotic resistance. New sustainable strategies for obtaining robust populations need to be developed, in addition to vaccination strategies, and feed formulations. Selective breeding for an improved immune system might be an important additional strategy to improve general disease resistance (3).
Selective breeding for an improved immune system requires a trait that is heritable, easy to measure, and preferentially related to general disease resistance (4). Natural antibodies (NAb) might be a good candidate trait. NAb are defined as antigen-binding antibodies present in individuals without a previous exposure to this antigen (5). NAb play an essential role in the innate, and adaptive immunity against self-antigens, and various types of pathogens, and are contributing to healthy aging, and disease resistance (6)(7)(8). Different NAb isotypes are found: predominantly IgM, but also IgA, and IgG (8,9). In earlier studies, high NAb titers (especially IgM) binding the overt antigen keyhole limpet hemocyanin (KLH) at adolescence, but not at later age, were related to lower mortality of commercial layer chickens (10)(11)(12). KLH-binding NAb levels at adolescence were previously estimated to have heritabilities between 0.07 and 0.14 in white layer chickens (4). In addition, some genomic regions underlying genetic variation of NAb titers were identified by using dedicated single-nucleotide polymorphism (SNP) sets with 1,022 SNP (13,14). However, to the best of our knowledge, no genome-wide association studies (GWAS) for NAb levels in chicken have been reported.
This study describes the first GWAS on NAb in chicken. The studied population consisted of 1,628 adolescent purebred White Leghorn chickens with observations for KLH-binding NAb titers of the isotypes IgM, IgA, IgG, and total KLH-binding NAb titers (IgT). In addition, phenotypes on total antibody concentrations of the isotypes IgM [total IgM (tIgM)], IgA (tIgA), IgG (tIgG), and total antibody (tIgT) concentration were collected to investigate similarities/differences between KLH-binding NAb, and total antibody concentration. Identification of associated genomic regions was performed by using 57,636 SNP in a single SNP GWAS. Regions were fine mapped by performing an association study using imputed genotypes based on sequence data with the aim to identify candidate gene(s), and possibly the causal variant(s). The role of the found association(s) in antibody-related immunity in chickens will be discussed.

Ethics Approval Statement
Collection of samples and data was done according to Hendrix Genetics (HG) protocols, under the supervision of HG employees. 1 http://www.fao.org/ag/againfo/themes/en/poultry/animal_health.html.
Samples and data were collected as part of routine data collection in a commercial breeding program for layer chickens in The Netherlands. Samples and data were collected on a breeding nucleus of HG for breeding purposes only and are a nonexperimental, agricultural practice, regulated by the Act Animals, and the Royal Decree on Procedures. The Dutch Experiments on Animals Act does not apply to non-experimental, agricultural practices. An ethical review by the Statement Animal Experiment Committee was therefore not required. No extra discomfort was caused for sample collection for the purpose of this study.

Study Population
Samples and data were obtained from a purebred White Leghorn chicken line (in other work referred to as "WA"), which is a layer chicken line selected mainly for egg production, but also for other production traits, e.g., traits related to egg quality. The studied chicken population comprised 1,628 chickens, of which 696 males and 932 females. The study population originated from 112 sires and 288 dams. 437 chickens in the current study were also included in the study of Berghof et al. (4). Of these 437 chickens, 222 female chickens were also included in the study of van der Klein et al. (15). The chickens were kept according to standard management of breeding nucleus farms of HG. Further details can be found in van der Klein et al. (15).
The chickens received obligatory vaccinations against Marek's disease [ Blood of the study population was collected once between 15 and 22 weeks of age (i.e., adolescence), without anesthesia/analgesia. Samples of the study population were collected in four batches with approximately 1.75 years between the first and last batch. No chickens were killed for sample collection. The blood samples were centrifuged, and plasmas and blood cells were collected separately, and stored at −20°C until use.
positive plasma sample was set to the sixth dilution. A linear regression line of the logit OD against the respective log 2 -dilution values of the averaged duplicate standard positive plasma samples was determined, which resulted in regression coefficient β. Titers of plasma samples per plate were calculated using: were logit OD lpw is the estimated logit OD at the lpw calculated with the estimated linear regression function using the log 2dilution value of that well, logit OD sample is the logit OD calculated of the OD closest to 50% of OD max for a plasma sample of an individual (OD sample ), β is the regression coefficient of the estimated linear regression function of the averaged duplicate standard positive plasma samples, and log 2 (dilution sample ) is the log 2 -dilution value at which OD sample occurred, as described by De Koning et al. (17). The total number of observations was 1,627 for IgM NAb, 1,608 for IgA NAb, 1,623 for IgG NAb, and 1,625 for IgT NAb (see Table 1).
(3) Washing procedure consisted of washing each well two times with 200 μl and subsequently three times with 100 μl wash solution. Plates were emptied, and tapped dry in between, and at the end. (4) 100 μl ELISA blocking solution was added to each well. (5) A minimum of five samples per plate was maintained to ensure proper adjustment of antibody concentrations for plate effects in the statistical analyses. (6) A duplicate of a standard positive TABLE 1 | Descriptive statistics of KLH-binding natural antibody (NAb) titers, and total antibody concentrations (μg/ml) in an adolescent White Leghorn chicken population for KLH-binding IgM NAb (IgM) titer, KLH-binding IgA NAb (IgA) titer, KLH-binding IgG NAb (IgG) titer, total KLH-binding NAb (IgT) titer, tIgM concentration, total IgA (tIgA) concentration, total IgG (tIgG) concentration, and total antibody (tIgT) concentration.

KLH-binding natural antibody titers
Total antibody concentrations (μ μ μg/ml) plasma sample (obtained from Bethyl Laboratories) was used per plate. (7) Enzyme substrate (TMB), and ELISA stop solution were made and applied as specified for "KLH-binding natural antibody titers" (see above). (8) OD were measured with a Multiskan Go. (9) All samples on a plate were corrected for the average background OD of the two blanks. (10) SoftMax ® Pro 7.0 build 226962 (Free Trial) was used for generating a 4-parameter logistic curve fit based on the averages of the two standard ranges, and calculating isotype concentrations per sample. The total concentration (μg/ml) of antibodies (tIgT) was calculated by summing the concentrations for tIgM, tIgA, and tIgG. tIgM, tIgA, tIgG, and tIgT were log 10 -tranformed to normalize the data for statistical analyses. The total number of observations was 1,619 for tIgM, 1,586 for tIgA, 1,615 for tIgG, and 1,573 for tIgT (see Table 1).

Genotypes
DNA was extracted from the collected blood cells. The study population was genotyped for a 2,740 SNP set (n = 488) (Illumina, San Diego, CA, USA), or for a 52,232 SNP set from which 11,173 SNP were used (n = 1,140) (Illumina). Both sets were imputed with Beagle 4.0 (18) (accuracy ≥ 97%) to a 57,636 SNP set, based on approximately 120 key ancestors of this chicken line. The SNP distribution over the chicken chromosomes (based on Gallus_gallus-5.0) can be found in Table S1 in Supplementary Material.
Quality control was applied in four steps: (1) All monomorphic SNP were removed. (2) SNP with at least one genotype class with less than nine chickens with phenotypic observations (~0.5% of the population) were removed. (3) SNP on autosomes with a strong deviation from Hardy Weinberg Equilibrium (χ 2 1 -value ≥ 600) were removed. (4) SNP on the allosome Z with a strong deviation from Hardy Weinberg Equilibrium (χ 2 1 -value ≥ 600) in males were removed for the whole study population. These criteria were chosen to exclude SNP which are likely to have a high rate of genotyping error and to exclude very low frequency SNP from the data set (19).

Genetic Parameters
The additive genetic variances (σ 2 a ), and the maternal environmental variance (σ 2 m ) for KLH-binding NAb titers were previously estimated in a similar, but larger population of this chicken line (4) (see Table 1). The additive genetic variances (σ 2 a ), and heritabilities for tIgM, tIgA, tIgG, and tIgT concentrations were estimated with model 1, in a similar approach as for the NAb titers described by Berghof et al. (4).
Model 1 was where y ij is the tIgM concentration, tIgA concentration, tIgG concentration, or tIgT concentration, μ is the overall mean, P i is the fixed effect of plate on which a sample was analyzed (i = 1-26 for tIgM and tIgA, or i = 1-27 for tIgG and tIgT), Age ij is the covariate describing the effect of age at sampling (15-22 weeks of age) with regression coefficient β 1 , id j is the random additive genetic effect of the jth chicken assumed to be distributed as ∼N(0,A σ 2 a ), and e ij is the residual term assumed to be distributed as ∼N(0,Iσ 2 e ). Assumed (co)variance structures of the random model terms are Aσ 2 a and Iσ 2 e , in which A is the additive genetic relationship matrix, σ 2 a is the additive genetic variance, I is an identity matrix, and σ 2 e is the residual variance. The pedigree used to construct A consisted of 2,537 individuals, and was based on a minimum of three generations of ancestors for each individual in the study population. The estimated plate effect also corrects for (confounded) effects on the samples, such as sex and storage. The estimated regression coefficient of age also corrects (partly) for (confounded) effects of vaccinations at 16 weeks of age.
Model 1 extended with a random maternal environmental effect [for IgM NAb only, see Berghof et al. (4)], were also used to estimate genetic, and phenotypic correlations with bivariate analyses, as described by Berghof et al. (4). The genetic correlations give estimates on the extent to which two traits are influenced by the same genetic variations (i.e., genes). The phenotypic correlations give estimates on the extent to which two traits are influenced by the same genetic variation, and the same environmental variation (e.g., position of cage in the stable, temperature, age). See also Stear et al. (20) for a detailed explanation.

Genome-Wide Association Studies
A single-SNP linear animal model was used for studying KLHbinding IgM, IgA, IgG, and IgT NAb titers and tIgM, tIgA, tIgG, and tIgT concentrations associations.
All traits, except IgM NAb, were analyzed using model 2a: and IgM NAb was analyzed using model 2b: where y ijk is the IgA NAb titer, IgG NAb titer, IgT NAb titer, tIgM concentration, tIgA concentration, tIgG concentration, or tIgT concentration, y ijkl is the IgM NAb titer, μ is the overall mean, P i is the fixed effect of plate on which a sample was analyzed (i = 1-91 for IgM NAb, IgG NAb, and IgT NAb; i = 1-100 for IgA NAb; i = 1-26 for tIgM; and tIgA, or i = 1-27 for tIgG, and tIgT), Age ijk and Age ijkl are the covariates describing the effect of age at sampling (15-22 weeks of age) with regression coefficient β 1 , SNP j is the fixed effect of genotype class j (j = AA, AB, or BB) of the analyzed SNP, id k is the random additive genetic effect of the kth chicken assumed to be distributed as ∼N(0,Aσ 2 a ), d l is the random effect of the lth dam (IgM NAb only) assumed to be distributed as ∼N(0,Iσ 2 m ), and e ijk and e ijkl are the residual terms assumed to be distributed as ∼N(0,Iσ 2 e ). Assumed (co)variance structures of the random model terms are Aσ 2 a , and Iσ 2 e (for all), and Iσ 2 m (for IgM NAb only), in which A is the additive genetic relationship matrix, σ 2 a is the additive genetic variance, I are identity matrices, σ 2 e is the residual variance, and σ 2 m is the maternal environmental variance (for IgM NAb only).
The variances for random factors were fixed to values obtained from analyses from models without a SNP effect: the additive genetic variances (σ 2 a ) and the maternal environmental variance (σ 2 m ) for NAb titers were fixed to values estimated by Berghof et al. (4) in a similar population of this chicken line. These estimates are more accurate, because the studied population was bigger. The additive genetic variances (σ 2 a ) for tIgM, tIgA, tIgG, and tIgT, concentrations were estimated with model 1.
The pedigree used to construct A consisted of 2,537 individuals, and was based on a minimum of three generations of ancestors for each individual in the study population. The estimated plate effect also corrects for (confounded) effects on the samples, such as sex and storage.
p-values of SNP effects were checked for inflation (22), which was expressed as inflation factor λ (23). λ was estimated using the "estlambda()" function of the R package "GenABEL" (24). Genomic control was applied when λ ≥ 1.1 [Wellcome Trust Case Control Consortium (25), taken from Duijvesteijn et al. (26)] by dividing the test statistics (F-values) by λ, and subsequently recalculating the p-values. A genome-wide false discovery rate (FDR) was estimated based on observed p-values using the R package "qvalue" (27). A FDR ≤ 0.05 was used to indicate significant associations, and a FDR ≤ 0.20 was used to indicate suggestive associations. The lead SNP [SNP with highest −log 10 (p-value)] variance as a percentage of the additive genetic variance within an associated region was estimated as σ 2 SNP /σ 2 a × 100%, where σ 2 SNP was calculated based on estimated SNP effects from model 2a for all, except IgM NAb, or model 2b for IgM NAb and the genotypic frequencies. Significant differences between genotype classes were tested with a T-test using the "!TDIFF" statement in ASReml ® 4.1 (21). A p-value ≤0.05 was considered to be significant, and a p-value ≤0.10 was considered to be suggestive.

Fine Mapping, and Identification of Possible Causal Variant
The chickens were imputated to full genome sequence in two steps based on the full genome sequences of 70 key ancestors of this chicken line. Imputation was done with Beagle 4.0 (18) (accuracy ≥ 88%) by imputing the 57,636 SNP set (for the 2,740 SNP set), or the 52,232 SNP set to full genome sequence with an average sequence depth of 12.4 (and SD of 2.1).
Associated genomic regions were selected for fine mapping, if at least 1 SNP was significant (FDR ≤ 0.05). Fine mapping regions were defined based on the last significant SNP on the border of significantly associated regions. The fine mapping regions consisted of the first SNP outside the significantly associated regions −5 and +5 Mbp.
Quality control of the imputed SNP that were used in the fine mapping studies, was applied in four steps as described above. Statistical analyses for the selected regions were performed as described above. Genomic control was applied using λ values from previous analyses, to be able to compare between analyses.
Single-nucleotide polymorphism of candidate genes in the fine mapped regions were checked for amino acid substitutions and predicted changes in protein functioning as a result of the nucleotide change in coding regions by using the Ensembl Variant Effect Predictor (28), including the integrated SIFT option (29), and by using PolyPhen-2 (30).

RESULTS
In total 1,628 chickens, of which 696 males, and 932 females, were in the study population. Descriptive statistics and genetic parameters are shown in Table 1. Mean titers for KLH-binding NAb were 5.8 for IgM, 5.4 for IgA, 5.5 for IgG, and 5.6 for total KLH-binding NAb (IgT). Mean concentrations for total antibody concentrations were 355 μg/ml for tIgM, 314 μg/ml for tIgA, 7,856 μg/ml for tIgG, and 8,484 μg/ml for total antibody concentration (tIgT). Heritabilities for total antibody concentrations were estimated to be 0.23 for tIgM, 0.22 for tIgA, 0.06 for tIgG, and 0.08 for tIgT. Maternal environmental effects were not significant for any of the total antibody concentrations. Age was not significant for tIgA, and tIgT but was kept in the model to be consistent with the other analyses. Heritabilities and maternal environmental variances (σ 2 m ) for NAb were previously estimated in a similar, but larger population of this chicken line (see Table 1) (4).
Genetic, and phenotypic correlations between the corresponding types of KLH-binding NAb titers and total antibody concentrations are shown in Table 2. Genetic correlations were: 0.91 for IgM NAb/tIgM, 0.38 for IgA NAb/tIgA, −0.61 for IgG NAb/tIgG, and −0.27 for IgT NAb/tIgT. Phenotypic correlations were: 0.41 for IgM NAb/tIgM, 0.26 for IgA NAb/tIgA, 0.08 for IgG NAb/tIgG, and 0.03 for IgT NAb/tIgT. In addition, genetic and phenotypic correlations between all total antibody concentrations were estimated (see Table S2 in Supplementary Material).
Quality control of the SNP resulted in removal of 37,053 SNP, because they were monomorphic (step 1), and an additional removal of 5,005 SNP, because of a low number of observations per genotype class or a strong deviation from Hardy-Weinberg equilibrium (steps 2-4) (see Table S1 in Supplementary Material). In total, 15,579 SNP were used for GWAS, though the actual number of used SNP per trait varied between 15,431 and 15,578 SNP due to missing phenotypes. Figure 1 shows the "Manhattan plots" of the GWAS. Results shown for KLH-binding IgM NAb,  Table S3 in Supplementary Material). For KLH-binding IgM NAb, and tIgM several SNP were significantly associated (see Table S3 in Supplementary Material). For KLH-binding IgG NAb, tIgA, tIgG, and tIgT no significant or suggestive associations were detected. Tables 3 and 4 characterize the significantly, and suggestively associated genomic regions based on the GWAS results. The associated regions, and additional information can be found in Table 3. The lead SNP and additional information can be found in Table 4. Table S3 in Supplementary Material gives an overview of associated genotyped and imputed SNP.  One genomic region showed an association with KLH-binding IgT NAb, and tIgM and showed the strongest association for KLHbinding IgM NAb. The genomic region was located on chromosome 4 around 70 M base pairs (bp), and consisted of 3 suggestive SNP for IgT NAb, 4 significant, and 1 suggestive SNP for tIgM, and 35 significant SNP for IgM NAb. The genomic region on chromosome 4 had its lead SNP for IgT NAb on 69,587,709 bp (rs313004783), for tIgM on 69,814,286 bp (rs313437715), for IgM NAb on 69,702,481 bp (rs15614874). Remarkably, the heterozygous genotype class was not significantly different from one of the homozygous genotype classes, indicating full dominance. The lead SNP variance as a percentage of the additive genetic variance was approximately 16.0% for KLH-binding IgT NAb, 14.1% for tIgM, and 57.6% for KLH-binding IgM NAb, illustrating that the genomic region on chromosome 4 has a major effect on the genetic variation of KLH-binding IgT NAb, tIgM, and especially on KLH-binding IgM NAb.
Two genomic regions were detected for KLH-binding IgA NAb. The first region was located on chromosome 9 around 12 Mbp and consisted of two suggestive SNP. The lead SNP was located at 12,261,658 bp (rs15969591). The lead SNP variance was approximately 13.5% of the additive genetic variance for IgA NAb. The second region was located on chromosome 18 around 10 Mbp. This region consisted of three suggestive neighboring SNP with equal p-values. The region showed complete dominance. The variance of the region was approximately 0.2% of the additive genetic variance for IgA NAb.
The 3,153 SNP in this region were checked for amino acid substitutions in coding regions, and consequential changes in protein functioning as a result of the nucleotide change. The Ensembl Variant Effect Predictor predicted 28 SNP to result in different amino acids. All 28 SNP could be predicted with SIFT, PolyPhen-2 or both. SIFT predicted amino acid substitutions to be "tolerated" for 11 SNP, and "deleterious" for 2 SNP. The remaining 15 SNP could not be predicted. PolyPhen-2 predicted amino acid substitution to be "benign" for 22 SNP, "possibly damaging" for 1 SNP, and "probably damaging" for 1 SNP. SIFT and PolyPhen-2 gave both the maximum impact score ("deleterious, " and "probably damaging") to 1 SNP (69,965,939 bp), which is considered the most likely candidate as causal variant for the major effects on KLH-binding IgT NAb, tIgM, and especially on KLH-binding IgM NAb. This SNP consists of a cytosine (C)/guanine (G) polymorphism, and results in a phenylalanine (F)/leucine (L) amino acid substitution in TLR1A at protein position 126. The C-variant had an allele frequency of 0.45, and the G-variant had an allele frequency of 0.55. Based on Figure 1 from Keestra et al. (33), the SNP is located in leucine-rich repeat (LRR) 4.
Significance and effects of the predicted causal variant on all measured traits can be found in Table 5. KLH-binding IgM NAb, KLH-binding IgG NAb, KLH-binding IgT NAb, and tIgM showed a significant effect of the predicted causal variant. For these traits (except IgG NAb), the GG genotype class had significantly lower phenotypic values than the CG and CC genotype classes. The CG and CC genotype classes were not significantly different from each other, showing full dominance. The estimated genetic variance explained by the predicted causal variant was between 7.3 and 15.8% for KLH-binding IgG NAb, KLH-binding IgT NAb, and tIgM, and was 63.5% for KLH-binding IgM NAb. KLH-binding IgA NAb, tIgA, tIgG, and tIgT were not significantly associated.

DISCUSSION
Relatively few GWAS have focused on detecting genomic regions associated with immunity in chicken (3,34,35). Though two KLH-binding NAb association studies with dedicated SNP sets were conducted across chicken lines (13,14). To the best of our knowledge, this study is the first genome-wide association study on (KLH-binding) NAb levels in chickens. None of previously associated genes (13,14) could be confirmed in this study, likely because of the high genetic uniformity in our data set as a result of a within line study instead of an across line study. No other study found the associated region on chromosome 4 reported in this study, which might indicate that this association is unique for this chicken line. This is also the first study that investigated heritabilities of total antibody concentrations or performed GWAS on all total antibody isotype concentrations in chickens. 57,636 SNP were used to perform the GWAS. 73.0% of the SNP were non-informative, mainly because the SNP were fixed (62.2%). This is in agreement with previously reported genetic uniformity in White Leghorn layer chicken lines (36). In total 15,580 SNP passed quality control. For five out of eight analyzed traits population stratification was found to be present, which suggests family structures within the studied population that are not captured by the pedigree information. Other factors, like heterogeneous variance, or model misspecification, likely also contributed to this, since inflation factors were not of equal size among the traits. Especially for KLH-binding IgM NAb, the population stratification was high. After correcting the test statistics for the inflation factor λ, the new inflation factor indicated that KLHbinding IgM NAb was overcorrected (λ = 0.88). Thereby being possibly too conservative on the associations with KLH-binding IgM NAb.
The heritability of tIgM was low, but twice the heritability of KLH-binding IgM NAb (4). KLH-binding IgM NAb have been found to be influenced by maternal environmental effects (4). In case maternal effects are not accounted for in genetic analyses, heritabilities will be overestimated as was seen for KLH-binding IgM NAb (4). The size of the present data set did not allow detection of maternal environmental effect for KLH-binding IgM NAb. The estimated heritability (without maternal environmental effect) for IgM NAb in this study population was 0.31 (data not shown), which is similar to the heritability of tIgM. In addition, the genetic correlation between KLH-binding IgM, and tIgM was very high. This suggests that maternal environmental effects for tIgM are present, but could not be estimated, and resulted in overestimation of tIgM heritability.
One significant genomic region was identified for KLH-binding IgM NAb, and tIgM on chromosome 4 around 70 Mbp. When correcting KLH-binding IgM NAb, and tIgM data for the estimated effects, the genomic region disappeared (data not shown). This shows that only one genomic region (i.e., one gene with one causal variant) is responsible for the observed effects.
All of the candidate genes in the IgM-associated region on chromosome 4 have been related to organism development, cell cycle, cell proliferation, metabolism, cancer development, viral infections, or a combination (mostly based on homologues in human and mouse studies), except for SMIM14, TMEM156, FAM114A1 (no literature available) (31). This implies that all candidate genes have a potential to influence IgM antibodies. However, some genes are reported in association with antibodies, or B cells, which makes them more likely candidates: KLF3 knockout mice were reported to have impaired B cell differentiation, including NAb-producing B1 B cells (37). TLR, including TLR1/TLR6 [homolog/paralog to TLR1A/TLR1B (38)], are well-known for their essential role in antibody production after stimulation of B cells, including B1 B cells (39)(40)(41). TBC1D1 has been reported to have a potential downstream role of silencing of pre-BCR signaling in acute lymphoblastic leukemia in human (42). However, none of these studies reported an effect on IgM without deliberate stimulation of B cells in healthy individuals, as is the case in this study.
Fine mapping of the associated region allowed to predict changes in protein functioning as a result of nucleotide changes in coding regions. A nucleotide change within TLR1A was predicted to have severe impact on protein folding/functioning, making this SNP the most likely candidate for the major effects on KLHbinding IgM NAb, tIgM, and KLH-binding IgT NAb. TLR1A, also known as TLR1-like a (TLR1La), TLR1.1, TLR1/6/10, and TLR16 (43), is one of the 10 known chicken TLR [see Brownlie and Allan (44) for a review on avian TLR]. TLR are among the most studied immune receptors. They are a family of transmembrane proteins that recognize conserved molecular patterns (pathogen-/microbeassociated molecular patterns; PAMP/MAMP), and are conserved in evolution (45). TLR1A dimerizes with TLR2 (either TLR2A or TLR2B), and recognizes peptidoglycans and related structures, including FSL-1, and PAM 3 CSK 4 (33,46), though some anomalies exist between studies (44). TLR1A contains 19 LRRs (33), of which LRR6-16 are required for ligand specificity (33). The region close to the C-terminal end of TLR1A is required for dimerization with TLR2 (in humans) (47,48). The candidate SNP is located in LRR4 near the N-terminal end, suggesting not a direct influence on ligand recognition, or dimerization. Instead, it is likely influencing the tertiary protein structure, or mediation of coreceptors with TLR [see Van Bergenhenegouwen et al. (49) for a review], or both. Also, only coding regions were checked for amino acid substitutions, but not, for example, promotor regions influencing expression of genes. Based on the data in this study, no conclusion can be drawn on whether the most likely causal variant results in a complete loss of function, or in reduced functioning. Nevertheless, full dominance was observed for this TLR1A variant, suggesting that one functional copy of the gene results in a sufficient expression of functional TLR1A heterodimer and enhancement by the coreceptors for activation. Future studies should confirm the predicted TLR1A variant and its functionality.
The most likely causal variant has an effect on KLH-binding IgM NAb, and tIgM. The same region was found for IgM binding the self-antigen cardiolipin, and IgM binding the self-antigen ovalbumin (Bao et al., submitted). It is tempting to speculate that this TLR1A variant has a role on maturation/proliferation/survival of naive (NAb-producing) B cells, (KLH-binding) IgM production, or a combination of these. The most likely causal variant can affect the three stages of the (avian) B cell development: prebursal (only before hatch), bursal, and postbursal (50). Considering TLR1A as the most likely candidate, the effect on B cells, and IgM antibody levels is expected after exposure to PAMP, i.e., after hatch. TLR1A is expressed on various types of cells, including macrophages, and B cells, and in various organs, including thymus, spleen, liver, and bursa of Fabricius (44,51). The effect of the TLR1A variant on IgM (NAb) can therefore be either directly on B cells or indirectly on B cells via for example macrophages, or both. An effect of the TLR1A variant is also expected on other cells of the immune system, both of the innate and adaptive part. This TLR1A variant is of particular interest, because it might explain part of the association of especially IgM NAb at adolescence with survival in chicken (11,12). Yilmaz et al. (52) suggested that the TLR1 and TLR2 subfamilies are under positive selection (favoring polymorphisms) in chicken, while all other TLR are under negative/purifying selection (favoring conserved structures) (52). Huang et al. (38) suggested that only certain sites of TLR1 might be under positive selection, but that TLR1 is mostly under purifying selection (38). Nevertheless, the results imply that TLR1A is not essential for healthy chickens, but that it does influence IgM levels.
The identified region on chromosome 4 has previously been associated with many other characteristics in chickens (layer, broiler, or both) [based on Animal QTLdb (53)]. Only one of these are directly related to disease susceptibility: for susceptibility to Marek's disease virus (54). Though the association with Marek's susceptibility was stronger around TLR3 (also located on chromosome 4, around 62 Mbp), which is involved in the recognition of viruses, including Marek's disease virus infections (55). Other characteristics associated with these region were for example age at first egg (56), eggshell breaking strength (57), eggshell color (58,59), egg weight (57,60,61), feed intake (60), growth/body weight at several ages (56,(62)(63)(64)(65)(66)(67)(68), liver weight (69), and spleen weight (70). To investigate these associations, production traits of 222 females of the study population were previously collected [described in van der Klein et al. (15)], and could therefore be tested for their association with the most likely causal variant. Of the investigated traits, only the number of laid eggs between 17 and 24 weeks of age (i.e., adolescence) tended (p = 0.09) to a significant association with the TLR1A variant, where the genotype associated with lower IgM levels was associated with higher egg production (data not shown). Such a beneficial effect of the TLR1A variant on one of the most important production traits in layer chickens, in combination with a reduced natural selection pressure (due to high sanitary barriers on the breeding nucleus), might explain why the variant is present in the study population, even though it might impair (humoral) immunity.
The heritability of tIgA was twice as large as the heritability of KLH-binding IgA NAb. Interestingly, similar concentrations for tIgA, and tIgM were found, which is the first time to be reported. Genetic, and phenotypic correlations between KLHbinding IgA NAb and tIgA were positive, and low to moderate. This suggests that variation in KLH-binding IgA NAb, and tIgA are mainly regulated by different mechanisms. The role of blood IgA antibodies is not well understood so far. It has been suggested to be a response isotype with a proinflammatory function, similar to IgG antibodies (41,71,72). However, in chicken most of the intestinal IgA is secreted via the bile (73,74), which means that intestinal IgA first has to be transported through the blood. IgA in blood might therefore represent the intestinal health status in chickens.
Two suggestively associated regions were identified for KLHbinding IgA NAb. These regions were not confirmed by tIgA concentration, even though a positive genetic correlation was estimated. This is the first time an association study on IgA (either NAb or total antibody concentration) in healthy individuals has been done. Observed results on KLH-binding IgA NAb were suggestive, and therefore these results first need to be confirmed in independent studies before further investigation of these associations.
Heritabilities of tIgG, and tIgT antibody concentrations were low, but comparable to heritabilities of KLH-binding IgG NAb titers, and KLH-binding IgT NAb titers (4). This indicates a strong environmental influence on IgG antibodies, and consequently on IgT, since IgT mainly consists of IgG antibodies: all genetic, and phenotypic correlations were higher than 0.8 [this study and Berghof et al. (4)]. However, KLH-binding IgT NAb, but not tIgT, was suggestively associated with the same region of KLH-binding IgM NAb, and tIgM. This confirms the high genetic correlation between KLH-binding IgT NAb, and KLH-binding IgM NAb (4), and the moderate genetic correlation between tIgT, and tIgM (this study).
Remarkably the genetic correlations between KLH-binding IgG NAb, and tIgG, and KLH-binding IgT NAb, and tIgT were negative. This suggest that KLH-binding IgG NAb, and tIgG, and KLH-binding IgT NAb, and tIgT are genetically regulated in a similar way, but with opposing effects. However, phenotypically there were very weak correlations between KLH-binding IgG NAb, and tIgG, and KLH-binding IgT NAb, and tIgT. In addition KLH-binding IgG NAb, and KLH-binding IgT NAb titers were significantly associated with the TLR1A variant, but tIgG, and tIgT concentrations were not. Meaning that a genetic predisposition for low (or high) levels of KLH-binding IgG NAb, and KLH-binding IgT NAb, or different genotypes for the TLR1A variant do not negatively (or positively) affect the tIgG antibody concentrations, and tIgT antibody concentrations. This implies that the humoral adaptive immune response, and memory formation are not negatively influenced by low NAb levels, or the TLR1A variant.
Natural antibodies are defined as antigen binding antibodies present in individuals without a previous exposure to this antigen (5). Layer chickens receive an extensive number of obligatory vaccinations, that may affect antibody levels in a (nonspecific) fashion. In addition, although thought to be unlikely, KLH-binding antibodies could have been generated due to crossreactivity with a component(s) of the vaccination(s). Also, due to KLH's molecular weight (75), cross-reactivity of KLH-binding NAb with other antigens cannot be excluded. For example, cross reactivity of antibodies binding KLH with antigens of the tropical parasitic worm Schistosoma mansoni has been suggested (76). This might suggest that NAb can be made in the presence of exogenous antigens, which is currently under investigation [e.g., Baumgarth (77)]. Nevertheless, the whole study population received the same vaccinations, and a possible effect of moment of blood sampling (between 15 and 22 weeks of age) was corrected by the regression coefficient of "age" in the statistical model. Therefore, the vaccinations likely do not underlie the observed results.
In this study, we show the potential of using a large database with defined immunological traits to discover underlying genomic regions, and new fundamental components in the development, and functioning of the immune system. Recently in humans, similar approaches have been used to detect genetic variation for several immune traits, not only measuring baseline immunity but also functional immunological assays (78,79). GWAS for defined immunological traits offer exiting, and promising opportunities for unraveling genetic variation, and fundamental mechanisms in the immune system in human, and animal species.
In summary, in this GWAS for KLH-binding NAb, and total concentrations of antibodies, we identified one significantly associated region on chromosome 4 with KLH-binding IgT NAb, total IgM concentration, and especially with KLH-binding IgM NAb. Further investigation predicted the causal variant on chromosome 4 to be located in TLR1A. As far as we know, this is the first time a TLR has been connected to IgM antibody levels in healthy individuals. This suggests a fundamental role of TLR in KLHbinding IgM NAb production, total IgM production, and possibly in (NAb-producing) B cell development, and proliferation. Future research should confirm the role of TLR1A, its role in KLHbinding IgM production, and total IgM production, and its role in general disease resistance in chickens.

ETHICS STATEMENT
Collection of samples and data was done according to Hendrix Genetics (HG) protocols, under the supervision of HG employees. Samples and data were collected as part of routine data collection in a commercial breeding program for layer chickens in The Netherlands. Samples and data were collected on a breeding nucleus of HG for breeding purposes only, and are a nonexperimental, agricultural practice, regulated by the Act Animals, and the Royal Decree on Procedures. The Dutch Experiments on Animals Act does not apply to non-experimental, agricultural practices. An ethical review by the Statement Animal Experiment Committee was therefore not required. No extra discomfort was caused for sample collection for the purpose of this study.

AUTHOR CONTRIBUTIONS
TB performed the research, analyzed, and interpreted the data, and wrote the manuscript. MV and JA analyzed and interpreted the data, and revised the manuscript. HP, JP, and HB designed the research, interpreted the data, and critically revised the manuscript. AV performed the research and analyzed and interpreted the data.