Common and Rare Variants Genetic Association Analysis of Circulating Neutrophil Extracellular Traps

Introduction Neutrophils contribute to host defense through different mechanisms, including the formation of neutrophil extracellular traps (NETs). The genetic background and underlying mechanisms contributing to NET formation remain unclear. Materials and Methods We performed a genome-wide association study (GWAS) and exome-sequencing analysis to identify common and rare genetic variants associated with plasma myeloperoxidase (MPO)-DNA complex levels, a biomarker for NETs, in the population-based Rotterdam Study cohort. GWAS was performed using haplotype reference consortium(HRC)-imputed genotypes of common variants in 3,514 individuals from the first and 2,076 individuals from the second cohort of the Rotterdam Study. We additionally performed exome-sequencing analysis in 960 individuals to investigate rare variants in candidate genes. Results The GWAS yielded suggestive associations (p-value < 5.0 × 10−6) of SNPs annotated to four genes. In the exome-sequencing analysis, a variant in TMPRSS13 gene was significantly associated with MPO-DNA complex levels (p-value < 3.06×10−8). Moreover, gene-based analysis showed ten genes (OR10H1, RP11-461L13.5, RP11-24B19.4, RP11-461L13.3, KHDRBS1, ZNF200, RP11-395I6.1, RP11-696P8.2, RGPD1, AC007036.5) to be associated with MPO-DNA complex levels (p-value between 4.48 × 10−9 and 1.05 × 10−6). Pathway analysis of the identified genes showed their involvement in cellular development, molecular transport, RNA trafficking, cell-to-cell signaling and interaction, cellular growth and proliferation. Cancer was the top disease linked to the NET-associated genes. Conclusion In this first GWAS and exome-sequencing analysis of NETs levels, we found several genes that were associated with NETs. The precise mechanism of how these genes may contribute to neutrophil function or the formation of NETs remains unclear and should be further investigated in experimental studies.


INTRODUCTION
Neutrophils play important roles in host defense through different mechanisms, including the formation of neutrophil extracellular traps (NETs) (1). NETs are released as a result of a unique form of cell death, where DNA coated with histones and anti-microbial proteins, such as neutrophil elastase and myeloperoxidase (MPO), form a web-like structure (1). Bacteria, fungi, viruses and parasites have been shown to induce NET formation, where they are trapped and killed by NETs, preventing dissemination of pathogens (1). Whereas NETs have a protective role in host defense against pathogens, NETs have also been shown to be involved in the pathogenesis of various diseases including, thrombosis (2), cardiovascular diseases (3,4), auto-immune diseases (5) and sepsis (6). Although the role of NETs in health and disease has been postulated, the molecular mechanisms of NET formation remain elusive.
Genome-wide association studies (GWAS) and exomesequencing analysis have been successfully implemented as approaches to identify genetic variants associated with disease susceptibility. The assessment of genetic variants in association with NETs might help to elucidate potential molecular mechanisms intervening in their formation and their downstream effect on other pathways. Here, we are the first to apply these approaches to ascertain common and rare genetic variants associated with NETs using data from a populationbased cohort study. We performed additional in silico analyses to identify more evidence for the associated variants and genes in relation to the plasma MPO-DNA complex levels.

Study Description and Population
The Rotterdam Study (RS) is a prospective, population-based cohort study of determinants of chronic diseases in older adults (7). The first cohort (RS-I) started in 1990 and included 7,983 inhabitants, aged ≥55 years from Ommoord, a suburb of Rotterdam in the Netherlands. The cohort was further extended in 2000 (RS-II) and 2005 (RS-III), establishing a total of 14,926 participants. Every 4 years participants are interviewed at their homes and they visit the study center for an extensive clinical assessment, including venipuncture and assessment of various cardiometabolic risk factors. The RS was approved by the Medical Ethics Committee of the Erasmus MC and by the Ministry of Health, Welfare and Sport of Netherlands. All participants provided written informed consent to participate in the study in accordance with the Declaration of Helsinki.

MPO-DNA Complex Measurement
Citrated plasma samples were collected at the third visit of RS-I (1997RS-I ( -1999 and the baseline examination of RS-II (2000)(2001), and stored at −80°C. We determined circulating levels of NETs by measuring myeloperoxidase (MPO)-DNA complexes with a capture ELISA as reported earlier (4). Briefly, we adjusted the commercial human cell death ELISA kit (Cell death detection ELISAPLUS, Cat. No 11-774-425-002; Roche Diagnostics Nederland B.V., Almere, the Netherlands) in which we used anti-MPO monoclonal antibody (clone 4A4, ABD Serotec, # 0400-002) as the capturing antibody. Patient plasma was added together with the peroxidase-labeled anti-DNA monoclonal antibody (component No.2 of the commercial cell death detection ELISA kit; Roche, #11-774-425-002). The absorbance at 405 nm wavelength was measured using BioTek Synergy HT plate reader with a reference filter of 490 nm. Values are expressed as milli arbitrary units (mAU/ml). The reference line to define the mAU was composed after isolation and induction of neutrophils from a healthy donor. NET formation was induced by adding phorbol 12-myristaat 13-acetaat (PMA).

Genotyping and Imputation
We implemented two different genetic association studies covering the assessment of common variants and rare variants ( Figure 1). Genomic DNA was extracted from peripheral blood mononuclear cells. Genome-wide single-nucleotide polymorphisms (SNPs) were genotyped from 6,291 participants from RS-I and 2,157 participants from RS-II using the Illumina Infinium II HumanHap550 or 610quad arrays. All genotyped participants were of European ancestry based on their self-report. Before imputation, genotyped SNPs with a call rate of < 98%, a minor allele frequency (MAF) of < 1%, or a Hardy-Weinberg equilibrium p-value of < 1 × 10 −6 , were excluded. In RS-I, a total of 512,849 SNPs remained after filtering and these were used for imputation. In RS-II, a total of 537,405 SNPs were used for imputation. Dosages of 19,537,258 SNPs were imputed in both studies using the Haplotype Reference Consortium (HRC), a reference panel of 64,976 human haplotypes at 39,235,157 SNPs constructed using whole-genome sequence data from 20 studies of predominantly European ancestry (8). The imputation was conducted using the Michigan Imputation server (9). The server uses SHAPEIT2 (v2.r790) to phase the data and Minimac 4 for imputation to the HRC reference panel (v1.1). After imputation, SNPs with a MAF < 0.01 or an imputation quality < 0.3 were excluded. The overlap between participants with MPO-DNA complex measurements and genotypes was 3,515 in RS-I and 2,076 in RS-II ( Figure 1).
Whole-Exome sequencing was conducted at the Human Genomics Facility in Erasmus Medical Centre, Rotterdam (10). Exome-SNPs were genotyped using Illumina HumanExome BeadChip v1.0 and was conducted on 2,998 paired-end sequenced samples using the Illumina hiSeq2000 (2×100bp reads). Indels and single nucleotide variants were filtered out and evaluated using GAKs Variant Evaluation; variants with a call rate < 0.97, duplicate samples, duplicate variants and > 5% of missing genotypes were considered as exclusion criteria. A total of 2,628 samples passed through all technical quality control and GATKs Haplotype Caller was used to call SNPs and indels simultaneously. Genotype calling was performed according to the CHARGE joint calling protocols (11). Annovar software tool (12) was used to functionally annotate each genetic variant. Each variant was coded as 0, 1 and 2 representing two reference alleles, one reference allele and one mutated allele and two mutated alleles, respectively.

Common Variant Analysis
GWAS were carried out in RS-I and RS-II under the additive model using HRC-imputed data and RVtest tool (13). Linear regression was modeled using log-transformed MPO-DNA complex levels with adjustment for age, sex, and the first four principal components. EasyQC was used to conduct quality control across cohorts (excluded: MAF < 0.01) to identify file naming errors, erroneous SNP genotype data and false association caused by incorrect analysis models (14). Cleaned results were combined in a joint meta-analysis of linear regression estimates, and standard errors using an inversevariance weighting approach was conducted using METAL (15). We determined the association of each SNP with MPO-DNA complex as the regression slope, its standard error and its corresponding p-value. Variants with a meta-analyzed p-value < 5.0 ×10 −8 were considered significant. The threshold for suggestive associations was set at a p-value < 5.0 ×10 −6 .

Gene Set Enrichment Analysis
A gene-set analysis of GWAS data was done using MAGMA v1.06 (16), implemented by FUMA v1.3.2 (17). Gene-based approach aims to test the joint association of all markers in the gene with MPO-DNA complex. Gene-set analysis aggregates individual genes to groups of genes sharing certain biological, or other functional characteristics, allowing the identification of effects consisting of multiple weaker associations to determine their joint effect. Likewise, gene-set analysis might provide insight into the involvement of specific biological pathways or cellular functions underlying NET formation (16). The analysis was performed using summary-level meta-analysis results. First, a gene-based association analysis was done to identify candidate genes associated with MPO-DNA complex. Second, the genes identified from the gene-based analysis were used to perform a tissue enrichment analysis using gene expression data for 53 tissues from GTEx. For all MAGMA analyses, multiple testing was accounted for Bonferroni correction.

Rare Variant Analysis
Exome-sequencing analysis was performed in a subset of RS-I subjects (N=960) to evaluate the association between rare variants and MPO-DNA complex levels. We used logtransformed MPO-DNA complex levels with adjustment for age, sex, BMI, smoking, SBP, cholesterol, HDL, and four ancestry-informative principal components (PCs), because rare variants are more susceptible to population stratification (18). Estimated regression coefficient for each variant and its standard error (prepScores) were employed on single variant level, to perform score tests for single SNP associations and gene-based test [sequence kernel association test (SKAT)] using the seqMeta package implemented in R (19). SKAT is a bidirectional test and is more powerful when the effect direction of rare variants within a gene varies. Single-variant analysis was done using score tests. We defined a statistical significance threshold of single variant and gene-based exome sequencing based on Bonferroni correction for multiple testing,~606, 583 variants (p-value < 8.2 × 10 − 8 ) and~46,551 genes (p-value < 1.07 × 10 −6 ), respectively.

SNP Heritability and Genetic Correlation
To characterize the extent to which common genetic variants determine MPO-DNA complex levels, and shared genetic etiology with other traits (coronary artery disease, stroke and C-reactive protein), we applied linkage disequilibrium score regression (LDSC) (20) methods for estimating SNP heritability and genetic correlation based on genome-wide sharing between distantly related individuals. LDSC is a summary-statistics-based method which estimates heritability and genetic correlation while accounting for LD and requires only publicly available summary statistics from genetic studies (21). In brief, the cross-product of two GWAS test statistics is calculated at each genetic variant, and this cross-product is regressed on the LD score. The slope of the regression is used for estimating the genetic covariance between two phenotypes. We used the default European LD score file based on the European 1KG reference panel. The analyses were conducted using LDSC (LD SCore) v1.0.1 package running under R (https://github.com/bulik/ ldsc) (20,22).

Functional Annotation and Pathway Analysis
We used HaploReg v4.1 (http://www.broadinstitute.org/ mammals/haploreg/haploreg.php; in the public domain) (23) to retrieve proxy SNPs in high linkage disequilibrium (LD) (R 2 threshold > 0.8, limit distance 100 kb, and population panel CEU) with the associated (suggestive) variants and subsequently to identify the SNP position and effect on protein structure, gene regulation, and splicing. Moreover, the SNPs in LD were scrutinized in the GWAS catalog to ascertain possible association with other traits (24). Additionally, to assess the correlation between the identified SNPs and expression levels of the host transcripts, we used expression quantitative trait loci (cis-eQTL) data from the Genotype-Tissue Expression Project (GTEx portal, Analysis Release V7) (http://www.gtexportal.org/ home/; in the public domain) and from 3,841 whole blood samples from five Dutch biobanks (BIOS-BBMRI database) (http://www.genenetwork.nl/biosqtlbrowser/; in the public domain). GTEx is a platform with available expression data on potential target organs (heart tissue, kidney tissue, brain tissue, aortic endothelial cells, and blood vessels) as well as blood cell types (CD4+ macrophages, monocytes) (25). The gene expression values are shown in TPM (transcripts per million), calculated from a gene model with isoforms collapsed to a single gene. Box plots are shown as median and 25 th and 75 th percentiles, outliers are displayed as dots if they are above or below 1.5 times the interquartile range (25).
We used GeneMANIA Cytoscape plugin to identify the most related genes to a query gene set using a guilt-by-association approach. The platform employs a large database of functional interaction networks from multiple organisms and each related gene is traceable to the source network used to make the prediction (26).We additionally conducted a core analysis, implemented in QIAGEN's Ingenuity Pathway Analysis Software (IPA, http://www.ingenuity.com), to determine enriched pathways and to interpret the role of these genes in the context of biological processes, pathways and molecular networks. IPA is a knowledge database generated from peerreviewed scientific publications that enables the discovery of highly represented biological mechanisms, pathways or functions most relevant to the genes of interest from large, quantitative datasets. We uploaded all target genes and performed a core IPA analysis with default setting. IPA uses right-tailed Fisher exact test to identify enriched canonical pathways and diseases associated to these genes.

Characteristics of Study Participants
The baseline characteristics of participants from RS-I and RS-II included in the GWAS and exome-sequencing analysis are shown in Table 1. Data on GWAS and MPO-DNA complex levels were available in 3,514 participants of RS-I and in 2,076 participants of RS-II. The mean age of all the participants was 68.8 ± 8.5 years and the majority of the study participants were females (56.6%). Participants in RS-I were older (mean age 72.5 ± 7 years) than participants in RS-II (mean age, 64.9 ± 8 years). The median MPO-DNA complex level in RS-I was 52 mAU/ml (25 th to 75 th percentile 41-82) and 56 mAU/ml  in RS-II. Moreover, exome sequencing analysis for MPO-DNA complex levels was performed in 960 participants from RS-I. The mean age of these participants was 70.7 ± 6.1 years and the majority of

Association Study With Common Variants
We first performed a GWAS in each RS cohort. Fixed-effect metaanalysis of the summary statistics from RS-I and RS-II showed no significant associations between common variants and MPO-DNA complex levels passing the GWAS threshold of 5.0 × 10 −8 (Supplementary Table S1). The lowest p-value was found for rs289078 located near a long intergenic non-protein coding RNA 2501 (LINC02501; chr4:31416764). The minor allele T at this region was associated with a decrease in MPO-DNA complex levels (b= 0.1, p-value= 3.5 × 10 −7 ). We additionally queried association results between rs289078 and over 778 traits assessed in 452264, as implemented in Gene Atlas (http://geneatlas.roslin.ed.ac.uk/) (27). At nominal significance level, rs289078 was found to be associated with postpartum hemorrhage, deep venous thrombosis, among others (Supplementary Table S2). Moreover, we found 26 additional SNPs, most of them located in intergenic regions. Five of these SNPs were annotated to 4 genes (KIF26B, CDK19, CATSPERB, AC027119.1) that were suggestively associated with MPO-DNA complex levels (p-value < 5.0 × 10 −6 ) (Supplementary Table S1). Manhattan plot depicting the association of all common SNPs with MPO-DNA complex levels is shown in Figure 2. QQ-plot is shown in Supplementary Figure S1.

Gene Set Enrichment Analysis
We performed gene-based association analyses using MAGMA to identify tissues and pathways relevant to NET formation. Input SNPs were mapped to 18889 protein coding genes establishing a genome wide significance p-value = 0.05/18884 = 2.6 × 10 −6 . There were no genes associated with MPO-DNA complex levels at genome-wide significance (Supplementary Table S3). The lowest p-value was found for a set of 40 SNPs located in TAS1R1 (Taste 1 Receptor Member 1) (p-value=2.7 × 10 −5 ). Tissue specificity analysis for TAS1R1, evaluated across 53 tissue types from GTEx project, showed no significant differential expression. Likewise, gene set analysis revealed no significant pathways. Gargalovic response to oxidized phospholipids red up, composed of a set of 15 genes, showed the lowest p-value= 3.2 × 10 −5 (Supplementary Table S4).

Association Study With Rare Variants
In the single-variant analysis, we found one variant corresponds to chr11:117789326, annotated to the TMPRSS13 (Transmembrane Serine Protease 13) gene, significantly associated with MPO-DNA complex levels. The minor allele C has was negatively associated with the plasma MPO-DNA complex levels (b= Additionally, we did a look up for association between common variants in the 10 genes identified by exomesequencing analysis and MPO-DNA complex levels in our GWAS results. A common variant (chr1:32509964) located in KHDRBS1 (KH domain-containing, RNA-binding, signal transduction-associated protein 1) gene was nominally associated with MPO-DNA complex levels (b= 0.17, p-value=0.003). In addition, we checked for rare variants in the 4 suggestive genes identified by GWAS in our exome-sequencing results, and found no significant association.

SNP Heritability and Functional Annotation
Our study includes data from 5,590 participants from two European cohorts. The 1KG intercept was 1.02 (standard error= 0.006). The LD-score calculated SNP Heritability showed an estimate of 7% and the mean chi 2 was below~1.02 (chi 2 = 1). Moreover, either sample size or heritability was not sufficient to determine cross-trait genetic correlation. None of the 27 suggestive SNPs identified by GWAS have nonsynonymous proxy variants in strong LD (R 2 > 0.8) ( Supplementary Table S5). We identified 155 proxy SNPs (R 2 > 0.8) of these GWAS SNPs. We checked GWAS Catalog to see whether the identified variants or their proxies have been reported previously to be associated with any diseases or traits (e.g., cardiovascular disease, Alzheimer's disease, etc.). We found that none of the SNPs were reported to be associated with any outcome in the GWAS Catalog. We further investigated gene expression levels associated with GWAS-SNPs and their proxies (SNPs in LD), as reported by GTEx. Among these, proxies of rs10457220 and rs112514818, located in CDK19, were associated to gene expression levels of their host gene (Supplementary Tables S6 and S7). We also investigated gene expression levels across different tissues of the genes identified by exome-sequencing analyses, as reported by GTEx (Supplementary Figures S2-S7). Although the expression levels of these genes have not been reported for neutrophils, the data show some of them to be expressed in whole blood.

DISCUSSION
The present study is the first genetic association study to identify genetic variants and potential loci affecting plasma MPO-DNA complex levels, a marker for NETs. Through GWAS, we found suggestive associations between MPO-DNA complex levels and common SNPs annotated to 4 genes. Through exome-sequencing analysis, we further identified rare exonic variants in 11 genes to be associated with circulating MPO-DNA complex levels. Many of the identified genes are involved in pathways that may be relevant for the function of NETs (e.g., molecular transport and cell-to-cell signalling and interaction). Cancer was the top disease linked to the NET-associated genes found in this study.
In the GWAS analysis, the most significant association was found for a SNP located near a long intergenic non-protein coding RNA 2501 (LINC02501). Its function is unknown, and it is yet unclear how this gene contributes to NET formation. We also found one suggestive SNP that was located in the KIF26B (Kinesin Family Member 26B) gene. One study that used proteomics identified a protein formed by the KIF26B gene that is located on the secretory vesicle membrane of human neutrophils (28). This gene is also associated with different forms of cancer (29,30). In hepatocellular carcinoma (HCC), KIF26B activation was suggested to promote cancer progression by activating the phosphatidylinositol 3-kinase (PI3K) and Akt/Protein Kinase B (PI3K/Akt) signaling pathway (30). The PI3K/Akt signaling pathway is known to be involved in autophagy and has been described as a pathway for NET formation (31). Another interesting suggestive SNP was located in the CDK19 (Cyclin Dependent Kinase 19) gene. This gene has a role in innate immunity as it regulates NF-kBmediated gene expression. When stimulated with TNF-alpha, NF-kB heterodimers and CDK19 promote expression of IL8, CXCL2, and CXCL3 (32). Different cytokines, including IL-8 have been described to induce NET formation (33). As all these genes were suggestively associated with MPO-DNA complex levels, future studies with larger sample sizes are needed.
In the exome-seq analysis, we found a rare variant in the TMPRSS13 gene to be associated with MPO-DNA complex levels. Genes found through exome sequencing analysis that are associated with MPO-DNA complex levels.
The significance threshold for gene-based analysis set at p-value < 1.07 × 10 −6 . † Genes with a suggestive association with MPO-DNA complex levels. MAF, minor allele frequency; SNP, single-nucleotide polymorphism.
This gene is highly expressed in the skin and the mucosa of the esophagus and plays a role in activation of membrane fusion activity of the hemagglutinin (HA) of influenza viruses (34). The family of type II transmembrane serine proteases have been described to play an important role in the HA process. It has also been shown previously that NETs are released in response to several viruses (35,36). In patients with severe influenza infections, high MPO-DNA complex levels correlate with poor prognosis (37). In contrast, NETs released by mouse neutrophils incubated with Chikungunya virus, resulted in virus capture and neutralization (38). Although NETs may be necessary for the neutralization of specific viruses, the cytotoxic effects of NETs affect survival of the host. From the genebased analysis, we additionally identified ten genes associated with MPO-DNA complex levels. The gene with the most significant association was OR10H1 (Olfactory Receptor Family 10 Subfamily H Member 1). OR10H1 is involved in the olfactory signaling pathway, where adenylyl cyclase mediates cAMP increase and influx of calcium into the cell (39). OR10H1 is predominantly expressed in the testis and urinary bladder and was shown to be highly expressed in bladder cancer cell lines (40). The direct effect of this gene on NET formation is not yet clear. However, neutrophils from patients with various malignant diseases demonstrate an increased capacity to release NETs (41,42). This suggests that the association of OR10H1 with MPO-DNA complex levels could be driven by the presence of malignancy. The second gene identified in exome-seq analysis is the KHDRBS1 (KH RNA Binding Domain Containing, Signal Transduction) gene. KHDRBS1 encodes SAM68 (Src-associated-in-mitosis-68-KD), which is an RNA binding protein and Src kinase substrate that has a function in RNA metabolism and signal transduction. It also contributes to TNFinduced NF-kB activation and apoptosis (43) and has been implicated in cancer (44)(45)(46). Also, SAM68 is expressed in human neutrophils and is involved in signal transduction when stimulated by phagocytic agonists (47). We also identified one gene that demonstrates an effect in viral transcription and infection, namely ZNF200 (Zinc Finger Protein 200). This gene exhibits an antiviral function in Herpes simplex virus 1 infection (48). As mentioned earlier, NETs play an important role in viral infections but can be both harmful and beneficial at the same time. Several Zinc Finger Proteins have been identified in activated neutrophils (49). However, ZNF200 has not yet been shown to be expressed in neutrophils. Moreover, we showed that MPO-DNA complex levels are associated with a set of rare variants annotated to long non-coding RNAs (lncRNAs), namely RP11-461L13.5, RP11-24B19.4, RP11-461L13.3, RP11-395I6.1, RP11-696P8.2. It has become increasingly evident that lncRNAs contribute to normal physiology and development of diseases (50). LncRNAs mainly regulate gene expression through modulation of the activity of transcriptional regulators, chromatin remodeling and post-transcriptional modifications (51). The RP-11 lnRNAs we found in our study have not yet been described to be associated with any specific diseases. However, several other RP-11 lncRNAs have previously been shown to play a role in the progression of different types of cancer, including colorectal cancer (52), HCC (53), ovarian cancer (54) and gastric cancer (55). Given cancer was the top disease linked to the NET-associated genes in our pathway analysis, future studies may investigate the potential link between these lncRNAs, NETs and cancer. Also, several lncRNAs have demonstrated to be involved in the regulation of the innate and adaptive immune response (56). We also found a number of genes that showed a suggestive association with MPO-DNA complex levels in the exomesequencing analysis, that are worth mentioning. The first gene is ZDHHC19 (Zinc Finger DHHC-Type Palmitoyltransferase 19), which has a function in inflammation as it regulates STAT3 (57). STAT3 is an important regulator of immunity, inflammation and tumorigenesis (58). Two other suggestive genes we identified in this study, are known to play a role in the PI3K/Akt and mTORC1 (mammalian target of rapamycin activation complex 1) signaling pathway. FOXC1 (Forkhead Box C1) has been shown to upregulate PI3K/Akt signaling, which was inflammation-dependent in fibroblast-like synoviocytes (59). Also, in HCC cell lines, FOXC1 promotes inflammation via PI3K/Akt signaling (60). In a previous study, both ZDHHC19 and FOXC1 were found to be upregulated more than five times in neutrophils from patients with acute respiratory distress syndrome (ARDS) than in neutrophils from healthy volunteers (61). TMEM55B (transmembrane protein 55b), a lysosomal protein of unknown molecular function, acts downstream of PI3K/Akt and interacts with many proteins that participate in mTORC1 (62). mTOR suppresses autophagy and has also been reported to be important for NET formation (63).
To our knowledge, this is the first genetic association study that investigated the genetic determinants of NET formation. Our study has several strengths and limitations. The main strength of this study is that we used data from a large population-based study. Moreover, we performed our analysis with both common and rare variants. The gene-based analysis of rare variants was another strength of this study, due to the combined effect of multiple SNPs (19). Furthermore, we measured MPO-DNA complex levels as a marker for the presence of NETs in plasma. MPO-DNA complex is currently considered the most specific, objective and quantitative assay for detecting NETs (64). MPO-DNA complex levels represent the presence of NETs, which are formed through the protein-arginine deiminase type 4 (PAD4)-dependent pathway as well as the autophagy pathway (64).
The findings of this study should be considered in light of some limitations. Since this is the first genetic association study for MPO-DNA complex levels, we were unable to replicate our findings in a secondary cohort, which may impede the identification of additional loci. We have included data from European populations; therefore, other studies are needed to assess the generalizability of our findings to other ancestries. Nevertheless, with the results of this study, we have provided potential genetic and molecular pathways leading to NET formation. This may lead to a better understanding of NETassociated diseases. To validate the effect of the genes mentioned in this study, experimental trials and functional experiments are needed.
To conclude, our study found rare variants in 11 genes (TMPRSS13, OR10H1, RP11-461L13.5, RP11-24B19.4, RP11-461L13.3, KHDRBS1, ZNF200, RP11-395I6.1, RP11-696P8.2, RGPD1, AC007036.5) that were significantly associated with plasma MPO-DNA complex levels. Moreover, we found suggestive associations between common SNPs annotated to four genes (KIF26B, CDK19, CATSPERB, AC027119.1) and MPO-DNA complex levels. These findings may indicate that NET formation is a multifactorial process which is regulated by genes involved in different biological processes. Also, a link between NET-associated genes and cancer was observed. The precise mechanism of how these genes may contribute to neutrophil function or the formation of NETs remains unclear. Future functional studies, investigating the translational potential of these observations is warranted to fully characterized the molecular mechanisms regulated by the genes described in this study.

DATA AVAILABILITY STATEMENT
The original contributions presented in this study are publicly available. This data can be found here: https://www.ebi.ac.uk/gwas/ accession number: GCST90013658.

ETHICS STATEMENT
The Rotterdam Study was approved by the Medical Ethics Committee of the Erasmus MC and by the Ministry of Health, Welfare and Sport of Netherlands. The patients/participants provided their written informed consent to participate in this study.

AUTHOR CONTRIBUTIONS
MG, MD, and FL were responsible for the design and supervision of this project. FR, FV, and MI, collected data. SD and MD were responsible for laboratory measurement of MPO-DNA complex. EP, SA, and MG performed the GWAS, exome sequencing, and in silico analyses. SD and EP interpreted the data and contributed equally in writing this article. All authors contributed to the article and approved the submitted version.