A Genetic Association Test Accounting for Skewed X-Inactivation With Application to Biotherapy Immunogenicity in Patients With Autoimmune Diseases

Despite being assayed on commercialized DNA chips, the X chromosome is commonly excluded from genome-wide association studies (GWAS). One of the reasons is the complexity to analyze the data taking into account the X-chromosome inactivation (XCI) process in women and in particular the XCI process with a potentially skewed pattern. This is the case when investigating the role of X-linked genetic variants in the occurrence of anti-drug antibodies (ADAs) in patients with autoimmune diseases treated by biotherapies. In this context, we propose a novel test statistic for selecting loci of interest harbored by the X chromosome that are associated with time-to-event data taking into account skewed X-inactivation (XCI-S). The proposed statistic relies on a semi-parametric additive hazard model and is straightforward to implement. Results from the simulation study show that the test provides higher power gains than the score tests from the Cox model (under XCI process or its escape) and the Xu et al.'s XCI-S likelihood ratio test. We applied the test to the data from the real-world observational multicohort study set-up by the IMI-funded ABIRISK consortium for identifying X chromosome susceptibility loci for drug immunogenicity in patients with autoimmune diseases treated by biotherapies. The test allowed us to select two single nucleotide polymorphisms (SNPs) with high linkage disequilibrium (rs5991366 and rs5991394) located in the cytoband Xp22.2 that would have been overlooked by the Cox score tests and the Xu et al.'s XCI-S likelihood ratio test. Both SNPs showed a similar protective effect for drug immunogenicity without any occurrence of ADA positivity for the homozygous females and hemizygous males for the alternative allele. To our knowledge, this is the first study to investigate the association between X chromosome loci and the occurrence of anti-drug antibodies. We think that more X-Chromosome GWAS should be performed and that the test is well-suited for identifying X-Chromosome SNPs, while taking into account all patterns of the skewed X-Chromosome inactivation process.

Despite being assayed on commercialized DNA chips, the X chromosome is commonly excluded from genome-wide association studies (GWAS). One of the reasons is the complexity to analyze the data taking into account the X-chromosome inactivation (XCI) process in women and in particular the XCI process with a potentially skewed pattern. This is the case when investigating the role of X-linked genetic variants in the occurrence of anti-drug antibodies (ADAs) in patients with autoimmune diseases treated by biotherapies. In this context, we propose a novel test statistic for selecting loci of interest harbored by the X chromosome that are associated with time-to-event data taking into account skewed X-inactivation (XCI-S). The proposed statistic relies on a semiparametric additive hazard model and is straightforward to implement. Results from the simulation study show that the test provides higher power gains than the score tests from the Cox model (under XCI process or its escape) and the Xu et al.'s XCI-S likelihood ratio test. We applied the test to the data from the real-world observational multicohort study set-up by the IMI-funded ABIRISK consortium for identifying X chromosome susceptibility loci for drug immunogenicity in patients with autoimmune diseases treated by biotherapies. The test allowed us to select two single nucleotide polymorphisms (SNPs) with high linkage disequilibrium (rs5991366 and rs5991394) located in the cytoband Xp22.2 that would have been overlooked by the Cox score tests and the Xu et al.'s XCI-S likelihood ratio test. Both SNPs showed a similar protective effect for drug immunogenicity without any occurrence of ADA positivity for the homozygous females and hemizygous males for the alternative allele. To our knowledge, this is the first study to investigate the association between X chromosome loci and the occurrence of anti-drug antibodies. We think that more X-Chromosome GWAS should

INTRODUCTION
Despite the widespread recognition that genes play a role in many complex diseases, it is puzzling that one of the most important biological characteristics, the sex which is determined by the sex chromosomes, is often overlooked in genome-wide association studies (GWAS) (1)(2)(3). In practice, most of the GWAS discard this information whereas commercialized genotyping chips include thousands of Single Nucleotide Polymorphisms (SNPs) on the X chromosome. Even for autoimmune diseases that show strong sex differences in prevalence, the analyses are often restricted to the autosomes, thus neglecting X-linked information. Some potential reasons explaining this lack of interest for X chromosomes as compared to autosomes include lower coverage of chromosome X, technical issues regarding genotype calling and imputation and non-standard statistical analyses (4). In the latter case, the methodological problem is due to the fact that the statistical methods should take into account the X-chromosome inactivation (XCI) process on female X-chromosome loci (5,6).
The main feature that makes the X chromosome different from the autosomal chromosomes is obviously the fact that, except for the pseudo-autosomal regions resulting from the divergence of evolution of sex chromosomes (X and Y), women have two copies while men have only one copy of each gene. This dosage imbalance is in part compensated by inactivation of one X chromosome in females through XCI. In each female cell, one copy of the X chromosome is inactivated. X-chromosome inactivation occurs at random (paternal or maternal), very early in embryonic life and is inherited by all daughter cells through mitosis (5). Females are mosaic, each cell having either the paternal or maternal X-chromosome inactivated. Such mosaic states can also be imposed in the case of gonosome aneuploidies. While the random inactivation process results in roughly a symmetrical (50:50) distribution in most females, skewing of X chromosome inactivation (XCI-S) is observed in some women, leading to a majority of either paternal or maternal X-chromosome inactivation. This skewing might be due either to selective pressure (negative selection) or a stochastic process (random selection in an embryonic stage where a limited number of cells give rise to the different tissues). Moreover, some genes may escape X-chromosome inactivation and remain biallelically expressed (XCE) (7).
In recent years, biopharmaceutical products (BPs) such as therapeutic monoclonal antibodies have become increasingly used in clinical practice and have led to a critical step forward in the treatment of many severe auto-immune diseases. However, for some patients these drugs activate the immune system, leading to the formation of Anti-Drug Antibodies (ADA). The mechanisms leading to drug immunogenicity can either be patient-related (genetic background, immunological status, prior exposure, prior disease) or treatment-related (drug characteristics and formulations, route, dose, frequency of administration) (8,9). While some genomewide studies have investigated the genetic factors associated with the immunogenic potential of biotherapies (10), to our knowledge, none has investigated the X-chromosome.
To analyze the X chromosome in association studies, several test statistics have been proposed and implemented for casecontrol studies that consider either the XCI process or its escape (XCE). To take into account these two underlying biological processes, two genotype coding schemes are commonly used. One corresponds to the assumption of the XCI process as proposed by Clayton (11) while the other corresponds to the XCE process, as implemented in the classical PLINK software (12). For XCI, the proposed coding values are the same for homozygous females and hemizygous males while the heterozygous females fall midway between two homozygous, mimicking the fact that about 50% of cells have the minor (or alternative) allele active while the other 50% of cells have the reference allele active due to random XCI. Using this coding, Clayton derived a one degree-offreedom score test statistic for case-control studies (11). For XCE, the coding implemented in PLINK codes female genotypes as 0, 1, or 2 copies of the minor allele and male genotypes as 0 or 1 copies of the minor allele. This genotype coding assumes that variants on both copies of the X chromosome are expressed in females. In this XCE setting, Zheng et al. (13) proposed a series of association tests that use different combinations of tests for male and female samples and rely either on genotypic counts or allelic counts in cases and controls. In practice, most of the case-control GWAS investigating X chromosome consider these two coding schemes with a logistic regression model, ignoring the XCI-S process. For time-to-event analysis, the same strategy can be considered by using the classical Cox model but with the same drawback for the XCI-S process. In a recent work, Xu et al. (1) and Han et al.
(2) proposed a penalized partial likelihood approach based on the Cox model with a subject-specific random effect that takes into account the XCI-S process. However, the method is quite complex to implement and computationally burdensome for GWAS. We therefore developed a simpler genetic association test accounting for skewed X-inactivation that we used to investigate the role of X-linked genetic variants in the occurrence of ADAs in patients with autoimmune diseases treated by biotherapies.
In this paper, we first present a novel test statistic for selecting interesting loci of the X chromosome in time-toevent data investigation taking into account the XCI-S process that relies on a semi-parametric additive hazard model. It is based on a score-like test evaluated at the null hypothesis that is straightforward to implement. It avoids to compute the complex log-partial likelihood for the random effect Cox model that requires to approximate the integration over the random effects (1,2). Then, we apply it to the data from the realworld observational multicohort study set-up by the IMI-funded ABIRISK consortium (10) to identify susceptibility loci for drug immunogenicity.

Material
The study population consists of 469 patients with genotyping information from the ABIRISK consortium real-world observational prospective multicenter cohort who suffered from multiple sclerosis, rheumatoid arthritis, and inflammatory bowel diseases and who were treated by biotherapies (10). These patients were naive for the biotherapies they were given during the study, which included tumor necrosis factor (TNF) inhibitors, interferon (IFN)-beta, anti-CD20 (Cluster of Differentiation 20) and anti-interleukin 6 (IL6) receptor monoclonal antibodies.
The patients were followed up for 12 months. Clinical data were recorded in an electronic Case Report Form. DNA samples and serum samples were collected for genetic analyses and ADA testing, respectively. Serum samples for ADA testing were collected at baseline before starting BP and subsequently at each study visit thereafter. Anti-drug antibodies were detected by specific validated assays for each BP and analyzed in central ABIRISK laboratories [more information can be found in Hässler et al. (10)]. The outcome was the time between the date of the first dose of biotherapy and the first detection of the occurrence of anti-drug antibodies. Patients without ADA occurrence were censored at the date of their last clinical visit. Among the 469 subjects, 129 (27.5%) developed ADA during the 1-year follow-up.
The DNA polymorphism analysis was performed with Infinium OmniExpress-24 v1.2 BeadChip. Genotype calling was performed by Genome Studio software 2011.1 with Genotyping module v1.9 (Illumina). Genotypes were called by comparing the generated data with those in the supplied cluster file. The final report for genotype data was based on GRch38/hg38. Quality checks were applied for each sample using autosomal SNPs and removing samples with a call rate (percentage of SNPs genotyped by samples) lower than 95%, excessive observed level of heterozygosity (deviated by more than 3 standard deviations from the mean heterozygosity of the sample), ambiguous sex (genotypic sex different from phenotypic sex from the eCRF), genotyping completeness less than 99%, and non-European ethnicity admixture detected as outliers from a principal component analysis of a linkage-disequilibrium-pruned data set (with a deviation of at least 6 SDs from the mean of at least one of the first 10 principal components). For the quality control specific to the sex chromosomes, we plotted the X chromosome heterozygosity rates and the call rates for SNPs harbored on the Y chromosome. We clustered the individuals using the k-means clustering algorithm and thus eliminated four individuals. As we used the genotyping information and not the information in the measured intensities of X and Y chromosomes, we did not eliminate potential sex-chromosome aneuploidies such as Trisomy X. A total of 457 genotyped individuals were retained.
Then, we extracted 17,565 genotyped SNPs harbored on the X chromosome and conducted further quality control filtering for these SNPs. In practice, we removed samples with a call rate lower than 95% for these SNPs. The SNPs with deviation from Hardy-Weinberg equilibrium test with p < 10 −5 in females, with minor allele frequency less than 5% for both males and females were removed. Finally, a total of 456 genotyped individuals with 12,976 X-chromosome SNPs were considered for subsequent analyses.

Notation
Let T denote the failure time (here the time-to-ADA detection) and C the censoring time. We assume that T and C satisfy the condition of independent and non-informative censoring (14). For each subject i (i = 1, ..., n), X i = min(T i , C i ) denotes the observed time of follow-up and δ i = 1 (X i =T i ) the indicator of failure (ADA detection) where the function 1 (.) is the indicator function whose value is 1 if the argument is true and 0 otherwise. We also denote Y i (t) = 1 (t≤X i ) the at-risk process and N i (t) = 1 (X i ≤t;δ i =1) the counting process, given at time t. Let G be the genotype for a di-allelic SNP on the X chromosome and denote the two alleles as A and a with A as the rare (or alternative) allele and a as the common (or reference) allele. The genotypes are

for female subjects and G = ([a], [A]) for male subjects.
Under the unknown underlying XCI process with a potentially skewed pattern, we consider the following genotype coding variable : (i) females : [Aa] and [AA], respectively; (ii) males : W = (0, 1) for [a] and [A], respectively. Here, the variable U is an unobserved (latent) subject-specific continuous random variable lying in the interval [−1/2, 1/2]. The values of −1/2, 0 and 1/2 represent skewed XCI toward the common allele, random XCI, or skewed XCI toward the rare (or minor) allele, respectively. In the following, we use the rewritten coding : W = Z + U × 1 (Z=1/2) with Z = 0, 1/2, 1 for females and Z = 0, 1 for males. For each patient i, the data consists of (X i , δ i , Z i , U i ).

Survival Model
In this work, we consider a semi-parametric additive hazard model with a latent variable (15,16). The hazard function for the failure time T of individual i takes the form: where β is the unknown regression coefficient of interest, λ 0 (t) is an unknown and unspecified baseline hazard function and U is the latent (unobserved) variable. Then, the individual-specific (conditional) survival distribution is such that: In the following, we assume that the U i are independent and identically raised cosine distributed random variables with parameters µ = 0 and γ = 1/2 (17). We recall that a continuous random variable U is said to have raised cosine distribution with parameters E(U) = µ and γ if its probability density In this work, U lies in the interval [−1/2, 1/2] and its expectation is equal to zero. Based upon this latter assumption, when marginalized over U, the unconditional (or marginal) survival function and hazard function are given by: where sinh and coth are the hyperbolic sine function and hyperbolic cotangent function, respectively.

Test Statistic
In this section, a statistic accounting for skewed X-inactivation is derived for testing the null hypothesis H 0 : β = 0 (same survival distribution across genotypes) against H 1 : β = 0 (different survival distribution across genotypes).
Following Lin and Ying (16), under the marginal additive hazard model introduced just above, a simple semiparametric estimating function for β is constructed and a score-type function is obtained under the null hypothesis (H 0 : β = 0). Here, the intensity function for N(t) is given by: .
By the Doob-Meyer decomposition (14), the counting process N(t) can be uniquely broken down into the sum of a martingale and a predictable process, such that: Under our model, we can estimate the cumulative hazard function by:ˆ Then, following Lin and Ying (16) (Equation 2.7), a simple estimating function (or score-like function) for β can be written as: Using L'Hopital's rule, we know that the limit of x × coth(x) as x approaches zero is equals to 1. Thus, under H 0 : β = 0: Under the null hypothesis H 0 , the random vector n −1/2 U β (β = 0) converges weakly to a normal with mean zero and with a variance which can be consistently estimated by n −1 B(β = 0) with : Thus, the (score-like) statistic : is asymptotically distributed under H 0 as a chi-square with one degree of freedom. A stratified version of the test over k strata can be constructed by calculating U β (β = 0), and its estimated variance, separately in each stratum. Both are then summed over strata. The final stratified test is then calculated in exactly the same way presented just above.

Simulation Study
A simulation study was conducted to assess the size and power of the proposed test (herein called "Score-like") as compared to three test statistics: (i) the score test of the null hypothesis under the Cox model (herein called "Cox-XCI") using the XCI coding (aa (0), aA (0.5) and AA (1) for females and a (0) and A (1) for males); (ii) the score test of the null hypothesis under the Cox model (herein called "Cox-XCE") using the XCE coding (aa Data were simulated under various scenarios assuming a locus undergoing XCI-S. The simulated variables were sex (females K = 2 and males K = 1), SNP genotype (females: Z = 0 for aa, Z = 1/2 for Aa and Z = 0 for AA; males: Z = 0 for a or Z = 1 for A), the individual skewness parameter and the time-to-event. In practice, genotype information for females was generated by combining the values of two Bernoulli variables (B p [A] ) independently drawn and for males from only one Bernoulli variable with mean: 10%, 20% and 30% for both males and females. This value corresponds to a pseudominor allele frequency (MAF), i.e., the proportion of [A] allele in the simulated population. The ratio between the female and male rate was set to 1:1. Failure times were generated from an additive hazard model with a protective effect of the minor allele such that the individual-specific hazard rate was: where λ 0 (t) = 5; β = 0, −1.5, −1.75, −2, −2.25, −2.5. The baseline hazard rates were λ 0(k=1) (t) = λ 0 (t) for males and λ 0(k=2) (t) = λ 0 (t)η for females with η = 1.2. Three distributions for the latent variable U were investigated: (i) U was generated independently and identically from a raised cosine distribution with parameters µ = 0 and γ = 1/2; (ii) U was generated independently and identically from a Beta distribution with E(U) = 1/2 (shape parameters equal to 2); (iii) U was generated independently and identically from a truncated normal distribution ranging from −0.5 to 0.5 with mean zero and standard error of 0.18. We investigated no censoring, 20% and 40% type I censoring (administrative censoring). The total number of subjects was chosen to be 400. For each configuration of parameters, 1,000 replications were performed and the levels and powers of the tests were estimated with a 0.05 significance level.

Simulation Study
As seen from the simulation results, the estimated level of the proposed test under the null hypothesis for a threshold of 0.05 fell within the binomial range [0.0365 − 0.0635] for all the studied scenarios. This is not the case for the Xu-Hao test, which gave slightly inflated type I error.
For XCI-S with a raised cosine distribution for the skewness parameter (Table 1), the power of the proposed test was always higher than the Xu-Hao test, with a difference varying from 1 to 6% and for each percentage of censoring. Higher power gains where observed for larger MAF. As expected, the Xu-Hao test gave higher power gains than the Cox-XCI test. The Cox-XCE test gave the worst power results. Results observed for XCI-S with a Beta distribution ( Table 2) were quite similar. For XCI-S with a truncated Normal distribution ( Table 3), for small and moderate MAF (10 and 20%), the power of the proposed test was always higher than the Xu-Hao test, with gains between 1 and 7%. For higher MAF (30%), the power results of the proposed test and the Xu-Hao test were close, sometimes to the slight advantage of the Xu-Hao test.
We first computed the p-values obtained with the stratified version (by the sex) of the score-like test. Then, to identify Xchromosomal loci associated with ADAs, we performed an FDRbased genome-wide analysis. Controlling the FDR at nominal level of 5% (19), we selected 24 associated signals. Results obtained using unstratified tests were similar. Among these association signals, two signals had p-values lower than 10 −6 : rs5991366 (p = 3.56 * 10 −8 stratified test, p = 4.58 * 10 −8 unstratified test) and rs5991394 (p = 3.74 * 10 −7 stratified test, p = 3.63 * 10 −7 unstratified test). Both SNPs were located in the cytoband Xp22.2 near the gene chromobox 1 pseudogene 4 (CBX1P4) and the gene REPS2 (RALBP1 Associated Eps Domain Containing 2). For rs5991366, the frequency for the minor allele was 9.1% for females and 8.8% for males with no significant difference. For rs5991394, the frequency for the minor allele was 9.9% for females and 10.2% for males with no significant difference. This pair of SNPs are in very high linkage disequilibrium (R 2 = 0.85) (20). Figure 1 displays the Kaplan-Meier survival curves for the SNP rs5991366. No event occurred among the 13 hemizygous males and 4 homozygous females for the alternative allele, whereas among the hemizygous males and homozygous females for the reference allele, 26.1% (35/134) and 27.6% (71/257) developed ADA positivity, respectively. Among the heterozygous females, 14.6% (7/48) developed ADA positivity. Figure 2 displays the Kaplan-Meier survival curves for the SNP rs5991394. No event occurred among the 15 hemizygous males and 3 homozygous females for the alternative allele, whereas among the hemizygous males and homozygous females for the reference allele, 26.5% (35/132) and 27.5% (69/251) developed ADA positivity, respectively. Among the heterozygous females, 16.4% (9/55) developed ADA positivity. Figures 3, 4 display the Manhattan plot of the X Chromosome genome-wide association results obtained with the score-like test (Figure 3) with a zoom in on the genomic region 150,000,000 bp   to 20,000,000 bp (Figure 4). Figures 5-7 display Manhattan plots of the X Chromosome genome-wide association results obtained with the (stratified) Cox-XCI test (Figure 5), the (stratified) Cox-XCE test (Figure 6) and the Xu-Hao test (Figure 7). Looking at the Manhattan diagrams in the distal Xp22.2 region, the association signals obtained from the Cox-XCI, Cox-XCE and FIGURE 4 | Manhattan plot of X Chromosome genome-wide association results (Score-like test) with zoom in on area 15,000,000-20,000,000 with SNP rs5991366 (red) and SNP rs5991394 (green). Xu-Hao tests were weaker than those obtained from the scorelike test.

DISCUSSION
Our aim was to investigate the association of loci located on the X chromosome with drug immunogenicity in auto-immune diseases, while taking into account different X-chromosome inactivation models: XCI (random inactivation), XCI-S (skewed inactivation) and XCE (inactivation escape). To date, few strategies have been proposed for analyzing time-to-event data, taking into account the complexity of the X-chromosome inactivation biological process. In practice, one can use statistical tests derived from the classical Cox model (with XCI or XCE coding) or a more complex and computationally burdensome test based on a random effect Cox model (1,2).
We propose a new score-like test that takes into account for an unknown underlying XCI process with a potentially skewed pattern. We assumed a semi-parametric additive hazard  model with a latent factor that provides an easy and meaningful interpretation of the skewed X-inactivation process. Results from the simulation study show that the proposed test provides higher power gains than the score tests from the Cox model (XCI and XCE coding) and to the likelihood ratio test proposed by Xu et al. (1) and Han et al. (2). With the latter test, some caution should be taken when interpreting its power results as the type I error rate is slightly inflated. For the distribution of the latent factor, we considered a raised cosine distribution that mimicks the unknown skewed X-inactivation process and leads to a closed form for the marginal survival distribution. Other choices are possible. However, as shown by the simulation study, the proposed test performs quite well even with other distributions such as the Beta and truncated Normal distributions. In the present, the additive hazard model that represents the effect of the genetic locus on the hazard rate as a linear form, serves as an alternative to the usual proportional hazards model and benefits from several useful mathematical properties. The model involves a straightforward simple testing procedure and a stratified version of the score-like test is easily obtained. However, if the genetic effect is not confounded with sex, there is no need to stratify by sex in the analysis. Note that our main objective was not to find the best genetic model over various biological processes but to propose a novel statistic for testing the effect of a loci under an XCI-S process. The statistic could nevertheless be used for model selection, although this would require further works.
By investigating the association between the genes located on the X chromosome with anti-drug immunogenicity, the proposed test allowed us to select two SNPs with high linkage disequilibrium (rs5991366 and rs5991394) located in the cytoband Xp22.2 that would have been overlooked by the score tests from the Cox model (XCI and XCE coding) and the Xu et al.'s likelihood ratio test. Both SNPs show a similar protective effect for drug immunogenicity without any occurrence of ADA positivity for homozygous females and hemizygous males for the alternative allele. In contrast, almost 30% of the homozygous females and hemizygous males for the reference allele experienced ADA positivity. Note that for both SNPs, the frequencies of the alternative allele observed for both males and females were not significantly different from the estimates obtained in European samples (21).
The region containing the two X-linked SNPs associated with ADA occurrence is conserved between primates (99% identity) and also within mammals (70% identity with mus musculus) (22). Moreover, the SNP rs59913394 is in the proximity of a regulatory variant (rs113772781) in LD (r2=1 in Europeans), but no gene expression in immune cell types is regulated by this variant (23). In the genomic neighborhood of these two SNPs, the closest gene, CBX1P4 (Chromobox 1 Pseudogène 4), is a pseudogene followed by the gene REPS2 (RALBP1 Associated Eps Domain Containing 2). The REPS2 gene (RALBP1 Associated Eps Domain Containing) encodes for a protein which is part of a protein complex that regulates the internalization of growth factors receptors such as EGF and insulin receptors and may have an inhibitory effect on growth factor cell signaling. It is downregulated in prostate cancer progression and that this downregulation is accompanied by upregulation of NF-κB activity (24,25). No direct effect of REPS2 on auto-immune disease has been recognized to date. However, the REPS2 gene is widely expressed in several human tissues, including white blood cells and lymph nodes. Its biological targets (growth factors and NF-κB signaling) are also widely expressed and have a major role in inflammation and immunity. Dysregulation of the IGFs pathway has an important role in autoimmune diseases (26). In particular, IGF stimulates Treg proliferation and has a protective effect in autoimmune disease models. Further in the Xp22.2 locus, several genes have a role in immunity including ACE2, TLR7, and TLR8. ACE2 (Angiotensin I converting enzyme 2) is a key actor of the renin-angiotensin system acting as a homeostatic regulator of the vascular function (blood pressure regulation). Recently it attracted much attention for its major role in SARS-CoV-2 infection through its affinity for the viral spike factor, raising the question of a possible sex bias in this disease (27). The genes TRL7 and TLR8 are members of the Toll-like receptor (TLR) family which plays a major role in activation of innate immunity. Studies have shown that some immunity genes such as TLR7, CD40LG, and CXCR3 may escape XCI in several lymphoid cells (B cells, T cells, plasmacytoid cells), especially in some autoimmune diseases such as Systemic Lupus Erythematosus due to a dysregulation of the XIST long non-coding RNA involved in XCI (28)(29)(30). Here, we investigated the genes that are located in the vicinity of the SNPs of interest, but other genes that are further from the SNPs (e.g., MNG2 multinodular goiter 2) could be relevant. Additional studies are required to strengthen our findings.
In the search for common susceptible loci, we analyzed together the time to ADA of patients treated with eight different drugs for different autoimmune diseases. This strategy, which uses information concerning various therapies and autoimmune diseases, is likely to provide significant gain in power in finding loci not associated with specific therapies or autoimmune diseases. Nevertheless, it would not be suitable for searching for loci that are specific to a particular therapy or disease.
In conclusion, we think that more X-Chromosome GWAS should be performed and that the proposed test, which is easy to implement with standard softwares, is well-suited for identifying X Chromosome SNPs, while taking into account all possibilities of the skewed X-Chromosome inactivation process.

DATA AVAILABILITY STATEMENT
The data analyzed in this study were collected in the context of the ABIRISK project by ABIRISK partners. Access to the minimal dataset underlying the findings can be obtained upon request to the ABIRISK Sustainability Scientific Committee by submission of an analysis plan. The analysis plan should explain the purpose of the use of the data and confirm the intention to use the data only for replication studies concerning anti-drug inhibitors, since this is the limitation of the ethical permission on how this data can be used. The contact person of the ABIRISK Sustainability Scientific Committee to whom the requests should be sent is MP (marc.pallardy@inserm.fr).

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by Medical

AUTHOR CONTRIBUTIONS
PB coordinated the project and developed the proposed statistical procedure. SH, SC-B, and PB participated in writing the original draft. SH and PB analyzed the data. SH, MA, FD, AF-H, XM, MP, and PB participated in the data collection. MP coordinated the ABIRISK project. All authors read and approved the final manuscript.