The Significance of RHD Genotyping and Characteristic Analysis in Chinese RhD Variant Individuals

Background RhD is the most important and complex blood group system because of its highly polymorphic and immunogenic nature. RhD variants can induce immune response by allogeneic transfusion, organ transplantation, and fetal immunity. The transfusion strategies are different for RhD variants formed by various alleles. Therefore, extensive investigation of the molecular mechanism underlying RhD variants is critical for preventing immune-related blood transfusion reactions and fetal immunity. Methods RhD variants were collected from donors and patients in Zhejiang Province, China. The phenotypes were classified using the serologic method. The full coding regions of RHD gene were analyzed using the PCR-SBT method. The multiplex ligation-dependent probe amplification (MLPA) assay was used to analyze the genotype and gene copy number. SWISS-MODLE and PyMOL software were used to analyze 3D structures of RhD caused by the variant alleles. The effect of non-synonymous substitutions was predicted using Polymorphism Phenotyping algorithm (PolyPhen-2), Sorting Intolerant From Tolerant (SIFT), and Protein Variation Effect Analyzer (PROVEAN) software. Results In the collected RhD variants, 28 distinct RHD variant alleles were identified, including three novel variant alleles. RH-MLPA assay is advantageous for determining the copy number of RHD gene. 3D homology modeling predicted that protein conformation was disrupted and may explain RhD epitope differential expression. A total of 14 non-synonymous mutations were determined to be detrimental to the protein structure. Discussion We revealed the diversity of RHD alleles present in eastern Chinese RhD variants. The bioinformatics of these variant alleles extended our knowledge of RhD variants, which was crucial for evaluating their impact to guide transfusion support and avoid immune-related blood transfusion reactions.


INTRODUCTION
The Rh blood group system plays a pivotal role in the field of blood transfusion and severe hemolytic disease of fetus/newborn due to its high polymorphism and strong immunogenicity (1)(2)(3). Except for the common RhD-positive and -negative phenotypes, populations exhibit numerous variants, including weak D, partial D, and Del types. The differential expression of antigen on the surface of red cells caused by RhD variants may lead to a corresponding immune response during allogeneic transfusion therapy, resulting in a series of serious consequences such as severe hemolytic transfusion reaction and organ transplant rejection. In clinical settings, it usually relied on the phenotypic outcomes of conventional RHD typing. However, most RhD variants could not be precisely identified using the serologic method, imposing great dangers to the immune response caused by RhD variant blood transfusion. RhDnegative blood recipients may develop anti-D antibodies following transfusion of some RhD variant red blood cells in practice (4)(5)(6). The RhD variants genotyping is greatly significant for avoiding immune-related blood transfusion reactions.
Most RhD variants could be traced back to DNA variation, including single nucleotide variation (SNV), multiple nucleotide variants (MNV), base insertions or deletions, and hybrid allele (7)(8)(9). So far, over 400 RHD alleles have been registered and nominated by the International Society of Blood Transfusion (ISBT), which were divided into four categories (https://www. isbtweb.org/working-parties/red-cell-immunogenetics-andblood-group-terminology). The genetic alterations of RHD alleles differentially influence RhD epitope attributes and quantity expression, resulting in various variants with potentially distinct immune responses. However, due to a lack of these agents for all immunogenic epitopes in the laboratory, the serological phenotypic detection of RhD variants is often unclear. Notably, the ability to test antigens without using serologic reagents is a major medical achievement, which is beneficial for locating compatible donor units and can be lifesaving. As a result, it is urgent and necessary to perform RHD genotyping and determine the phenotypes of RhD variants in practice.
Current research on the distribution of variant RHD alleles has been mostly focused on European donor and patient cohorts (10)(11)(12)(13)(14). Additionally, there are some corresponding reports on the diversity of RHD alleles in African and Asian populations (15). The knowledge about overall characteristics of RhD variants in China remains relatively limited. We aimed to obtain further knowledge of RhD variant genotype and explore the genetic resources of RHD gene in the Chinese population based on the diversity of genetic characteristics in different populations and allele-specific antigen and antibody immune response. Herein, RhD variants collected from a specific period were tested using different typing assays for RHD allele sequence and copy number variation analysis. Moreover, bioinformatics analysis was processed in silico, which could enhance our understanding of the influence of genetic variants on antigen differential expression and their implications for immune-related transfusion reactions and fetal immunity.

Study Specimens
The study specimens are collected in daily work. These samples were first suspected to be RhD variants or anti-D negative and were sent to our reference laboratory for further identification. Portions of them were drawn from blood donors in Blood services in the Zhejiang Province, while the remainders were drawn from patients in hospitals of Hangzhou City in Zhejiang Province, China. Zhejiang Province is located in eastern China. These individuals provided informed consent, and the research was approved by the ethics committee of Blood Center of Zhejiang Province. Genomic DNAs were extracted using a commercial MagDNA pure LC DNA isolation kit (RBC Bioscience Corporation, Taiwan, China) according to the manufacturer's instructions.

Serological Test
RhD phenotypes were determined using conventional serological kits. Using microplate testing, blood donors were selected from regular preliminary screening of RhD types by anti-D reagent (IgM, Clone LDM1, Alba Bioscience Ltd, UK). RhD preliminary screenings of negative or weak expression blood donors and patient samples from hospital were further identified using a tube test (IgM/IgG blend, Clone TH-28, and MS-26, Merck Millipore Ltd, Livingston, UK). Agglutination reactions were graded on a scale of 0 to 4+, and phenotypes were classified as RhD negative (0), weak RhD (±, 1+, and 2+), or RhD positive (3+ and 4+). RhD-negative and weak RhD specimens were further determined by indirect agglutination test (IAT) using two of the following anti-D reagents according to reagent supply (1-IgM/IgG blend, Clone RS1126, and MS-26, Lorne Laboratories Ltd, Twyford, UK; 2-IgM/IgG blend, Clone P3X61, P3X21223B10, P3X290, and P3X35, Diagast Ltd, Loos, France; 3-IgM/IgG blend, Clone TH-28, and MS-26, Merck Millipore Ltd, Livingston, UK; 4-IgG blend, clone MS-26, Shanghai Blood Biomedicine Co., Ltd. Shanghai, China). RhD variants were defined as RBCs that agglutinate with anti-D reagent but had agglutination intensity of less than 2+, or agglutinated with only a few anti-D reagents, or were not agglutinated by the tube method but reacted with anti-D by IAT.
RHD Genotyping Using PCR-SBT PCR was used to amplify the sequences of 10 exons of RHD gene. The amplification and sequencing primers were selected according to our prior report (16). The amplification conditions were optimized to the same reaction program as follows: 95°C 5 min; 95°C 30 s, 64°C 30 s, 72°C 90 s, 30 cycles; 72°C 10 min. The content of reaction system is identical to that described previously (16), except that DNA polymerase was changed to 0.5 U La-Taq (TaKaRa, Dalian, China).

Genotype and Copy Number Analysis Using MLPA Method
The copy number of RHD gene was analyzed by the RH-MLPA method using a commercial blood group genotyping kit (MRC-Holland, Amsterdam, the Netherlands), which includes three sets of MLPA probe mixtures P401, P402, and P403, as well as reaction system reagents. The three sets of probes contain 44 RH allele wild-type and mutant probes. The presence and proportion of probe signals were utilized to determine the existence of wildtype and/or variant alleles as well as the corresponding copy number (18). MLPA operation process and data analysis were conducted following the manufacturer's procedures.

Protein 3D Structure Analysis
3D structure models of wild-type and prematurely terminated RhD protein were generated from the crystal structure of RhCG protein (PDB code: 3HD6) (19) template using SWISS Model (https://swissmodel.expasy.org/interactive) (20). The RhCG protein is the most similar structure to RhD, and there is yet no crystal structure for wild-type RhD. The structures of novel variant RhD proteins mutated by amino acid substitutions were analyzed using PyMOL software (21).

Preliminary Serological Results
Following preliminary RhD blood type screening and further confirmation using a tube test and ITA method, 59 specimens were designated as RhD variants for follow-up research. These samples exhibited no agglutination or weak agglutination by IgM antibody using a tube test or +~3+ agglutination strength using IAT test. Based on our results, all samples revealed similar agglutination strength in IAT test using two reagents containing IgG antibodies. Anti-D antibodies were found in the serum of two blood donors.
One case of a pregnant woman with RhD-negative phenotype was also included in this study, which was subjected to subsequent analysis after detecting anti-D antibody in the serum and obtaining as suspected result using the PCR-SSP method in the hospital.

Genotype Distribution of D Variants
Ten exons of RHD gene were analyzed using the PCR-SBT method. There were 52 specimens with variant sites in the coding region. Seven specimens with no variants were identified as normal RHD*01 genotype. Another one was confirmed as an RHD-negative allele. Among them, RHD*DVI.3 homozygotes and RHD*weak partial 15 homozygotes were the two most prevalent genotypes in the Chinese population, accounting for 26.9% and 19.2% of RhD variants in this study, respectively. Two donors who produced antibodies were with the RHD*DVI.3 homozygous genotype. Using MLPA assay, nine specimens were detected as heterozygous, including heterozygosity with RHD*01, or heterozygosity for two different variant alleles. The detailed genotyping results are depicted in Table 1.

Molecular Characterization of Variant RHD Alleles, Including Novel Alleles
A total of 28 variant RHD alleles were identified in these RhD variants, including 20 single nucleotide variants (SNV), 1 multiple nucleotide variant (MNV), 1 insertion, and 6 RHD-CE-D hybrid alleles ( Figure 1A). According to ISBT nomenclature classification, these variant alleles can be classified into 14 weak D alleles, 9 partial D alleles, 1 Del allele, 1 D negative allele, and 3 novel variant alleles. The number and proportion are illustrated in Table 2 and Figure 1B. Among our test specimens, the most prevalent variant allele was partial D allele, RHD*DVI.3, in which exons 3-6 were exchanged with RHCE. Additionally, RHD*weak partial 15 alleles had a relatively high frequency. The other remaining variant alleles were rare because only one to two specimens were detected for each allele. RHD*18A is a synonymous mutation. The amino acid it encodes is located at the N-terminus of protein, which may affect transcriptional splicing of RNA. Specific allele information, including amino acid mutations, rs number, GenBank no., and membrane localization, is listed in Table 2.
Two novel single-nucleotide missense variants (c.538G>C and c.782C>T) and one novel insertion variant (c.210_211insG) were detected in three specimens ( Table 2). All novel RHD alleles were found to be hemizygous. RHD*538C allele was consistent with a c.538G>C transition in Exon4 encoding a p.G180R substitution, whereas RHD*782T was associated with a c.782C>T transition in Exon5, resulting in a p.P261L substitution. Both of them corresponded to amino acid substitution in TM. RHD*210_211 insG inserted a G in 210 to 211 positions, resulting in a frameshift to create a premature terminal codon at 158 position of RhD protein.

Genotyping and Copy Number Analysis Using RH-MLPA Assay
Further genotyping and copy number analysis of 49 RhD variants were performed using MLPA assay (excluding three specimens with no available DNA). The results compared to PCR-SBT are summarized in Table 1. Some specimens with commonly known alleles were accurately identified among the tested cases, consistent with PCR-SBT results. RHD*01N.04 alleles were found in two specimens using the RH-MLPA assay but were missing using the PCR-SBT method ( Table 2). MLPA assay was unable to detect specimens with rare variants (including novel mutation). However, the RhD blood group continues to face the problem of copy number variation. The RH-MLPA assay has a good advantage in analyzing copy number of alleles. Based on the signal ratio of wild-type alleles, RHD*01W.47/ RHD* weak partial 15  nine specimens were found with two copies. The remaining 40 specimens were analyzed for only one copy of RHD allele, indicating that they were hemizygous (RHD/-). The copy number analysis results based on the MLPA assay were consistent with the zygosity test except for one specimen (RHD*03.02).

Model 3D Structure of RhD Mutant Protein
The wild-type RhD protein structure was simulated using SWISS-MODEL based on the crystal structure of the RhCG protein model. The amino acid conversion was performed using PyMOL software. The amino acid substitutions in the overall structure of RhD protein are displayed in Figure 2A. A known mutation c.143A>G resulted in the substitution of p.Y48C (26), and two novel missense mutations led to p.G180R, and p.P261L substitutions were selected for 3D structure analysis. These three amino acid substitutions were located in a2 helix, a6 helix, and loop ICL3. The connections between the replaced amino acid and other amino acids, as well as the changes in hydrogen bonds, are depicted in Figures 2B1-D1 and Table 3. In the wild-type RhD protein, amino acid residue Y48 was bonded to four amino acids by hydrogens (L44, 2.8Å; G51, 3.0 Å; Q52, 2.9 Å, and S222, 2.7 Å), as indicated in Figure 2B2. Following mutagenesis, the hydrogen bond connected to S222 disappeared ( Figure 2B2). As for p.G180R mutant protein, atoms and positions for forming hydrogen bonds were changed significantly ( Figure 2C2). Two additional Ser located in a4 helix were bound to R180 (S122, 3.1 Å; S126, 2.7 Å, and 1.2 Å) because Arg has one more side chain than Gly. The hydrogen bond length remained unchanged for the remaining three connected unchanged amino acids (A176, A177, and A184) ( Table 3). The amino acid at position 261 was located in the intracellular loop. This residue has no interaction with other amino acids analyzed by PyMOL software. Figures 2D1-D2 demonstrated that the loop's position was slightly changed because of mutation p.P261L. For RHD*210_211insG allele, a frameshift occurred, resulting in the formation of C-terminally truncated RhD protein, which only formed two complete a-helices ( Figure 2E1). The model comprised 81 amino acids to the wild-type protein, including 14 frameshift amino acids from 71 to 85 positions. Figure 2E2 depicted the last 14 different amino acids.  These three mutation sites could be found in the SNP database of NCBI, but they were not nominated by ISBT. / They could not be correspond to membrane localization or have no corresponding ISBT terminology and rs numbers.

Predicted Effect of Non-Synonymous Mutations In Silico
Three bioinformatics software packages (SIFT, PolyPhen-2, and PROVEAN) were used to predict the influence of nonsynonymous mutations on structural alterations. According to deleterious qualification standards, at least two of the three bioinformatics systems utilized had to predict the mutation alleles as damaging. A total of 14 non-synonymous mutations were determined to be deleterious to the protein structure, whereas the remaining mutations were found to be neutral ( Table 4). The novel non-synonymous mutations were predicted to be deleterious to the protein.

DISCUSSION
In this study, the molecular characteristics of RhD variants were investigated in Zhejiang Province, eastern China. Genotyping is extensively applied to identify RhD variants (27). As the most important and sophisticated clinical blood group system, serological testing is hampered by many factors and cannot clearly identify various RhD variants. Notably, distinct phenotypes and genotypes of RhD variants have specific clinical significance and require blood transfusion strategies (28,29). The RhD blood type mismatch between donor and recipient may cause acute severe immune response resulting in neonatal hemolytic disease, hemolytic transfusion reaction, and autoimmune hemolytic disease (26,30). Therefore, genotyping is required to determine the molecular characteristics of RhD variant donors or recipients, which is beneficial to disease treatment. As an alternative to routine serological testing, genotyping has a significant potential function in clinical practice. RHD allele distribution varies widely between ethnic populations. At present, 28 distinct variant RHD alleles, including three novel alleles, were characterized in our tested specimens. RHD*DVI.3 and RHD*weak partial 15 were the most prevalent RhD variant alleles in Zhejiang province. This result was consistent with the distribution of RhD variant alleles in To qualify as deleterious, the mutation had to be predicted as damaging by at least two of the three bioinformatics programs used. other regions of China (9, 16, 31), but it is completely different from a Caucasian population with a prevalence of less than 5% (32,35). There was no significant similarity or clustering in a particular site between the other rare RhD variant alleles identified in our study and those reported previously (3,9). In this study, the frequencies of variant RHD alleles could not be accurately calculated because the specimens were not derived from a unified large-scale screening of random populations. However, their frequency was definitely low, with only one to two specimens detected in the population. This study identified three novel D variant alleles, which provided knowledge of RHD alleles and enriched the genetic resources of Rh blood group in the Chinese population. The c.538G>C mutation is similar to the c.538G>A mutation reported by Fichou et al. (12), since they have different base mutations at the same position, causing the same amino acid substitution. The novel RHD*210_211insG allele predicting a premature terminal codon was derived from the pregnant woman with the RhD-negative phenotype. However, the RhD-negative phenotype required confirmation using absorption and elution test. Subsequent corresponding expression experiments in vitro are required to elucidate the effect of mutation on the antigen.
The Del caused by RHD*1227A holds a significant function in the Asian population, which is referred to as "Asian Del". In clinical practice, RhD-negative individuals were confirmed to produce primary and secondary immune responses caused by Asian Del transfusion (33,34). On the other hand, pregnant women with "Asian Del" would not produce IgG antibodies to cause hemolytic disease of newborn (HDN) (35), indicating that RHD genotyping could avoid unnecessary administration of RhIg. Notably, some variants may have a strong immune response. The variant arising from RHD*D-CE(9)-D was reported to produce a high-titer anti-D and cause severe hemolytic disease in newborns (36). To improve blood transfusion safety and early prevention of Rh-related neonatal hemolytic disease, it is required to broaden the use of Rh blood group genotyping, including RhD-negative individuals, which could reduce immunizations caused by antibodies. RhD genotyping has been widely implemented in European countries to control blood product supply (37)(38)(39).
All mutation sites appeared to be randomly distributed over the CDS region. The variant alleles affecting transmembrane region were relatively prevalent among these rare RhD variant alleles. The possible reason for the impact of these variant alleles on antigens is yet unclear because different variant alleles exhibit distinct phenotypic and clinical characteristics (25,28). As a result, knowledge of RHD alleles and their clinical significance contributed to each case and the development of blood group genetic factors. In China and other populations, much more practical research on the Rh blood group is encouraged.
The difficulty in investigating the molecular mechanism of Rh blood group is not only in the sequence analysis but also in identifying RHD-CE recombination and RH gene copy number variation (40). However, there are few accurate detection methods available in this area. The copy number of RHD gene directly affects the expression of its surface antigen, which is greatly significant for prenatal diagnosis and prevention of hemolytic disease of fetus and newborn (HDFN) (41). Therefore, detecting RHD gene copy number is critical. At present, the MLPA assay and hybridization box test are relatively suitable methods for heterozygosity detection. In this study, the MLPA method was employed to concurrently perform genotyping and copy number variation analysis for RhD variants. However, the results were inconsistent with hybrid Rhesus test for one specimen because hybrid Rhesus detection may be affected by differences in breakpoints or additional variations in Rhesus box area (42,43). In terms of detecting point mutations, the MLPA assay is weaker than PCR-SBT, particularly for weak D variants. It cannot detect most rare mutation sites and is even more challenging to detect novel mutations because it contains only a few well-established mutation sites. On the other hand, the PCR-SBT method cannot determine copy number variation of RHD gene, and hybridization box PCR-SSP test has a relatively high error rate (43). Therefore, these three technologies have their own advantages and disadvantages and complement each other. To achieve different clinical requirements, combining the actual needs and choosing the appropriate method for Rh blood group genotype is recommended.
According to 3D simulated protein structure, amino acid changes caused by missense mutations, p.Y48C, and p.G180R affect the secondary structure a-helix of protein. Tyr and Cys are polar hydrophilic amino acids; however, p.Y48C substitution eliminated the aromatic ring of amino acids and reduced hydrogen bonds connected to S222 in another a-helix. Therefore, replacing this amino acid may disrupt the interaction between ahelix structures, resulting in decreased antigen expression or changes in antigenic properties. G180 is the smallest hydrophobic amino acid for a wild type, making it a superior helix-forming residue (44). Compared with Gly, Arg has a large and hydrophilic charged side chain, impacting tertiary interactions and RhD protein stabilization. Meanwhile, hydrogen bonds were affected by the sidechain group. G180R substitution increased the interaction with other amino acids by adding three hydrogen bonds. However, additional hydrogen bonds may not be required for correct helix folding. Instead, it may destroy the inherent stable structure of protein and impair its membrane insertion (45). The reduction of RhD antigen density caused by p.P261L substitution may be explained by the disruptive effect of Leu interfering with correct folding of intracellular loop ICL3. The insertion mutation RHD*210_211insG caused a frameshift at amino acid 71 and premature termination, resulting in a truncated RhD polypeptide. As a result, it was speculated that this insertion could only comprise a part of RhD epitope. The 3D structural simulation revealed that protein's spatial conformation was destroyed. Amino acid substitutions altered the interaction and stability of RhD protein structure, whereas frameshift mutations resulted in partial structure loss. All of them affected the normal assembly of tertiary structure, leading to changes in RhD antigen characteristics. The bioinformatics analysis of RhD protein could help us grasp the impact of RHD gene mutations on antigen differences and guide subsequent blood transfusion strategies.
Seven specimens defined as D variants could not detect corresponding variant alleles and may have another molecular basis. RHD-specific microRNA was reported to regulate the expression of Rh antigen protein (46). However, the molecular mechanism underlying the differential expression of RhD antigen remains largely unknown. Additional research is required to determine the impact of other regulatory elements on RhD variants.
In summary, this study identified and characterized various RHD alleles, including 25 reported weak D, partial D, DEL, RhD alleles, and three novel variant alleles. These findings provided a brief overview of variant D phenotypes found in eastern China. Additionally, we performed an in silico analysis of bioinformatics of these variant alleles. All findings could extend our knowledge of RhD variants in blood donors and clinical transfusion recipients. It is critical for precision blood transfusion treatment, as it reduces the risk of alloimmunization in RhD-negative recipients to improve blood transfusion safety, organ transplantation success, and fetal incompatibility prevention.

DATA AVAILABILITY STATEMENT
The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found below: GenBank, accession numbers from MZ782891 to MZ782914.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by the Ethics committee of Blood Center of Zhejiang Province. The patients/participants provided their written informed consent to participate in this study.

AUTHOR CONTRIBUTIONS
YY performed research, analyzed data, and wrote the paper. JZ, XH, and XX performed research. JH collected the samples. FZ designed the research. All authors contributed to the article and approved the submitted version.