Functional Characterization of the Effects of N-acetyltransferase 2 Alleles on N-acetylation of Eight Drugs and Worldwide Distribution of Substrate-Specific Diversity

Variability in the enzymatic activity of N-acetyltransferase 2 (NAT2) is an important contributor to interindividual differences in drug responses. However, there is little information on functional differences in N-acetylation activities according to NAT2 phenotypes, i.e., rapid, intermediate, slow, and ultra-slow acetylators, between different substrate drugs. Here, we estimated NAT2 genotypes in 990 Japanese individuals and compared the frequencies of different genotypes with those of different populations. We then calculated in vitro kinetic parameters of four NAT2 alleles (NAT2∗4, ∗5, ∗6, and ∗7) for N-acetylation of aminoglutethimide, diaminodiphenyl sulfone, hydralazine, isoniazid, phenelzine, procaineamide, sulfamethazine (SMZ), and sulfapyrizine. NAT2∗5, ∗6, and ∗7 exhibited significantly reduced N-acetylation activities with lower Vmax and CLint values of all drugs when compared with NAT2∗4. Hierarchical clustering analysis revealed that 10 NAT2 genotypes were categorized into three or four clusters. According to the results of in vitro metabolic experiments using SMZ as a substrate, the frequencies of ultra-slow acetylators were calculated to be 29.05–54.27% in Europeans, Africans, and South East Asians, whereas Japanese and East Asian populations showed lower frequencies (4.75 and 11.11%, respectively). Our findings will be helpful for prediction of responses to drugs primarily metabolized by NAT2.

NAT2 metabolizes isoniazid (INH), a first-line antituberculosis (TB) drug, to N-acetyl INH (Mthiyane et al., 2020). High plasma levels of hydrazine, generated by non-enzymatic conversion of INH, are thought to cause anti-TB drug-induced liver injury (ATDILI) Brewer et al., 2019). Because patients with TB harboring NAT2 * 5, * 6, and * 7 show much slower N-acetylation than patients homozygous for NAT2 * 4 (Mthiyane et al., 2020), accumulation of INH and hydrazine occurs in SAs, leading to a higher risk of liver injury Brewer et al., 2019). These phenotypes may be the most efficient pharmacogenomics biomarkers for predicting the risk of ATDILI (Azuma et al., 2013;Mushiroda et al., 2016;Wattanapokayakit et al., 2016;Yuliwulandari et al., 2016;Suvichapanich et al., 2019) and may lead to identification of a cost-effective treatment for TB (Rens et al., 2020). Therefore, categorization of patients as RAs, IAs, and SAs may be useful for treatment with NAT2 substrates (Rens et al., 2020). Although many studies have reported strong associations of each NAT2 genotype, such as NAT2 * 6A/ * 6A, with the risk of ATDILI (Suvichapanich et al., 2018;Nicoletti et al., 2020;Yuliwulandari et al., 2021), little information is available regarding the associations between NAT2 genotype and the risk of the adverse reactions induced by other NAT2 substrate drugs. For example, NAT2 phenotypes, such as SA, IA, and RA, have been shown to be associated with hydralazine-induced adverse reactions because hydralazine is primarily metabolized by NAT2 (Spinasse et al., 2014;Rens et al., 2020). Thus, the substrate specificity of N-acetylation with relation to categorization of NAT2 genotypes into phenotypes has also not been reported (Zang et al., 2007).
As first reported by Ruiz et al. (2012), ultra-slow acetylators (USAs; also known as very-slow acetylators) are defined as individuals with the NAT2 * 6A/ * 6A genotype. Thereafter, Selinski et al. (2017) reported the existence of USAs based on an association with urinary bladder cancer risk. In addition, we previously demonstrated the existence of USAs (NAT2 * 6A/ * 6A, * 6A/ * 7B, and * 7B/ * 7B) by comparing the effects of each NAT2 genotype on the risk of developing liver injury induced by isoniazid in a trans-ethnic meta-analysis (Suvichapanich et al., 2018). Recently, studies of Indian and European cohorts also concluded that the metabolic effects of NAT2 * 6 and * 7 are different from that of NAT2 * 5 (Nicoletti et al., 2020). Therefore, we attempted to clarify the substrate-specific diversity of NAT2 phenotypes between different populations by categorizing NAT2 genotypes into USA, SA, IA, and RA phenotypes based on the results of in vitro N-acetylation of eight drugs in this study.
In this study, we conducted in vitro metabolic experiments of aminoglutethimide (AGT), diaminodiphenyl sulfone (DDP), hydralazine (HLZ), isoniazid (INH), phenelzine (PZ), procaineamide (PA), sulfamethazine (SMZ), and sulfapyrizine (SP) using HEK293 cells transiently expressing NAT2 * 4, * 5, * 6, and * 7 to elucidate the substrate specificity profiles of the N-acetylation of each allele. Additionally, we categorized NAT2 genotypes into phenotypes, i.e., RAs, IAs, SAs, and USAs, based on activity scores calculated by in vitro intrinsic clearance (CLint) for N-acetylation of the substrate drugs. Moreover, many studies have demonstrated the frequencies of NAT2 phenotypes, but not genotypes, without considering the effects of each drug on N-acetylation activity in worldwide populations (Li et al., 2011;Mortensen et al., 2011;Sabbagh et al., 2011). Therefore, we could not categorize NAT2 genotypes into USA, SA, IA, and RA phenotypes based on the effects of different drugs on in vitro N-acetylation activities. Accordingly, we also summarized the worldwide distributions of NAT2 phenotypes in a large-scale study of Japanese individuals and 26 different populations collected by the 1000 Genomes Project (1KGP) based on information on categorization of NAT2 phenotypes.

Participants and Data Collection
Nine hundred ninety Japanese individuals (343 patients with epilepsy or bipolar disorder, 454 patients with schizophrenia, 65 patients with breast cancer, 83 patients with colorectal cancer, and 45 patients with malignant melanoma) provided informed consent for participation in this study in accordance with the Declaration of Helsinki. The study was approved by the ethics committee of National Cancer Center Research Institute, Fujita Health University Hospital, and RIKEN Center for Integrative Medical Sciences. Targeted resequencing of 100 pharmacokinetics-related genes, including NAT2, was performed as reported elsewhere (Fukunaga et al., 2020). Based on the information on NAT2 SNVs, we estimated individual NAT2 genotypes and registered this information in the NBDC Human Database 2 . We also obtained individual genotype data for 2,504 samples from 26 ethnic populations collected by the 1KGP 3 . These datasets consisted of high-coverage whole-genome and whole-exome sequencing data from diverse ethnic groups. Using individual genotypes in these datasets, NAT2 genotypes were determined. The individual genomes from 26 ethnic populations were divided into five major ethnic populations, i.e., Africans (AFRs), Ad mixed Americans (AMRs), East Asians (EASs), Europeans (EURs), and South Asians (SASs). The AFR population consisted of African Caribbeans in Barbados (ACB); Americans of African ancestry in the southwest United States Expression of NAT2 * 4, * 5, * 6, and * 7 The cDNAs of NAT2 * 4, * 5, * 6, and * 7 were synthesized by Integrated DNA Technologies (Coralville, IA, United States) and cloned into the EcoRV site of pcDNA3.1 (+) Mammalian Expression Vectors (Thermo Fisher Scientific, Waltham, MA, United States) using an In-Fusion HD Cloning Kit (Takara, Shiga, Japan). The locations of variants in the NAT2 alleles are shown in Figure 1A. The constructs were transformed into Escherichia coli JM109 competent cells (Takara) and then the sequences of the inserts in a few colonies were confirmed by Sanger sequencing. After obtaining the constructs carrying each allele, the constructs were cloned and purified with a Qiagen Midi Plasmid Kit (Qiagen, Valencia, CA, United States). The sequences of the clones carrying each allele were confirmed by Sanger sequencing again. The concentration and quality of DNA were determined using a Nano Drop 1000 UV-Vis Spectrophotometer (Thermo Fisher Scientific).
HEK293 cells were seeded into 10-cm collagen-coated culture dishes (IWAKI, Tokyo, Japan) in Dulbecco's modified Eagle's medium (Sigma-Aldrich, St. Louis, MO, United States) containing 10% fetal bovine serum (FBS; Sigma-Aldrich), 100 mM sodium pyruvate (Thermo Fisher Scientific), and nonessential amino acid solution (Thermo Fisher Scientific). When the cells were approximately 80% confluent, vectors carrying each allele were transfected into the cells using Lipofectamine 3000 Transfection Reagent (Thermo Fisher Scientific) according to the manufacturer's instructions. The optimal transfection efficiency and cell viability were obtained with 10 µg DNA/dish and 30 µL Lipofectamine 3000 Transfection Reagent. Fortyeight hours after transfection, cells were treated with 0.05% trypsin-ethylenediaminetetraacetic acid (EDTA; Thermo Fisher Scientific) and washed once with 100 mM potassium phosphate buffer (pH 7.4). The washed cell pellet was then lysed with occasional mixing in 1 mL of Mammalian Protein Extraction Buffer (GE Healthcare, Waukesha, WI, United States) containing EDTA-free protease inhibitor cocktail (Takara). The homogenate was centrifuged at 20,000 × g and 4 • C for 30 min. The resulting supernatant was transferred to another tube and stored at −80 • C until subsequent analysis.

Measurement of NAT2 Protein Expression Levels
Total protein concentrations of lysates were measured using a Pierce BCA protein assay kit (Thermo Fisher Scientific) according to the manufacturer's instructions. Equal amounts of protein were used for enzyme-linked immunosorbent assay (ELISA). The lysates (0.5 mg protein) were coated onto 96well microplates for 20 h at 4 • C. After washing the plates three times by filling the wells with 200 µL phosphate-buffered saline (PBS), the remaining protein-binding sites in the coated wells were blocked by adding 200 µL of 5% FBS (Merck, Darmstadt, Germany) in PBS. After the plates were washed three times, 100 µL of primary anti-NAT2 monoclonal antibody (1:100; cat. no. sc-134399; Santa Cruz Biotechnologies, Dallas, TX, United States) was added to each well, and plates were then incubated for 1 h. After washing three times, 100 µL of HRP-Rabbit Anti-Mouse IgG (H + L) Conjugate (1:100,000; Thermo Fisher Scientific) was added to each well, and plates were incubated for 30 min. After washing three times, we added 100 µL of an ELISA POD substrate TMB kit (Nacalai Tesque, Kyoto, Japan) to each well and incubated the plates for 10 min. An equal volume of 1 M HCl as stopping solution was added, and the optical density at 450 nm was measured using a microplate reader (ARVOmx; PerkinElmer, Waltham, MA, United States).
UPLC-MS/MS analysis was performed on a Waters ACQUITY UPLC system coupled to a TQ Detector (Waters, Milford, MA, United States). Chromatographic separation was achieved on an ACQUITY UPLC BEH Amide column (2.1 mm × 150 mm, 1.7 µm; Waters) equipped with a Vanguard pre-column (ACQUITY UPLC BEH Amide, 2.1 mm × 5 mm, 1.7 µm; Waters). The column temperature was kept at 45 • C, and samples in the autosampler were maintained at 7 • C. The mobile phases (flow rate: 0.4 mL/min) were 0.1% formic acid (FA) in water/acetonitrile (20:80, v/v), isopropanol (IPA)/0.  were performed using MassLynx 4.1 (Waters). The total run time of the analyses was 3 min. The linearity of the assay for each metabolite was confirmed using serial dilutions of the positive control sample after a 60-min incubation with NAT2 * 4. Since no authentic standards for N-acetyl conjugates of all substrates were commercially available, the relative peak area based on the ratio of the analyte signal for the internal standard were used to measure metabolite levels. We determined the limit of detection (LOD) based on the analyte peaks with a signal-to-noise (S/N) ratio of 10 for each N-acetyl conjugation. The values of the S/N ratio were 145.8, 180.5, 167.6, 90.3, 10.1, 94.7, 104

RESULTS
A partial schematic diagram of NAT2 alleles is shown in Figure 1A. NAT2 genotypes in 990 Japanese individuals were estimated based on information on individual SNVs (Table 1).
The detailed genotype frequencies are shown in Supplementary  Table 1. The highest and lowest frequencies of NAT2 * 4 allele in seven African subpopulations were 0.523 of YRI and 0.352 of ASW, respectively. This indicates that the sums of frequencies of NAT2 * 5, * 6, and * 7 alleles fin YRI and ASW populations were lowest (0.473) and highest (0.648), respectively. In three Chinese subpopulations (CDX, CHB, and CHS), the NAT2 * 4 allele frequencies showed a marked difference (0.457, 0.607, and 0.524) as well as the African subpopulations. Recombinant NAT2 proteins were transiently expressed in HEK293 cells, and the lysates were used for in vitro metabolic studies. As shown in Figure 1B, all NAT2 proteins were immunodetectable using an anti-NAT2 monoclonal antibody, and there were no differences in the expression levels of recombinant proteins. The catalytic activities of NAT2 * 4, * 5, * 6, and * 7 proteins were evaluated using the eight substrate drugs. Michaelis-Menten plots of the four NAT2 proteins are shown in Figure 1C, and the estimated kinetic parameters (Km, Vmax, and CLint) are summarized in Table 2. The three variant proteins, i.e., NAT2 * 5, * 6, and * 7, exhibited significantly reduced Vmax (2.20-25.09% that of NAT2 * 4) and CLint (2.62-72.40% that of NAT2 * 4) values for all drugs. Although most Km values for the variant proteins were comparable to or higher than that of NAT2 * 4, NAT2 * 7 showed significantly lower Km values in the N-acetylation of DDP (5.3% that of NAT2 * 4), SMZ (14.3% that of NAT2 * 4), and SP (15.7% that of NAT2 * 4) compared with NAT2 * 4 ( Table 2). When the CLint values of NAT2 * 4 were set at 100%, the relative clearance values of NAT2 * 5, * 6, and * 7 were 12.4-30.0%, 3.9-18.7%, and 2.6-70.2%, respectively (Figures 1D-F). The P values for differences in relative clearance between the  Each value represents the mean ± SEM of four independent experiments. Tukey's multiple comparison test; a P < 0.05 compared with NAT2 * 4, b P < 0.05 compared with NAT2 * 5, c P < 0.05 compared with NAT2 * 6. The data were reported previously (Suvichapanich et al., 2018).
eight drugs are summarized in Supplementary Table 2. When comparing the relative clearance value of NAT2 * 7 for each drug, the values for DDP, SMZ, and SP were higher than those for AGT, HLZ, INH, PZ, and PA ( Figure 1F).
Using the CLint value of the RA allele, i.e., NAT2 * 4 as a base of 1, we obtained activity scores for NAT2 * 5, * 6, and * 7 according to the CLint values for different drugs (a value of 0.5 indicated a 50% reduction in the CLint). The sum of the activity scores of both alleles indicated the NAT2 genotype for each drug (Table 3). For the classification of NAT2 genotypes into phenotypes, such as RAs, IAs, SAs, and USAs, dendrograms were generated to visualize the relationships between activity scores and genotypes by hierarchical agglomerative clustering (Figure 2A). The most optimal number of clusters was three or four, according to analyses using NbClust R-packages. In the current study, 10 NAT2 genotypes were classified as USAs, SAs, IAs, or RAs, based on the activity scores and results of clustering (Figure 2A and Table 3). The NAT2 * 4/ * 4 genotype was categorized into the RA category for N-acetylation of all drugs. Ten NAT2 genotypes were categorized into similar clusters for N-acetylation of AGT, HLZ, INH, PZ, and PA, whereas the numbers of clusters for AGT/INH/PA and HLZ/PZ were three and four, respectively. In cases of INH, all NAT2 genotypes were divided into three phenotypes, i.e., SAs, IAs, and RAs, consistent with previous studies (Naidoo et al., 2019;Mthiyane et al., 2020). Owing to the higher activity scores and lower Km of the NAT2 * 7 allele, the clustering patterns of 10 NAT2 genotypes for N-acetylation of DDP, SMZ, and SP were different from those of other drugs. Although USAs were observed for N-acetylation of HLZ, PZ, and SMZ, the clustering patterns of 10 NAT2 genotypes for SMZ were different from those of HLZ and PZ.
The genetic distribution of the predicted NAT2 phenotypes based on the activity scores of Japanese, AFR, AMR, EAS, EUR, and SAS populations in the 1KGP are presented in Figure 2B and Supplementary Figure 1. RAs for NAT2 were present at high frequencies, particularly in Japanese (48.48-62.02%) and EAS (31.15-52.38%) populations, but were present at low frequencies in EUR (6.76-7.95%) and SAS (5.93-8.79%) populations. According to the results of in vitro metabolic experiments using SMZ as a substrate, the frequencies of USAs were much higher (29.05-54.27%) than those of other drugs, except in Japanese and EAS populations (4.75 and 11.11%, respectively).

DISCUSSION
NAT2 genotypes show marked geographic and ethnic differences. In order to clarify the substrate-specific diversity of NAT2 phenotypes between different populations, we categorized NAT2 genotypes into USA, SA, IA, and RA phenotypes in 26 populations based on the results of in vitro N-acetylation of eight drugs. Our analyses revealed the dramatic genetic variability between populations, including phenotypic consequences at the level of N-acetylation profiles. In particular, we observed lower frequencies of RAs of N-acetylation for all substrates in EUR and SAS populations, which showed normal NAT2 activity, suggesting that lower dosages of NAT2 substrate drugs in EUR and SAS populations may be more appropriate than a one-sizefits-all approach. Thus, our findings provide useful information for population-adjusted genotype-guided therapy for NAT2.
The worldwide distribution of NAT2 phenotype diversity, i.e., SAs, IAs, and RAs, has been reported (Walker et al., 2009; Sabbagh et al., 2011), and genetic differentiation patterns have been shown to be related to geography (Sabbagh et al., 2008). However, no reports have described the diversity of NAT2 phenotypes for USAs, which were recently identified based on combined * 6/ * 6, * 6/ * 7, and * 7/ * 7 genotypes (Selinski et al., 2013;Selinski et al., 2015;Suarez-Kurtz et al., 2016). Our study showed that genotypes could be categorized into the USA phenotype in N-acetylation of HLZ, PZ, and SMZ by hierarchical clustering analysis. In a previous work, the pharmacokinetics of oral HLZ were found to be dependent on the NAT2 genotype during pregnancy (Han et al., 2019), and SA status was shown to be associated with clinical blood pressure and 24-h blood pressure after HLZ treatment in patients with resistant hypertension (Garces-Eisele et al., 2014). Therefore, the * 6/ * 6, * 6/ * 7, and * 7/ * 7 genotypes must be clearly distinct from the SA group, and this categorization may further improve individualization of HLZ treatment.
In this study, we focused only on alleles ( * 5, * 6, and * 7) with frequencies equal to 5% or higher in 2,504 individuals of the 1KG project. Although the minor allele frequency (MAF) of the rs1801279 defining NAT2 * 14 (which confers an SA phenotype) was 2.78% in all individuals of the 1KG project, the African subpopulations show MAFs higher than 5%. Indeed, the MAFs of ACB, ASW, ESN, GWD, LWK, MSL, and YRI populations in Africa were 7.8,7.4,12.6,14.6,9.1,7.1,and 11.6%, respectively. In the seven African subpopulations, the non-synonymous variants defining NAT2 * 22 (0.98%) and * 24 (2.34%) were also detected. Therefore, the frequency of SA in the African population in our study may be underestimated. Further studies focusing on NAT2 * 14, * 22, * 24 and other rare alleles are needed.
In previous in vitro metabolic experiments using SMZ as a substrate, NAT2 * 5, * 6, and * 7 showed lower Vmax values than that of NAT2 * 4, but only NAT2 * 7 showed higher affinity for SMZ with lower Km compared with NAT2 * 4 (Olivera et al., 2007;Garces-Eisele et al., 2014), consistent with our current study. Additionally, NAT2 * 5 and * 6 alleles result in lower N-acetyltransferase activities toward SMZ compared with NAT2 * 7 (Walraven et al., 2008), and carriers of NAT2 * 5 and * 6 were categorized as USAs in the current study. Moreover, the NAT2 SA phenotype is associated with the pharmacokinetics of a different sulfur drug, sulfamethoxazole, in renal transplant recipients (Kagaya et al., 2012) and with adverse reactions to sulfamethoxazole, such as toxic epidermal necrolysis, Stevens-Johnson syndrome, and increased serum alanine aminotransferase levels in patients with systemic lupus erythematosus (Soejima et al., 2007). For SMZ and structurally related drugs, such as sulfamethoxazole, the categorization of carriers of NAT2 * 5 and * 6 alleles as USAs may be useful for genotype-guided dosing.
In summary, in this study, we defined NAT2 phenotypes based on the activity score of each drug and determined the worldwide distribution of the NAT2 phenotype diversity according to this new categorization method. Because limited information on NAT2 genotypes has been published, our current frequency data for the large-scale Japanese population and 26 different populations collected by the 1KGP should be valuable. Therefore, our findings will be useful for future studies, including case-control association studies, to predict responses to drugs primarily metabolized by NAT2. To verify the findings of the present study, in the future, case-control association studies to predict the risk of adverse drug reaction and drug responses should be conducted. By the verification of the benefit of usage of information on the NAT2 phenotypes depending on each drug and the different populations, we will be able to implement the NAT2 phenotypes as pharmacogenomics biomarkers.

DATA AVAILABILITY STATEMENT
The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found in the article/ Supplementary Material.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by the study was approved by the Ethics Committee of National Cancer Center Research Institute, Fujita Health University Hospital, and RIKEN Center for Integrative Medical Sciences. The patients/participants provided their written informed consent to participate in this study.

AUTHOR CONTRIBUTIONS
KF and TM conceived the study and designed the experiments. KK, TO, TY, TS, HZ, MI, and NI supplied the all genomic DNAs. KF performed the experiments, analyzed the data, and wrote the manuscript. All authors reviewed the manuscript and approved the final version to be published.