Genetic association and transferability for urinary albumin-creatinine ratio as a marker of kidney disease in four Sub-Saharan African populations and non-continental individuals of African ancestry

Background Genome-wide association studies (GWAS) have predominantly focused on populations of European and Asian ancestry, limiting our understanding of genetic factors influencing kidney disease in Sub-Saharan African (SSA) populations. This study presents the largest GWAS for urinary albumin-to-creatinine ratio (UACR) in SSA individuals, including 8,970 participants living in different African regions and an additional 9,705 non-resident individuals of African ancestry from the UK Biobank and African American cohorts. Methods Urine biomarkers and genotype data were obtained from two SSA cohorts (AWI-Gen and ARK), and two non-resident African-ancestry studies (UK Biobank and CKD-Gen Consortium). Association testing and meta-analyses were conducted, with subsequent fine-mapping, conditional analyses, and replication studies. Polygenic scores (PGS) were assessed for transferability across populations. Results Two genome-wide significant (P < 5 × 10−8) UACR-associated loci were identified, one in the BMP6 region on chromosome 6, in the meta-analysis of resident African individuals, and another in the HBB region on chromosome 11 in the meta-analysis of non-resident SSA individuals, as well as the combined meta-analysis of all studies. Replication of previous significant results confirmed associations in known UACR-associated regions, including THB53, GATM, and ARL15. PGS estimated using previous studies from European ancestry, African ancestry, and multi-ancestry cohorts exhibited limited transferability of PGS across populations, with less than 1% of observed variance explained. Conclusion This study contributes novel insights into the genetic architecture of kidney disease in SSA populations, emphasizing the need for conducting genetic research in diverse cohorts. The identified loci provide a foundation for future investigations into the genetic susceptibility to chronic kidney disease in underrepresented African populations Additionally, there is a need to develop integrated scores using multi-omics data and risk factors specific to the African context to improve the accuracy of predicting disease outcomes.


Introduction
Chronic kidney disease (CKD) is a leading risk factor for years of life lost and premature mortality, with a 41.5% relative increase in mortality worldwide from 1990 to 2017 (GBD Chronic Kidney Disease Collaboration et al., 2020;Kovesdy, 2022).The estimated global prevalence of CKD is 9.1% and while predicted to be higher in Sub-Saharan Africa (SSA), the true prevalence and associated risk factors remain understudied (Kaze et al., 2018;GBD Chronic Kidney Disease Collaboration et al., 2020).The Africa Wits-INDEPTH partnership for Genomic Studies (AWI-Gen) cohort, which included ~12,000 participants from four SSA countries in West, East, and Southern Africa, reported overall CKD prevalence as 10.7% (95% confidence interval [CI]: 9.9-11.7),with notable geographic regional differences.The most important risk factors for CKD in SSA were older age, female sex, diabetes, hypertension, and human immunodeficiency virus (HIV) infection (George et al., 2019).
While the majority of kidney disease-associated risk loci have been identified in studies on participants of European and East Asian ancestry, and the African diaspora (Lee et al., 2018), few have focused on participants living in SSA (Böger et al., 2011;Pattaro et al., 2012;Lin et al., 2019;Morris et al., 2019).Recently, a study of genetic associations of eGFR in a Ugandan population-based cohort, (Fatumo et al., 2020), replicated the association between eGFR and the GATM locus.
Replication and transferability of GWAS signals across populations of different ancestries, and specifically with African ancestry populations, tend to be poor despite regional replication often identifying shared associated genomic regions (Pattaro et al., 2012).This may be due to differences in linkage disequilibrium (LD) with the causal variant, allele frequency differences between the populations, underlying population structure, and variabilities in environmental exposures.African populations, with their great genetic diversity and deep evolutionary roots, represent an opportunity for genetic discovery to identify and fine-map disease-associated risk variants (Gomez et al., 2014;Pereira et al., 2021).
Polygenic scores (PGS) are used to quantify and stratify populations according to genetic risk.A PGS based on 63 eGFRassociated alleles showed significant association with kidney diseaserelated phenotypes, such as chronic kidney failure and hypertensive kidney disease in the Million Veteran Study (US) on 192,868 white and non-Hispanic individuals (Hellwege et al., 2019).A PGS based on 64 urine UACR associated alleles was significantly associated with CKD (Teumer et al., 2019).Further analysis revealed positive associations of the PGS with an increased risk of hypertension (HT) and diabetes.However, PGS often translate poorly across different ancestries (Martin et al., 2017;Kamiza et al., 2022;Kachuri et al., 2023).Since most published GWAS for kidney disease and kidney function markers are based on European ancestry populations, the predictive accuracy of models developed from these studies is expected to be significantly diminished for African populations (Adam et al., 2022;Choudhury et al., 2022;Kamiza et al., 2023;Majara et al., 2023).
In this study, we present a GWAS for UACR conducted within resident Sub-Saharan African individuals.This population crosssectional study includes a cohort of 8,970 individuals from four SSA countries from the AWI-Gen study (Ali et al., 2018), the African Research on Kidney Disease (ARK) study (Kalyesubula et al., 2020), with 9,705 individuals of African-ancestry from the UK Biobank (UKB) and African American participants from the CKD-Gen Consortium (Teumer et al., 2019).The primary objectives are to: (1) identify genetic loci associated with UACR as a marker of kidney disease in individuals from SSA and of African ancestry; (2) explore the replication of findings identified in previous GWAS; (3) perform analysis and comparison of PGS derived from non-African and multi-ancestry population studies and evaluate their transferability to African populations.

Study participants
Africa Wits-INDEPTH partnership for genomic research (AWI-Gen) The study participants are a subset of the population crosssectional AWI-Gen study (Ramsay et al., 2016;Ali et al., 2018).The study recruited adults primarily between the ages of 40 and 80 years from six SSA study sites in West Africa (Nanoro, Burkina Faso and Navrongo, Ghana), East Africa (Nairobi, Kenya) and in South Africa (Bushbuckridge -hereinafter referred to as Agincourt, Mpumalanga Province; Dikgale, Limpopo Province; and Soweto, Gauteng).All participants were of self-identified black ethnicity.Data collection was described in detail previously (Ali et al., 2018;George et al., 2019).Detailed demographic data, health-related questionnaire data, and anthropometric measurements were collected.Peripheral blood samples and urine samples were collected for biomarker assays (the relevant assays are described below).DNA was extracted from peripheral blood-derived buffy coat samples and used for genotyping.Urine albumin was measured using a colorimetric method on the Cobas © 6000/c501 analyzer, and urine creatinine was measured by the modified Jaffe method (Craik et al., 2023).This study was approved by the Human Research Ethics Committee (Medical), University of the Witwatersrand, South Africa (M121029, M170880) and the ethics committees of all participating institutions.All participants provided written informed consent following community engagement and individual consenting processes.

African research on kidney disease (ARK)
The African Research Kidney Disease (ARK) study is a well characterised population-based cohort study of 2021 adults (20-80 years) of self-identified black ethnicity from Agincourt, (Mpumalanga, South Africa) with demographic data, healthrelated questionnaire data, and anthropometric measurements collected at enrolment (Fabian et al., 2022).Blood and urine were collected for biomarker assays (the relevant assays are described below).DNA was extracted from buffy coat samples and used for genotyping.Urine albumin was measured using a colorimetric method on the Cobas © 6000/c501 analyzer, and urine creatinine was measured by the modified Jaffe method (Craik et al., 2023).This study was approved by the Human Research Ethics Committee (Medical), University of the Witwatersrand, South Africa (M160939).All participants provided written informed consent following community engagement and individual consenting processes.The geographical area of recruitment overlaps with the Agincourt sub-cohort of AWI-Gen but there is no overlap in participants.

UK-Biobank (UKB)
Individuals of self-reported Caribbean and African ancestry from the UKB were identified for this study.Of this subset of UKB individuals, those with both genotyping and UACR data were retained for the analysis.UACR was derived using urinary levels of albumin and creatinine.In the UKB, albumin was measured using the immuno-turbidimetric analysis method (Randox Biosciences, UK) while creatinine was measured using the enzymatic analysis method (Beckman Coulter, UK) (Casanova et al., 2019).

Phenotype generation and harmonization
UACR was calculated for AWI-Gen and ARK studies using urinary levels of albumin and creatinine as previously described (George et al., 2019;Fabian et al., 2022).Participants with missing values for albumin and creatinine were excluded from this study.We applied filtering criteria similar to those employed by the CKD-Gen consortium (Köttgen and Pattaro, 2020).In cases where the values for urine albumin and urine creatinine fell outside the upper and lower limits of detection, the values were replaced with the respective upper and lower limits: for urine creatinine, the range was 3-400 mmol/L and for urine albumin, the range was 3.75-475 mg/L for AWI-Gen and ARK.For the UKB dataset, the upper limit was 6.7 mg/L for urine albumin.Albuminuria was defined as UACR >3.0 mg/mmol.

UK-Biobank
Genotyping was performed by Affymetrix on two closely related purpose-designed arrays.~50,000 participants were genotyped using the UK BiLEVE Axiom array (Resource 149,600) and the remaining ~450,000 were genotyped using the UK Biobank Axiom array (Resource 149,601).The dataset is a combination of results from both arrays.A total of 805,426 markers were released in the genotype data.We extracted individuals with self-reported (Data-Field in dataset 21,000) African Ancestry split between African (UKB-African) and Caribbean origins (UKB-Caribbean) from the raw dataset (Casanova et al., 2019).
In our final QC step, we identified and excluded outliers, admixed and related individuals using PCASmart, a feature of the EIGENSOFT software (Price et al., 2006), Admixture software (Alexander et al., 2009) using AGV (Gurdasani et al., 2015) and 1000 Genomes Project data (Auton et al., 2015) and PLINK (Version 1.9) (Purcell et al., 2007;Chang et al., 2015).More detail on filter parameters for each software can be found in Supplementary Table S1.

Imputation
Genotype imputation was performed on each dataset separately (AWI-Gen, ARK, and UKB) using the Sanger Imputation Server with the African Genome Resources reference panel (https://www.sanger.ac.uk/tool/sanger-imputation-service/).EAGLE2 was used for pre-phasing and the PBWT algorithm was used for imputation (Loh et al., 2016).After imputation, poorly imputed SNPs with info scores less than 0.3 and with a HWE p-value less than 1 × 10 −04 were removed.The genomic positions were mapped to GRCh37p11.

Phenotype transformation for association testing
For AWI-Gen, ARK, and UKB datasets, UACR was transformed on the logarithm scale.Linear regression of variables was performed with covariates in R (Version 3.6): ln (UACR) ~age + sex + genetic principal components (PCs) 1-5.Residuals were extracted and transformed using Rank-Based Inverse Normal Transformation to ensure the normal distribution of residuals (Casanova et al., 2019).PCs were calculated using a sub-set of LD pruned preimputed SNPs in PLINK (Version 1.9) (Purcell et al., 2007;Chang et al., 2015).The sub-set was derived by LD pruning using PLINK (Version 1.9) (Purcell et al., 2007;Chang et al., 2015) with an LD (r 2 ) threshold of 0.2 with windows of 50 kb and 10 kb for step size.

Association testing
Mixed model association testing was performed with imputed genotype probabilities using GEMMA (Version 0.98.1)(Zhou and Stephens, 2012).GEMMA uses a relatedness matrix to account for genetic structure and relatedness between individuals.The relatedness matrix was built with a sub-set of pre-imputed SNPs described above.
Mixed model association testing was performed independently on each dataset.A total of nine datasets were tested.The datasets were defined as follows: six datasets for AWI-Gen: AWI-Agincourt, AWI-Dikgale, AWI-Nanoro, AWI-Nairobi, AWI-Navrongo and AWI-Soweto; one dataset for ARK: ARK-Agincourt; and two datasets for UK Biobank: UKB-Caribbean and UKB-African.For each dataset, Quantile-to-quantile plots (QQ-plots) were generated, and inflation factors were calculated using SNPs with MAF>0.01 to verify that the association signals were not inflated due to unaccounted population sub-structure.The genome-wide significance level for novel discovery was considered at P < 5 × 10 −08 .
Briefly, CKD-Gen-AA is a meta-analysis based on 7 studies with African American participants.For each study, genotyping was performed using genome-wide arrays followed by application of study-specific quality filters prior to phasing, imputation, and association analysis software [description can be found in Supplementary Tables 1, 2 from (Teumer et al., 2019)].Metaanalysis was performed using fixed effects inverse-variance weighted meta-analysis of the study-specific GWAS result files with imputation quality (IQ) score > 0.6 and MAC > 10, effective sample size ≥ 100, and a beta < 10, using METAL [for more details see (Teumer et al., 2019)].

Meta-analysis
Fixed-effect meta-analyses were conducted using the METASOFT software (Han and Eskin, 2011).The first metaanalysis (Meta SSA ) used the GWAS summary statistics generated from individual-level data from resident SSA populations.This included AWI-Agincourt, AWI-Dikgale, AWI-Nanoro, AWI-Nairobi, AWI-Navrongo, AWI-Soweto and ARK-Agincourt.The second meta-analysis (Meta NONRES ) included data from individuals of African ancestry who are not residing in SSA.We used the GWAS summary statistics generated from individual-level data from the UK-Biobank (UKB-African and UKB-Caribbean) and CKD-Gen African American sub-set (CKD-Gen-AA).The third meta-analysis (Meta ALL ) consisted of a meta-analysis that pooled the summary statistics of all studies from AWI-Agincourt, AWI-Dikgale, AWI-Nanoro, AWI-Nairobi, AWI-Navrongo, AWI-Soweto, ARK-Agincourt, UKB-African, UKB-Caribbean and CKD-Gen-AA. Figure 1 outlines the metaanalysis workflow.As a secondary analysis, the role of heterogeneity had been investigated between cohorts from different regions of origin by performing separate meta-analyses for residents of Southern African (AWI-Agincourt, AWI-Dikgale, AWI-Soweto, and ARK-Agincourt) and residents of West Africa (AWI-Nanoro, AWI-Navrongo).Random-effects model from METASOFT (Han and Eskin, 2011) took into account potential heterogeneity between study sites, we performed Meta RE using all dataset (Meta ALL RE ) (Borenstein et al., 2010;Nikolakopoulou et al., 2014).The genome-wide significance level for novel discovery was considered at P < 5 × 10 −08 .

Post association analysis
Plotting QQ-plots and Manhattan plots were generated using the FastMan library (Paria et al., 2022) (available at https://github.Study design showing data sources, the analysis strategy and post-GWAS analysis approach. Frontiers in Genetics frontiersin.org05 Brandenburg et al. 10.3389/fgene.2024.1372042com/kaustubhad/fastman) and the Hudson library (available at https://github.com/anastasia-lucas/hudson).These visualizations were created using SNPs with a MAF threshold of 0.01 or more.
For regional plots, we utilized the standalone version of the LocusZoom software (Pruim et al., 2010).

Genetic LD reference
For the estimation of the LD reference panel for conditional and joint (COJO) analysis, clumping, and fine-mapping, three LD reference panels were constructed using genotype data from the appropriate datasets.For resident SSA dataset comparisons, the LD reference panel (LD SSA ) was constructed using AWI-Gen and ARK individual-level genotype data.For non-resident SSA dataset comparisons, the LD reference panel (LD NONRES ) was constructed using UKB individual-level genotype data.For the combined datasets comparison, the LD reference panel (LD ALL ) was constructed using AWI-Gen, ARK, and UKB individual-level genotype data.

Fine-mapping and lead SNPs
For each locus with a lead SNP with a p-value below 5 × 10 −08 , fine-mapping was conducted using the H3ABioNet H3AGWAS pipeline and implementing a stepwise model selection procedure through GCTA (Yang et al., 2011;2012;Brandenburg et al., 2022a) to identify independently associated SNPs.Subsequently, we utilized the FINEMAP software (Version 1.4) (Benner et al., 2016), considering one causal variant, to define the credible set with 99% confidence using a stochastic approach (Benner et al., 2016).

Conditional analyses (GCTA)
Conditional analyses used the GCTA software implemented within the H3AGWAS pipeline, with summary statistics obtained from the meta-analyses as input.In these analyses, the lead SNPs identified in each meta-analysis were conditioned upon lead SNPs found in previously published studies.Changes in the p-value, both increasing or decreasing significance, of the lead SNP, confirmed a relationship between the two SNPs.

Replication of previous findings
Replication was performed according to the following criteria: 1) Exact replication: if any genome-wide significant lead SNPs found in CKD-Gen-EA and CKD-Gen-MA reached statistical significance (p < 0.05) in Meta SSA , Meta NONRES or Meta ALL after Bonferroni correction (A total of 60 independent lead SNPs were identified in the CKD-Gen datasets, of which 55 lead SNPs were from CKD-Gen-EA and 57 lead SNPs were from CKD-Gen-MA) and that the lead SNPs have same direction of effect.2) LD Window replication: for a given genome-wide significant SNP found in the CKD-Gen datasets, SNPs were extracted from Meta SSA , Meta NONRES and Meta ALL that are in LD with the said CKD-Gen lead SNP.LD pruning used the clump procedure in PLINK (Version 1.9) (r2 = 0.1, windows size 1000 kb, P 1 = 5 × 10 −08 , P 2 = 0.1).The lowest p-value(s) from SNPs within the given LD window were extracted and this LD window was considered statistically significant if the p-value was less than 5 × 10 −04 in both datasets.Additionally, the direction of effect between the CKD-Gen and Meta-datasets (Meta SSA , Meta NONRES and Meta ALL ) must be consistent.Conditional analyses were performed between the genome-wide significant SNP(s) in CKD-Gen and lead SNP in our meta-analyses to confirm the replication.
For replication, the findings from Meta SSA were compared to CKD-Gen-MA and CKD-Gen-EA, and the findings from Meta NONRES and Meta ALL were only compared to CKD-Gen-EA to avoid sample overlaps within the CKD-Gen datasets (as CKD-Gen-AA is contained within CKD-Gen-MA).

Annotation and expression quantitative trait locus (eQTL) analysis
Functional annotation of genome-wide significant SNPs found in Meta SSA , Meta NONRES and/or Meta ALL was done using the ANNOVAR software (Wang et al., 2010).eQTL analysis was performed using the database of cis-eQTLs in both glomerular and tubulointerstitial tissues, derived from participants in the Nephrotic Syndrome Study Network (NEPTUNE) using SNPs with false discovery rate (FDR) < 0.05 (Han et al., 2023).In this analysis a 1000 kb window was defined around each genome-wide significant locus and an eQTL was considered significant if the LD (r2) was ≥0.01 between the lead SNP and significant eQTL, LD computation used the genetics data from the African populations from the 1000 Genomes Project (v5a, hg19) (Auton et al., 2015;Sudmant et al., 2015).

Polygenic scores
PGS were computed for each dataset independently (AWI-Agincourt, AWI-Dikgale, AWI-Nanoro, AWI-Nairobi, AWI-Navrongo, AWI-Soweto, ARK-Agincourt, UKB-African, and UKB-Caribbean).The effect sizes from 3 previous studies were used: CKD-Gen-AA, CKD-Gen-MA, and CKD-Gen-EA.PRS-CS (Ge et al., 2019), software that estimates posterior SNP effect sizes by implementing continuous shrinkage (CS) priors, was used to calculate the PGS.As external LD references are required for this analysis, the African LD data derived from the 1000 Genomes Project by the PRScs project was used for this purpose (accessible at https://github.com/getian107/PRScs).The PGS values were regressed against the residualized UACR value in a linear regression model that adjusted for age, sex, and the first five principal components to assess the performance of PGS.

Study participants and phenotype data
Genomic and phenotypic data were accessible for 7,959 individuals in the AWI-Gen datasets, 1,011 individuals in the ARK dataset, and 2,916 individuals in the UK-Biobank dataset with 1,205 individuals and 1,711 individuals in UKB-African and UKB-Caribbean respectively (Supplementary Figure S1).CKD-Gen AA was a meta-analysis of 7 studies including 6,795 individuals in total.Overall, there was a higher prevalence of albuminuria (17.9%; median UACR 1.01 mg/mmol) among individuals from the UKB with African and Caribbean ancestry compared to individuals residing in SSA, where notable regional differences were observed.The highest prevalence of albuminuria occurred in AWI-Agincourt, South Africa (14.1%; median UACR 0.59 mg/ mmol) while the lowest prevalence occurred in AWI-Nanoro, West Africa (prevalence of albuminuria 4.5%, median UACR 0.35 mg/mmol) (Table 1).

Meta-analysis
Meta-analyses were conducted to investigate the genetics of UACR in resident Sub-Saharan African datasets (Meta SSA ) (Figure 2A), nonresident Sub-Saharan African datasets (Meta NONRES ) (Figure 2B) and all African ancestry datasets (Meta ALL ) (Figure 2C).
No genomic inflation was observed for the individual-dataset association testing performed on the 9 datasets.All genomic inflation factors (lambda) were below 1.1.This was visually confirmed on the dataset specific QQ-plots and Manhattan plots (Supplementary Figure S2; Supplementary Figures S3A-I).Datasetspecific significant findings are reported in Supplementary Table S2; Supplementary Figures S4A-C.
One genome-wide significant locus with the lead SNP rs9505286 (p = 4.3.10−08 ) was identified in Meta SSA on chromosome 6.(Figure 3A).One genome-wide significant locus with the lead SNP rs73404549 was identified on chromosome 11 in Meta NONRES (p = 5.6.10 −11 ) and Meta ALL (p = 7.7 10 −13 ) (Table 2; Figures 3B, C).SNP rs9505286 (chr6:7820353) is located in the intronic region of BMP6.Two SNPs were identified in the 95% credible set using FINEMAP (Figure 3A; Supplementary Table S4).eQTLs in the region were found to be associated with the expression of two genes RREB1 and BMP6 (Table 2; Supplementary Figure S5; Supplementary Table S3; Supplementary Table S5).
SNP rs73404549 (chr11:5320654) is located near the HBE1, OR51B4, and HBB genes.This signal is primarily driven by results from West African ancestry datasets in the Meta NONRES and Meta ALL (Figures 3B, C; Supplementary Figure S6).Notably, this SNP is monomorphic in the Southern African and East African datasets.Furthermore, rs73404549 is in LD with rs334 (r 2 = 0.52; 72,422 bp apart), the SNP that defines the sickle cell mutation (HbS).SNP rs334 was also significant in Meta ALL (P ALL = 8.55 × 10 −9 ).
In the window of 1000 kb around rs73404549, SNPs in the region colocalized with gene expression of TRIM6 and STIM1 in glomerular and tubulointerstitial tissues (Table 2; Supplementary Table S5).Seven and two SNPs were identified in the 99% credible set using FINEMAP in the Meta NONRES and Meta ALL results, respectively (Supplementary Table S3; Supplementary Figure S6).

Replication of previous findings
Replication analysis confirmed associations in three were identified in the 95% credible setpreviously identified region in THBS3, SPATA5L1/GATM, and ARL15 (Supplementary Table S3).
In the THBS3 region, the Meta ALL meta-analysis rs370545 was the lead SNP in our dataset, with a p-value of 1 × 10 −04 .However, a conditional analysis using rs2974937 (lead SNP in CKD-Gen-EA) resulted in a decrease in significance level (P conditional_analysis = 0.85).This suggests that the association in the THBS3 region was driven by rs2974937 in Meta ALL even though it was not the lead SNP in this region (Supplementary Table S4; Supplementary Figure S7).
In the ARL15 region, a statistically significant association signal was observed in Meta SSA (rs1664781, p = 1.8 × 10 −04 ).Conditional analysis using rs1694068 (lead SNP in CDK-Gen-EA) revealed a reduction in p-value for rs1664781 (P conditional_analysis = 0.87), suggesting that rs1694068 and rs1664781 are in LD thus confirming the association in this region (Supplementary Table S4; Supplementary Figure S8).Manhattan plot-GWAS of UACR in the (A) Meta SSA (B) Meta NONRES (C) Meta ALL datasets using the fixed effect model.Lead genome-wide significant SNPs (P < 5 × 10 −08 ) and gene annotations are highlighted.
In the SPATA5L1/GATM region, the Meta ALL meta-analysis identified rs1694067 as the lead SNP in this region with a p-value of 7.0 × 10 −05 .Furthermore, the lead SNP rs1153847 identified in CKD-Gen-EA, was present in our dataset, and its association was replicated (P Bonferoni_adjusted = 0.04).For the window-based replication, a conditional analysis using rs2467858 (genome-wide significant SNP in CKD-Gen-EA), a reduced p-value was observed (P conditional_analysis = 0.87) confirming rs1694067 and rs2467858 are in LD and replicated the CKD-Gen signal.(Supplementary Table S4; Supplementary Figure S9).

Polygenic score analyses
The variance explained by the PGS for UACR residuals was between 0% and 0.82%.PGS constructed using the betas from CKD-Gen-EA and CKD-Gen-MA performed better for the non-SSA resident datasets, particularly in the UKB-African, showing the best predictivity (% variance: 0.82, p = 1 × 10 −04 ) and statistically significant correlation between the PGS and the UACR residual (Figure 4; Supplementary Table S6).

Discussion
This study is the first GWAS for UACR conducted in Sub-Saharan African populations.Two genomic regions were identified to be significantly associated with UACR among 8,970 participants from West, East, and Southern Africa and among 9,705 nonresident African-ancestry participants from the UK Biobank and CKD-Gen Consortium.
For the first locus, the SNP rs9505286 reached genome-wide significance in resident African individuals Meta SSA and is located in the intronic region of BMP6.eQTLs in LD with rs9505286 were found to be associated with expression of two genes, namely bone morphogenetic protein 6 (BMP6) and ras-responsive element binding protein 1 (RREB1).Both genes are plausibly linked with kidney disease.BMP6 encodes a secreted ligand of the transforming growth factor (TGF-beta) superfamily of proteins, of which TGF-β1 is FIGURE 3 Regional plot using LocusZoom of genome-wide significant SNPs found in meta-analyses using the fixed effect model, (A) rs9505286 from the result of Meta SSA , (B) rs9966824 from the result Meta NONRES (C) rs9966824 from the result Meta ALL.
one of the most important regulators of kidney fibrosis, the pathological hallmark of irreversible loss of kidney function in CKD (Dendooven et al., 2011;Jenkins and Fraser, 2011).TGF-B1 is highly expressed in various fibrotic kidney diseases, including diabetic nephropathy (DN), hypertensive nephropathy, obstructive kidney disease, autosomal dominant polycystic kidney disease, immunoglobulin A nephropathy, crescentic glomerulonephritis, and focal segmental glomerulosclerosis.Because of its pivotal role in mediating kidney fibrosis, TGF-B1 is a potential target for drug discovery, and these results point towards similar potential in African populations for further exploration.RREB1, initially identified as a repressor of the angiotensinogen gene, is associated with type 2 diabetes in African Americans with end stage kidney disease (Bonomo et al., 2014).RREB1 polymorphisms have been shown to interact with APOL1, and are implicated in fat distribution and fasting glucose, a potential explanation for the association with type 2 diabetes.As obesity and type 2 diabetes prevalence emerge in many African communities undergoing rapid sociodemographic transition, these findings must inform future work (Bonomo et al., 2014).Unfortunately, neither of the eQTLs has strong LD support with the lead SNP rs9505286 (see Supplementary Table S5).
Variability of kidney function, confounding factors and allele frequency differences between datasets may explain why the rs9505286 signal was not replicated in Meta ALL or Meta NONRES (Marigorta et al., 2018) .
For the second locus, the SNP rs73404549 was found to be statistically significant in non-resident individuals with African Ancestry (Meta NONRES ) and overall (Meta ALL ), but not in Sub-Saharan African individuals (Meta SSA ).This can be explained by the fact that the variant allele of rs73404549 is extremely rare or absent in East and South African populations.This SNP was found to be in LD with rs334, the sickle cell trait mutation (HbS) in the HBB gene.The HbS mutation has been linked to malaria resistance among heterozygotes, with differences in allele frequency attributed to variations in selection pressures between Bantu-speaking populations in West and South/East Africa (Gurdasani et al., 2019;Choudhury et al., 2020).Notably, sickle cell trait and rs334 had been associated with various kidney function (eGFR) and kidney disease traits, including albuminuria, and chronic and end-stage kidney disease in African, African American and US Hispanic/Latino populations (Naik et al., 2014;Gurdasani et al., 2019;Fatumo et al., 2020;Masimango et al., 2022).Furthermore, an interaction between APOL1 high-risk genotypes and the sickle cell trait enhances the risk for low eGFR (Masimango et al., 2022).
In addition to the HBB region, our GWAS revealed transferability of three previously identified signals.Of the 60 UACR-associated loci identified in European and Multi-Ancestry studies, only three were replicated, including variants in GATM.This region was also associated with eGFR in a Ugandan population (Fatumo et al., 2020).We also replicated the association with ARL15 in the region of chromosome 1 ARL15 is a regulator of Mg2+ transport thereby promoting the complex N-glycosylation of cyclin M proteins (CNNM 1-4) and could play a role in the pathogenesis of hypertension mediated via altered tubular handling of magnesium in the kidney (Zolotarov et al., 2021).
Allelic heterogeneity is high in African ancestry populations, as demonstrated by the high genetic diversity in our study (Supplementary Figure S1).However, analysis of regional subgroups using meta-analysis (residents of South, West, or East Africa) did not reveal significant population-specific signatures (p < 5 x 10 -8 ), likely due to small sample sizes within these subgroups (Supplementary Figures S3I, 10A, B).Interestingly, meta-analysis under a random-effects model that allows for heterogeneity in allelic effects between regions (Meta ALL

RE
) did not improve the detection of specific signals already observed with the fixedeffects methods for HBB (Supplementary Figure S11).Consequently, the heterogeneity observed might be explained primarily by variations in LD or environmental factors rather than by the effect of a specific allele, such as the presence or absence of sickle cell trait (Nikolakopoulou et al., 2014;Kuchenbaecker et al., 2019;Choudhury et al., 2020).
The transferability of PGS developed using the effect sizes quantified in three previous association studies in European ancestry, African ancestry and multi-ancestry populations showed limited predictability, explaining less than 1% of the variability in UACR.PGS in resident African populations (AWI-Gen and ARK) explained between 0.58% and 0.60% of the variance of UACR compared to UKB-African, where best prediction was observed (0.80%).The poor predictability of UACR using summary statistics derived from African Americans was likely due to the small sample size of the discovery dataset.Unfortunately, there have been few studies on PGS approaches to compare findings with, and the genetic heritability of UACR is relatively low, estimated at 4.3% (Teumer et al., 2019).
The limited transferability of PGS and previous GWAS signals across ancestral groups could be due to differences in genetic architecture and/or pleiotropic effects.Different demographic histories and genetic selection pressures between European and African populations could modify the ability to replicate previous GWAS results due to differences in allele frequencies between non-African and African populations, with generally lower LDs in African genomes.Environmental factors and variability in the prevalence and aetiology of kidney and disease-related risk factors such as diabetes and hypertension (Fatumo et al., 2020) could also influence the genetic architecture of kidney disease in Africans populations (Limou et al., 2014;Teumer et al., 2019;Brandenburg et al., 2022b).Selection pressures have increased the frequencies of APOL1 kidney risk variants and HbS due to their protective properties in areas of Africa where trypanosomiasis and malaria are endemic.This may have contributed to shaping genetic susceptibility to kidney disease in African individuals.In our study, the APOL1 gene region did not exhibit significant associations with UACR.The indel rs71785313 was not imputed using the African Sanger reference for imputation, and a specific study had previously been published to describe APOL1 variant distribution in the AWI-Gen dataset using other imputation panels, but the locus did not reach genomic significance (5 x 10 -8 ) for association with eGFR and UACR (Brandenburg et al., 2022b).S6.
While the burden of CKD in SSA is high, it is noteworthy that no prior GWAS on UACR has been conducted on the continent.Despite its uniqueness, our study is limited by its relatively modest sample size, which impacts statistical power to detect small-effect associations reaching genome-wide significance thresholds.Kidney and disease markers were measured at a single time point, and spot urine albumin and creatine levels are sensitive to incident infections and other environmental factors that could affect the prevalence of albuminuria.
It is important to note that our study populations are mainly treatment naïve in relation to kidney disease and other cardiometabolic conditions, which may be an advantage in detecting genetic associations (Pereira et al., 2021).Other studies, based on lipid-associated loci, attributed non-transferability of associated loci to pleiotropic effects, gene-environment interactions, and also to variability in allele frequencies and LD patterns (Kuchenbaecker et al., 2019;Choudhury et al., 2022), as we hypothesize for UACR.
In conclusion, this study describes genetic associations with UACR in a unique SSA cohort and non-resident individuals with African ancestry.CKD in African populations remains understudied but from available data, hypertension, rather than diabetes is the most commonly associated risk factor and in some regions, up to 60% of people with CKD do not have an associated "traditional" risk factor common to high-income settings, suggesting alternate underlying molecular pathways or aetiologies for CKD (Kalyesubula et al., 2018;Nakanga et al., 2019;Muiru et al., 2020).Our study identified two novel SNPs associated with UACR in populations of African ancestry.We further replicated three known UACR-associated loci.Regional genetic diversity due to different selection pressures appear to play a role in the genetic aetiology of CKD across the African continent.These factors likely contribute to the limited transferability of previous association signals and the poor transfer of polygenic scores developed in non-African populations to African populations.Larger genomic studies are necessary to better understand the genetic architecture of kidney function and chronic kidney disease across different African populations and inform region-specific kidney risk profiles.As demonstrated in this study, the low genetic heritability of UACR limits the predictive power of polygenic score for kidney disease in our setting.It is critical for future research to address these gaps by modelling integrative risk scores that incorporate locally relevant clinical risk factors that are powerful predictors of kidney disease, multiple kidney phenotypes (eGFR cystatin C , eGFR creatinine , eGFR creatinine+cystatin C , albuminuria, blood urea nitrogen), using multi-omics (Eddy et al., 2020), and the impacts of Africanspecific genetic risk for kidney disease, such as APOL1 high-risk genotypes and sickle cell trait or disease (Naik et al., 2014;Friedman and Pollak, 2016;Brandenburg et al., 2022b).

FIGURE 4
FIGURE 4Percent variance (r2) explained between PGS and residual phenotypes computed using age, sex and 5 PCs.Key-The negative relationship between PGS and the phenotype in the result of the linear model, *p < 0.05, **p < 0.01 and ***p < 0.001.Details in Supplementary TableS6.

TABLE 1
Study participants and phenotype data.Participant characteristics for each AWI-Gen study site, ARK-Agincourt and UKB-African and UKB-Caribbean, with phenotype distributions of UACR (median) and covariables used in the study.
1 data reported as the mean. 2 UACR: urine albumin: creatinine ratio: mg/mmol; reported as median (interquartile range).3Albuminuria:UACR>3.0mg/mmol. 4 Sub-Saharan Africa. 5Gen Consortium: African American Ancestry individuals.CKD-GEN-AA data are summary statistics of meta-analysis from Teumer et al. downloadable at the CKD-GEN consortium website (http://ckdgen.imbi.uni-freiburg.de/),informationrelative to samples had been reported in relevant papers.6UKBBiobank:Individuals of self-reported African Ancestry; 7 self-reported Caribbean Ancestry; 8 self-reported African and Caribbean Ancestry.9AfricaWits -INDEPTH partnership for Genomic Studies (AWI): individuals of African Ancestry from 9,10 West Africa; 11 East Africa; 12-14 South Africa.15AfricanResearch on Kidney Disease (ARK) Study: South Africa The bold row displays the combined totals for non-residents, residents, and the overall category

TABLE 2
Lead genome-wide significantly associated SNPs for sub-Saharan African population meta-analysis (Meta