Phenomic Analysis of Chronic Granulomatous Disease Reveals More Severe Integumentary Infections in X-Linked Compared With Autosomal Recessive Chronic Granulomatous Disease

Background Chronic granulomatous disease (CGD) is an inborn error of immunity (IEI), characterised by recurrent bacterial and fungal infections. It is inherited either in an X-linked (XL) or autosomal recessive (AR) mode. Phenome refers to the entire set of phenotypes expressed, and its study allows us to generate new knowledge of the disease. The objective of the study is to reveal the phenomic differences between XL and AR-CGD by using Human Phenotype Ontology (HPO) terms. Methods We collected data on 117 patients with genetically diagnosed CGD from Asia and Africa referred to the Asian Primary Immunodeficiency Network (APID network). Only 90 patients with sufficient clinical information were included for phenomic analysis. We used HPO terms to describe all phenotypes manifested in the patients. Results XL-CGD patients had a lower age of onset, referral, clinical diagnosis, and genetic diagnosis compared with AR-CGD patients. The integument and central nervous system were more frequently affected in XL-CGD patients. Regarding HPO terms, perianal abscess, cutaneous abscess, and elevated hepatic transaminase were correlated with XL-CGD. A higher percentage of XL-CGD patients presented with BCGitis/BCGosis as their first manifestation. Among our CGD patients, lung was the most frequently infected organ, with gastrointestinal system and skin ranking second and third, respectively. Aspergillus species, Mycobacterium bovis, and Mycobacteirum tuberculosis were the most frequent pathogens to be found. Conclusion Phenomic analysis confirmed that XL-CGD patients have more recurrent and aggressive infections compared with AR-CGD patients. Various phenotypic differences listed out can be used as clinical handles to distinguish XL or AR-CGD based on clinical features.


INTRODUCTION
Chronic granulomatous disease (CGD) is an inborn error of immunity (IEI) that is characterised by recurrent infections caused by catalase-positive bacteria and fungi, such as Staphylococcus aureus and Aspergillus species (1). It is estimated that the prevalence of CGD is 1 in 250,000 live births among Europeans and Americans (2,3). CGD arises from the loss of function of one of the proteins forming the NADPH oxidase complex, which generates reactive oxygen species, i.e., superoxide radicals and hydrogen peroxide for intracellular bacteria and fungi killing in phagocytes (4). Currently, there is one X-linked (XL) and five autosomal recessive (AR) forms of CGD. The gene responsible for XL-CGD is CYBB, and the other five genes responsible for AR-CGD are CYBA, NCF1, NCF2, NCF4, and CYBC1. Frequently affected organs and systems include the lung, skin, lymph node, and liver (3). Patients may suffer from pneumonia and deep and superficial abscesses. CGD patients usually present with lymphadenopathy and hepatosplenomegaly on physical examination (5).
In our study, we focus on the phenomic analysis of CGD. Phenomics stands for the acquisition of high-dimensional phenotypic data on an organism scale. Study of phenomics is usually incorporated with the study of genomics and environment so that we can know various factors which might possibly influence the complex traits displayed. Compared with genomics, phenomics is much more sophisticated and is much more difficult to be characterised (6). For this study, we use Human Phenotype Ontology (HPO) terms to describe the phenomic abnormalities. The HPO project was publicised in 2008 and provides an ontology of annotations (7), i.e., HPO terms, to describe phenotypic abnormalities encountered by clinicians. The HPO currently contains over 13,000 terms arranged in a simple class-subclass relationship such that various specific terms belong to a subclass of a parent term. The aim of the HPO system is to offer a computational bridge between genome biology and clinical medicine, as well as enabling the integration of phenotypic information across various scientific tools for clinical diagnosis and research purposes (7). Due to advancement in technology, there are more external tools available for genomic discovery project and diagnostic research. For genomic projects, HPO terms are used to filter out the list of candidate genes to be tested from whole genome sequencing using Exomiser, Phevor, or PhenIX (8,9). External algorithms such as Phenomizer and Phenolyzer can compute clinical phenotype data written in HPO terms to give out possible differential diagnoses. Phenomizer is an external tool which utilises HPO terms to report phenotypic abnormalities. It yields a list of differential diagnoses of the patient based on the HPO data inputted by a clinician (10). However, for most IEI which are included in the genotypic classification of the International Union of the Immunological Societies, a full HPO phenomic data is still lacking currently which requires contributions from different clinical immunologists (11,12). Therefore, it is paramount to generate a phenome of IEI for a reference for the differential diagnosis tool. This can provide sufficient information for them to diagnose future patients by importing HPO terms identified by medical practitioners.
Currently, numerous case series about CGD patients have already been published but none of them has used HPO terms to represent their phenome. The phenomic data on CGD patients stored in Phenomizer database may not be accurate as well due to insufficient phenomic analysis. As a result, the main aim of this project is to observe and create a phenome for CGD patients in our case series. We also attempted to identify the main differences of the phenotypic data between XL-CGD and AR-CGD patients. The phenome of the respective XL or AR-CGD patients may also be sent to various external tools which use HPO terms for analysis such as Phenomizer and Exomiser so as to provide a reference for diagnosis and genetic analysis of suspected CGD patients. Significant differences are listed out to help clinicians to differentiate between them clinically.

Patient Source and Selection
The APID network is an IEI referral network established in 2009 by The University of Hong Kong, which acts as a platform to offer econsultation and free genetic testing for IEI. There are over 100 member centres across Asia and Africa. From 2003 to 2017, 117 CGD patients referred from 18 centres were successfully genetically diagnosed and were included in this case series.

Data Collection
Clinical records and laboratory results of CGD patients, provided by their referring doctors at the time of request for genetic testing, are archived in the APID network. Patient data, including demographics, family history, age of clinical milestones, infection, and genetic results were recorded.

Phenomic Data
HPO (October 2020 version) terms, which describe individual phenotypic abnormalities in a hierarchical framework of organs, were applied for performing the phenomic analysis. Two researchers first reviewed the laboratory reports and clinical notes and then suitable HPO terms were selected from the HPO browser http://www.human-phenotype-ontology.org to describe the phenotypic abnormalities displayed. In the end, a HPO phenotypic profile for each CGD patient was generated. The HPO phenotypic profile consisting of all HPO terms which can be manifested from the clinical records was selected. No negated HPO terms, i.e., no specific phenotype was manifested in the clinical record, were used in our study. Discrepancies for the final HPO phenotypic profile were discussed and modified. Only the highest class of HPO terms in the hierarchial framework, i.e., systems affected and the most specific HPO terms were computed for detection of any significant correlation between individual phenotypic abnormality and mode of inheritance.

Genetics Data
Genetic analysis was performed using genomic DNA extracted from peripheral blood. Genomic DNA was sent to our research laboratory and the candidate genes, i.e., CYBB, CYBA, NCF1, NCF2, and NCF4, were tested by using Sanger sequencing on the basis of clinical likelihood in the research laboratory of the Department of Paediatrics and Adolescent Medicine, The University of Hong Kong. Pathogenicity of the targeted gene mutation is re-evaluated in accordance with the diagnostic interpretation guidelines published by the American Academy of Allergy, Asthma & Immunology (AAAAI) PID working group in 2020 (13).

APID Network Questionnaire
A questionnaire was also distributed online to APID network member centres across Asia and Africa. Questions regarding the availability of care for CGD patients in APID network member centres, i.e., diagnosis, laboratory tests, and treatment of CGD patients were asked. A total of 20 responses were recorded.

Ethics Approval
This research project is approved by The University of Hong Kong/Hospital Authority Hong Kong West Cluster Institutional Review Board.

Statistical Analysis
For descriptive statistics, all ages of clinical milestones were expressed in median and range (year). Univariate analysis was performed by independent-samples Mann-Whitney U test to evaluate the difference between XL-CGD and AR-CGD. First manifestation, HPO terms, and system affected are presented in the form of heat map and expressed in percentages. Fisher's exact test was used in analysis to determine the correlation between categorical phenotypes with the genotype.

Demographics Data
The demographics data of our CGD case series was displayed in the Sankey diagram in Figure 1. Of the 117 patients, 104 (88.9%) were male and 13 (11.1%) were female. XL-CGD was seen in 87 (74.4%) patients, while the remaining belonged to AR-CGD group. Out of the 30 AR CGD patients, 9 (30.0%) of them were found to have mutations in CYBA by Sanger Sequencing, 13 (43.3%) of them were found to have GT deletion in NCF1 by GeneScan ® and 9 (30.0%) of them were found to have mutations in NCF2 by Sanger Sequencing, with one of them have concurrent CYBA and NCF2 mutations diagnosed. In our case series, more than half of the patients (56.7%) came from mainland China, with India (17.3%) and Hong Kong (17.3%) ranking second. Other patients either came from South-east Asia or South Africa. Details about consanguinity and family history were only available for 92 out of 117 patients, with 90 (97.8%) having no family consanguinity and 2 (2.2%) with family consanguinity. Moreover, out of the 92 patients, 69 (75.0%) have no suggestive family history and 23 (25.0%) have suggestive family history. Only 1 of them has an elder brother with known CGD.

Genetics Data
The genetic mutations of 117 CGD patients, with 3 patients reporting 2 unphased variants, are shown in Table 1. The commonest gene implicated was CYBB, with no diagnosed NCF4 CGD patients. In total, 87 of them had mutations identified in CYBB gene. For the remaining 30 AR-CGD patients, 20 of them had homozygous mutations while only 1 of them was confirmed with compound heterozygous. Four of them had 2 unphased heterozygous mutations found in a recessive gene and five of them had 1 heterozygous mutation found in a recessive gene.
All AR-CGD patients with NCF1 mutations have documented GT deletion. Four patients among our CGD case series had a large deletion mutation in CYBB. There were 24 novel mutations identified in our patients, including 23 mutations in CYBB and 1 mutation in NCF2. Pathogenicity of these unreported mutations was determined by using the AAAAI guidelines in 2020 (13). Among 117 CGD patients, 99 of the CGD patients have pathogenic variants, 13 of them have likely pathogenic variants, and 5 of them have variants with uncertain significance.

Clinical Characteristics of XL and AR-CGD
The ages of clinical milestones of our CGD patients are displayed in Table 2. Ages of onset, referral, clinical diagnosis, and genetic diagnosis correlated with the mode of inheritance. Median age of onset correlated with the mode of inheritance (p = 0.01), with XL-CGD (0.2 years) lower than that of AR-CGD (0.4 years). Median age of referral to an immunology unit of XL-CGD (0.8 years) is also significantly lower than that of AR-CGD (3.5 years) and was shown to be significantly related with the mode of inheritance (p = 0.009). Median age at clinical diagnosis of XL-CGD (1.4 years) is younger than that of AR-CGD (4.8 years) with a strong correlation between XL or AR (p = 0.017). The same result was also demonstrated for the age of genetic diagnosis (p = 0.004) with XL-CGD patients showing a lower median age (2.2 years) compared with AR-CGD patients (4.8 years). First manifestations of XL and AR-CGD patients in our case series were displayed in Figure 2. Only the 5 most common first manifestations were included in the heat map, namely fever, BCGitis/BCGosis, pneumonia, cough, and lymphadenopathy, where generalized lymphadenopathy and cervical lymphadenopathy or no specific location regarding the lymphadenopathy were all categorized here. As shown in the figure, there is no significant association between the mode of inheritance and the respective first manifestation. However, it could be seen that a higher percentage of XL-CGD (18%) has BCGitis/BCGosis as their first manifestation compared with AR-CGD (4%) while a higher proportion of AR CGD (22%) has lymphadenopathy as their first manifestation compared with XL CGD (10%).

Infection Profile
Results for microbiological testing ordered based on clinical suspicion of infections were tallied, and cultured microorganisms from various infections and the 3 top locations where they were cultured are reported in Figure 3. However, the infectious etiology may not be established in every CGD case as the culture of the pathogen may not be performed or the culture reports were negative

Phenomic Analysis Between XL and AR-CGD
Regarding the phenomic analysis, we only included 90 out of 117 patients as only these 90 patients had sufficient clinical information provided. The affected systems of XL and AR-CGD patients are displayed in Figure 4 in the form of a heat map. In general, more XL-CGD patients are affected compared with AR-CGD patients in terms of the systems. Immune system was not shown in the heat map because all CGD patients had their immune system affected. As shown above, the most frequently affected systems of both XL and AR-CGD patients were the respiratory system, homeostasis system, and digestive system, respectively. A univariate analysis was performed to determine the correlation between various systems and their respective genotypes by using Fisher's exact test. The integumentary system is significantly associated between the mode of inheritance (p = 0.0153), with more XL-CGD patients (57%) affected compared with AR-CGD patients (26%). In addition, more XL-CGD patients (13%) have their nervous system affected compared with no AR-CGD patients showing such manifestation. Although it is not statistically significant, it is still an interesting phenomenon to be reported. Phenotypic abnormalities of CGD patients are displayed in Figure 5 in the form of a heat map as shown above. More than 200 HPO terms describing phenotypic abnormalities were recorded in our CGD case series but only HPO terms which were manifested by more than 10% of patients would be included in the heat map. In general, more HPO terms were displayed in XL-CGD patients compared with AR-CGD patients. Among all the HPO terms recorded, both recurrent fever and pneumonia are the most frequent HPO terms identified with more than 70% of XL-CGD patients and 50% of AR-CGD patients showing this phenotypic abnormality. Other more common phenotypic abnormalities include hepatosplenomegaly, cutaneous abscess, and anaemia.
A total of 5 HPO terms were shown to be correlated with XL or AR inheritance by Fisher's exact test with more XL-CGD patients showing that specific phenotypic abnormality. Perianal abscess, cutaneous abscess and elevated hepatic transaminase are strongly correlated with the mode of inheritance (p < 0.001). Bronchitis and cough are correlated with the mode of inheritance as well (p < 0.05).

APID Network Questionnaire Regarding the Care of CGD Patients
The results of the questionnaire which was delivered to 20 APID network members are shown in Figure 6. As displayed, there were 16 centres who had diagnosed CGD in their clinics and most of them diagnosed their first CGD patients in the 1990s to 2010s. Only 9 APID network members have performed nitroblue tetrazolium test (NBT) after their establishment. APID network members diagnosed CGD by using NBT test initially with all of the 9 clinics performing the first NBT test before 2010s. However, starting from 1990s, APID network members started to use dihydrorhodamine (DHR) cytometry assay as well. In total, 11 APID network members have performed DHR cytometry assay in their clinics during the 1990s to 2010s.

DISCUSSION
Our study revealed that XL-CGD and AR-CGD patients had some phenotypic differences through phenomic analysis. XL-CGD patients had their integument and the central nervous system more frequently affected. XL-CGD patients were shown to have perianal abscess, cutaneous abscess, and elevated hepatic transaminase more often as well. More XL-CGD patients presented with BCGitis/BCGosis as their first manifestation.
In our study, the most significant finding is that the integument system is more frequently affected among XL-CGD patients than AR-CGD patients. The reason behind such finding is due to more frequent perianal abscess, perianal rash, and cutaneous abscess reported among XL-CGD patients in the phenomic analysis, as it was observed in previous publications (3,(50)(51)(52). However, from reports in India and China, there is no statistical difference between XL and AR-CGD patients with episodes of superficial abscess (30,41). Another interesting FIGURE 5 | A heat map describing percentages of CGD patients in our case series where their clinical records displayed certain phenotypic abnormalities. Fisher's exact test is used for statistical analysis (Only more than 5% of patients describing certain HPO term is recorded). HPO, Human Phenotype Ontology. Note: **p < 0.01; *p < 0.05. This graph is created by using the app Datawrapper. observation reported in our case series is that 13% of XL-CGD patients, but none of the AR-CGD patients had a central nervous system (CNS) abnormality. Common abnormalities under CNS include upper motor neuron dysfunction, headache, spinal cord compression, choroid plexus cyst, and unusual CNS infection, including CNS aspergillosis. The frequency of CNS aspergillosis only accounts for less than 5% in overall infections, and it has been shown that there is no significant association between the genotype and CNS aspergillosis in previous literature (53,54). Further investigations need to be done to see whether these CNS and integumentary abnormalities are primary defects or complications of CGD or unrelated with CGD. Nevertheless, these new findings might be useful clinical handles for clinical immunologist to distinguish between XL and AR-CGD. Whenever clinicians observe some redflags, i.e., more frequent cutaneous or perianal abscesses or CNS abnormalities among CGD patients, they should suspect XL-CGD and perform targeted gene Sanger sequencing and DHR as soon as possible to confirm the diagnosis.
In addition to more frequent perianal and cutaneous abscesses seen in XL-CGD patients, a higher frequency of elevated hepatic transaminase was noted among XL-CGD patients in phenomic analysis. Previous literature has shown that abnormal liver enzymes level is common among CGD patients, occurring with at least one episode among 73% of the patients (55). However, no study has been done to correlate with the mode of inheritance and elevated hepatic transaminase. It has been hypothesised that XL-CGD patients had hepatosplenomegaly, liver abscesses, and BCGosis more often, which is a common cause of abnormal liver enzymes shown in previous literature (55). Therefore, clinicians can use the frequency of elevated hepatic transaminase to help differentiate between XL-CGD and AR-CGD.
Another notable finding in this case series is that more XL-CGD patients presented with BCGitis/BCGosis as first manifestation compared with AR-CGD patients. It has been documented that CGD patients are more prone to disseminated or local BCG infection due to defective intracellular mycobacterial killing mechanism (18,56). BCG-related disease has been documented as a common first sign of CGD but in-depth study between genotypes of CGD has not been done. BCGitis is seen more commonly in countries where BCG vaccination is included in universal vaccination programme like mainland China, Iran, and Latin America as shown in Table 3 (67). It has been hypothesised that patients with XL-CGD had poorer control of BCG as compared with AR-CGD and hence physicians can recognise the BCG-related disease more often as their first manifestation.
The main limitation of our study is that the clinical data provided to the APID network might be insufficient. Some of the CGD patients from our study were genetically diagnosed 20 years ago, during which the awareness and understanding of CGD was still inadequate in many countries, leading to an underreporting of CGD patients with atypical features. The authors do not have full access to the complete medical records of the CGD patients and hence some major phenotypic data of the CGD patients in our case series may be missed. Microbiological culture tests were not performed in some cases, leading to omissions in our infection profile. Staphylococcus aureus, for example, has been reported to be the most common pathogen causing skin abscesses in previous CGD case series but was not shown in our study. In addition, the clinical data provided to us were only up to the time when the patient was clinically diagnosed with CGD, and hence no followup clinical data could be computed and analysed. As a result, survival and death analysis cannot be done. As NBT or DHR assays were not always available and our cases came from many centres with different testing methodologies (68), therefore the functional phenotype of residual reactive oxygen species production could not be analysed in our case series. Since there were only 23 AR-CGD cases with sufficient clinical information, there might not be enough power resulting in false-negative results in our phenomic analysis comparing between AR and XL-CGD.
In conclusion, more severe integument infections, CNS, and hepatic enzyme abnormalities were observed in XL-CGD patients compared with AR-CGD patients. A summary of key findings regarding the differences between previous case series and this case series is presented in Table 3. Whenever clinicians identify such phenomic features among our children suspected to have IEI, they should suspect a diagnosis of XL-CGD and perform DHR as soon as possible. This can help speed up the diagnostic process and hence start prophylactic treatment as well as offering targeted genetic testing.

DATA AVAILABILITY STATEMENT
The datasets presented in this article are not readily available because of ethical restrictions. Requests to access the datasets should be directed to lauylung@hku.hk.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by Hospital Authority Hong Kong West Cluster-University of Hong Kong Institutional Review Board. Written informed consent to participate in this study was provided by the participants' legal guardian/next of kin.

AUTHOR CONTRIBUTIONS
YL conceptualised the study. YL and DL designed the study. K-WC and C-YW performed genetic study. TC, HY, K-WC, and DL curated mutations. TC and DL phenotyped the patients, analysed data, and penned the manuscript. Other authors referred patients and provided clinical care and clinical data. All authors critically reviewed the manuscript. All authors contributed to the article and approved the submitted version.

FUNDING
The work is funded by the Society for Relief of Disabled Children and Jeffrey Modell Foundation.