Airway epithelium respiratory illnesses and allergy (AERIAL) birth cohort: study protocol

Introduction Recurrent wheezing disorders including asthma are complex and heterogeneous diseases that affect up to 30% of all children, contributing to a major burden on children, their families, and global healthcare systems. It is now recognized that a dysfunctional airway epithelium plays a central role in the pathogenesis of recurrent wheeze, although the underlying mechanisms are still not fully understood. This prospective birth cohort aims to bridge this knowledge gap by investigating the influence of intrinsic epithelial dysfunction on the risk for developing respiratory disorders and the modulation of this risk by maternal morbidities, in utero exposures, and respiratory exposures in the first year of life. Methods The Airway Epithelium Respiratory Illnesses and Allergy (AERIAL) study is nested within the ORIGINS Project and will monitor 400 infants from birth to 5 years. The primary outcome of the AERIAL study will be the identification of epithelial endotypes and exposure variables that influence the development of recurrent wheezing, asthma, and allergic sensitisation. Nasal respiratory epithelium at birth to 6 weeks, 1, 3, and 5 years will be analysed by bulk RNA-seq and DNA methylation sequencing. Maternal morbidities and in utero exposures will be identified on maternal history and their effects measured through transcriptomic and epigenetic analyses of the amnion and newborn epithelium. Exposures within the first year of life will be identified based on infant medical history as well as on background and symptomatic nasal sampling for viral PCR and microbiome analysis. Daily temperatures and symptoms recorded in a study-specific Smartphone App will be used to identify symptomatic respiratory illnesses. Discussion The AERIAL study will provide a comprehensive longitudinal assessment of factors influencing the association between epithelial dysfunction and respiratory morbidity in early life, and hopefully identify novel targets for diagnosis and early intervention.

Introduction: Recurrent wheezing disorders including asthma are complex and heterogeneous diseases that affect up to 30% of all children, contributing to a major burden on children, their families, and global healthcare systems.It is now recognized that a dysfunctional airway epithelium plays a central role in the pathogenesis of recurrent wheeze, although the underlying mechanisms are still not fully understood.This prospective birth cohort aims to bridge this knowledge gap by investigating the influence of intrinsic epithelial dysfunction on the risk for developing respiratory disorders and the modulation of this risk by maternal morbidities, in utero exposures, and respiratory exposures in the first year of life.Methods: The Airway Epithelium Respiratory Illnesses and Allergy (AERIAL) study is nested within the ORIGINS Project and will monitor 400 infants from birth to 5 years.The primary outcome of the AERIAL study will be the identification of epithelial endotypes and exposure variables that influence the development of recurrent wheezing, asthma, and allergic sensitisation.Nasal respiratory epithelium at birth to 6 weeks, 1, 3, and 5 years will be analysed by bulk RNAseq and DNA methylation sequencing.Maternal morbidities and in utero exposures will be identified on maternal history and their effects measured through transcriptomic and epigenetic analyses of the amnion and newborn epithelium.Exposures within the first year of life will be identified based on infant medical history as well as on background and symptomatic nasal sampling for viral PCR and microbiome analysis.Daily temperatures and symptoms recorded in a study-specific Smartphone App will be used to identify symptomatic respiratory illnesses.Discussion: The AERIAL study will provide a comprehensive longitudinal assessment of factors influencing the association between epithelial dysfunction and respiratory morbidity in early life, and hopefully identify novel targets for diagnosis and early intervention.

Introduction
Recurrent wheezing disorders in children are extremely common and impose significant burdens on children, families, and communities, with 20%-30% of all children developing recurrent wheezing and 10% receiving an asthma diagnosis (1,2).These early-life disorders also have a significant impact on healthcare systems worldwide, representing one of the most common reasons for hospital presentation and contributing billions of dollars to direct and indirect economic and healthcare costs (2,3).While significant advances have been made in our understanding of disease pathophysiology, endotypes, and triggers, there are still major knowledge gaps leading to an ongoing lack of optimal diagnostic, disease prevention, and treatment options (4,5).
The pathogenesis of recurrent wheezing represents a complex interplay between intrinsic genetic, cellular, and structural factors, which are then modulated by external pathogen and environmental exposures.While deficient immune responses and interferon signalling are well characterised in these conditions (6), the airway epithelium has been increasingly recognised as a key contributor to this disease process (7), but the exact mechanisms are still unclear.As the first contact point for inhaled pathogens and particles, the airway epithelium plays an essential role in barrier function, mucociliary clearance, and immune/inflammatory responses in the airway (8).To interrogate disease pathogenesis and susceptibility mechanisms in the airway epithelium, obtaining airway tissues is essential, but limited by ease of access (9).While there exists differences in epithelial cell composition and gene expression between nasal, tracheal, and bronchial regions (10), strongly conserved gene signatures between nasal and tracheal samples have been described for both global gene expression and wheeze-/atopyspecific biomarkers (9).This supports the nose as a minimally invasive site for interrogating global epithelial function throughout the airway.Primary airway epithelial cells isolated from both the upper and lower airways of young children with asthma have been shown to exhibit essential functional characteristics with compromised barrier function (11,12), wound repair (13)(14)(15)(16), and innate immune responses (17), referred to as the "vulnerable epithelium" (18).These observations of an intrinsically impaired epithelium are supported by genome-wide associations studies that have highlighted several genes variants that are expressed in epithelial cells as risk factors for asthma development (19,20).
Several key variables have been identified that further modulate this intrinsic epithelial risk.Maternal morbidities such as asthma, have been shown to be associated with an increased risk for childhood asthma development (21).In utero exposures have also been linked to the development of recurrent wheezing/ asthma in early-life including infection, allergen, medication/ antibiotic, and pollutant exposures, with maternal smoking typically showing the strongest association (22)(23)(24).The amnion provides a readily available, non-invasive tissue source to detect and measure the influence of maternal morbidities and in utero exposures after birth (25)(26)(27).In support of this, transcriptional and epigenetic changes in the placenta and amnion have been described following in utero exposures such as infection or smoking (25)(26)(27).The amnion is also a source of epithelial cells that have been exposed to similar factors as the developing respiratory epithelium of the foetus/newborn (25)(26)(27).
In addition to maternal morbidities and in utero exposures, early-life exposures to respiratory viruses and bacteria, as well as environmental allergens, pollutants, and cigarette smoke have all shown an association with subsequent recurrent wheezing and asthma development (28)(29)(30)(31).In particular, the airway microbiome appears to play an important role in the development and phenotypic manifestations of asthma and allergy (32).Distinct microbial profiles have been described in children with and without asthma (33), as well as profiles that associate with asthma symptom control and exacerbations (34,35).One mechanism through which exposures and the airway microbiome appear to modulate future respiratory outcomes is through epigenetic reprogramming of host cells, with distinct methylation profiles being observed in respiratory epithelial cells from healthy, atopic, and asthmatic children (36).
While published studies have focused on individual aspects of these disease mechanisms, there remains a need for more integrated studies investigating multiple factors both in combination and longitudinally.We have therefore designed this prospective longitudinal birth cohort study as an integrated platform to interrogate key factors influencing the development of recurrent wheezing, asthma, and allergic sensitisation across the first 5 years of life.We hypothesise that intrinsic epithelial vulnerability is a precursor to adverse respiratory outcomes in childhood, with this predisposition being further modulated by maternal morbidities, in utero exposures, and exposures within the first year of life.
The specific aims of this cohort study are: 1. To establish whether the vulnerable respiratory epithelium is detectable at birth, epigenetically regulated, and can be used to predict respiratory outcomes within the first 5 years of life.2. To determine whether maternal morbidities and in utero exposures influence the development of the vulnerable epithelium at birth. 3. To determine whether the amnion can be used as a surrogate marker for antenatal exposures and/or vulnerable respiratory epithelium of the newborn.4. To determine the role of the microbiome and respiratory virus exposures within the first year of life on the development of respiratory outcomes in those with vulnerable epithelium.
are invited to be part of the AERIAL study during an antenatal appointment at Joondalup Health Campus, a public/private partnership hospital servicing a culturally and socioeconomically diverse region in Perth, Western Australia (37,38).Recruitment into the AERIAL Study commenced in August 2020 during the COVID-19 pandemic and associated public health restrictions in Western Australia.The study is required to adhere to the guidelines set by the Western Australia Government and Health Department for participant interactions and sample collection.The primary influence of these changes was an enforced reduction in direct participant contact.To adhere to these public health restrictions and ensure staff and participant safety, our sample and swab collection protocols were adjusted to allow parent collection in addition to study staff attending home visits.

Participants and recruitment
The AERIAL study will recruit 400 families from the ORIGINS Project, during a routine antenatal visit.Mothers will be consented into the study prior to delivery, while newborns will be consented as active participants after birth.Inclusion criteria for the AERIAL study is sufficient understanding of written/spoken English to complete consent/study procedures and access to a smartphone that can utilise the study App.To maintain diversity in the cohort, the only exclusions for this study are participant births with a gestational age less than 32 weeks, any significant genetic anomalies, and any significant perinatal complications.

Participant and consumer involvement
AERIAL is an interactive longitudinal study which requires a high level of participation from the families, particularly within the first year of their child's life, with daily temperature checks, recording of symptoms, and collection of four background nasal/ throat swabs (3, 6, 9, 12 months), as well as repeated symptomatic swabs during illness.To ensure high data and sample integrity, the study consulted extensively with consumers in two consumer reference groups (one which included ORIGINS participants) to develop the protocol which prioritised reduced burden as well as maximised adherence and retention.All consumers insisted on simplicity and ease of monitoring, communication, and daily recordings.This led to the development of a Smartphone app AERIAL TempTracker App enabling real time viral detection in the AERIAL cohort.

Outcome measures
The primary outcomes of the AERIAL study will be assessed at 1, 3, and 5 years of age.We will characterise the association between transcriptomic and epigenetic markers of the vulnerable epithelium with: 1. Recurrent wheezing 2. Allergic Sensitisation 3. Asthma The presence of these endpoints will be determined independently of downstream analyses and will be based on serial review of each participants medical history during both AERIAL and ORIGINS study visits.Recurrent wheezing (at 1, 3, and 5 years of age) will be defined as the presence of two or more episodes of parent-reported or medically documented wheezing (39).Allergic sensitisation will be defined as a positive skin-prick test result [at 1, 3, and 5 years as part of their ORIGINS Project clinic visit (37,38)], according to Australasian Society of clinical Immunology and Allergy guidelines (40).Asthma (at 3 and 5 years) will defined based on a diagnosis made by the participant's treating physician or according to national guidelines (41) as chronic signs and symptoms suggestive of asthma with a documented treatment response.
Secondary outcomes include: 1. Exploring the association between nasal transcriptomic/ epigenetic markers and other allergic disease (food allergies, allergic rhinitis, and eczema).2. Investigating the use of the amnion as a surrogate marker for a child's respiratory epithelium at birth and as a tool for classifying the epigenetic/transcriptomic effects of maternal morbidities and in utero exposures at birth. 3. Characterising the influence of epithelial endotypes and symptomatic viral infections on the development and maintenance of the microbiome in early life.4. Identifying microbial signatures associated with protection or susceptibility to viral infections.

Data collection, processing, and analyses
The samples and data to be collected at each timepoint are presented in Figure 1 and Table 1.

Clinical data collection
Participant clinical data used in the AERIAL Study will be collected as part of the ORIGINS Project (38) and includes: 1. Maternal demographics, medical history (including asthma/ atopy), in utero exposures (including smoking exposure), and pregnancy data.2. Infant demographics, medical history (i.e., respiratory illnesses, wheezing/asthma, allergy, eczema, rhinitis, hospitalisations, other diagnoses), medications, body composition, and growth/developmental assessments.3. Skin prick testing against common allergens (egg, milk, wheat, tuna, peanut, cashew nut, ryegrass, house dust mite, cat dander) at 1, 3, and 5 years of age.
The AERIAL study will also collect additional information on respiratory illness, wheezing, allergies, and relevant medication use (including asthma relievers/preventers) at scheduled study visits as well as at symptomatic illness events.

Symptomatic respiratory illnesses
The AERIAL study aims to capture all respiratory illnesses within the first year of life.Data will be captured in a designed-for-purpose Smartphone App, the AERIAL TempTracker App that will enable real-time data collection, automated alerts, and reminders to families.The AERIAL App will also be used to monitor and encourage participant adherence with study procedures.
Parents will measure daily temperatures using an infrared forehead thermometer and enter data into the AERIAL App.For each entry, parents will record: (1) measurement time; (2) temperature; (3) symptoms present from a pre-defined list; (4) medications used from a pre-defined list; and optionally (5) other free-text symptoms/medications (Supplementary Table S1).AERIAL study schema.Produced using BioRender.com.
Symptomatic respiratory illnesses requiring a nasal swab will be defined as a temperature >37.5 °C and the presence of at least one respiratory symptom (runny/blocked nose, wet/dry cough, sneeze, headache, myalgia, chills, rigours, tiredness, sore throat, dyspnoea, loss of taste/smell, poor feeding, diarrhoea) OR temperature <37.5 °C and the presence of at least two respiratory symptoms.Only one symptomatic swab per illness will be collected, with at least 2 weeks between subsequent swabs.
Parents will also be asked to fill out a symptomatic questionnaire at each illness, which will capture additional information on the nature and duration of symptoms (runny/ blocked nose, sneezing, cough, wheeze), medication use (e.g., salbutamol), and healthcare attendance/hospitalisation.

Background visits
The AERIAL study will perform quarterly background sampling at 3, 6, 9, and 12 months of age.Background study visits will be delayed up to 1 month to capture children when they are asymptomatic (symptom free for more than 2 weeks).If a suitable window cannot be identified within this timeframe, the symptomatic sampling will be used in place of the background visit, and additional sampling will not be performed.
Parents will also be asked to fill out a background questionnaire at each visit, which will capture symptoms, medications, and healthcare attendance/hospitalisation from the preceding 3 months.

Amnion sampling
Amnion for transcriptomic and epigenetic analyses will be collected from consenting mothers.After delivery, placentas will be examined by the delivery personnel according to Joondalup Health Campus standard operating procedures, placed in plastic bags, and stored at 4 °C for transport to Telethon Kids Institute (Perth).Biopsy samples from each amniotic membrane will be taken within 72 h of delivery, snap frozen, and stored at −80 °C until batch processing.Our pilot data has confirmed that this processing method provides RNA/DNA of suitable quality and quantity for sequencing.

Nasal epithelial sampling
Nasal epithelial sampling for transcriptomic and epigenetic analyses will be performed at birth to 6 weeks home visit), and at 1, 3, and 5 years of age (clinic visits), by a trained study team member.The nasal epithelium was chosen as a minimally invasive surrogate tissue for the lower airways based on previous demonstration of highly conserved global and wheeze/allergyspecific gene expression between nasal and tracheal epithelium (9).
For newborns, each nostril will be sampled using a Dent-O-Care 620 brush (Dent-o-Care, United Kingdom), as previously described (42), and placed into a FluidX TM tube (Azenta Life Sciences, USA) containing DNA/RNA Shield TM (Zymo research, USA).For 1-, 3-, and 5-year nasal sampling, only one nostril will be used.All collected samples will be store at 4 °C for transport and then stored at −80 °C until batch RNA and DNA extraction.

Nasal swabs for virus and microbiome analyses
During the first year of life, nose/throat swabs for respiratory virus identification and microbiome analysis will be collected from separate nostrils during symptomatic respiratory illnesses and at quarterly background checks (as described above).
For respiratory virus identification, the throat and one nostril (43) will be sampled with a swab (FLoQSwabs, Copan Group, Italy),placed into viral testing medium (eSwab, Copan Group, Italy), and stored at 4 °C until transport to the commercial laboratory (Western Diagnostics Pathology) for testing of 8 common respiratory viruses (influenza "A", influenza "B", respiratory syncytial virus, human metapneumovirus, parainfluenza, rhinovirus, adenovirus, SARS-CoV-2) by multiplex PCR.For microbiome analyses, the other nostril will be sampled, the swab placed into DNA/RNA stabilisation medium (eNAT, Copan Group, Italy), transported at 4 °C, and frozen at −80 °C until batch processing.

Additional biological samples
Additional samples including maternal and infant blood, urine, saliva, and stool samples are collected at various timepoints and bio-banked as part of the ORIGINS Project (38) and are available for secondary analysis as required during the AERIAL study.Participants of both the AERIAL and ORIGINS Project will be asked to consent to allow these secondary analyses on bio-banked samples.

Transcriptomic profiling
RNA from amnion and nasal epithelial samples will be extracted using Zymo Quick-RNA Microprep Kits (Zymo Research, USA), according to manufacturer's instructions, and stored at −80 °C.Total RNA quantity, quality and integrity will be assessed by Qubit (ThermoFisher, USA) and the Agilent TapeStation (Agilent Technologies, USA).RNA samples will be sent on dry ice to a genomics core facility for wholetranscriptome library preparation and sequencing.

Epigenetic profiling
For epigenetic analysis, DNA from amnion and nasal epithelial samples will be extracted using Chemagic DNA Blood Kits (PerkinElmer, USA), according to the manufacturer's instructions, and stored at −80 °C until analysis in bulk.DNA quantity will be assessed by Qubit (I ThermoFisher, USA).Capture DNA methylation sequencing (methyl-seq) using enzymatic conversion and target enrichment will be conducted at a genomics core facility using the TWIST Human Methylome Panel, covering more than 3.2 million CpGs.This method also allows extraction and analysis of DNA variant/small nuclear polymorphisms from the methylation sequencing data.

Microbiome profiling
Microbial DNA will be extracted using QIAamp DNA kit (QIAGEN, location).Positive extraction controls in the form of spike-in internal control samples and mock communities from ZymoBIOMICS (Zymo Research, USA) will be included.Bacterial load in DNA extracts will be quantified using TaqMan qPCR (ThermoFisher, USA) (44).The bacterial biota will be profiled at strain level resolution by amplifying and sequencing the fulllength 16S rRNA gene.

Long-term follow-up plans
While the initial AERIAL cohort study plans to follow children from birth to 5 years of age, the long-term aim is to extend the cohort throughout childhood to allow longitudinal monitoring of health trajectories and to enable older age-appropriate investigations such as pulmonary function testing at age 6 or 7 to better clarify respiratory phenotypes.

Statistical analysis 2.6.1 Analytical methods
Analyses will be performed on longitudinally acquired data, using generalised linear mixed effects models with random subject effects and best fitting covariance structure.Primary endpoints of recurrent wheezing, asthma, and allergic sensitisation will be assessed as binary outcomes (yes/no).Number of wheezing episodes will also be assessed as a discrete outcome.All models will be adjusted for interactions and confounders as required (including demographic and socioeconomic variables).
Transcriptomic, methylation, and microbiome data will be preprocessed, quality controlled, and analysed using our established in-house protocols and pipelines (9,44,45).As markers of the vulnerable epithelium have not previously been assessed at birth, a paired analysis approach was chosen.Firstly, previously identified markers of a vulnerable epithelium will be used to classify children as vulnerable epithelium "yes" or "no", with these endotypes then associated with the development of clinical outcomes at 1, 3, and 5 years of age.Secondly, differential gene expression and methylation analyses will be used to identify differences between children with and without clinical outcomes.These markers will be associated with markers of epithelial dysfunction.

Sample size
A total of 400 participants will be recruited into AERIAL to be followed for 5 years and will be randomly divided into discovery (300 participants) and validation (100 participants) cohorts for analysis.Given the primary analysis aims to explore the association between transcriptomic and epigenetic markers of the vulnerable epithelium and clinical outcomes at 1, 3, and 5 years of age, sample size calculations are based on our published RNA-Seq data (9).R package RNASeqPower (46) was used to determine sample sizes for the discovery cohort with an estimated outcome prevalence of 20%, library depth of 15 million, biological coefficient of variation of 0.5, power of 80%, a = 0.05, and a range of fold difference effect sizes.The outcome prevalence was chosen based on well-documented rates of recurrent wheezing in young children (∼30%) and our observations of a vulnerable epithelium in approximately 70% of these children (15,16).We will recruit 300 children as a discovery cohort to ensure retention of 260-270 children with adequate diary information and viral ascertainment of ∼65% (47, 48).The validation cohort will be used to assess the ability of target biomarkers from the discovery cohort to predict the main respiratory outcomes of interest, using penalised logistic regression and receiver operating characteristic curves (AUC).100 children will give us an approximate power of 0.95 [a = 0.05, R package pROC (49)] to detect a difference in AUC between 0.75 and 0.5 for recurrent wheezing, asthma, and atopic sensitisation endpoints.

Discussion
The Airway Epithelium Respiratory Illnesses and Allergy study (AERIAL) will longitudinally characterise the determinates of recurrent wheezing, asthma, and allergic sensitisation in early life.While individual aspects of these disease processes have been interrogated previously, the key strengths of the AERIAL cohort are its longitudinal design, multi-omics approach, and robust collation of critical exposures both in utero and through the first year.The use of a Smartphone App will allow for reduced burden on participant families and study coordinators and support adjustments to evolving pandemic restrictions.Through an improved understanding of both intrinsic epithelial vulnerability and the external factors modulating its phenotypic manifestation, we hope to provide a foundation for new insights into disease pathogenesis unlocking novel opportunities for diagnostic and therapeutic development.

Ethics and dissemination
The study is being conducted in accordance with the Helsinki Declaration and was approved by the Ramsey Health Care HREC WA-SA (#1908).Participating families may withdraw consent for the study at any time and privacy and confidentiality will be provided by assigning a unique AERIAL study ID at the time of consent.All specimens, reports, and data collected by AERIAL are identified by ID number only.
The de-identified results of the analysis and data will be made available and disseminated through open-access peer-reviewed manuscripts, local, national and international conference presentations, and through different media channels to the consumers, AERIAL/ORIGINS families, and the wider community.

TABLE 1
AERIAL study schedule of assessments.