Cerebrovascular Reactivity Measurement Using Magnetic Resonance Imaging: A Systematic Review

Cerebrovascular reactivity (CVR) magnetic resonance imaging (MRI) probes cerebral haemodynamic changes in response to a vasodilatory stimulus. CVR closely relates to the health of the vasculature and is therefore a key parameter for studying cerebrovascular diseases such as stroke, small vessel disease and dementias. MRI allows in vivo measurement of CVR but several different methods have been presented in the literature, differing in pulse sequence, hardware requirements, stimulus and image processing technique. We systematically reviewed publications measuring CVR using MRI up to June 2020, identifying 235 relevant papers. We summarised the acquisition methods, experimental parameters, hardware and CVR quantification approaches used, clinical populations investigated, and corresponding summary CVR measures. CVR was investigated in many pathologies such as steno-occlusive diseases, dementia and small vessel disease and is generally lower in patients than in healthy controls. Blood oxygen level dependent (BOLD) acquisitions with fixed inspired CO2 gas or end-tidal CO2 forcing stimulus are the most commonly used methods. General linear modelling of the MRI signal with end-tidal CO2 as the regressor is the most frequently used method to compute CVR. Our survey of CVR measurement approaches and applications will help researchers to identify good practice and provide objective information to inform the development of future consensus recommendations.


INTRODUCTION
Cerebrovascular reactivity (CVR) reflects the ability of the blood vessels to dilate in order to match tissue blood supply to increased demand and can be investigated by measuring the change in cerebral blood flow (CBF) or cerebral blood volume (CBV) that vasodilation induces. It is a valuable tool for assessing vascular health in pathologies, including steno-occlusive diseases (Mandell et al., 2008b), while more subtle CVR impairments have been found in Alzheimer's disease (Chen, 2018) and cerebral small vessel disease (Wardlaw et al., 2019). The measurement of CVR relies on three key elements: the vasodilatory stimulus, the signal acquisition and the processing method.

Vasodilatory Stimulus
Vasodilation occurs naturally as a mechanism of CBF auto-regulation, but can also be triggered by exogenous stimuli inducing extracellular and intracellular acidosis. The resulting decrease in pH relaxes smooth muscle cells lining the arteries and arterioles, thereby increasing their diameter. Common stimuli include changes in arterial CO 2 partial pressure (PaCO 2 ) induced by voluntary modulations of the breathing pattern, including breath-holding, hyperventilation and paced breathing (Petersson and Glenny, 2014;Urback et al., 2017;Liu et al., 2019) or by inhalation of CO 2 -enriched gas (Fierstra et al., 2013;Liu et al., 2019). As PaCO 2 cannot easily be measured in vivo, end-tidal CO 2 (EtCO 2 ), the most recent maximal exhaled CO 2 partial pressure, is often used as a surrogate and can be measured by recording the CO 2 level in the exhaled gas using a gas monitor. Several approaches exist to manipulate PaCO 2 : inhalation of gas with fixed CO 2 concentration (e.g., CO 2 -enriched air or carbogen), rebreathing the exhaled gas, EtCO 2 targeting manually or using a computer-controlled device (Fierstra et al., 2013). Vasodilation can be induced without modulating the composition of the inhaled gas or breathing pattern by injection of acetazolamide (ACZ), a carbonic anhydrase inhibitor that causes acidosis (Vagal et al., 2009).

Signal Acquisition
Several imaging methods can assess haemodynamic changes induced by the vasodilatory stimulus. Positron emission tomography (PET), single-photon emission computed tomography (SPECT) (Ogasawara et al., 2003) and computed tomography (CT) (Marion and Gerrit, 1991) have all been used to measure CVR, but involve ionising radiation and have low temporal resolution. Transcranial Doppler ultrasound is a practical alternative, but has a limited field of view that allows blood velocity measurements only in parts of single large vessels, which do not necessarily reflect local changes in tissue blood supply (Purkayastha and Farzaneh, 2012;McDonnell et al., 2013). Magnetic resonance imaging (MRI) is a non-invasive, non-ionising technique which allows CVR mapping using contrasts related to CBF and/or CBV. Arterial spin-labelling (ASL) and phase-contrast (PC) MRI measure CBF in tissue and large vessels, respectively (Valdueza et al., 1997;Noth et al., 2008), while vascular space occupancy (VASO) MRI measures CBV (Donahue et al., 2009). Dynamic susceptibility contrast (DSC)-MRI measures both CBF and CBV (Taneja et al., 2019) by monitoring the T 2 or T 2 * -weighted signal following intravenous injection of a gadolinium-based contrast agent. Blood Oxygen Level Dependent (BOLD) imaging, using a T 2 or T 2 * -weighted sequence, can also measure CVR due to its sensitivity to a combination of CBF and CBV.

Processing Method
The signal change due to the vasodilatory stimulus must be converted into a quantitative or semi-quantitative measurement of CVR using one of several methods. Pre-vs.-post-stimulus subtraction of the MRI signal relies on the computation of the absolute or relative signal difference before and after the stimulus has been applied (Donahue et al., 2013;Wu et al., 2017). Often, the pre-and post-values are calculated by taking the average of the MRI volumes acquired during each period respectively, discarding volumes that are acquired during the transition period. Linear regression is a method that investigates the linear relationship between the dependent variable (in this case the MRI signal or derived CBF) and independent variables (e.g., EtCO 2 , to reflect the vasodilatory stimulus; time, to model a linear signal drift) (Thrippleton et al., 2018;Liu et al., 2019), allowing the MRI time course to be modelled using multiple predictors simultaneously. Cross-correlation quantifies the similarity between two signals (e.g., the MRI signal and EtCO 2 ) as a function of their relative time delay (Donahue et al., 2016) and has been used as a measure of CVR. Non-linear fitting involves modelling the MRI signal as a non-linear function (Ziyeh et al., 2005;. It requires some initial estimate of the CVR and other parameters such as CVR delay, and can be more challenging to implement than linear regression, but has the advantage that any models can be used to fit the MRI signal. Some models (e.g., calibrated fMRI models) also allow quantitative estimation of CVR and other parameters that can be of interest such as cerebral metabolic rate of oxygen (CMRO 2 ). Frequency-based analysis includes transfer function (Duffin et al., 2015) and Fourier (Blockley et al., 2011) analyses. In both methods, the signals of interest (e.g., the MRI signal and EtCO 2 ) are transformed into the frequency domain. The magnitude of the signal at the stimulus frequency is then defined as the CVR. Finally, the standard deviation of the MRI signal (Kannurpatti et al., 2014;Jahanian et al., 2017) can be computed as a metric of CBF change due to natural vasodilation and vasoconstriction.

Aims of the Review
Since many combinations of the above stimuli, imaging methods and analysis techniques are possible, there are potentially many different ways to measure CVR in-vivo, resulting in a high degree of methodological diversity in the literature. Previous reviews described common CVR-MRI experiments (Fierstra et al., 2013;Pillai and Mikulis, 2015;Moreton et al., 2016;Urback et al., 2017;Liu et al., 2019) or CVR data analysis . However, as far as we are aware, there are no systematic reviews detailing the breadth of CVR-MRI acquisition techniques, processing methods and applications that have been presented and used in the literature.
We conducted a systematic literature review of papers reporting the use of CVR-MRI techniques. We present an overview of the different aspects of the CVR-MRI experiment reported and applied in the literature, describing the most common methods and clinical research applications. We classified and systematically analysed reports of the MRI techniques, vasodilatory stimuli, data processing methods and study populations. Based on these findings we identified recent practises, trends, technical findings and evidence from clinical studies to inform future application and standardisation of CVR-MRI protocols.

Search Strategy
We systematically reviewed the EMBASE and MEDLINE databases from 1980, until June 2020 using Ovid. The search strategy combined terms relating to: "Cerebrovascular reactivity, " "MRI, " "BOLD, " "ASL, " "PC, " "hypercapnia, " "acetazolamide, " and "CO 2 ." We manually added relevant articles from the authors' libraries. The search was not constrained to English-language literature. Full details of the search strategy are provided as Supplementary Information.

Eligibility Criteria
We included all studies that investigated changes in cerebral blood flow or cerebral blood volume using MRI due to vasodilation or vasoconstriction in humans. We excluded reviews, conference abstracts, editors' notes, and case reports (single-subject studies focussed on methodological aspects of CVR were included). We removed studies that did not investigate induced vasodilation in the brain or used another imaging modality (e.g., CT, PET) to measure CVR. Studies that measured the change in the BOLD signal in response to a functional task and hypercapnia but did not compute a CVR metric were also excluded.

Data Extraction
One author (E.S.) screened the titles and abstracts of all potentially eligible publications to exclude duplicates and assess eligibility against the inclusion criteria before reading the full text of the remaining articles to determine eligibility. Eligibility and data extraction were discussed with other authors where queries around inclusion or exclusion, or data extraction arose.
We extracted population characteristics, including pathology, sample size, age, and gender. We recorded MRI acquisition parameters including magnetic field strength, type of pulse sequence and sequence parameters (e.g., TR, TE, spatial resolution, field-of-view). We recorded the type of vasodilatory stimulus, measurement of EtCO 2 and/or end-tidal O 2 (EtO 2 ), stimulus paradigm and, where available, information on tolerability, number and reason for any excluded or failed scans. Finally, we extracted information on the pre-processing steps, delay correction/computation methods and CVR processing methods applied, reported grey and white matter CVR values in healthy volunteers and relevant findings.

Search Results
We identified 732 articles, 176 of which were removed as duplicates (Figure 1). Of the remaining 556 papers, 317 were excluded on review of the title and abstract due to a lack of analysable data or insufficient detail [n = 192: conference abstracts (n = 131), reviews (n = 34), and case reports and notes to the editor (n = 27)], inaccessibility (n = 1), only reporting rodent studies (n = 2), using other modalities (e.g., PET, TCD, CT, SPECT) (n = 71) and not measuring CVR (n = 51). After full text review an additional 14 papers were removed because they used other imaging modalities to measure CVR (n = 6) or did not measure CVR (n = 8). Additionally, 24 articles were added from the authors' libraries. We included 235 papers in the review. Summary data extracted from each study is included in the Supplementary Material.

Population Characteristics
The studies included 5,369 unique participants. 36 subjects were excluded before CVR due to contraindication to MRI (n = 6) or ACZ (n = 3), claustrophobia in the MRI scanner (n = 5), too large to fit in the MRI scanner (n = 1), anxiety during pre-testing of the stimulus (n = 1) and intolerance of the stimulus (n = 20). The remaining 5,333 unique participants who had a CVR scan comprised 2,394 patients and 2,939 healthy participants. All studies reported a sample size, with a mean sample size of 35 (median: 19, range: 1-536). Forty-five studies had fewer than 10 subjects whereas 9 included more than 100 subjects. Twelve papers did not report any age information, a further 18 papers reported only the age range. The mean age, computed as the mean of the mean or median ages, was 44.3 (1.4-92) years. The median gender distribution was 43% females and 57% males, excluding the 18 studies not reporting gender distribution.
Information on tolerability of the CVR experiment ranged from information regarding subject withdrawal to subjective rating of tolerability and was reported in 51/235 (22%) studies (1,162/5,333 unique subjects). Overall, the CVR experiment in these 51 studies was mostly described as well-tolerated. One article studied the tolerability of 434 CVR (294 subjects) scans acquired with EtCO 2 targeting BOLD MRI and concluded that it was well-tolerated (Spano et al., 2013). Six studies reported subjective tolerability: the experiment was rated as tolerable to very tolerable with minimal discomfort on average in each study. Twenty-three studies detailed complaints of discomfort: 11 studies reported no complaints or adverse effects whereas 12 did. These 12 studies (618 subjects) reported 120 complaints transient to the CVR scan: respiratory symptoms due to gas inhalation such as breathing resistance and shortness of breath (n = 77), anxiety and/or claustrophobia (n = 16), dizziness and/or headache (n = 10), narrowness of head coil with gas apparatus (n = 4), tachycardia (n = 3), paraesthesia (n = 3), chest tightness (n = 1), conjunctive erythema (n = 1), tremor (n = 1), hand weakness (n = 1), nausea, confusion, and blurred vision (n = 1) and no details of the complaints (n = 2). No long-lasting symptoms were reported and no studies using acetazolamide injection detailed complaints or adverse effects. In 17 studies, 79 scans were defined as untolerable by the subject due to: anxiety (n = 21), claustrophobia (n = 16), discomfort related to gas apparatus in the scanner (n = 9), position in the head coil (n = 2) and no details (n = 31).
Three studies (n = 18) found BOLD-derived CVR values were lower at lower magnetic field strengths (Driver et al., 2010;Triantafyllou et al., 2011;Peng et al., 2020), two of which (n = 9) reported a linear relationship between BOLD-derived CVR and the field strength (Driver et al., 2010;Triantafyllou et al., 2011). In one study (n = 16), ASL-derived CVR did not differ at different field strengths (Noth et al., 2006). One study (n = 8) reported longer post-labelling delay results in lower baseline CBF and ASL-CVR measurements (Inoue et al., 2014). Use of EPI with parallel imaging compared to spiral imaging, reduced signal loss due to susceptibility-induced magnetic field gradients in BOLD-CVR measurements without affecting sensitivity, which was defined as the CVR t-statistic (n = 5) (Winter et al., 2009). Furthermore, one study (n = 5) showed that using simultaneous multi-slice acceleration of factor 2 and 3, can reduce scan duration by at least a half compared to conventional EPI while maintaining the CVR sensitivity (Ravi et al., 2016a). Compared to single-echo ASL or BOLD EPI, a multi-echo (four echoes) EPI acquisition followed by T2 * fitting of the signal decay had higher inter-scan repeatability of breath-hold CVR analysed across voxels, CVR sensitivity and test-retest reliability analysed using the intra-class correlation coefficient (n = 14) (Cohen and Wang, 2019).
One study (n = 4) found BOLD response to EtCO 2 is 60 times higher than to EtO 2 , but demonstrated that during hypercapnic CVR-BOLD experiments, EtO 2 should be controlled if the change in EtCO 2 is small compared to the change in EtO 2 (Prisman et al., 2008). One study (n = 9) demonstrated that carbogen should not be used with BOLD or ASL to measure CVR due to a lack of correlation between both MRI techniques as opposed to CVR measurements using CO 2 -enriched air with BOLD or ASL (Hare et al., 2013). Another study (n = 20) found that, for a gas challenge, an effect of at least 2 mmHg EtCO 2 change is required to detect haemodynamic impairment using BOLD at 3 T (De Vis et al., 2018). RS-BOLD was found to give CVR results that were associated with fixed-inspired CO 2 BOLD (n = 48, Liu et al., 2017a) and RespirAct BOLD (n = 13, Golestani et al., 2016) measurements. One study (n = 8) reported differences in response amplitude and onset time depending on whether BH was performed before and after expiration (Leoni et al., 2008). For BOLD-BH, one study (n = 6) demonstrated that the fraction activation volume saturated for breath-hold durations of 20 s and above; thus recommended using breathhold durations of 20 s to give sufficient sensitivity to BOLD signal changes to detect impaired CVR (Liu et al., 2002).

CVR Data Processing Methods
Common pre-processing steps that were reported (Figure 4) were sequence-dependent and included motion correction (167/235 studies, 71%), spatial smoothing (107/235, 46%), registration of functional volumes to MNI or subject space (96/235, 41%), region-of-interest or whole brain delay correction (93/235, 40%), drift removal/modelling (79/235, 34%), voxelwise delay correction (62/235, 26%), and discarding transient MRI volumes to consider only those where steady-state signal   interest can reduce vascular contamination of CVR due to larger responses to CO 2 in blood vessels than in tissues (Thrippleton et al., 2018). T1 correction was recommended for CVR-ASL (some studies used more than one package in combination). Only one in-house Matlab script (for pre-processing BOLD and EtCO 2 data) reported to be publicly available .
Of the six classes of CVR calculation methods identified, linear regression is the most common method overall (149/235 studies, 63%) and in recent publications. However, several newer methods are under development including frequencybased analysis (Duffin et al., 2015). The main reference signal used to compute linear regression or cross-correlation is the EtCO 2 (89/235 studies, 38%). An HRF was incorporated in the MRI signal model in 14% of the studies (32/235), with the single or double gamma function being the most common choice (22/235 studies, 9%). A relatively new method to find an appropriate regressor is RIPTiDe (Regressor Interpolation at Progressive Time Delays), which derives the reference signal from the MRI data by iteratively applying principal component analysis on aligned MRI time courses until convergence of the regressor (Tong et al., 2011;Donahue et al., 2016). Twenty one studies did not clearly describe the CVR processing method, of which two included no information, these were excluded from the summary of CVR processing method ( Figure 5A).
Dynamic aspects of CVR (e.g., lung-to-brain delay, response time) were computed in 128/235 studies (54%) using different methods (Figure 5B), however some studies used different MRI techniques and multiple associated delay processing methods. Fourteen of the delay computation methods were not clearly described and 8 of which could not be included in Figure 5B. Cross-correlation-and linear regression-based methods can be used to compute CVR delay by determining the time shift that gives the best correlation between the BOLD signal and a reference signal (e.g., EtCO 2 ). The most common delay computation methods are cross-correlation-or, equivalently, linear regression-based approaches (84/128 studies, 66%) and pre-defined delay, e.g., from literature, voxel examination, (29/128, 23%). The delay between two signals can be found using linear regression or cross-correlation by determining the time shift giving the best correlation between these two signals. As with CVR computation, delay computation is an evolving area and new methods are arising including obtaining the delay directly from the HRF between the BOLD signal and EtCO 2 (Atwi et al., 2019). One study corrected the hypercapnic delay for delay due to the vasculature (i.e., the delay it takes for the blood and CO 2 to travel from the lungs to the brain tissues) by using the BOLD delay from a hyperoxia challenge as a surrogate of vasculature delay and assuming no vasodilation due to hyperoxia (Champagne et al., 2019a). This correction can distinguish between delay due to vasculature and delay due to vasodilation.
CVR values in whole brain, grey and white matter of healthy volunteers are summarised in Table 3. The associated processing methods were linear regression (72/104, 69%), pre-vs.-post stimulus value comparison (17/104 values, 16%), non-linear signal modelling (13/104, 13%) and frequency-based analysis (2/104, 2%). CVR in grey matter was higher than CVR in white matter. Moreover, measuring white matter CVR using ASL is not common, probably due to the fact that ASL suffers from low contrast-to-noise ratio (CNR) .

The Relationship Between BOLD Response and PaCO2
The healthy BOLD signal response to CO 2 was found to be sigmoidal in two studies (n = 18) (Tancredi and Hoge, 2013;Bhogal et al., 2014). The sigmoid model of the BOLD response to CO 2 was used in a further three studies (n = 65) (Bhogal et al., 2015(Bhogal et al., , 2016De Vis et al., 2018). In four studies, vasodilatory resistance to CO 2 was modelled using the BOLD response (Sobczyk et al., 2014;Duffin et al., 2017Duffin et al., , 2018McKetton et al., 2019). The relationship between resistance and CO 2 was assumed sigmoidal due to the limited ability of the blood vessels to constrict and dilate (n = 141). One study (n = 32) suggested that steal phenomenon associated with some pathologies could alter the sigmoid relationship between CO 2 and vasodilatory resistance (Sobczyk et al., 2014).
Negative CVR clusters correspond to MRI responses anticorrelated to the stimulus. In some cases this might simply reflect long CVR delays if they are not appropriately modelled. Negative CVR could also reflect the steal phenomenon, where tissues with high CVR "steal" blood flow from other regions due to flow redistribution (Shiino et al., 2003;Mandell et al., 2008a;Han et al., 2011a,b;Sobczyk et al., 2014;Poublanc et al., 2015;Fisher et al., 2017;Para et al., 2017;McKetton et al., 2018;Venkatraghavan et al., 2018;Hartkamp et al., 2019). However, they usually appear in the deep white matter (Mandell et al., 2008a), near and in the ventricles (Blockley et al., 2011). Therefore, others have suggested that they may result from low CNR in the white matter tissues leading to spurious CVR values (Blockley et al., 2011), or from reduction in cerebrospinal fluid (CSF) partial volume due to vasodilation (Thomas B. P. et al., 2013;Bright et al., 2014;Ravi et al., 2016b). The latter effect can be diminished by shortening TE (Ravi et al., 2016b).

DISCUSSION
We identified 235 papers using MRI to measure CVR including 5,333 subjects, which covered several different acquisition and analysis methods. Stimuli, paradigm and duration, sequences used for acquisition and processing methods varied considerably. We found several papers, which investigated specific aspects of the CVR-MRI experiment such as processing methods or reproducibility of CVR measurement, but sample sizes were often low, and validation studies remain limited. Reporting was also inconsistent.

Clinical Populations
CVR was measured in several pathologies including steno-occlusive diseases, stroke, small vessel disease, brain injuries, and dementia. Patients generally had lower CVR compared to healthy participants (Krainik et al., 2005;Donahue et al., 2009;da Costa et al., 2016;Hartkamp et al., 2018;Thrippleton et al., 2018;McKetton et al., 2019), though in obstructive sleep apnoea findings were mixed. Delays were longer in steno-occlusive, small vessel disease and dementia patients than in healthy controls, but were not reported in other pathologies. CVR metrics have been associated with cerebrovascular dysfunction, disease severity, and response to interventions (including revascularisation surgery for steno-occlusive diseases). CVR is therefore a promising biomarker of haemodynamic impairment and changes with broad applicability.

Acquisition
Most CVR studies used a 3 T scanner (178/235, 74%) and 2D GE-EPI BOLD sequence (118/235, 50%) for acquisition. While several different sequences can measure CVR, all have limitations. BOLD signal results from a complex interaction between CBF, CBV, haemoglobin concentration, oxygen extraction fraction, cerebral metabolic rate of oxygen consumption and arterial O 2 partial pressure . Changes in any of these parameters can alter the BOLD signal; however, there is evidence that CBV and CBF change together during hypercapnia (Chen and Pike, 2010;Hoge, 2012) and that CVR-BOLD is well-correlated with CVR-ASL (Mandell et al., 2008b;Hare et al., 2013;. Cerebral metabolic rate of oxygen consumption might change during hypercapnia; however it is thought that these changes are small for low levels of CO 2 stimulus (Hoge, 2012). ASL allows direct measurement of CBF and is also widely used (41/235, 17%), but suffers from low CNR ; differences in labelling duration and efficiency, and bolus arrival time can also potentially affect CVR estimation. Calibrated fMRI (9/235, 4%) using dual-echo BOLD/ASL allows simultaneous quantification of CVR and cerebral metabolism parameters (e.g., rate of oxygen consumption and oxygen extraction fraction) (Germuska et al., 2016Merola et al., 2017Merola et al., , 2018. However, calibrated fMRI models depend on the initialisation values of model parameters, model assumptions such as the oxygen metabolism not being altered during hypercapnia and hyperoxia stimuli (Germuska and Wise, 2019), and are more complex to implement. PC-MRI (12/235, 5%) measures CVR at the large-vessel level and generally provides limited spatial coverage; although 4D phasecontrast flow imaging (Miller et al., 2019;Morgan et al., 2020) is developing rapidly, the long scan duration currently limits applicability for measuring CVR in patients. Several different paradigms were used, which varied in duration and number of repetitions. EtCO 2 targeting (81/235, 34%) and fixed CO 2inhalation (69/235, 29%) are the most widely used vasodilatory stimuli with a block paradigm (212/235, 90%) with a median paradigm duration of 9 min. Fixed CO 2 -inhalation is easier to set up than EtCO 2 targeting but the change in EtCO 2 associated with a specific inspired CO 2 concentration may vary between subjects. EtCO 2 targeting allows precise control over the EtCO 2 and paradigm but requires expensive, specialist equipment. 75% of respiratory challenge studies (160/207) measured ETCO 2 . However, in patients with lung diseases, using EtCO 2 is not a direct surrogate for PaCO2 (Petersson and Glenny, 2014).

Processing Methods
CVR was mainly computed using linear regression (149/235, 63%). Few studies described why a particular processing method or regressor was used, and comparisons between different methods are lacking (Bright et al., 2017). 40% of the studies (93/235) calculated a whole brain or single region-of-interest delay that was applied to all voxels. While this method may be relatively robust against noise, delay is known to vary between and within tissue types (Thrippleton et al., 2018;Atwi et al., 2019). However, only 26% of studies (62/235) accounted for voxel-wise delays. An HRF between the stimulus and MRI signal was used in only 14% of the studies (32/235). This might be because the CVR HRF is unknown and may vary between stimuli, paradigms and pathologies Sam et al., 2016a). Assuming a non-delta-function HRF allowed delay and steady-state CVR to be investigated in parallel Donahue et al., 2016), but can be more complex to implement and computationally demanding.

Validation
CVR measurements and detection of CVR impairment using MRI and other imaging modalities [e.g., BOLD-CVR to TCD-CVR (Ziyeh et al., 2005), BOLD-CVR to SPECT-CVR (Shiino et al., 2003), DSC-CVR to PET-CVR (Grandin et al., 2005)] were well-correlated, validating the CVR-MRI experiment. Furthermore, biological validation such as results from studies comparing CVR in patients with steno-occlusive diseases and healthy controls, also supports use of CVR as a biomarker (Ziyeh et al., 2005;Bokkers et al., 2011;Uchihashi et al., 2011;De Vis et al., 2015b). Preclinical CVR imaging is also a fast-growing field which has been applied in preclinical models of stroke, cancer and Alzheimer's disease (Wells et al., 2015;Lake et al., 2016;Gonçalves et al., 2017). Preclinical CVR studies predominantly follow similar approaches to human studies but involve additional considerations such as the effect of anaesthetic agents on resting CBF (Stringer et al., 2021). Isolated vessel preparations (Seitz et al., 2004;Joutel et al., 2010), laser speckle imaging (Lynch et al., 2020), and multiphoton microscopy (Joo et al., 2017;Kisler et al., 2017) can also assess CVR preclinically and may help improve understanding of how impaired vasoreactivity develops and further direct validation of CVR-MRI as a clinical biomarker of cerebrovascular health (Stringer et al., 2021). CVR measurements using MRI techniques showed lower repeatability between-days than within-days (Dengel et al., 2017;Merola et al., 2018). CVR measurements were also less repeatable in white matter than in grey matter due to a lower CNR (Kassner et al., 2010;Thrippleton et al., 2018). Studies with higher sample sizes and investigating reproducibility in different pathologies would be helpful to further validate the CVR-MRI experiment.

Definition and Interpretation of CVR
The definition and units of CVR vary across studies depending on choice of sequence, stimulus, paradigm and analysis methods. However, CVR is most commonly reported as the relative change in BOLD signal (110/235, 47%) or CBF (36/235, 15%) per unit change in EtCO 2 as %/mmHg. Several aspects influence CVR values beyond the vasodilatory capacity of vessels, which must be considered in interpreting the results. The CVR steal phenomenon has been proposed as a systemic mechanism governing the cerebrovascular system by prioritising blood supply to specific regions and potentially leading to local deficits elsewhere. Low or negative CVR values may also result from low CNR or blood vessel dilation near the ventricles shrinking the CSF space and artificially decreasing the BOLD signal due to differences in CSF and blood signal intensities (Thomas B. P. et al., 2013;Bright et al., 2014;Ravi et al., 2016b). Excluding voxels that contain CSF or using a shorter TE (e.g., 21 ms for a TR of 1,500 ms at 3T) can reduce negative artefacts in CVR data (Ravi et al., 2016b). Other physiological factors can affect the MR signal, including resting CBF and oxygen extraction fraction. Finally, as blood vessels have a limited vasodilation capacity, the linearity of the MRI response to the vasodilatory stimulus has a restricted range. Indeed, the shape of the MRI response to the stimulus and baseline parameters including resting CBF and EtCO 2 can influence CVR values (Bhogal et al., 2014(Bhogal et al., , 2016van Niftrik et al., 2018;Hou et al., 2019). Despite some gaps in current knowledge, CVR has a proven validity and utility in several diseases as described above.

Definition and Interpretation of CVR Delay
Delay in the MRI response to a stimulus can lead to inaccurate CVR values if it is not accounted for, and could give further information on vascular health. Voxel-wise or ROI delay should be favoured as opposed to whole brain delay to better account for differences in tissue response and distance from blood vessels. Artificially high or low delay values can be obtained when the noise level is high, i.e., low CNR. Definitions of delay were inconsistent in distinguishing between lung-to-brain delay and duration of the vasodilation process (Thomas et al., 2014). For example, one study computed the lung-to-brain delay, assuming instantaneous MRI response, as the shift in the EtCO 2 that gives the lowest residual when regressed against the MRI time course: the delay in grey and white matter were approximately 15 and 35 s, respectively (Thomas et al., 2014). Another study computed the response time using a mono-exponential fit of the MRI signal: they found response time constants between a few seconds in grey matter up to 100 s in white matter .

Harmonisation of the CVR-MRI Experiment
Variability in the implementation of CVR experiments, including the choice of sequence and MRI parameters, such as TR and TE for BOLD MRI and post-labelling delay for ASL MRI (Inoue et al., 2014), causes heterogeneity in the CVR values, making it challenging to interpret findings across studies and conduct meta-analyses. CVR measurements are highly dependent on MRI sequence (e.g., BOLD, ASL, PC, and DSC), since each measures a different quantity as an estimate or surrogate of CBF, which are not directly comparable .
Harmonisation of acquisition and processing methods would allow more uniform definitions of CVR, delay and HRF, enhancing inter-study comparability, although specific techniques may be better suited to some pathologies and patient groups. Such efforts may also find consensus on the optimal paradigm duration to ensure that CO 2 reaches the region of interest and the MRI response reaches the steady state. As many groups have developed in-house software to process CVR data, making these publicly available, as a step towards development of validated, community-driven open-source software, would also promote reproducibility and harmonisation.
While little consensus currently exists, our review reveals evidence of convergence in some aspects of the CVR-MRI experiment: the use of BOLD at 3T with a block paradigm for the acquisition and definition of CVR as the relative change in BOLD signal per unit change in mmHg (%/mmHg). Early attempts to establish a framework for reaching consensus have recently been initiated (Bright et al., 2017). Further work is needed to reach consensus regarding signal processing and CO 2 delivery methods. CVR is also highly dependent on the image analysis methods used, including the erosion of regions of interest to avoid signal contamination from neighbouring tissues, or region of interest vs. voxel-wise analysis.

Considerations for Future Studies
Detailed reporting of methods and results is essential for interpretation and inter-study comparison of CVR data. Future publications should give sufficient detail to allow processing to be reproduced and, where possible, authors should release their software in version controlled open-source repositories. Results should preferably be reported in relative signal units to allow inter-study comparisons. Accurate recording and reporting of tolerability and reasons for excluding CVR scans is also important to facilitate clinical translation.
Non-linearity due to the limited vasodilation capacity of the blood vessels, is a consideration when interpreting CVR values. In this case, CVR reflects both the maximum response as well as the sensitivity to CO 2 . Research is needed to identify the aspects of the CVR response (e.g., maximum response, MRI response vs. EtCO 2 slope) that are most sensitive and specific in key pathologies. Accounting for voxel-wise lung-to-brain delay would allow direct comparison of the BOLD signal and EtCO 2 and should improve the accuracy of CVR values. Drift in the MRI signal can be significant and should be controlled for during signal processing .
Finally, there are age-related changes in CVR values: CVR is lower with increasing age in grey and white matter (Thomas et al., 2014;Leung et al., 2016c;Leoni et al., 2017). Statistical analyses should account for such key covariates, which requires larger sample sizes or matched study design. CVR is also associated with vascular risk factors including hypertension, diabetes, hypercholesterolemia and smoking (Haight et al., 2015;Tchistiakova et al., 2015;Sam et al., 2016b;Blair et al., 2020).

Strengths and Weaknesses
This review included foreign language papers (5/236), though one such paper was inaccessible. Most but not all of the required information was obtained during the data extraction. This might have added a bias to the results of this review: for example, description of the CVR processing and delay computation methods were not clear in 9% and 11% of the studies, respectively. Furthermore, the sample size of many studies was low (mean sample size: 35, 45/235 studies ≤10 participants), particularly in studies investigating repeatability and reproducibility of CVR values (mean: 16). This review was also restricted to human studies; therefore it does not provide a detailed description of preclinical CVR methods, although the main processing techniques are similar.

CONCLUSION
To our knowledge, this is the first systematic review to summarise and describe the diverse acquisition and analysis techniques used to measure CVR using MRI, and their applications in health and disease. While CVR-MRI is a relatively new and evolving technique we identified applications in several clinical populations including steno-occlusive and small vessel disease, highlighting the value of CVR measurements in medical research. However, acquisition techniques, analysis methods and definitions of CVR all varied substantially. Future work should target consensus recommendations to facilitate more reliable and harmonised CVR measurement for use in clinical research and trials of new therapies.

DATA AVAILABILITY STATEMENT
The original contributions presented in the study are included in the article/Supplementary Material, further inquiries can be directed to the corresponding author.

AUTHOR CONTRIBUTIONS
ES performed the search, analysed the data, and prepared the manuscript. MT, JW, MS, and IM contributed to the work by discussing the eligibility and data extraction of some papers and by reviewing the manuscript. All authors contributed to the article and approved the submitted version.

ACKNOWLEDGMENTS
We would like to thank Xiaodi (Dillys) Liu, Tetiana Poliakova, and Manuel Deckart for their help in extracting the relevant information from foreign language papers.