Image-to-image generative adversarial networks for synthesizing perfusion parameter maps from DSC-MR images in cerebrovascular disease

Stroke is a major cause of death or disability. As imaging-based patient stratification improves acute stroke therapy, dynamic susceptibility contrast magnetic resonance imaging (DSC-MRI) is of major interest in image brain perfusion. However, expert-level perfusion maps require a manual or semi-manual post-processing by a medical expert making the procedure time-consuming and less-standardized. Modern machine learning methods such as generative adversarial networks (GANs) have the potential to automate the perfusion map generation on an expert level without manual validation. We propose a modified pix2pix GAN with a temporal component (temp-pix2pix-GAN) that generates perfusion maps in an end-to-end fashion. We train our model on perfusion maps infused with expert knowledge to encode it into the GANs. The performance was trained and evaluated using the structural similarity index measure (SSIM) on two datasets including patients with acute stroke and the steno-occlusive disease. Our temp-pix2pix architecture showed high performance on the acute stroke dataset for all perfusion maps (mean SSIM 0.92–0.99) and good performance on data including patients with the steno-occlusive disease (mean SSIM 0.84–0.99). While clinical validation is still necessary for future studies, our results mark an important step toward automated expert-level perfusion maps and thus fast patient stratification.

Stroke is a major cause of death or disability. As imaging-based patient stratification improves acute stroke therapy, dynamic susceptibility contrast magnetic resonance imaging (DSC-MRI) is of major interest in image brain perfusion. However, expert-level perfusion maps require a manual or semimanual post-processing by a medical expert making the procedure timeconsuming and less-standardized. Modern machine learning methods such as generative adversarial networks (GANs) have the potential to automate the perfusion map generation on an expert level without manual validation. We propose a modified pix pix GAN with a temporal component (temppix pix-GAN) that generates perfusion maps in an end-to-end fashion. We train our model on perfusion maps infused with expert knowledge to encode it into the GANs. The performance was trained and evaluated using the structural similarity index measure (SSIM) on two datasets including patients with acute stroke and the steno-occlusive disease. Our temp-pix pix architecture showed high performance on the acute stroke dataset for all perfusion maps (mean SSIM . -. ) and good performance on data including patients with the steno-occlusive disease (mean SSIM . -. ).

. Introduction
Ischemic stroke is a leading cause of death or disability worldwide. In such a situation, time is brain (1). This requires rapid decision-making in the clinical setting to ensure an optimal outcome for an affected patient. Standard treatment strategies include recanalization by mechanical or pharmacological intervention, or a combination of both (2,3). In this context, the eligibility of patients for treatment is mainly based on large cohorts of interventional trials that implement few imaging information (4,5). However, this means that some patients will not receive treatment that would be beneficial for them, and conversely, some patients will be subjected to futile treatment attempts (6). An alternative approach to improve outcomes is individualized patient stratification based on specific patient characteristics (7, 8). One of the most important techniques for this approach is perfusion-weighted imaging, a special imaging technique used in both computed tomography (CT) and magnetic resonance imaging (MRI) (9). It provides highly relevant information about (patho)physiological blood flow in and around the ischemic brain tissue (10). In MRI, the most commonly used perfusion imaging technique is dynamic susceptibility contrast (DSC) MRI (11). It measures brain perfusion by injecting a gadolinium-based contrast agent into the patient's blood (11), followed by a series of T2-or T2*weighted MRI sequences that record the flow of the contrast agent through the brain. The resulting four-dimensional image is deconvolved voxel-wise with an arterial input function (AIF) (12). The tissue concentration curve and the deconvolved curve result in interpretable perfusion parameter maps, such as the cerebral blood flow (CBF), cerebral blood volume (CBV), mean transit time (MTT), time-to-maximum (Tmax), and time-topeak (TTP) (12). These maps are different representations of the information encoded in the time-intensity curve for each voxel. For all except TTP, to derive robust and valid parameter maps, the time-intensity curve must be deconvolved with an AIF (12). Ideally, the AIF is derived for each voxel separately, but in the clinical setting, the calculation of a global AIF is preferred (12).
WHO EMRO Stroke, Cerebrovascular Accident | Health Topics.
The gold standard is the manual selection of several-usually 3 or 4-AIFs in the hemisphere contralateral to the stroke, from segments of the middle cerebral artery (12). The manual selection of AIFs is a tedious and time-consuming process that can only be performed after training (12). Therefore, automated methods whose results are subsequently reviewed by an expert are preferred in clinical practice (12).
However, the correct shape of the AIF has to be visually confirmed by an expert user. Otherwise, incorrect perfusion maps are generated, which can result in a serious diagnostic error (13). Therefore, there is a great clinical demand for novel automation approaches that provide validated, expertlevel perfusion maps without the necessity for any expert oversight. This is especially important in smaller institutions where no expert is readily available and yet a quick decision must be made whether to treat a patient with an acute stroke or even transfer them for further endovascular therapy.
One possible solution is the application of modern artificial intelligence (AI) methods based on machine learning and in this study, particularly, deep learning approaches. These have shown great promise for solving medical imaging problems in the past few years (14,15). This includes problems in stroke, such as stroke time onset prediction (16), lesion segmentation (17), and patient outcome prediction (18,19). Among deep learning applications, generative adversarial networks (GANs) are particularly promising for the generation of expert-level perfusion maps. For example, GANs can be presented both with an original image and a processed image and learn to generate the processed image from the original. This is achieved by the special architecture of GANs, and they consist of two neural networks that try to fool each other (20). One network, the generator, synthesizes a data sample such as an image, whereas the other network, the discriminator, decides whether the sample looks like a real sample or not. At the end of the training, the generated sample should resemble the original as closely as possible. For image-to-image translations, GANs are considered to be state of the art in the medical field (21,22), and a conditional GAN, such as the pix2pix GAN, can be applied (23). For example, pix2pix GANs have been successfully applied to transform MR images to CT images (cross-modal) or to transform 3T MR images to 7T MR images (intramodal) (24,25).

FIGURE
Workflow of the study. Our GAN is trained on expert-level perfusion maps. The resulting model is able to synthesize perfusion maps from unseen data without the need of manual AIF selection, at the same expert level that was present in the training data.
Given that the translation of a time series of perfusion information from source images to a single perfusion map can be seen as a highly similar medical image-to-image translation problem, GANs are a highly promising method for this use case. Preliminary work on GANs for the translation of time-series in dynamic cine applications has been published (26). Yet, to the best of our knowledge, no study has investigated the generation of DSC perfusion images from perfusion source data so far. The use of GANs would also present a new advantage: Since we use the final perfusion map for training, the GAN would not simply copy the map generation algorithm but would merge the map generation algorithm and the optimal AIF placement information into one algorithm. The GAN system would thus be able to generate perfusion maps even on images where manual AIF placement is not possible, such as due to motion artifacts, which are quite common in acute stroke (13).
Thus, we propose a modified slice-wise pix2pix GAN with a temporal component (temp-pix2pix-GAN) to account for the time dimension in DSC source perfusion imaging. Our GAN model automatically generates perfusion parameter maps in an end-to-end fashion. We train our model on expert-level perfusion parameter maps (see Figure 1). The performance of our temp-pix2pix-GAN model is compared to a standard pix2pix GAN without a temporal component. We train and test our approach on two different datasets including patients with acute stroke and those chronic cerebrovascular disease, and perfusion data with motion artifacts.
. Materials and methods

. . Data
In total, 276 patients were included in this study. Of which, 204 patients from a study performed at Heidelberg University Hospital suffered from acute stroke. Imaging was performed with a T2*-weighted gradient-echo EPI sequence with fat suppression TR = 2,220 ms, TE = 36 ms, flip angle 90 • , field of view: 240 x 240 mm 2 , image matrix: 128 x 128 mm, and 25-27 slices with ST of 5 mm and was started simultaneously with bolus injection of a standard dose (0.1 mmol/kg) of an intravenous gadolinium-based contrast agent on 3 Tesla MRI systems (Magnetom Verio, TIM Trio and Magnetom Prisma; Siemens Healthcare, Erlangen, Germany). In total, 50-75 dynamic measurements were performed (including at least eight prebolus measurements). Bolus and prebolus were injected with a pneumatically driven injection pump at an injection rate of 5 ml/s. The study protocol for this retrospective analysis of our prospectively established stroke database was approved by the Ethics Committee of Heidelberg University, and patient informed consent was waived.
A total of 72 patients with steno-occlusive disease were included in the PEGASUS study (27). A total of 80 whole-brain images were recorded using a single-shot FID-EPI sequence (TR = 1,390 ms, TE = 29 ms, and voxel size: 1.8 x 1.8 x 5 mm 3 ) after automatic and synchronized injection of 5 ml Gadovist (Gadobutrol, 1 M, Bayer Schering Pharma AG, Berlin) followed .

FIGURE
Rescaling of the DSC source images. The DSC source images need to be rescaled to the same dimension to be suitable for a machine learning-based analysis. For this, all images that consisted of less than time points were rescaled by copying the image of the last time point, where the contrast agent had already left the brain tissue. Essentially, this simply prolongs the baseline and has no e ect on the parts of the time series that contain relevant information. by 25 ml saline flush by a power injector (Spectris, Medrad Inc., Warrendale PA, USA) at a rate of 5 ml/s. The acquisition time was 1:54 min. All patients gave their written informed consent, and the study has been authorized by the Ethical Review Committee of Charité -Universitatsmedizin Berlin. DSC post-processing was performed blinded to clinical outcome.
For the acute stroke data from Heidelberg, DSC data were post-processed with Olea Sphere R (Olea Medical, La Ciotat, France) in-house at the stroke center in Heidelberg, and automatic motion correction was applied. Raw DSC images were used to calculate perfusion maps of time-to-peak (TTP) from the tissue response curve. Maps of cerebral blood flow (CBF), cerebral blood volume (CBV), mean transit time (MTT), and time-to-maximum (Tmax) were created by deconvolution of a regional concentration-time curve with an arterial input function (AIF). Block-circulant singular value decomposition (cSVD) deconvolution was applied. The arterial input function (AIF) was detected automatically. All AIFs were visually inspected by a neuroradiology expert (MAM, with over 6 years of experience in perfusion imaging), and only in two cases, the automatically detected AIF needed to be manually corrected.
For PEGASUS patients, DSC data were post-processed with PGui software (version 1.0, provided for research purposes by the Center for Functional Neuroimaging, Aarhus University, Denmark). Motion correction was not available. Raw DSC images were used to calculate perfusion maps of TTP from the tissue response curve. Maps of CBF, CBV, MTT, and Tmax were created by deconvolution of a regional concentration-time curve with an AIF. Parametric deconvolution was applied (28). For each patient, an AIF was determined by a junior rater (JB, with 2 years of experience in perfusion imaging) by manual selection of three or four intravascular voxels of the MCA M2 segment contralateral to the side of stenosis minimizing partial volume effects and bolus delay. The AIF shape was visually assessed for peak sharpness, bolus peak time, and amplitude width (12, 29). The AIFs were inspected by a senior rater (VIM, with over 12 years of experience in perfusion imaging).
The acute stroke data were resized to 21 horizontal slices each containing 128 x 128 voxels. To make the DSC source images suitable for our machine learning models, they were rescaled to 80 time points by copying the last slice until there were 80 slices in total (see Figure 2). All images of one parameter map and the DSC source images were normalized between −1 and +1 to ensure the stability of the GAN and then split into horizontal slices.
The post-processed data were split into training (acute stroke data: 142 patients, PEGASUS: 50 patients), validation (acute stroke data: 20 patients, PEGASUS: eight patients), and test (acute stroke data: 41 patients, PEGASUS: 12 patients) sets. Since the patient cohorts were different for the acute stroke dataset and the PEGASUS dataset consisting of patients with steno-occlusive disease, separate models were trained on the datasets. For optimization, hyperparameters such as the learning rate were selected based on visual inspection of the generated .
/fneur. . images and the performance on the validation test (for details, see Section 2.4). The generalizable performance was estimated by the performance of the test set.

. . General methodological approach
We utilized a special type of AI model, which was developed for generating an image based on the input of another image: the pix2pix GAN (23). A pix2pix GAN consists of two neural networks that try to mislead each other. The first network, the generator, aims to produce realistic looking images based on another image (e.g., produce a CT based on an MR image), whereas, the second network, the discriminator, tries to distinguish between the generated image and real image. Based on the discriminator's feedback, both networks get better in their respective tasks.
Typically, the input and output to a pix2pix GAN generator are a 2D image. For this use case, we modified the pix2pix GAN to take a 3D image (time sequence of the 2D DSC source image) as an input and synthesize the corresponding 2D perfusion map slice (e.g., Tmax). In this work, we implemented two different generator architectures. The first architecture, the classical pix2pix GAN, took in the 3D input image without accounting for the temporal relation between the images. In contrast to that, the second architecture, the temp-pix2pix GAN, was designed to first extract the temporal relation between the images followed by the transformation to the output image (see Figure 3). Both architectures were first trained on the acute stroke dataset. The resulting models served as weight initialization for the models trained on the PEGASUS dataset. In the following, the technical details of the two approaches are described in depth.

. . Network architecture
The GAN architecture was adapted from the pix2pix GAN (23). In our first architecture, we utilized the original U-Net generator as proposed in the paper with the time steps being .
represented in the channels. For the second architecture, we modified the U-Net by adding 3D temporal convolutions before feeding the result into the U-Net in the generator (see Figure 3). Both GAN architectures consisted of two neural networks: the generator G and the discriminator D. On the one hand, the generator's task was to synthesize perfusion parameter maps such as Tmax or CBF from the DSC source image. The discriminator, on the other hand, learned to distinguish between the real DSC source image together with the real perfusion parameter map and the real DSC with the generated perfusion parameter map.
In general, the objective function of a conditional GAN such as the pix2pix GAN is as follows: (1) Where x is the input image (DSC source in our case) and y is the output image (for example, Tmax) and z is a noise vector, which is implicitly implemented as a dropout in the generator architecture (23). The generator tries to maximize the objective which is achieved when the discriminator outputs a high probability of the generated image pair being real and a low probability for the real image pair, respectively. In contrast to that, the discriminator tries to minimize this objective, and identify the real input images. The pix2pix GAN does not directly incorporate the noise vector z but introduces noise in the network using dropout in the generator.
The loss of the generator consisted of two parts. The first part was the adversarial loss, which took into account the feedback of the discriminator as described earlier. In addition, a reconstruction loss directly penalized deviation from the original image using the L1 norm. This second loss was added to the adversarial loss, and both were weighted by 1 after testing different weightings.
Two different GAN variants, the pix2pix GAN and the temp-pix2pix GAN, were implemented. They both differed in the generator's architecture and the generator's input representation.
The pix2pix generator was a 2D U-Net with six downsampling and upsampling layers (see Figure 3B). One DSC source slice at a time was fed as an input to the generator. The different time points of the DSC were concatenated in the channel dimension leading to 3D input data (channel, image height, and image width). Each downsampling layer consisted of a convolutional layer, batch normalization layer, and a LeakyReLU with slope 0.2, and the upsampling layers of ConvTranspose-layers, batch normalization, and a ReLU activation. After the last convolution, a tanh was applied.
In contrast to that, the generator of the temp-pix2pix GAN took one slice of the DSC source at all time points as an input. Here, the time points were represented in a separate time dimension leading to a 4D input (channel, image height, image width, and time). The time sequence of slices was then fed through six 3D convolutions over the time dimension iteratively reducing this dimension to 1. Each convolutional layer was followed by a batch normalization layer and a LeakyReLu with slope 0.2. After the temporal path, the output is fed into a 2D U-Net with convolutions over the spatial dimensions with six downsampling and upsampling layers in an early fusion approach as shown in Figure 3C.
The discriminator adapted the architecture of the discriminator from the PatchGAN as suggested by Isola et al. (23). It consisted of three convolutional layers with batch normalization and a LeakyReLU activation function followed by another convolutional layer and a sigmoid activation function (see Figure 3D). For both the generator and discriminator, the kernel size was four with strides of two.

. . Training
For each architecture, five GANs were trained on the acute stroke dataset from Heidelberg for each of the five parameter maps (CBF, CBV, MTT, Tmax, and TTP). The models were trained with a learning rate of 0.0001 for both the generator and discriminator using the Adam optimizer with β 1 = 0.5 and β 2 = 0.999. The networks were trained for 100 epochs at which point convergence of the two networks was achieved. The batch size was 4 and the dropout was 0. As the PEGASUS dataset was smaller, the models trained on the acute stroke dataset served as a weight initialization for the PEGASUS models and were then only fine-tuned for additional 50 epochs. Thus, in total, 10 models were trained per architecture.
All hyperparameters mentioned earlier were tuned and selected according to visual inspection of the generated images and the performance on the validation set. Due to the computational limitations, an automated search was not feasible. The code was implemented in PyTorch and is publicly available . The models were trained on a TESLA V100 GPU (NVIDIA Corporation, Santa Clara, CA, USA).

. . Performance evaluation
The generated images were first visually inspected. In addition, four metrics were applied: the mean absolute error (MAE) or L1 norm of the error, the normalized root mean squared error (NRMSE), the structural similarity index measure (SSIM), and the peak-signal-to-noise-ratio (PSNR). These metrics are standard performance measures for pairwise image comparison and were selected to better compare with existing studies (30).
The MAE is defined voxel-wise and measures the average absolute of the error between the real image y and the generated https://github.com/prediction /DSC-to-perfusion Frontiers in Neurology frontiersin.org . /fneur. . imageŷ: The NRMSE is defined as the root mean squared error normalized by average euclidean norm of the true image y: The SSIM is defined as a combination of luminance, contrast, and structure and can be summed up as follows: Where µ y and µŷ are the average values of y andŷ, respectively, σ y is the variance, and σ yŷ is the covariance. c 1 and c 2 are constants for stabilization and defined as c 1 = (k 1 L) 2 and c 2 = (k 2 L) 2 , with L being the dynamic range of the pixel values and k 1 , k 2 ≪ 1 small constants. The higher the SSIM, the more similar are the two images with 1 denoting the highest similarity. The PSNR is defined as follows: MAX I is the maximal possible pixel/voxel value. It describes the ratio between the maximal possible signal power and noise power contained in the sample.
The performance metrics of each trained temp-pix2pix GAN were compared to the pix2pix GAN performance using the paired Wilcoxon signed-rank test. For p < 0.05, the difference between performances was considered statistically significant.

. Results
Visual inspection of the results of the acute stroke dataset showed that the perfusion parameter maps generated by the temp-pix2pix GAN looked similar to the ground truth (see Figure 4). For the pix2pix model, on the contrary, only CBF, .

FIGURE
Mean performance metrics for evaluating the similarity between the ground truth and the synthesized parameter maps generated by the pix pix GAN (green) and the temp-pxi pix GAN (blue) on the acute stroke dataset. (A, B) Show the mean absolute error (MAE) and normalized mean root squared error (NRMSE), respectively (the lower, the better). (C, D) Show the structural similarity index measure (SSIM) and the peak-signal-to-noise-ratio (PSNR) (the higher, the better). For all parameter maps, the temp-pix pix architecture shows a better or comparable performance compared with the pix pix GAN. For the time-dependent parameter maps such as Tmax and TTP, the di erence between the pix pix and temp-pix pix GAN performance is larger than the other three maps. The error bar represents the standard deviation.
CBV, and MTT were of sufficient quality, whereas the timedependent parameters TTP and Tmax did not consistently resemble the ground truth (also Figure 4). The quantitative analysis in the acute stroke dataset revealed for all parameter maps a high SSIM ranging from 0.92 to 0.99 for the temp-pix2pix model ( Figure 5). In contrast to this, the pix2pix GAN showed a comparable or worse SSIM ranging from 0.86 to 0.98. A performance difference between the pix2pix and temp-pix2pix models was especially prominent for Tmax and TTP (SSIM 0.92 vs. 0.86 and 0.95 vs. 0.91, respectively). The temp-pix2pix model showed a significantly better performance than the pix2pix model (p < 0.05) for all metrics and perfusion maps, except for the MAE of CBV (MAE of 0.009 for both models, p = 0.53).
For the PEGASUS dataset, the perfusion maps generated by both the fine-tuned pix2pix and temp-pix2pix GAN look similar to the ground truth (see Figure 6). For both networks, MTT appeared to be the least well-reconstructed parameter map, which is also reflected in the metrics (Figure 7). Furthermore, the high intensities of Tmax were not well-captured by the pix2pix GAN ( Figure 6). The performance metrics of the pix2pix and temp-pix2pix GAN and the ground truth for the PEGASUS dataset showed a low error and high SSIM and PSNR for CBF, CBV, and Tmax. Here, for most metrics, the temp-pix2pix GAN achieved a slightly better performance in contrast to the pix2pix GAN. For MTT and TTP, the temp-pix2pix showed a better performance compared with the pix2pix GAN (SSIM 0.84 vs. 0.78 and 0.86 vs. 0.82, respectively). Overall, the metrics of the synthesized MTT and TTP maps obtained a worst performance compared with the other parameter maps. Again, the temp-pix2pix model showed a significantly better performance than the pix2pix model (p < 0.05) for most metrics and perfusion maps, except for the PSNR and NRMSE of Tmax (PSNR of 37.28 vs. 37.83, p = 0.29; NRMSE of 0.032 vs. 0.031, p = 0.10). Figure 8A shows two patients whose generated parameters showed the worst performance. For the acute stroke dataset, these are two Tmax maps ( Figure 8A, first and second columns), whereas, the generated Tmax in the first column did not capture the high intensities well, the generated map in the second column visually looked well. For the PEGASUS models, MTT performed the worst ( Figure 8A, third and fourth columns). In the third column, the generated MTT appears less noisy than the ground truth. In contrast to that, in the fourth column, the generated MTT map looked noisier compared with the ground . /fneur. . truth. Figure 8B shows that the Tmax maps generated by the temp-pix2pix and pix2pix GAN for four patients for which an AIF could not be placed.

. Discussion
In the present study, we propose a novel pix2pix GAN variant with temporal convolutions-coined temp-pix2pix-to generate expert-level perfusion parameter maps from DSC-MR images in an end-to-end fashion for the first time. The temp-pix2pix architecture showed high performance in a dataset of patients with acute stroke and good performance on data of patients with the chronic steno-occlusive disease. Our results mark a decisive step toward the automated generation of expertlevel DSC perfusion maps for acute stroke and their application in the clinical setting.
This requires rapid decision-making in the clinical setting to ensure an optimal outcome for an affected patient. While automated methods for parameter map generation have shown inconclusive results in the literature (26,(31)(32)(33)(34)(35), they are successfully used in acute stroke to identify stroke-affected tissue. In our sample, this was confirmed as the automatically derived arterial input functions only required expert adjustment in 2 out of 204 patients in the acute stroke set. Nevertheless, this approach still requires a manual check resulting in a time delay of a few minutes per patient before patient stratification. As a consequence, there is a major clinical need for automated methods that provide final perfusion parameter maps without any manual input. Here, we chose a GAN AI approach, as presenting this methodology expert-level perfusion maps would lead to a model after training that could then generate expertlevel perfusion maps, implicitly encoding the choice of AIFs within ∼ 1.8 s per patient (speed of our GAN). Our exploratory results show that this approach was successful.
This may have a positive impact on the clinical setting. First, it would eliminate the need for a manual review of AIFs. This would reduce the time needed to calculate perfusion parameter maps and also reduce resource requirements as radiologists and neurologists would no longer need to be trained on how to identify optimal AIFs. Second, as we have shown, it is even possible to calculate parameter maps for patients who currently have to be excluded due to motion artifacts that make it impossible for the standard software to calculate the parameter .

FIGURE
Mean performance metrics for evaluating the similarity between the ground truth and the synthesized parameter maps generated by the pix pix GAN (green) and the temp-pxi pix GAN (blue) on the PEGASUS dataset. (A, B) Show the mean absolute error (MAE) and normalized mean root squared error (NRMSE), respectively (the lower, the better). (C, D) Show the structural similarity index measure (SSIM) and the peak-signal-to-noise-ratio (PSNR) (the higher, the better). For most metrics and parameter maps, the temp-pix pix architecture shows a better performance compared with the pix pix GAN. In terms of the metrics, the generated MTT maps showed the worst performance. The error bar represents the standard deviation.
maps. At this point, it is important to emphasize that our study is exploratory, and the generated model is only used for internal research purposes. This is due to the fact that the generative AI has fundamentally learned to approximate the non-AI algorithm that was originally used to calculate the perfusion parameter maps. To maximize clinical impact, we thus encourage the developers and vendors of relevant clinically used perfusion software to consider adding GAN-based automated perfusion calculation modules to their products. To facilitate this process, we have made our code publicly available. One of the most important contributions of our approach was the consideration of the temporal dimension of the time series input. Not surprisingly, the temp-pix2pix architecture performed better than the pix2pix GAN without a temporal component in both datasets. This was particularly noticeable in the acute stroke dataset for parameters directly related to the correct order of the time-intensity curve, namely TTP and Tmax. Maps of CBF, CBV, and MTT (derived by the central volume theorem as CBV/CBF) also performed quite well in the baseline architecture without a temporal component, as for these maps, the order of input is not relevant. This is because CBV corresponds to the area under the time-intensity curve and CBF is calculated based on the height of the slope, which are indifferent to the order. In the chronic stroke dataset, the temp-pix2pix also outperformed the baseline GAN without a temporal component. However, the difference in performance was not as pronounced as in the acute stroke dataset. This could be due to the fact that patients with acute vascular obstruction usually have significantly higher delays than patients with chronic stenoocclusive disease, and the performance advantage of temp-pix2pix increases with increasing delay. It is noteworthy that in contrast to the patients with acute stroke in the chronic stenoocclusive cohort, MTT and TTP maps performed worse than the other parameter maps. This might be related to the more complex perfusion pathophysiology in chronic steno-occlusive disease. Whereas in acute stroke, delay is the main contributor to blood flow abnormalities, and in chronic steno-occlusive disease, it is the sum of delay and considerable dispersion due to vessel abnormalities (36). This could pose particular difficulties for neural networks to learn the relationships required to create parameter maps: MTT is a parameter that depends on two other parameters (CBV and CBF) in the original software solutions, which are likely to have greater variability in chronic stenoocclusive disease. In addition, TTP delays are attributable to .
/fneur. . Columns three and four show MTT for two PEGASUS patients. While the generated image in the third column shows less noise than the ground truth, the GAN introduced noise in the fourth column in the synthesized image. (B) Four Tmax maps generated by temp-pix pix (upper row) and pix pix (lower row) for cases from the acute stroke data for which no AIF could be computed, and thus with conventional methods not imaging would be available. To get an estimate of the true lesion, h CT is plotted in the lower row. Note that for the CT images, only D visualizations were available. Thus, they are not aligned with the perfusion parameter map. Since motion artifacts a ect the quality of the time series, in these cases the baseline pix pix performs better than the temp-pix pix.
both delay and dispersion, with varying weights in individual patients leading again to a larger variability (this effect is much less pronounced in Tmax parameter maps due to the deconvolution procedure). Such increased variability might lead to less stable models and thus increased noise in the generated maps.  (41). In a study by Hess et al. (40), they reported the performance in terms of MAE with clipping to not account for noise. The generated Tmax achieved an MAE with clipping of 0.524 compared to our approach showing an MSE of 0.016. These differences compared to our study might be due to the novel use of the GAN method and the fact that our model considered whole slices instead of patches to better account for the spatial dimension. Of course, these comparisons need to be additionally evaluated as they might depend on differences in training and test data. It is also worth noting that AIFs are known to exhibit considerable variation (42,43) and the predictive performance of perfusion image maps can be influenced by the AIF shape (44). We performed a cursory visual inspection of the AIFs of the patients in the test set of our study but could not find any striking correlation between the AIF shape and the performance metrics of a generated image. However, all AIFs in our study were selected and reviewed by well-trained and experienced staff, so differences between AIFs could be rather subtle. Therefore, a potential relationship between the shape of the AIF and the performance of the generated image can probably only be detected in a quantitative analysis where the AIFs are parameterized and numerically compared with performance metrics. This is beyond the scope of our study but is a very interesting approach for further research.
Our study has several limitations. First, our network was based on 2D slices instead of the full 3D volumes due to computational restrictions. It is likely that the results could be improved further using the full 3D images. Second, our study is an exploratory hypothesis generating study. Its results need to be clinically validated in a future study before integrating into clinical practice would be possible. This includes clinical evaluation metrics beyond voxel-wise comparison such as comparing manually segmented lesion volumes. Furthermore, due to data availability, this analysis was only performed on MR data and did not include CT perfusion images. Lastly, our approach so far is a black-box approach. It could be extended with explainable AI to generate insights into which areas in the source images are particularly relevant for the creation of different perfusion parameter maps. This could further elucidate the causes of the performance differences between the maps that we identified and could guide the way for further improvements.

. Conclusion
We generated expert-level perfusion parameter maps using a novel GAN approach showcasing that AI approaches might have the ability to overcome the need for oversight by medical experts. Our exploratory study paves the way for fully-automated DSC-MR processing for faster patient stratification in acute stroke. In the clinical setting where time is crucial for patient outcome, this could have a big impact on standardized patient care in acute stroke.

Data availability statement
The datasets presented in this article are not readily available because data protection laws prohibit sharing of the PEGASUS and acute stroke datasets at the current time point. Requests to access these datasets should be directed to the Ethical Review Committee of Charite Universitätsmedizin Berlin, ethikkommission@charite.de.

Ethics statement
The studies involving human participants were reviewed and approved by Ethics Committee of Heidelberg University and Ethical Review Committee of Charité-Universitätsmedizin Berlin. The patients/participants provided their written informed consent to participate in this study.

Author contributions
TK designed and conceptualized study, implemented the networks and ran the experiments, interpreted the results, and drafted the manuscript. VM collected data, designed and conceptualized study, interpreted the results, and drafted the manuscript. MM collected and processed data, interpreted the results, and revised the manuscript. AHe and KH advised on methods and revised the manuscript. JB processed data and revised the manuscript. CA advised on conceptualization and revised the manuscript. AHi revised the manuscript. JS and MB collected data and revised the manuscript. DF acquired funding, advised on conceptualization, and revised the manuscript. All authors contributed to the article and approved the submitted version.