Assimilation of SEVIRI Water Vapor Channels With an Ensemble Kalman Filter on the Convective Scale

Numerical weather prediction (NWP) systems at the convective scale are operated to gain reliable forecasts for diverse atmospheric variables at high spatial resolution. Especially for the prediction of small-scale weather phenomena such as deep convection including the associated precipitation patterns and wind gusts the high-resolution models provide additional benefit over coarser scale models. In this context the distribution of atmospheric humidity plays an important role, however conventional observations of atmospheric humidity are sparse in space and time. The present work aims at the assimilation of water vapor channel radiances of the satellite instrument SEVIRI in an operational framework based on a Local Ensemble Transform Kalman Filter (LETKF) and a convection permitting NWP model. This article describes all the essential elements for a successful incorporation of this kind of data into the system, from the application of a cloud filtering technique over bias correction and vertical localization of the radiance observation. Data assimilation experiments over two 4-week periods show a neutral to slightly positive impact of SEVIRI radiances on upper-air relative humidity and wind speed forecasts.

Numerical weather prediction (NWP) systems at the convective scale are operated to gain reliable forecasts for diverse atmospheric variables at high spatial resolution. Especially for the prediction of small-scale weather phenomena such as deep convection including the associated precipitation patterns and wind gusts the high-resolution models provide additional benefit over coarser scale models. In this context the distribution of atmospheric humidity plays an important role, however conventional observations of atmospheric humidity are sparse in space and time. The present work aims at the assimilation of water vapor channel radiances of the satellite instrument SEVIRI in an operational framework based on a Local Ensemble Transform Kalman Filter (LETKF) and a convection permitting NWP model. This article describes all the essential elements for a successful incorporation of this kind of data into the system, from the application of a cloud filtering technique over bias correction and vertical localization of the radiance observation. Data assimilation experiments over two 4-week periods show a neutral to slightly positive impact of SEVIRI radiances on upper-air relative humidity and wind speed forecasts.

INTRODUCTION
Weather forecasts around the world are based on numerical weather prediction (NWP). National weather services operate different NWP systems with different data assimilation (DA) methods. Widespread DA techniques in operational frameworks are variational techniques, i.e., 3DVar and 4DVar (Rabier et al., 2000;Fischer et al., 2005) and, more recently, ensemble Kalman filters (Bonavita et al., 2010;Schraff et al., 2016). These techniques merge short-range forecasts from an atmospheric model ("first guess") and a large number of different observations to gain an optimal initial state for the atmospheric state variables of the forecast model.
The observations used are on the one hand in situ observations of pressure, temperature, horizontal wind or humidity measured close to the surface, e.g., by weather stations or buoys, or in the upper atmosphere by radiosondes and air planes. These observations have a specific spatial location. They are complemented by radiances measured by satellite instruments. These are nonlocal observations, since they depend on the meteorological conditions in the atmosphere and are measured at the top of the atmosphere as an integral measure over the atmospheric column. Over the last two decades, the use of satellite data has evolved significantly and nowadays they are a main contributor to the quality of global forecasts (Bauer et al., 2010(Bauer et al., , 2011Buehner et al., 2015;Geer et al., 2018;In-Hyuk et al., 2018).
Limited area weather prediction systems for regional weather forecast often make less use of satellite observations. Polarorbiting satellites provide a large set of observations on the global scale whereas they visit each earth region twice a day only (Montmerle et al., 2007). For this reason data from geostationary satellites are of particular interest to operational regional NWP (Szyndel et al., 2005;Stengel et al., 2009;Guedj et al., 2011) as they provide observations at high spatial and temporal resolution.
The aim of the present work is to introduce the assimilation of geostationary satellite radiances from the Spinning Enhanced Visible and Infrared Imager (SEVIRI) instrument on Meteosat Second Generation (MSG) into the operational convective scale NWP system at Deutscher Wetterdienst (DWD). It is based on an ensemble Kalman filter and covers Germany and neighboring countries. With seamless prediction (Blahak et al., 2018) being a main focus of current developments, the representation of convective initiation and the positioning of convective cells with the greatest possible accuracy from the initialization of the forecast becomes increasingly important. The accurate analysis of the pre-convective environment requires to capture the large-scale moisture and temperature distribution from the boundary layer to the upper troposphere. Hence the present work focuses on the SEVIRI infrared water vapor channels providing a high horizontal and temporal density allowing for a better high-resolution constraint of the mid-and uppertropospheric moisture distribution in addition to the rather sparsely distributed radiosonde humidity information used up to now.
An important issue for radiance assimilation is the treatment of cloudy areas. While advances are being made to incorporate cloud information in the assimilation (Stengel et al., 2013;Geer et al., 2018;Gustafsson et al., 2018) many issues remain, especially in the infrared spectrum linked to inaccuracies and biases which are caused by deficits in the model parameterization and the representation of clouds in radiative transfer models. These are known to lead to errors and biases in the model equivalent radiances. To this end, in this work we follow a clear-sky approach which neglects cloud-affected observations. We have applied a new clear-sky method to screen out cloud-affected observations (Harnisch et al., 2016). Several previous studies have investigated the assimilation of cloud information using satellite derived products (Schomburg et al., 2015) or a bias correction scheme (Otkin and Potthast, 2019) in a similar data assimilation setting. While those studies were done in a research context and run over a short time period, the present study targets an operational implementation involving longer periods covering variable weather situations.
An additional issue investigated here is linked to the fact that the implementation of the Local Ensemble Transform Kalman Filter (LETKF)  requires the vertical localization of radiance observations, which are integrals over the atmosphere and thus inherently non-local. This problem is tackled using a new localization approach with observation-dependent localization radii.
To our best knowledge, the present work is the first to show a successful satellite data assimilation within an LETKF in a regional convective scale model over land in an operational setting including experiments in variable weather situations occurring during longer trial periods. The work points out which steps are necessary to gain a positive impact on forecasts. Section 2 provides an overview on the operational regional data assimilation setup at DWD and describes the cloud screening and bias correction of the satellite data. The results section 3 derives and applies an improved vertical localization method and describes the results of the assimilation trials. The data assimilation has been applied to clear-sky satellite data selected by the new method. The final discussion in section 4 puts the gained results into context and wraps up the work.

METHODS
This section introduces the data assimilation elements, such as the data assimilation techniques, the NWP model, the observations used and the weather situations under study.

Kilometre-Scale ENnsemble Data Assimilation (KENDA) Framework
The KENDA system (Schraff et al., 2016) is operational at DWD on a regional scale. It currently consists of the limited-area model COSMO (COnsortium for Small-scale Modeling, Baldauf et al., 2011), a 4D implementation of the Local Ensemble Transform Kalman Filter (LETKF, Hunt et al., 2007) and Latent Heat Nudging (LHN, Stephan et al., 2008) of radarderived precipitation. The present study employs KENDA with the COSMO-DE configuration (code version 5.4d 1 ) covering Germany and parts of neighboring countries as depicted in Figure 6. It has a 2.8 km horizontal grid spacing and 51 hybrid vertical layers up to 22 km (at ∼40 hPa). Different to Schraff et al. (2016), the COSMO-DE model is embedded by oneway nesting into the larger-scale European ICOsahedral Nonhydrostatic model (ICON-EU, Reinert et al., 2019) at 6.5 km (deterministic run) and 20 km (ensemble run) grid spacing. ICON-EU is two-way nested into the global ICON model and has the double spatial resolution of the global model configuration. At the top of the atmosphere the embedded COSMO-DE fields relax to the ICON-EU fields in a sponge layer from the top down to a height of 11 km (∼235 hPa).
The DWD runs operationally L = 40 ensemble members together with a deterministic run at a regional and global scale (Reinert et al., 2019). The COSMO-DE model receives the hourly ensemble and the deterministic model fields from the ICON-EU as lateral boundary conditions run every 3 h. The LETKF data assimilation step with a 1-hourly cycle incorporates in-situ observations from radiosondes (temperature, relative humidity, wind direction and wind speed typically available at every 6 or 12 h), air planes (temperature, humidity and horizontal wind frequently available), surface weather stations (pressure and horizontal wind available hourly) and wind profiler (horizontal wind available every 15-30 min). During the COSMO model runs between the 1-hourly LETKF steps, LHN allows the assimilation of radar-derived precipitation rates, which are available every 5 min. Spin-up time is estimated to ∼1 h but is not detrimental to the forecasts. The data assimilation considers observations from the surface to a maximum height at 300 hPa. Until today, no satellite observations are assimilated operationally in the KENDA system.
The LETKF estimates the background error covariance P b ∈ R N×N in the N−dimensional model space by backgroundensemble perturbations of number L, i.e., Herex b is the background ensemble mean. After the coordinate transformation from physical space to ensemble space x = x b + X b w the analysis minimizes the weighted distance to the background and to observations in ensemble space in the new coordinate w ∈ R L . Here denotes the observation error covariance matrix with the number of observations S, y ∈ R S are observations,ȳ b ∈ R S are the model equivalent mean of the background ensemble members in observation space. The matrix Y b ∈ R S×L is the equivalent of X b in observation space. The model equivalents in observation space are computed from the model field variables by applying the observation operator The minimization of the cost function (2) yields the analysis ensemble meanw a and a square-root filter-ansatz yields the analysis ensemble member perturbations {w a,l } . Then the analysis ensemble members in physical space read see Hunt et al. (2007) and Schraff et al. (2016) for more details. It is noted that the computation of the analysis in ensemble space, i.e., the transform vectorsw a and w a,l , requires only the application of the full non-linear observation operators and not of any tangent-linear and adjoint operators, and it makes use of the 4-dimensional ensemble background error covariances in observation space. The control vector, i.e., the model variables finally updated by Equation (3), can be chosen to comprise the complete set or only a sub-set of the prognostic variables.
Operationally and also in the present study, the control vector consists of the 3-dimensional wind vector, temperature, pressure, water vapor, cloud water, and cloud ice, but does not include rain, snow, graupel, or surface and soil variables. In addition to the computation of the ensemble, KENDA computes an unperturbed deterministic atmospheric analysis field with the Kalman gain matrix Since the Kalman gain matrix depends on X b the ensemble influences the evolution of the deterministic run. The deterministic run can evolve on a grid with higher spatial resolution by interpolation with an operator T, see Equation (4), but this is not adopted in the present study.
In the ensemble Kalman filter, the background covariance matrix P b is rank-deficient due to L ≪ N in Equation (1) which leads to spurious correlations in P b . Spatial localization in the ensemble Kalman filter has been found to be beneficial (Hamill et al., 2001;Greybush et al., 2011;Perianez et al., 2014;Houtekamer and Zhang, 2016;Necker et al., 2020). Since the LETKF estimates the analysis in ensemble space and not in physical space, it is necessary to perform the localization in observation space. Hunt et al. (2007) proposed to localize by increasing the observation error in the matrix R dependent on the distance between analysis grid point and observation. Our implementation follows this approach (Schraff et al., 2016) and employs adaptive horizontal localization for insitu observations (Perianez et al., 2014). Since satellite data outnumber by far the number of in-situ observations, they would dominate the few in-situ observations. To avoid this effect, we have fixed the radius of the horizontal localization for satellite data to the relatively small value of 35 km, while the adaptive localization for in-situ observations remains unchanged. The vertical localization of satellite data obeys a different reasoning and is discussed in section 2.4. We point out, that the analysis update is computed in the localization box only. Hence only the observations present in the box are considered in the analysis update. This aspect is important to remember when extending the vertical localization radius by a new method discussed in section 3.1.
The SEVIRI water vapor channels exhibit an intrinsic signalto-noise ratio yielding a technical measurement error of ∼0.5 K. The observation error covariance matrix R takes into account such intrinsic measurement errors, but also comprises errors in the radiation transfer model, i.e., the observation operator H, and representativeness errors. These latter components are in fact larger than the measurement error. Our various data assimilation studies with KENDA have resulted in optimal observation error settings of 2.0 K for channel 6.2 µm and 3.0 K for channel 7.3 µm.

Additive Covariance Inflation
The ensemble Kalman filter underestimates the forecast error covariance matrix due to limited ensemble size or model errors (Anderson and Anderson, 1999). This problem is often addressed by covariance inflation (Hamill and Whitaker, 2011;Luo and Hoteit, 2013;Houtekamer and Zhang, 2016;Zeng et al., 2019). In addition to the covariance inflation described in Schraff et al. (2016), i.e., multiplicative covariance inflation and relaxation to prior perturbations, the present work implements additive covariance inflation (Mitchell and Houtekamer, 2000;Zeng et al., 2019). The analysis ensemble x a in (3) is modified by additive noise with covariance f add B. Here f add > 0 is a scalar inflation factor and B is the climatological error covariance matrix used in the global operational data assimilation at DWD and interpolated to the regional model grid. Consequently the analysis ensemble reads The matrix B 1/2 is the square root of B with B = B 1/2 (B 1/2 ) t . The vector ξ denotes random Gaussian-distributed noise with vanishing mean and unity variance. In the employed implementation, the inflation factor is f add = 0.01 for an assimilation cycle with hourly updates.

Weather Situation in Study Periods
The proposed assimilation improvements have been tested for two different periods of several weeks representing a range of weather conditions over Germany and neighboring regions. The first is a summer period in the year 2016 from May 26, 2016 to June 30, 2016. It exhibited extreme weather situations ranging from highly convective situations with strong precipitation events including flash floods and thunder storms primarily from end of May for 3 weeks to hot and dry weather primarily toward end of June 2016 (Piper et al., 2016;Necker et al., 2018). The cloud mask analysis in Figure 7 gives further details on the occurrence of cloud types during the summer period. The second period is a winter period in 2016 from December 1 to December 31 with primarily high-pressure areas with low-level inversions yielding several days with fog and low stratus in local areas. Most days were sunny but cold otherwise, and December was the driest month in the year (Friedrich and Breyer, 2016;Hoffmann, 2017).

SEVIRI Satellite Data and NWCSAF Cloud Mask
We have employed data observed by the SEVIRI instrument on the geostationary Meteosat-10 MSG satellite. SEVIRI is a line-byline scanning radiometer and provides image data in 12 channels in the visible, near-infrared and infrared spectral range. The instrument has a baseline repeat cycle of 15 min and a spatial imaging sampling distance of about 5 km over Germany (at nadir 3 km). The present work considers data from the two water vapor channels at wavelengths 6.2 and 7.3 µm at a 1-hourly temporal resolution. In this spectral range water vapor distributed in the atmosphere absorbs radiation and hence reduces radiation from lower levels observed at the top of the atmosphere. The water vapor channels are also highly sensitive to clouds.
Satellite observations at a small horizontal spatial scale may not reflect well the model dynamics on a larger scale. Bierdel et al. (2012) have shown that the COSMO-DE model features an effective horizontal resolution of 11 − 14 km.
This stipulates observations on that scale. We have performed superobbing (averaging) of the satellite data of 5 × 5 pixels. This leads to averaged observations on the scale of ∼20 km over central Europe. Moreover, our numerical implementation of the LETKF currently assumes uncorrelated observation errors, i.e., a diagonal observation error covariance matrix. To reduce horizontal observation error correlations, we have applied an additional thinning yielding a final spatial resolution of satellite observations at a scale of 36 − 48 km.
The LETKF and flavors of the ensemble Kalman filter applies a localization scheme that diminishes spurious correlations in the background error covariance, cf. section 2.1. In situ observations are local since they have a single spatial location and their distance to grid points is rather well defined. In contrast, satellite observations are non-local since they are observed from outside the atmosphere and represent the integral radiation along the path through the atmosphere. At a first glance, their assimilation in the ensemble Kalman filter seems not possible due to the missing specific location. This problem has attracted much attention in atmospheric research (Miyoshi and Yamane, 2007;Bishop and Hodyss, 2009;Campbell et al., 2010;Leng et al., 2013;Lei and Whitaker, 2015;Nadeem and Potthast, 2016;Bishop et al., 2017;Farchi and Boquet, 2019).
A first good approximation for an atmospheric location of a clear-sky radiance observation is the vertical location of the maximum channel sensitivity to atmospheric temperature and humidity (Houtekamer et al., 2005;Miyoshi and Sato, 2007). However, it is known that sensitivity functions of channels may exhibit multiple maxima and hence this choice may not be optimal. To this end, we locate radiances at the expectation value of a normalized sensitivity function w(p), which is a weighted sum of two Jacobian functions of the radiance transfer model. Our implementation applies the Radiative Transfer for TIROS Operational Vertical sounders (RTTOV) observation operator (version 12.0, Saunders et al., 2018) and the two Jacobians J q [q(p), T(p)] and J T (q(p), T(p)) with respect to the specific humidity q and the temperature T at pressure level p, respectively. The Jacobians are computed for the deterministic run only since their computation for each ensemble member currently would be too time-consuming in an operational framework. Then the normalized sensitivity function reads n = 1, . . . , N with discrete pressure levels p n , N = 54 vertical RTTOV-levels, the weight α T = 2 K and the normalization constant W. The sensitivity is dimensionless with N n=1 w(p n ) = 1. We mention that Equation (5) is heuristic and other definitions are possible, cf. Mitchell et al. (2018). The COSMOmodel and the RTTOV operator use the specific humidity as prognostic atmospheric variable, whereas our implementation of the LETKF considers relative humidity. To utilize the RTTOVimplementation of the Jacobian computation, we choose α q = q which leads to a rough approximation of a weighting function with respect to relative humidity. Then the vertical location of the radiance is p 0 = N n=1 p n w(p n ). The vertical localization function about this radiance location is a Gaspari-Cohn function (Gaspari and Cohn, 1999) with radius . Conventional implementations fix this radius to a flowindependent and horizontally constant value that is proportional to the pressure of the radiance p 0 . We choose = 0.3p 0 . Section 3.1 shows the derivation of a novel flow-and space-dependent localization radius.
The satellite sensitivity function describes how deep into the atmosphere the satellite captures radiation. In clear-sky conditions, the SEVIRI-channel 6.2 µm peaks at about 300 hPa, i.e., high in the atmosphere, whereas channel 7.3 µm peaks at about 500 hPa with a vertical width of several hundreds hPa. Hence, channel 7.3 µm can be sensitive to the Earth surface in high terrain. The surface emissivity and skin temperature in the model can deviate considerably from the truth and thus the model equivalents of the first guess in these regions do contain information on the atmospheric conditions and are contaminated by errors in the description of the surface conditions. This can trigger spurious convective cells (observed in own experiments, not shown). Hence the current implementation excludes satellite data over regions with orography exceeding 1000 m of altitude.
The SEVIRI-data allows to derive various cloud parameters, e.g., by the Satellite Application Facility on Nowcasting (NWC-SAF) (Derrien and Le Gléau, 2005). For validation purpose, we employ the NWC-SAF cloud type (CT) product based on SEVIRI observation data and model input from DWD global model ICON 2 The corresponding original classification of CT comprises 21 cloud categories. For simplicity, we introduce 5 major classes that merge subsets of the original categories: • clear: cloud free over land and over sea, • low: very low and low cumuliform and stratiform clouds, • medium: medium stratiform and cumuliform clouds, • high: high cumuliform clouds and high stratiform clouds (h.strat), very high cumuliform clouds and very high stratiform clouds (vh. strat), High Semi-Transparent (HST) very thin cirrus (vthin. cirr.), HST thin cirrus (thin cirr.), HST thick cirrus (th. cirr.) and HST cirrus above low or medium level clouds (cirr. alt.), • all: sum of cloud classes low, medium and high.

Zero Cloud Impact
To use radiances in NWP, it is essential to employ a radiative transfer model (RTM) as an observation operator H that describes well the radiance transfer through the atmosphere. Today powerful RTMs exist, however there are very few RTMs that provide good estimates of radiances in the presence of clouds (Aumann et al., 2018). This difficulty originates from the fact that clouds affect radiative transfer and too little is known about the micro-physical mechanisms in clouds. Hence clouds make radiation measured at the top of the atmosphere by satellite instruments much harder to interpret in terms of atmospheric parameters (Kidder and Vonder Haar, 1995;Stephens and Kummerow, 2007;Kurzrock et al., 2018).
To deal with this problem, typically one attempts to detect clouds that are present in a certain Field of View (FOV) of a satellite measurement. If a cloud is present, 2 For more details, see http://www.nwcsaf.org/ct_description.  6) with respect to brightness temperature BT. (B) Examples of histograms for all data BT as (allsky) and selected data for which BT(n) > −0.1 K (RTTOV clearsky in red color). The panels show data for SEVIRI infrared channels 7.3µm and 6.2µm in the time window June 6, 2016-June 10, 2016. Here, the estimated limit brightness temperature is BT lim = 256.3 K (7.3 µm) and BT lim = 234 K (6.2 µm).
the measurement is discarded for the subsequent analysis of atmospheric temperature and humidity. To this end, various cloud detection techniques have been proposed (Kurzrock et al., 2018) for various satellite instrument types. For instance, the McNally-Watts method (McNally and Watts, 2003) is usually applied to hyperspectral infrared sounders, while the NWC-SAF cloud mask product is based on the measurements from the MSG-SEVIRI instrument.
A recently developed symmetric technique (Harnisch et al., 2016) estimates the impact of clouds on infrared SEVIRI observations based on model and observations. The corresponding cloud impact factor for a FOV quantifies the impact of clouds on radiation in this FOV as a reduction of the corresponding brightness temperature (BT) compared to a limit brightness temperature BT lim . If BT in a FOV exceeds BT lim , then the cloud impact factor is zero. The present work selects FOVs with zero cloud impact (ZCI), i.e., FOVs where BT≥BT lim for both RTM output and observations. This criterion implies that neither the selected data in the corresponding FOV nor the model profile are affected by clouds and may be interpreted to be cloud-free.
To determine the limit brightness temperature BT lim , we compare the brightness temperature computed from RTTOV in clear sky-mode and all sky-mode BT cs and BT as , respectively (Harnisch et al., 2016). To this end, we consider the distribution of all values BT as in a data set and bin them with bin size 1 K and N bins. Then, for each bin n the difference BT k (n) = BT as k (n) − BT cs k (n) quantifies the difference between clear sky and all sky radiation at a certain FOV k. The mean difference between clear sky and all sky brightness temperature in bin n is with number of FOVs F. This results in a distribution of the type shown in Figure 1A. We observe an increasing difference between clear-sky and all-sky RTTOV output with lower brightness temperatures, whereas larger BT as exhibit more similar brightness temperatures. For BT n = 0 K, all-sky and clear-sky radiances are identical reflecting an absence of cloud impact (if the RTTOV model is assumed to be correct). The limit brightness temperature BT lim is set to the minimum BT as for which BT n > −0.1 K. Figure 1B shows typical distributions of RTTOV output under all sky conditions and for ZCI-data. Figure 2 illustrates the temporal variation of BT lim in a short time period in summer 2016. We observe a variation of ∼±2.5 K about the temporal mean in both water vapor channels. In the present study, we define the final limit brightness temperature by the temporal mean over the time period under study.
Summarizing, ZCI-data comprise observations at each FOV for which BT obs ≥ BT lim and BT model ≥ BT lim holds with the observation BT obs and the model (all-sky) equivalent BT model . In the present implementation, we use the deterministic run for the model equivalent BT model . The climatological values used for the limiting brightness temperature are BT lim = 233 K (6.2 µm) and BT lim = 256 K (7.3 µm) for the summer period and BT lim = 230 K (6.2 µm) and BT lim = 255 K (7.3 µm) for the winter period.
It should be noted that the all-sky equivalents BT l of some of the ensemble members l may be cloud affected (i.e., BT l < BT lim ) even for ZCI FOVs since the definition of ZCI only considers the deterministic model run. This tends to increase the ensemble spread and hence the first guess error estimate used in the LETKF may result in an increased weight given to the ZCI data in such situations.
Furthermore we stress that the definition and occurrence of clear sky as given here by ZCI differs in principle from the traditional definition for clear sky. The latter implies the absence of cloud in the atmospheric column and is adopted e.g., in the NWC-SAF cloud type product. In contrast, clear-sky as given by ZCI relates to a specific satellite channel, and various types of clouds may be present in the atmospheric columns of ZCI FOV's as long as these clouds are low or thin enough to have no or negligible influence on the brightness temperature for that channel. Thus, clear-sky by ZCI is often cloudy by NWC-SAF, as will be seen in section 3.2. The reverse is also not excluded by definition, i.e., FOV's diagnosed as clear-sky by NWC-SAF may have a brightness temperature slightly smaller than BT lim . However this should occur more rarely.

Bias Correction
Satellite instruments are imperfect and prone to errors, such as intrinsic random fluctuations at the radiation detector. Systematic measurement errors (including calibration errors), biases in the radiative transfer, and biases in the numerical model induce a bias between the observations and the model. Assimilating observations, that are biased in this sense, violates basic assumptions of the ensemble Kalman filter to provide an optimal solution and thus degrades forecasts. Satellite observations are used in most meteorological centers and different observation bias correction schemes have been developed, such as variational bias correction schemes or static schemes (Eyre, 2016;Hashemi et al., 2017;Otkin and Potthast, 2019). We employ the latter type (Harris and Kelly, 2001) as a temporally adaptive scheme (Auligné et al., 2007) and apply it to ZCI-data.
We note that all these schemes are based on observation minus model departures and hence correct for the relative bias between observation and model state, rather than the observation bias itself. Unless there is either priori knowledge about the model bias, or there is (observation minus model statistics from) a number of other satellite data (channels) with similar sensitivity to the atmosphere available for cross-validation, it is difficult to separate the observation from the model bias.
The subsequent description assumes satellite instrument channels of number C and observations Z ∈ R S×C at spatial locations of number S. The bias-corrected observations with the bias correction B c ∈ R S×C are estimated by linear regression with predictors of number E, i.e., P ∈ R S×E , and corresponding coefficients β ∈ R E×C . Previous studies considered different predictors, such as the thickness between two pressure levels, the integrated water vapor or surface skin temperature (Montmerle et al., 2007). As predictors we have chosen the model thickness between 50 and 250 hPa and between 300 and 850 hPa (E = 2). These predictors tend to result in robust bias estimates, while including model surface skin temperature or integrated water vapor model may lead to model biases that are detrimental to bias corrections applied to the observations. The same predictors are applied to both SEVIRI channels (C = 2). The optimal coefficients arē β = P t P −1 P t D, where D = Z − H(x) is the background departure, i.e., the difference between observations and model equivalents of the first guess state x.
In the case of an offline scheme, the coefficientsβ are chosen optimally for a certain time period, e.g., 1 month. Such a scheme assumes that the systematic observation bias does not change over time. The temporally adaptive online implementation employed in the current work computes U(t n ) = P t P and V(t n ) = P t D at a certain analysis time t n and averages them with equivalent terms at previous analysis time t n−1 weighted by a scalar factor f ≥ 0. This yields the online bias correction coefficient Choosing f = exp(− t/τ ), Equation (7) represents a temporal average over an exponentially decaying window of width τ . The interval t is the time between two analyses. In the current implementation, t = 1h. Several own preliminary numerical studies (not shown) have indicated that the bias correction time scale should be several times the typical synoptic time scale and we choose τ = 30 days. Hence the correction coefficients are time-dependent and adapt gradually to the atmospheric dynamics. For analysis steps in situations with many clouds and only very few ZCI data in the limited-area domain, the update by Equation (7) will generally lead to very small changes in the adaptive bias correction coefficientsβ adap . This is because the weighted averaging between current and previous values with factor f is applied to U and V separately instead of β = U −1 V. Each element in U(t n ) and V(t n ) being a sum over observations will remain relatively small in general if the number of observations is small. After an initial spin-up phase (possibly over several weeks), this renders the adaptive bias correction scheme robust and stable even for small model domains.
In the online bias correction scheme, the initial bias correction coefficient plays an important role due to the long time scale τ . To gain a good choice for the initial coefficientβ adap (t 0 ), we start with U(t 0 ) = 0, V(t 0 ) = 0 and run the data assimilation cycle from the beginning of the time period (section 2.3) for 2 weeks. Then the final bias correction terms are applied as initial terms for a second iteration of data assimilation cycle over the same 2 weeks. Eventually, the last bias correction terms of this second iteration are used as initial bias correction terms for our data assimilation experiments. This procedure is performed separately for the summer and winter period 2016.

Improved Vertical Localization of Satellite Data
The vertical model pressure levels of RTTOV are not equidistant and their layer widths depend on pressure. Figure 3A shows the pressure values as a function of the level index for two example profiles. Their corresponding pressure layer widths are defined as the half-level distances between neighboring levels. They differ slightly between two columns primarily at very small and large pressures, i.e., close to the surface ( Figure 3B). The RTTOV-Jacobians with respect to T and q and subsequently the weighting function w(p n ) depend on the width of the pressure levels since thicker layers may provide larger contributions to the radiance than more narrow levels. To correct this intrinsic scaling, we scale the sensitivity function w(p n ) by the pressure level width p n w(p n ) = w(p n ) p n /W yielding a dimensionless quantityw with N n=1w (p n ) = 1 and normalization constantW. Figure 3C compares scaled and nonscaled sensitivity functionsw and w, respectively, for both spatial locations from Figures 3A,B for both SEVIRI-channels. While the scaling decreases slightly the sensitivity between 300hPa and 600hPa, the sensitivity is increased in lower and higher layers. Interestingly, the sensitivity function in column #600 is bi-modal illustrating that the conventional vertical location of the radiance at the maximum peak does not reflect well the radiation source distribution.
Since the pressure layer scaling affects the sensitivity magnitude, we expect an effect on the vertical radiance location p 0 . Figure 4A shows a typical distribution of level pressures p 0 . The pressure layer scaling moves the radiances in channel 6.2 µm to lower pressures, i.e., moves them up in the atmosphere due to the increased sensitivity at very high levels. Moreover, radiations in channel 7.3 µm are moved slightly to larger pressures and  down in the atmosphere due to the increased sensitivity below 600 hPa.
As a second modification, we take up the approach of Miyoshi and Sato (2007) and choose the vertical localization radius adaptive to the vertical sensitivity function. To this end, we consider the standard deviation σ ofw(p n ) as a reasonable localization radius. Then the final choice is with the conventional radius , see section 2.4. This guarantees that the localization radius is at least as low as for local observations. Figure 4B shows typical distributions of σ . We observe that channel 6.2 µm exhibits a much narrower sensitivity function than channel 7.3 µm. Since the conventional localization radius is proportional to the pressure level, we introduce the relative level width l = σ/p 0 . The distribution of l ( Figure 4C) reveals that most sensitivity functions are broader than assumed by the conventional radius , especially for channel 7.3 µm. Moreover, Figures 4B,C reveal that pressure layer scaling renders the sensitivity function wider for both channels and increases the number of radiances with increased localization radii σ . Hence we expect a prominent effect by both pressure layer scaling and adaptive vertical localization.

Cloud Types in Summer Period 2016
To assess our cloud criterion (section 2.5), we evaluate our results using the NWC-SAF cloud types. Here we present results for the summer period in 2016 (Figure 5) for the ZCI data. As expected the overall number of observations fulfilling the ZCI-criterion is considerably larger for the higherpeaking channel 6.2 µm and most of these observations are either clear or over low clouds. Altogether, the number of cloud-affected ZCI observations exceeds the number of clearsky observations in channel 6.2 µm, while there are more clear-sky observations in channel 7.3 µm than cloud-affected observations ( Figure 5A). In both channels, the majority of cloud-affected observations belong to low clouds with some medium-level and high-level clouds. A more detailed look at high clouds ( Figure 5B) reveals that most ZCI-FOVs with high clouds are semi-transparent thin cirrus clouds. Consequently, this evaluation supports that either ZCI-observations are wellclassified as clear-sky or the clouds are too low or too thin to affect the observed radiation. Figure 6 compares the spatial distribution of ZCI data and cloud types at a single date. Most ZCI-data are classified clearsky (yellow color) and some are located at low and high cloud locations. Figure 7 reveals how the number of ZCI-data varies over time and how this number depends on the channel and the cloud types. At first we observe that the high-peaking channel 6.2 µm (Figure 7A) contributes, as expected, much more ZCI-data to the data assimilation than the mediumpeaking channel 7.3 µm (Figure 7B). This is consistent with results shown in Figure 5A. Moreover, the plot distinguishes cloudy from cloud-free days which agree with the weather situation, cf. section 2.3. Large numbers of ZCI-data correlate with large numbers of clear-sky observations. The mean observation-minus-first guess departure, i.e., the radiance bias after online bias correction, fluctuates around zero with rare large spikes for medium and high cloud observations. This low radiance bias indicates a good performance of the online bias correction applied. Moreover, the fact that the bias correction is very similar for all cloud types (bottom rows in Figure 7) indicates that the cloud impact scheme proposed in the current work indeed is not significantly affected by clouds. Second row: number of ZCI-observations for FOV's diagnosed by NWC-SAF as clear-sky (black), low cloud (green), medium cloud (orange), and high cloud (blue). Third row: mean satellite observation-first guess departure (after bias correction) for each cloud type. Bottom row: online bias correction of observations for each cloud type. The date is given in format mmddhh with month mm, day dd and hour hh. Data shown from second to bottom row are based on ZCI-data.

Forecasts in Summer Period 2016
To study the proposed methods, we employ a data assimilation cycle comprising the LETKF yielding the model analysis and the COSMO-DE model evolution step. We employ an hourly assimilation cycle. The NO-SEVIRI experiment considers in situobservations but no satellite data and represents the control experiment. The SEVIRI-CONV experiment adds SEVIRI data from both water vapor channels with conventional vertical localization and the experiment SEVIRI-NEW extends SEVIRI-CONV by the improved vertical localization. Figure 8 compares the deterministic first guess departures for radiosonde relative humidity (RH) and radiosonde wind speeds (WIND). We observe that SEVIRI-CONV (red line) retains the number of observations compared to NO-SEVIRI, increases slightly the RHbias and WIND-bias and yields standard deviations close to NO-SEVIRI results (black line). Moreover, the improved vertical localization in experiment SEVIRI-NEW retains the number of radiosonde observations and reduces slightly the RH-bias. This indicates that the model first guess fields are in better correspondence with the radiosondes when the SEVIRI water vapor radiances are assimilated with the improved localization. However, the WIND-bias increases in magnitude. For both RH and WIND, the standard deviations are close to the other experiments. The radiosonde temperature first guess departure does not show a prominent effect of satellite data (not shown). Moreover, we point out that a sole application of either the pressure layer width scaling or the adaptive localization radius does yield worse results than the application of both improvements (not shown). Since SEVIRI-NEW improves the first guess departure of NO-SEVIRI, we focus on SEVIRI-NEW forecasts in the following.
In the following, we consider deterministic forecasts with lead time of 6h and 12h and initial times 00, 06, 12, and 18 UTC. Figure 9 shows scores of 6 and 12 h-forecasts of upper-air relative humidity, temperature and wind. Even though the humidity mean error (ME) is slightly decreased around 400−500 hPa and increased around 700 hPa (these heights are in the range of enlarged sensitivity of channels 6.2 µm and 7.3 µm), the impact on the bias of the forecast is generally very small. In terms of root mean square error (RMSE), however, the humidity error is clearly reduced at ∼400 hPa. The impact of SEVIRI assimilation with improved localization on the temperature forecasts is neutral, whereas wind speed is slightly improved between 600 hPa and the surface.
The RMSE score differences in Figure 9 are small for most pressure levels (except for upper-level humidity). To further evaluate statistically the score differences of the humidity forecasts, we choose single initial times, compute the RMSE and estimate the statistical significance of experiment differences from bootstrap confidence intervals (Efron and Tibshirani, 1993) with 1000 bootstrap samples. We observe a prominent significant positive impact on relative humidity of SEVIRI-NEW at ∼400 hPa (Figure 10), where sensitivity of the channel 6.2 µm exhibits its maximum.
As precipitation is a key forecast quantity for a high resolution model, we also assess precipitation rate forecasts by Fraction Skill Scores (FSS) (Roberts and Lean, 2008) against radar-derived precipitation. The FSS ranges from 0 to 1 and the larger the FSS  [WIND, panel (B)] in summer period 2016 for all used radiosonde observations. The left panels give the number of active observations; the bias is the mean difference between model equivalent first guess and observations (center panel), the standard deviation (right panels) is defined accordingly. The colors decode the NO-SEVIRI run (black), the SEVIRI-CONV run (red) and the SEVIRI-NEW run (green).
FIGURE 9 | Forecast verification statistics for relative humidity (RH), temperature (T) and wind speed (WIND) vs. radiosonde observations for the summer period 2016 with lead times 6 h (black) and 12 h (red). Results of the experiment NO-SEVIRI are shown by dashed lines and SEVIRI-NEW results by solid lines. The statistic ME is the forecast-observation bias, RMSE is the root-mean square error of forecasts to observations. The forecast statistics are averages over all initial times and considers the same observations to verify both experiments. the better is the forecast. Figure 11 shows fractional skill scores for two initial times. The corresponding statistical significance is estimated from bootstrap confidence intervals for differences between experiments based on 10 4 bootstrapped samples. One observes significantly improved precipitation forecasts for lead times between 7h and 11h both in 0UTC and 6UTC forecast runs, FIGURE 10 | Forecast verification of radiosonde relative humidity for the summer period 2016 for different initial times and lead times. Results are subject to initial times (06 and 00 UTC shown in left panel and 018 and 12 UTC shown in right panel) with lead times 6 h (black) and 12 h (red) for the experiments SEVIRI-NEW (solid line) and NO-SEVIRI (dashed). Observations are present at 00 h and 12 h. The stars denote statistically significant differences in the scores between experiments (significance level is α = 0.05). All forecast statistics consider the same observations. FIGURE 11 | Fraction skill scores (FSS) of hourly precipitation forecasts against radar-derived precipitation as a function of time of day for summer period 2016. The plot shows forecasts of experiments NO-SEVIRI (dashed) and SEVIRI-NEW (solid) for lead times from 1 to 12 h, colors encode initial time 00UTC (black) and 06UTC (green). The stars denote significantly different scores between experiments (significance level is α = 0.10) based on bootstrap confidence interval. The precipitation threshold is 1.0 mm and scale diameter is chosen to 11 grid points (∼30 km).
while most shorter SEVIRI-NEW forecasts are not significantly better. FSS for initial times 12UTC and 18UTC are not statistically different between NO-SEVIRI and SEVIRI-NEW.
The verification for forecasts of near surface parameters vs. SYNOP observations (not shown) shows heterogeneous scores. For instance, relative humidity and temperature at 2 m height exhibit a neutral/slightly negative impact on bias and root-mean square error, whereas the dew point shows improved scores. Such a heterogeneous impact may result from the complex response of boundary layer physics with strong vertical mixing and turbulent dynamics to changes in the analysis. These preliminary results demand closer investigations and we refer the reader to future work.

Cloud Types and Forecasts Winter Period 2016
The question arises whether the gained results are specific to the summer period. To this end, we evaluate a winter period in addition, cf. section 2.3. Figure 12 presents the cloud classification of the ZCI data in this winter period. We observe that the large majority of ZCI data are either clear-sky, exhibit low clouds or high semi-transparent cirrus clouds, what is consistent with the results obtained for the summer period. Since we do not expect strong cloud effects on radiances in any of these classes, ZCI-data are well-suited for data assimilation.
Forecasts of relative humidity exhibit a minor bias increase from 300 − 700 hPa but clear reduction of RMSE around 400 hPa and near the surface (Figure 13). Temperature scores are hardly affected by our current ZCI SEVIRI satellite data assimilation. For wind speed there is an overall small but encouraging reduction of the RMSE, whereas the wind bias is slightly increased (not shown).
The findings for relative humidity are confirmed in Figure 14 for single initial times. We observe statistically significantly reduced RMSE in forecasts at the top of the atmosphere and close to the surface, cf. Figure 14. Precipitation forecasts exhibit statistically significant improved fraction skill scores against radar-derived precipitation rates for very short lead times, but significantly worse or neutral impact for larger lead times (Figure 15).
The corresponding scores for SYNOP observations (not shown) are heterogeneous with slightly negative impact on relative humidity and temperature and a slightly better dew point. This result is difficult to interpret, similar to the summer period and requires further attention in future studies.

DISCUSSION
The present work introduces a novel combination of techniques which allows to exploit infrared water vapor channel satellite radiances at the convective scale in an operational NWP setting using an ensemble Kalman filter. Convective-scale regional models are subject to the difficulties of radiance assimilation over land and in cloudy regions. Moreover short-range forecast models at the convective scale are prone to many small-scale nonlinear processes and feedback processes, which can lead easily to a negative impact of a sub-optimal setup when trying to assimilate formerly unused observations. Hence it is essential to show that no harm is done to the system when introducing new observations. In general it can be difficult to prove a beneficial impact of new observations in a limited area model as upper air verification is based mainly on a few radiosonde sites, and standard scores are often unsuitable for cloud and precipitation verification due to the double penalty effect.
This implementation adresses several technical and physical issues, such as the question of cloud effects, bias correction and localization in the Ensemble Kalman Filter. The impact achieved with the current first implementation indicates that the SEVIRI water vapor channels can be beneficially used as an additional constraint for upper tropospheric moisture to complement the sparse radiosonde information in the LETKF assimilation. However, further analysis over longer time periods and additional tuning will likely allow to achieve a more consistent positive impact across all variables.

Cloud Detection
We have implemented the idea of a cloud impact on radiances (Harnisch et al., 2016) and interpreted the cloud-free case as a filter criterion for radiances. A comparison of the ZCIdata to the cloud classification from the independent NWC-SAF cloud product affirms the good quality of the data and hence of the criterion. For the large majority of data, the cloud classification is consistent with the assumption that the (possibly cloud-affected) observations selected have zero or small impact from clouds on the radiances, cf. Figures 5, 6, 12. This is affirmed, inter alia, by the similar bias correction and the resulting quasizero bias of the ZCI data for all cloud types present in ZCI-data selection (Figure 7).
Although the ZCI classification and the NWC-SAF cloud mask are consistent, the former appears to be better suited for data assimilation purposes than the latter. The NWC-SAF product utilizes the background state of the global ICON-model and data in all SEVIRI-channels. In the operational framework, the cloud mask computation represents an additional timeconsuming processing step after the computation of the global model computation step. In addition, ZCI classifies observations for each channel separately and in general allows for excluding data from active use only for those channels that are affected by clouds. This is not possible with the NWC-SAF cloud mask that does not distinguish channels. Hence ZCI is channel-dependent and possibly increases the number of assimilated radiances compared to the NWC-SAF product.
In sum, the cloud impact classification is a useful tool to select SEVIRI observation data for use in a clear-sky assimilation. It also permits to then extend gradually the definition of cloud impact from ZCI to non-zero cloud impact (Harnisch et al., 2016). The corresponding average cloud impact factor C a (ZCI for C a = 0 K) gives the average deviation of observation and first guess model equivalent to the limiting brightness temperature. Future work may consider different classes of data up to a certain value of C a . Then this degree of cloud impact is a tuning parameter filtering out certain cloud classes. In this context, it will be necessary to take a closer look at the observation bias correction scheme applied. Our online observation bias correction scheme based on the method of Harris and Kelly (2001) assumes that the bias is independent of the satellite observations and assumes model variables as predictors. It is under debate whether the satellite observation bias diagnosed as average deviation between observation and model values represents a model error or an observation error (Auligné et al., 2007), but it is likely to include both components. Since the bias correction scheme applied is valid for cloud-free observations only, occurring clouds are likely to introduce a cloud-dependent bias and may demand new cloud-dependent predictors (Otkin and Potthast, 2019).

Vertical Localization of Satellite Observations
The proposed scaling of the sensitivity function by the pressure layer width and the adaption of the vertical localization radius to the sensitivity function (section 3.1) distinctly improves shorttime forecasts. Only the combination of both improvements yields the best results. This improvement may result from the increased width of the vertical sensitivity by pressure layer scaling and the accompanying enlarged vertical localization radius. Previous studies have indicated that a larger vertical localization radius may be beneficial (Lei and Whitaker, 2015;Houtekamer and Zhang, 2016). Motivated by this successful extension, future work will focus on further improved vertical localization. Our approach estimates a single vertical location for the radiance and the vertical localization function is a Gaspari-Cohn function with radius parameter partially derived from the satellite sensitivity function, cf. section 3.1. Fertig et al. (2007) have shown that it may be beneficial not to choose a single vertical location but a set of vertical regions where the satellite sensitivity function exceeds a certain threshold. Other ensemble Kalman filter methods estimate the localization function from the correlations of a radiance and a state variable at each grid point (Anderson, 2007;Lei and Anderson, 2014). Relaxing the condition to localize in observation space, a model space localization of non-local satellite radiances may improve forecasts (Lei et al., 2018;Farchi and Boquet, 2019). However The plot shows forecasts of experiments NO-SEVIRI (dashed) and SEVIRI-NEW (solid) for lead times from 1 to 12 h, colors encode initial time 00 h (black) and 06 h (green). FSS for initial times 12 and 18 h are not statistically different between NO-SEVIRI and SEVIRI-NEW. The stars denote statistically significant different scores between experiments (significance level is α = 0.10). Bootstrap method identical to the method applied in Figure 11. The precipitation threshold is 1.0 mm and scale diameter is chosen to 11 grid points. the approach's superiority to observation space localization is still under debate (Lei and Whitaker, 2015).

Major Insights
To determine the optimal combination of model and data assimilation parameters, we have run a large number of data assimilation experiments. In the following, we report on the major insights from these experiments indicating future perspectives of research.
First of all, we point out that verification results depend heavily on the length of time window in the time period under study. We have explored parameters and techniques primarily in the heterogeneous summer period 2016. For experimental periods longer than 3 weeks, statistical verification results have been rather stable. We have also found that the brightness temperature limit BT lim necessary to define ZCI-data should represent a climatological estimate gained from a time period of at least 3 weeks. Shorter adaptive estimations of BT lim are possible in principle, but preliminary studies have indicated a detrimental effect on short-time forecasts. Similar effects have been observed for the observation bias correction. An online time scale shorter than 21 days yields worse bias correction and worse first guess estimates.
Major influencing factors are the satellite observation selection of ZCI-data, together with the improved vertical localization, the online observation bias correction and the additive covariance inflation. A neglect of one of these elements worsens the first guess departure error clearly.
In addition to these essential elements, the optimal choice of the observation error for each channel contributes significantly to forecast error reductions. We have found in tuning experiments that the observation error of the lower channel 7.3 µm should exceed the observation error of the upper channel 6.2 µm. A possible reason might be that there are many rather cloudy days in the experiment period and only rather few data for the lowerpeaking channel e.g., to base the determination of BT lim and the bias correction on. Given a larger uncertainty in BT lim and in the bias correction, larger (remaining) observation errors should be expected. In general, the optimal choice of the observation error matrix R is a topic of major research interest Zhu et al., 2016;Minamide and Zhang, 2017).
The SEVIRI-instrument provides observations of high resolution in space and time. The data cover atmospheric columns at a resolution of ∼5 km over the heterogeneous landscape of Germany and neighboring countries. The surface exhibits strong variability in orography, from mostly flat regions close to the sea in the north to the Alps in the south in France, Switzerland and Austria. Since the surface in the Alps is much closer to the satellite and the water vapor channel 7.3 µm may observe surface-affected radiation, it has been beneficial to limit the observation data to a low orography. Moreover, it has been shown to be beneficial to limit the spatial sampling by superobbing and thinning. This down-sampling reduces the spatial correlations between used observations which is motivated by our data assimilation implementation with a diagonal observation error matrix. In addition to the spatial sub-sampling, it has turned out that temporal sub-sampling of the radiance observations provides better short-time forecasts. Instead of using data available every 15 min, we found that the radiation observed closest to the analysis time (every hour) yields the best impact. Similar results with the KENDA system have been found recently (Bick et al., 2016) for the assimilation of radar reflectivity data. This is possibly due to unaccounted temporal observation error correlations between the data sets at 15 min intervals, or due to the ratio of conventional to satellite (or radar) data becoming too small for an effective assimilation of the conventional data. Furthermore, the first guess departures for the observations valid 15 min after the previous analysis time might be affected by spin-up effects in the 15-min forecast.
At last, the large number of satellite data may cause problems when considering adaptive localization techniques. For instance, choosing the horizontal localization radius by fixing the number of observations to be considered may reduce strongly the localization radius if large numbers of radiance observations are present. Then the satellite data would strongly reduce the influence of the few in situobservations in the vicinity of a grid point. To circumvent this, we have fixed the horizontal localization radius for the radiance data without changing the adaptive localization for the in-situ observations. This yields a better impact on short-time forecasts.

Limitations and Outlook
The aim of our study has been the implementation of satellite data assimilation in the existing regional NWP framework at DWD. This framework has put constraints to possible methodological extensions. For instance, the current regional data assimilation implementation at DWD considers a diagonal radiance observation error matrix assuming uncorrelated errors between both water vapor channels and between spatial locations. Since the two water vapor channels overlap spectrally and their radiance errors correlate strongly (Waller et al., 2016), future work will implement non-diagonal radiance observation error matrices. This will likely permit to increase the number of used observations. A corresponding implementation has become operational recently in the data assimilation of DWD's global NWP system.
Moreover the ZCI classification of observations is instantaneous at each analysis step, whereas the underlying limit brightness temperature BT lim is a climatological estimate and has to be determined offline. We have seen in section 2.5 that the limiting brightness temperature BT lim differs in the summer and winter period. In order to apply this approach to an operational context in the future, seasonal or monthly climatological values can be derived beforehand and applied, or preferably, BT lim can be estimated adaptively from a sliding window in a similar way as is done for the bias correction in section 2.6.
The ZCI-data include quite a few observations that are diagnosed as cloudy by the NWC-SAF cloud product, cf. Figures 5, 12. This may appear to be contradictory to the application of a clear-sky bias correction in the presence of cloudy FOVs, cf. Figure 7. In fact this apparent contradiction results from the definition of clear-sky. The NWC-SAF product defines cloud classes and thus serves for a definition of clear-sky and all-sky, whereas not all cloud types affect the radiance observations. For instance, low clouds are below the vertical range of the sensitivity of the 6.2 µm channel and often even of channel 7.3 µm. Hence, observations from FOVs classified as low clouds can be considered to be unaffected by clouds and may represent clear-sky observations. In this sense, the NWC-SAF and the ZCI classification are consistent in most cases. The fact, that the bias correction attains very similar values for the different NWC-SAF cloud types in Figure 7 indicates that the ZCI data are not affected by cloud for all these cloud types (or they would need to be affected by cloud in exactly the same way which is extremely unlikely).
The ZCI data represent a reasonable data subset for weather situations without clouds or weather situations with low clouds or high cirrus clouds. For some weather situations, like the highly convective regime present in the summer period used for the current work, ZCI data may include very few radiance observations for prolonged periods of time resulting in limited impact. This strong reduction in the number of observations is caused by frequent thick clouds causing low satellite brightness temperatures and consequently a high rejection rate in observations. Future work will extend the ZCI criterion to more cloud-affected observations by a non-zero cloud impact classification (cf. Harnisch et al., 2016) and at the same time adjusting e.g., the observation errors associated to these cloudy observations. In this way, the data classification approach studied here could lead to the assimilation of cloudy radiances, the socalled all-sky approach which is under development at several centers (Geer et al., 2019).

DATA AVAILABILITY STATEMENT
The datasets generated for this study are available on request to the corresponding author.

AUTHOR CONTRIBUTIONS
AH and CS have conceived the study. All authors have contributed to the research results and have written the manuscript.