Comparison of Fourteen Reference Evapotranspiration Models With Lysimeter Measurements at a Site in the Humid Alpine Meadow, Northeastern Qinghai-Tibetan Plateau

Evapotranspiration is a key component in the terrestrial water cycle, and accurate evapotranspiration estimates are critical for water irrigation management. Although many applicable evapotranspiration models have been developed, they are largely focused on low-altitude regions, with less attention given to alpine ecosystems. In this study, we evaluated the performance of fourteen reference evapotranspiration (ET0) models by comparison with large weight lysimeter measurements. Specifically, we used the Bowen ratio energy balance method (BREB), three combination models, seven radiation-based models, and three temperature-based models based on data from June 2017 to December 2018 in a humid alpine meadow in the northeastern Qinghai–Tibetan Plateau. The daily actual evapotranspiration (ETa) data were obtained using large weighing lysimeters located in an alpine Kobresia meadow. We found that the performance of the fourteen ET0 models, ranked on the basis of their root mean square error (RMSE), decreased in the following order: BREB > Priestley-Taylor (PT) > DeBruin-Keijman (DK) > 1963 Penman > FAO-24 Penman > FAO-56 Penman–Monteith > IRMAK1 > Makkink (1957) > Makkink (1967) > Makkink > IRMAK2 > Hargreaves (HAR) > Hargreaves1 (HAR1) > Hargreaves2 (HAR2). For the combination models, the FAO-24 Penman model yielded the highest correlation (0.77), followed by 1963 Penman (0.75) and FAO-56 PM (0.76). For radiation-based models, PT and DK obtained the highest correlation (0.80), followed by Makkink (1967) (0.69), Makkink (1957) (0.69), IRMAK1 (0.66), and IRMAK2 (0.62). For temperature-based models, the HAR model yielded the highest correlation (0.62), HAR1, and HAR2 obtained the same correlation (0.59). Overall, the BREB performed best, with RMSEs of 0.98, followed by combination models (ranging from 1.19 to 1.27 mm day−1 and averaging 1.22 mm day−1), radiation-based models (ranging from 1.02 to 1.42 mm day−1 and averaging 1.27 mm day−1), and temperature-based models (ranging from 1.47 to 1.48 mm day−1 and averaging 1.47 mm day−1). Furthermore, all models tended to underestimate the measured ETa during periods of high evaporative demand (i.e., growing season) and overestimated measured ETa during low evaporative demand (i.e., nongrowing season). Our results provide new insights into the accurate assessment of evapotranspiration in humid alpine meadows in the northeastern Qinghai–Tibetan Plateau.

Evapotranspiration is a key component in the terrestrial water cycle, and accurate evapotranspiration estimates are critical for water irrigation management. Although many applicable evapotranspiration models have been developed, they are largely focused on low-altitude regions, with less attention given to alpine ecosystems. In this study, we evaluated the performance of fourteen reference evapotranspiration (ET 0 ) models by comparison with large weight lysimeter measurements. Specifically, we used the Bowen ratio energy balance method (BREB), three combination models, seven radiationbased models, and three temperature-based models based on data from June 2017 to December 2018 in a humid alpine meadow in the northeastern Qinghai-Tibetan Plateau. The daily actual evapotranspiration (ET a ) data were obtained using large weighing lysimeters located in an alpine Kobresia meadow. We found that the performance of the fourteen ET 0 models, ranked on the basis of their root mean square error (RMSE), decreased in the following order: BREB  . For radiation-based models, PT and DK obtained the highest correlation (0.80), followed by Makkink (1967) (0.69), Makkink (1957) (0.69), IRMAK1 (0.66), and IRMAK2 (0.62). For temperature-based models, the HAR model yielded the highest correlation (0.62), HAR1, and HAR2 obtained the same correlation (0.59). Overall, the BREB performed best, with RMSEs of 0.98, followed by combination models (ranging from 1.19 to 1.27 mm day −1 and averaging 1.22 mm day −1 ), radiation-based models (ranging from 1.02 to 1.42 mm day −1 and averaging 1.27 mm day −1 ), and temperaturebased models (ranging from 1.47 to 1.48 mm day −1 and averaging 1.47 mm day −1 ).

INTRODUCTION
Evapotranspiration is a key parameter in the simultaneous processes of heat and water transfer to the atmosphere via transpiration and evaporation in the soil-plant-atmosphere system (Sentelhas et al., 2010), thereby playing an important role in water balance calculations, water allocation, and water irrigation management. Thus, accurate estimates of evapotranspiration could improve water management strategies and promote efficient water resource use, especially in regions with water shortages (Sun et al., 2011).
To date, direct evapotranspiration measurements have been achieved by a variety of methods, such as the Bowen ratio energy balance (BREB) system (Irmak et al., 2003(Irmak et al., , 2005Si et al., 2005;Irmak and Irmak, 2008), lysimeters (Jia et al., 2006;Valipour, 2015), and the eddy covariance technique (Novick et al., 2009;Zhang et al., 2018). Alternatively, evapotranspiration can be indirectly assessed by applying various reference evapotranspiration (ET 0 ) equations. Several ET 0 models have been widely used for evapotranspiration calculation and can be classified into three types, namely, radiation-based models (Hargreaves and Samani, 1985), temperature-based models (Trajkovic et al., 2019), and combination models (Penman, 1963). While the development of these models has undoubtedly benefited the calculation of evapotranspiration, it remains difficult to choose the optimal one because of the availability of observed data, and most models have not been evaluated against lysimeter measurements across a range of regions and climates (Liu et al., 2017;Kiefer et al., 2019). To select the best-performing models, many studies have been conducted to assess model performance under various climates. For instance, the Food and Agriculture Organization of the United Nations (FAO) recommends the Penman-Monteith FAO-56 (PM-56) as the standard equation for estimating models ET 0 (Allen et al., 1998), and this has been widely used worldwide when compared with other equations (Cai et al., 2007). The advantage of the PM equation is that it does not require any local calibration because it incorporates both physiological and aerodynamic parameters, and it has been well tested based on lysimeter data (Trajkovic, 2009).
Although many models have been widely used to estimate ET, most previous models have only been evaluated with respect to PM-56 (Cao et al., 2015;Liu et al., 2017), with few being tested against lysimeter measurements. Furthermore, the application of the PM-56 equation requires many meteorological inputs, such as wind speed, temperature, humidity, and solar radiation, which are often not available in regions with harsh environments (Tabari et al., 2012;Martel et al., 2018). Thus, it is essential to develop a relatively accurate ET 0 model that requires fewer meteorological parameters to allow more simplified estimates of evapotranspiration than those of PM-56, applicable across a range of climatic conditions (Tabari and Talaee, 2011). For example, Tabari (2010) assessed four ET 0 models in an arid climate and found that the Turc model performed the best. Meanwhile, the Hargreaves equation performed best in semiarid regions (Sabziparvar and Tabari, 2010). Liu et al. (2017) compared sixteen models for ET 0 against weighing lysimeter measurements and found that the combination models performed best for estimating evapotranspiration in semiarid regions. Overall, most previous studies have been conducted in low-humidity conditions at low altitudes (i.e., arid and semiarid regions) (Sentelhas et al., 2010;Liu et al., 2017), with few studies in humid climates, particularly in alpine ecosystems. Compared with arid and semiarid regions in low altitudes, the interaction soil-plant-atmosphere condition change in the humid alpine meadow was more affected by net radiance (Dai et al., 2021), thus, the ET 0 models in arid and semiarid regions might not be suitable for humid alpine meadow. It was urgent to improve the accuracy of evapotranspiration observations in alpine regions by comparing reference evapotranspiration models with lysimeter measurements in the humid alpine meadow.
The Qinghai-Tibetan Plateau (QTP), with an average altitude of 4,000 m, is the world's highest alpine ecosystem and is also known as the "Asian tower, " playing an important role in ensuring the safety of water resources in China and southeast Asia (Zou et al., 2017;Dai et al., 2019b). Furthermore, the permafrost and seasonally frozen ground account for 50-56% of the total Qinghai-Tibet Plateau area, and the alpine ecosystem was more sensitive to global warming compared with other ecosystems owing to its high altitude (Zou et al., 2017). Therefore, an accurate estimation of evapotranspiration in an alpine ecosystem could not only provide new insights into the water cycle but also benefit the formulation of water resource management strategies. Furthermore, given the uncertainty and confusion in the selection of evapotranspiration equations across different regions and climates, it is critical to thoroughly understand the performance of various models in the humid alpine meadow (Zhang et al., 2018). In this study, we compared fourteen ET 0 models against lysimeter measurements on the northern Tibetan Plateau, with the aim of selecting the best fit model over a humid alpine meadow on the northeastern QTP to estimate evapotranspiration.

Study Area
This study was conducted at the Haibei National Field Research Station, Qinghai, China (37 • 37 ′ N, 101 • 19 ′ E), which is located on the Northeastern QTP at an elevation of 3,200 m MASL (Figure 1). This area is characterized by a plateau continental monsoon climate, with well-developed seasonally frozen ground. The average annual air temperature is −1.7 • C, with the maximum monthly temperature in July (10.1 • C) and the minimum monthly temperature in January (−15 • C). The annual precipitation is ∼580 mm, of which 80% falls in the growing season (i.e., from May to September), leading to high water content (close to field capacity) in the soil during the growing season, thus, the evapotranspiration during the growing season was not limited by soil water content (Dai et al., 2021). The average annual pan evaporation is ∼1,191.4 mm (Zhang et al., 2018). The soil type around the lysimeter system is classified as Mat-Gryic Cambisol, which belongs to clay loam, and has a thickness of ∼60-80 cm (Dai et al., 2019a), and the basic soil property is shown in Table 1. The grass crop is dominated by perennial sedge and graminoid species, including Kobresia humilis, Stipa aliena, and Elymus nutans, with ∼8-15 cm in height, which together constitutes 60-80% of plant cover around the lysimeter system (Dai et al., 2021).

Actual Evapotranspiration Measurement and Data Quality Control
The actual evapotranspiration (ET a ) was measured using largescale weighing lysimeters (height 2 m, diameter 1 m, and resolution 10 g) installed in the central alpine Kobresia meadow of the Haibei National Field Research Station. Lysimeters are vessels containing both monolites and repacked soils and are embedded completely in the soil at their top level with the soil surface. The topsoil, including the root, was composed of monolites, while deep soil, not including roots, comprised repacked soils. The soil was cut off from the parent soil at the base of the lysimeters, and the lower lysimeter boundary was exposed to atmospheric pressure. Measurements of ET a were based on changes in weight, with measurements at 30min intervals recorded by a data logger (CR1000, Campbell, USA) and converted to daily values. To ensure data quality, all negative or abnormal ET a values caused by falling soil particles were discarded; outliers that differed more than three times from average ET a , which yielded 393 days of data spanning June 2017-December 2018. According to in situ phenological observations of the dominant plant foliage, we defined the growing season as the period from May 1 to September 30, while the period from October 1 to April 30 of the following year was defined as the nongrowing season (Zhang et al., 2018). Given that the soil moisture of the root region was more than the field capacity (Table 1), thus, the ET a in this study was mainly not limited by surface moisture. Furthermore, the plant height was 8-12 cm, actively growing, and completely shading the ground, which satisfies the definition of reference evapotranspiration (defined as 8-15 cm high, uniform, actively growing, green grass that completely shades the soil and not limited by soil water availability), and we, thus, could assume that the ET a in this study could be considered as reference crop evapotranspiration or potential evaporation. To verify the assumption that ET a in this study was not limited by soil water availability, our previous studies showed that the annual Priestley-Taylor coefficient values (ratio of ET a to ET eq ) ranged from 1.08 to 1.14 (ET eq represents the minimum possible evapotranspiration from a moist surface and depends only on air temperature and available energy), and the annual decoupling coefficient Ω ranged from 0.43 to 0.48 (Zhang et al., 2018), which provided direct evidence that ET a in this study occurred under strongly energy-limited conditions rather than soil water availability constraints.

Meteorological and Soil Water Content Data Collection
All meteorological variables needed to calculate ET 0 using various models were obtained or estimated from the weather station at Haibei Station, including precipitation (PPT), relative humidity (RH), wind speed (WS), net radiation (R n ), total radiation (R s ), soil heat flux (G), maximum air temperature (T max ), vapor pressure deficit (VPD), minimum air temperature (T min ), and mean air temperature (T a ). PPT was collected using a PPT gauge (52203, RM Young, USA) at 0.5 m height. Radiation was measured by four radiometers (CNR4, Kipp & Zonen, Netherlands) at 1.5 m height; RH, WS, and T were measured at 1.5 m height (HMP45C, Vaisala, Finland), and WS was converted to 2 m height (u 2 = u z 4.87 ln(67.82−5.42) ) for calculating ET 0 . The G was measured using three heat flux plates (HFT-3, Campbell, USA), which were buried 5 cm beneath the surface. Half-hourly means of meteorological data were stored using a data logger (9210 XLITE, Sutron, USA). The BREB method parameter was determined using an eddy covariance system installed near a lysimeter system, which included a three-dimensional ultrasonic anemometer (CSAT3, Campbell, USA) and an open-path infrared CO 2 /H 2 O gas analyzer (LI-7500A, LI-Cor, USA), and 30-min fluxes were calculated and logged with a SMARTFLUX system (LI-COR, USA) (Zhang et al., 2018). The soil water content was measured using the oven-drying method at depths of 0-10, 10-20, 20-30, 30-40, and 40-50 cm through a soil auger in July because the evapotranspiration reaches its maximum in July with its maximum water demand.

Evaluation Criteria
In this study, the ET a measured by the large-scale weighing lysimeters and the performances of the ET 0 models were compared with these lysimeter system estimates on a daily basis. Pairwise comparisons were conducted using a general linear regression. For further comparison, the root means squared error (RMSE), percentage error of estimate (PE), mean absolute error (MAE), and coefficient of determination (R 2 ) were used to evaluate the ET 0 models. The RMSE, PE, MAE, and R 2 are defined as follows:
where P i is the predicted value; O i is the observed value; P and O are the averages of P i and O i , respectively; and n is the total number of data points.

Statistical Analysis
To achieve the best comparison between the models and measurements, we need to select the dominant meteorological factors affecting the measured ET a . Given that it may not be appropriate to explore results based solely on the coefficient of independent variables in multiple regression analysis, owing to the strong collinearity and nonlinearities among meteorological factors, we adopted a boosted regression tree (BRT) model to quantitatively evaluate the relative influences of meteorological variables on measured ET a . BRT is a machine-learning method based on the classification regression tree algorithm (CART). This method generates multiple regression trees through random selection and self-learning methods, which can improve model stability and prediction accuracy. During operation, some data are randomly selected many times to analyze the influence of independent variables on dependent variables and to quantitatively evaluate the relative effect of independent variables on dependent variables, while the remaining data are used to test the fitting results (Elith et al., 2008). Most importantly, the BRT can evaluate the relative influence of an independent variable on a dependent variable, without transformations, and can cope well with nonlinear relationships. Furthermore, the BRT exhibits good performance in dealing with stronger collinearity and nonlinearities. Thus, the BRT was adopted to evaluate the individual influences of the controlling factors on the measured evapotranspiration. All statistical analyses were performed using R software version 3.03 (R Development Core Team, 2006), and all figures were plotted using Origin 9.0.

Seasonal Variation in ET a and Environmental Variables
The measured ET a showed a clear seasonal pattern, and the growing season ET a was significantly higher than that in the nongrowing season (P < 0.05) (Figure 2a). The average measured daily ET a during the study period was 2.33 mm day −1 , with an average daily measured ET a of 4.14 and 0.65 mm day −1 during the growing season and nongrowing season, respectively (Figure 2a). Environmental variables showed a similar seasonal pattern, with maximum and minimum values in the growing and nongrowing seasons (Figure 2). The average daily R n , R s , T a , VPD, and RH during the growing season were 9.53 MJ m −2 , 18.50 MJ m −2 , 10.15 • C, 0.33 kPa, and 74.88%, respectively (Figure 2). The average daily R n , R s , T a , VPD, and RH during the nongrowing season were 3.28 MJ m −2 , 12.65 MJ m −2 , −3.67 • C, 0.23 kPa, and 57.16%, respectively.

Comparison of Daily Evapotranspiration Between Reference Evapotranspiration Models and Lysimeter Measurements Throughout the Study Period
A comparison of fourteen evapotranspiration equations against the lysimeter measurements is presented in Figure 3, Table 3, showing that the relationship between daily ET 0 calculated by the However, the radiationbased models (except PT and IRMAK1) and temperature-based models generally overestimated the ET a values, with MAEs ranging from −0.14 to 0.50 mm day −1 for radiation-based models and 0.40-0.59 mm day −1 for temperature-based models. Throughout the study period, the RMSE of the BREB method was 0.98, and the RMSE of combination models ranged from 1.19 to 1.27 mm day −1 and averaged 1.22 mm day −1 . Furthermore, the RMSE of FAO-56 PM increased from 1.22 to 1.29 mm day −1 as aerodynamic resistance (r s ) changed from 20 to 60 s m −1 (Figure 4). The RMSE for radiation-based models ranged from 1.02 to 1.42 mm day −1 and averaged 1.27 mm day −1 , and the RMSE for temperature-based models ranged from 1.47 to 1.48 mm day −1 and averaged 1.47 mm day −1 . Based on the RMSE, the performance of the fourteen evapotranspiration models decreased in the following order: BREB (0.  Overall, for the entire study period, the BREB yielded the best performance, followed by the combination, radiation-based, and temperaturebased models.

Comparison of Daily Evapotranspiration Between Reference Evapotranspiration Models and Lysimeter Measurements During the Growing Season
During the growing season, the daily ET 0 calculated by fourteen evapotranspiration models was significantly correlated with the lysimeter measurements ET a (P < 0.01), with R 2 ranging from 0.32 to 0.64 ( Figure 5, Table 4). Of the combination models, PM-56 had the highest R 2 (0.60), followed by FAO-24 Pen (0.59) and Pen-63 (0.58). Of the radiation-based models, PT and DK obtained the highest R 2 (0.59), followed by IRMAK1 (0.50), Makkink (1967) Makkink (1967) and Makkink (1957) had the same R 2 (0.47). Of the temperature-based models, HAR, HAR1, and HAR2 obtained the same R 2 values (0.32).
Interestingly, all models (except HAR1 and HAR2) generally underestimated ET a during the growing season, values of MAE ranged from −1.10 to −0.15 mm day −1 , with PM-56 having the largest underestimation (26.59%) and HAR the minimum underestimation (3.56%) ( Table 4). The RMSE of BREB was 1.31, the RMSE for combination models ranged from 1.38 to 1.58 mm day −1 and averaged 1.47 mm day −1 , the RMSE for radiation-based models ranged from 1.19 to 1.55 mm day −1 and averaged 1.40 mm day −1 , and the RMSE for temperature-based models ranged from 1.42 to 1.43 mm day −1 and averaged 1.42 mm day −1 (  Evidently, the best DK was 25% more accurate than the poorest (FAO-56). Overall, for the growing season period, the BREB yielded the best performance, followed by the radiation-based, temperature-based, and combination models.

Comparison of Daily Evapotranspiration Between Reference Evapotranspiration Models and Lysimeter Measurements During the Nongrowing Season
During the nongrowing season, the daily ET 0 calculated by the fourteen evapotranspiration equations was also significantly correlated with the lysimeter measurements ET a (P < 0.01), but with lower coefficients of determination (R 2 ) ranging from 0.17 to 0.64 (Figure 6, Table 5). Of the combination models, FAO-24 Pen obtained the highest R 2 (0.28), followed by Pen-63 (0.25) and PM-56 (0.21). Of the radiation-based models, PT and DK obtained the highest R 2 (0.34), followed by Makkink (1967) (0.27), Makkink (1957) Makkink (1967) yielded the largest underestimate (by 185.83%) and PT the minimum underestimate (by 61.06%) ( Table 5).
The RMSE of BREB was 0.52, the RMSE for combination models ranged from 0.86 to 1.00 mm day −1 and averaged 0.92 mm day −1 , the RMSE for radiation-based models ranged from 0.80 to 1.44 mm day −1 and averaged 1.12 mm day −1 , and the RMSE for temperature-based models ranged from 1.47 to 1.54 mm day −1 and averaged 1.50 mm day −1 . Based on the RMSE, the ET 0 model performance decreased in the  HAR2 (1.54). Evidently, the best model was 66.24% more accurate than the poorest model (HAR2). Overall, for the nongrowing season period, the BREB yielded the best performance, followed by the combination, radiation-based, and temperature-based models.

Comparison of Monthly Averaged Daily Evapotranspiration Between Reference Evapotranspiration Models and Lysimeter Measurements
The fourteen evapotranspiration model estimations were consistent with the pattern observed in the lysimeter measurements (Figure 7), with a peak in July. As already noted, the combination models and BREB underestimated the measurements from May to September and overestimated those in the other months (Figures 7A,B). The radiationbased models underestimated the measurements from June to September and overestimated those in the other months ( Figure 7C). However, the temperature-based models generally overestimated the measured evapotranspiration during most months (except for July and August) ( Figure 7D). Overall, all models tended to underestimate the measured ET a during the growing season (with larger evaporative demand) and overestimated ET a during the nongrowing season (with reduced evaporative demand).

Performance Comparison of Combination Models Against Lysimeter Measurements
Given that the evapotranspiration in our study was not limited by soil water conditions and the plant height was 8-12 cm, actively growing, and completely shading the ground, which is close to the definition of ET 0 , we thus assumed that ET a was comparable to ET 0 during the growing season, with the aim of selecting the best fit model over a humid alpine meadow on the northeastern QTP to estimate ET 0 . Previous studies have shown that the Penman family models are generally the most  accurate when evaluating ET 0 across various climate scenarios and regions (Liu et al., 2017). Of the Penman models for ET 0 , the PM-56 has been considered the standard equation for estimating evapotranspiration (Allen et al., 1998). For instance, Yoder et al. (2005) found that PM-56 displayed the best performance in the humid southeast United States. López-Urrea et al. (2006) tested seven evapotranspiration equations using lysimeter observations in a semiarid climate and found that the PM-56 equation was the most precise method compared with other evapotranspiration equations. Unlike previous studies, the PM-56 model was not FIGURE 8 | The relative influence of meteorological factors on ET a during whole study period (a), growing season (b), and non-growing season (c). WS, wind speed; TD, the difference value between maximum air temperature and the minimum air temperature; G, soil heat flux; R a , extraterrestrial solar radiation; R s , total radiation; VPD, vapor pressure deficit; RH, relatively humid; T, mean air temperature; R n , net radiation.
the best in our study; we found that Pen-63 and FAO-24 Pen were more accurate (Pen-63 and FAO-24 Pen had smaller RMSE than PM-56 throughout the study period). Similar results have been reported in other studies (Berengena and Gavilán, 2005;Martel et al., 2018). A more recent study also reported the poor performance of PM-56 when compared with data from 20 FLUXNET towers (Ershadi et al., 2014). These results suggest that PM-56 might not be the only standard model for evaluating ET 0 because it did not yield better accuracy than the other Penman models. Given the better performance of Pen-63 and FAO-24 than PM-56 in this study, we may apply old Penman family models to our study region, especially considering that the PM-56 requires many meteorological inputs, which limits its use in areas with sparse data, especially in harsh environments (Tabari et al., 2012). Overall, the poor performance of PM-56 may be attributed to the higher aerodynamic resistance (r s ), and there is increasing evidence indicating that PM-56 underestimation is related to the fixed r s = 70 s m −1 in the equation, which may be too large (Liu et al., 2017). This result was also confirmed by our results that the values of RMSE increased from 1.22 to 1.29 mm day −1 as r s changed from 20 to 60 s m −1 , and the RMSE was nearly unchanged (from 1.12 to 1.13 mm day −1 ) when r s varied between 0 and 20 s m −1 (Figure 4). Therefore, reducing the value of r s from 70 to 0-20 s m −1 can improve daily PM-56 estimates.
Other studies also found that r s should be a variable rather than a fixed value (Allen et al., 2006;Liu et al., 2017). For instance, r s should be smaller when the result is underestimated and should be larger when it is overestimated (Ventura et al., 1999).

Performance Comparison of Radiation and Temperature Models Against Lysimeter Measurements
For the performance comparison of radiation models against lysimeter measurements, we found that the PT models yielded the best performance of the radiation-based models, which was consistent with a previous study conducted in humid areas where the PT method yielded a good accuracy estimate for ET 0 (Ershadi et al., 2014). There is increasing evidence indicating that the input parameters are the dominant factors affecting their performance (Lang et al., 2017), and we conclude that the better performance of the PT models might be associated with the most important meteorological factors affecting ET, such as R n , which is supported by our results (Figure 8). Compared with the PT method, other radiation models that use R s as the main driving variable may overestimate ET 0 due to the reduction in R s through atmospheric reflection in this region (Zhang et al., 2018). Furthermore, each model was developed based on its specific underlying surface and climatic conditions. For instance, the PT model was established in a humid climate condition, which is suitable for a humid alpine meadow in the northeastern QTP. Most importantly, the PT models require fewer meteorological inputs than the combination models. However, given that the ET a was mainly limited by available radiation (i.e., R n -G), the structures of the PT and DK models both included available radiation items. Combining these factors, we recommend the PT and DK models for use in humid alpine meadows in the northeastern QTP, especially when considering the difficulty in obtaining evapotranspiration in this harsh climate. For the performance comparison of temperature models against lysimeter measurements, a previous study reported that the Hargreaves equation is one of the simplest empirical methods used for ET 0 estimation because of its lower meteorological data input, including some meteorological data required in the standard PM-56 model (Jensen et al., 1990). To further select the best Hargreaves version equation, we compared the performance of the original (HAR) and two modified versions (HAR1 and HAR2) and found that the original HAR model had the lowest error (RMSE = 1.47 mm day −1 , MAE = 0.40 mm day −1 , and PE = 17.37%), which was consistent with previous studies conducted in humid regions (Tabari, 2010) but contrasted to studies conducted in arid regions in which the modified Hargreaves equation displayed a more accurate estimation of evapotranspiration (Ravazzani et al., 2012). Overall, the temperature models displayed poor performance compared with radiation models because the Hargreaves method was established in semiarid areas (Tabari, 2010). Furthermore, the structure of temperature models missing the most important parameter (i.e., R n ) results in poor performance compared with radiation models. Therefore, a local calibration is required to improve the Hargreaves method accuracy in nonarid regions.

Performance Comparison of All Models
By comparing the four model types, we found that the BREB yielded the best performance, followed by the combination, radiation-based, and temperature-based models (Table 3). Overall, most radiation-based models underestimated the measured ET a throughout the study period, whereas the temperature-based models tended to overestimate ET a . This was consistent with previous studies in which the Makkink and PT models generally underestimated ET a (Priestley and Taylor, 1972;Xu and Singh, 2002), while the Hargreaves equations often overestimated ET a in cold-humid conditions and required local calibration (Berti et al., 2014). Given that our study region was a humid alpine meadow, ET a tended to be overestimated. An alternative explanation for the poor Hargreaves model performance in humid regions may be that the Hargreaves method was established in semiarid areas (Tabari, 2010), and the R a parameter used in the Hargreaves model structure was based on the maximum possible radiation value and does not consider the atmospheric transmissivity. However, the atmospheric transmissivity in humid regions is affected by many factors, such as atmospheric moisture; thus, the solar radiation reaching the surface is significantly reduced because of the high atmospheric moisture content (Temesgen et al., 1999), resulting in the overestimation of solar radiation, ultimately leading to an overestimation of evapotranspiration using the Hargreaves method.
Furthermore, there were common features in all four groups of models. All the models tended to underestimate the measured ET a during the growing season (with larger evaporative demand) and overestimated ET a during the nongrowing season (with reduced evaporative demand), which was consistent with a previous study conducted in a semiarid region (Liu et al., 2017). The underestimated measured ET a during the growing season may be related to the ET a in alpine meadows under strongly energy-limited conditions rather than soil water content during the growing season; thus, higher solar radiation could lead to a higher ET a during the growing season (Zhang et al., 2018). Therefore, both the Hargreaves equations and other models require further local or regional calibration before being applied to a given region (Xu and Singh, 2002). It should also be noted that the data used in this study were obtained from 1 year and a single weather station, which may be insufficient to represent the whole humid climate or the alpine ecosystem but represent only a humid alpine meadow in the northeastern QTP. Thus, a longer period and more lysimeter systems should be used in the alpine ecosystem in the future to obtain more accurate estimates of evapotranspiration over humid alpine meadows in the northeastern QTP.

CONCLUSION
This study is the first to document information on the comparison of fourteen evapotranspiration models against lysimeter measurements in a humid alpine meadow, northeastern Qinghai-Tibetan Plateau, and we found that the BREB method performed the best, followed by combination models, radiation-based models, and temperature-based models. Furthermore, all models tended to underestimate ET a during the growing season and overestimate ET a during the nongrowing season, suggesting that these models should be calibrated or modified by local lysimeter data when extrapolated to other regions. Besides, the 1963 Penman and FAO-24 Penman models demonstrated better performances than recommended FAO-56 PM, suggesting that older Penman equations may superior to the standard FAO-56 PM model. Given the outstanding performance of Priestley-Taylor model owing to its most important factors affecting ET a (R n ), which require few meteorological inputs, we thus recommend that these two models can be used in the humid alpine meadow on the northeastern Qinghai-Tibetan Plateau. Our result could provide new insights for the accurate assessment of evapotranspiration in the alpine ecosystem.

AUTHOR CONTRIBUTIONS
LD and XG performed the research, analyzed data, and wrote the manuscript. RF, ZZ, ZH, and YD analyzed data. GC conceived the study. All authors contributed to the article and approved the submitted version.