<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v2.3 20070202//EN" "journalpublishing.dtd">
<article article-type="research-article" dtd-version="2.3" xml:lang="EN" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">Front. Remote Sens.</journal-id>
<journal-title>Frontiers in Remote Sensing</journal-title>
<abbrev-journal-title abbrev-type="pubmed">Front. Remote Sens.</abbrev-journal-title>
<issn pub-type="epub">2673-6187</issn>
<publisher>
<publisher-name>Frontiers Media S.A.</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="publisher-id">871942</article-id>
<article-id pub-id-type="doi">10.3389/frsen.2022.871942</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Remote Sensing</subject>
<subj-group>
<subject>Original Research</subject>
</subj-group>
</subj-group>
</article-categories>
<title-group>
<article-title>A Multiscale Spatiotemporal Approach for Smallholder Irrigation Detection</article-title>
<alt-title alt-title-type="left-running-head">Conlon et al.</alt-title>
<alt-title alt-title-type="right-running-head">Multiscale Spatiotemporal Irrigation Detection</alt-title>
</title-group>
<contrib-group>
<contrib contrib-type="author" corresp="yes">
<name>
<surname>Conlon</surname>
<given-names>Terence</given-names>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
<xref ref-type="corresp" rid="c001">&#x2a;</xref>
<uri xlink:href="https://loop.frontiersin.org/people/1597011/overview"/>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Small</surname>
<given-names>Christopher</given-names>
</name>
<xref ref-type="aff" rid="aff2">
<sup>2</sup>
</xref>
<uri xlink:href="https://loop.frontiersin.org/people/1017516/overview"/>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Modi</surname>
<given-names>Vijay</given-names>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
</contrib>
</contrib-group>
<aff id="aff1">
<sup>1</sup>
<institution>Department of Mechanical Engineering</institution>, <institution>Columbia University</institution>, <addr-line>New York City</addr-line>, <addr-line>NY</addr-line>, <country>United States</country>
</aff>
<aff id="aff2">
<sup>2</sup>
<institution>Lamont Doherty Earth Observatory</institution>, <institution>Columbia University</institution>, <addr-line>Palisades</addr-line>, <addr-line>NY</addr-line>, <country>United States</country>
</aff>
<author-notes>
<fn fn-type="edited-by">
<p>
<bold>Edited by:</bold> <ext-link ext-link-type="uri" xlink:href="https://loop.frontiersin.org/people/1560635/overview">Sudipan Saha</ext-link>, Technical University of Munich, Germany</p>
</fn>
<fn fn-type="edited-by">
<p>
<bold>Reviewed by:</bold> <ext-link ext-link-type="uri" xlink:href="https://loop.frontiersin.org/people/1691426/overview">Jonathan Prexl</ext-link>, Technical University of Munich, Germany</p>
<p>
<ext-link ext-link-type="uri" xlink:href="https://loop.frontiersin.org/people/1691432/overview">Parth Naik</ext-link>, University of Trento, Italy</p>
</fn>
<corresp id="c001">&#x2a;Correspondence: Terence Conlon, <email>terence.conlon@columbia.edu</email>
</corresp>
<fn fn-type="other">
<p>This article was submitted to Image Analysis and Classification, a section of the journal Frontiers in Remote Sensing</p>
</fn>
</author-notes>
<pub-date pub-type="epub">
<day>14</day>
<month>04</month>
<year>2022</year>
</pub-date>
<pub-date pub-type="collection">
<year>2022</year>
</pub-date>
<volume>3</volume>
<elocation-id>871942</elocation-id>
<history>
<date date-type="received">
<day>08</day>
<month>02</month>
<year>2022</year>
</date>
<date date-type="accepted">
<day>18</day>
<month>03</month>
<year>2022</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright &#xa9; 2022 Conlon, Small and Modi.</copyright-statement>
<copyright-year>2022</copyright-year>
<copyright-holder>Conlon, Small and Modi</copyright-holder>
<license xlink:href="http://creativecommons.org/licenses/by/4.0/">
<p>This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.</p>
</license>
</permissions>
<abstract>
<p>In presenting an irrigation detection methodology that leverages multiscale satellite imagery of vegetation abundance, this paper introduces a process to supplement limited ground-collected labels and ensure classifier applicability in an area of interest. Spatiotemporal analysis of MODIS 250&#xa0;m enhanced vegetation index (EVI) timeseries characterizes native vegetation phenologies at regional scale to provide the basis for a continuous phenology map that guides supplementary label collection over irrigated and non-irrigated agriculture. Subsequently, validated dry season greening and senescence cycles observed in 10&#xa0;m Sentinel-2 imagery are used to train a suite of classifiers for automated detection of potential smallholder irrigation. Strategies to improve model robustness are demonstrated, including a method of data augmentation that randomly shifts training samples; and an assessment of classifier types that produce the best performance in withheld target regions. The methodology is applied to detect smallholder irrigation in two states in the Ethiopian Highlands, Tigray and Amhara, where detection of irrigated smallholder farm plots is crucial for energy infrastructure planning. Results show that a transformer-based neural network architecture allows for the most robust prediction performance in withheld regions, followed closely by a CatBoost model. Over withheld ground-collection survey labels, the transformer-based model achieves 96.7% accuracy over non-irrigated samples and 95.9% accuracy over irrigated samples. Over a larger set of samples independently collected <italic>via</italic> the introduced method of label supplementation, non-irrigated and irrigated labels are predicted with 98.3 and 95.5% accuracy, respectively. The detection model is then deployed over Tigray and Amhara, revealing crop rotation patterns and year-over-year irrigated area change. Predictions suggest that irrigated area in these two states has decreased by approximately 40% from 2020 to 2021.</p>
</abstract>
<kwd-group>
<kwd>spatiotemporal modeling</kwd>
<kwd>irrigation detection</kwd>
<kwd>multiscale imagery</kwd>
<kwd>machine learning</kwd>
<kwd>Ethiopia</kwd>
</kwd-group>
</article-meta>
</front>
<body>
<sec id="s1">
<title>1 Introduction</title>
<p>Between 1970 and 2008, global irrigated area increased from 170 million to 304 million hectares (<xref ref-type="bibr" rid="B31">Vogels M. F. A. et al., 2019</xref>). In sub-Saharan Africa however, as little as 4&#x2013;6% of cultivated area is irrigated, given the lack of electric grid infrastructure and the high cost of diesel (<xref ref-type="bibr" rid="B34">Wiggins et al., 2021</xref>). Locating isolated irrigation identifies areas that can support higher quality energy provision services&#x2014;e.g., a grid connection or minigrid installation&#x2014;as these sites can sustain higher energy demands and the attendant electricity costs (<xref ref-type="bibr" rid="B7">Conlon et al., 2020</xref>). Facilitated through informed planning, irrigation expansion has a direct impact on poverty reduction: In Ethiopia, one study found that the average income of irrigating households was double that of non-irrigating households (<xref ref-type="bibr" rid="B10">Gebregziabher et al., 2009</xref>).</p>
<p>In data poor locations, satellite imagery provides a source of detailed synoptic observations of irrigated agriculture (<xref ref-type="bibr" rid="B23">Pervez et al., 2014</xref>). A previous irrigation mapping effort in Ethiopia used three 1.5&#xa0;m resolution SPOT6 images to distinguish between large-scale and smallholder irrigation in the Ethiopian rift (<xref ref-type="bibr" rid="B31">Vogels M. F. A. et al., 2019</xref>). This approach was then adapted to intake timeseries of 10&#xa0;m Sentinel-2 imagery to predict irrigation presence across the horn of Africa (<xref ref-type="bibr" rid="B30">Vogels M. et al., 2019</xref>). While both studies demonstrated high accuracies over collected observations, limited labels precluded a more rigorous performance assessment over the entire area of interest. Other studies have used multiscale imagery to detect irrigation, including one that fuses MODIS and Landsat imagery to identify irrigated extent, frequency, and timing in northwestern China (<xref ref-type="bibr" rid="B6">Chen et al., 2018</xref>). Here, unique advantages of satellite imagery products at different resolutions are exploited: 250&#xa0;m MODIS imagery is valuable for characterizing vegetation over large areas (<xref ref-type="bibr" rid="B11">Huete et al., 1999</xref>), while decameter resolution imagery from Landsat or Sentinel-2 missions can better discern plot extent (<xref ref-type="bibr" rid="B19">Phiri et al., 2020</xref>).</p>
<p>Deep learning techniques have become widely used for land process classification, as they uncover intricate structures in large, complex datasets (<xref ref-type="bibr" rid="B14">Lecun et al., 2015</xref>); and provide a robust method of handling phenological variability (<xref ref-type="bibr" rid="B36">Zhong et al., 2019</xref>). However, despite increasing availability of remotely sensed imagery, computing resources, and advanced algorithms for information extraction, high-quality labels remain scarce and expensive to acquire. Methods of overcoming label scarcity generally fall into one of four categories: 1) Using pretrained networks; 2) unsupervised and self-supervised learning; 3) data augmentation; or 4) additional label collection (<xref ref-type="bibr" rid="B15">Li et al., 2018</xref>). Even as pretrained networks like ImageNet (<xref ref-type="bibr" rid="B8">Deng et al., 2009</xref>) are highly effective for true-color image classification, these networks&#x2019; weights do not translate to tasks that intake multispectral or hyperspectral imagery (<xref ref-type="bibr" rid="B29">Tao et al., 2022</xref>). Unsupervised learning techniques, including those that ensemble different clustering methods&#x2014;e.g., <xref ref-type="bibr" rid="B2">Banerjee et al. (2015)</xref>&#x2014;have been shown to effectively organize unlabeled imagery. Existing work has also demonstrated that training a Generative Adversarial Network (GAN)&#x2014;itself a type of unsupervised learning&#x2014;has allowed for improved change detection performance on multispectral imagery, e.g., <xref ref-type="bibr" rid="B22">Saha et al. (2019)</xref>. For data augmentation, three techniques are often implemented: Image translation, rotation, and flipping (<xref ref-type="bibr" rid="B35">Yu et al., 2017</xref>; <xref ref-type="bibr" rid="B27">Stivaktakis et al., 2019</xref>); however, these techniques do not have obvious analogues for pixel-based classification.</p>
<p>In assessing the impact of training dataset size on land cover classification performance, <xref ref-type="bibr" rid="B21">Ramezan et al. (2021)</xref> finds that investigating multiple types of classifiers is recommended, as the performance of specific classifiers is highly dependent on the number of training samples. A number of other studies have introduced methods for obtaining training samples, including collection <italic>via</italic> hand-engineered rules (<xref ref-type="bibr" rid="B1">Abbasi et al., 2015</xref>); normalized difference in vegetation index (NDVI) thresholding (<xref ref-type="bibr" rid="B3">Bazzi et al., 2021</xref>); finding neighboring pixels that are highly similar to labeled pixels (<xref ref-type="bibr" rid="B17">Naik and Kumar, 2021</xref>); and visual inspection of high-resolution (<xref ref-type="bibr" rid="B30">Vogels M. et al., 2019</xref>) and decameter resolution (<xref ref-type="bibr" rid="B16">Wu and Chin, 2016</xref>) imagery. Lastly, while larger training datasets generally yield better model performance, condensing input samples <italic>via</italic> dimensionality reduction has been demonstrated to increase land cover classification accuracy (<xref ref-type="bibr" rid="B28">Stromann et al., 2020</xref>; <xref ref-type="bibr" rid="B24">Sivaraj et al., 2022</xref>).</p>
<p>Another lingering issue in land process mapping is determining the conditions under which a model can be utilized in locations beyond where it was trained. Site-specific methods may not be easily transferable to other places or climes (<xref ref-type="bibr" rid="B18">Ozdogan et al., 2010</xref>; <xref ref-type="bibr" rid="B4">Bazzi et al., 2020</xref>), and the performance of transferred models can often only be assessed <italic>after</italic> full implementation in a novel setting (<xref ref-type="bibr" rid="B20">de Lima and Marfurt, 2020</xref>). Therefore, processes that yield insights about model transferability <italic>before</italic> training and inference offer benefits to researchers seeking to understand the maximum spatial applicability of their approaches.</p>
<p>As current methods primarily focus on already well-understood areas of interest with existing datasets, new techniques and products need to be developed for parts of the world lacking labeled data. In the realm of irrigation detection, new methodologies and mapping products can help identify locations for further energy system planning and investment, as these areas contain latent energy demands that can make higher quality energy services cost-effective and increase incomes. To this end, the following paper presents a multiscale methodology that leverages 250&#xa0;m MODIS imagery for regional phenological characterization and 10&#xa0;m Sentinel-2 imagery for irrigation detection on smallholder plots. This approach is then applied to the 205,000&#xa0;km<sup>2</sup> Ethiopian Highlands, whereby it introduces a novel method of label collection; an evaluation of different classifier architectures and training strategies that ensure model applicability within the area of interest; and an assessment of irrigated area in the Tigray and Amhara states of Ethiopia for 2020 and 2021.</p>
</sec>
<sec id="s2">
<title>2 Background</title>
<p>Identification of dry season greening as potentially irrigated agriculture must take into account spatiotemporal variations in native vegetation phenological cycles. The complex topography of the Ethiopian Highlands and East African rift system, combined with the latitudinal movement of the InterTropical Convergence Zone (ITCZ) and seasonal upwelling of the Somali current in the Arabian Sea produces a diversity of rainfall patterns that control annual vegetation phenological cycles in the study area<xref ref-type="fn" rid="fn1">
<sup>1</sup>
</xref>. In order to provide phenological context with which to identify anomalous dry season greening, a regional vegetation phenology map is derived from spatiotemporal analysis of timeseries of vegetation abundance maps. Using the spatiotemporal characterization and temporal mixture modeling approach given by (<xref ref-type="bibr" rid="B26">Small, 2012</xref>) applied to timeseries of MODIS enhanced vegetation index (EVI) maps, four temporal endmember (tEM) phenologies are identified that bound the temporal feature space of all vegetation phenology cycles observed on the East African Sahel. These four tEM phenologies form the basis of a linear temporal mixture model that can be inverted to provide tEM fraction estimates for each pixel&#x2019;s vegetation phenology. <xref ref-type="fig" rid="F1">Figure 1</xref> presents a spatiotemporal phenological characterization for the country, created from 16-day 250&#xa0;m MODIS EVI imagery between 1 June 2011 and 1 June 2021.</p>
<fig id="F1" position="float">
<label>FIGURE 1</label>
<caption>
<p>Continuous endmember fraction map derived from a temporal mixture model of 250&#xa0;m MODIS enhanced vegetation index (EVI) timeseries. Smooth gradients and abrupt transitions in phenology are primarily related to topography and variations in precipitation. Region names showing locations of labeled polygons are italicized: The region containing ground collection (GC) labels is delineated in gold; the regions containing visual collection (VC) labels are delineated in blue.</p>
</caption>
<graphic xlink:href="frsen-03-871942-g001.tif"/>
</fig>
<p>The four tEMs extracted for Ethiopia are as follows: a <italic>single cycle</italic> tEM, representing a single annual vegetation cycle per year that peaks in September/October; an <italic>evergreen</italic> tEM, representing perennial vegetation; a <italic>double cycle</italic> tEM, representing semiannual vegetation cycles observed on the Somali peninsula; and a <italic>non-vegetated</italic> tEM, representing barren or non-existent vegetation. The ensuing phenology map in <xref ref-type="fig" rid="F1">Figure 1</xref> contains unmixing root mean square (RMS) error less than 10% for 90% of the pixels; additional unmixing error statistics and the locations of the extracted tEMs in principal component (PC) feature space are shown in <xref ref-type="sec" rid="s11">Supplementary Figures S1, S2</xref>.</p>
<p>
<xref ref-type="fig" rid="F1">Figure 1</xref> roughly divides into 4 quadrants. In the northeast quadrant, Afar appears as dark green, indicating that none of the 4 tEMs contribute significantly to phenologies in this part of the country: The vegetation that does exist in this mostly barren area is represented by low levels of evergreen tEM abundances. In the southeast quadrant, dominated by Somali and a portion of Oromia, vegetation patterns cycle twice annually. This is an area with bimodal rainfall but low total annual precipitation that results in the <italic>double cycle</italic> tEM containing peak vegetation abundances lower than those of the <italic>single cycle</italic> and <italic>evergreen</italic> tEMs. It follows that southeast Ethiopia is more pastoral with sparser vegetation than other parts of the country.</p>
<p>The southwest quadrant&#x2014;covering Southern Nations, Nationalities, and Peoples&#x2019; (SNNP) Region, Sidama, and the western portion of Oromia&#x2014;contains significant amounts of evergreen vegetation, as is demonstrated by its bright green hue. Here, evergreen vegetation is supported by bimodal rainfall with higher levels of annual precipitation than in eastern Ethiopia. In contrast, the northwest quadrant of the phenology map contains red-dominant color gradients, indicating phenologies similar to the <italic>single cycle</italic> tEM. This portion of the country, known as the Ethiopian Highlands and comprising of Amhara and Tigray, is highly agricultural; the main cropping season lasts from June to October and coincides with the primary <italic>kiremt</italic> rains, with some secondary cropping following the lighter <italic>belg</italic> rains from March to May. Accordingly, cropping that occurs during the dry season between November and March is likely to be irrigated.</p>
<p>In presenting a map of dominant vegetation phenologies in Ethiopia, <xref ref-type="fig" rid="F1">Figure 1</xref> provides a guide for land cover classification applicability within the country. For instance, a dry season irrigation detector trained in Amhara will perform poorly in SNNP, as phenological patterns differ significantly across these states, and dry season crop cycles exhibit different vegetation signatures. In contrast, a dry season irrigation detector developed across Amhara can be transferred to Tigray or Benishangul-Gamuz, due to regional phenological similarities.</p>
<p>The named, italicized outlines in <xref ref-type="fig" rid="F1">Figure 1</xref> represent the 8 areas containing labels used in this paper, referred to as <italic>regions</italic>: The yellow outline indicates a region where labels were collected <italic>via</italic> a ground survey, and the purple outlines indicate regions where labels were collected by means of visual interpretation and timeseries inspection. Full information on the labeled data collection process is presented in <xref ref-type="sec" rid="s3">Section 3</xref>.</p>
</sec>
<sec id="s3">
<title>3 Materials and Methods</title>
<p>The data collection portion of this paper&#x2019;s methodology consists of pairing Sentinel-2 imagery with labeled polygons to train an irrigation detector. Here, a pixel timeseries paired with a binary irrigation/non-irrigation label constitutes a sample. Irrigation is defined as such: A pixel is irrigated if its phenology includes at least one non-perennial vegetation cycle during the dry season, 1 December to 1 April for the Ethiopian Highlands. Conversely, a pixel is non-irrigated if its phenology demonstrates only vegetation growth that can be attributed to the area&#x2019;s known rainy seasons. Irrigated areas are only of interest if they contain dry season vegetation cycles; this strict definition of irrigation excludes supplemental irrigation practices and perennial crops that may be consistently irrigated throughout the year.</p>
<sec id="s3-1">
<title>3.1 Sentinel-2 Imagery Collection</title>
<p>The following analysis uses bottom-of-atmosphere corrected (processing level L2A) Sentinel-2 temporal stacks&#x2014;four dimensional arrays created by stacking a set spatial extent of imagery bands over multiple timesteps&#x2014;using the Descartes Labs (DL) platform, a commercial environment for planet-scale geospatial analysis. Images are collected at a 10-day time resolution. To focus on the 2020 and 2021 dry seasons, the time period of interest is defined as between 1 June 2019, and 1 June 2021. Given the 10-day timestep, 72 image mosaics are collected&#x2014;36 per year. Additional information on the imagery download process is available in the <xref ref-type="sec" rid="s11">Supplementary Material</xref>.</p>
</sec>
<sec id="s3-2">
<title>3.2 Label Collection</title>
<p>Two types of labeled data are leveraged for irrigation mapping: <italic>ground collection</italic> (GC) labels, acquired <italic>via</italic> an in-person survey; and <italic>visual collection</italic> (VC) labels, acquired <italic>via</italic> visual identification of dry season vegetation from Sentinel-2 imagery using the DL platform and subsequent cleaning <italic>via</italic> timeseries clustering. The locations of these GC and VC regions are shown in italics in <xref ref-type="fig" rid="F1">Figure 1</xref>, with all labels collected for the 2021 dry season. A description of the ground collection survey is presented in the <xref ref-type="sec" rid="s11">Supplementary Material</xref>. As the GC labels constitute our highest quality irrigation observations, verified by <italic>in situ</italic> visits to individual plots, we do not use them for training during the model sensitivity analysis, instead reserving them for validation of classifier performance.</p>
<sec id="s3-2-1">
<title>3.2.1 Visual Label Collection</title>
<p>To supplement the GC labels located in Tana, visually collected labels are acquired for seven separate regions <italic>via</italic> a three-step process of 1) visual inspection, 2) EVI timeseries confirmation, and 3) cluster cleaning. Each of these steps is described in its eponymous subsection below.</p>
<sec id="s3-2-1-1">
<title>3.2.1.1 Visual Inspection</title>
<p>The first step in the VC labeling process involves drawing polygons around locations that either: 1) Present as cropland with visible vegetation growth (for the collection of irrigated samples), or 2) present as cropland with no visible vegetation growth (for the collection of non-irrigated samples), based on dry-season, false-color Sentinel-2 imagery presented on the DL platform. Sub-meter resolution commercial satellite imagery from Google Earth Pro is also used to confirm the existence of cropland in the viewing window. For the collection of non-irrigated labels, polygons are restricted to areas that contain non-perennial cropland; however, because only phenologies that contain dry season vegetation cycles are considered irrigated, non-irrigated polygons occasionally overlap other types of land cover&#x2014;e.g., perennial crops, fallow cropland, or areas with human settlement&#x2014;with any overlap likely to improve training robustness.</p>
</sec>
<sec id="s3-2-1-2">
<title>3.2.1.2 Enhanced Vegetation Index Timeseries Confirmation</title>
<p>After drawing a polygon around a suspected irrigated or non-irrigated area, the second step in the VC label acquisition process entails inspection of the median Sentinel-2 EVI timeseries of all pixels contained within the polygon; this step is shown in the plot windows of <xref ref-type="fig" rid="F2">Figure 2</xref>. Here, all available Sentinel-2 imagery with less than 20% cloud cover between 1 June 2020, and 1 June 2021 is retrieved; a cubic spline is then fit to all available data to generate continuous EVI timeseries. For potential irrigated polygons, if the EVI timeseries shows a clear peak above 0.2 during the dry season, it is confirmed as irrigated. Similarly, for potential non-irrigated polygons, an EVI timeseries that demonstrates a single vegetation cycle attributable to Ethiopia&#x2019;s June to September rains is taken as confirmation of a non-irrigated VC polygon. However, if the EVI timeseries does not confirm the expected irrigated/non-irrigated class, or if the plotted EVI error bars (representing &#xb1;one standard deviation of the EVI values at that timestep) indicate a level of signal noise within the polygon that prevents the identification of a clear vegetation phenology, the polygon is discarded.</p>
<fig id="F2" position="float">
<label>FIGURE 2</label>
<caption>
<p>Example of the visual collection (VC) labeling process in Koga using the Descartes Labs platform. Blue polygons denote areas determined to be irrigated; red polygons are determined to be non-irrigated. Background imagery is a false-color Sentinel-2 image taken in March 2021: Red, near-infrared, and blue bands are presented in the RGB channels, respectively. In <bold>(A)</bold>, the Sentinel-2 enhanced vegetation index (EVI) timeseries is shown for the drawn purple rectangle in the middle of the window; in <bold>(B)</bold>, the Sentinel-2 EVI timeseries is shown for the drawn pink, semi-octagonal polygon in the top left of the window. Both timeseries present the median EVI values for all pixels contained within the drawn polygon; the error bars show one standard deviation of these values above and below the median. In both figures, the drawn polygons are confirmed as VC labels, since they meet the definitions of irrigation/non-irrigation, respectively.</p>
</caption>
<graphic xlink:href="frsen-03-871942-g002.tif"/>
</fig>
<p>
<xref ref-type="fig" rid="F2">Figure 2A</xref> demonstrates an example of irrigated VC label collection in the Koga region&#x2014;here, the double vegetation peak present in the EVI timeseries confirms the purple polygon in the center of the window as irrigated (blue polygons indicate areas already saved as irrigated VC labels). <xref ref-type="fig" rid="F2">Figure 2B</xref> demonstrates the same process for non-irrigated VC labels, also in Koga: The single EVI peak in October 2020 confirms the pink polygon in the top left of the window as non-irrigated (red polygons indicate areas already saved as non-irrigated VC labels).</p>
</sec>
<sec id="s3-2-1-3">
<title>3.2.1.3 Cluster Cleaning</title>
<p>The third step in the VC label acquisition process involves bulk verification of the collected timeseries by means of cluster cleaning. For each VC region, all pixels that reside within labeled polygons are collected and split based on the irrigated/non-irrigated class labels of the polygons. Fifteen-component Gaussian mixture models are fit to each class&#x2019;s data to extract the dominant phenologies contained within the region&#x2019;s samples; the EVI timeseries representing the cluster centroids are then plotted, with the plot legend displaying the number of samples per cluster. <xref ref-type="fig" rid="F3">Figure 3A</xref> presents the results of this initial clustering for the Koga region.</p>
<fig id="F3" position="float">
<label>FIGURE 3</label>
<caption>
<p>Clustered enhanced vegetation index (EVI) timeseries before and after cluster cleaning for the Koga visual collection (VC) region. Before and after cleaning, pixels are grouped into one of 15 randomly indexed clusters. In <bold>(A)</bold>, Clusters 3, 6, and 13 of the irrigated samples are discarded due to either (6, 13) not containing a clear EVI peak above 0.2 during the dry season (December 1<sup>st</sup> to April 1<sup>st</sup>); or (3) not containing successive EVI values below 0.2. All non-irrigated clusters display a single vegetation peak aligned with the main rainy season, and the irrigated clusters after cleaning <bold>(B)</bold> all display a vegetation cycle during the dry season.</p>
</caption>
<graphic xlink:href="frsen-03-871942-g003.tif"/>
</fig>
<p>From the initial cluster timeseries, an iterative process begins to ensure that all cluster timeseries align with the specified class label. For an irrigated cluster timeseries to be kept, it must contain multiple successive EVI values above and below 0.2, and it must contain a clear EVI peak above 0.2 during the dry season. Analogously, non-irrigated cluster timeseries are discarded if they display a clear dry-season EVI peak above 0.2. If these conditions are not met&#x2014;as is the case for Clusters 3, 6, and 13 of the Koga irrigated samples, which do not contain a clear EVI peak above 0.2 between 1 December 2020 and 1 April 2021 (Clusters 6 and 13) or do not senesce below an EVI threshold of 0.2 for successive timesteps (Cluster 3)&#x2014;all pixel timeseries associated with that cluster are discarded from the labeled data. This process is repeated until all 15 clusters for both classes demonstrate EVI signals that meet the non-irrigated/irrigated class definitions. The final, cleaned cluster timeseries for the Koga region are shown in <xref ref-type="fig" rid="F3">Figure 3B</xref>.</p>
<p>Cluster-cleaning is performed for all regions&#x2019; labeled data, including labeled data collected from the GC region, Tana. For increased visibility into the labeled data collected and used for training, these regions&#x2019; clusters before and after cleaning are included in <xref ref-type="sec" rid="s11">Supplementary Appendix SA</xref> of the <xref ref-type="sec" rid="s11">Supplementary Material</xref>.</p>
<p>A summary of the number of collected polygons and cleaned pixel timeseries samples in each region is shown in <xref ref-type="sec" rid="s11">Supplementary Tables S2, S3</xref>: In total, 1,207,233 non-irrigated samples and 907,887 irrigated samples are used, taken from 1702 to 750 labeled polygons, respectively. For model training and evaluation, data are divided among training, validation, and test splits<xref ref-type="fn" rid="fn2">
<sup>2</sup>
</xref>. Here, polygons in each labeled region are split according to a 70/15/15 training/validation/test ratio; this method ensures that highly similar pixels from within the same polygon do not exist across training configurations, a division of data that would artificially inflate model performance for the task of predicting irrigation over pixel timeseries unseen by the model. All training, validation, and testing is performed pixelwise (i.e., having removed the spatial relationships of samples).</p>
<p>The <xref ref-type="sec" rid="s11">Supplementary Material</xref> contain additional information about the labeled data distributions, including a statistical evaluation of the similarity of labeled samples across region and class (<xref ref-type="sec" rid="s11">Supplementary Tables S4, S5</xref>).</p>
</sec>
</sec>
</sec>
<sec id="s3-3">
<title>3.3 Prediction Admissibility Criteria</title>
<p>Given that irrigated phenologies exist over a small fraction of the total land area of the Ethiopian Highlands, and that there are many types of land cover that do not fall within this paper&#x2019;s non-irrigated/irrigated cropland dichotomy, a set of criteria are imposed to exclude pixel phenologies that are not cropland or are highly unlikely to be irrigated. <xref ref-type="table" rid="T1">Table 1</xref> presents five criteria that must all be met for a pixel timeseries to be potentially irrigated and the motivation behind each.</p>
<table-wrap id="T1" position="float">
<label>TABLE 1</label>
<caption>
<p>Prediction admissibility criteria. All criteria need to be satisfied for a prediction to be admitted as irrigated.</p>
</caption>
<table>
<thead valign="top">
<tr>
<th align="left">Admissibility Criteria</th>
<th align="left">Motivation</th>
</tr>
</thead>
<tbody valign="top">
<tr>
<td align="left">10th percentile of EVI timeseries <inline-formula id="inf1">
<mml:math id="m1">
<mml:mo>&#x3c;</mml:mo>
</mml:math>
</inline-formula> 0.2</td>
<td align="left">Remove evergreen pixels</td>
</tr>
<tr>
<td align="left">90th percentile of EVI timeseries <inline-formula id="inf2">
<mml:math id="m2">
<mml:mo>&#x3e;</mml:mo>
</mml:math>
</inline-formula> 0.2</td>
<td align="left">Remove barren/non-vegetated pixels</td>
</tr>
<tr>
<td align="left">Maximum of the EVI timeseries during the dry season (Dec. 1&#x2014;Apr. 1) <inline-formula id="inf3">
<mml:math id="m3">
<mml:mo>&#x3e;</mml:mo>
</mml:math>
</inline-formula> 0.2</td>
<td align="left">Remove pixels with no vegetation growth in the dry season</td>
</tr>
<tr>
<td align="left">Ratio of the 90th:10th percentile of the EVI timeseries <inline-formula id="inf4">
<mml:math id="m4">
<mml:mo>&#x3e;</mml:mo>
</mml:math>
</inline-formula> 2</td>
<td align="left">Remove evergreen pixels</td>
</tr>
<tr>
<td align="left">Shuttle Radar Topography Mission slope measurement <inline-formula id="inf5">
<mml:math id="m5">
<mml:mo>&#x3c;</mml:mo>
</mml:math>
</inline-formula> 8%</td>
<td align="left">Remove pixels in highly sloped settings where cropping is impractical</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>These vegetation-specific criteria are informed by the EVI distributions of labeled irrigated samples for all label collection regions: <xref ref-type="sec" rid="s11">Supplementary Figure S3</xref> contains cumulative distribution functions (CDFs) for the 10th and 90th EVI timeseries percentiles, the 90th:10th EVI timeseries percentile ratio, and the maximum EVI value during the dry season. CDFs are presented for all regions&#x2019; irrigated samples, including for a set of polygons collected over evergreen land cover areas.</p>
<p>The criteria in <xref ref-type="table" rid="T1">Table 1</xref> are also used to create a reference irrigation classifier that does not rely on machine learning. For this reference classifier, if all 5 conditions are met, the sample is deemed irrigated; if any of the conditions is not satisfied, the sample is deemed non-irrigated.</p>
</sec>
<sec id="s3-4">
<title>3.4 Model Training</title>
<sec id="s3-4-1">
<title>3.4.1 Model Architectures</title>
<p>Five separate classifier types are compared to determine the model architecture with the most robust irrigation detection performance across regions. The first two classifiers are decision tree-based: A random forest with 1000 trees (<xref ref-type="bibr" rid="B5">Breiman, 2001</xref>); and a CatBoost model that uses gradient boosting on up to 1000 trees (<xref ref-type="bibr" rid="B9">Dorogush et al., 2017</xref>). The other three classifiers are neural networks (NN): A baseline network, a long short-term memory (LSTM)-based network, and a transformer-based network. For comparability, these three classifier architectures are designed to have similar structures, based on the strong baseline model structure proposed in (<xref ref-type="bibr" rid="B33">Wang et al., 2017</xref>); as seen in <xref ref-type="fig" rid="F4">Figure 4</xref>, they differ only in the type of encoding blocks used.</p>
<fig id="F4" position="float">
<label>FIGURE 4</label>
<caption>
<p>Neural network (NN) model architectures tested as irrigation detection classifiers. Model architectures are consistent by design; only encoding blocks differ across networks.</p>
</caption>
<graphic xlink:href="frsen-03-871942-g004.tif"/>
</fig>
</sec>
<sec id="s3-4-2">
<title>3.4.2 Model Training Strategy</title>
<p>The implemented model training strategy addresses two potential pitfalls among training processes: 1) Imbalanced samples across region and class; and 2) high similarity among samples within a region that may not reflect the sample distributions across all regions. Consistent with best practices in dealing with imbalanced data, this first issue is addressed with 1) class balancing weights specific to each region, based on the &#x201c;balanced&#x201d; heuristic inspired by <xref ref-type="bibr" rid="B12">King and Zeng (2001)</xref>; and 2) a region-specific weight equal to the ratio of the maximum number of samples in any region to the number of samples for the region in question. Both class-balancing and region-balancing weights are used in all training configurations.</p>
<p>To address potential redundancy and time-specificity among samples within a region, random shifts are applied to all input timeseries. The sizes of these random shifts vary between &#x2212;3 and &#x2b;3 timesteps (corresponding to between &#x2212;30 and &#x2b;30&#xa0;days), with an equal probability of all 7 possible shifts occurring (including a shift by 0 timesteps). Random shifts are applied to all samples in the training and validation sets and differ for each sample every time it is seen by the model. No shifts are applied to the samples in the testing sets.</p>
<p>The primary metric for performance evaluation is the F<sub>1</sub> score on the test datasets of regions withheld from training. Accordingly, performance is assessed in a manner that prioritizes classifier robustness&#x2014;i.e., performance in regions unseen during training&#x2014;and not in a manner that could be inflated by close similarity of samples within a region. For reference, the F<sub>1</sub> score balances prediction precision and recall, and is calculated per <xref ref-type="disp-formula" rid="e1">Eq. 1</xref>.<disp-formula id="e1">
<mml:math id="m6">
<mml:msub>
<mml:mrow>
<mml:mi>F</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mo>&#x3d;</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:mi>T</mml:mi>
<mml:mi>P</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>T</mml:mi>
<mml:mi>P</mml:mi>
<mml:mo>&#x2b;</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:mfrac>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mi>F</mml:mi>
<mml:mi>P</mml:mi>
<mml:mo>&#x2b;</mml:mo>
<mml:mi>F</mml:mi>
<mml:mi>N</mml:mi>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:mfrac>
</mml:math>
<label>(1)</label>
</disp-formula>
</p>
<p>The training strategy differs for the tree-based classifiers and for the neural network-based classifiers. As training the tree-based classifiers occurs across a single batch with no iteration across epochs, there is no need for separate validation and testing datasets: The training and validation datasets of all included regions are therefore combined to create a single training dataset. After training on this combined dataset, performance is evaluated across the test datasets.</p>
<p>In contrast, training neural network-based models takes place by batch across epochs, and a validation set is required to guide the training process. For a given training step, one batch from each region is concatenated, with the combined output shuffled before model intake. After the epoch is finished, performance is assessed on the validation set of each region included in training. If the minimum F<sub>1</sub> score among all regions&#x2019; validation sets has increased from its previous maximum, the model weights are saved; however, if the minimum F<sub>1</sub> score has not increased from its previous high point, the model weights are discarded. Minimum F<sub>1</sub> score across all validation regions is selected as the weight update criteria to ensure model robustness: Consistent performance across the entire area of interest is desired, not high performance in one set of regions and poor performance in another. Training concludes once the minimum validation region F<sub>1</sub> score has not improved for 10 training epochs, or after 30 epochs have been completed. After training, model weights are loaded from the epoch with the highest minimum validation region F<sub>1</sub> score; performance of this model on the test datasets of all regions is then reported. For all training runs, a binary cross-entropy loss, a learning rate of 1e-4, and an Adam optimizer (<xref ref-type="bibr" rid="B13">Kingma and Ba, 2015</xref>) are specified. Inputs are standardized to a mean of 0 and standard deviation of 1 using statistics from the entire set of labeled samples.</p>
</sec>
</sec>
</sec>
<sec id="s4">
<title>4 Results</title>
<sec id="s4-1">
<title>4.1 Model Sensitivity</title>
<p>
<xref ref-type="fig" rid="F5">Figure 5</xref> presents withheld VC region test dataset F<sub>1</sub> scores for three different types of model input&#x2014;one that includes all spectral bands for all timesteps; one that includes only the EVI layer for all timesteps; and one that includes only the EVI layer for all timesteps with the random sample shift applied. Here, the performance of models trained on all combinations of VC regions is evaluated; these results are organized along the <italic>x</italic>-axis by the number of VC regions included during training. Each <italic>x</italic>-axis tick label also includes in parentheses the number of withheld VC region test dataset evaluations, <italic>n</italic>, for all models trained on <italic>x</italic> included VC regions<xref ref-type="fn" rid="fn3">
<sup>3</sup>
</xref>. Mean and 10th percentile values of the <italic>n</italic> performance evaluations are displayed for each <italic>x</italic> between 1 and 6. All results are presented for the transformer model architecture; however, these findings are agnostic to the classifier architecture selected.</p>
<fig id="F5" position="float">
<label>FIGURE 5</label>
<caption>
<p>Withheld region test dataset performance for different types of model input, organized along the <italic>x</italic>-axis by the number of regions included during training. <bold>(A)</bold> presents mean F<sub>1</sub> score over the withheld regions; <bold>(B)</bold> presents the 10th percentile F<sub>1</sub> score over the withheld regions. Results indicate that model inputs of randomly shifted enhanced vegetation index (EVI) timeseries yield the best classifier performance. F<sub>1</sub> scores from classification based on the prediction admissibility criteria are presented for reference.</p>
</caption>
<graphic xlink:href="frsen-03-871942-g005.tif"/>
</fig>
<p>
<xref ref-type="fig" rid="F5">Figure 5</xref> demonstrates that models trained on samples containing only EVI timeseries outperform those that include all spectral bands at all timesteps, both on average <xref ref-type="fig" rid="F5">Figure 5A</xref> and in low performing regions <xref ref-type="fig" rid="F5">Figure 5B</xref>. The 10th percentile of withheld regions&#x2019; F<sub>1</sub> scores is shown in order to understand the low-end of model performance without accounting for outliers. For reference, classifier performance based on the prediction admissibility criteria is also included. <xref ref-type="fig" rid="F5">Figure 5</xref> shows that explicitly feeding classification models information about samples&#x2019; vegetation content&#x2014;i.e., feature engineering&#x2014;allows for better performance compared to models that intake the 10 Sentinel-2 L2A spectral bands containing ground information. Introducing a random temporal shift to the EVI timeseries further increases performance; by increasing the sample variance seen by the model, randomly shifting the input timeseries improves model transferability. <xref ref-type="sec" rid="s11">Supplementary Figure S4</xref> provides additional evidence of the benefits of this training strategy: A gradient class-activation map shows that a classifier trained on randomly shifted timeseries better identifies dry season vegetation as predictive of irrigation presence.</p>
<p>Taken together, randomly shifted EVI timeseries increase withheld region F<sub>1</sub> scores by an average of 0.22 when only 2 VC regions are included in the training data, compared to models that use all spectral bands. As performance begins to plateau with 4 or more VC regions included in the training data, this gap shrinks to an improvement of 0.10. Similar results can be seen in <xref ref-type="fig" rid="F5">Figure 5B</xref> for the low-end of performance: Extracting and randomly shifting EVI timeseries increase the 10th percentile of withheld region F<sub>1</sub> scores by 0.40 when 2 VC regions are included in the training data, a difference that shrinks to approximately 0.14 with 5 or more VC regions in the training data. Two additional findings are gleaned from the results for the models trained on randomly shifted EVI timeseries (i.e., the grey curve). First, a classifier trained on data from 2 VC regions or more outperforms the pixel filtering baseline. Second, increasing the number of VC regions included in the training set improves withheld region prediction performance up until 4 VC regions before tapering off.</p>
<p>
<xref ref-type="fig" rid="F6">Figure 6</xref> displays <xref ref-type="fig" rid="F6">Figure 6A</xref> mean and <xref ref-type="fig" rid="F6">Figure 6B</xref> 10th percentile F<sub>1</sub> score for all combinations of VC regions included in training for the 5 classification models tested, along with the reference classifier based on the prediction admissibility criteria. <xref ref-type="fig" rid="F6">Figure 6</xref> demonstrates that the transformer architecture is most robust for all combinations of VC training regions, followed closely by the CatBoost architecture for all training configurations with 2 or more VC regions. Moreover, for models with 5 or 6 VC regions included in training, mean and low-end F<sub>1</sub> scores for these two architectures are practically indistinguishable at 0.97 and 0.92, respectively. The <xref ref-type="sec" rid="s11">Supplementary Material</xref> contain further comparisons between Transformer and CatBoost performance (see <xref ref-type="sec" rid="s11">Supplementary Table S6</xref>), showing that when each model is trained on all 7 VC regions&#x2019; training data, the two models demonstrate an average regional prediction alignment of 98.9%. Moreover, an ablation study on training dataset size finds that reducing the proportion of polygons in the training set from 70 to 15% has minimal impact on prediction performance (<xref ref-type="sec" rid="s11">Supplementary Figure S5</xref>). Lastly, <xref ref-type="fig" rid="F6">Figure 6</xref> shows that the LSTM architecture does not noticeably improve performance compared to the baseline neural network, and that the trained Random Forest models yield the worst performance in withheld regions.</p>
<fig id="F6" position="float">
<label>FIGURE 6</label>
<caption>
<p>Withheld region test dataset performance for different classifier models, organized along the <italic>x</italic>-axis by the number of regions included during training. <bold>(A)</bold> presents mean F<sub>1</sub> score over the withheld regions; <bold>(B)</bold> presents the 10th percentile F<sub>1</sub> score over the withheld regions. Results indicate that the transformer based classifier yields the best performance, followed closely by the CatBoost model. F<sub>1</sub> scores from classification based on the prediction admissibility criteria are presented for reference.</p>
</caption>
<graphic xlink:href="frsen-03-871942-g006.tif"/>
</fig>
<p>Next, prediction performance over the unseen ground-collected samples in Tana is assessed. As the transformer model demonstrates the most robust performance over withheld regions&#x2019; samples, it is selected for prediction, achieving 96.7% accuracy over irrigated samples (88,128/91,898) and 95.9% accuracy over non-irrigated samples (33,954/35,121) for an F<sub>1</sub> score of 0.932. It is again worth noting that these high accuracies are achieved over the GC samples without the classification model seeing any labeled data from the Tana region during training.</p>
</sec>
<sec id="s4-2">
<title>4.2 Model Inference</title>
<p>For model inference, the transformer architecture is trained on the randomly shifted EVI timeseries of the labeled data from the 7 VC and 1&#xa0;GC regions. The trained model is then deployed over Tigray and Amhara for the 2020 and 2021 dry seasons (using imagery collected between 1 June 2019 and 1 June 2020; and between 1 June 2020 and 1 June 2021, respectively). Two post-processing steps are then taken: 1) The prediction admissibility criteria are applied, and 2) contiguous groups of predicted irrigated pixels smaller than 0.1&#xa0;Ha are removed in order to ignore isolated, outlier predictions.</p>
<p>During inference, another step is taken to verify the accuracy of irrigation predictions. Here, five additional enumerators collect 1601 labeled polygons for the 2020 and 2021 dry seasons&#x2014;1082 non-irrigated polygons covering 3,807&#xa0;Ha and 519 irrigated polygons covering 582&#xa0;Ha&#x2014;across the extent of Amhara <italic>via</italic> the same labeling methodology used to collect the training, validation, and testing data. The locations of these independently labeled polygons are shown in <xref ref-type="sec" rid="s11">Supplementary Figure S6</xref>. After cluster cleaning and applying the prediction admissibility criteria, these polygons yield 361,451 non-irrigated samples and 48,465 irrigated samples. An F<sub>1</sub> score of 0.917 is achieved over these samples&#x2014;98.3% accuracy over non-irrigated samples and 95.5% accuracy over irrigated samples&#x2014;performance that remains in line with the reported test dataset metrics from <xref ref-type="fig" rid="F6">Figure 6</xref> and accuracies over the withheld Tana ground-collected labels.</p>
<p>Due to text constraints, <xref ref-type="fig" rid="F7">Figures 7</xref>, <xref ref-type="fig" rid="F8">8</xref> present bitemporal irrigation maps at a resolution far coarser than their native 10&#xa0;m. The full resolution, georeferenced irrigation maps are available from the corresponding author upon request.</p>
<fig id="F7" position="float">
<label>FIGURE 7</label>
<caption>
<p>Bitemporal irrigation map for Tigray. Figure inset contains example EVI timeseries predicted as irrigated in either 2020 or 2021. A predominance of red indicates that many parts of Tigray contain irrigation detected in 2020 but not in 2021.</p>
</caption>
<graphic xlink:href="frsen-03-871942-g007.tif"/>
</fig>
<fig id="F8" position="float">
<label>FIGURE 8</label>
<caption>
<p>Bitemporal irrigation map for Amhara. Figure inset contains example predictions around Choke Mountain displaying interannual irrigation patterns. A predominance of red indicates that many parts of Amhara contain irrigation detected in 2020 but not in 2021.</p>
</caption>
<graphic xlink:href="frsen-03-871942-g008.tif"/>
</fig>
<sec id="s4-2-1">
<title>4.2.1 Tigray</title>
<p>
<xref ref-type="fig" rid="F7">Figure 7</xref> presents predicted irrigated areas in Tigray for 2020 and 2021, with 2020 irrigation predictions in red and 2021 irrigation predictions in cyan. To better understand the nature of changing vegetation phenologies across this time period, the inset of <xref ref-type="fig" rid="F7">Figure 7</xref> contains example timeseries that produced an irrigation prediction in one of 2020 or 2021. These example timeseries show that a second crop cycle with vegetation growth peaking in January is associated with a positive irrigation prediction; in contrast, the non-existence of this cycle is associated with non-irrigated prediction. <xref ref-type="table" rid="T2">Table 2</xref> displays the total predicted irrigated area for Tigray for 2020 and 2021, along with the total land area, organized by zone. Between 2020 and 2021, <xref ref-type="table" rid="T2">Table 2</xref> quantifies a 39.8% decline in irrigated area in Tigray.</p>
<table-wrap id="T2" position="float">
<label>TABLE 2</label>
<caption>
<p>Predicted irrigated area statistics in Tigray for 2020 and 2021, organized by zone.</p>
</caption>
<table>
<thead valign="top">
<tr>
<th align="left">Zone</th>
<th align="center">Irrigated Ha., 2020</th>
<th align="center">Irrigated Ha. 2021</th>
<th align="center">Total Ha.</th>
<th align="center">Percent Change, 2020 to 2021</th>
<th align="center">Percent Change as Fraction of Total Area, 2020 to 2021</th>
</tr>
</thead>
<tbody valign="top">
<tr>
<td align="left">Central</td>
<td align="center">3,710</td>
<td align="center">3,554</td>
<td align="center">954,616</td>
<td align="char" char=".">&#x2212;4.2%</td>
<td align="char" char=".">0.0%</td>
</tr>
<tr>
<td align="left">Eastern</td>
<td align="center">3,068</td>
<td align="center">2,863</td>
<td align="center">635,670</td>
<td align="char" char=".">&#x2212;6.7%</td>
<td align="char" char=".">0.0%</td>
</tr>
<tr>
<td align="left">Mekelle</td>
<td align="center">556</td>
<td align="center">397</td>
<td align="center">52,313</td>
<td align="char" char=".">&#x2212;28.5%</td>
<td align="char" char=".">&#x2212;0.3%</td>
</tr>
<tr>
<td align="left">North Western</td>
<td align="center">7,439</td>
<td align="center">2,062</td>
<td align="center">1,246,715</td>
<td align="char" char=".">&#x2212;72.3%</td>
<td align="char" char=".">&#x2212;0.4%</td>
</tr>
<tr>
<td align="left">South Eastern</td>
<td align="center">2,658</td>
<td align="center">2,301</td>
<td align="center">533,334</td>
<td align="char" char=".">&#x2212;13.4%</td>
<td align="char" char=".">&#x2212;0.1%</td>
</tr>
<tr>
<td align="left">Southern</td>
<td align="center">16,474</td>
<td align="center">8,064</td>
<td align="center">506,151</td>
<td align="char" char=".">&#x2212;51.1%</td>
<td align="char" char=".">&#x2212;1.7%</td>
</tr>
<tr>
<td align="left">Western</td>
<td align="center">2,278</td>
<td align="center">2,557</td>
<td align="center">1,331,652</td>
<td align="char" char=".">12.3%</td>
<td align="char" char=".">0.0%</td>
</tr>
<tr>
<td align="left">Total</td>
<td align="center">36,181</td>
<td align="center">21,799</td>
<td align="center">5,260,451</td>
<td align="char" char=".">&#x2212;39.8%</td>
<td align="char" char=".">&#x2212;0.3%</td>
</tr>
</tbody>
</table>
</table-wrap>
</sec>
<sec id="s4-2-2">
<title>4.2.2 Amhara</title>
<p>
<xref ref-type="fig" rid="F8">Figure 8</xref> presents a bitemporal irrigation map for Amhara, also with 2020 irrigation predictions in red and 2021 irrigation predictions in cyan. This map contains large clusters of irrigated predictions around Lake Tana in the zones of Central Gondar, South Gondar, and West Gojjam, an intuitive finding given the availability of water from Lake Tana and the rivers that extend off it. Irrigation is also detected in the portions of Amhara&#x2019;s easternmost zones that fall within the Main Ethiopian Rift (MER); as the valley formed by the MER extends north into Tigray, irrigation predictions in the North Wello, Oromia, and North Shewa zones align with irrigation predictions in the Southern zone of Tigray shown in <xref ref-type="fig" rid="F7">Figure 7</xref>. <xref ref-type="table" rid="T3">Table 3</xref> displays the total predicted irrigated area for Amhara for 2020 and 2021, along with the total land area, organized by zone. From 2020 to 2021, <xref ref-type="table" rid="T3">Table 3</xref> quantifies a 41.6% decline in irrigated area in Amhara.</p>
<table-wrap id="T3" position="float">
<label>TABLE 3</label>
<caption>
<p>Predicted irrigated area statistics in Amhara for 2020 and 2021, organized by zone.</p>
</caption>
<table>
<thead valign="top">
<tr>
<th align="left">Zone</th>
<th align="center">Irrigated Ha., 2020</th>
<th align="center">Irrigated Ha. 2021</th>
<th align="center">Total Ha.</th>
<th align="center">Percent Change, 2020 to 2021</th>
<th align="center">Percent Change as Fraction of Total Area, 2020 to 2021</th>
</tr>
</thead>
<tbody valign="top">
<tr>
<td align="left">Awi</td>
<td align="center">27,443</td>
<td align="center">20,547</td>
<td align="center">906,682</td>
<td align="char" char=".">&#x2212;25.1%</td>
<td align="char" char=".">&#x2212;0.8%</td>
</tr>
<tr>
<td align="left">Central Gondar</td>
<td align="center">73,450</td>
<td align="center">50,954</td>
<td align="center">2,095,018</td>
<td align="char" char=".">&#x2212;30.6%</td>
<td align="char" char=".">&#x2212;1.1%</td>
</tr>
<tr>
<td align="left">East Gojjam</td>
<td align="center">44,975</td>
<td align="center">33,888</td>
<td align="center">1,405,689</td>
<td align="char" char=".">&#x2212;24.7%</td>
<td align="char" char=".">&#x2212;0.8%</td>
</tr>
<tr>
<td align="left">North Gondar</td>
<td align="center">7,381</td>
<td align="center">3,367</td>
<td align="center">684,247</td>
<td align="char" char=".">&#x2212;54.4%</td>
<td align="char" char=".">&#x2212;0.6%</td>
</tr>
<tr>
<td align="left">North Shewa (AM)</td>
<td align="center">62,933</td>
<td align="center">21,362</td>
<td align="center">1,622,197</td>
<td align="char" char=".">&#x2212;66.1%</td>
<td align="char" char=".">&#x2212;2.6%</td>
</tr>
<tr>
<td align="left">North Wello</td>
<td align="center">21,367</td>
<td align="center">8,250</td>
<td align="center">1,110,856</td>
<td align="char" char=".">&#x2212;61.4%</td>
<td align="char" char=".">&#x2212;1.2%</td>
</tr>
<tr>
<td align="left">Oromia</td>
<td align="center">30,875</td>
<td align="center">5,285</td>
<td align="center">380,773</td>
<td align="char" char=".">&#x2212;82.9%</td>
<td align="char" char=".">&#x2212;6.7%</td>
</tr>
<tr>
<td align="left">South Gondar</td>
<td align="center">72,682</td>
<td align="center">43,046</td>
<td align="center">1,406,698</td>
<td align="char" char=".">&#x2212;40.8%</td>
<td align="char" char=".">&#x2212;2.1%</td>
</tr>
<tr>
<td align="left">South Wello</td>
<td align="center">28,215</td>
<td align="center">16,302</td>
<td align="center">1,849,812</td>
<td align="char" char=".">&#x2212;42.2%</td>
<td align="char" char=".">&#x2212;0.6%</td>
</tr>
<tr>
<td align="left">Wag Hamra</td>
<td align="center">447</td>
<td align="center">698</td>
<td align="center">890,004</td>
<td align="char" char=".">56.4%</td>
<td align="char" char=".">0.0%</td>
</tr>
<tr>
<td align="left">West Gojjam</td>
<td align="center">97,206</td>
<td align="center">71,052</td>
<td align="center">1,348,157</td>
<td align="char" char=".">&#x2212;26.9%</td>
<td align="char" char=".">&#x2212;1.9%</td>
</tr>
<tr>
<td align="left">West Gondar</td>
<td align="center">6,180</td>
<td align="center">1342</td>
<td align="center">1,529,197</td>
<td align="char" char=".">&#x2212;78.3%</td>
<td align="char" char=".">&#x2212;0.3%</td>
</tr>
<tr>
<td align="left">Total</td>
<td align="center">473,155</td>
<td align="center">276,093</td>
<td align="center">15,229,329</td>
<td align="char" char=".">&#x2212;41.6%</td>
<td align="char" char=".">&#x2212;1.3%</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>The inset of <xref ref-type="fig" rid="F8">Figure 8</xref> presents interannual irrigated cropping patterns for an area southwest of Choke Mountain. Interlocking red and cyan plots indicate the spatial rotation of irrigated crops from 2020 to 2021; no white plots are observed, which would signify dry season crop growth in both years.</p>
</sec>
</sec>
</sec>
<sec id="s5">
<title>5 Discussion</title>
<p>This paper makes a set of contributions to the literature for learning from limited labels. First, it demonstrates a process of collecting training data to supplement ground-collected labels that improves on previous methods of sample collection&#x2014;such as using imagery from a single timestep or simple vegetation content heuristics&#x2014;as it verifies the existence or non-existence of full vegetation cycles during the dry season. Second, an evaluation of inputs, classifier architectures, and training strategies is presented for achieving irrigation classifier applicability to a larger area. Results indicate that enhanced vegetation index (EVI) timeseries outperform a full set of spectral bands as inputs; that randomly shifting input timeseries prevents classifier models from overfitting to region-specific input features; and that a transformer-based neural network produces the highest prediction accuracies in unseen target regions. Due to the close similarity of performance metrics and alignment of predictions, the faster training, more easily interpretable CatBoost architecture is also shown as a suitable alternative for irrigation mapping efforts.</p>
<p>Prediction results indicate strong classifier performance over sample timeseries from regions not seen during training. On data from withheld target regions, transformer-based classifiers achieve mean F<sub>1</sub> scores above 0.95 when four or more regions&#x2019; data are included during training; using labels from all 7 visual collection (VC) regions, the transformer-based classifier achieves an F<sub>1</sub> score of 0.932 on the ground collection (GC) labels around Lake Tana. Over an independently collected set of more than 400,000 samples collected for performance assessment, the same classifier achieves 98.3% accuracy over non-irrigated samples and 95.5% accuracy over irrigated samples, demonstrating strong performance throughout the entire Ethiopian Highlands.</p>
<p>Deploying a transformer-based classifier trained on samples from all 8 label collection regions yields insight into changing irrigation patterns. Results suggest that from 2020 to 2021, irrigation in Tigray and Amhara decreased by 40%. In Tigray, this decline was most precipitous in the Northwest and Southern zones, which saw percent changes in irrigated area of &#x2212;72.3% and &#x2212;51.1%, respectively. The Western zone of Tigray was the only zone to see an increase in irrigated area from 2020 to 2021; even so, this increase amounted to 279&#xa0;Ha in a zone with a total area of 1,331,652&#xa0;Ha. Amhara is predicted to have had similar decreases in irrigated area: Apart from the Wag Hamra zone, which was predicted to have less than 0.08% of its area irrigated in 2020 or 2021, all zones in Amhara experienced a change in irrigated area between &#x2212;25.0% and -82.3%. The largest declines by area occurred in North Shewa (&#x2212;41,572&#xa0;Ha), South Gondar (&#x2212;29,636&#xa0;Ha), and West Gojjam (&#x2212;26,154&#xa0;Ha). Combined, results for Tigray and Amhara predict severe reductions in dry season crop growth from 2020 to 2021, findings that align with recent reports of food insecurity following the eruption of civil conflict in Ethiopia in late 2020.</p>
<p>Despite presented performance metrics indicating high levels of prediction accuracy, there are a few limitations to the proposed methodology that are important to mention. First, the study area is limited to the Ethiopian Highlands, a highly agricultural, climatologically consistent area that is dominated by rainfed cropped phenologies. As the irrigation classifiers are only trained to separate dry season crop cycles from rainfed vegetation cycles&#x2014;associating identified dry-season cropping with irrigation presence&#x2014;they will perform poorly in settings with different rainfall and phenological patterns. Relatedly, the trained irrigation classifiers do not identify irrigation used to supplement rainy season precipitation, irrigation of perennial tree crops, evergreen vegetation in riparian areas, or irrigation that supports continuous cropping, as the phenological signatures of these types of vegetation are difficult to distinguish from evergreen, non-cropped signatures. This discrimination task is left for future work. Lastly, classifiers are trained only on cropped phenologies, which constitute a portion of the vegetation signatures that exist in the area of interest. To manage the other phenologies present at model inference, prediction admissibility criteria are implemented. Nevertheless, these criteria are imperfect: There are surely irrigated pixels which have been mistakenly assigned a non-irrigated class label, along with non-cropped pixels which have evaded the admissibility criteria.</p>
<p>While the presented methodology is applied only for the task of irrigation identification in the Ethiopian Highlands, the strategy of regional phenological characterization to provide context for geographically informed selection of training samples and model applicability can be transferred more broadly to a range of land process mapping objectives. The suitability of this approach in the field of machine learning with limited labels is supported by results comparing classifier architectures and hyperparameter choice to assess the question of result uniqueness that overshadows all land cover classifications. As discussed by <xref ref-type="bibr" rid="B25">Small (2021)</xref>, what is presented as <italic>the</italic> map is often just <italic>a</italic> map&#x2014;one of many different products that can be obtained from the same set of inputs with different classifiers and hyperparameter settings. By assessing multiple classifier architectures and quantifying prediction sensitivity, this approach demonstrates consistency in results and indicates the uncertainty that can be expected of the resulting irrigation maps; as such, it provides a process for building robust classifiers in settings with scarce labeled data.</p>
</sec>
</body>
<back>
<sec id="s6">
<title>Data Availability Statement</title>
<p>The raw data supporting the conclusion of this article will be made available by the authors, without undue reservation.</p>
</sec>
<sec id="s7">
<title>Author Contributions</title>
<p>TC and VM conceived of the study, which was led by VM. TC developed and implemented the methodology, analyzed the results, and produced the data visualizations. CS introduced the concept of multiscale phenological context and devised the spatiotemporal mixture model. VM consulted in all steps of these processes. TC is the primary author of the paper, which was prepared with editorial assistance from CS and VM.</p>
</sec>
<sec id="s8">
<title>Funding</title>
<p>Partial support for this effort was provided by the National Science Foundation (INFEWS Award Number 1639214), Columbia World Projects, Rockefeller Foundation (eGuide Grant 2018POW004), OPML United Kingdom (DFID) and Technoserve (BMGF).</p>
</sec>
<sec sec-type="COI-statement" id="s9">
<title>Conflict of Interest</title>
<p>The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.</p>
</sec>
<sec sec-type="disclaimer" id="s10">
<title>Publisher&#x2019;s Note</title>
<p>All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.</p>
</sec>
<ack>
<p>The authors are grateful to Jack Bott, Yinbo Hu, Hasan Siddiqui, and Yuezi Wu for their assistance in labeling. The authors would like to thank Gunther Bensch (RWI), Andrej Kveder (OPML), Abiy Tamerat (EthioResource Group), Yifru Tadesse (ATA Ethiopia), and Esther Kim (Technoserve) for their assistance with field data collection efforts; Rose Rustowicz for guidance in using of Descartes Labs platform; and colleagues Jay Taneja (UMass Amherst), Markus Walsh (AfSIS), and Edwin Adkins (Columbia) for their continued stimulating discussions and guidance.</p>
</ack>
<sec id="s11">
<title>Supplementary Material</title>
<p>The Supplementary Material for this article can be found online at: <ext-link ext-link-type="uri" xlink:href="https://www.frontiersin.org/articles/10.3389/frsen.2022.871942/full#supplementary-material">https://www.frontiersin.org/articles/10.3389/frsen.2022.871942/full&#x23;supplementary-material</ext-link>
</p>
<supplementary-material xlink:href="DataSheet1.pdf" id="SM1" mimetype="application/pdf" xmlns:xlink="http://www.w3.org/1999/xlink"/>
</sec>
<fn-group>
<fn id="fn1">
<label>1</label>
<p>See <xref ref-type="bibr" rid="B32">Wakjira et al. (2021)</xref> for a fuller discussion of rainfall patterns in Ethiopia.</p>
</fn>
<fn id="fn2">
<label>2</label>
<p>In splitting the labeled data, the training/validation/testing terminology standard in machine and deep learning literature is adopted.</p>
</fn>
<fn id="fn3">
<label>3</label>
<p>An example helps explain the calculation of <italic>n</italic> values: Given <italic>x</italic> &#x3d; 2 VC regions included in training, there remain 5 VC regions unseen by the classifier. As there are <inline-formula id="inf6">
<mml:math id="m7">
<mml:mfenced open="(" close=")">
<mml:mfrac linethickness="0">
<mml:mrow>
<mml:mn>7</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:mfrac>
</mml:mfenced>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>21</mml:mn>
</mml:math>
</inline-formula> ways to select 2 VC regions from the full set of 7, and each of these combinations leaves 5 withheld VC regions for performance evaluation, <italic>n</italic> &#x3d; 105 when <italic>x</italic> &#x3d; 2.</p>
</fn>
</fn-group>
<ref-list>
<title>References</title>
<ref id="B1">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Abbasi</surname>
<given-names>B.</given-names>
</name>
<name>
<surname>Arefi</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Bigdeli</surname>
<given-names>B.</given-names>
</name>
<name>
<surname>Roessner</surname>
<given-names>S.</given-names>
</name>
</person-group> (<year>2015</year>). <article-title>Automatic Generation of Training Data for Hyperspectral Image Classification Using Support Vector Machine</article-title>. <source>Int. Arch. Photogramm. Remote Sens. Spat. Inf. Sci.</source> <volume>XL-7/W3</volume> (<issue>7W3</issue>), <fpage>575</fpage>&#x2013;<lpage>580</lpage>. <pub-id pub-id-type="doi">10.5194/isprsarchives-XL-7-W3-575-2015</pub-id> </citation>
</ref>
<ref id="B2">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Banerjee</surname>
<given-names>B.</given-names>
</name>
<name>
<surname>Bovolo</surname>
<given-names>F.</given-names>
</name>
<name>
<surname>Bhattacharya</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Bruzzone</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Chaudhuri</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Mohan</surname>
<given-names>B. K.</given-names>
</name>
</person-group> (<year>2015</year>). <article-title>A New Self-Training-Based Unsupervised Satellite Image Classification Technique Using Cluster Ensemble Strategy</article-title>. <source>IEEE Geosci. Remote Sensing Lett.</source> <volume>12</volume> (<issue>4</issue>), <fpage>741</fpage>&#x2013;<lpage>745</lpage>. <pub-id pub-id-type="doi">10.1109/LGRS.2014.2360833</pub-id> </citation>
</ref>
<ref id="B3">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Bazzi</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Baghdadi</surname>
<given-names>N.</given-names>
</name>
<name>
<surname>Amin</surname>
<given-names>G.</given-names>
</name>
<name>
<surname>Fayad</surname>
<given-names>I.</given-names>
</name>
<name>
<surname>Zribi</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Demarez</surname>
<given-names>V.</given-names>
</name>
<etal/>
</person-group> (<year>2021</year>). <article-title>An Operational Framework for Mapping Irrigated Areas at Plot Scale Using sentinel-1 and sentinel-2 Data</article-title>. <source>Remote Sensing</source> <volume>13</volume> (<issue>13</issue>), <fpage>2584</fpage>&#x2013;<lpage>2612</lpage>. <pub-id pub-id-type="doi">10.3390/rs13132584</pub-id> </citation>
</ref>
<ref id="B4">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Bazzi</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Baghdadi</surname>
<given-names>N.</given-names>
</name>
<name>
<surname>Fayad</surname>
<given-names>I.</given-names>
</name>
<name>
<surname>Zribi</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Belhouchette</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Demarez</surname>
<given-names>V.</given-names>
</name>
</person-group> (<year>2020</year>). <article-title>Near Real-Time Irrigation Detection at Plot Scale Using sentinel-1 Data</article-title>. <source>Remote Sensing</source> <volume>12</volume> (<issue>9</issue>), <fpage>1456</fpage>. <comment>ISSN 20724292</comment>. <pub-id pub-id-type="doi">10.3390/RS12091456</pub-id> </citation>
</ref>
<ref id="B5">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Breiman</surname>
<given-names>L.</given-names>
</name>
</person-group> (<year>2001</year>). <source>Random Forests</source>. <publisher-name>Machine Learning</publisher-name>, <fpage>1</fpage>&#x2013;<lpage>28</lpage>. <pub-id pub-id-type="doi">10.1201/9780429469275-8</pub-id> </citation>
</ref>
<ref id="B6">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Chen</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Lu</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Luo</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Pokhrel</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Deb</surname>
<given-names>K.</given-names>
</name>
<name>
<surname>Huang</surname>
<given-names>J.</given-names>
</name>
<etal/>
</person-group> (<year>2018</year>). <article-title>Detecting Irrigation Extent, Frequency, and Timing in a Heterogeneous Arid Agricultural Region Using MODIS Time Series, Landsat Imagery, and Ancillary Data</article-title>. <source>Remote Sensing Environ.</source> <volume>204</volume> (<issue>2017</issue>), <fpage>197</fpage>&#x2013;<lpage>211</lpage>. <comment>ISSN 00344257</comment>. <pub-id pub-id-type="doi">10.1016/j.rse.2017.10.030</pub-id> </citation>
</ref>
<ref id="B7">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Conlon</surname>
<given-names>T.</given-names>
</name>
<name>
<surname>Wu</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Small</surname>
<given-names>C.</given-names>
</name>
<name>
<surname>Siddiqui</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Adkins</surname>
<given-names>E.</given-names>
</name>
<name>
<surname>Modi</surname>
<given-names>V.</given-names>
</name>
</person-group> (<year>2020</year>). &#x201c;<article-title>A Novel Method of Irrigation Detection and Estimation of the Effects of Productive Electricity Demands on Energy System Planning</article-title>,&#x201d; in <source>AGU Fall Meeting Abstracts</source> (<publisher-name>IEEE</publisher-name>), <fpage>GC034</fpage>&#x2013;<lpage>08</lpage>. </citation>
</ref>
<ref id="B8">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Deng</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Dong</surname>
<given-names>W.</given-names>
</name>
<name>
<surname>Socher</surname>
<given-names>R.</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>L-J.</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>K.</given-names>
</name>
<name>
<surname>Fei-Fei</surname>
<given-names>L.</given-names>
</name>
</person-group> (<year>2009</year>). &#x201c;<article-title>ImageNet: A Large-Scale Hierarchical Image Database</article-title>,&#x201d; in <conf-name>2009 IEEE Conference on Computer Vision and Pattern Recognition</conf-name> (<publisher-loc>Miami, FL, USA</publisher-loc>: <publisher-name>IEEE</publisher-name>), <fpage>248</fpage>&#x2013;<lpage>255</lpage>. <pub-id pub-id-type="doi">10.1109/cvprw.2009.5206848</pub-id> </citation>
</ref>
<ref id="B9">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Dorogush</surname>
<given-names>A. V.</given-names>
</name>
<name>
<surname>Gulin</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Gusev</surname>
<given-names>G.</given-names>
</name>
<name>
<surname>Kazeev</surname>
<given-names>N.</given-names>
</name>
<name>
<surname>Prokhorenkova</surname>
<given-names>L. O.</given-names>
</name>
<name>
<surname>Vorobev</surname>
<given-names>A.</given-names>
</name>
</person-group> (<year>2017</year>). <source>Fighting Biases with Dynamic Boosting</source>. <publisher-name>Computing Research Repository</publisher-name>, <comment>abs/1706.09516</comment>. <ext-link ext-link-type="uri" xlink:href="http://arxiv.org/abs/1706.09516">http://arxiv.org/abs/1706.09516</ext-link>. </citation>
</ref>
<ref id="B10">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Gebregziabher</surname>
<given-names>G.</given-names>
</name>
<name>
<surname>Namara</surname>
<given-names>R. E.</given-names>
</name>
<name>
<surname>Holden</surname>
<given-names>S.</given-names>
</name>
</person-group> (<year>2009</year>). <article-title>Poverty Reduction with Irrigation Investment: An Empirical Case Study from Tigray, Ethiopia</article-title>. <source>Agric. Water Manag.</source> <volume>96</volume> (<issue>12</issue>), <fpage>1837</fpage>&#x2013;<lpage>1843</lpage>. <comment>ISSN 03783774</comment>. <pub-id pub-id-type="doi">10.1016/j.agwat.2009.08.004</pub-id> </citation>
</ref>
<ref id="B11">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Huete</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Justice</surname>
<given-names>C.</given-names>
</name>
<name>
<surname>Van Leeuwen</surname>
<given-names>W.</given-names>
</name>
</person-group> (<year>1999</year>). <article-title>MODIS Vegetation Index (MOD13) Algorithm Theoretical Basis Document</article-title>. <source>Earth Observing Syst.</source> <volume>3</volume> (<issue>213</issue>), <fpage>295</fpage>&#x2013;<lpage>309</lpage>. </citation>
</ref>
<ref id="B12">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>King</surname>
<given-names>G.</given-names>
</name>
<name>
<surname>Zeng</surname>
<given-names>L.</given-names>
</name>
</person-group> (<year>2001</year>). <article-title>Logistic Regression in Rare Events Data</article-title>. <source>Polit. Anal.</source> <volume>9</volume> (<issue>2</issue>), <fpage>137</fpage>&#x2013;<lpage>163</lpage>. <comment>ISSN 15487660</comment>. <pub-id pub-id-type="doi">10.1093/oxfordjournals.pan.a004868</pub-id> </citation>
</ref>
<ref id="B13">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Kingma</surname>
<given-names>D. P.</given-names>
</name>
<name>
<surname>Ba</surname>
<given-names>J. L.</given-names>
</name>
</person-group> (<year>2015</year>). &#x201c;<article-title>Adam: A Method for Stochastic Optimization</article-title>,&#x201d; in <conf-name>3rd International Conference on Learning Representations, ICLR 2015-Conference Track Proceedings</conf-name> (<publisher-name>IEEE</publisher-name>), <fpage>1</fpage>&#x2013;<lpage>15</lpage>. </citation>
</ref>
<ref id="B14">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Lecun</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Bengio</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Hinton</surname>
<given-names>G.</given-names>
</name>
</person-group> (<year>2015</year>). <article-title>Deep Learning</article-title>. <source>Nature</source> <volume>521</volume> (<issue>7553</issue>), <fpage>436</fpage>&#x2013;<lpage>444</lpage>. <comment>ISSN 14764687</comment>. <pub-id pub-id-type="doi">10.1038/nature14539</pub-id> </citation>
</ref>
<ref id="B15">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Li</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Zhang</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Xue</surname>
<given-names>X.</given-names>
</name>
<name>
<surname>Jiang</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Shen</surname>
<given-names>Q.</given-names>
</name>
</person-group> (<year>2018</year>). <article-title>Deep Learning for Remote Sensing Image Classification: A Survey</article-title>. <source>Wires Data Mining Knowl Discov.</source> <volume>8</volume> (<issue>6</issue>), <fpage>1</fpage>&#x2013;<lpage>17</lpage>. <comment>ISSN 19424795</comment>. <pub-id pub-id-type="doi">10.1002/widm.1264</pub-id> </citation>
</ref>
<ref id="B16">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Liew Soo Chin</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Chin</surname>
<given-names>L. S.</given-names>
</name>
</person-group> (<year>2016</year>). &#x201c;<article-title>A Simplified Training Data Collection Method for Sequential Remote Sensing Image Classification</article-title>,&#x201d; in <conf-name>4th International Workshop on Earth Observation and Remote Sensing Applications, EORSA 2016-Proceedings</conf-name> (<publisher-name>IEEE</publisher-name>), <fpage>329</fpage>&#x2013;<lpage>332</lpage>. <pub-id pub-id-type="doi">10.1109/EORSA.2016.7552823</pub-id> </citation>
</ref>
<ref id="B17">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Naik</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>Kumar</surname>
<given-names>A.</given-names>
</name>
</person-group> (<year>2021</year>). <source>A Stochastic Approach for Automatic Collection of Precise Training Data for a Soft Machine Learning Algorithm Using Remote Sensing Images</source>. <publisher-loc>Singapore</publisher-loc>: <publisher-name>Springer</publisher-name>, <fpage>285</fpage>&#x2013;<lpage>297</lpage>. <pub-id pub-id-type="doi">10.1007/978-981-16-2712-5_24</pub-id> </citation>
</ref>
<ref id="B18">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ozdogan</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Yang</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Allez</surname>
<given-names>G.</given-names>
</name>
<name>
<surname>Cervantes</surname>
<given-names>C.</given-names>
</name>
</person-group> (<year>2010</year>). <article-title>Remote Sensing of Irrigated Agriculture: Opportunities and Challenges</article-title>. <source>Remote Sensing</source> <volume>2</volume> (<issue>9</issue>), <fpage>2274</fpage>&#x2013;<lpage>2304</lpage>. <comment>ISSN 20724292</comment>. <pub-id pub-id-type="doi">10.3390/rs2092274</pub-id> </citation>
</ref>
<ref id="B19">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Phiri</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Simwanda</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Salekin</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Nyirenda</surname>
<given-names>V.</given-names>
</name>
<name>
<surname>Murayama</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Ranagalage</surname>
<given-names>M.</given-names>
</name>
</person-group> (<year>2020</year>). <article-title>Sentinel-2 Data for Land Cover/use Mapping: A Review</article-title>. <source>Remote Sensing</source> <volume>12</volume> (<issue>14</issue>), <fpage>2291</fpage>. <comment>ISSN 2072-4292</comment>. <pub-id pub-id-type="doi">10.3390/rs12142291</pub-id> </citation>
</ref>
<ref id="B20">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Pires de Lima</surname>
<given-names>R.</given-names>
</name>
<name>
<surname>Marfurt</surname>
<given-names>K.</given-names>
</name>
</person-group> (<year>2020</year>). <article-title>Convolutional Neural Network for Remote-Sensing Scene Classification: Transfer Learning Analysis</article-title>. <source>Remote Sensing</source> <volume>12</volume> (<issue>1</issue>), <fpage>86</fpage>. <comment>ISSN 20724292</comment>. <pub-id pub-id-type="doi">10.3390/rs12010086</pub-id> </citation>
</ref>
<ref id="B21">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ramezan</surname>
<given-names>C. A.</given-names>
</name>
<name>
<surname>Warner</surname>
<given-names>T. A.</given-names>
</name>
<name>
<surname>Maxwell</surname>
<given-names>A. E.</given-names>
</name>
<name>
<surname>Price</surname>
<given-names>B. S.</given-names>
</name>
</person-group> (<year>2021</year>). <article-title>Effects of Training Set Size on Supervised Machine-Learning Land-Cover Classification of Large-Area High-Resolution Remotely Sensed Data</article-title>. <source>Remote Sensing</source> <volume>13</volume> (<issue>3</issue>), <fpage>368</fpage>&#x2013;<lpage>395</lpage>. <pub-id pub-id-type="doi">10.3390/rs13030368</pub-id> </citation>
</ref>
<ref id="B22">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Saha</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Solano-Correa</surname>
<given-names>Y. T.</given-names>
</name>
<name>
<surname>Bovolo</surname>
<given-names>F.</given-names>
</name>
<name>
<surname>Bruzzone</surname>
<given-names>L.</given-names>
</name>
</person-group> (<year>2019</year>). &#x201c;<article-title>Unsupervised Deep Learning Based Change Detection in Sentinel-2 Images</article-title>,&#x201d; in <conf-name>2019 10th International Workshop on the Analysis of Multitemporal Remote Sensing Images, MultiTemp</conf-name> (<publisher-name>IEEE</publisher-name>), <fpage>0</fpage>&#x2013;<lpage>3</lpage>. <pub-id pub-id-type="doi">10.1109/Multi-Temp.2019.8866899</pub-id> </citation>
</ref>
<ref id="B23">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Shahriar Pervez</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Budde</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Rowland</surname>
<given-names>J.</given-names>
</name>
</person-group> (<year>2014</year>). <article-title>Mapping Irrigated Areas in Afghanistan over the Past Decade Using MODIS NDVI</article-title>. <source>Remote Sensing Environ.</source> <volume>149</volume>, <fpage>155</fpage>&#x2013;<lpage>165</lpage>. <comment>ISSN 0034-4257</comment>. <pub-id pub-id-type="doi">10.1016/J.RSE.2014.04.008</pub-id> </citation>
</ref>
<ref id="B24">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Sivaraj</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>Kumar</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Koti</surname>
<given-names>S. R.</given-names>
</name>
<name>
<surname>Naik</surname>
<given-names>P.</given-names>
</name>
</person-group> (<year>2022</year>). <article-title>Effects of Training Parameter Concept and Sample Size in Possibilistic C-Means Classifier for Pigeon Pea Specific Crop Mapping</article-title>. <source>Geomatics</source> <volume>2</volume> (<issue>1</issue>), <fpage>107</fpage>&#x2013;<lpage>124</lpage>. <pub-id pub-id-type="doi">10.3390/geomatics2010007</pub-id> </citation>
</ref>
<ref id="B25">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Small</surname>
<given-names>C.</given-names>
</name>
</person-group> (<year>2021</year>). <article-title>Grand Challenges in Remote Sensing Image Analysis and Classification</article-title>. <source>Front. Remote Sens.</source> <volume>1</volume> (<issue>4</issue>), <fpage>1</fpage>&#x2013;<lpage>4</lpage>. <pub-id pub-id-type="doi">10.3389/frsen.2020.605220</pub-id> </citation>
</ref>
<ref id="B26">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Small</surname>
<given-names>C.</given-names>
</name>
</person-group> (<year>2012</year>). <article-title>Spatiotemporal Dimensionality and Time-Space Characterization of Multitemporal Imagery</article-title>. <source>Remote Sensing Environ.</source> <volume>124</volume>, <fpage>793</fpage>&#x2013;<lpage>809</lpage>. <comment>ISSN 00344257</comment>. <pub-id pub-id-type="doi">10.1016/j.rse.2012.05.031</pub-id> </citation>
</ref>
<ref id="B27">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Stivaktakis</surname>
<given-names>R.</given-names>
</name>
<name>
<surname>Tsagkatakis</surname>
<given-names>G.</given-names>
</name>
<name>
<surname>Tsakalides</surname>
<given-names>P.</given-names>
</name>
</person-group> (<year>2019</year>). <article-title>Deep Learning for Multilabel Land Cover Scene Categorization Using Data Augmentation</article-title>. <source>IEEE Geosci. Remote Sensing Lett.</source> <volume>16</volume> (<issue>7</issue>), <fpage>1031</fpage>&#x2013;<lpage>1035</lpage>. <comment>ISSN 15580571</comment>. <pub-id pub-id-type="doi">10.1109/LGRS.2019.2893306</pub-id> </citation>
</ref>
<ref id="B28">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Stromann</surname>
<given-names>O.</given-names>
</name>
<name>
<surname>Nascetti</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Yousif</surname>
<given-names>O.</given-names>
</name>
<name>
<surname>Ban</surname>
<given-names>Y.</given-names>
</name>
</person-group> (<year>2020</year>). <article-title>Dimensionality Reduction and Feature Selection for Object-Based Land Cover Classification Based on Sentinel-1 and Sentinel-2 Time Series Using Google Earth Engine</article-title>. <source>Remote Sensing</source> <volume>12</volume> (<issue>1</issue>), <fpage>76</fpage>. <pub-id pub-id-type="doi">10.3390/RS12010076</pub-id> </citation>
</ref>
<ref id="B29">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Tao</surname>
<given-names>C.</given-names>
</name>
<name>
<surname>Qi</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Lu</surname>
<given-names>W.</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>H.</given-names>
</name>
</person-group> (<year>2022</year>). <source>Remote Sensing Image Scene Classification with Self-Supervised Paradigm under Limited Labeled Samples</source>. <source>IEEE Geosci. Remote Sen. Lett.</source> <volume>19</volume>, <fpage>1</fpage>&#x2013;<lpage>5</lpage>. <pub-id pub-id-type="doi">10.1109/LGRS.2020.3038420</pub-id> </citation>
</ref>
<ref id="B30">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Vogels</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>de Jong</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Sterk</surname>
<given-names>G.</given-names>
</name>
<name>
<surname>Douma</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Addink</surname>
<given-names>E.</given-names>
</name>
</person-group> (<year>2019b</year>). <article-title>Spatio-temporal Patterns of Smallholder Irrigated Agriculture in the Horn of Africa Using GEOBIA and Sentinel-2 Imagery</article-title>. <source>Remote Sensing</source> <volume>11</volume> (<issue>2</issue>), <fpage>143</fpage>. <comment>ISSN 20724292</comment>. <pub-id pub-id-type="doi">10.3390/rs11020143</pub-id> </citation>
</ref>
<ref id="B31">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Vogels</surname>
<given-names>M. F. A.</given-names>
</name>
<name>
<surname>de Jong</surname>
<given-names>S. M.</given-names>
</name>
<name>
<surname>Sterk</surname>
<given-names>G.</given-names>
</name>
<name>
<surname>Addink</surname>
<given-names>E. A.</given-names>
</name>
</person-group> (<year>2019a</year>). <article-title>Mapping Irrigated Agriculture in Complex Landscapes Using SPOT6 Imagery and Object-Based Image Analysis - A Case Study in the Central Rift Valley, Ethiopia -</article-title>. <source>Int. J. Appl. Earth Observation Geoinformation</source> <volume>75</volume> (<issue>2018</issue>), <fpage>118</fpage>&#x2013;<lpage>129</lpage>. <comment>ISSN 1872826X</comment>. <pub-id pub-id-type="doi">10.1016/j.jag.2018.07.019</pub-id> </citation>
</ref>
<ref id="B32">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Wakjira</surname>
<given-names>M. T.</given-names>
</name>
<name>
<surname>Peleg</surname>
<given-names>N.</given-names>
</name>
<name>
<surname>Anghileri</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Molnar</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Alamirew</surname>
<given-names>T.</given-names>
</name>
<name>
<surname>Six</surname>
<given-names>J.</given-names>
</name>
<etal/>
</person-group> (<year>2021</year>). <article-title>Rainfall Seasonality and Timing: Implications for Cereal Crop Production in Ethiopia</article-title>. <source>Agric. For. Meteorology</source> <volume>310</volume>, <fpage>108633</fpage>. <comment>ISSN 01681923</comment>. <pub-id pub-id-type="doi">10.1016/j.agrformet.2021.108633</pub-id> </citation>
</ref>
<ref id="B33">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Wang</surname>
<given-names>Z.</given-names>
</name>
<name>
<surname>Yan</surname>
<given-names>W.</given-names>
</name>
<name>
<surname>Oates</surname>
<given-names>T.</given-names>
</name>
</person-group> (<year>2017</year>). <article-title>Time Series Classification from Scratch with Deep Neural Networks: A strong Baseline</article-title>. <source>Proc. Int. Jt. Conf. Neural Networks</source> <volume>2017</volume>, <fpage>1578</fpage>&#x2013;<lpage>1585</lpage>. <pub-id pub-id-type="doi">10.1109/IJCNN.2017.7966039</pub-id> </citation>
</ref>
<ref id="B34">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Wiggins</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Glover</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Dorgan</surname>
<given-names>A.</given-names>
</name>
</person-group> (<year>2021</year>). <source>Agricultural Innovation for Smallholders in Sub-saharan Africa</source>. <publisher-loc>London, United Kingdom</publisher-loc>: <publisher-name>Technical Report</publisher-name>. </citation>
</ref>
<ref id="B35">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Yu</surname>
<given-names>X.</given-names>
</name>
<name>
<surname>Wu</surname>
<given-names>X.</given-names>
</name>
<name>
<surname>Luo</surname>
<given-names>C.</given-names>
</name>
<name>
<surname>Ren</surname>
<given-names>P.</given-names>
</name>
</person-group> (<year>2017</year>). <article-title>Deep Learning in Remote Sensing Scene Classification: a Data Augmentation Enhanced Convolutional Neural Network Framework</article-title>. <source>GIScience &#x26; Remote Sensing</source> <volume>54</volume> (<issue>5</issue>), <fpage>741</fpage>&#x2013;<lpage>758</lpage>. <comment>ISSN 15481603</comment>. <pub-id pub-id-type="doi">10.1080/15481603.2017.1323377</pub-id> </citation>
</ref>
<ref id="B36">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zhong</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Hu</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Zhou</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Tao</surname>
<given-names>X.</given-names>
</name>
</person-group> (<year>2019</year>). <article-title>Deep Learning Based winter Wheat Mapping Using Statistical Data as Ground References in Kansas and Northern Texas, US</article-title>. <source>Remote Sensing Environ.</source> <volume>233</volume>, <fpage>111411</fpage>. <comment>ISSN 0034-4257</comment>. <pub-id pub-id-type="doi">10.1016/J.RSE.2019.111411</pub-id> </citation>
</ref>
</ref-list>
</back>
</article>