Enhancing Engagement of Citizen Scientists to Monitor Precipitation Phase

Recent literature has highlighted how citizen science approaches can engage volunteers, expand scientific literacy, and accomplish targeted research objectives. However, there is limited information on how specific recruitment, retention, and engagement strategies enhance scientific outcomes. To help fill this important information gap, we detail the use of various approaches to engage citizen scientists in the collection of precipitation phase data (rain, snow, or mixed). In our study region, the Sierra Nevada and Central Basin and Range of California and Nevada near Lake Tahoe, a marked amount of annual precipitation falls near freezing. At these air temperatures, weather forecasts, land surface models, and satellites all have difficulty correctly predicting and observing precipitation phase, making visual observations the most accurate approach. From January to May 2020, citizen scientists submitted timestamped, geotagged observations of precipitation phase through the Citizen Science Tahoe mobile phone application. Our recruitment strategy included messaging to winter, weather, and outdoor enthusiasts combined with amplification through regional groups, which resulted in over 199 citizen scientists making 1,003 ground-based observations of rain, snow, and mixed precipitation. We enhanced engagement and retention by targeting specific storms in the region through text message alerts that also allowed for questions, clarifications, and training opportunities. We saw a high retention rate (88%) and a marked increase in the number of observations following alerts. For quality control of the data, we combined various meteorological datasets and compared to the citizen science observations. We found that 96.5% of submitted data passed our quality control protocol, which enabled us to evaluate rain-snow partitioning patterns. Snow was the dominant form of precipitation at air temperatures below and slightly above freezing, with both ecoregions expressing a 50% rain-snow air temperature threshold of 4.2°C, a warmer value than what would be incorporated into most land surface models. Thus, the use of a lower air temperature threshold in these areas would produce inaccuracies in event-based rain-snow proportions. Overall, our high retention rate, data quality, and rain-snow analysis were supported by the recruitment strategy, text message communication, and simplicity of the survey design. We suggest other citizen science projects may follow the approaches detailed herein to achieve their scientific objectives.


INTRODUCTION
Successful citizen science projects require user-submitted data, but there is limited information on how to maximize the submission of high-quality observations (Robson et al., 2013;Eveleigh et al., 2014;Crall et al., 2017;De Moor et al., 2019). Broad recruitment and retention of volunteer citizen scientists is critical to data collection, but can also be the biggest challenges to a citizen science project (Andow et al., 2016;Crall et al., 2017;Frensley, 2017;De Moor et al., 2019). It is typically necessary to deploy recruitment and communication strategies that engage a broad base of individuals and effectively use available resources (Crall et al., 2017;De Moor et al., 2019). While many studies have shown that a few participants generally contribute the bulk of the citizen science observations (Crall et al., 2017), recruiting a large pool is necessary for sustainability of the citizen science project and continued recruitment (De Moor et al., 2019). Recruitment approaches include email, social media campaigns, social networking, and press (Robson et al., 2013;Crall et al., 2017). Crafting messaging that targets volunteer motivations (Clary and Snyder, 1999) and addresses community-identified problems (Davis et al., 2020) has been shown to motivate participants and enhance recruitment (Uchoa et al., 2013); however, volunteer motivations have been shown to be variable (Crall et al., 2017). Robson et al. (2013) showed that targeting local groups with interest in the topic was effective for increasing data collection and Crall et al. (2017) showed increased recruitment by connecting to similar volunteer-based projects. These findings suggest targeting volunteer motivations and collaborating with regional groups with similar interests are effective approaches for recruiting citizen scientists when combined with social media, emails, and press. Retaining volunteer citizen scientists is also a challenge, with reported retention rates ranging between 15 and 92% (Andow et al., 2016;Reges et al., 2016;Crall et al., 2017). Similar to recruitment, citizen scientist retention strategies come in a variety of styles; however, retention practices that keep participants informed, allow for feedback, and create a sense of community are typically the most effective (Crall et al., 2017;Davis et al., 2020). In addition, the structure of the program itself can also aid with retention, for example, with gamification (e.g., Strobl et al., 2019). Higher retention rates reduce the need for continued recruitment and training, and potentially increase data quality (Andow et al., 2016). Consistent training and feedback also improve the quality of data submitted by citizen scientists (Druschke and Seltzer, 2012;Andow et al., 2016;Crall et al., 2017). The wide range of observed retention rates and potential impacts to data quality suggest the need for a quantitative approach to evaluating engagement and retention strategies.
While citizen scientists can markedly increase the amount of observations compared to a single team of researchers, projects also need clearly defined scientific objectives (Robinson et al., 2018). In our case, we were interested how a network of citizen scientists could be used to monitor the phase of precipitation (rain, snow, and mixed) in the Sierra Nevada of California and Nevada. In this region and much of the western US, households, agriculture, and industry rely on mountain snowmelt for their water resources (Bales et al., 2006). However, with anthropogenic climate change, the proportion of precipitation falling as snow has decreased (Knowles et al., 2006), reducing the ability of mountain snow to act as reservoirs (Barnett et al., 2005;Mankin et al., 2015), with resultant effects on downstream streamflow timing and volume (Stewart et al., 2005;Stewart, 2009). Continued changes in precipitation phase (Klos et al., 2014;Safeeq et al., 2016) are expected to have impacts to the human and natural systems that rely on those water resources (Stewart et al., 2004).
These observed and projected water resource impacts necessitate the effective monitoring and modeling of precipitation phase in mountain areas, particularly those near the rain-snow transition zone. Due to a limited number of in situ networks reporting precipitation phase in western US mountains, models are typically used to fill in the gaps (Lundquist et al., 2019). However, many land surface models still use spatially uniform air temperature thresholds or ranges to partition rain and snow despite work showing that these values vary spatially (Ding et al., 2014;Froidurot et al., 2014;Harpold et al., 2017;Jennings et al., 2018). Using the wrong rain-snow air temperature threshold results in biases in snow water equivalent, snow depth, and snow duration (Fassnacht and Soulis, 2002;Wen et al., 2013;Harder and Pomeroy, 2014;Jennings and Molotch, 2019). Similarly, both ground-based radars and satellites struggle to predict precipitation phase at air temperatures near freezing as a result of snow scattering properties, difficulties in parsing rain and snow signals at various intensities, and snow cover on the Earth's surface (Skofronick-Jackson et al., 2015;Ebtehaj and Kummerow, 2017;Harpold et al., 2017;Lundquist et al., 2019). These issues are particularly concerning in the Sierra Nevada, where a significant proportion of winter precipitation falls near 0°C. Enhanced monitoring for model validation of rain and snow would greatly enhance the prediction of precipitation that accumulates in mountain snowpacks and the precipitation that runs off.
Previous weather-and hydrology-focused citizen science work has shown how to integrate crowdsourced observations into research design. For example, the Meteorological Phenomena Identification Near the Ground (mPING) uses crowdsourced data to assess weather forecast models (Elmore et al., 2014(Elmore et al., , 2015. The mPING mobile phone application allows users to upload observations of weather phenomena and natural hazards of various types (Elmore et al., 2014(Elmore et al., , 2015. The Community Collaborative Rain Hail and Snow Network (CoCoRaHS) is a network of volunteers who measure daily precipitation and other meteorological quantities (Cifelli et al., 2005). The data generated from CoCoRaHS have resulted in numerous publications (e.g., Cifelli et al., 2005;Reges et al., 2016), demonstrating the value of such a spatially distributed network of observations and temporally consistent submissions. Likewise in hydrology, CrowdWater engages volunteers in verifying water level class data (Strobl et al., 2019), and Stream Tracker pairs observations by volunteers with those of streamflow sensors and remote sensing data (Puntenney et al., 2017). Community Snow Observations engages citizen scientists to measure snow depth with avalanche probes in under sampled areas in order to improve estimates of snow depth 1 . While these and other programs represent an advancement in citizen science, they are limited to observations at a specified location at a specified time, are complicated, or require significant training.
For our project, Tahoe Rain or Snow, we employed a citizen science approach to collect ground-based observations of precipitation phase during winter and spring storms in the Sierra Nevada region. Our objectives were to evaluate engagement strategies and produce a robust, quality-controlled dataset of precipitation phase. The former will contribute to the ongoing study of how to best deploy citizen science projects, while the latter serves as a critical validation source for land surface models and satellite remote sensing products. We also detail an initial assessment of the rain-snow temperature relationship within our study region. The citizen science approach presented here is potentially applicable to other weather-based studies for which data collection over spatial and temporal extents is required.

STUDY SITE
Our study area included the Lake Tahoe Region of the Sierra Nevada, and the urban/suburban areas of Reno, Sparks, and Carson City, Nevada, collectively referred to as the Tahoe Basin and Truckee Meadows. The Lake Tahoe Region is characterized by Lake Tahoe (1,878 m elevation) and the surrounding mountain peaks (reaching elevations of 3,318 m). The Lake Tahoe and Sierra Nevada climate is characterized by a wet season from November to April and dry season from May to October (Null et al., 2010). Most of the wet season precipitation falls as snow. The Truckee Meadows region (elevation ∼1,300 m) is located on the eastern side of the Sierra Nevada and the region experiences a similar seasonal climate pattern, with overall less annual precipitation. This study area includes two ecoregions the Sierra Nevada and the Central Basin and Range Level III ecoregions.

METHODS
Tahoe Rain or Snow, launched in 2019, is a contributory citizen science project (Shirk et al., 2012) where the research team designed the study and community members contributed the precipitation phase data. From the outset of the project, both scientists and education/engagement specialists collaborated on project design, as recommended by Druschke and Seltzer (2012). The research goals and study design informed our survey design, recruitment, and retention strategies. Following data submission, we assessed the data for quality and evaluated the results for rainsnow partitioning patterns. These approaches were utilized to meet our goals to: 1) recruit and retain citizen scientists, and 2) collect a sufficient number of quality data points for precipitation phase analysis.

Tahoe Rain or Snow Survey
To effectively crowdsource precipitation phase data, we designed the Tahoe Rain or Snow precipitation phase survey to be included on the existing Citizen Science Tahoe (CST) mobile phone app platform ( Figure 1). Zach Lyon Creative developed the CST mobile phone app in 2015 in collaboration with three regional groups: University of California Davis Tahoe Environmental Research Center, the League to Save Lake Tahoe, and the Desert Research Institute (DRI). In 2020, the CST app hosted six different citizen science surveys used by the CST partners with a total network of 1,970 registered users.
We designed the Tahoe Rain or Snow survey to make data collection simple and fast (i.e., the survey could be completed in <30 s), allowing participants to easily submit frequent observations, even while out of cell service. The survey was also designed to facilitate participation from those who have time constraints or who do not have technical knowledge of meteorology, thus requiring minimal training. This is in line with previous research recommending that citizen science platforms be designed to anticipate possible technology and time constraints with flexibility in both to encourage participation (Eveleigh et al., 2014;Davis et al., 2020).
The Tahoe Rain or Snow survey includes a series of screens on which the user submits observations. The first screen reminds the user of study goals and provides basic training info ( Figure 1B), enabling those that did not sign up through the text message service (described below) to submit accurate data. The user then selects the precipitation type (rain, snow, or mixed precipitation) on the following screen ( Figure 1C). Next, the user is asked to verify their location ( Figure 1D) and may manually move the location pin if it is incorrect. The user is then asked to check the accuracy of all parameters and submit the observation. The app collects the following data: observer name, date, time, latitude, longitude, a unique identifier for each observation, and the weather observation (rain, snow, or mixed precipitation), all of which are accessible to the researchers via a web-based portal. Surveys that are submitted while the user is out of cell service are cached on the local device and uploaded once the user returns to service area. If location services were 'turned off' and the observer did not change the location, the default location was set to 39.0968°N, 120.032°W, in the center of Lake Tahoe, allowing for easy identification of these data points. A small group of scientists and educators tested the app prior to the launch for data quality and user experience and we incorporated their feedback in the design process.

Recruitment
The first strategic step for citizen scientist recruitment was to create content that connected the opportunity to be involved in Tahoe Rain or Snow to individuals' interests and motivations related to winter recreation, weather, curiosity for science, and/or their connection to mountain regions. In this context, Tahoe Rain or Snow's focus on improving the ability to estimate water resources in mountain regions locates the project within a broad matrix of environmental and natural resource issues and interests, i.e., a "complex problem domain" (Hano et al., 2020). Therefore, our communication strategy included links between the science, natural resources, and places of local interest and value.
The second step was to reach a large audience to increase the likelihood of collecting sufficient, high-quality data. In the context of this work, this corresponds to approximately 200 observations taken between −8 and 8°C to compute rain-snow probability curves (e.g., Jennings et al., 2018). We pursued this strategy so that even low conversion rates (i.e., the percentage of users who engage with the project following outreach/promotion) would result in an adequate amount data collected (Crall et al., 2017). For targeted recruitment of individuals with overlapping interests, we engaged various local weather forecasters and nonprofits with aligned interests to "amplify" the request to participate in Tahoe Rain or Snow through social media and email (Uchoa et al., 2013;Crall et al., 2017). We then reached out to our amplifiers, requesting they post the content to their social media platforms or email lists. Amplifier requests included the National Weather Service Reno, the CoCoRaHS network in Nevada and California, Protect Our Winters (POW), the Sierra Avalanche Center, Snowlands Network, and the League to Save Lake Tahoe. We also gave presentations to local recreation, outdoors, and education groups and conducted flyer handouts at ski resorts and ski-related events. In addition, we sent email invitations to all participants on the Citizen Science Tahoe platform as well as all participants in Stories in the Snow (another citizen science project based at DRI) as individuals that have participated in other citizen science projects are more likely to persist (Frensley, 2017).
Recruitment efforts also focused on teaching participants how to submit quality data. Citizen scientists were asked to sign up to a text messaging service, SimpleTexting, by texting a keyword to a number ("To join Tahoe Rain or Snow, text WINTER to 855-909-####"). The text messaging service compiled citizen scientists' phone numbers into a database. Upon joining the text message service, the citizen scientists were automatically sent three text messages at 24-h intervals. The three drip campaign messages contained 1) instructions to download the CST app through a web link, 2) background information to understand to the goals of the study, and 3) a succinct training module on when and how to submit observations. This text message service enabled the citizen scientists to opt-in to messaging, allowing continued communication throughout the season in the form of scheduling and pushing alerts to all users, text message communications with individual citizen scientists, and ongoing training and education.

Retention
Retention of volunteers is critical to project success but represents a major challenge in citizen science. To maintain participation throughout the winter and spring, push alerts were sent through the text messaging system when a storm was approaching to bring awareness to the participants. This was preferable to social media or email, which introduce a time lag in communication and require participants to pull the information. The ability to answer citizen scientist questions via the same text messaging system allowed for clarification about when and how to sample, potentially improving participant understanding of the project.
At the end of the sampling season in late May, we provided a report-back to the community as these have been shown to enhance engagement (Druschke and Seltzer, 2012) and to encourage continued involvement in the project (Tweddle et al., 2012). For our report, we created a summary of preliminary data with plain English descriptions of the patterns in the data collected by citizen scientists (Tahoe Rain or Snow, 2020) and shared the web link via the text message alert service and DRI's social media outlets.

Evaluating Patterns in Citizen Science Observations
To assess our recruitment and retention strategies, we examined the number of sign-ups over time and quantified requests to subscribe to and unsubscribe from the text message system. We also evaluated the timing and number of reports following our text message notifications to analyze their effectiveness in encouraging participation. To learn more about the reporting behavior of our citizen scientists, we assessed the timing and spatial patterning of observations based on the day and time of report along with the associated location and elevation.

Quality Assurance and Quality Control
Effective use of citizen science data requires robust quality assurance (QA) and quality control (QC) protocols. Our primary QA mechanism was designing the app-based survey to allow for little subjective interpretation when reporting observations (i.e., by proving only three options: rain, snow, mixed). Additionally, the app performed all timestamping and geolocating automatically, eliminating the possibility a user could report an incorrect time and/or location. All observations for users with location services turned off were reported in the center of Lake Tahoe, allowing for removal of these erroneous data as the first step of the QC protocol.
After spatially filtering the data, the phase observations were compared to air temperature measurements. In general, we found existing gridded air temperature products to be insufficient for the temporal and spatial scale of our project. For example, hourly meteorological data can be accessed from phase 2 of the North American Land Data Assimilation System (NLDAS-2), but the grid cell resolution of 0.125°is too large for our point-scale observations (Xia et al., 2012). Conversely, data from the Parameter-elevation Regressions on Independent Slopes Model (PRISM) is available at an 800 m grid spacing, but only at a daily time step (Daly et al., 2008). We needed both high temporal resolution (hourly or subhourly) and precise geolocation. We therefore accessed several networks of meteorological stations ( Table 1) that recorded air temperature data at an hourly time step or finer to ensure as broad a spatial and climatic coverage as possible. These included the Snowpack Telemetry (SNOTEL) network, Remote Automatic Weather Stations (RAWS), multiple state and national networks distributed through the Hydrometeorological Automated Data System (HADS), and the Automated Weather Observing System (AWOS). In total, we identified 66 stations with hourly or finer air temperature in our study domain. The number of stations reporting valid data for each citizen science observation ranged between 57 and 66. As an initial quality control check, we relied on routines from the respective datasets and we also filtered out observations outside of the range of −30-45°C.
Few of our citizen science observations were recorded directly next to a meteorological station, necessitating that we distribute air temperature from a station or multiple stations to the point of interest. To do this, we tested four methods inside an air temperature distribution model: 2014) and MicroMet (Liston and Elder, 2006). As we did not need gridded values of air temperature, we recoded these methods to predict air temperature at a given observation point using parallel processing in the R computing language. The air temperature distribution model included the following steps: 1. Compute the observer's elevation by extracting the value from the 10 m digital elevation model (DEM) from the USGS's National Elevation Dataset (Gesch et al., 2002) using the submitted latitude-longitude coordinates. 2. Identification of all air temperature measurements within ±1 h of the citizen science observation. 3. Removal of all air temperature measurements except those closest in time to the observation. 4. If a single meteorological station reported two air temperature values (i.e., the time gap before and after the citizen science observation were equal), we took the arithmetic mean of the two. 5. We computed the distance between the citizen science observation and each meteorological station. 6. We calculated air temperature at the observation location using the four previously introduced methods: a. IDW const was calculated by correcting all air temperature values to sea level based on station elevation and a constant lapse rate of −0.005°C m −1 (Girotto et al., 2014) and computing normalized weights for each station based on its distance to the citizen science observation. Sea level air temperature was calculated at the observation point as a function of the station weights and the sea level air temperature for each station and lapse the air temperature from sea level to the elevation of the citizen science observation. b. IDWvar followed the same steps as IDWconst, but used a lapse rate calculated per time step. The variable lapse rate was predicted as the slope of an ordinary least squares regression model fit to station elevation (independent variable) and air temperature (dependent variable). c. Nearestconst was calculated by identifying the station nearest to each citizen science observation and correcting the air temperature value from the station to the observation based on the difference in elevation between the two and the −0.005°C m-1 constant lapse rate. d. Nearest var followed the same steps as Nearest const , but the lapse rate is calculated per timestep as in IDW var .
To evaluate which air temperature distribution method performed best, we randomly removed 5,000 observations from the 576,435 observations in our aggregated air temperature dataset. We then ran the steps detailed above to predict the air temperature value for each randomly removed measurement to cross-validate the four methods. Overall, there was high agreement in the ability of the different methods to predict air temperature, and IDW var had the lowest bias and highest r 2 value ( Table 2). To note, the average variable lapse rate was −0.0056°C m −1 , suggesting the extra processing step does provide marginal improvement even if the lapse rate difference is small.
Once we had computed an air temperature value using IDW var for each citizen science observation, we followed a two-step process to flag suspicious reports. First, we checked the air temperature against previously reported rain-snow air temperature ranges (Kienzle, 2008;Jennings et al., 2018). Next, we checked daily precipitation reports from PRISM and active AWOS stations to flag whether precipitation occurred on the observation date. We used both datasets because although PRISM provides spatially continuous coverage, it tended to not identify small and trace precipitation amounts within the study region.

Precipitation Phase Partitioning Patterns
Although this paper focuses on our outreach efforts and data collection, we include an initial assessment of the precipitation phase observations to improve understanding of rain and snow patterns in our study area. First, we evaluated the proportion of each precipitation phase by elevation using the DEM-derived values identified in Step 1 of the air temperature distribution model above. We next created histograms to show the number of reports corresponding to each phase type by elevation. Then, we summarized the data in 500 m elevation bins, computing the percentage of observations corresponding to each phase and evaluating how the dominant phase of precipitation changed by elevation.
We next used the air temperature and phase data to produce conditional snow probability curves at the scale of Level III Ecoregions from the US Environmental Protection Agency (US EPA, 2015). Our study area included the Sierra Nevada and Central Basin and Range ecoregions in California and  Nevada ( Figure 2). The curves for each ecoregion display the probability that precipitation at a given air temperature value will be snowfall. To do this, we grouped the precipitation phase observations into 0.5°C air temperature bins and divided the number of snowfall reports by the number of total precipitation reports. Once snow probability curves had been created, we calculated the 50% rain-snow air temperature threshold for each ecoregion. This is the optimized, spatially explicit air temperature that can be used to partition rain and snow in model-based or observational studies instead of a spatially uniform threshold (e.g., the common 0°C rain-snow split).
Here, we followed the approach of Dai (2008) by fitting a hyperbolic tangent to the snowfall probability curve and calculating the air temperature at which the curve crosses 50%. Finally, we compared the derived threshold per ecoregion to two generic rain-snow air temperature thresholds (0 and 2°C) to evaluate the proportion of snow that would be misidentified as rain. First, we plotted the distribution of air temperature data and computed the mean and median values from all of our citizen scientist reports. Next, we calculated the total number of reports as well as the number of snow observations given at air temperatures between the two generic rain-snow thresholds and the thresholds we computed per ecoregion in the steps above. We then assumed that each observation reported as snow in this air temperature range would be incorrectly identified as rain by a land surface model.

Survey, Recruitment and Retention
The Tahoe Rain or Snow survey was launched on January 7, 2020 and the final data export occurred on May 31, 2020 (spring rain and snow storms are not unusual in the Sierra Nevada). For recruitment, we generated content for social media outreach which was sent to amplifiers who committed to communicating the request to their subscribers either through social media or email. Two increases in enrollment in the text message service, January 13-15, 2020 (n 93) and January 28 to February 4, 2020 (n 61) were observed (Figure 3). These two pulses represent amplification by regional groups in mid-January (e.g., National Weather Service Reno amplified the project to their social media on January 13, 2020) and amplification by the CoCoRaHS network in the study area at the end of January. The recruitment methods resulted in 199 individuals signing up for the text message service (Figure 3). Because the text message service and mobile phone app were different platforms, individuals could submit observations anonymously or submit without subscribing to the text message service, therefore the true number of individual participants is difficult to determine. It is estimated that users submitted on average 12 observations, with several "super users" who reported >35 observations. The retention strategy included the text messaging notification system. Thirteen text message storm alerts were sent to the subscriber list from January to April 2020 (Figure 3, Supplementary Table S1). Following the thirteen text alerts, a total of 115 response texts were received from citizen scientists in the form of questions or comments (excluding requests to opt-in or opt-out from the text message service). From January to April, there were 24 unsubscribe requests to the text message alert service, resulting in an 88% retention rate. Figure 3 shows the dates that text alerts were sent to citizen scientists compared to precipitation accumulation from two SNOTEL sites (Mount Rose elevation 2,682 m, Truckee elevation 1,984 m). There was a precipitation event between January 14th and January 15th with a total of 84 observations made. No text message had been sent to participants; however, we note that most individuals signed up the few days prior to the storm event, with the "drip campaign" messaging released 24 and 48 h after sign-up. This contrasts with the March 1st snow event which was similar in precipitation amount to the January 15th event, however it had no text message notification, and fewer observations (n 18). When comparing the March 1st snow event to an event on February 2nd and March 7th, both had minimal precipitation accumulation, had text message notifications, and had many observations (n 121, 78 respectively) (Figure 3). Figure 4 shows the number of observations compared to number of days since last text message notifications. We note a decrease in the number of observations ∼20 days after the last notification.
The majority of the observations were collected during the month of March (n 496) and January (n 220) with fewer observations submitted the months of February (n 138), April (n 121), and May (n 34). The low number of observations in February are potentially driven by little precipitation accumulation from February 5th to March 3rd. This was followed by a series of snow and rain storms through mid-April, however, few notifications (April n 1, May n 0) may have resulted in the lower number of observations made (Figure 3). Most of the observations were submitted during daylight hours (10-16) and the most popular day of the week to make an observation was Sunday, followed by Thursday and Saturday ( Figure 5). The winter and spring of 2020 was also unique for non-weather-related reasons. In response to the COVID-19 pandemic, the State of Nevada closed all nonessential businesses on March 17th, 2020, and the State of  California issued a statewide stay at home order on March 20th, 2020. Schools began distance learning, and non-essential workers began working from home. Many ski resorts in the Sierra Nevada suspended operations as well. These factors may have affected citizen scientists' behavior and availability to participate in the study. At the end of the season, we created a report-back for our citizen scientists. The web link to the report back was sent via text message to the participants (Supplementary Table S1). In the week following its release on June 10th, there were 86 unique views to the Tahoe Rain or Snow report-back webpage. While is not possible to determine if hits to this page came solely from Tahoe Rain or Snow participants, we estimate this is an ∼43% conversion rate as a result of this alert.

QA and QC
During the study period (January 7 to May 31, 2020), citizen scientists reported 1,039 precipitation phase observations. After elimination of erroneous locations, there were a total of 1,009 observations, 97.1% of the initial dataset. We then removed six observations from outside the study area because there was an insufficient number of observations to perform a quantitative analysis, leaving a total of 1,003 valid reports. We then compared citizen science observations to meteorological data. Precipitation phase reports generally fell within the air temperature bounds of previous research. Observers reported only two instances of rain at air temperatures less than 0°C and 27 instances of snow at air temperatures above 5°C, representing 0.2 and 2.7% of the dataset, respectively. These data are flagged, but not removed from the final dataset. We also flagged observations when both PRISM and the AWOS dataset indicated 0 mm of precipitation on that day of observation. In total, this represented 12 observations, or 1.2% of the total dataset. Removing these observations had little effect on snow probability calculations, so they were included in the analysis, but are flagged in the final dataset. In total, 96.5% of all submitted data passed quality control.

Precipitation Phase Partitioning Patterns
After elimination of data submitted with erroneous locations, there were a total of 1,003 observations, with 705 observations of snow, 163 observations of rain, and 135 observations of mixed phase precipitation (Figure 3). Most of the observations were from the targeted communities surrounding Lake Tahoe, the Truckee Meadows (inclusive of the cities of Reno/Sparks) and Carson City, Nevada (Figure 2). The distributions of all phase types by elevation are bimodal (Figure 6), which may be indicative of our study region's topography. Truckee, CA and most of the towns around Lake Tahoe are located near or above 1,800 m in the Sierra Nevada ecoregion, while Reno, NV and Carson City, NV are both located below 1,500 m in the Central Basin and Range ecoregion. Most of our observations come from population centers and nearby road corridors ( Figure 2). As such, we have relatively few reports from the sparsely populated areas between the higher and lower elevation bands.
Unsurprisingly, the proportion of snowfall in the citizen science observations increased with elevation ( Figure 6). Snowfall was the dominant precipitation phase type in each elevation band, only dropping below 50% at elevations <1,250 m. Above 2,250 m, 92.7% of reports were snowfall during our study period. The greatest percentages of rain and mixed precipitation were reported at elevations <1,250 m at 26.2% and 31.0%, respectively (Figure 7). The latter value was more than twice the second highest amount of mixed precipitation reported at any other elevation band.
At air temperatures near freezing, snowfall probability was near 100% according to citizen science reports in both ecoregions (Figure 8). Snowfall probability declined with increasing air temperature, nearing 0% as air temperature approached 10°C. The similar snowfall probability curves produced identical 50% rain-snow air temperature thresholds of 4.2°C in the Sierra Nevada and Central Basin and Range ecoregions. This threshold is markedly warmer than the 0°C used as a default value in some land surface models. Frontiers in Earth Science | www.frontiersin.org February 2021 | Volume 9 | Article 617594 During our study period, 57.0% of observations were reported at air temperatures between 0 and 4.2°C, the latter being the optimized 50% rain-snow air temperature threshold for our ecoregions (Figures 8, 9). The distribution of air temperature data corresponding to our reports was unimodal, with mean and median values of 1.9 and 2.0°C, respectively. These findings indicate that the majority of precipitation phase reports given by our observers were in the air temperature range of greatest rain-snow uncertainty. Of the 572 reports given between 0 and 4.2°C, 410 were snowfall, meaning a 0°C rain-snow temperature threshold would have misidentified 71.7% of phase observations that were actually snow as rain. Even using a warmer threshold of 2°C would lead to a 60.6% rain vs. snow misidentification rate for the 317 reports given between 2 and 4.2°C.

Citizen Science Recruitment, Retention, and Engagement
Various strategies have been proposed in the literature for maximizing citizen science recruitment (Robson et al., 2013;  Frontiers in Earth Science | www.frontiersin.org February 2021 | Volume 9 | Article 617594 10 Eveleigh et al., 2014;Andow et al., 2016;Reges et al., 2016;Crall et al., 2017), however key to those strategies is retaining volunteers and ensuring high data quality. Here we show a high rate of retention (88%) and high quality of data (96.5% passing strict quality controls) which we attribute to the recruitment, retention, and engagement methodologies which maximized the submission of high-quality data for the study of the precipitation phase.
Lessons for other citizen science programs can be taken from the effectiveness of the strategies we describe above. We implemented a recruitment strategy of targeted messaging to recruit citizen scientists with values aligned with the study followed by communicating the opportunity to participate through social media and connection to regional groups. We recommend that science leads of citizen science projects consult with engagement specialists or social scientists to consider the values are aligned with those programs and identify partner "amplifier" groups. Messaging amplified through known organizations in the community demonstrated to be effective to introduce Tahoe Rain or Snow as sign-ups increased after National Weather Service-Reno and CoCoRaHS circulated the call to participate.
While the project engaged >199 individuals, recruitment of a dedicated super-user group who submitted large numbers of observations was also important for the project's success and was dependent on recruiting a larger base of volunteers. We suggest that future projects consider making sign-up or opt-in for citizen science programs simple and straightforward, and that FIGURE 8 | Snowfall probability plotted against air temperature for the Central Basin and Range and Sierra Nevada ecoregions. Snowfall probability is equal to the percent of citizen science observations reported as snow per 0.5°C air temperature bin. The gray dashed line corresponds to 50% snowfall probability.
FIGURE 9 | Kernel density plot showing the distribution of all precipitation phase reports by air temperature. The 50% rain-snow air temperature threshold value is marked with the black dashed vertical line. The mean and median air temperature values from our citizen scientist reports were 1.9 and 2.0°C, respectively.
Frontiers in Earth Science | www.frontiersin.org February 2021 | Volume 9 | Article 617594 mechanisms for two-way communication with volunteers be integrated. The text message notification system retained 88% of individuals who signed up and the continued reporting throughout the study period ( Figure 4) supports this was an effective approach for retaining individuals. The results shown in Figure 4 suggest that the number of observations increase with notifications and such continued engagement additionally may have aided with the continued observations even with the outbreak of the COVID-19 pandemic during the winter of 2020. The number of observations may also be influenced by the amount of precipitation, as events with a greater amount of precipitation may persist for longer. The continued response texts received from citizen scientists throughout the winter also support the effectiveness of the text message system in engaging volunteers, building community with the observers, and improving data quality. For example, questions were sent about appropriate locations to make observations, requests for clarification on weather types, pictures of the precipitation at their location, and proud messages of how many observations they had submitted through the season. The text message system and the simplicity of the survey design potentially aided with retaining 'dabblers', or those with an intermittent approach to participation (Eveleigh et al., 2014). Many dabblers are motivated to continue to participate in citizen science projects and can help broaden the reach of the project (Eveleigh et al., 2014), therefore making them an important group to engage. In addition to these factors, the simplicity of the survey and continued education through the text messaging system, may also have contributed to the high quality of data. Finally, we encourage all citizen science programs to dedicate energy and time to meaningful report-back of the results to volunteers. The 43% conversion rate from the report-back text message alert is high in comparison to conversion rates reported in other studies, such as 4% from email or 10% from social media (Crall et al., 2017). A comparison of conversion rates between text alerts and email newsletters can be made with a similar citizen science project also led by DRI, called Stories in the Snow, which uses email notifications for users and is conducted in the same region as Tahoe Rain or Snow. For Stories in the Snow email newsletters sent in the 2019-2020 season to ∼1,246 users, the mean percentage of emails opened was 32%, and the mean conversion rate was only 4.7% (based on data from the email campaign management platform). The high conversion rate of the Tahoe Rain or Snow report-back also supports the effectiveness of text messages to engage individuals in comparison to email messaging for a similar user group in the same region.

Applicability to Precipitation Phase Studies
0°C is still used to partition rain and snow, likely because the freezing point seems to be a logical temperature at which to split precipitation phase and is thus a default method in some land surface models such as the widely used Variable Infiltration Capacity (VIC) model (Liang et al., 1994). Another common assumption in many studies is the use of a spatially uniform rainsnow air temperature threshold, where precipitation phase is partitioned at the same air temperature across wide spatial extents (Harpold et al., 2017). This is also incorrect. To date, there has been a significant body of literature highlighting the inadequacy of 0°C thresholds, the spatial variability in rain-snow partitioning, and the negative effects of incorrectly determining precipitation phase (Marks et al., 2013;Ye et al., 2013;Ding et al., 2014;Jennings et al., 2018;Jennings and Molotch, 2019). In our project, we did not calculate a difference in the 50% rain-snow air temperature threshold between the two adjacent ecoregions, but at 4.2°C the value is markedly warmer than what would be incorporated into any land surface model. This value is also warmer than the spatially variable thresholds of 1.8-2.6°C predicted in Jennings et al. (2018) for the four grid cells with the most citizen science observations. Thus, the use of a common lower air temperature threshold in these areas would produce inaccuracies in annual and event-based rain-snow proportions.
Despite the utility of our citizen science data, it is also important to consider potential limitations including a lack of spatial and temporal variability. Most of the observations were made during daylight hours ( Figure 5) and many of the observations were near population centers (Reno, Carson City, and Truckee) and along roads ( Figure 2). Similar limitations have been observed in other direct-observation approaches (Harpold et al., 2017). Future efforts will encourage observation at all times of day as well as backcountry and remote observations. However, even with these limitations, the citizen science dataset resulted in a sufficient number of quality data points for precipitation phase validation for both EPA Level III Ecoregions in the study area (number 5 and 13) ( Figure 2). Another source of error is the air temperature predicted by our simple model. Although the high r 2 and low mean bias suggests reasonable accuracy, there are tradeoffs in combining multiple data sources from varied locations with different measurement protocols. We also came against the same problem with mixed precipitation as other observational studies (e.g., Jennings et al., 2018), in that observers do not report rain-snow proportions and therefore we cannot validate precipitation phase partitioning methods that predict a liquid-solid gradient from one air temperature threshold to another. In this context, we considered mixed precipitation to be liquid when computing 50% rain-snow air temperature thresholds, similar to how it is treated in NASA's Global Precipitation Measurement (GPM) mission (Huffman et al., 2019). This shortcoming suggests the need of additional quantitative studies in areas that receive significant amounts of mixed precipitation (Yuter et al., 2006;Avanzi et al., 2014;Wayand et al., 2017).
An alternative method to visual reports and model output is the use of satellite and ground-based remote sensing data. GPM, for example, provides estimates of precipitation phase in its Integrated Multi-satellite Retrievals for GPM (IMERG) and Dual-frequency Precipitation Radar (DPR) products, both of which have their own shortcomings. Precipitation phase in IMERG is, in essence, a reanalysis product where rain and snow are partitioned using a wet bulb temperature threshold (Sims and Liu, 2015). DPR, in contrast, provides precipitation phase estimates from Ka and Ku band radar retrieval algorithms (Iguchi et al., 2018). Despite technological advances, large errors in DPR falling snow observations persist partly as a result of a lack of validation data (Skofronick-Jackson et al., 2018). In this Frontiers in Earth Science | www.frontiersin.org February 2021 | Volume 9 | Article 617594 context, citizen science precipitation phase data, particularly observations collected over a larger spatial extent, could provide much-needed validation data and provide pathways for improvement in GPM output.

CONCLUSION
Success, for the purpose of this study, is defined as meeting our goals to: 1) recruit and retain citizen scientists, and 2) collect a sufficient number of quality data points for precipitation phase analysis. Here we have shown the successful application of citizen science for ground-based precipitation phase observations. The recruitment through messaging targeting winter, weather and outdoor enthusiasts coupled with amplifiers to target group members with similar interests, created a pool of citizen scientists of both super-users and 'dabblers'. The high retention rate (88%) was potentially supported by the text message system and two-way communication. Continued engagement throughout the winter and spring through the text messages encouraged all participants to continue to submit observations, provided further training opportunities, and enabled two way communication with the observers. In addition, simple, easy to use design of the survey may have encouraged retention and high-quality data. Combined, these factors contributed to the high data quality and high retention rate.
Ground-based observations have important applications for validation of modeling and remote sensing of precipitation phase. For example, Jennings and Molotch (2019) showed that in the lower elevations of the Cascades and Sierra Nevada (e.g., warmer snow areas), incorrectly identifying a rain-snow air temperature threshold can produce significant errors in modeled snow accumulation and melt. Here we show that citizen science offers an approach to collect data of high quality and spatial and temporal variability, with an engagement, recruitment and retention program applicable to other studies. We recommend that other citizen science projects consider implementing a targeted communication and recruitment strategy, continued communication and feedback for citizen scientists to encourage engagement, and an easy-to-use data reporting system for quality assurance.

DATA AVAILABILITY STATEMENT
The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found below: The precipitation phase observations generated for this study can be found in the Mendeley Data repository https://data.mendeley.com/ datasets/gfkkvvm7w8/1. SNOTEL data can be found at https:// www.wcc.nrcs.usda.gov/snow/ Accessed July, 2020. Code used to process analyze the data can be found at: https://github.com/ SnowHydrology/Tahoe_RainOrSnow.

AUTHOR CONTRIBUTIONS
MA, MC, and KJ designed the experiment, collected, and interpreted the data. MA, MC, and KJ wrote the article and approve the content of the work.