The Waiting and Mating Game: Condition Dependent Mate Sampling in Female Gray Treefrogs (Hyla versicolor)

Strong sexual selection by receivers can lead to the evolution of elaborate courtship behaviors in signalers. However the process by which receivers sample signalers and execute mate choice under complex signaling conditions—and thus the realized strength of sexual section—is poorly understood. Moreover, receivers can vary in condition, which can further inﬂuence mate sampling strategies. Using wild female frogs we tested two hypotheses at the intersection of these important problems: that some of the individual variation in mate sampling is explained by (1) the reproductive urgency hypothesis, which predicts that receivers in a more urgent reproductive state will sample mates less and/or (2) the reproductive investment hypothesis, which predicts that receivers that have invested less in the current reproductive effort will sample mates less. Eastern gray treefrogs, Hyla versicolor , were collected in amplexus and repeatedly tested for phonotaxis behavior using a dynamic playback assay. To evaluate if hormonal mechanisms explained variation in the mate sampling, three steroid hormones, estradiol, progesterone, and corticosterone, were collected using a noninvasive water-borne hormone assay, validated for this species in the present study. Finally, we measured clutch size (investment) and the duration of time required for each female to oviposit after being reunited with their male mate (urgency). We found repeatability in many of the behaviors, including mate sampling. We found that females with higher concentrations estradiol and corticosterone made quicker choices, and that females with higher progesterone sampled mates more. We also found that female frogs in a more urgent reproductive state had lower concentrations of progesterone and estradiol, thereby providing the ﬁrst evidence of a relationship between gonadal hormones and reproductive urgency. Collectively we found some support for the reproductive urgency but not the investment hypothesis. Thus, even though a female frog’s reproductive readiness is a highly transient life history stage, ﬁne scale variation in her reproductive timeline could mitigate the strength of directional selection. and 2.5% for estrone; T: 100% for T, 56.8% for 5a-dihydrotestosterone, for for estradiol and for estrone; PROG: 100% for progesterone, 172% for 3ß-hydroxy-progesterone, 188% for -hydroxy-progesterone, and 147% for 11 -hydroxy-progesterone.


INTRODUCTION
The strategies that sexually reproducing animals use to sample and choose mates has important implications for the nature and strength of sexual selection (Andersson, 1994). Mate choice preferences have been studied extensively and there is ample evidence that signal preferences can provide direct and indirect benefits to the choosier sex (Ryan, 1985;Welch et al., 1998;Head et al., 2005). In many species, females exhibit strong preferences for static and dynamic features of male advertisement signals (Gerhardt, 1991;Gerhardt et al., 1996), and thus a female's sensory preferences and mate selection behavior are two phenotypic levels of interest for understanding the evolution of communication systems (Cotton and Small, 2006).
Female mate choice can be both context-and conditiondependent. Preferences can vary with external factors, such as predation risk, energetic and time costs (Cotton and Small, 2006) and internal factors, such as a female's condition Cotton and Small, 2006). For example, age (Moore and Moore, 2001;Coleman et al., 2004), diet , resource availability (Vitousek, 2009), early developmental stress (Woodgate et al., 2010), body condition (Baugh and Ryan, 2009), and social rank and dominance (Owens et al., 1994;Bro-Jørgensen, 2002, reviewed in Cotton andSmall, 2006) have all been shown to impact female mate choice in a variety of taxa.
The context-and condition-dependent nature of mate sampling has primarily been studied using static mate choice designs, which employ invariant stimulus presentations. For example, in the barking treefrog (Hyla gratiosa), females sample mates simultaneously, with variation in the number of calls sampled in phonotaxis experiments ranging from 4 to 8 calls (Murphy, 2012), as opposed to simply selecting the first mate that is detectable or that satisfies a threshold criterion (Murphy and Gerhardt, 2002). Although such studies are valuable for understanding the basic rules of mate sampling, we also know that the process of executing a mate choice can be temporally dynamic. For example, studies in túngara frogs (Physalaemus pustulosus; Ryan, 2009, 2010a,b,c) and gray treefrogs (Hyla versicolor; Gerhardt et al., 1996) have demonstrated among-individual variation in how female receivers execute mate choices in real-time during dynamic playbacks, in which stimulus presentation varies temporally. Such dynamic designs provide an additional means by which to study sampling strategies and choosiness. Costs incurred by choosier females can include loss in body mass (Wikelski et al., 2001), opportunity costs, and competition (Lindström and Lehtonen, 2013).
One approach to understanding how such costs shape female mate sampling strategies is to examine variation in sampling behavior at the individual level. Individuals can vary at the within-and among-individual levels. The former largely represents the transient variation due to internal (e.g., body condition) and external factors (e.g., abiotic or social environment) and the latter represents individual differences in a phenotype (i.e., repeatability; see Dingemanse and Dochtermann, 2013). There is evidence for variation at both levels in mate choice studies (reviewed in Jennions et al., 1995;Jennions and Petrie, 1997;Bell et al., 2009). The condition-dependent variables offer an opportunity to evaluate the internal contributions to mate sampling. For example, reproductive urgency and reproductive investment may covary with female behavior. Reproductive urgency, the small and transient window of opportunity a female has to find a mate and copulate, has been suggested as a determinant of sampling. In fiddler crabs, Uca annulipes, females are more selective at the beginning of their mate sampling period, when they face less temporal constraints, compared to the end of this time window, when the cost of sampling becomes high (Backwell and Passmore, 1996). This same pattern was suggested in female túngara frogs wherein females with higher residual body mass-and thus presumably further along in egg clutch maturation-exhibited reduced choosiness under dynamic playback conditions (Baugh and Ryan, 2009). Reproductive investment, such as the size of a clutch or gonad, could also be a source of variance in choosiness, though evidence in support of this idea is mixed. A study in house crickets by Gray (1999) found that reproductive investment had no effect on mate choice, while another study demonstrated that, through cryptic mate choice, females vary their investment based on the quality of the mate (Reyer et al., 1999).
In exploring how reproductive urgency and investment impact mate choice behavior, we must also investigate the physiological mechanisms serving as a substrate for such natural variation, including the gonadal and adrenal steroid hormones that are known modulators of vertebrate reproductive behavior. In the rhesus monkey, Macaca mulatta, where females associate primarily with males during the breeding season and females in the nonbreeding season, ovariectomies drastically attenuate a female's preference to affiliate with males during courtship, and injection with estrogen is known to restore affiliative behavior but only during the breeding season (Michael and Zumpe, 1993;Adkins-Regan, 1998). In meadow voles, Microtus pennsylvanicus, females prefer male odors during the spring and summer when the photoperiods are long, but prefer female odors during the winter where the photoperiods are shorter. In addition to being dependent on longer photoperiods, the preference for males is modulated by estrogen, as ovariectomies reverse preferences for male odors, while estradiol (E 2 ) treatments restore the preference (Ferkin and Zucker, 1991;Adkins-Regan, 1998). Similarly, in songbirds, E 2 elevates copulation solicitation displays in females (reviewed in Maney and Pinaud, 2011). In anuran species, similar patterns are present. In the túngara frog, females are most receptive to advertisement calls when found in amplexus , coinciding with elevated concentrations of progesterone (PROG) and E 2 that decline after copulation . Experimentally elevating E 2 in female túngara frogs using E 2 injections (Chakraborty and Burmeister, 2009) or human chorionic gonadotropin injections Chakraborty and Burmeister, 2009) increases a female's receptivity prior to copulation. In female gray treefrogs, injections with PROG and prostaglandins increase receptivity (Gordon and Gerhardt, 2009;Ward et al., 2015), wherein females with higher concentrations of PROG and E 2 are more receptive (Gordon and Gerhardt, 2009).
Complementing the gonadal steroid studies, recent evidence has shown that interrenal hormones, such as glucocorticoids, modulate aspects of mate choice in female vertebrates (Vitousek, 2009;Vitousek and Romero, 2013;Davis and Leary, 2015). For example, elevated concentrations of corticosterone (CORT) in the green treefrog, Hyla cineria, led to a decrease in discrimination between conspecific male advertisement calls broadcast at various call rates (Davis and Leary, 2015). Because the gonadal and adrenal/interrenal steroids might interact antagonistically (reviewed in Toufexis et al., 2014), it is important to evaluate the role of both the hypothalamic-pituitary-gonadal (HPG) and hypothalamic-pituitary-adrenal/interrenal (HPA/I) axes when investigating hormonal mechanisms of mate choice.
Here we test two hypotheses on how reproductive urgency and investment contribute to a female's mate sampling behavior using wild caught eastern gray treefrogs (Hyla versicolor).
(1) Reproductive urgency hypothesis: females with a shorter time horizon prior to oviposition will exhibit less mate sampling (greater commitment to an initial mate choice and shorter latencies). We assumed that females which oviposit sooner are in a more urgent reproductive state and predicted that they would exhibit shorter choice latencies and be more likely to commit to an initial mate preference. (2) Reproductive investment hypothesis: females that have invested more resources into a current bout of reproduction have more at stake and will thus be choosier. We predicted that females with larger clutch masses will exhibit longer choice latencies and sample more mates. Because these two hypotheses are not mutually exclusive, we also explored the potential relationship between these factors. Lastly, we investigated steroid hormone mechanisms using a non-invasive water-borne assay that was conducted alongside the behavioral assays with a protocol that has been effective in fish (Fischer et al., 2014) and amphibians (Gabor and Grober, 2011;Gabor et al., 2013;Baugh et al., 2018).

Study Species
Male H. versicolor produce pulsatile advertisement calls to attract females during the breeding season (Wells, 2007). Females prefer calls with a greater number of pulses and higher call rates . Research has shown that selecting males with longer call durations (more pulses) provides indirect benefits, as call duration is an indicator for the male's genetic quality (Welch et al., 1998), and that gonadal hormones are important determinants of female proceptivity (Gordon and Gerhardt, 2009;Ward et al., 2015). Most previous work has used conventional static phonotaxis assays, in which a female is allowed to move freely toward speakers producing a repeated train of advertisement calls or alternative sounds. Recent studies, however, have shown that the process of mate choice in receivers is subject to real-time changes in signaler outputs. Because receivers can update their decision-making in real time, dynamic playbacks offer an opportunity to evaluate female choosiness (c.f. fickleness; Gerhardt et al., 1996;Ryan, 2009, 2010a,b,c) and mate sampling behaviors in greater detail.

Animals
We collected breeding frogs from Glassboro Wildlife Management Area in Glassboro, NJ (39.68 • N,−75.07 • W, elevation: 39 m) during May and June of 2016 between the hours of 2100 and 2300. Almost all females were collected in amplexus with a male. A few females were found approaching a calling male, and we collected and paired them and allowed them to get into amplexus. Collecting females in amplexus and testing them prior to oviposition ensures that they are in reproductive condition and the fact that they have already selected a mate does not appear to influence their choosiness (Murphy and Gerhardt, 1996). Mated pairs were placed in small plastic containers with 2 centimeters of water and were maintained at 4 • C, where they were maintained for 0-3 days. This holding method has been extensively used as a method to maintain female gray treefrogs prior to testing Ward et al., 2015;Tanner et al., 2017). In 2015 we conducted a pilot study from the same population of frogs using similar dynamic phonotaxis assay and found that female phonotaxis was not influenced by this holding method (latency to exit the origin: t 12 = 0.05; p = 0.85; latency to choice: t 12 = 0.43; p = 0.59; probability of choice; binomial exact test, p = 0.06). Likewise, this holding method does not impact circulating concentrations of CORT (t 43 = 0.22, p = 0.83) or E 2 (t 43 = 0.13, p = 0.90); potential effects on PROG are unknown. Before testing, female frogs were placed in room temperature frog water in an incubator for ca. 10 min until they reached 20 • C (Fluke 62 Max+ IR thermometer, Everett, WA). The males were placed in an oviposition chamber with room temperature frog water, and were reunited with the female at the conclusion of the female' s testing. Frogs were tested at Swarthmore College and all methods were approved by its Institutional Animal Care and Use Committee.

Stimuli and Experimental Design
We used synthetic advertisement calls created with custom software (J. Schwartz, Pace University at Pleasantville, NY, U.S.A). The software used mean call parameters estimated from our population (Supplemental Materials S1) to synthesize a single pulse, and this pulse was replicated for the number of pulses needed in a given call. In all experiments, an 18-pulse call was used as the more attractive stimulus and a 10-pulse call was used as the less attractive stimulus; these values represent the mean pulse number and two standard deviations below the average pulse number, respectively, from the field site population (Supplemental Materials S2).
Subjects were tested in a sound attenuating acoustic chamber (Industrial Noise Control, North Aurora, IL) under infrared lighting (Figure 1). Before testing, both speakers used to broadcast the stimuli were calibrated to 80 dB SPL (re 20 µPa) at the center ("origin") of the chamber using a SoundTrack LXT sound pressure level meter (Larson Davis, Provo, UT; Figure 1). Two PCs were used during testing; one that controlled the acoustic playback using SIGNAL (Version 5, Berkeley, CA), and one that was connected to a ceiling mounted IR camera (Ikegami ICO-49, Japan) that monitored the frog's movement using Ethovision XT (Version 9, Noldus, Wageningen, NL).
We used a dynamic two-choice design similar to that conducted in Ryan (2009, 2010a,b,c). At the beginning of each trial, the subject was placed under a mesh cone at the origin with two speakers placed at opposite sides of the chamber, FIGURE 1 | Phonotaxis arena illustrating decision boundaries and an example trial with the pre-manipulation (initial) and post-manipulation (final) stimuli. The radii (from the center of the speaker) of the approach boundary was 80 cm and the choice boundaries were 10 cm. When a female crossed the approach boundary of the 18-pulse call, the stimuli were altered such that the initially approached speaker begins to broadcast the less attractive 10-pulse call and the opposing speaker broadcasts the 18-pulse call. A hypothetical non-reversal (solid black line) and reversal (solid gray line) phonotaxis path are shown. Internal length and width dimensions are indicated.
antiphonally broadcasting one of the two stimuli at a call rate of one call every 4 s per speaker. In order to minimize any potential side bias in the chamber or first caller preference (Bosch and Márquez, 2002), we randomly assigned the order of the stimuli (before and after manipulation) as well as the location of the stimuli. After 60 s of playback, the cone was lifted remotely and phonotactic behavior was observed live and digitally accessioned using Ethovision. If the frog approached the more attractive stimulus by crossing the "approach boundary" (Figure 1), then the observer pressed the spacebar on the computer controlling the playback, leading to the activation of a custom program in SIGNAL that introduced a 500-ms delay, and, depending on the experimental condition, modifications to stimulus presentation. In all cases except the control condition, the 18-pulse call was changed to a 10-pulse call, and the 10-pulse call was simultaneously changed to an 18-pulse call. This process of call alteration was perceptually seamless to the frog: testers only pressed the spacebar during the silent interval between calls in order to prevent artificially truncating a call, because interrupting a call has been shown to decrease its attractiveness (Henderson and Gerhardt, 2013). If the frog continued to approach the speaker now broadcasting the 10-pulse call and crossed the choice boundary, then the trial was terminated and recorded as a "non-reversal" choice. In contrast, if the frog reversed and approached the speaker now broadcasting the 18-pulse call and crossed that choice boundary, then the trial was terminated and recorded as a "reversal" choice ( Figure 1). In addition to reversals, for each trial we measured and estimated the repeatability of the following behavioral measures: (1) latency to exit the origin circle; (2) latency to cross the approach boundary (excluding origin exit latency); (3) pause duration after stimuli alteration; (4) latency to choice after stimuli alteration; and (5) overall latency to choice (entire trial). Path length to approach boundary, path length to choice, and path length to choice after alteration were estimated from the recorded file using Ethovision's automated tracking functions.
We tested each subject under three experimental conditions for a total of seven trials: (1) Experimental single switch: five repeated measures trials were conducted per female wherein the 18-pulse call was changed to the 10-pulse call if the female initially crossed the approach boundary toward the 18-pulse call; (2) Experimental looped switch: one trial per female wherein the 18-pulse call became a 10-pulse call if the female initially crossed the approach boundary toward the 18-pulse call, and then this cycle of stimulus alteration was repeated every time a female crossed the approach boundary toward the current 18pulse call; (3) Control: one trial per female wherein the playback was interrupted for 500 ms after the female crossed the approach boundary toward the 18-pulse call, but then stimuli were rebroadcast from their initial locations (i.e., no change of stimulus location). Trials in which frogs failed to leave the origin after 5 min, were stationary for 2 min after leaving the origin, or took longer than 10 min to make a choice were recorded as "fouls." Frogs that fouled twice were given a three-minute "time out, " where they were placed in Tupperware full of water in an incubator (20 • C) and then tested again. If the frog continued to foul, then that frog was no longer tested, though all biometrics, oviposition data, and hormone data were collected. A frog could also commit a foul by initially approaching and then choosing the lower pulse number call (i.e., no stimuli alteration) because in these trials, females did not have the opportunity to dynamically resample mates; in these instances the frog was still tested until the frog was unresponsive or successfully finished all tests. In these types of fouls, because the stimuli were never altered, the trials could not be used in the analysis of latency to decision boundary, pause duration, latency to choice or latency to choice after switch; however, they were included in analysis of latency to exit the origin.

Hormone Sampling and Reproductive Investment
We sampled each female frog for both blood-and water-borne (excreted) hormone concentrations and estimated reproductive investment. Concentrations of excreted steroid hormones were measured using a water-borne hormone method (Gabor et al., 2013;Baugh et al., 2018). After phonotaxis testing, frogs were weighed and placed in 100 mL of room temperature "frog water" for 30 min (hereafter "preoviposition water bath") while housed in a dark holding room with a playback (FoxPro Wildfire) of a recording from a natural chorus at the collection site (88 dB SPL at 1 m). Frog water was prepared by dissolving 1.2 g CaCl 2 , 1.38 MgSO 4 , 1.08 g KHCO 3 , and 0.038 g of commercial traceelement mix in 30 L of reverse osmosis water (Baugh et al., 2018). After this 30 min sample, females were then transferred to an oviposition chamber (Exo Terra 30 x 30 x 43 cm, with 1 inch of water) with their original male partner. Each pair was monitored under darkness for 14 h using an IR camera (Bell & Howell, Rogue Night Vision). This was done to determine the latencies for the frogs to reengage in amplexus and oviposit. Afterwards, frogs were placed in a water bath for 1 h (hereafter "postoviposition water bath"). Following both water baths, we collected the water samples and stored them at −80 • C until extraction. We then collected blood using a non-lethal cardiac puncture method (Gordon and Gerhardt, 2009;Davis and Leary, 2015) and recorded the handling time of blood collection (time elapsed from collecting the frog to completing the bleed). The frog's snout-vent length (SVL) was also measured to determine residual body mass (RBM) from a length-mass linear regression. Lastly, we counted the eggs the female laid in the oviposition chamber and measured the diameter of a haphazardly selected subsample of 50 eggs to determine average egg volume. The eggs from one female were used to determine the density of an egg, and from this, clutch mass was calculated for all females by multiplying this value by average egg volume. Measuring clutch mass has been used previously to measure reproductive investment in vertebrates (Schoenle et al., 2017;Zhang et al., 2018).

Water-Borne Hormones
Water samples from the pre-and post-oviposition baths were thawed and filtered with VWR grade 417 filter paper (primed with 3 ml reverse osmosis water). Following Gabor et al. (2013), steroids were extracted using solid phase columns (Sep-Pak C18 500 mg cartridges; Waters Corp. Milford, MA) activated with 4 mL of methanol (ACS grade) and equilibrated with 4 mL of reverse osmosis water. This method has been used to extract water-borne steroid hormones, such as CORT, E 2 and PROG, in other amphibians (Gabor et al., 2013;Baugh et al., 2018). The water samples were processed through the cartridges under 15 bar of vacuum pressure using a 24-port manifold system (United Chemical Technologies, LLC, Bristol PA). An additional 4 mL of reverse osmosis water was processed through each cartridge to evacuate the sample. We then eluted the cartridges using 4 mL of methanol (HPLC grade) into borosilicate vials (Gabor and Grober, 2011;Gabor et al., 2013). The samples were then dried using two methods: a subset of the samples were dried using an evap-o-rack manifold (Cole-Parmer, Bunker CT) using nitrogen gas in a 37 C water bath, and a different subset of samples were partially dried using the above method but then finished using diethyl ether and nitrogen. Samples were then reconstituted in 5% methanol and 95% assay buffer solution for a total reconstitution volume of 400 µL.

Plasma Hormones
Prior to the cardiac punctures, 1 µL of heparin was added to the microcentrifuge tubes to prevent coagulation. The blood was then centrifuged at 7500 RPM for 10 min in a cold room (Denville Scientific, Inc. Denville 260D). The plasma fraction of the blood was then stored at−80 C. Approximately 20 µL of plasma from each female was used for the enzyme immunoassay (EIA) and diluted to 1:25 for PROG and E2 and 1:50 for CORT in the commercial assay buffer (see Parallelism and Recovery Determination). Samples were extracted using diethyl ether, dried under nitrogen gas and reconstituted overnight in 400 µL of assay buffer.

Enzyme Immunoassay
We used commercial EIA kits from Arbor Assays (DetectX R kits, Ann Arbor, MI) to estimate hormone concentrations. We followed the manufacturer's protocol for CORT (catalog number: K014 Donkey anti-Sheep IgG), PROG (catalog number: K025, Goat anti-Mouse IgG), and 17β-estradiol (catalog number: K030, Donkey anti-Sheep IgG) kits for the water samples using 50 µLof fadfa of reconstituted solution for each well, and the 17β-estradiol serum (catalog number: KB30, Goat anti-Rabbit IgG) kit for the plasma samples using 100 µL of reconstituted solution for each well. Optical densities for the plates were read using a Versa max microplate reader with SoftMax Pro software with a correction at 450 nm (Molecular Devices, Sunnyvale CA). We then corrected for dilution factor and, for the water samples, body length by dividing the samples by SVL (Gabor et al., 2013;Baugh et al., 2018).

Parallelism and Recovery Determination
We performed tests for parallelism for all three hormones in plasma and used a comparison of slopes by t-test to statistically compare our standard curves and dilution curves for pooled samples of plasma. From this we determined that a 1:25 dilution was optimal for PROG and E 2 and a 1:50 dilution for CORT. We also conducted recovery efficiency estimates for both media types. To do this, we stripped plasma from the endogenous hormones by adding 70 mg dextran coated charcoal per 1 mL of plasma (Delehanty et al., 2015) and from water by adding 7 mg coated charcoal per 1 ml water sample. Samples were vortexed, incubated (37 C) for 4 h, and then repeatedly centrifuged and supernatant trasnfered to a new tub to ensure no residual charcoal in the sample. Water was then spiked with 187.5 µL of commercial CORT (from kit) diluted to 10 pg/µl, 187.5 µL of stock solution of PROG diluted to 3.2 pg/µl, and 93.8 µL of stock solution of E 2 diluted to 10 pg/µl. Plasma was spiked with 25 µL of the commercial PROG (32 pg/µL), CORT (100 pg/µL), and E 2 (2.4 pg/µL). These stripped and spiked samples were processed using the same extraction and assay procedures as the unknown water and plasma samples, allowing us to calculate the percent recovery. Each plate had a minimum of two stripped and spiked samples distributed evenly across the plate for estimation of inter-and intra-assay coefficients of variation (CV). We accepted the average of duplicate wells. We assayed stripped and unspiked samples in order to evaluate the thoroughness of the stripping method as well as blank wells to verify that our process was free from contamination. The assays have detection limits and sensitivities, respectively, of 16.9 and 18.6 pg mL −1 for CORT, 2.05 and 2.21 pg mL −1 for E 2 in plasma, 26.5 and 39.6 pg mL -1 for E 2 in water, and 47.9 and 52.9 pg mL −1 for PROG. The cross-reactivity of the antiserum for each kit is as follows: CORT: 100% for corticosterone, 12.3% for desoxycorticosterone, 0.62% for aldosterone, 0.38% for cortisol; E 2 in plasma: 100% for E 2 , 3.2% for estrone sulfate, and 2.5% for estrone; T: 100% for T, 56.8% for 5a-dihydrotestosterone, and 0.27% for androstendione; E 2 in water: 100% for estradiol and 0.73% for estrone; PROG: 100% for progesterone, 172% for 3ß-hydroxy-progesterone, 188% for 3 -hydroxy-progesterone, and 147% for 11 -hydroxy-progesterone.

High Performance Liquid Chromatography-Mass Spectrometry
One pooled plasma sample (post-oviposition females) and one water sample (a pre-oviposition female) were processed through the drying stage and then shipped on dry ice to the core services at West Coast Metabolomics (University of California-Davis) for HPLC-MS analysis. Dried samples were resuspended in 100 µl of a 1:1 solution of HPLC grade methanol and acetonitrile and processed using HPLC-MS (Waters Acquity/SciEx QTrap 6500) using a targeted metabolite and steroid panel designed to identify 30 steroid species. Four internal standards were used for calibration: 17-hydroxyprogesterone, E 2 , PROG and testosterone.

Statistics
We conducted statistical analyses using SPSS (version 21, IBM) and the statistics package R (R Development Core Team, 2008). Correlations between plasma-borne hormones and waterborne hormones were conducted in SPSS. Because the residuals between plasma-borne hormones and water-borne hormones after oviposition were not normal for CORT and E 2 , these data were log 10 transformed, which improved the normality of the residuals. The residuals between plasma-borne PROG and waterborne PROG after oviposition were also not normal, though log 10 transforming the data did not improve the normality of the residuals, so the PROG data remained untransformed throughout data analysis. Pearson's correlations were conducted between the different media (water, plasma) types for CORT and E 2 , and a Spearman's correlation was conducted for the media types for PROG. Significance for all correlations and t-tests were determined in SPSS.
Intra-class correlation coefficients (ICC) were calculated in SPSS by dividing the resulting among-group variance by the sum of within-group and among-group variance. An ANOVA with subject as the independent variable was used to calculate the treatment and error variance estimates and determine statistical significance (Lessells and Boag, 1987). Repeatability estimates were also derived from linear mixed-effects models in R using the lme4 and rptR packages, with subject used as a random effect. Repeatability (i.e., intraclass correlation coefficient) was calculated as the among-individual variance divided by the sum of within-(i.e., residual) and among-individual error terms. The 95% confidence intervals were estimated in R using parametric bootstrapping of the data with 1000 cycles (see Dingemanse and Dochtermann, 2013).
Linear mixed effects models (LMM) for each of the phonotaxis measures were created in R using the lme4 and lmerTest packages. Null models were created for each behavior by including only subject as a random effect. Subsequent models were created using hormone data, latency to oviposition, and clutch mass. Models were selected through two stepwise methods: backward elimination and forward selection. In backward elimination, all three hormones, latency to oviposit, and clutch mass were initially added to the model for a given behavioral measure. The variable with the highest p-value was then eliminated from the model, and a new model was made until all remaining variables had significant effects (p < 0.05). For forward selection, all five variables were included in separate models, and if a model had a significant effect, another variable was added until the added variables had non-significant effects. These model selection procedures were conducted instead of testing a set of predetermined candidate models because the relationship between hormones, reproductive urgency, and investment have not been investigated previously. These stepwise selection approaches are an effective way to address collinearity through iteratively removing or adding correlated predictor variables (Smith et al., 2009). All models had subject as a random effect. Models from both selection methods were then compared to each other (if different) and to a null model that was specified with only subject as a random effect by selecting the model with the lowest AICc value (Supplemental Materials S3). The model for latency to exit the origin included trials in which the female made a choice (regardless of whether it was a reversal, non-reversal, or a foul) or crossed the decision boundary toward the more attractive stimulus before the stimuli were altered because these behaviors indicate that the female was performing phonotaxis when leaving the origin. The models for latency to approach the decision boundary and for pause duration included trials in which the female crossed the decision boundary toward the more attractive stimulus because these were trials in which the stimuli were altered, and could therefore

Repeatabilities were calculated using intraclass correlation (ICC) and by linear mixed effects models (LMM). Statistical significance (p-values) calculated from the ICC estimates.
be used to determine the female's behavior before the stimulus alteration (latency to approach boundary) and immediately after the stimulus alteration (pause duration). The models for latency to choice and latency to choice after switch included only trials in which the female approached the more attractive stimulus and did not reverse because reversals resulted in significantly longer latencies to choices. Significance for fixed effects and AICc values for the LMMs were determined in R. Both conditional and marginal R 2 values for the LMMs were calculated in R using the MuMIn and lme4 packages (Nakagawa and Schielzeth, 2012).

Behavior
Zero females reversed in the control condition. Also, zero females reversed more than one time in the experimental looped switch condition. Therefore we treated the experimental looped switch trial as though it were a sixth replicate of experimental single switch condition. Thirty-two females successfully completed at least one experimental switch trial, and 11 of these females reversed in at least one trial (number of females that reversed: 0 reversed in all 6 trials; 0 reversed in 5 of 6 trials; 1 reversed in 4 of 6 trials; 2 reversed in 3 of 6 trials; 1 reversed in 2 of 6 trials; 7 reversed in 1 of six trials; 21 reversed in 0 of 6 trials). Females that did reverse did not significantly differ in SVL (t 35 = 0.358; p = 0.723), mass (t 35 = 1.907; p = 0.065) or clutch mass (t 35 = 0.393; p = 0.697). Most of the mate choice behaviors exhibited significant repeatability ( Table 1). Reversal and nonreversal trials differed significantly in a set of other behavioral measures (Supplemental Materials S4): (1) the total duration of trials was longer for reversal trials (LMM: t 170.1 = 6.875, p < 0.001); the total path length was longer for reversal trials (LMM: t 169.8 = 10.73, p < 0.001). There were no significant differences between reversal and non-reversal trials in latency to exit the origin (LMM: t34.8 = 0.38, p = 0.71) or in pause duration (LMM: t = 29.1 = 1.1, p = 0.28). Lastly, date of testing did not significantly explain variance in any of the behavioral measures (all p > 0.05).

Validation, Parallelism, and Recovery
There was a significant positive correlation between concentrations in water and plasma for all three hormones  Drying method did not affect the estimated concentrations and was thus omitted from the final models. Average recovery efficiencies (mean ± CV) were similar for plasma and water-borne CORT (plasma: 45.1 ± 28.1%; water: 44.9 ± 5.6%) and E 2 (plasma: 37.9 ± 13.2%; water: 49.4 ± 18.9%); but were high for PROG (plasma: 139.1 ± 8.4%; water: 178.0 ± 9.1%), probably due to the high cross-reactivity with multiple metabolites. Inter-and intra-assay coefficients of variation were generally low with the exception of the inter-assay CV for plasma CORT (20.4%) (see Supplemental Materials S9).

Hormones, Oviposition, and Clutch Mass
To analyze how gonadal and glucocorticoid hormones were related to reproductive urgency and reproductive investment, we performed correlations between each of the steroids and paired t-tests comparing hormone concentrations before and after oviposition within females. Both E 2 concentrations and PROG concentrations were positively correlated with latency to oviposit, suggesting they are related to reproductive urgency (E 2 : R 2 = 0.23; N = 28, p = 0.01; PROG: R 2 = 0.28; N = 28, p = 0.005; Figure 3). For the correlation between PROG and latency to oviposit, one point was removed due to its high Cook's value. Removal of this point did not influence the overall significance of the correlation, but did increase the R 2 value (R 2 = 0.40; N = 27; p = 0.035). When this outlier was excluded, the best fit model for latency to oviposit included both of these hormones (R 2 = 0.40; N = 27; p = 0.002; main effects: PROG, p = 0.015; E 2 , p = 0.039), but not CORT. With the outlier included, the overall model was significant, though the main effect for PROG was no longer significant and the overall fit of the model was reduced (omnibus: R 2 = 0.32; N = 28; p = 0.008; PROG main effect: p = 0.07; E 2 main effect: p = 0.02).

Hormones and Behavior
To evaluate the association between hormone concentrations and phonotaxis behaviors, a LMM was first conducted on overall choice latency for trials in which frogs exhibited non-reversal choices (because reversals resulted in higher latencies compared to non-reversals). Using our model selection procedure, the best model explaining latency to choice was the one that only included CORT and E 2 (Figure 4; AICc = 1591.9, CORT: t 26.0 = −2.40, p = 0.002, coefficient = −127.23; E 2 : t 25.4 = 2.33, p = 0.028, coefficient = 66.42; R 2 marginal = 0.09, R 2 conditional = 0.33; Null model AICc = 1667.5). We were interested in analyzing certain behaviors within the process of a female making a choice, such as how long a female sampled mates before and after the stimuli were altered, because it would provide more detailed information of the female's mate sampling behavior. We therefore analyzed the components of the female's mate choice behavior with significant and non-zero repeatabilities. The best model for latency to approach boundary also was the model only including CORT and E 2 (Supplemental Materials S10; AICc = 2103.37, CORT: t 26.8 = 2.84, p = 0.0008, coefficient = −93.41, E 2 : t 26.6 = 2.43, p = 0.02, coefficient = 100.43, R 2 marginal = 0.07, R 2 conditional = 0.16; Null model AICc = 2160.12). The best overall model for pause duration, one of the behavioral proxies for mate sampling, in all trials in which the frog crossed the approach boundary was the model that only included PROG (Figure 5; AICc = 1499.0, t 23.6 = 2.83, p = 0.009, coefficient = 0.21, R 2 marginal = 0.05, R 2 conditional = 0.08; Null model AICc = 1504.1). Note that this statistical correlation was driven by a female that exhibited high PROG; adding a fixed effect in the model for "outlier" resulted in a significant effect (t = 2.1, p = 0.04) and that model then yielded a non-significant effect of the predictor variable PROG (t = 0.30, p = 0.76).
PROG was the factor present in the best overall models for latency to choice after stimulus alteration in nonreversal trials (Supplemental Materials S11; AICc = 1381.42, t 26.4 = 2.63, p = 0.014, coefficient = −0.69, R 2 marginal = 0.08, R 2 conditional = 0.26; Null model AICc = 1385.6). Note that this statistical correlation was driven by a female that exhibited high PROG; adding a fixed effect in the model for "outlier" resulted in a significant effect (t = 3.9, p = 0.001) and that model then yielded a non-significant effect of the predictor variable PROG (t = 0.94, p = 0.35). The best model for latency to exit the origin, another behavior that may be associated with mate sampling, was the one that only included CORT and PROG in reversal and non-reversal trials (Figure 6; AICc = 2749.04, CORT: t 25.2 = −2.644, p = 0.00139, coefficient = −53.95; PROG: t 29.8 = 2.09, p = 0.044, coefficient = −0.86, R 2 marginal = 0.11, R 2 conditional = 0.27; Null model AICc = 2813.3). The finding that PROG was in the models associated only with the behaviors associated with mate sampling and not all of the behaviors or latency to choose overall suggests that the effects of PROG are related to mate sampling.

Oviposition, Clutch Mass and Behavior
To evaluate the association between a female's reproductive state (urgency and investment) and behavior, we considered latency to oviposit and clutch mass for the models for all of the mate choice behavior. However, neither of these factors was included in the best overall models for any of the behaviors as described above. None of the hormones had significant or non-zero repeatabilites, indicating that hormonal state is highly variable among frogs during the transition from pre-to post-oviposition. In addition, none of the hormones were correlated with residual body mass FIGURE 4 | Scatterplots depicting the relationships between (A) pre-oviposition water-borne estradiol (log 10 transformed) and mean choice latency, and (B) pre-oviposition water-borne corticosterone (log 10 transformed) and mean choice latency. The y-axis represents the residuals of the linear mixed effects model for latency to choice and the other hormone as the only fixed effect, and subject as a random effect. Solid lines depict significant relationships (p < 0.05). Error bars depict standard deviations.

DISCUSSION
We show that eastern gray treefrogs dynamically sample mates-approximately one-third of females exhibit this temporal updating behavior, similar to what has been reported in túngara frogs (30%; Baugh and Ryan, 2010a), though lower than that reported in another population of gray treefrogs (67%) . We confirmed that this temporal updating strategy comes with direct costs in terms of increased time and locomotive burdens as dynamic females take longer and move greater distances compared to females that commit to their initial choice. It is presently unknown whether such costs have fitness consequences, but given that predation risk is often elevated at anuran leks (Ryan et al., 1981), these time and locomotive costs may expose females to higher risk. We also show differences in mate sampling related to reproductive condition and hormone status-although these factors did not impact whether females FIGURE 6 | Scatterplot illustrating the relationship between (A) pre-oviposition water-borne corticosterone (log 10 transformed) and mean latency to exit the origin and (B) pre-oviposition water-borne progesterone and mean latency to exit the origin. Each hormone is represented on the x-axis, while the y-axis represents the residuals of the linear mixed effects model for latency to exit the origin with the other hormone as the only variable and frog identification as a mixed effect. Solid lines indicate a significant relationship. Error bars represent standard deviations.
reversed their initial mate choice decision (our prediction), we found, for example, that females with higher levels of PROG had more delayed oviposition (i.e., less urgent) and exhibited a longer pause following stimulus manipulation (i.e., sampled mates longer). Note that this finding for PROG and pause hinges on the inclusion of a female that had high PROG concentrations and there is debate regarding whether hormone outliers should be included or excluded from analyses given the often non-linear nature of hormone-behavior relationships (Williams, 2008). To complement these condition-dependent contributions to female sampling, we also demonstrated that females exhibit significant repeatability (i.e., the proportion of phenotypic variance accounted for by among-individual variance) in a variety of mate choice behaviors, indicating the relevance of simultaneously measuring both within-and amongindividual variation (reviewed in Jennions and Petrie, 1997;Bell et al., 2009).

Support for the Reproductive Urgency Hypothesis
Overall our results provide partial support for the reproductive urgency hypothesis, and our findings suggest this aspect of female condition is uncorrelated with reproductive investment. There was no direct relationship between a female's reproductive urgency and her mate sampling behavior; however, we demonstrated (1) that females vary in their urgency (oviposition latency) and (2) found evidence of an indirect relationship between urgency and mate choice: there was a negative correlation between both gonadal hormones (PROG and E 2 ) and reproductive urgency (i.e., females with higher PROG and E 2 were in a less reproductively urgent state) and there was a positive correlation with both gonadal hormones and several of the mate choice/sampling behaviors. For example, females that have higher concentrations of E 2 make slower choices overall, suggesting that they may be less motivated to make a choice immediately, potentially because of being in a less urgent reproductive state. When decomposing choice latency into component parts, a similar relationship is present between E 2 and the latency to reach the approach boundary, as before the stimuli are altered, females with higher E 2 are slower to approach the more attractive stimulus. These results suggest that E 2 may be a key hormone involved in modulating female motivation during mate choice, at least in a static mate choice environment. While E 2 had an effect on how quickly a female reached the decision boundary and made an overall choice, PROG seemed to have an effect on behaviors more related to mate sampling. Females with higher PROG took longer to exit the origin, suggesting that at this initial timepoint, they might be sampling mates more thoroughly before executing a choice. Likewise, females with higher PROG paused for longer after the stimuli were altered, potentially to resample the dynamic mate environment. It is unknown whether females continue to sample while moving toward a male, but estimating pause duration immediately after stimulus manipulation offers a potentially informative estimate of sampling. Because females with higher concentrations of PROG may be in a less urgent reproductive state (e.g., they have longer latencies to oviposition), one interpretation of the positive relationship between PROG and pause duration is that less urgent females are less motivated. Alternatively, these females may be more discriminating. The latter interpretation appears more likely given that all females in this sample exhibited robust positive phonotaxis toward the more attractive stimulus-a strong indication of sexual motivation in female anurans (Ryan, 1985).
The finding that females farther from oviposition had higher concentrations of E 2 and PROG may appear somewhat surprising because previous research in frogs has shown that E 2 and PROG are relatively low prior to amplexus, are high during amplexus, and decline after amplexus Chakraborty and Burmeister, 2009;Gordon and Gerhardt, 2009), suggesting that the concentrations of these gonadal hormones might peak at oviposition. Similar patterns have been observed in rats (Sakuma, 2008); and songbirds (Maney and Pinaud, 2011). However, the finer temporal patterns in hormone concentrations within individuals have not been explored previously, and these may not map onto the coarser timelines at the population level studied previously. Here we report that this temporal pattern may be more complicated, with E 2 and PROG apparently peaking before oviposition and declining sharply after oviposition. The sharp declines in all three steroids after oviposition may also erode among-individual differences across this life history transition, thus explaining our lack of repeatability in these hormone concentrations.
Another unexpected result given the relationship between the gonadal hormones and reproductive urgency was the relationship between PROG and E 2 . In the present study concentrations of E 2 and PROG were both positively correlated with latency to oviposition and thus negatively related with reproductive urgency. However, water-borne E 2 and PROG were not positively correlated with each other before oviposition, which contrasts with the findings in gray treefrogs in plasma (Gordon and Gerhardt, 2009). One potential reason for this difference could be that the two hormones are influenced independently by factors other than reproductive urgency. A study conducted by  in female túngara frog found that E 2 concentrations were sensitive to the social environment, with concentrations increasing in response to hearing advertisement calls. Therefore, E 2 could be dependent on multiple factors, including urgency and social environment. Thus, the lack of correlation between E 2 and PROG in this experiment could be because there are a variety of internal and external factors influencing these hormones individually rather than in concert.
Differences among studies might also arise from comparing excreted hormone concentrations, as done in this study, and circulating hormone concentrations, which may capture different aspects of an animal's endocrine state. For example, it is possible that clearance rates-and thus excreted concentrations-may themselves vary as a function of reproductive status and circulating concentrations, though this requires empirical study.

Mate Choice Behavior and Glucocorticoids
A surprising result from the present study was the relationship that CORT had with oviposition status and mate choice behavior. Similar to the gonadal hormones, CORT concentrations were elevated before oviposition and declined after oviposition, and females with elevated CORT initiated mate choice sooner. Little is known about the relationship between CORT and oviposition, though one study in scincid lizards, Bassiana duperreyi, did find that injections with CORT led to premature oviposition of eggs (Radder et al., 2008), suggesting that natural CORT concentrations may be elevated as females advance to oviposition. Behaviorally, a study by Davis and Leary (2015) found that CORT-injected female frogs were less discriminating of male vocalizations. A possible explanation for these findings is that females with higher CORT are in a state in which they are close to ovipositing and therefore might benefit from less choosy behavior. However, there are two potential problems with this hypothesis. First, although preoviposition frogs in our study did have higher CORT, there was no relationship between latency to oviposit and CORT, suggesting that this CORT profiles may not be specific to reproductive urgency. Second, the frogs were handled differently prior to sample collection. In the preoviposition water bath, frogs were handled frequently between trials for up to several hours during their mate choice testing, possibly inducing a stress response leading to an increase in CORT, while in the post-oviposition water bath, the frogs were not handled during the 14 h beforehand. A study measuring CORT at different time points along a female's reproductive timeline with minimal handling could verify whether this hormone is naturally elevated prior to oviposition. An alternate explanation for why CORT is higher in females before oviposition is that CORT indicates metabolic needs, which could be higher before oviposition. A study in captive zebra finches that measured CORT and metabolic rate at a higher and lower temperatures found that there was a positive quadratic relationship between CORT concentrations and metabolic rate (Jimeno et al., 2017). Thus, it is possible that females may have a higher metabolic demand before oviposition, leading to an increase in CORT concentrations.
Nevertheless CORT was correlated with overall choice latency, latency to reach the decision boundary, and one of the behaviors implicated in mate sampling, namely how quickly the frog initiated phonotaxis. Previously, CORT has been shown to influence female discrimination; for example, Davis and Leary (2015) showed that female green tree frogs with low CORT preferred calls broadcasted at high call rates while females with high CORT did not display this preference. In the present study there was no relationship with discrimination, but females with higher CORT initiated phonotaxis faster and overall selected mates faster, which suggests reduced mate sampling. One important distinction between these two studies is that Davis and Leary experimentally manipulated CORT in their intact animals, while the current study was correlational. An alternate explanation for the reduced latency to exit the origin and make a choice is that the increase in CORT led to an increase in locomotor activity. For example, in white-crowned sparrows, experimentally increasing plasma CORT elevates locomotor activity (Breuner et al., 1998). Thus, females gray treefrogs with higher levels of CORT could have left the origin more quickly due to a stimulatory effect of CORT. Notably, because CORT was not correlated with PROG before oviposition, it appears that the effects on mate sampling that were found in relation with reproductive urgency are separate from this CORT related effect.

Validation of Water-Borne Hormone Measurements
Our validation of the water-borne hormone method demonstrates that a 1-h water bath captures biologically meaningful estimates of circulating steroid concentrations, yielding concentrations correlated with plasma levels. Thus, we demonstrate that this assay permits repeated measurement of hormones and behavioral endocrinology non-invasively throughout a female's reproductive timeline (e.g., throughout a life history transition), thus avoiding possible impacts of hormones sampling on behavior. This method has recently been validated in other amphibians (Gabor et al., 2013;Baugh et al., 2018) and could offer a practical method for field and conservation biologists.

CONCLUSIONS
The present study provides evidence that acute and transient variation in gonadal and glucocorticoid hormones are associated with individual differences in mate choice behaviors. Females that were in a more urgent state had lower concentrations of gonadal hormones, thus linking individual differences in reproductive state and gonadal hormones. Further experimental manipulations are needed to evaluate if these relationships are causal. For example, the directionality of the relationship between hormones and mate sampling behaviors could be evaluated through experimental studies that manipulate the concentrations of gonadal and glucocorticoid hormones and measure effects on behavior.
Overall, the results suggest that individual differences in reproductive urgency could serve as an important source of diversifying inter-sexual selection. Given the observed natural variation in female urgency and its consequences on mate sampling, it is conceivable that under natural conditions, more urgent females will sample mates less thoroughly, thereby reducing the strength of selection on male traits. We have shown here that this functional implication may be underpinned proximately by gonadal hormones, with more urgent females having lower PROG concentrations and sampling mates less. Additionally, glucocorticoids appear to have the opposite effect on mate sampling, with higher CORT females sampling mates less before initiating phonotaxis. This may indicate that hormones produced by the HPA/I axis (CORT) and HPG axis (PROG and E2) have antagonistic effects on mate sampling (Toufexis et al., 2014). This and other studies show female mate choice depends on the female's condition in addition to the context (e.g., attractiveness and availability of mates). Females are not just the agents of selection; they also face selective pressures to find an attractive mate. Thus, individual variation in mate choice behavior may be viewed as balancing the indirect benefits associated with reproducing with preferred mates (Welch et al., 1998) and the direct benefits (time savings) associated with sampling, which may be constrained by acute reproductive condition. Hence, variation among and within females in physiological condition may impact the strength of selection on male traits and should be included in models of sexual selection (Jennions and Petrie, 1997;Hunt et al., 2005;Cotton and Small, 2006).

AUTHOR CONTRIBUTIONS
AB conceived of the project. BB at AB wrote the manuscript. BB, GF, FG, JM, CS-P, DP, CY, and AB participated in data collection and analysis.