<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v2.3 20070202//EN" "journalpublishing.dtd">
<article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="research-article">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">Front. Psychol.</journal-id>
<journal-title>Frontiers in Psychology</journal-title>
<abbrev-journal-title abbrev-type="pubmed">Front. Psychol.</abbrev-journal-title>
<issn pub-type="epub">1664-1078</issn>
<publisher>
<publisher-name>Frontiers Media S.A.</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.3389/fpsyg.2021.674815</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Psychology</subject>
<subj-group>
<subject>Original Research</subject>
</subj-group>
</subj-group>
</article-categories>
<title-group>
<article-title>The Effect of a Dilemma on the Relationship Between Ability to Identify the Criterion (ATIC) and Scores on a Validated Situational Interview</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author" corresp="yes">
<name><surname>Latham</surname> <given-names>Gary P.</given-names></name>
<xref ref-type="aff" rid="aff1"><sup>1</sup></xref>
<xref ref-type="corresp" rid="c001"><sup>&#x0002A;</sup></xref>
<xref ref-type="author-notes" rid="fn002"><sup>&#x02020;</sup></xref>
</contrib>
<contrib contrib-type="author">
<name><surname>Itzchakov</surname> <given-names>Guy</given-names></name>
<xref ref-type="aff" rid="aff2"><sup>2</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/1226778/overview"/>
</contrib>
</contrib-group>
<aff id="aff1"><sup>1</sup><institution>Rotman School of Management, University of Toronto</institution>, <addr-line>Toronto, ON</addr-line>, <country>Canada</country></aff>
<aff id="aff2"><sup>2</sup><institution>Department of Human Services, University of Haifa</institution>, <addr-line>Haifa</addr-line>, <country>Israel</country></aff>
<author-notes>
<fn fn-type="edited-by"><p>Edited by: Aharon Tziner, Netanya Academic College, Israel</p></fn>
<fn fn-type="edited-by"><p>Reviewed by: Kevin Murphy, University of Limerick, Ireland; Edna Rabenu, Netanya Academic College, Israel; Sylvia Roch, University at Albany, United States; Milton Hakel, Bowling Green State University, United States</p></fn>
<corresp id="c001">&#x0002A;Correspondence: Gary P. Latham <email>Latham&#x00040;rotman.utoronto.ca</email></corresp>
<fn fn-type="other" id="fn001"><p>This article was submitted to Organizational Psychology, a section of the journal Frontiers in Psychology</p></fn>
<fn fn-type="other" id="fn002"><p>&#x02020;The first author has cross-appointments in the Departments of Psychology, Nursing, and Industrial relations</p></fn></author-notes>
<pub-date pub-type="epub">
<day>27</day>
<month>07</month>
<year>2021</year>
</pub-date>
<pub-date pub-type="collection">
<year>2021</year>
</pub-date>
<volume>12</volume>
<elocation-id>674815</elocation-id>
<history>
<date date-type="received">
<day>02</day>
<month>03</month>
<year>2021</year>
</date>
<date date-type="accepted">
<day>08</day>
<month>06</month>
<year>2021</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright &#x000A9; 2021 Latham and Itzchakov.</copyright-statement>
<copyright-year>2021</copyright-year>
<copyright-holder>Latham and Itzchakov</copyright-holder>
<license xlink:href="http://creativecommons.org/licenses/by/4.0/"><p>This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.</p></license>
</permissions>
<abstract><p>Four experiments were conducted to determine whether participants&#x00027; awareness of the performance criterion on which they were being evaluated results in higher scores on a criterion valid situational interview (SI) where each question either contains or does not contain a dilemma. In the first experiment there was no significant difference between those who were or were not informed of the performance criterion that the SI questions predicted. Experiment 2 replicated this finding. In each instance the SI questions in these two experiments contained a dilemma. In a third experiment, participants were randomly assigned to a 2 (knowledge/no knowledge provided of the criterion) X 2 (SI dilemma/no dilemma) design. Knowledge of the criterion increased interview scores only when the questions did <italic>not</italic> contain a dilemma. The fourth experiment revealed that including a dilemma in a SI question attenuates the ATIC-SI relationship when participants must identify rather than be informed of the performance criterion that the SI has been developed to assess.</p></abstract>
<kwd-group>
<kwd>situational interview</kwd>
<kwd>employee selection</kwd>
<kwd>recruitment</kwd>
<kwd>human resource management</kwd>
<kwd>assessment</kwd>
</kwd-group>
<counts>
<fig-count count="2"/>
<table-count count="2"/>
<equation-count count="0"/>
<ref-count count="39"/>
<page-count count="12"/>
<word-count count="10292"/>
</counts>
</article-meta>
</front>
<body>
<sec sec-type="intro" id="s1">
<title>Introduction</title>
<p>The employment interview has long been known to be a deeply flawed method for selecting individuals (Wagner, <xref ref-type="bibr" rid="B37">1949</xref>; Ulrich and Trumbo, <xref ref-type="bibr" rid="B36">1965</xref>). In many instances, it is tantamount to little more than an unstructured conversation between two or more individuals. The result is a selection technique that has low reliability and validity.</p>
<p>In the 1980s, two structured interview techniques were developed that overcame these issues. All job applicants are asked the same job-related questions derived from a systematic job analysis. The result is two reliable, valid methods for interviewing candidates. These two methods are the situational interview (SI; Latham, <xref ref-type="bibr" rid="B24">1989</xref>) and the patterned behavior description interview (PBDI; Janz, <xref ref-type="bibr" rid="B16">1989</xref>). The premise underlying the SI is that intentions predict behavior. Applicants are asked to respond to questions derived from a job analysis by explaining what they would do in sundry situations. The premise underlying the PBDI is that among the best predictors of future behavior is an individual&#x00027;s past behavior. A meta-analysis of the research on the effectiveness of these two interview techniques revealed that the SI has higher overall mean criterion-related validity (<italic>M</italic> = 0.23) compared to the PBDI (<italic>M</italic> = 0.18) for predicting an individual&#x00027;s job performance (Culbertson et al., <xref ref-type="bibr" rid="B5">2017</xref>). Similarly, Levashina et al. (<xref ref-type="bibr" rid="B29">2014</xref>), in a review of the literature, found that past behavior interview questions had lower group differences than situational interviews (<italic>d</italic> = 0.10, <italic>d</italic> = 0.20, respectively). Hence, the present research focused on the SI and the extent to which knowing, that is, being informed of vs. identifying, the job performance criterion or criteria the SI was developed to assess improves an individual&#x00027;s performance in this interview.</p>
<sec>
<title>Ability to Identify Criteria (ATIC)</title>
<p>Kleinmann (<xref ref-type="bibr" rid="B20">1993</xref>) and colleagues (e.g., Ingold et al., <xref ref-type="bibr" rid="B13">2015</xref>) found that the ability to correctly identify the job criterion that is being predicted in a criterion valid SI increases both an individual&#x00027;s score on the SI and subsequent performance on the job. They also made this claim for assessment centers (e.g., Jansen et al., <xref ref-type="bibr" rid="B14">2011</xref>). In short, they concluded it is ability to identify criteria (ATIC) that not only increases an individual&#x00027;s performance in these two selection procedures, but predicts performance on the job as well. This is because ATIC is said to enable job candidates to &#x0201C;provide evaluation relevant answers in the interview, as well as demonstrate evaluation relevant behaviors on the job&#x0201D; (Ingold et al., <xref ref-type="bibr" rid="B13">2015</xref>, p. 389). However, with regard to assessment centers, it is noteworthy that ratings of performance on non-transparent dimensions were shown to be more criterion valid than ratings from assessment centers with transparent dimensions (Ingold et al., <xref ref-type="bibr" rid="B12">2016</xref>). Based on their research, K&#x000F6;nig et al. (<xref ref-type="bibr" rid="B22">2007</xref>) similarly concluded that selection interviews should not be made transparent.</p>
<p>Findings from research on ATIC have far reaching implications for human resource management. A downside is that the research suggests that, similar to self-report personality tests, the SI is susceptible to applicants &#x0201C;faking&#x0201D; their responses. Faking may be especially problematic regarding ATIC if applicants take the time to discover an organization&#x00027;s values/culture, strategy, and desired job competencies prior to applying for a job, as this would increase their likelihood of being able to identify the criterion or criteria on which they will be assessed.</p>
<p>The upsides of an individual&#x00027;s ability to identify the job performance criterion being assessed arguably out-weigh this downside. ATIC is an individual difference variable. Hence, ATIC is advantageous for some job applicants because, as noted earlier, those who score high on this measure &#x0201C;are more likely to discern criteria for success both in the SI and on the job&#x0201D; (Ingold et al., <xref ref-type="bibr" rid="B13">2015</xref>, p. 389). This, in turn, not only enables applicants to perform well in a SI, but it also enables them to &#x0201C;demonstrate evaluation-relevant behaviors on the job&#x0201D; (p. 389). Ingold et al. gave the example of an individual who recognizes the importance of cooperativeness as a performance criterion and then emphasizes cooperation when responding to a SI question and subsequently making &#x0201C;efforts to cooperate (rather than compete) with coworkers on the job (p. 389).</p>
<p>ATIC has been defined as a form of context-specific social effectiveness. Tangential evidence supporting the Ingold et al. (<xref ref-type="bibr" rid="B13">2015</xref>) finding that ATIC affects the relationship between the SI and job performance can be found in the study by Sue-Chan and Latham (<xref ref-type="bibr" rid="B34">2004</xref>). They found that emotional intelligence mediated the relationship between the SI and team work skills. This finding is consistent with Griffin&#x00027;s (<xref ref-type="bibr" rid="B8">2014</xref>) assertion that social understanding, as noted above, predicts ATIC scores.</p>
<p>Melchers et al. (<xref ref-type="bibr" rid="B31">2012</xref>) stated that there are two possible reasons why ability to identify the performance criterion that is being assessed in an interview results in better performance. The first possibility is that some candidates have the ability to provide more accurate ideas than others. A second possibility is that some candidates merely generate more ideas in general regarding the performance dimensions that are being assessed. Melchers et al.&#x00027;s (<xref ref-type="bibr" rid="B31">2012</xref>) analysis revealed that it is the first possibility, namely, the ability to provide more <italic>accurate</italic> ideas of what is being assessed that predicts better performance in the job interview.</p>
<p>It might be argued, based on Kleinmann et al&#x00027;s. studies (e.g., Kleinmann et al., <xref ref-type="bibr" rid="B21">2011</xref>), that ATIC is a proxy for general mental ability (GMA). Given GMA is among the strongest single predictor of job performance, it might affect an individual&#x00027;s ability to infer what is being assessed by a SI. However, in a criterion-related validity study involving managers, where the criterion was an assessment of teamwork skills, the correlation between the SI and GMA was not significant (Sue-Chan and Latham, <xref ref-type="bibr" rid="B34">2004</xref>). This finding is consistent with a series of meta-analyses conducted by Cortina et al. (<xref ref-type="bibr" rid="B4">2000</xref>). They found that highly structured interviews have incremental validity beyond cognitive ability. Furthermore, a meta-analysis revealed a weak relationship (<italic>r</italic> = 0.09) between the SI and cognitive ability (Culbertson et al., <xref ref-type="bibr" rid="B5">2017</xref>).</p>
<p>There are reasons to question the findings on ATIC as an explanation for the criterion related validity of an SI. In the domain of training, supervisors who were given the learning points that they were asked to demonstrate performed no better than those in the control group where this information was not provided (Latham and Saari, <xref ref-type="bibr" rid="B25">1979</xref>). In short, knowledge alone was not sufficient for bringing about a desired change in behavior.</p>
<p>In response to Kleinmann&#x00027;s (<xref ref-type="bibr" rid="B20">1993</xref>) and Griffin&#x00027;s (<xref ref-type="bibr" rid="B8">2014</xref>) call for research on ATIC under both transparent and non-transparent performance criterion conditions, the purpose of the present research was to examine the possibility that the alleged benefit of the ATIC for answering SI questions is based on inappropriate research methodology, namely, the failure to include a dilemma in each SI question. To do so, we first briefly discuss the correct development of a SI. We then discuss the methodology used by Ingold et al. (<xref ref-type="bibr" rid="B13">2015</xref>) and Oostrom et al. (<xref ref-type="bibr" rid="B32">2016</xref>) to develop an SI. Finally, we present the results of our four experiments. In the first two, individuals were informed of the performance criterion that the SI predicted. In the third experiment, interview scores on SI questions that did vs. did not contain a dilemma were examined. In the fourth experiment, a direct measure of ATIC was employed. We did so to expand previous work by testing if ATIC increases scores on SI questions that include a dilemma relative to SI questions that do not include a dilemma, as was found in previous work (e.g., Ingold et al., <xref ref-type="bibr" rid="B13">2015</xref>; Oostrom et al., <xref ref-type="bibr" rid="B32">2016</xref>). The direct measure of ATIC was consistent with extant ATIC procedures (e.g., Oostrom et al., <xref ref-type="bibr" rid="B32">2016</xref>).</p>
</sec>
<sec>
<title>The Situational Interview</title>
<p>Consistent with Campion, Palmer and Campion (<xref ref-type="bibr" rid="B3">1997</xref>) typology, the SI is a structured interview in that the questions are based on a job analysis, the same questions are asked of each interviewee, prompting an individual is not allowed, notes are taken by two or more interviewers, and the same interviewers are used across interviewees. The interviewers use a predetermined scoring guide to evaluate each interviewee&#x00027;s answer to an interview question.</p>
<p>The premise of the SI is that intentions predict behavior (Latham et al., <xref ref-type="bibr" rid="B26">1980</xref>; Latham, <xref ref-type="bibr" rid="B24">1989</xref>). Intentions are &#x0201C;a representation of a future course of action to be performed&#x02026;a proactive commitment to bringing them (future actions) about&#x0201D; (Bandura, <xref ref-type="bibr" rid="B1">2000</xref>, p. 5). Intentions are generally viewed as the direct motivational instigator of behavior (Klehe and Latham, <xref ref-type="bibr" rid="B19">2006</xref>; Locke and Latham, <xref ref-type="bibr" rid="B30">2013</xref>).</p>
<p>The SI (Latham, <xref ref-type="bibr" rid="B24">1989</xref>; Latham and Sue-Chan, <xref ref-type="bibr" rid="B27">1999</xref>) has five distinct features. First, as noted earlier, it is based on a systematic job analysis, typically the critical incident technique (Flanagan, <xref ref-type="bibr" rid="B7">1954</xref>). Consistent with Wernimont and Campbell&#x00027;s (<xref ref-type="bibr" rid="B38">1968</xref>) argument to develop predictors consistent with the performance criteria, the performance criteria (e.g., Behavioral Observation Scales/BOS; Latham and Wexley, <xref ref-type="bibr" rid="B28">1977</xref>) and the SI questions are developed from the same job analysis.</p>
<p>Second, the context, behavior, and outcomes described in a critical incident are turned into a question: &#x0201C;What would you do in this situation?&#x0201D; Each SI question contains a dilemma.</p>
<p>In a valid SI interview, the dilemma confronting an individual is having to choose between two or more exclusive courses of action (Latham and Sue-Chan, <xref ref-type="bibr" rid="B27">1999</xref>; Levashina et al., <xref ref-type="bibr" rid="B29">2014</xref>). The purpose of the dilemma is to &#x0201C;force&#x0201D; applicants to state their actual intentions rather than offer socially desirable responses (Latham, <xref ref-type="bibr" rid="B24">1989</xref>; Sue-Chan and Latham, <xref ref-type="bibr" rid="B34">2004</xref>).</p>
<p>An example of a question that only assesses a future intention is: &#x0201C;As you are crossing a busy street, your aging parent, who is nearing the middle of the road, calls out to you for assistance. What would you do in this situation?&#x0201D; Note that the question does not contain a dilemma.</p>
<p>In contrast, an example of a SI question that contains a dilemma is as follows: &#x0201C;As you are crossing a busy street your aging parent, who is nearing the middle of the road, calls out to you for assistance. As you turn in her direction, a gust of wind blows the lottery ticket worth a million dollars out of your hand down the street. What would you do in this situation?&#x0201D;</p>
<p>The presentation of a dilemma to interviewees, in this instance helping your mother vs. going after the lottery ticket, is critical to the development of a SI because, as noted earlier, the underlying premise is that an individual&#x00027;s intentions predict behavior. If an SI question does not contain a dilemma, the answer to an interview question may be a response to what the interviewee infers the interviewer hopes to hear. Hence the dilemma is a core aspect of the SI (Levashina et al., <xref ref-type="bibr" rid="B29">2014</xref>)<xref ref-type="fn" rid="fn0001"><sup>1</sup></xref>. In short, the importance of a dilemma to differentiate an SI from a non SI question cannot be over-emphasized (Latham, <xref ref-type="bibr" rid="B24">1989</xref>; Latham and Sue-Chan, <xref ref-type="bibr" rid="B27">1999</xref>; Klehe and Latham, <xref ref-type="bibr" rid="B18">2005</xref>). Nevertheless, in conducting their meta-analysis, Taylor and Small (<xref ref-type="bibr" rid="B35">2002</xref>) reported a great deal of unexplained variance across SI studies due to the fact that many studies claiming to be an SI did not include a dilemma:</p>
<p>&#x0201C;We noticed rather heterogeneous approaches to how questions and answer rating scales were developed among primary studies. Examples of situational questions developed by Latham et al. suggest that those authors not only pose questions as hypothetical dilemmas, but that these dilemmas typically involve choices between two competing values. In contrast, other researchers have developed situational questions which neither present dilemmas nor focus on values&#x0201D; (Taylor and Small, <xref ref-type="bibr" rid="B35">2002</xref>, p. 290).</p>
<p>Likewise, in conducting their meta-analysis, Huffcutt et al. (<xref ref-type="bibr" rid="B11">2004</xref>, p. 269) found that &#x0201C;a majority of the situational studies in the current interview literature include questions that do not have a dilemma.&#x0201D; Examples of studies that failed to include a dilemma include Campion et al. (<xref ref-type="bibr" rid="B2">1994</xref>) and Pulakos and Schmitt (<xref ref-type="bibr" rid="B33">1995</xref>)<xref ref-type="fn" rid="fn0002"><sup>2</sup></xref>. Low validity coefficients were obtained in both studies. In addition, Levashina et al. (<xref ref-type="bibr" rid="B29">2014</xref>), in their review of the literature, concluded that although a dilemma is a core aspect of valid situational interview questions (Latham et al., <xref ref-type="bibr" rid="B26">1980</xref>), many researchers have used situational interview questions that did not contain dilemmas.</p>
<p>Third, a behavioral scoring guide is developed by subject matter experts (e.g., supervisors, customers) to aid the interviewers in scoring the response to each question. This is done to minimize interviewer biases in the scoring of responses, and to increase interrater reliability.</p>
<p>Fourth, the scoring of each individual&#x00027;s response to an SI question is conducted by a panel of two or more individuals. Each member of the panel scores each answer independently.</p>
<p>Fifth, a pilot study is conducted to determine whether there is variability in the responses to each SI question. If most people give a correct/incorrect response, the question is discarded. As Guion (<xref ref-type="bibr" rid="B9">1998</xref>, p. 614) commented, &#x0201C;the explicit provision of a pilot study for the SI is noteworthy because people who would never dream of developing written tests without pilot studies do not hesitate to develop interview guides without them. Building a psychometric device without pilot studies displays unwarranted arrogance&#x02014;or ignorance of the many things that can go wrong.&#x0201D; An example of adhering to this guideline can be found in Klehe and Latham (<xref ref-type="bibr" rid="B18">2005</xref>). Specifically, a pilot study (<italic>n</italic> = 31) was conducted to determine if there was variability in the responses to the questions before the criterion validity study was conducted and to determine whether there was interrater reliability when using the behavioral scoring guide.</p>
</sec>
<sec>
<title>Ingold et al. (<xref ref-type="bibr" rid="B13">2015</xref>) Study</title>
<p>Ingold et al.&#x00027;s study was designed to answer the following question: Why do situational interviews predict job performance? Their answer was the &#x0201C;interviewee&#x00027;s ability to identify criteria&#x0201D; (p. 388). In their study, no job analysis was conducted to develop the performance criteria or the SI questions to predict them. Instead, Ingold et al. focused on what they called a management trainee position as the targeted job. Their experiment involved 97 current and prospective University graduates who were employed or had been recently employed. Over half (55%) held a Master&#x00027;s or a comparable degree. Two interviewers, as a panel, conducted a mock interview to assess assertiveness, perseverance, and organizing behavior. Each participant&#x00027;s supervisor assessed the individual&#x00027;s performance, using a 7-point scale, on five items from Williams and Anderson (<xref ref-type="bibr" rid="B39">1991</xref>) and five items from Jansen et al. (<xref ref-type="bibr" rid="B15">2013</xref>) for assessing a general manager. Because the scores on the two scales were highly correlated, a composite score was computed. Examples of items from the two respective scales are: &#x0201C;Demonstrates expertise in all job-related tasks&#x0201D;; &#x0201C;adequately completes assigned duties.&#x0201D;</p>
<p>Rather than develop SI questions, Ingold et al. contacted authors of previous SI studies for permission to adapt SI questions for their study along with the respective behavioral scoring guide for each question. Several questions failed to include a dilemma (see <xref ref-type="boxed-text" rid="Box1">Box 1</xref>). Consequently, only two of the three concurrent validity coefficients were significant with supervisory ratings of job performance. Specifically, perseverance, [<italic>r</italic> = 0.23, <italic>p</italic> &#x0003C; 0.05], organizing behaviors, [<italic>r</italic> = 0.30, <italic>p</italic> &#x0003C; 0.01], and assertiveness, [<italic>r</italic> = 0.11, <italic>p</italic> = 0.27]. The correlation between ATIC and SI performance was significant [<italic>r</italic> = 0.23, <italic>p</italic> &#x0003C; 0.05]. If Ingold et al. had followed step 5, namely, conduct a pilot study to determine whether there is variability in the responses, and in addition only presented questions that contained a dilemma, all three correlation coefficients might have been valid, and the magnitude of the validity coefficients might have been higher.</p>
<boxed-text id="Box1">
<label>Box 1</label>
<title>SI Question used by Ingold et al. (<xref ref-type="bibr" rid="B13">2015</xref>).</title>
<p>Perseverance: Imagine you&#x00027;re finding the first months at your new job very difficult. The tasks you are assigned are very demanding, and you think your boss isn&#x00027;t entirely satisfied with your work. Please describe briefly how you would behave in this situation.There is no dilemma in this question.</p>
</boxed-text>
<p>Following the SI, the participants completed a questionnaire where they were told to write the criterion that they believed an SI question was assessing, and to provide a behavioral example. Ingold and a Master&#x00027;s student rated the accuracy of each participant&#x00027;s responses on a 4-point scale (i.e., no fit, limited fit, a moderate fit, fits completely). They then tested whether people with high scores on ATIC performed better on the questions corresponding to a specific dimension, whether ATIC predicted supervisory assessments of job performance, and whether it explained incremental variance in job performance beyond the SI. They obtained supporting evidence in each instance. Finally, Ingold et al. found that the SI did not predict performance on the job when ATIC was controlled. Thus, the hypothesis tested in our first experiment is that individuals who are made aware of the performance criterion obtain a significantly higher score on a criterion valid SI than those who are not informed.</p>
</sec>
</sec>
<sec id="s2">
<title>Overview of Present Experiments</title>
<p>The present research used a predictively valid SI that was developed by Klehe and Latham (<xref ref-type="bibr" rid="B18">2005</xref>) for assessing the teamwork skills of applicants to an MBA program. The uncorrected <italic>r</italic> is 0.41 (<italic>p</italic> &#x0003C; 0.05). The MBA program requires much of the course work to be performed in teams, making teamwork skills a critical prerequisite for a student to receive an MBA degree.</p>
<p>Both the teamwork criterion and the SI questions were derived from a systematic job analysis, the critical incident technique (Flanagan, <xref ref-type="bibr" rid="B7">1954</xref>). Each SI question developed by Klehe and Latham (<xref ref-type="bibr" rid="B18">2005</xref>) contained a dilemma (see <xref ref-type="boxed-text" rid="Box2">Box 2</xref>). As was the case in the Ingold et al. (<xref ref-type="bibr" rid="B13">2015</xref>) study, a behavioral scoring guide was developed for each SI question, and the responses to the questions were scored by a panel (Klehe and Latham, <xref ref-type="bibr" rid="B18">2005</xref>). A pilot study was then conducted with MBA students who had not taken part in the criterion validation study. Questions for which agreement on the scoring could not be reached, as well as questions that revealed a lack of variance in the responses to them were discarded. This process resulted in nine SI questions. Each answer to the SI question was rated on a Likert-type scale ranging from 1 to 5. In all four of our experiments, we calculated the score for each SI question by calculating the average score of the two raters.</p>
<boxed-text id="Box2">
<label>Box 2</label>
<title>An SI question used by Klehe and Latham (<xref ref-type="bibr" rid="B18">2005</xref>).</title>
<p>Your group is working on a very important project. All of you want to achieve a good grade. You have a tight deadline. One member of your group was especially successful in this area last term. Supported by two other group members, she takes the lead on your group project. She keeps the minutes and controls the flow of information during the discussion. However, you have the strong impression that she only records ideas supportive of her position and makes decisions on issues without consulting with others. What would you do?</p>
<p>The dilemma is between meeting a tight deadline and attaining a high grade versus the importance of ensuring that the input of others on the team is taken into account.</p>
</boxed-text>
<p>In the first three experiments, participants were informed of the performance criterion that was being assessed by the SI. This was done to test whether providing knowledge of the performance criterion enables individuals to provide relevant responses to the SI questions and hence receive higher scores than those in the control condition.</p>
<p>In the third experiment, participants were randomly assigned to conditions that did or did not include a dilemma, and did or did not include information about the performance criterion that was being assessed. This was done to determine whether knowledge given to participants of the performance criterion the SI questions assess increases SI scores when a dilemma is/is not contained in a question. In the fourth experiment, we conceptually replicated the results of the third experiment. Specifically, we examined whether an individual&#x00027;s ability to identify the performance criteria an SI was developed to predict increases interview scores only when the questions do not include a dilemma.</p>
</sec>
<sec id="s3">
<title>Experiment 1</title>
<p>The hypothesis tested in this experiment is that being informed of the performance criterion that is being assessed leads to higher scores on SI questions that contain a dilemma than the scores in a control condition where this information was not provided.</p>
<sec>
<title>Method</title>
<sec>
<title>Participants</title>
<p>Participants (<italic>n</italic> = 108, <italic>M</italic><sub>age</sub> = 35.09, <italic>SD</italic><sub><italic>age</italic></sub> = 12.09, 45.4% female), recruited through CrowdFlower, an online subject pool platform, responded to the nine SI interview questions on the online data collection tool, Qualtrics. They did so in exchange for a monetary payment. The study lasted &#x0007E;15 min. Twenty-six percent of the participants were in between jobs, 53.3% were employed full-time (i.e., worked more than 35 h per week), and 20.6% held a part-time job. On average, they had 11.15 (<italic>SD</italic> = 10.49) years of job experience. Twenty-seven percent had an Associate Degree, 52.5% held a Bachelor&#x00027;s Degree, 10.9% held either an MA or a Ph.D., and 9.9% had a Professional Degree. Of those currently employed, 5.6% of the participants worked in research and education, 4.6% in banking and insurance, 5.6% in the service industry, 6.5% in sales, 7.4% in the public sector, and 13.9% were employed in other industries. No participant was excluded from the data analysis.</p>
<p>Power analysis using GPower (Faul et al., <xref ref-type="bibr" rid="B6">2007</xref>) indicated that this sample size has a power of 0.80 to detect a medium effect size, <italic>d</italic> = 0.55.</p>
</sec>
<sec>
<title>Procedure</title>
<p>Participants were randomly assigned to the experimental (<italic>n</italic> = 53) or the control condition (<italic>n</italic> = 55). Prior to administrating the SI questions, only those in the experimental condition were shown the performance criterion, teamwork, and the behavioral items that operationally define it on a BOS. All participants in the two conditions answered the nine predictively valid SI questions on their respective computers. The questions, taken from Klehe and Latham (<xref ref-type="bibr" rid="B18">2005</xref>), included a dilemma. A sample question is: &#x0201C;Your group is working on a very important project. All of you want to achieve a good grade. You have a tight deadline. One member of your group was especially successful in this area last term. Supported by two other group members, she takes the lead on your group project. She keeps the minutes and controls the flow of information during the discussion. However, you have the strong impression that she only records ideas supporting her position, and makes decisions on issues without consulting others. What would you do in this situation?&#x0201D; There were no time limits for responding to the questions. The participants were then debriefed and compensated.</p>
<p>All responses to the nine SI questions (<italic>M</italic> = 2.39, <italic>SD</italic> = 0.63) were scored independently by a Ph.D. psychologist and a Ph.D. student in human resource management. Both individuals were blind to the hypothesis and the experimental conditions. The scoring was done using the behavioral scoring guide developed by Klehe and Latham (<xref ref-type="bibr" rid="B18">2005</xref>). These two raters were familiar with the SI&#x00027;s behavioral scoring guide. Nevertheless, they practiced the rating process as a dyad before scoring the actual responses to the SI questions independently.</p>
</sec>
</sec>
<sec>
<title>Results</title>
<p>The <italic>ICC</italic> (3) of the SI was 0.81 indicating adequate inter-rater reliability. The final score for each participant was the average of the scores from the two independent raters. Following the scoring guide by Klehe and Latham (<xref ref-type="bibr" rid="B18">2005</xref>), the rating for each SI question ranged from 1 (<italic>unacceptable</italic>) to 5 (<italic>outstanding</italic>). An independent sample two-tailed <italic>t</italic>-test revealed that scores on the SI did not differ significantly between the experimental (<italic>M</italic> = 2.45, <italic>SD</italic> = 0.62) and the control (<italic>M</italic> = 2.34, <italic>SD</italic> = 0.64) group [<italic>t</italic><sub>(106)</sub> = 0.89, <italic>p</italic> = 0.37, <italic>d</italic> = 0.17]. We then tested for any effect that an individual&#x00027;s sex, age, years of work experience, number of hours worked per week, and education may have had on this result. No significant effect was found.</p>
</sec>
<sec>
<title>Discussion</title>
<p>The results of this experiment show that having been informed of the performance criterion that the SI was assessing did not enable participants to obtain higher scores on the SI than participants who were not given this information. An arguable limitation of this experiment is that because the participants read the SI questions and wrote their answers to them, the context was not similar to an interview. An additional limitation was that the sample size had low statistical power to detect small effect sizes (<italic>d</italic> &#x0003C; 0.50). Yet, the magnitude of the effect size obtained in this experiment, <italic>d</italic> = 0.17, is relatively negligible and thus provides support for the hypothesis that knowledge of the criterion being assessed has little or no effect on the score in a valid situational interview, namely, SI questions that contain dilemmas.</p>
</sec>
</sec>
<sec id="s4">
<title>Experiment 2</title>
<p>A second experiment was conducted where each participant was interviewed by two interviewers who recorded and scored the answers independently. The purpose of this second experiment was to determine whether the results of the first experiment would be replicated when the participants were actually interviewed.</p>
<sec>
<title>Method</title>
<sec>
<title>Participants</title>
<p>The participants were 100 undergraduate students at a large University in Canada (<italic>M</italic><sub>age</sub> = 20.75, <italic>SD</italic> = 3.79, 59.8% female). Of this number, 64% were unemployed, 35% worked in a part-time job, and 1% worked in a full-time job. Because the initial data collection was limited to the M.B.A students enrolled in classes taught by the first author, we followed feasibility analysis recommendations to recruit as many participants as possible (Lakens, <xref ref-type="bibr" rid="B23">2021</xref>). A sensitivity power analysis using G&#x0002A;Power (Faul et al., <xref ref-type="bibr" rid="B6">2007</xref>) indicated that this sample size has a power of 0.80 to detect a moderate effect size (<italic>d</italic> = 0.57) in a between-participant design with two groups.</p>
</sec>
<sec>
<title>Procedure</title>
<p>The interviewees were randomly assigned to the experimental (<italic>n</italic> = 50) or the control (<italic>n</italic> = 50) condition. As in the first experiment, the participants in the experimental condition were shown the performance criterion and the behavioral items that define it, whereas those in the control condition did not receive this information. As in Experiment 1, the interviewees answered the nine valid SI questions from Klehe and Latham (<xref ref-type="bibr" rid="B18">2005</xref>), and the interviewers used the same behavioral scoring guide. All participants in the two conditions answered the nine predictively valid SI questions on their respective computers. The responses to the SI questions (<italic>M</italic> = 2.42, <italic>SD</italic> = 0.62, range 1&#x02013;5) were scored independently by the same two interviewers as in Experiment 1, namely, a Ph.D. psychologist and a Ph.D. student in human resource management who were blind to the hypothesis and the experimental conditions.</p>
</sec>
</sec>
<sec>
<title>Results</title>
<p>The inter-rater reliability of the answers to the SI was high <italic>ICC</italic> (3) = 0.94. As in the first experiment, there was no significant difference in the SI scores between the experimental (<italic>M</italic> = 2.56, <italic>SD</italic> = 0.48) and the control group (<italic>M</italic> = 2.44, <italic>SD</italic> = 0.46); [<italic>t</italic><sub>(98)</sub> = 1.33, <italic>p</italic> = 0.19, <italic>d</italic> = 0.27]. We conducted an additional analysis to examine whether the lack of a main effect on the SI changed when controlling for demographics variables. Specifically, an ANCOVA controlling for an individual&#x00027;s sex, years of employment, and age did not change the conclusion [<italic>F</italic><sub>(1, 92)</sub> = 1.89, <italic>p</italic> = 0.17].</p>
</sec>
<sec>
<title>Discussion</title>
<p>This second experiment replicated the results obtained in the previous experiment and hence provides additional support for the conclusion that providing knowledge of the criterion predicted by the SI does not improve the final score when the SI questions include a dilemma. However, these two experiments did not directly compare the two types of questions, namely, those with and without a dilemma. Consequently, we conducted a third experiment since recent SI studies have omitted the dilemma in the interview questions (Taylor and Small, <xref ref-type="bibr" rid="B35">2002</xref>; Huffcutt et al., <xref ref-type="bibr" rid="B11">2004</xref>).</p>
<p>An additional limitation of experiments 1 and 2 is that they did not have sufficient power to detect the average effect size, <italic>d</italic> = 0.22. Nevertheless, this effect size is quite small. It supports the hypothesis that knowledge of the criterion being assessed has little or no effect on an individual&#x00027;s score in a valid situational interview, that is, SI questions that include a dilemma.</p>
</sec>
</sec>
<sec id="s5">
<title>Experiment 3</title>
<p>Because experiments 1 and 2 yielded essentially the same results regardless of whether the SI was administered verbally or in written form, we allowed participants in this third experiment to read the nine SI questions which were presented in the first two experiments and write their answers to them. There was no time limit for doing so.</p>
<p>The second hypothesis of our research that was tested in this third experiment was that the scores of responses to interview questions that do not include a dilemma are significantly higher than scores of responses to the same interview questions that do contain a dilemma.</p>
<sec>
<title>Method</title>
<sec>
<title>Participants</title>
<p>We recruited 284 participants through Prolific Academic platform. Prolific academic is an online crowdsourcing platform designed for academic research. This platform includes &#x0007E;12,000 international participants who participate in scientific studies in exchange for cash rewards for themselves or for one of two chosen charities (Save the Children and Cancer Research UK). Participants in this platform can be selected for a study on the basis of their age, fluent language skills, and approval rate in previous studies.</p>
<p>Four participants wrote meaningless words and combinations of letters as answers to the interview questions and hence were excluded from the data analysis. Thus, the final sample was 280 (<italic>M</italic><sub>age</sub> = 34.70, <italic>SD</italic> = 9.13, 51.4% female). Of this number, 94.6% of the participants were employed. Sensitivity analysis indicated that the smallest effect size that this sample size can detect for an interaction with a power of 0.80 in a between-participant 2 X 2 factorial design is <italic>Cohen&#x00027;s f</italic> = 0.20 (Faul et al., <xref ref-type="bibr" rid="B6">2007</xref>).</p>
</sec>
<sec>
<title>Procedure</title>
<p>Participants were randomly assigned to one of four conditions. Specifically, we crossed the dilemma conditions (yes/no) with knowledge of the criterion conditions (yes/no). Participants in the first condition (<italic>n</italic> = 68) were given the same nine SI questions used in the previous two experiments (<italic>M</italic> = 3.04, <italic>SD</italic> = 0.50). They were informed of the performance criterion that was being assessed by those questions (i.e., dilemma/knowledge of the criterion provided). Participants in the second condition (<italic>n</italic> = 69) were given the nine SI questions that contained a dilemma. They were not informed of the criterion that the questions predicted (i.e., dilemma/no knowledge of the criterion provided). Participants in the third condition (<italic>n</italic> = 72) were given the nine SI questions with no dilemma. They were informed of the performance criterion being assessed (i.e., no dilemma/knowledge of the criterion provided). Participants in the fourth condition (<italic>n</italic> = 71) were given the SI questions that did not include a dilemma. They were not informed of the criterion that the questions predicted (i.e., no dilemma/no knowledge of the criterion). The coders rated the answer to each SI question on a scale ranging from 0 (<italic>unacceptable</italic>) to 5 (<italic>outstanding</italic>). We used the average score for all SI questions across the two ratings.</p>
<p>After the participants responded to the nine SI questions, they answered demographic questions and were compensated $1.60. Two judges independently scored the responses. These judges were M.B.A students who received training on the coding procedure. The inter-rater reliability of the answers to the SI was high, as indicated by <italic>ICC</italic> (3) = 0.84. Therefore, the score for each SI question was the average rating by the two independent judges who were blind to the research hypotheses and experimental conditions. The training of the two judges included an explanation about the SI and how to use the behavioral scoring guide. Both judges were experts in human resource management and were highly knowledgeable of SI procedures prior to participating in this experiment.</p>
</sec>
</sec>
<sec>
<title>Results</title>
<p><xref ref-type="table" rid="T1">Table 1</xref> presents the descriptive statistics by experimental condition. There was a significant main effect of the dilemma manipulation on SI scores [<italic>F</italic><sub>(1, 276)</sub> = 85.30, <italic>p</italic> &#x0003C; 0.001, &#x003B7;<sup>2</sup><sub><italic>p</italic></sub> = 0.24]. Participants in the no-dilemma condition (<italic>M</italic> = 3.28, <italic>SE</italic> = 0.04) scored significantly higher on the SI than participants in the dilemma condition (<italic>M</italic> = 2.80, <italic>SE</italic> = 0.04). There was no main effect for the knowledge of the criterion manipulation [<italic>F</italic><sub>(1, 276)</sub> = 1.38, <italic>p</italic> = 0.241, &#x003B7;<sup>2</sup><sub><italic>p</italic></sub> = 01]. Those participants who received knowledge of the performance criterion the SI questions were assessing, teamplaying (<italic>M</italic> = 3.07, <italic>SE</italic> = 0.04), did not score significantly higher than the participants who did not have this knowledge (<italic>M</italic> = 3.01, <italic>SE</italic> = 0.04).</p>
<table-wrap position="float" id="T1">
<label>Table 1</label>
<caption><p>Experiment 3: descriptive statistics by condition.</p></caption>
<table frame="hsides" rules="groups">
<thead><tr>
<th/>
<th valign="top" align="center" colspan="2" style="border-bottom: thin solid #000000;"><bold>No knowledge</bold></th>
<th valign="top" align="center" colspan="2" style="border-bottom: thin solid #000000;"><bold>Knowledge</bold></th>
</tr>
<tr>
<th valign="top" align="left"><bold>Dilemma</bold></th>
<th valign="top" align="center"><bold><italic>Mean</italic></bold></th>
<th valign="top" align="center"><bold><italic>SD</italic></bold></th>
<th valign="top" align="center"><bold><italic>Mean</italic></bold></th>
<th valign="top" align="center"><bold><italic>SD</italic></bold></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">No</td>
<td valign="top" align="center">3.20</td>
<td valign="top" align="center">0.43</td>
<td valign="top" align="center">3.36</td>
<td valign="top" align="center">0.42</td>
</tr>
<tr>
<td valign="top" align="left">Yes</td>
<td valign="top" align="center">2.82</td>
<td valign="top" align="center">0.43</td>
<td valign="top" align="center">2.78</td>
<td valign="top" align="center">0.45</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>There was a significant Dilemma X Knowledge of criterion interaction [<italic>F</italic><sub>(1, 276)</sub> = 4.13, <italic>p</italic> = 0.043, &#x003B7;<sup>2</sup><sub><italic>p</italic></sub> = 0.02] (see <xref ref-type="fig" rid="F1">Figure 1</xref>). Simple effect analysis using the <italic>LSD</italic> test indicated that the interaction was the result of the dilemma manipulation. When the SI questions did <italic>not</italic> include a dilemma, the participants who were provided knowledge of the criterion the interview questions had been developed to assess scored higher on the SI than those who did not receive this information (<italic>M</italic><sub>difference</sub> = 0.17, <italic>SE</italic> = 0.07, <italic>p</italic> = 0.024, 95% <italic>CI</italic> [0.02, 0.31]). When the questions did contain a dilemma, there was no significant difference in SI scores between those in the knowledge/no knowledge of criterion conditions (<italic>M</italic><sub>difference</sub> = 0.04, <italic>SE</italic> = 0.07, <italic>p</italic> = 0.546, 95% <italic>CI</italic> [&#x02212;0.19, 0.10]).</p>
<fig id="F1" position="float">
<label>Figure 1</label>
<caption><p>Experiment 3: Interaction between the experimental conditions on Situational interview score. Knowledge 0-no knowledge of the criterion, Knowledge 1-knowledge of the criterion, Dilemma 0-no dilemma, Dilemma 1-dilemma.</p></caption>
<graphic xlink:href="fpsyg-12-674815-g0001.tif"/>
</fig>
</sec>
<sec>
<title>Discussion</title>
<p>The results of this third experiment provide support for the research by Kleinmann and colleagues (e.g., Ingold et al., <xref ref-type="bibr" rid="B13">2015</xref>). When SI questions lack a dilemma, as was the case with the majority of questions used in their experiments, knowledge of the criterion that is being assessed improved scores on the SI. However, the results of the present experiment also provide support for the two preceding experiments. When the SI questions contained a dilemma, knowledge of the performance criterion did not affect SI scores.</p>
</sec>
</sec>
<sec id="s6">
<title>Experiment 4</title>
<p>Subsequent to the Ingold et al. (<xref ref-type="bibr" rid="B13">2015</xref>) experiment, Oostrom et al. (<xref ref-type="bibr" rid="B32">2016</xref>) arguably used the most rigorous design to date to examine the ATIC-SI relationship. Consistent with the underlying premise of the SI, they found that there was considerable similarity between what participants said they would do, that is, their intentions, and their subsequent behavior. Consistent with previous research on the ATIC, Oostrom et al. (<xref ref-type="bibr" rid="B32">2016</xref>) also found that differences in the ability to identify the performance criterion that was being assessed explain why the SI has criterion-related validity.</p>
<p>Once again, the majority of the SI questions used in the Oostrom et al. experiment failed to include a dilemma. Consequently, we modified their questions to include dilemmas. Hence, the purpose of this fourth experiment was 3-fold. First, we again examined scores on the SI where questions did/did not include a dilemma to determine the effect on SI scores. Second, in the three preceding experiments, the SI questions were developed to predict a single criterion, teamplaying. Research on the ATIC sometimes use multiple performance criteria. Hence, we used the same SI questions that assessed the same multiple criteria used by Oostrom et al. (<xref ref-type="bibr" rid="B32">2016</xref>).</p>
<p>Finally, none of our three preceding experiments included an explicit measure of ATIC. Note, however, that in the fourth cell of the third experiment the SI questions did not contain a dilemma and the participants were not informed of the criterion that was being assessed. Even though they were not asked to identify the performance criterion, they were free to do so. Yet the participants in this condition did not perform better than those in the other three conditions. Consequently, ATIC was assessed in this fourth experiment. This was done because the ability to identify the performance criteria that are being assessed, after being explicitly asked to do so, may be far different from being informed prior to an SI interview of the performance criteria that are being assessed. Hence, the third hypothesis tested in this fourth experiment is that the ability to identify the performance criteria, when asked to do so, results in SI scores that are significantly lower when the SI questions include a dilemma compared to scores for SI questions that do not include a dilemma.</p>
<sec>
<title>Method</title>
<sec>
<title>Participants</title>
<p>We recruited 151 undergraduate students from a college in Israel to participate in this study in exchange for course credit.</p>
<p>Three participants wrote meaningless answers to the SI questions and thus were excluded from the data analysis. The final sample was 148 (<italic>M</italic><sub>age</sub> = 22.27, <italic>SD</italic> = 4.39, 48.6% female). Of this number, 23.0% of the participants worked in the private sector, 19.6% in the public sector, and 6.8% were self-employed. The participants had been working, on average, for 3.76 years (<italic>SD</italic> = 4.55). Statistical power was calculated by converting the interaction effect in the third experiment to a Cohen&#x00027;s <italic>f</italic>, which yielded a score of 0.14. Power analysis using GPower indicated that a sample size of 148 participants has a power above 80% to detect an interaction in a regression. Moreover, sensitivity analysis indicated that the smallest interaction effect that this sample size can detect with a power of.80 is <inline-formula><mml:math id="M1"><mml:msubsup><mml:mrow><mml:mi>R</mml:mi></mml:mrow><mml:mrow><mml:mi>c</mml:mi><mml:mi>h</mml:mi><mml:mi>a</mml:mi><mml:mi>n</mml:mi><mml:mi>g</mml:mi><mml:mi>e</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msubsup></mml:math></inline-formula> = 0.047.</p>
</sec>
<sec>
<title>Procedure</title>
<p>Participants were informed that they would be taking part in a study about the job interview. To motivate participants to perform well, they were informed that those who received the highest rating would receive an award of 350 NIS (equivalent to $110 US). The experiment, conducted using Qualtrics software, consisted of two parts. In the first part, we manipulated the dilemma.</p>
<p>Participants were randomly assigned to a no-dilemma condition (<italic>n</italic> = 72). They received the 10 SI questions taken from Oostrom et al. (<xref ref-type="bibr" rid="B32">2016</xref>). The questions assessed self-control, customer focus, persuasiveness, person-oriented leadership, and task-oriented leadership. The other half of the participants (<italic>n</italic> = 76) received the same SI questions. Each question contained a dilemma that was inserted by the present authors. The interview questions were presented orally to the participants.</p>
<p>An example of an SI question in the no dilemma condition was:</p>
<p>&#x0201C;You have submitted an offer to a customer. You know that you are not the only company that is making an offer. The client has demanded more and more work from you when drawing up the offer. Hence, you believe you will receive the assignment.</p>
<p>You are now with the client who says: &#x0201C;Unfortunately, you did not get the job.&#x0201D; What would you do in this situation?</p>
<p>An SI question with a dilemma was:</p>
<p>&#x0201C;You have submitted an offer to a customer. You know that you are not the only company that is making an offer. The client has demanded more and more work from you when drawing up the offer. Hence, you believe you will receive the assignment. Because this is a big client who demands much of your time, and you were sure you would get the job, you turned down other job opportunities to find the necessary time for this client.</p>
<p>You are now with the client who says: &#x0201C;Unfortunately, you probably won&#x00027;t get the job.&#x0201D; However, if you cut the price by 30%, you might get it.&#x0201D; What would you do in this situation?</p>
<p>In the second part of this experiment, the ATIC was assessed. Consistent with Oostrom et al. (<xref ref-type="bibr" rid="B32">2016</xref>), the participants were presented with the following information:</p>
<p>&#x0201C;During the interview, you probably thought about which skills or characteristics were measured by the different questions. We would like to know, for each question, the skills/characteristics that were being measured. Also, please provide concrete behavioral examples that are related to the skills/properties you identified.&#x0201D;</p>
<p>The participants were then presented with each SI question that had been asked of them. They were requested to write the performance criterion that they believed each question assessed, along with a behavioral example. Finally, the participants answered demographic questions before being debriefed.</p>
<p>Two judges who were blind to the research hypotheses and experimental conditions rated the answers to each SI question on a 1-5 Likert-type scale. Specifically, following Oostrom et al. (<xref ref-type="bibr" rid="B32">2016</xref>), the categories were labeled as follows: 1- <italic>Not effective at all</italic>, 2- <italic>Not effective, 3- A bit effective</italic>, 4- <italic>Effective, 5- Very effective</italic>. The ATIC answers were evaluated on a 0&#x02013;3 scale with the following labels: 0- <italic>No match</italic>, 1- <italic>matches a bit</italic>, 2- <italic>matches</italic>, 3-<italic>completely matches</italic>. Both coders received 6 h of training on the SI and ATIC. One judge was an M.B.A student specializing in human resource management. The other judge had an M.B.A. degree and worked as an organizational consultant in the field of employee selection. Both judges were unaware of the hypotheses or the experimental conditions.</p>
<p>The training procedure for the two judges is consistent with prior work on the ATIC and SI (e.g., Oostrom et al., <xref ref-type="bibr" rid="B32">2016</xref>). The training began with an introduction to the SI, the ATIC, and the scoring procedures for each one. The judges were given a behavioral scoring guide for determining whether an answer to each SI question was poor, adequate, or highly acceptable. A practice session enabled the judges to become familiar with the rating process for the SI questions and the ATIC. Afterwards, they discussed their ratings with each other, received feedback on their ratings, and were invited to ask questions of the researchers about the rating process before the experiment began.</p>
</sec>
</sec>
<sec>
<title>Results</title>
<p>A one-way random effect <italic>ICC</italic> was calculated in order to determine inter-rater reliability. The <italic>ICC</italic> (3) for the SI questions presented was 0.88, and the correlation between the two judges was 0.79. The <italic>ICC</italic> (3) for the ATIC questions was 0.92, and the correlation between the judges was 0.88. Therefore, both the SI and the ATIC scores were calculated as the mean rating of the two judges.</p>
<p><xref ref-type="table" rid="T2">Table 2</xref> presents the descriptive statistics and correlations among the study&#x00027;s variables.</p>
<table-wrap position="float" id="T2">
<label>Table 2</label>
<caption><p>Experiment 4: descriptive statistics and correlations.</p></caption>
<table frame="hsides" rules="groups">
<thead><tr>
<th/>
<th valign="top" align="center"><bold><italic>M</italic></bold></th>
<th valign="top" align="center"><bold><italic>SD</italic></bold></th>
<th valign="top" align="center"><bold>1</bold></th>
<th valign="top" align="center"><bold>2</bold></th>
<th valign="top" align="center"><bold>3</bold></th>
<th valign="top" align="center"><bold>4</bold></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">1. ATIC</td>
<td valign="top" align="center">1.07</td>
<td valign="top" align="center">0.29</td>
<td/>
<td/>
<td/>
<td/>
</tr>
<tr>
<td valign="top" align="left">2. SI</td>
<td valign="top" align="center">2.46</td>
<td valign="top" align="center">0.58</td>
<td valign="top" align="center">0.33<xref ref-type="table-fn" rid="TN1"><sup>&#x0002A;&#x0002A;</sup></xref></td>
<td/>
<td/>
<td/>
</tr>
<tr>
<td valign="top" align="left">3. Work experience (in years)</td>
<td valign="top" align="center">3.76</td>
<td valign="top" align="center">4.55</td>
<td valign="top" align="center">0.01</td>
<td valign="top" align="center">&#x02212;0.03</td>
<td/>
<td/>
</tr>
<tr>
<td valign="top" align="left">4. Age</td>
<td valign="top" align="center">22.27</td>
<td valign="top" align="center">4.39</td>
<td valign="top" align="center">0.01</td>
<td valign="top" align="center">0.07</td>
<td valign="top" align="center">0.84<xref ref-type="table-fn" rid="TN1"><sup>&#x0002A;&#x0002A;</sup></xref></td>
<td/>
</tr>
<tr>
<td valign="top" align="left">5. Gender</td>
<td valign="top" align="center">1.49</td>
<td valign="top" align="center">0.50</td>
<td valign="top" align="center">&#x02212;0.02</td>
<td valign="top" align="center">0.15</td>
<td valign="top" align="center">&#x02212;0.07</td>
<td valign="top" align="center">&#x02212;0.01</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<fn id="TN1">
<label>&#x0002A;&#x0002A;</label>
<p><italic>p &#x0003C; 0.01; Gender was coded as 1- female, 2- male</italic>.</p></fn>
</table-wrap-foot>
</table-wrap>
<sec>
<title>Main Effects</title>
<p>There was a significant main effect of the dilemma manipulation on SI scores [<italic>t</italic><sub>(146)</sub> = 8.29, <italic>p</italic> &#x0003C; 0.001, <italic>d</italic> = 1.36]. Participants in the no-dilemma condition (<italic>M</italic> = 2.80, <italic>SD</italic> = 0.54) scored significantly higher on the SI than participants in the dilemma condition (<italic>M</italic> = 2.14, <italic>SD</italic>= 0.42). There was no main effect of ATIC between the experimental groups [<italic>t</italic><sub>(146)</sub> = 1.16, <italic>p</italic> = 0.25, <italic>d</italic> = 0.19]. Participants in the dilemma condition (<italic>M</italic> = 1.04, <italic>SD</italic> = 0.24) did not differ on their ATIC score relative to participants in the no-dilemma condition (<italic>M</italic> = 1.09, <italic>SD</italic> = 0.34).</p>
</sec>
<sec>
<title>Moderation Analysis</title>
<p>We conducted a moderation analysis using Model 1 in PROCESS (Hayes, <xref ref-type="bibr" rid="B10">2017</xref>) using 5,000 bootstrapped samples. First, we centered the ATIC scores on its means. The results indicated that the ATIC had a main effect on the SI score&#x02014;controlling for the other main effect and the interaction [<italic>b</italic> = 0.82, <italic>SE</italic> = 0.16, <italic>t</italic> = 5.22, <italic>p</italic> &#x0003C; 0.001]. The manipulation had a main effect controlling for ATIC score [<italic>b</italic> = &#x02212;0.63, <italic>SE</italic> = 0.07, <italic>t</italic> = &#x02212;8.59, <italic>p</italic> &#x0003C; 0.001]. In addition, as hypothesized, ATIC was associated with higher scores on the SI interview only in the no dilemma condition. Specifically, the Manipulation X ATIC interaction had a significant effect on SI, controlling for the main effect of the manipulation and the ATIC was significant [<italic>b</italic> = &#x02212;0.73, SE = 0.26, <italic>t</italic> = &#x02212;2.78, <italic>p</italic> = 0.006], <inline-formula><mml:math id="M2"><mml:msubsup><mml:mrow><mml:mi>R</mml:mi></mml:mrow><mml:mrow><mml:mi>c</mml:mi><mml:mi>h</mml:mi><mml:mi>a</mml:mi><mml:mi>n</mml:mi><mml:mi>g</mml:mi><mml:mi>e</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msubsup></mml:math></inline-formula> = 0.03, <italic>F</italic><sub>(1, 144)</sub> = 7.76, <italic>p</italic> = 0.006]. As shown in <xref ref-type="fig" rid="F2">Figure 2</xref>, a simple slope analysis indicated that in the no dilemma condition, ATIC significantly increased performance in the SI [<italic>b</italic> = 0.82, <italic>SE</italic> = 0.16, <italic>t</italic> = 5.22, <italic>p</italic> &#x0003C; 0.001]. However, the ATIC did not increase SI scores when a dilemma was included in the interview questions [<italic>b</italic> = 0.08, <italic>SE</italic> = 0.21, <italic>t</italic> = 0.38, <italic>p</italic> = 0.70]. Put differently, the association between ATIC and scores on the SI was strong, positive and significant only in the no-dilemma condition [<italic>r</italic><sub>(72)</sub> = 0.51. <italic>p</italic> &#x0003C; 0.001], yet it was not significant in the dilemma condition, [<italic>r</italic><sub>(75)</sub> = 0.05, <italic>p</italic> = 0.69].</p>
<fig id="F2" position="float">
<label>Figure 2</label>
<caption><p>Experiment 4: Interaction of the relationship between SI and ATIC by experimental condition.</p></caption>
<graphic xlink:href="fpsyg-12-674815-g0002.tif"/>
</fig>
</sec>
</sec>
<sec>
<title>Discussion</title>
<p>The results of this fourth experiment provide additional support for the first three experiments as well as support for previous research on the ATIC. Specifically, when SI questions lacked a dilemma, the ATIC increased scores on the SI in this fourth experiment. This finding replicates previous work on this topic (e.g., Ingold et al., <xref ref-type="bibr" rid="B13">2015</xref>; Oostrom et al., <xref ref-type="bibr" rid="B32">2016</xref>). However, when a dilemma was included in each SI question, the ATIC score did not increase scores on the SI. The findings of this fourth experiment enhance the external validity of the findings in the three previous experiments by assessing multiple performance criteria as well as using the same interview questions that were used in Oostrom et al.&#x00027;s (<xref ref-type="bibr" rid="B32">2016</xref>) research, and in addition including a condition where individuals were requested to identify the performance criteria that the questions predicted.</p>
</sec>
</sec>
<sec id="s7">
<title>General Discussion</title>
<p>The present four experiments are similar to Ingold et al.&#x00027;s (<xref ref-type="bibr" rid="B13">2015</xref>) in that the participants in the first two experiments and the participants in the Oostrom et al. (<xref ref-type="bibr" rid="B32">2016</xref>) study were, or had been, employed in a variety of sectors. The SI used in our four experiments and the SI used by Ingold et al. and Oostrom et al. (<xref ref-type="bibr" rid="B32">2016</xref>) had criterion-related validity. The inter-rater reliability estimates in the present four experiments are similar to the reliability estimates obtained by both Ingold et al. (<xref ref-type="bibr" rid="B13">2015</xref>) and Oostrom et al. (<xref ref-type="bibr" rid="B32">2016</xref>).</p>
<p>The differences between the Ingold et al. and Oostrom et al. studies vs. the present experiments are at least 2-fold. First, the SI questions in the present research, as shown in the <xref ref-type="supplementary-material" rid="SM1">Appendices</xref>, contained dilemmas whereas the majority of questions used by Ingold et al. and Oostrom et al. (<xref ref-type="bibr" rid="B32">2016</xref>) failed to do so. In addition, we informed participants in the experimental group in experiments 1&#x02013;3 of the performance criterion that was being assessed, whereas Ingold et al. and Oostrom et al. required participants to guess the performance criterion on which they were being assessed. However, this was also the case in our fourth experiment. Individuals in the experimental condition were asked to identify the criteria that was being assessed.</p>
<p>The results of our first experiment were replicated in the second and third experiments. These results suggest that knowing the performance criterion that is being assessed is not advantageous for attaining higher scores on a SI if the SI questions contain a dilemma. Furthermore, the results of the fourth experiment, which included a measure of ATIC, showed that it is the existence of a dilemma in SI questions that disentangles the relationship between the ATIC and SI. When the SI questions in our fourth experiment did not include a dilemma, the ATIC significantly increased SI scores, thus replicating the findings of both Ingold et al. (<xref ref-type="bibr" rid="B13">2015</xref>) and Oostrom et al. (<xref ref-type="bibr" rid="B32">2016</xref>).</p>
<p>The practical significance of the present four experiments, in addition to casting doubt on the relevance of ATIC to a correctly developed SI, is that it shows the necessity of adhering to a critical step required for developing a SI, namely including a dilemma in each question. Had this been done by Ingold et al. (<xref ref-type="bibr" rid="B13">2015</xref>), they would likely have obtained findings similar to that of our first and second experiments. The present findings hopefully shed light on an important element in the development of an SI, namely, &#x0201C;the dilemma.&#x0201D; The inclusion of a dilemma appears to have been lost from this technique in its purer form in myriad studies. Researchers and practitioners should refrain from what is becoming a common practice (see Taylor and Small, <xref ref-type="bibr" rid="B35">2002</xref>; Huffcutt et al., <xref ref-type="bibr" rid="B11">2004</xref>), namely, treating SI questions with and without a dilemma as interchangeable. If this recommendation is followed, managers can remain confident that the SI is a reliable and valid technique for selecting employees.</p>
</sec>
<sec id="s8">
<title>Limitations and Future Research</title>
<p>The limitations of our research are at least 3-fold. Arguably the criterion, teamwork, used in three of our four experiments may have been readily discerned by the content of the SI questions. If this argument has merit, the participants in the control condition might have been able to easily guess the criterion that was being assessed. However, if this explanation were correct, Klehe and Latham (<xref ref-type="bibr" rid="B18">2005</xref>) would not have obtained evidence of predictive validity for the SI, due to restriction of range as the majority of the participants would have been able to give socially desirable answers. Moreover, we would not have found a significant interaction effect in our third experiment. Note too that consistent with Oostrom et al. (<xref ref-type="bibr" rid="B32">2016</xref>), multiple performance criteria were used in our fourth experiment.</p>
<p>An arguable second limitation of the first three experiments, as noted earlier, is that making people aware of the performance criterion is not the same as what is presumed to be an individual difference variable, namely, ATIC. Nevertheless, the results from the fourth experiment replicated the findings by Ingold et al. in the no-dilemma condition. The results also replicated the findings of our third experiment. The ATIC did not enhance SI scores in the dilemma condition.</p>
<p>To further investigate whether ATIC is different from providing individuals with knowledge of the criteria being assessed, future research should use a 2 &#x000D7; 2 factorial design crossing knowledge of the criterion and a dilemma along with a measure of ATIC. Such an experiment will provide information about the incremental validity of the ATIC relative to knowledge of the criterion that is being assessed by valid SI questions that is, those that contain dilemmas.</p>
<p>Note that the gold standard for adequately questioning the results obtained by other researchers, in this instance, Ingold et al. (<xref ref-type="bibr" rid="B13">2015</xref>) and Oostrom et al. (<xref ref-type="bibr" rid="B32">2016</xref>), involves a two-step process. First, skeptics must show that they can replicate the original findings. Second, they must show that those findings are due to a methodological artifact, in this instance, the absence of a dilemma in an SI question. Only questions that contain a dilemma warrant the designation of SI. These two steps were taken in the present research.</p>
<p>A third limitation concerns the issue of statistical power. The third and fourth experiments had adequate power, namely above 80% to detect a medium effect size. However, the effects in previous studies on ATIC and the SI found small effect sizes (e.g., Klehe et al., <xref ref-type="bibr" rid="B17">2008</xref>). Hence, future studies should use large sample sizes to further explore the effect of a dilemma in the ATIC-SI relationship.</p>
<p>A fourth arguable limitation is that the difficulty level of the questions in the dilemma vs. no-dilemma conditions was not held constant. If this criticism were appropriate, previous research comparing the criterion-related validity of structured vs. unstructured interviews, the latter often involving little more than a casual conversation, would be redacted. Structured interviews that are job-related are typically more difficult for a job applicant than participating in a free-flowing unstructured discussion. Similarly, including a dilemma in an SI question makes it more difficult to answer than it is to respond to questions that lack a dilemma. A dilemma is an inherent quality of the SI (Levashina et al., <xref ref-type="bibr" rid="B29">2014</xref>).</p>
<p>In summary, the issue underlying this paper is that many practitioners and researchers have ignored the recommendations of Latham (<xref ref-type="bibr" rid="B24">1989</xref>) for constructing SI questions. The primary finding of the present research is that ability to identify the criteria being assessed by an SI increases an individual&#x00027;s score only when the questions fail to contain a dilemma.</p>
</sec>
<sec sec-type="data-availability-statement" id="s9">
<title>Data Availability Statement</title>
<p>The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.</p>
</sec>
<sec id="s10">
<title>Ethics Statement</title>
<p>The studies involving human participants were reviewed and approved by University of Toronto. The patients/participants provided their written informed consent to participate in this study.</p>
</sec>
<sec id="s11">
<title>Author Contributions</title>
<p>GL and GI contributed to the conception and design of the studies. GI performed the statistical analysis and wrote the method and results were written jointly by the authors. GL wrote the introduction and discussion. All authors approved the submitted version.</p>
</sec>
<sec sec-type="COI-statement" id="conf1">
<title>Conflict of Interest</title>
<p>The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.</p>
</sec>
<sec sec-type="disclaimer" id="s12">
<title>Publisher&#x00027;s Note</title>
<p>All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.</p>
</sec>
</body>
<back>
<ack><p>We thank Paul Green and Tom Janz for their helpful comments on an earlier draft of this paper.</p>
</ack>
<sec sec-type="supplementary-material" id="s13">
<title>Supplementary Material</title>
<p>The Supplementary Material for this article can be found online at: <ext-link ext-link-type="uri" xlink:href="https://www.frontiersin.org/articles/10.3389/fpsyg.2021.674815/full#supplementary-material">https://www.frontiersin.org/articles/10.3389/fpsyg.2021.674815/full#supplementary-material</ext-link></p>
<supplementary-material xlink:href="Data_Sheet_1.docx" id="SM1" mimetype="application/vnd.openxmlformats-officedocument.wordprocessingml.document" xmlns:xlink="http://www.w3.org/1999/xlink"/>
</sec>
<ref-list>
<title>References</title>
<ref id="B1">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Bandura</surname> <given-names>A.</given-names></name></person-group> (<year>2000</year>). <article-title>Social cognitive theory. an agentic perspective</article-title>. <source>Ann. Rev. Psychol.</source> <volume>32</volume>, <fpage>1</fpage>&#x02013;<lpage>26</lpage>. <pub-id pub-id-type="doi">10.1146/annurev.psych.52.1.1</pub-id><pub-id pub-id-type="pmid">11148297</pub-id></citation></ref>
<ref id="B2">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Campion</surname> <given-names>M. A.</given-names></name> <name><surname>Campion</surname> <given-names>J. E.</given-names></name> <name><surname>Hudson</surname> <given-names>J. P.</given-names></name></person-group> (<year>1994</year>). <article-title>Structured interviewing: a note on incremental validity and alternative question types</article-title>. <source>J. Appl. Psychol.</source> <volume>79</volume>, <fpage>998</fpage>&#x02013;<lpage>1002</lpage>. <pub-id pub-id-type="doi">10.1037/0021-9010.79.6.998</pub-id></citation></ref>
<ref id="B3">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Campion</surname> <given-names>M. A.</given-names></name> <name><surname>Palmer</surname> <given-names>D. K.</given-names></name> <name><surname>Campion</surname> <given-names>J. E.</given-names></name></person-group> (<year>1997</year>). <article-title>A review of structure in the selection interview</article-title>. <source>Pers. Psychol.</source>, <volume>50</volume>, <fpage>655</fpage>&#x02013;<lpage>702</lpage>.</citation></ref>
<ref id="B4">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Cortina</surname> <given-names>J. M.</given-names></name> <name><surname>Goldstein</surname> <given-names>N. B.</given-names></name> <name><surname>Payne</surname> <given-names>S. C.</given-names></name> <name><surname>Davison</surname> <given-names>H. K.</given-names></name> <name><surname>Gilliland</surname> <given-names>S. W.</given-names></name></person-group> (<year>2000</year>). <article-title>The incremental validity of interview scores over and above cognitive ability and conscientiousness scores</article-title>. <source>Pers. Psychol.</source> <volume>53</volume>, <fpage>325</fpage>&#x02013;<lpage>351</lpage>. <pub-id pub-id-type="doi">10.1111/j.1744-6570.2000.tb00204.x</pub-id></citation></ref>
<ref id="B5">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Culbertson</surname> <given-names>S. S. W. S</given-names></name> <name><surname>Huffcutt</surname> <given-names>A. I.</given-names></name></person-group> (<year>2017</year>). <article-title>A tale of two formats: direct comparison of matching situational and behavior description interview questions</article-title>. <source>Human Resour. Manage. Rev.</source> <volume>27</volume>, <fpage>167</fpage>&#x02013;<lpage>177</lpage>. <pub-id pub-id-type="doi">10.1016/j.hrmr.2016.09.009</pub-id></citation></ref>
<ref id="B6">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Faul</surname> <given-names>F.</given-names></name> <name><surname>Erdfelder</surname> <given-names>E.</given-names></name> <name><surname>Lang</surname> <given-names>A. G.</given-names></name> <name><surname>Buchner</surname> <given-names>A.</given-names></name></person-group> (<year>2007</year>). <article-title>G&#x0002A; Power 3: a flexible statistical power analysis program for the social, behavioral, and biomedical sciences</article-title>. <source>Behav. Res. Methods</source> <volume>39</volume>, <fpage>175</fpage>&#x02013;<lpage>191</lpage>. <pub-id pub-id-type="doi">10.3758/BF03193146</pub-id><pub-id pub-id-type="pmid">17695343</pub-id></citation></ref>
<ref id="B7">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Flanagan</surname> <given-names>J. C.</given-names></name></person-group> (<year>1954</year>). <article-title>The critical incident technique</article-title>. <source>Psychol. Bull.</source> <volume>51</volume>, <fpage>327</fpage>&#x02013;<lpage>360</lpage>. <pub-id pub-id-type="doi">10.1037/h0061470</pub-id></citation></ref>
<ref id="B8">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Griffin</surname> <given-names>B.</given-names></name></person-group> (<year>2014</year>). <article-title>The ability to identify criteria: its relationship with social understanding, preparation, and impression management in affecting predictor performance in a high-stakes selection context</article-title>. <source>Human Perform.</source> <volume>27</volume>, <fpage>147</fpage>&#x02013;<lpage>164</lpage>. <pub-id pub-id-type="doi">10.1080/08959285.2014.882927</pub-id></citation></ref>
<ref id="B9">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Guion</surname> <given-names>R. M.</given-names></name></person-group> (<year>1998</year>). <source>Assessment, Measurement, and Prediction for Personnel Decisions</source>. <publisher-loc>Mahwah, NJ</publisher-loc>: <publisher-name>Lawrence Erlbaum Associates Inc</publisher-name>., Publishers.</citation></ref>
<ref id="B10">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Hayes</surname> <given-names>A. F.</given-names></name></person-group> (<year>2017</year>). <source>Introduction to Mediation, Moderation, and Conditional Process Analysis: A Regression-based Approach</source>. <publisher-name>Guilford publications</publisher-name>.</citation></ref>
<ref id="B11">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Huffcutt</surname> <given-names>A. I.</given-names></name> <name><surname>Conway</surname> <given-names>J. M.</given-names></name> <name><surname>Roth</surname> <given-names>P. L.</given-names></name> <name><surname>Klehe</surname> <given-names>U. C.</given-names></name></person-group> (<year>2004</year>). <article-title>The impact of job complexity and study design on situational and behavior description interview validity</article-title>. <source>Int. J. Select. Assess.</source> <volume>12</volume>, <fpage>262</fpage>&#x02013;<lpage>273</lpage>. <pub-id pub-id-type="doi">10.1111/j.0965-075X.2004.280_1.x</pub-id></citation></ref>
<ref id="B12">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Ingold</surname> <given-names>P. V.</given-names></name> <name><surname>Kleinmann</surname> <given-names>M.</given-names></name> <name><surname>K&#x000F6;nig</surname> <given-names>C. J.</given-names></name> <name><surname>Melchers</surname> <given-names>K. G.</given-names></name></person-group> (<year>2016</year>). <article-title>Transparency of assessment centers: lower criterion-related validity but greater opportunity to perform?</article-title> <source>Personnel Psychol.</source> <volume>69</volume>, <fpage>467</fpage>&#x02013;<lpage>497</lpage>. <pub-id pub-id-type="doi">10.1111/peps.12105</pub-id></citation></ref>
<ref id="B13">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Ingold</surname> <given-names>P. V.</given-names></name> <name><surname>Kleinmann</surname> <given-names>M.</given-names></name> <name><surname>K&#x000F6;nig</surname> <given-names>C. J.</given-names></name> <name><surname>Melchers</surname> <given-names>K. G.</given-names></name> <name><surname>Van Iddekinge</surname> <given-names>C. H.</given-names></name></person-group> (<year>2015</year>). <article-title>Why do situational interviews predict job performance? The role of interviewees&#x00027; ability to identify criteria</article-title>. <source>J. Business Psychol.</source> <volume>30</volume>, <fpage>387</fpage>&#x02013;<lpage>398</lpage>. <pub-id pub-id-type="doi">10.1007/s10869-014-9368-3</pub-id></citation></ref>
<ref id="B14">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Jansen</surname> <given-names>A.</given-names></name> <name><surname>Lievens</surname> <given-names>F.</given-names></name> <name><surname>Kleinmann</surname> <given-names>M.</given-names></name></person-group> (<year>2011</year>). <article-title>Do individual differences in perceiving situational demands moderate the relationship between personality and assessment center dimension ratings?</article-title> <source>Human Perform.</source> <volume>24</volume>, <fpage>231</fpage>&#x02013;<lpage>250</lpage>. <pub-id pub-id-type="doi">10.1080/08959285.2011.580805</pub-id></citation></ref>
<ref id="B15">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Jansen</surname> <given-names>A.</given-names></name> <name><surname>Melchers</surname> <given-names>K. G.</given-names></name> <name><surname>Lievens</surname> <given-names>F.</given-names></name> <name><surname>Kleinmann</surname> <given-names>M.</given-names></name> <name><surname>Br&#x000E4;ndli</surname> <given-names>M.</given-names></name> <name><surname>Fraefel</surname> <given-names>L.</given-names></name> <etal/></person-group>. (<year>2013</year>). <article-title>Situation assessment as an ignored factor in the behavioral consistency paradigm underlying the validity of personnel selection procedures</article-title>. <source>J. Appl. Psychol.</source> <volume>98</volume>, <fpage>326</fpage>&#x02013;<lpage>341</lpage>. <pub-id pub-id-type="doi">10.1037/a0031257</pub-id><pub-id pub-id-type="pmid">23244223</pub-id></citation></ref>
<ref id="B16">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Janz</surname> <given-names>T.</given-names></name></person-group> (<year>1989</year>). <article-title>The patterned behavior description interview: the best prophet of the future is the past</article-title>, in <source>The Employment Interview: Theory, Research, and Practice</source>, eds <person-group person-group-type="editor"><name><surname>Eder</surname> <given-names>R. W.</given-names></name> <name><surname>Ferris</surname> <given-names>G. R.</given-names></name></person-group> (<publisher-loc>Newbury Park, CA</publisher-loc>: <publisher-name>Sage</publisher-name>), <fpage>158</fpage>&#x02013;<lpage>168</lpage>.</citation></ref>
<ref id="B17">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Klehe</surname> <given-names>U.-C.</given-names></name> <name><surname>K&#x000F6;nig</surname> <given-names>C. J.</given-names></name> <name><surname>Richter</surname> <given-names>G. M.</given-names></name> <name><surname>Kleinmann</surname> <given-names>M.</given-names></name> <name><surname>Melchers</surname> <given-names>K. G.</given-names></name></person-group> (<year>2008</year>). <article-title>Transparency in structured interviews: Consequences for construct and criterion-related validity</article-title>. <source>Hum. Perform</source>. <volume>21</volume>, <fpage>107</fpage>&#x02013;<lpage>137</lpage>. <pub-id pub-id-type="doi">10.1080/08959280801917636</pub-id></citation></ref>
<ref id="B18">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Klehe</surname> <given-names>U. C.</given-names></name> <name><surname>Latham</surname> <given-names>G. P.</given-names></name></person-group> (<year>2005</year>). <article-title>The predictive and incremental validity of the situational and patterned behavior description interviews for teamplaying behavior</article-title>. <source>Int. J. Select. Assess.</source> <volume>13</volume>, <fpage>108</fpage>&#x02013;<lpage>115</lpage>. <pub-id pub-id-type="doi">10.1111/j.0965-075X.2005.00305.x</pub-id></citation></ref>
<ref id="B19">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Klehe</surname> <given-names>U. C.</given-names></name> <name><surname>Latham</surname> <given-names>G. P.</given-names></name></person-group> (<year>2006</year>). <article-title>What would you do &#x02013; really or ideally? Constructs underlying the behavior description interview and the situational interview in predicting typical versus maximum performance</article-title>. <source>Human Perform.</source> <volume>19</volume>, <fpage>357</fpage>&#x02013;<lpage>382</lpage>. <pub-id pub-id-type="doi">10.1207/s15327043hup1904_3</pub-id><pub-id pub-id-type="pmid">14655779</pub-id></citation></ref>
<ref id="B20">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kleinmann</surname> <given-names>M.</given-names></name></person-group> (<year>1993</year>). <article-title>Are rating dimensions in assessment centers transparent for participants? Consequences for criterion and construct validity</article-title>. <source>J. Appl. Psychol.</source> <volume>78</volume>, <fpage>988</fpage>&#x02013;<lpage>993</lpage>. <pub-id pub-id-type="doi">10.1037/0021-9010.78.6.988</pub-id></citation></ref>
<ref id="B21">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kleinmann</surname> <given-names>M.</given-names></name> <name><surname>Ingold</surname> <given-names>P. V.</given-names></name> <name><surname>Lievens</surname> <given-names>F.</given-names></name> <name><surname>Jansen</surname> <given-names>A.</given-names></name> <name><surname>Melchers</surname> <given-names>K. G.</given-names></name> <name><surname>K&#x000F6;nig</surname> <given-names>C. J.</given-names></name></person-group> (<year>2011</year>). <article-title>A different look at why selection procedures work: the role of candidates&#x00027; ability to identify criteria</article-title>. <source>Organ. Psychol. Rev.</source> <volume>1</volume>, <fpage>128</fpage>&#x02013;<lpage>146</lpage>. <pub-id pub-id-type="doi">10.1177/2041386610387000</pub-id></citation></ref>
<ref id="B22">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>K&#x000F6;nig</surname> <given-names>C. J.</given-names></name> <name><surname>Melchers</surname> <given-names>K. G.</given-names></name> <name><surname>Kleinmann</surname> <given-names>M.</given-names></name> <name><surname>Richter</surname> <given-names>G. M.</given-names></name> <name><surname>Klehe</surname> <given-names>U. C.</given-names></name></person-group> (<year>2007</year>). <article-title>Candidates&#x00027; ability to identify criteria in nontransparent selection procedures: evidence from an assessment center and a structured interview</article-title>. <source>Int. J. Select. Assess.</source> <volume>15</volume>, <fpage>283</fpage>&#x02013;<lpage>292</lpage>. <pub-id pub-id-type="doi">10.1111/j.1468-2389.2007.00388.x</pub-id></citation></ref>
<ref id="B23">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Lakens</surname> <given-names>D.</given-names></name></person-group> (<year>2021</year>). <article-title>Sample size justification</article-title>. <source>PsyArXiv Preprints.</source> <pub-id pub-id-type="doi">10.31234/osf.io/9d3yf</pub-id></citation></ref>
<ref id="B24">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Latham</surname> <given-names>G. P.</given-names></name></person-group> (<year>1989</year>). <article-title>The reliability, validity, and practicality of the situational interview</article-title>, in <source>The Employment Interview: Theory, Research, and Practice</source>, eds <person-group person-group-type="editor"><name><surname>Eder</surname> <given-names>R. W.</given-names></name> <name><surname>Ferris</surname> <given-names>G. R.</given-names></name></person-group> (<publisher-loc>Newbury Park, CA</publisher-loc>: <publisher-name>Sage</publisher-name>), <fpage>169</fpage>&#x02013;<lpage>182</lpage>.</citation></ref>
<ref id="B25">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Latham</surname> <given-names>G. P.</given-names></name> <name><surname>Saari</surname> <given-names>L. M.</given-names></name></person-group> (<year>1979</year>). <article-title>Application of social-learning theory to training supervisors through behavioral modeling</article-title>. <source>J. Appl. Psychol.</source> <volume>64</volume>, <fpage>239</fpage>&#x02013;<lpage>246</lpage>. <pub-id pub-id-type="doi">10.1037/0021-9010.64.3.239</pub-id></citation></ref>
<ref id="B26">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Latham</surname> <given-names>G. P.</given-names></name> <name><surname>Saari</surname> <given-names>L. M.</given-names></name> <name><surname>Pursell</surname> <given-names>E. D.</given-names></name> <name><surname>Campion</surname> <given-names>M. A.</given-names></name></person-group> (<year>1980</year>). <article-title>The situational interview</article-title>. <source>J. Appl. Psychol.</source> <volume>65</volume>, <fpage>422</fpage>&#x02013;<lpage>427</lpage>. <pub-id pub-id-type="doi">10.1037/0021-9010.65.4.422</pub-id></citation></ref>
<ref id="B27">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Latham</surname> <given-names>G. P.</given-names></name> <name><surname>Sue-Chan</surname> <given-names>C.</given-names></name></person-group> (<year>1999</year>). <article-title>A meta-analysis of the situational interview: an enumerative review of reasons for its validity</article-title>. <source>Canad. Psychol/Psychol. Canad.</source> <volume>40</volume>, <fpage>56</fpage>&#x02013;<lpage>67</lpage>. <pub-id pub-id-type="doi">10.1037/h0086826</pub-id></citation></ref>
<ref id="B28">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Latham</surname> <given-names>G. P.</given-names></name> <name><surname>Wexley</surname> <given-names>K. N.</given-names></name></person-group> (<year>1977</year>). <article-title>Behavioral observation scales for performance appraisal purposes</article-title>. <source>Personnel Psychol.</source> <volume>30</volume>, <fpage>255</fpage>&#x02013;<lpage>268</lpage>. <pub-id pub-id-type="doi">10.1111/j.1744-6570.1977.tb02092.x</pub-id></citation></ref>
<ref id="B29">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Levashina</surname> <given-names>J.</given-names></name> <name><surname>Hartwell</surname> <given-names>C. J.</given-names></name> <name><surname>Morgeson</surname> <given-names>F. P.</given-names></name> <name><surname>Campion</surname> <given-names>M. A.</given-names></name></person-group> (<year>2014</year>). <article-title>The structured employment interview: narrative and quantitative review of the research literature</article-title>. <source>Pers. Psychol.</source> <volume>67</volume>, <fpage>241</fpage>&#x02013;<lpage>293</lpage>. <pub-id pub-id-type="doi">10.1111/peps.12052</pub-id></citation></ref>
<ref id="B30">
<citation citation-type="book"><person-group person-group-type="editor"><name><surname>Locke</surname> <given-names>E. A.</given-names></name> <name><surname>Latham</surname> <given-names>G. P.</given-names></name></person-group> (eds.). (<year>2013</year>). <article-title>Goal setting theory: the current state</article-title>, in <source>New Developments in Goal Setting and Task Performance</source>, <publisher-loc>York, NY</publisher-loc>: <publisher-name>Routledge</publisher-name>, <fpage>623</fpage>&#x02013;<lpage>630</lpage>.</citation>
</ref>
<ref id="B31">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Melchers</surname> <given-names>K. G.</given-names></name> <name><surname>B&#x000F6;sser</surname> <given-names>D.</given-names></name> <name><surname>Hartstein</surname> <given-names>T.</given-names></name> <name><surname>Kleinmann</surname> <given-names>M.</given-names></name></person-group> (<year>2012</year>). <article-title>Assessment of situational demands in a selection interview: reflective style or sensitivity?</article-title> <source>Int. J. Select. Assess.</source> <volume>20</volume>, <fpage>475</fpage>&#x02013;<lpage>485</lpage>. <pub-id pub-id-type="doi">10.1111/ijsa.12010</pub-id></citation></ref>
<ref id="B32">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Oostrom</surname> <given-names>J. K.</given-names></name> <name><surname>Melchers</surname> <given-names>K. G.</given-names></name> <name><surname>Ingold</surname> <given-names>P. V.</given-names></name> <name><surname>Kleinmann</surname> <given-names>M.</given-names></name></person-group> (<year>2016</year>). <article-title>Why do situational interviews predict performance? Is it saying how you would behave or knowing how you should behave?</article-title> <source>J. Busin. Psychol.</source> <volume>31</volume>, <fpage>279</fpage>&#x02013;<lpage>291</lpage>. <pub-id pub-id-type="doi">10.1007/s10869-015-9410-0</pub-id><pub-id pub-id-type="pmid">27226697</pub-id></citation></ref>
<ref id="B33">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Pulakos</surname> <given-names>E. D.</given-names></name> <name><surname>Schmitt</surname> <given-names>N.</given-names></name></person-group> (<year>1995</year>). <article-title>Experience-based and situational interview questions: studies of validity</article-title>. <source>Pers. Psychol.</source> <volume>48</volume>, <fpage>289</fpage>&#x02013;<lpage>308</lpage>. <pub-id pub-id-type="doi">10.1111/j.1744-6570.1995.tb01758.x</pub-id></citation></ref>
<ref id="B34">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Sue-Chan</surname> <given-names>C.</given-names></name> <name><surname>Latham</surname> <given-names>G. P.</given-names></name></person-group> (<year>2004</year>). <article-title>The situational interview as a prediction of academic and team performance: a study of the mediating effects of cognitive ability and emotional intelligence</article-title>. <source>Int. J. Select. Assess.</source> <volume>12</volume>, <fpage>312</fpage>&#x02013;<lpage>320</lpage>. <pub-id pub-id-type="doi">10.1111/j.0965-075X.2004.00286.x</pub-id></citation></ref>
<ref id="B35">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Taylor</surname> <given-names>P.</given-names></name> <name><surname>Small</surname> <given-names>B.</given-names></name></person-group> (<year>2002</year>). <article-title>Asking applicants what they would do versus what they did do: a meta-analytic comparison of situational and past behavior employment interview questions</article-title>. <source>J. Occup. Organ. Psychol.</source> <volume>75</volume>, <fpage>277</fpage>&#x02013;<lpage>294</lpage>. <pub-id pub-id-type="doi">10.1348/096317902320369712</pub-id></citation></ref>
<ref id="B36">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Ulrich</surname> <given-names>L.</given-names></name> <name><surname>Trumbo</surname> <given-names>D.</given-names></name></person-group> (<year>1965</year>). <article-title>The selection interview since 1949</article-title>. <source>Psychol. Bull.</source> <fpage>63</fpage>&#x02013;<lpage>100</lpage>. <pub-id pub-id-type="doi">10.1037/h0021696</pub-id></citation></ref>
<ref id="B37">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wagner</surname> <given-names>R.</given-names></name></person-group> (<year>1949</year>). <article-title>The employment interview: a critical summary</article-title>. <source>Pers. Psychol.</source> <volume>2</volume>, <fpage>17</fpage>&#x02013;<lpage>46</lpage>. <pub-id pub-id-type="doi">10.1111/j.1744-6570.1949.tb01669.x</pub-id></citation></ref>
<ref id="B38">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wernimont</surname> <given-names>P. F.</given-names></name> <name><surname>Campbell</surname> <given-names>J. P.</given-names></name></person-group> (<year>1968</year>). <article-title>Signs, samples, and criteria</article-title>. <source>J. Appl. Psychol.</source> <volume>52</volume>, <fpage>372</fpage>&#x02013;<lpage>376</lpage>. <pub-id pub-id-type="doi">10.1037/h0026244</pub-id></citation></ref>
<ref id="B39">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Williams</surname> <given-names>L. J.</given-names></name> <name><surname>Anderson</surname> <given-names>S. E.</given-names></name></person-group> (<year>1991</year>). <article-title>Job satisfaction and organizational commitment as predictors of organizational citizenship and in-role behaviors</article-title>. <source>J. Manage.</source> <volume>17</volume>, <fpage>601</fpage>&#x02013;<lpage>617</lpage>. <pub-id pub-id-type="doi">10.1177/014920639101700305</pub-id></citation></ref>
</ref-list>
<fn-group>
<fn id="fn0001"><p><sup>1</sup>Another arguable advantage of including a dilemma in SI questions is that doing so increases their difficulty level for interviewees.</p></fn>
<fn id="fn0002"><p><sup>2</sup>In addition to using interview questions that did not include a dilemma, Pulakos and Schmitt (<xref ref-type="bibr" rid="B33">1995</xref>) also made the performance criterion transparent to the interviewees.</p></fn>
</fn-group>
</back>
</article>
