<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v2.3 20070202//EN" "journalpublishing.dtd">
<article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" article-type="research-article" dtd-version="2.3" xml:lang="EN">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">Front. Educ.</journal-id>
<journal-title>Frontiers in Education</journal-title>
<abbrev-journal-title abbrev-type="pubmed">Front. Educ.</abbrev-journal-title>
<issn pub-type="epub">2504-284X</issn>
<publisher>
<publisher-name>Frontiers Media S.A.</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.3389/feduc.2023.1120129</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Education</subject>
<subj-group>
<subject>Original Research</subject>
</subj-group>
</subj-group>
</article-categories>
<title-group>
<article-title>The construct validity of the main student selection tests for medical studies in Germany</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author" corresp="yes"><name>
<surname>Levacher</surname>
<given-names>Julie</given-names>
</name><xref rid="aff1" ref-type="aff"><sup>1</sup></xref>
<xref rid="c001" ref-type="corresp"><sup>&#x002A;</sup></xref>
<uri xlink:href="https://loop.frontiersin.org/people/2133429/overview"/>
</contrib>
<contrib contrib-type="author"><name>
<surname>Koch</surname>
<given-names>Marco</given-names>
</name><xref rid="aff1" ref-type="aff"><sup>1</sup></xref>
<uri xlink:href="https://loop.frontiersin.org/people/1402161/overview"/>
</contrib>
<contrib contrib-type="author"><name>
<surname>Stegt</surname>
<given-names>Stephan J.</given-names>
</name><xref rid="aff2" ref-type="aff"><sup>2</sup></xref>
<uri xlink:href="https://loop.frontiersin.org/people/2117825/overview"/>
</contrib>
<contrib contrib-type="author"><name>
<surname>Hissbach</surname>
<given-names>Johanna</given-names>
</name><xref rid="aff3" ref-type="aff"><sup>3</sup></xref>
<uri xlink:href="https://loop.frontiersin.org/people/1508270/overview"/>
</contrib>
<contrib contrib-type="author"><name>
<surname>Spinath</surname>
<given-names>Frank M.</given-names>
</name><xref rid="aff1" ref-type="aff"><sup>1</sup></xref>
</contrib>
<contrib contrib-type="author"><name>
<surname>Escher</surname>
<given-names>Malvin</given-names>
</name><xref rid="aff4" ref-type="aff"><sup>4</sup></xref>
</contrib>
<contrib contrib-type="author"><name>
<surname>Becker</surname>
<given-names>Nicolas</given-names>
</name><xref rid="aff5" ref-type="aff"><sup>5</sup></xref>
<uri xlink:href="https://loop.frontiersin.org/people/1060110/overview"/>
</contrib>
</contrib-group>
<aff id="aff1"><sup>1</sup><institution>Department of Individual Differences and Psychodiagnostics, Saarland University</institution>, <addr-line>Saarbrucken</addr-line>, <country>Germany</country></aff>
<aff id="aff2"><sup>2</sup><institution>ITB Consulting GmbH</institution>, <addr-line>Bonn</addr-line>, <country>Germany</country></aff>
<aff id="aff3"><sup>3</sup><institution>Department of Biochemistry and Molecular Cell Biology, University Medical Center Hamburg-Eppendorf (UKE)</institution>, <addr-line>Hamburg</addr-line>, <country>Germany</country></aff>
<aff id="aff4"><sup>4</sup><institution>Faculty of Medicine, University Heidelberg</institution>, <addr-line>Heidelberg</addr-line>, <country>Germany</country></aff>
<aff id="aff5"><sup>5</sup><institution>Department of Individual Differences and Psychodiagnostics, Greifswald University</institution>, <addr-line>Greifswald</addr-line>, <country>Germany</country></aff>
<author-notes>
<fn id="fn0001" fn-type="edited-by">
<p>Edited by: Michael Sailer, Ludwig Maximilian University of Munich, Germany</p>
</fn>
<fn id="fn0002" fn-type="edited-by">
<p>Reviewed by: Matthias Ziegler, Humboldt University of Berlin, Germany; Florian G. Hartmann, Paris Lodron University Salzburg, Austria</p>
</fn>
<corresp id="c001">&#x002A;Correspondence: Julie Levacher, <email>julie.levacher@uni-saarland.de</email></corresp>
<fn id="fn0003" fn-type="other">
<p>This article was submitted to Assessment, Testing and Applied Measurement, a section of the journal Frontiers in Education</p>
</fn>
</author-notes>
<pub-date pub-type="epub">
<day>28</day>
<month>02</month>
<year>2023</year>
</pub-date>
<pub-date pub-type="collection">
<year>2023</year>
</pub-date>
<volume>8</volume>
<elocation-id>1120129</elocation-id>
<history>
<date date-type="received">
<day>09</day>
<month>12</month>
<year>2022</year>
</date>
<date date-type="accepted">
<day>08</day>
<month>02</month>
<year>2023</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright &#x00A9; 2023 Levacher, Koch, Stegt, Hissbach, Spinath, Escher and Becker.</copyright-statement>
<copyright-year>2023</copyright-year>
<copyright-holder>Levacher, Koch, Stegt, Hissbach, Spinath, Escher and Becker</copyright-holder>
<license xlink:href="http://creativecommons.org/licenses/by/4.0/">
<p>This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.</p>
</license>
</permissions>
<abstract>
<p>Standardized ability tests that are associated with intelligence are often used for student selection. In Germany two different admission procedures to select students for medical studies are used simultaneously; the TMS and the HAM-Nat. Due to this simultaneous use of both a detailed analysis of the construct validity is mandatory. Therefore, the aim of the study is the construct validation of both selection procedures by using data of 4,528 participants (<italic>M<sub>age</sub></italic>&#x2009;=&#x2009;20.42, <italic>SD</italic>&#x2009;=&#x2009;2.74) who took part in a preparation study under low stakes conditions. This study compares different model specifications within the correlational structure of intelligence factors as well as analysis the g-factor consistency of the admission tests. Results reveal that all subtests are correlated substantially. Furthermore, confirmatory factor analyses demonstrate that both admission tests (and their subtests) are related to <italic>g</italic> as well as to a further test-specific-factor. Therefore, from a psychometric point of view, the simultaneous use of both student selection procedures appears to be legitimate.</p>
</abstract>
<kwd-group>
<kwd>student selection</kwd>
<kwd>cognitive ability</kwd>
<kwd>construct validity</kwd>
<kwd>psychometrics</kwd>
<kwd>g-factor</kwd>
</kwd-group>
<contract-num rid="cn1">01GK1801A</contract-num>
<contract-sponsor id="cn1">Ministry of Education<named-content content-type="fundref-id">10.13039/501100002701</named-content></contract-sponsor>
<counts>
<fig-count count="1"/>
<table-count count="4"/>
<equation-count count="0"/>
<ref-count count="26"/>
<page-count count="7"/>
<word-count count="5164"/>
</counts>
</article-meta>
</front>
<body>
<sec id="sec1" sec-type="intro">
<title>1. Introduction</title>
<p>In general, student selection procedures are usually used when there are more applicants than there are study places. This is especially the case for some study courses, like medicine in Germany.</p>
<p>In this context, specific aptitude tests measuring cognitive abilities and/or specific knowledge are often used as a selection criterion since many years. Numerous studies indicate that cognitive abilities predict school performance (<xref ref-type="bibr" rid="ref19">Roth et al., 2015</xref>), educational attainment (<xref ref-type="bibr" rid="ref3">Deary et al., 2007</xref>), training success, job performance (<xref ref-type="bibr" rid="ref20">Schmidt and Hunter, 1998</xref>; <xref ref-type="bibr" rid="ref9">H&#x00FC;lsheger et al., 2007</xref>; <xref ref-type="bibr" rid="ref14">Kramer, 2009</xref>), and success in university studies (<xref ref-type="bibr" rid="ref6">Hell et al., 2007</xref>; <xref ref-type="bibr" rid="ref120">Schult et al., 2019</xref>). In general intelligence can be defined as a broad cognitive ability that includes the understanding of complex ideas, adaptability to environmental conditions, learning from experience, and problem solving through analysis (<italic>cf.</italic> <xref ref-type="bibr" rid="ref17">Neisser et al., 1996</xref>). Concerning the construct validity of intelligence <xref ref-type="bibr" rid="ref23">Spearman (1904)</xref> already noted that different indicators of cognitive ability usually show positive intercorrelations (i.e., positive manifold). This led him to the assumption that all intelligence tests are determined by one general factor (<italic>g</italic>) and that <italic>g</italic> in turn can be assessed by every intelligence test (i.e., indifference of indicators). In current higher-order factor models <italic>g</italic> is regarded as a factor standing at the apex of a hierarchy of intercorrelated subordinate group-factors (<italic>cf.</italic> <xref ref-type="bibr" rid="ref10">Jensen, 1998</xref>; <xref ref-type="bibr" rid="ref16">McGrew, 2009</xref>) and there is considerable evidence that different intelligence tests tap the same general latent factor (<xref ref-type="bibr" rid="ref11">Johnson et al., 2004</xref>, <xref ref-type="bibr" rid="ref12">2008</xref>). Going beyond classical higher-order models, recent studies (<xref ref-type="bibr" rid="ref4">Gignac, 2006</xref>, <xref ref-type="bibr" rid="ref5">2008</xref>; <xref ref-type="bibr" rid="ref1">Brunner et al., 2012</xref>; <xref ref-type="bibr" rid="ref24">Valerius and Sparfeldt, 2014</xref>) argue that they can be extended by nested-factors that account for systematic residual variance not covered by <italic>g</italic>. The results of <xref ref-type="bibr" rid="ref24">Valerius and Sparfeldt (2014)</xref> for example show that the fit of a nested-factor model was relatively better than a higher-order or general-factor model.</p>
<p>Due to the federal structure of the educational system in Germany, universities are sovereign to decide about the selection criteria for their students. The current practice in medical studies is that universities use one of two tests explicitly developed for the selection of medical students (<xref ref-type="bibr" rid="ref22">Schwibbe et al., 2018</xref>): the Hamburger Naturwissenschaftstest (HAM-Nat; en. Hamburg Natural Science Test; <xref ref-type="bibr" rid="ref7">Hissbach et al., 2011</xref>) and the Test f&#x00FC;r medizinische Studieng&#x00E4;nge (TMS; en. Test for Medical Studies; <xref ref-type="bibr" rid="ref13">Kadmon et al., 2012</xref>). Within the scope of a nation-wide research project (&#x201C;Studierendenauswahlverbund <italic>stav</italic>&#x201D;; en. student admission research network), the existing tests as well as three additional reasoning tests, which were developed within the stav, were examined under low- and high stakes conditions. In 2020, the HAM-Nat consisted of four different scales measuring natural science knowledge as well as numerical, verbal, and figural reasoning. Those three reasoning scales, which measure fluid intelligence, were added to the original HAM-Nat in order to enable a broader measurement of cognitive abilities beside the crystallized intelligence. Overall, 2,234 people participated 2020 in the 2:15&#x2009;h session at three universities. The TMS consists of 8 specific modules measuring different cognitive abilities and has a total working time of 5:07&#x2009;h. It was used by 37 universities and had 37,092 applications in the year 2022.</p>
<p>Previous studies showed that the test scores from both possess predictive validity and the included items suitable psychometric properties in terms of internal consistency (<xref ref-type="bibr" rid="ref6">Hell et al., 2007</xref>; <xref ref-type="bibr" rid="ref7">Hissbach et al., 2011</xref>; <xref ref-type="bibr" rid="ref13">Kadmon et al., 2012</xref>; <xref ref-type="bibr" rid="ref25">Werwick et al., 2015</xref>; <xref ref-type="bibr" rid="ref120">Schult et al., 2019</xref>). As all of these studies exclusively deal with only one of the tests, there is currently no evidence concerning the construct validity between their test scores. This can be regarded as a research gap for three reasons: (1) With respect to the comparability of the selection procedures it would generally be important to know if different universities apply different standards. (2) If both tests assess the exact same construct, it would be more economical to only use one test. (3) Nested factors that are specific for each of the two tests could explain variance of study aptitude that is not covered by the other one. A combined test could therefore allow a better prediction of study success than both tests alone.</p>
<p>This study is a first step to close this research gap. We are able to provide first evidence concerning the construct validity between the scores of two tests by using a large sample of applicants that completed a short version of the Ham-Nat science test plus the three reasoning tests (numerical, verbal, and figural) from the stav-project,<xref rid="fn0004" ref-type="fn"><sup>1</sup></xref> as well as four of eight subtests from the TMS. In the following, we refer to the four TMS-modules as &#x201C;TMS&#x201D; and to the combination of HAM-Nat and the three reasoning subtests as &#x201C;HAM-Nat.&#x201D; In doing so, we compared the following models also presented in <xref rid="fig1" ref-type="fig">Figure 1</xref>:</p>
<list list-type="bullet">
<list-item>
<p>g-model: In a first step we analysed the classical <italic>g</italic>-factor model in the sense of <xref ref-type="bibr" rid="ref23">Spearman (1904)</xref>. Here, all subtests load on a single general factor and all other variance is regarded as measurement error.</p>
</list-item>
<list-item>
<p>HO-model: Taking higher-order factor models into account (<xref ref-type="bibr" rid="ref10">Jensen, 1998</xref>; <xref ref-type="bibr" rid="ref16">McGrew, 2009</xref>) we inspected a model with separate group factors which represent the shared variance of the subtests within the HAM-Nat or the TMS and give rise to a superordinate <italic>g</italic>-factor.</p>
</list-item>
<list-item>
<p>NF-model: The idea of a nested-factor structure (<xref ref-type="bibr" rid="ref4">Gignac, 2006</xref>, <xref ref-type="bibr" rid="ref5">2008</xref>; <xref ref-type="bibr" rid="ref1">Brunner et al., 2012</xref>; <xref ref-type="bibr" rid="ref24">Valerius and Sparfeldt, 2014</xref>) was evaluated by a model in which variance not bound by <italic>g</italic> is explained by specific factors within the HAM-Nat or the TMS.</p>
</list-item>
<list-item>
<p>TS-model: Following <xref ref-type="bibr" rid="ref11">Johnson et al. (2004</xref>, <xref ref-type="bibr" rid="ref12">2008)</xref> we analysed a test-specific model with separate <italic>g</italic>-factors for the subtests of the HAM-Nat and the TMS. Shared variance between the tests is represented by correlation between the test-specific factors.</p>
</list-item>
</list>
<fig position="float" id="fig1"><label>Figure 1</label>
<caption>
<p>Confirmatory factor analyses for alternative models estimating general and specific factors of both admission tests.</p>
</caption>
<graphic xlink:href="feduc-08-1120129-g001.tif"/>
</fig>
</sec>
<sec id="sec2" sec-type="methods">
<title>2. Methods</title>
<sec id="sec3">
<title>2.1. Sample and procedure</title>
<p><xref rid="tab1" ref-type="table">Table 1</xref> shows the demographic details for the total sample as well as for the samples in the subtests. The total sample consisted of 4,537 participants with a mean age of 20.42&#x2009;years (<italic>SD</italic>&#x2009;=&#x2009;2.74, 16&#x2009;&#x2264;&#x2009;age&#x2009;&#x2264;&#x2009;56). All participants were registered for the TMS high stakes test carried out in 2021. Respondents received a link and completed a practice online test at home in an unsupervised setting.</p>
<table-wrap position="float" id="tab1"><label>Table 1</label>
<caption>
<p>Demographic variables and sample sizes for all subtests.</p>
</caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th/>
<th><italic>N</italic></th>
<th align="center" valign="top" colspan="2">Age</th>
<th align="center" valign="top" colspan="4">Gender</th>
<th align="center" valign="top"><italic>M</italic></th>
<th align="center" valign="top"><italic>SD</italic></th>
<th align="center" valign="top">Female</th>
<th align="center" valign="top">Male</th>
<th align="center" valign="top">Diverse</th>
<th align="center" valign="top">Missing</th>
</tr>
</thead>
<tbody>
<tr>
<td align="left" valign="top">Overall</td>
<td align="center" valign="top">4,537</td>
<td align="char" valign="top" char=".">20.42</td>
<td align="char" valign="top" char=".">2.74</td>
<td align="center" valign="top">3,440</td>
<td align="center" valign="top">1,083</td>
<td align="center" valign="top">5</td>
<td align="center" valign="top">9</td>
</tr>
<tr>
<td align="left" valign="top">DT</td>
<td align="center" valign="top">3,250</td>
<td align="char" valign="top" char=".">20.36</td>
<td align="char" valign="top" char=".">2.66</td>
<td align="center" valign="top">2,451</td>
<td align="center" valign="top">794</td>
<td align="center" valign="top">4</td>
<td align="center" valign="top">1</td>
</tr>
<tr>
<td align="left" valign="top">TC</td>
<td align="center" valign="top">3,419</td>
<td align="char" valign="top" char=".">20.37</td>
<td align="char" valign="top" char=".">2.63</td>
<td align="center" valign="top">2,593</td>
<td align="center" valign="top">821</td>
<td align="center" valign="top">4</td>
<td align="center" valign="top">1</td>
</tr>
<tr>
<td align="left" valign="top">QFP</td>
<td align="center" valign="top">3,901</td>
<td align="char" valign="top" char=".">20.39</td>
<td align="char" valign="top" char=".">2.70</td>
<td align="center" valign="top">2,954</td>
<td align="center" valign="top">938</td>
<td align="center" valign="top">5</td>
<td align="center" valign="top">4</td>
</tr>
<tr>
<td align="left" valign="top">BMS</td>
<td align="center" valign="top">4,502</td>
<td align="char" valign="top" char=".">20.42</td>
<td align="char" valign="top" char=".">2.74</td>
<td align="center" valign="top">3,416</td>
<td align="center" valign="top">1,073</td>
<td align="center" valign="top">5</td>
<td align="center" valign="top">8</td>
</tr>
<tr>
<td align="left" valign="top">Nat</td>
<td align="center" valign="top">3,354</td>
<td align="char" valign="top" char=".">20.37</td>
<td align="char" valign="top" char=".">2.70</td>
<td align="center" valign="top">2,532</td>
<td align="center" valign="top">817</td>
<td align="center" valign="top">2</td>
<td align="center" valign="top">3</td>
</tr>
<tr>
<td align="left" valign="top">FM</td>
<td align="center" valign="top">1,532</td>
<td align="char" valign="top" char=".">20.36</td>
<td align="char" valign="top" char=".">2.66</td>
<td align="center" valign="top">1,140</td>
<td align="center" valign="top">390</td>
<td align="center" valign="top">1</td>
<td align="center" valign="top">1</td>
</tr>
<tr>
<td align="left" valign="top">APS</td>
<td align="center" valign="top">958</td>
<td align="char" valign="top" char=".">20.41</td>
<td align="char" valign="top" char=".">2.82</td>
<td align="center" valign="top">711</td>
<td align="center" valign="top">245</td>
<td align="center" valign="top">0</td>
<td align="center" valign="top">2</td>
</tr>
<tr>
<td align="left" valign="top">RR</td>
<td align="center" valign="top">1,252</td>
<td align="char" valign="top" char=".">20.22</td>
<td align="char" valign="top" char=".">2.52</td>
<td align="center" valign="top">962</td>
<td align="center" valign="top">289</td>
<td align="center" valign="top">0</td>
<td align="center" valign="top">1</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<p><italic>N</italic>, sample size; <italic>M</italic>, mean; <italic>SD</italic>, standard deviation; DT, diagrams and tables; TC, text comprehension; QFP, quantitative and formal problems; BMS, basic understanding of medicine and the sciences; Nat, HAM-Nat science test; FM, figural matrices; APS, arithmetic problem solving; RR, relational reasoning.</p>
</table-wrap-foot>
</table-wrap>
<p>The test preparation study consisted of eight different subtests, all of which are used for admission tests in medicine (four subtests of the eight TMS-scales, all four subtests of the HAM-Nat). While the HAM-Nat subtests were presented in random order, the TMS subtests were presented en-bloc in the same order as under high stakes conditions. The reason for this inconsistent approach is that we offered a cost-free preparation study to all participants registered to the TMS in 2021. The incentive to participate in our study was a practice condition as close as possible to the original test format. Therefore, the order of the individual subtests of this selection test was standardized  identically to that of the real student selection test. However, in order to meet our research requirements and the state-of-the-art of randomisation, we decided to present the single HAM-Nat scales in random order. In contrast to the high stakes conditions, all subtests were presented in an online version here. Respondents received detailed feedback of their results as a further incentive.</p>
</sec>
<sec id="sec4">
<title>2.2. Materials</title>
<p>Respondents completed eight different subtests. Thereof, four tests (DT, TC, QFP, BMS) are subtests of the TMS, and the remaining four (HST, FM, APS, RR) of the HAM-Nat. All tests have in common that they are presented as multiple-choice questions.</p>
<p>Diagrams and tables (DT): Respondents are provided with data presented in tables or diagrams (e.g., a figure showing the relationship between blood clotting time and the number of platelets in patients with different diseases and therapies) and have to analyse them to infer specific information not directly presented in the material (e.g., find out whether the blood clotting time can be normal even if the number of platelets is severely reduced).</p>
<p>Text comprehension (TC): This subtest contains four longer scientific texts (e.g., about growth hormones, related control loops and feedback mechanisms) and six questions for each text concerning specific information that can be derived (e.g., An adult patient has an increased concentration of GH. According to the text, what factors can be the cause of it?). All questions can be answered without any prior knowledge.</p>
<p>Quantitative and formal problems (QFP): In this subtest respondents receive descriptions of complex arithmetic relations in a biomedical context and have to understand them in order to answer related questions (e.g., the formula of the energy charge E describing the energetic situation of a cell is explained. Then it must be calculated how the energy charge of a cell with certain proportions of ATP, ADP and AMP changes when the available ADP is converted into AMP).</p>
<p>Basic medical and scientific understanding (BMS): The aim of this subtest is to assess the ability to extract complex and demanding information from a text. Respondents receive texts dealing with medical and scientific topics (e.g., transport mechanisms for small ions) and have to decide which of several statements can be derived from the text (e.g., if a certain pharmaceutical agent inhibits transport of potassium ions into the extracellular space). In contrast to TC-tasks, the presented texts are shorter, and only one question per text has to be answered. Again, no prior knowledge is necessary to answer the questions.</p>
<p>HAM-Nat science test (Nat): The questions of this subtest deal with school knowledge in biology, chemistry, physics, and mathematics at the upper secondary school level relevant to the medical field. The questions can only be solved by using prior knowledge not included in the question (e.g., calculating the molar mass of acetonic acid when only the molecular formula and the molar masses are given).</p>
<p>Figural matrices (FM): The items of this subtest are 3&#x2009;&#x00D7;&#x2009;3 matrices filled with geometric symbols that follow certain design rules (e.g., symbols in the first and second cell of a row add up in the third cell). The last cell of the matrix is left empty, and respondents have to select the symbols which logically complete it.</p>
<p>Arithmetic problem solving (APS): In this subtest respondents receive short descriptions of arithmetic relations (e.g., After a price reduction of 20 percent, product A costs four times as much as product B, which costs 20 euros. How much did product A cost before the price reduction?).</p>
<p>Relational reasoning (RR): Respondents receive a set of premises (e.g., City A is larger than city C; City B is smallest; City D is smaller than city A) have to integrate them logically to find answers to corresponding questions (e.g., Which is the biggest city?).</p>
</sec>
<sec id="sec5">
<title>2.3. Statistical analysis</title>
<p>All statistical analyses were carried out using R version 3.5.1. We computed Cronbach&#x2019;s alpha (&#x03B1;), item difficulties (<italic>p</italic>) as well as the part-whole corrected item-total correlations (<italic>r<sub>it</sub></italic>) for each of the eight subtests. Furthermore, all intercorrelations between the mean scores in the subtests were calculated. The construct validity models presented in the introduction were tested by conducting confirmatory factor analyses in the R package lavaan (<xref ref-type="bibr" rid="ref18">Rosseel, 2012</xref>) using the maximum likelihood estimator. We calculated the <italic>&#x03C7;</italic><sup>2</sup> goodness of fit statistic, the root mean square error of approximation (RMSEA), the standardized root mean square residual (SRMR), the comparative fit index (CFI), and the Tucker-Lewis Index (TLI). Following (<xref ref-type="bibr" rid="ref8">Hu and Bentler, 1999</xref>) CFI values greater than 0.95, TLI values greater than 0.95, RMSEA values close to 0.06 and SRMR values smaller than 0.08 were regarded as indicators of good model fits. Akaike&#x2019;s Information Criterion (AIC) and the Bayesian Information Criterion (BIC) were used to compare the different construct validity models, with lower values indicating a better fit (<xref ref-type="bibr" rid="ref21">Schwarz, 1978</xref>). A difference of the RMSEAs between two models (&#x0394;RMSEA) greater than 0.015 was regarded as an additional indicator of the difference of model fits (<xref ref-type="bibr" rid="ref2">Chen, 2007</xref>).</p>
</sec>
</sec>
<sec id="sec6" sec-type="results">
<title>3. Results</title>
<sec id="sec7">
<title>3.1. Item statistics, internal consistency, correlations</title>
<p>The descriptive statistics for the item difficulties and item-total correlations of the subtests as well as the internal consistencies can be found in <xref rid="tab2" ref-type="table">Table 2</xref>. It can be seen that items of all subtests cover a considerably wide range of difficulties and that the item-total correlations as well as the internal consistencies can be described as acceptable.</p>
<table-wrap position="float" id="tab2"><label>Table 2</label>
<caption>
<p>Number of items of all test parts as well as Cronbach&#x2019;s alpha, item difficulties and item-total correlations.</p>
</caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th/>
<th align="center" valign="top" colspan="4">TMS</th>
<th align="center" valign="top" colspan="4">HAM-Nat</th>
</tr>
<tr>
<th/>
<th align="center" valign="top">DT</th>
<th align="center" valign="top">TC</th>
<th align="center" valign="top">QFP</th>
<th align="center" valign="top">BMS</th>
<th align="center" valign="top">Nat</th>
<th align="center" valign="top">FM</th>
<th align="center" valign="top">APS</th>
<th align="center" valign="top">RR</th>
</tr>
</thead>
<tbody>
<tr>
<td align="left" valign="top"># items</td>
<td align="center" valign="top">24</td>
<td align="center" valign="top">24</td>
<td align="center" valign="top">24</td>
<td align="center" valign="top">24</td>
<td align="center" valign="top">20</td>
<td align="center" valign="top">28</td>
<td align="center" valign="top">16</td>
<td align="center" valign="top">16</td>
</tr>
<tr>
<td align="left" valign="top"><italic>M(p)</italic></td>
<td align="char" valign="top" char=".">0.59</td>
<td align="char" valign="top" char=".">0.60</td>
<td align="char" valign="top" char=".">0.55</td>
<td align="char" valign="top" char=".">0.56</td>
<td align="char" valign="top" char=".">0.45</td>
<td align="char" valign="top" char=".">0.55</td>
<td align="char" valign="top" char=".">0.59</td>
<td align="char" valign="top" char=".">0.66</td>
</tr>
<tr>
<td align="left" valign="top"><italic>SD(p)</italic></td>
<td align="char" valign="top" char=".">0.17</td>
<td align="char" valign="top" char=".">0.14</td>
<td align="char" valign="top" char=".">0.15</td>
<td align="char" valign="top" char=".">0.21</td>
<td align="char" valign="top" char=".">0.11</td>
<td align="char" valign="top" char=".">0.09</td>
<td align="char" valign="top" char=".">0.17</td>
<td align="char" valign="top" char=".">0.18</td>
</tr>
<tr>
<td align="left" valign="top">Range (<italic>p</italic>)</td>
<td align="char" valign="top" char=".">0.19; 0.86</td>
<td align="char" valign="top" char=".">0.30; 0.85</td>
<td align="char" valign="top" char=".">0.29; 0.91</td>
<td align="char" valign="top" char=".">0.24; 0.95</td>
<td align="char" valign="top" char=".">0.24; 0.61</td>
<td align="char" valign="top" char=".">40; 0.78</td>
<td align="char" valign="top" char=".">0.26; 0.87</td>
<td align="char" valign="top" char=".">0.24; 0.93</td>
</tr>
<tr>
<td align="left" valign="top"><italic>M(r<sub>it</sub>)</italic></td>
<td align="char" valign="top" char=".">0.32</td>
<td align="char" valign="top" char=".">0.36</td>
<td align="char" valign="top" char=".">0.34</td>
<td align="char" valign="top" char=".">0.29</td>
<td align="char" valign="top" char=".">0.30</td>
<td align="char" valign="top" char=".">0.56</td>
<td align="char" valign="top" char=".">0.37</td>
<td align="char" valign="top" char=".">0.33</td>
</tr>
<tr>
<td align="left" valign="top"><italic>SD(r<sub>it</sub>)</italic></td>
<td align="char" valign="top" char=".">0.05</td>
<td align="char" valign="top" char=".">0.06</td>
<td align="char" valign="top" char=".">0.07</td>
<td align="char" valign="top" char=".">0.08</td>
<td align="char" valign="top" char=".">0.08</td>
<td align="char" valign="top" char=".">0.08</td>
<td align="char" valign="top" char=".">0.09</td>
<td align="char" valign="top" char=".">0.07</td>
</tr>
<tr>
<td align="left" valign="top">Range (<italic>r</italic><sub>it</sub>)</td>
<td align="char" valign="top" char=".">0.21; 0.40</td>
<td align="char" valign="top" char=".">0.20; 0.44</td>
<td align="char" valign="top" char=".">0.22; 0.49</td>
<td align="char" valign="top" char=".">0.14; 0.40</td>
<td align="char" valign="top" char=".">0.04; 0.44</td>
<td align="char" valign="top" char=".">0.39; 0.70</td>
<td align="char" valign="top" char=".">0.27; 0.50</td>
<td align="char" valign="top" char=".">0.27; 0.47</td>
</tr>
<tr>
<td align="left" valign="top">Cronbach&#x2019;s &#x03B1;</td>
<td align="char" valign="top" char=".">0.78</td>
<td align="char" valign="top" char=".">0.82</td>
<td align="char" valign="top" char=".">0.81</td>
<td align="char" valign="top" char=".">0.75</td>
<td align="char" valign="top" char=".">0.74</td>
<td align="char" valign="top" char=".">0.93</td>
<td align="char" valign="top" char=".">0.78</td>
<td align="char" valign="top" char=".">0.73</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<p><italic>p</italic>, item difficulty; <italic>M(p)</italic>, mean item difficulty; <italic>SD(p)</italic>, standard deviation of mean item difficulty; <italic>r<sub>it</sub></italic>, part-whole corrected item-total correlations; <italic>M(r<sub>it</sub>)</italic>, mean item-total correlations; <italic>SD(r<sub>it</sub>)</italic>, standard deviation of mean item-total correlations; DT, diagrams and tables; TC, text comprehension; QFP, quantitative and formal problems; BMS, basic understanding of medicine and the sciences; Nat, HAM-Nat science test; FM, figural matrices; APS, arithmetic problem solving; RR, relational reasoning.</p>
</table-wrap-foot>
</table-wrap>
<p>The correlations between the sum scores of the subtests are presented in <xref rid="tab3" ref-type="table">Table 3</xref>. All subtests show substantial and significant correlations with the other ones (0.25&#x2009;&#x2264;&#x2009;<italic>r</italic>&#x2009;&#x2264;&#x2009;0.67). The highest mean correlation can be found among the subtests of the TMS (<italic>M(r)</italic>&#x2009;=&#x2009;0.53) while lower mean correlations can be found among the HAM-Nat subtests (<italic>M(r)</italic>&#x2009;=&#x2009;0.46) and between the HAM-Nat and the TMS subtests (<italic>M(r)</italic>&#x2009;=&#x2009;0.38).</p>
<table-wrap position="float" id="tab3"><label>Table 3</label>
<caption>
<p>Correlations between the sum scores of all subtests.</p>
</caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th/>
<th align="center" valign="top">DT</th>
<th align="center" valign="top">TC</th>
<th align="center" valign="top">QFP</th>
<th align="center" valign="top">BMS</th>
<th align="center" valign="top">Nat</th>
<th align="center" valign="top">FM</th>
<th align="center" valign="top">APS</th>
<th align="center" valign="top">RR</th>
</tr>
</thead>
<tbody>
<tr>
<td align="left" valign="top">DT</td>
<td/>
<td align="center" valign="top">3,169</td>
<td align="center" valign="top">3,222</td>
<td align="center" valign="top">3,241</td>
<td align="center" valign="top">2,799</td>
<td align="center" valign="top">1,340</td>
<td align="center" valign="top">816</td>
<td align="center" valign="top">1,063</td>
</tr>
<tr>
<td align="left" valign="top">TC</td>
<td align="char" valign="top" char=".">0.62&#x002A;</td>
<td/>
<td align="center" valign="top">3,396</td>
<td align="center" valign="top">3,410</td>
<td align="center" valign="top">2,871</td>
<td align="center" valign="top">1,374</td>
<td align="center" valign="top">821</td>
<td align="center" valign="top">1,077</td>
</tr>
<tr>
<td align="left" valign="top">QFP</td>
<td align="char" valign="top" char=".">0.63&#x002A;</td>
<td align="center" valign="top">0.57&#x002A;</td>
<td/>
<td align="center" valign="top">3,882</td>
<td align="center" valign="top">3,098</td>
<td align="center" valign="top">1,446</td>
<td align="center" valign="top">892</td>
<td align="center" valign="top">1,162</td>
</tr>
<tr>
<td align="left" valign="top">BMS</td>
<td align="char" valign="top" char=".">0.61&#x002A;</td>
<td align="center" valign="top">0.67&#x002A;</td>
<td align="center" valign="top">0.60&#x002A;</td>
<td/>
<td align="center" valign="top">3,331</td>
<td align="center" valign="top">1,526</td>
<td align="center" valign="top">946</td>
<td align="center" valign="top">1,243</td>
</tr>
<tr>
<td align="left" valign="top">Nat</td>
<td align="char" valign="top" char=".">0.40&#x002A;</td>
<td align="center" valign="top">0.41&#x002A;</td>
<td align="center" valign="top">0.49&#x002A;</td>
<td align="center" valign="top">0.37&#x002A;</td>
<td/>
<td align="center" valign="top">1,387</td>
<td align="center" valign="top">865</td>
<td align="center" valign="top">1,118</td>
</tr>
<tr>
<td align="left" valign="top">FM</td>
<td align="char" valign="top" char=".">0.34&#x002A;</td>
<td align="center" valign="top">0.29&#x002A;</td>
<td align="center" valign="top">0.36&#x002A;</td>
<td align="center" valign="top">0.25&#x002A;</td>
<td align="center" valign="top">0.36&#x002A;</td>
<td/>
<td align="center" valign="top">435</td>
<td align="center" valign="top">583</td>
</tr>
<tr>
<td align="left" valign="top">APS</td>
<td align="char" valign="top" char=".">0.51&#x002A;</td>
<td align="center" valign="top">0.43&#x002A;</td>
<td align="center" valign="top">0.53&#x002A;</td>
<td align="center" valign="top">0.40&#x002A;</td>
<td align="center" valign="top">0.55&#x002A;</td>
<td align="center" valign="top">0.50&#x002A;</td>
<td/>
<td align="center" valign="top">570</td>
</tr>
<tr>
<td align="left" valign="top">RR</td>
<td align="char" valign="top" char=".">0.36&#x002A;</td>
<td align="center" valign="top">0.32&#x002A;</td>
<td align="center" valign="top">0.28&#x002A;</td>
<td align="center" valign="top">0.30&#x002A;</td>
<td align="center" valign="top">0.42&#x002A;</td>
<td align="center" valign="top">0.40&#x002A;</td>
<td align="center" valign="top">0.56&#x002A;</td>
<td/>
</tr>
</tbody>
</table>
<table-wrap-foot>
<p><sup>&#x002A;</sup><italic>p</italic>&#x2009;&#x003C;&#x2009;0.001; Pearson correlation with pairwise-deletion; below the diagonal correlations are shown; above the diagonal the sample sizes are presented; DT, diagrams and tables; TC, text comprehension; QFP, quantitative and formal problems; BMS, basic understanding of medicine and the sciences; Nat, HAM-Nat science test; FM, figural matrices; APS, arithmetic problem solving; RR, relational reasoning.</p>
</table-wrap-foot>
</table-wrap>
</sec>
<sec id="sec8">
<title>3.2. Results of the confirmatory factor analyses</title>
<p>The results of the confirmatory factor analyses are presented in <xref rid="fig1" ref-type="fig">Figure 1</xref>. All factor loadings and (if applicable) latent correlations were significant and substantial. The fit indices of the four tested models as well as the McDonald&#x2019;s Omega (&#x03C9;) of their latent variables are shown in <xref rid="tab4" ref-type="table">Table 4</xref>. The <italic>&#x03C7;</italic><sup>2</sup> goodness of fit statistic was significant for all of the models. The fit indices (CFI, TLI, RMSEA, SRMR) for the g-model and HO-model indicate model misfit while they predominantly did not exceed the cut-offs for the other two models. The NF-model, on the other hand, had an excellent fit. A comparison of the information criteria (AIC, BIC) reveals that they were lowest in the NF-model, followed by the TS-, the HO and the G-model. The same pattern is revealed by inspecting the &#x0394;RMSEA (see <xref rid="tab4" ref-type="table">Table 4</xref>). Thus, it seems that both tests are indicators for a general intelligence factor. Intelligence  can be inferred with the help of both tests and construct validity therefore exists.</p>
<table-wrap position="float" id="tab4"><label>Table 4</label>
<caption>
<p>Fit of the four tested models.</p>
</caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th/>
<th align="center" valign="top"><italic>&#x03A7;</italic><sup>2</sup></th>
<th align="center" valign="top"><italic>df</italic></th>
<th align="center" valign="top"><italic>p</italic></th>
<th align="center" valign="top">CFI</th>
<th align="center" valign="top">TLI</th>
<th align="center" valign="top">RMSEA</th>
<th align="center" valign="top">SRMR</th>
<th align="center" valign="top">AIC</th>
<th align="center" valign="top">BIC</th>
<th align="center" valign="top">&#x0394; RMSEA</th>
<th align="center" valign="top" colspan="3">&#x03C9;</th>
</tr>
<tr>
<th/>
<th/>
<th/>
<th/>
<th/>
<th/>
<th/>
<th/>
<th/>
<th/>
<th/>
<th align="center" valign="top"><italic>g</italic></th>
<th align="center" valign="top">TMS</th>
<th align="center" valign="top">HAM-Nat</th>
</tr>
</thead>
<tbody>
<tr>
<td align="left" valign="top">G-model</td>
<td align="center" valign="top">2,520.84</td>
<td align="center" valign="top">20</td>
<td align="char" valign="top" char=".">&#x003C;0.001</td>
<td align="char" valign="top" char=".">0.843</td>
<td align="char" valign="top" char=".">0.781</td>
<td align="char" valign="top" char=".">0.166</td>
<td align="char" valign="top" char=".">0.074</td>
<td align="center" valign="top">198,285</td>
<td align="center" valign="top">198,387</td>
<td/>
<td align="char" valign="top" char=".">0.85</td>
<td/>
<td/>
</tr>
<tr>
<td align="left" valign="top">HO-model</td>
<td align="center" valign="top">745.32</td>
<td align="center" valign="top">17</td>
<td align="char" valign="top" char=".">&#x003C;0.001</td>
<td align="char" valign="top" char=".">0.954</td>
<td align="char" valign="top" char=".">0.925</td>
<td align="char" valign="top" char=".">0.097</td>
<td align="char" valign="top" char=".">0.037</td>
<td align="center" valign="top">196,516</td>
<td align="center" valign="top">196,638</td>
<td align="char" valign="top" char=".">0.069</td>
<td/>
<td align="char" valign="top" char=".">0.85</td>
<td align="char" valign="top" char=".">0.73</td>
</tr>
<tr>
<td align="left" valign="top">TS-model</td>
<td align="center" valign="top">745.32</td>
<td align="center" valign="top">19</td>
<td align="char" valign="top" char=".">&#x003C;0.001</td>
<td align="char" valign="top" char=".">0.955</td>
<td align="char" valign="top" char=".">0.933</td>
<td align="char" valign="top" char=".">0.092</td>
<td align="char" valign="top" char=".">0.037</td>
<td align="center" valign="top">196,511</td>
<td align="center" valign="top">196,620</td>
<td align="char" valign="top" char=".">0.005</td>
<td/>
<td align="char" valign="top" char=".">0.85</td>
<td align="char" valign="top" char=".">0.73</td>
</tr>
<tr>
<td align="left" valign="top">NF-model</td>
<td align="center" valign="top">324.68</td>
<td align="center" valign="top">12</td>
<td align="char" valign="top" char=".">&#x003C;0.001</td>
<td align="char" valign="top" char=".">0.980</td>
<td align="char" valign="top" char=".">0.954</td>
<td align="char" valign="top" char=".">0.076</td>
<td align="char" valign="top" char=".">0.022</td>
<td align="center" valign="top">196,104</td>
<td align="center" valign="top">196,259</td>
<td align="char" valign="top" char=".">0.016</td>
<td align="char" valign="top" char=".">0.71</td>
<td align="char" valign="top" char=".">0.26</td>
<td align="char" valign="top" char=".">0.23</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<p>df, degrees of freedom; CFI, comparative fit index; TLI, Tucker&#x2013;Lewis index; RMSEA, root mean square error of approximation; SRMR, standardized root mean square residual; AIC, Akaike information criterion; BIC, Bayesian information criterion; &#x03C9;, McDonald&#x2019;s omega.</p>
</table-wrap-foot>
</table-wrap>
</sec>
</sec>
<sec id="sec9" sec-type="discussions">
<title>4. Discussion</title>
<p>The goal of this study was to provide first insights concerning the construct validity of the scores from the existing admission tests developed for the selection of medical students in Germany.</p>
<p>Our findings are based on a large sample of respondents that completed a broad and representative set of subtests included in the two tests. The basic psychometric properties of the subtests (difficulty, item total correlation, internal consistency) demonstrate the suitability of the database for further analyses. The inspection of the intercorrelations between the sum scores of the subtests clearly shows a positive manifold. Besides this, the correlations of the subtests within the HAM-Nat and the TMS were higher than the correlations between them. The confirmatory factor analyses reveal a more differentiated picture. For the models comprising only a single general factor (g-model) or a higher-order structure in which test-specific group-factors give rise to a general factor (HO-model) we found fit indices that were considerably below the respective cut-offs. The models containing test-specific factors that are independent from a general factor (TS-model, NF-model) showed better fit indices. This is in line with our results concerning the information criteria which would also favour the models with independent test-specific factors.</p>
<p>The results of our study are in line with the previous literature dealing with the construct validity of intelligence. The positive manifold of the subtests of the HAM-Nat and the TMS show that they share a substantial amount of variance. This corresponds with <xref ref-type="bibr" rid="ref23">Spearman (1904)</xref> idea of <italic>g</italic> and higher-order factor models (e.g., <xref ref-type="bibr" rid="ref10">Jensen, 1998</xref>; <xref ref-type="bibr" rid="ref16">McGrew, 2009</xref>) that conceptualize a general intellectual ability that is independent from the test used to assess it. This also shows that a considerable amount of systematic variance exists that is not shared by the two tests. Therefore, our results support the recent literature that found evidence for test-specific ability factors beyond <italic>g</italic> (e.g., <xref ref-type="bibr" rid="ref1">Brunner et al., 2012</xref>; <xref ref-type="bibr" rid="ref24">Valerius and Sparfeldt, 2014</xref>).</p>
<p>It should be noted that this study was conducted under low stakes conditions in an unsupervised setting. It is therefore possible that participants spent more time on each subtest, used prohibited tools (e.g., calculators, taking notes) or had a lower overall motivation than under a high stakes condition. It is therefore possible that both tests represent the construct even better under high-stakes conditions, since the participants work in a more focused manner, which might result in a higher validity of the test score due to reduced error variance. On the other hand, people are better prepared under high-stakes conditions. It is precisely this preparation that could have an influence on the test result and lead to an increased error variance under high-stakes conditions. The higher motivation among participants can also influence the result (<xref ref-type="bibr" rid="ref15">Levacher et al., 2021</xref>). To account for this possibility, an attempt was made to increase the motivation of the participants, as the study provided an opportunity to prepare for the high-stakes test and the results were re-ported back as an incentive. Additionally, it is worth noting, that all participants for this study were chosen from the database of people who were registered for the TMS. Therefore, they may not have been prepared for the HAM-Nat, potentially affecting their motivation to complete this part of the assessment. For this reason, it was decided in advance to present only very easy items of the Nat, which may also have had an influence on the results. With regard to the TMS, comparability to the full TMS may be impaired, as only four of the eight subscales were administered and the items used had been published before in preparation books, so that some participants may have known them already. With respect to the current selection practice for medical students in Germany it is noteworthy that the large amount of shared variance between the two tests shows that universities using either the HAM-Nat or the TMS do not apply entirely different standards. Nevertheless, neglecting the specific ability aspects not shared by the two tests could result in a loss of valuable information. Given this fact it could be reasonable to combine both tests or at least parts of them.</p>
<p>To achieve this, in a further study the specific variances should be examined in more detail to consider predictive validity. For this purpose, a regression with the study success as criterion and the variance of the g-factor as well as the two specific variances (HAM-Nat and TMS) as predictors should be estimated. In this way, it can be analysed whether, in addition to g, test-specific variance predicts study success, or whether the test-specific variance merely reflects methodological variance.</p>
</sec>
<sec id="sec10" sec-type="conclusions">
<title>5. Conclusion</title>
<p>Taken together, both subtest groups under study (TMS and HAM-Nat) seem to measure a very similar cognitive ability, despite different theoretical concepts. It can be assumed, that both subtest groups are related to <italic>g</italic> as well as to a further test-specific-factor. Even if both specific-factors are correlated, specific-non-shared parts remain. Therefore, the non-shared variance of each test should be further analysed by including university grades in our models to investigate their incremental validity. Referring to our findings, the parallel use of both procedures for the selection of students seems to be legitimate from a test-theoretical point of view.</p>
</sec>
<sec id="sec11" sec-type="data-availability">
<title>Data availability statement</title>
<p>The data analyzed in this study is subject to the following licenses/restrictions: Due to data privacy restrictions of the stav, data cannot be shared with external researchers. Requests to access these datasets should be directed to <email>kontakt@projekt-stav.de</email>.</p>
</sec>
<sec id="sec12">
<title>Ethics statement</title>
<p>Ethical review and approval was not required for the study on human participants in accordance with the local legislation and institutional requirements. Written informed consent from the participants' legal guardian/next of kin was not required to participate in this study in accordance with the national legislation and the institutional requirements.</p>
</sec>
<sec id="sec13">
<title>Author contributions</title>
<p>JL: conceptualization, data curation, formal analysis, methodology, validation, visualization, and writing &#x2013; original draft preparation. MK and FS: writing &#x2013; review and editing. SS and ME: conceptualization (online preparation study) and writing &#x2013; review and editing. JH: conceptualization, project administration, and writing &#x2013; review and editing. NB: conceptualization, funding acquisition, project administration, supervision, and writing &#x2013; review and editing. All authors contributed to the article and approved the submitted version.</p>
</sec>
<sec id="sec14" sec-type="funding-information">
<title>Funding</title>
<p>This work was partly funded by the Federal Republic of Germany, Federal Ministry of Education and Research (funding code: 01GK1801A).</p>
</sec>
<sec id="conf1" sec-type="COI-statement">
<title>Conflict of interest</title>
<p>SS is partner of the ITB Consulting GmbH, the organization developing the TMS.</p>
<p>The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.</p>
</sec>
<sec id="sec100" sec-type="disclaimer">
<title>Publisher&#x2019;s note</title>
<p>All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.</p>
</sec>
</body>
<back>
<ack>
<p>We would like to thank all project partners of the stav who were involved in the test administration. We acknowledge support by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) and Saarland University within the &#x201C;Open Access Publication Funding&#x201D; program.</p>
</ack>
<ref-list>
<title>References</title>
<ref id="ref1"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Brunner</surname> <given-names>M.</given-names></name> <name><surname>Nagy</surname> <given-names>G.</given-names></name> <name><surname>Wilhelm</surname> <given-names>O.</given-names></name></person-group> (<year>2012</year>). <article-title>A tutorial on hierarchically structured constructs</article-title>. <source>J. Pers.</source> <volume>80</volume>, <fpage>796</fpage>&#x2013;<lpage>846</lpage>. doi: <pub-id pub-id-type="doi">10.1111/j.1467-6494.2011.00749.x</pub-id>, PMID: <pub-id pub-id-type="pmid">22091867</pub-id></citation></ref>
<ref id="ref2"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Chen</surname> <given-names>F. F.</given-names></name></person-group> (<year>2007</year>). <article-title>Sensitivity of goodness of fit indexes to lack of measurement invariance</article-title>. <source>Struct. Equ. Model. Multidiscip. J.</source> <volume>14</volume>, <fpage>464</fpage>&#x2013;<lpage>504</lpage>. doi: <pub-id pub-id-type="doi">10.1080/10705510701301834</pub-id></citation></ref>
<ref id="ref3"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Deary</surname> <given-names>I. J.</given-names></name> <name><surname>Strand</surname> <given-names>S.</given-names></name> <name><surname>Smith</surname> <given-names>P.</given-names></name> <name><surname>Fernandes</surname> <given-names>C.</given-names></name></person-group> (<year>2007</year>). <article-title>Intelligence and educational achievement</article-title>. <source>Intelligence</source> <volume>35</volume>, <fpage>13</fpage>&#x2013;<lpage>21</lpage>. doi: <pub-id pub-id-type="doi">10.1016/j.intell.2006.02.001</pub-id></citation></ref>
<ref id="ref4"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Gignac</surname> <given-names>G. E.</given-names></name></person-group> (<year>2006</year>). <article-title>A confirmatory examination of the factor structure of the multidimensional aptitude battery: contrasting oblique, higher order, and nested factor models</article-title>. <source>Educ. Psychol. Meas.</source> <volume>66</volume>, <fpage>136</fpage>&#x2013;<lpage>145</lpage>. doi: <pub-id pub-id-type="doi">10.1177/0013164405278568</pub-id></citation></ref>
<ref id="ref5"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Gignac</surname> <given-names>G. E.</given-names></name></person-group> (<year>2008</year>). <article-title>Higher-order models versus direct hierarchical models: g as superordinate or breadth factor?</article-title> <source>Psychol. Sci. Q.</source> <volume>50</volume>, <fpage>21</fpage>&#x2013;<lpage>43</lpage>.</citation></ref>
<ref id="ref6"><citation citation-type="other"><person-group person-group-type="author"><name><surname>Hell</surname> <given-names>B.</given-names></name> <name><surname>Trapmann</surname> <given-names>S.</given-names></name> <name><surname>Schuler</surname> <given-names>H.</given-names></name></person-group> (<year>2007</year>). Eine metaanalyse der validit&#x00E4;t von fachspezifischen studierf&#x00E4;higkeitstests im deutschsprachigen raum. <source>Empirische P&#x00E4;dagogik</source> <fpage>21</fpage>, 251&#x2013;270.</citation></ref>
<ref id="ref7"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hissbach</surname> <given-names>J. C.</given-names></name> <name><surname>Klusmann</surname> <given-names>D.</given-names></name> <name><surname>Hampe</surname> <given-names>W.</given-names></name></person-group> (<year>2011</year>). <article-title>Dimensionality and predictive validity of the HAM-Nat, a test of natural sciences for medical school admission</article-title>. <source>BMC Med. Educ.</source> <volume>11</volume>:<fpage>83</fpage>. doi: <pub-id pub-id-type="doi">10.1186/1472-6920-11-83</pub-id>, PMID: <pub-id pub-id-type="pmid">21999767</pub-id></citation></ref>
<ref id="ref8"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hu</surname> <given-names>L.</given-names></name> <name><surname>Bentler</surname> <given-names>P. M.</given-names></name></person-group> (<year>1999</year>). <article-title>Cutoff criteria for fit indexes in covariance structure analysis: conventional criteria versus new alternatives</article-title>. <source>Struct. Equ. Model. Multidiscip. J.</source> <volume>6</volume>, <fpage>1</fpage>&#x2013;<lpage>55</lpage>. doi: <pub-id pub-id-type="doi">10.1080/10705519909540118</pub-id></citation></ref>
<ref id="ref9"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>H&#x00FC;lsheger</surname> <given-names>U. R.</given-names></name> <name><surname>Maier</surname> <given-names>G. W.</given-names></name> <name><surname>Stumpp</surname> <given-names>T.</given-names></name></person-group> (<year>2007</year>). <article-title>Validity of general mental ability for the prediction of job performance and training success in Germany: a meta-analysis1</article-title>. <source>Int. J. Sel. Assess.</source> <volume>15</volume>, <fpage>3</fpage>&#x2013;<lpage>18</lpage>. doi: <pub-id pub-id-type="doi">10.1111/j.1468-2389.2007.00363.x</pub-id></citation></ref>
<ref id="ref10"><citation citation-type="book"><person-group person-group-type="author"><name><surname>Jensen</surname> <given-names>A. R.</given-names></name></person-group> (<year>1998</year>). <source>The g Factor: The Science of Mental Ability</source>. <publisher-name>Praeger Publishers/Greenwood Publishing Group Westport</publisher-name>.</citation></ref>
<ref id="ref11"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Johnson</surname> <given-names>W.</given-names></name> <name><surname>Bouchard</surname> <given-names>T. J.</given-names></name> <name><surname>Krueger</surname> <given-names>R. F.</given-names></name> <name><surname>McGue</surname> <given-names>M.</given-names></name> <name><surname>Gottesman</surname> <given-names>I. I.</given-names></name></person-group> (<year>2004</year>). <article-title>Just one g: consistent results from three test batteries</article-title>. <source>Intelligence</source> <volume>32</volume>, <fpage>95</fpage>&#x2013;<lpage>107</lpage>. doi: <pub-id pub-id-type="doi">10.1016/S0160-2896(03)00062-X</pub-id></citation></ref>
<ref id="ref12"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Johnson</surname> <given-names>W.</given-names></name> <name><surname>Te Nijenhuis</surname> <given-names>J.</given-names></name> <name><surname>Bouchard</surname> <given-names>T. J.</given-names></name></person-group> (<year>2008</year>). <article-title>Still just 1 g: consistent results from five test batteries</article-title>. <source>Intelligence</source> <volume>36</volume>, <fpage>81</fpage>&#x2013;<lpage>95</lpage>. doi: <pub-id pub-id-type="doi">10.1016/j.intell.2007.06.001</pub-id></citation></ref>
<ref id="ref13"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kadmon</surname> <given-names>G.</given-names></name> <name><surname>Kirchner</surname> <given-names>A.</given-names></name> <name><surname>Duelli</surname> <given-names>R.</given-names></name> <name><surname>Resch</surname> <given-names>F.</given-names></name> <name><surname>Kadmon</surname> <given-names>M.</given-names></name></person-group> (<year>2012</year>). <article-title>Warum der test f&#x00FC;r medizinische studieng&#x00E4;nge (TMS)?</article-title> <source>Z. Evid. Fortbild. Qual. Gesundhwes.</source> <volume>106</volume>, <fpage>125</fpage>&#x2013;<lpage>130</lpage>. doi: <pub-id pub-id-type="doi">10.1016/j.zefq.2011.07.022</pub-id>, PMID: <pub-id pub-id-type="pmid">22480896</pub-id></citation></ref>
<ref id="ref14"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kramer</surname> <given-names>J.</given-names></name></person-group> (<year>2009</year>). <article-title>Allgemeine intelligenz und beruflicher erfolg in deutschland</article-title>. <source>Psychol. Rundsch.</source> <volume>60</volume>, <fpage>82</fpage>&#x2013;<lpage>98</lpage>. doi: <pub-id pub-id-type="doi">10.1026/0033-3042.60.2.82</pub-id></citation></ref>
<ref id="ref15"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Levacher</surname> <given-names>J.</given-names></name> <name><surname>Koch</surname> <given-names>M.</given-names></name> <name><surname>Hissbach</surname> <given-names>J.</given-names></name> <name><surname>Spinath</surname> <given-names>F. M.</given-names></name> <name><surname>Becker</surname> <given-names>N.</given-names></name></person-group> (<year>2021</year>). <article-title>You can play the game without knowing the rules&#x2014;but you&#x2019;re better off knowing them: the influence of rule knowledge on figural matrices tests</article-title>. <source>Eur. J. Psychol. Assess.</source> <volume>38</volume>, <fpage>15</fpage>&#x2013;<lpage>23</lpage>. doi: <pub-id pub-id-type="doi">10.1027/1015-5759/a000637</pub-id></citation></ref>
<ref id="ref16"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>McGrew</surname> <given-names>K. S.</given-names></name></person-group> (<year>2009</year>). <article-title>CHC theory and the human cognitive abilities project: standing on the shoulders of the giants of psychometric intelligence research</article-title>. <source>Intelligence</source> <volume>37</volume>, <fpage>1</fpage>&#x2013;<lpage>10</lpage>. doi: <pub-id pub-id-type="doi">10.1016/j.intell.2008.08.004</pub-id></citation></ref>
<ref id="ref17"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Neisser</surname> <given-names>U.</given-names></name> <name><surname>Boodoo</surname> <given-names>G.</given-names></name> <name><surname>Bouchard</surname> <given-names>T. J.</given-names> <suffix>Jr.</suffix></name> <name><surname>Boykin</surname> <given-names>A. W.</given-names></name> <name><surname>Brody</surname> <given-names>N.</given-names></name> <name><surname>Ceci</surname> <given-names>S. J.</given-names></name> <etal/></person-group>. (<year>1996</year>). <article-title>Intelligence: knowns and unknowns</article-title>. <source>Am. Psychol.</source> <volume>51</volume>, <fpage>77</fpage>&#x2013;<lpage>101</lpage>. doi: <pub-id pub-id-type="doi">10.1037/0003-066X.51.2.77</pub-id></citation></ref>
<ref id="ref18"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Rosseel</surname> <given-names>Y.</given-names></name></person-group> (<year>2012</year>). <article-title>Lavaan: an R package for structural equation modeling and more version 0.5-12 (BETA)</article-title>. <source>J. Stat. Softw.</source> <volume>48</volume>, <fpage>1</fpage>&#x2013;<lpage>36</lpage>. doi: <pub-id pub-id-type="doi">10.18637/jss.v048.i02</pub-id></citation></ref>
<ref id="ref19"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Roth</surname> <given-names>B.</given-names></name> <name><surname>Becker</surname> <given-names>N.</given-names></name> <name><surname>Romeyke</surname> <given-names>S.</given-names></name> <name><surname>Sch&#x00E4;fer</surname> <given-names>S.</given-names></name> <name><surname>Domnick</surname> <given-names>F.</given-names></name> <name><surname>Spinath</surname> <given-names>F. M.</given-names></name></person-group> (<year>2015</year>). <article-title>Intelligence and school grades: a meta-analysis</article-title>. <source>Intelligence</source> <volume>53</volume>, <fpage>118</fpage>&#x2013;<lpage>137</lpage>. doi: <pub-id pub-id-type="doi">10.1016/j.intell.2015.09.002</pub-id></citation></ref>
<ref id="ref20"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Schmidt</surname> <given-names>F. L.</given-names></name> <name><surname>Hunter</surname> <given-names>J. E.</given-names></name></person-group> (<year>1998</year>). <article-title>The validity and utility of selection methods in personnel psychology: practical and theoretical implications of 85 years of research findings</article-title>. <source>Psychol. Bull.</source> <volume>124</volume>, <fpage>262</fpage>&#x2013;<lpage>274</lpage>. doi: <pub-id pub-id-type="doi">10.1037/0033-2909.124.2.262</pub-id></citation></ref>
<ref id="ref120"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Schult</surname> <given-names>J.</given-names></name> <name><surname>Hofmann</surname> <given-names>A.</given-names></name> <name><surname>Stegt</surname> <given-names>S. J.</given-names></name></person-group> (<year>2019</year>). <article-title>Leisten fachspezifische Studierf&#x00E4;higkeitstests im deutschsprachigen Raum eine valide Studienerfolgsprognose?</article-title> <source>Z. Entwicklungspsychol. Padagog. Psychol.</source> <volume>51</volume>, <fpage>16</fpage>&#x2013;<lpage>30</lpage>. doi: <pub-id pub-id-type="doi">10.1026/0049-8637/a000204</pub-id></citation></ref>
<ref id="ref21"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Schwarz</surname> <given-names>G.</given-names></name></person-group> (<year>1978</year>). <article-title>Estimating the dimension of a model</article-title>. <source>Ann. Stat.</source> <volume>6</volume>, <fpage>461</fpage>&#x2013;<lpage>464</lpage>. doi: <pub-id pub-id-type="doi">10.1214/aos/1176344136</pub-id></citation></ref>
<ref id="ref22"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Schwibbe</surname> <given-names>A.</given-names></name> <name><surname>Lackamp</surname> <given-names>J.</given-names></name> <name><surname>Knorr</surname> <given-names>M.</given-names></name> <name><surname>Hissbach</surname> <given-names>J.</given-names></name> <name><surname>Kadmon</surname> <given-names>M.</given-names></name> <name><surname>Hampe</surname> <given-names>W.</given-names></name></person-group> (<year>2018</year>). <article-title>Medizinstudierendenauswahl in deutschland</article-title>. <source>Bundesgesundheitsbl. Gesundheitsforsch. Gesundheitsschutz</source> <volume>61</volume>, <fpage>178</fpage>&#x2013;<lpage>186</lpage>. doi: <pub-id pub-id-type="doi">10.1007/s00103-017-2670-2</pub-id>, PMID: <pub-id pub-id-type="pmid">29294180</pub-id></citation></ref>
<ref id="ref23"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Spearman</surname> <given-names>C.</given-names></name></person-group> (<year>1904</year>). <article-title>&#x201C;General intelligence,&#x201D; objectively determined and measured</article-title>. <source>Am. J. Psychol.</source> <volume>15</volume>, <fpage>201</fpage>&#x2013;<lpage>293</lpage>. doi: <pub-id pub-id-type="doi">10.2307/1412107</pub-id></citation></ref>
<ref id="ref24"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Valerius</surname> <given-names>S.</given-names></name> <name><surname>Sparfeldt</surname> <given-names>J. R.</given-names></name></person-group> (<year>2014</year>). <article-title>Consistent g- as well as consistent verbal-, numerical- and figural-factors in nested factor models? Confirmatory factor analyses using three test batteries</article-title>. <source>Intelligence</source> <volume>44</volume>, <fpage>120</fpage>&#x2013;<lpage>133</lpage>. doi: <pub-id pub-id-type="doi">10.1016/j.intell.2014.04.003</pub-id></citation></ref>
<ref id="ref25"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Werwick</surname> <given-names>K.</given-names></name> <name><surname>Winkler-Stuck</surname> <given-names>K.</given-names></name> <name><surname>Hampe</surname> <given-names>W.</given-names></name> <name><surname>Albrecht</surname> <given-names>P.</given-names></name> <name><surname>Robra</surname> <given-names>B.-P.</given-names></name></person-group> (<year>2015</year>). <article-title>Introduction of the HAM-Nat examination &#x2013; applicants and students admitted to the medical faculty in 2012-2014</article-title>. <source>GMS Z. Med. Ausbild.</source> <volume>32</volume>:<fpage>Doc53</fpage>. doi: <pub-id pub-id-type="doi">10.3205/zma000995</pub-id>, PMID: <pub-id pub-id-type="pmid">26604995</pub-id></citation></ref>
</ref-list>
<fn-group>
<fn id="fn0004"><label>1</label>
<p>The stav-project investigates different subtests within the framework of the Studierendenauswahl-Verbund (stav; en. Student Selection Network). It investigates the HAM-Nat, to which three subtests were added, and the TMS. One aim of stav is to evaluate the different subtests in order to scientifically find out how the current student admission procedures in medicine could be improved.</p>
</fn>
</fn-group>
</back>
</article>