Secondary School Leaving Examinations: The Impact of Expectancies, Values, and Dimensional Comparisons on Male and Female Students’ Science-Oriented Choices

In Germany, secondary school students have to choose at least one STEM subject (mathematics, biology, chemistry, and physics) for their Secondary School Leaving Examinations. In a representative sample of students in grade 13 in one federal state in Germany, we explore male and female students’ subject choices in an expectancy-value as well as dimensional comparison framework by considering prior performance, ability self-concept, and values in the chosen subject. We extend previous research by including dimensional comparisons that students make between the varying subjects they have to choose from. We discriminate between two opposing groups. One group shows a science-avoidance choice pattern by selecting only one science subject: biology (n = 439). The other group shows a science-oriented choice pattern by selecting either physics or chemistry or two STEM subjects of which one is physics or chemistry (n = 248). We measured achievement test scores, relative and absolute midterm grades, ability self-concepts, as well as attainment and utility values in chosen and non-chosen subjects and calculated logistic regressions as well as multigroup models. Our results showed that science-oriented final exam choices depended on two mechanisms. Within the expectancy-value framework, a science-oriented choice pattern was predicted by ability self-concept in mathematics for male and female students. However, attainment and utility values appeared to be irrelevant for this specific choice. Within the dimensional comparison framework, the relative mathematics-English midterm grade was relevant, but only for male students. Our findings raise the question whether male and female students should be encouraged differently in order to stay in the STEM pipeline and how structural conditions may shape pathways into or out of this pipeline.

In Germany, secondary school students have to choose at least one STEM subject (mathematics, biology, chemistry, and physics) for their Secondary School Leaving Examinations. In a representative sample of students in grade 13 in one federal state in Germany, we explore male and female students' subject choices in an expectancyvalue as well as dimensional comparison framework by considering prior performance, ability self-concept, and values in the chosen subject. We extend previous research by including dimensional comparisons that students make between the varying subjects they have to choose from. We discriminate between two opposing groups. One group shows a science-avoidance choice pattern by selecting only one science subject: biology (n = 439). The other group shows a science-oriented choice pattern by selecting either physics or chemistry or two STEM subjects of which one is physics or chemistry (n = 248). We measured achievement test scores, relative and absolute midterm grades, ability self-concepts, as well as attainment and utility values in chosen and non-chosen subjects and calculated logistic regressions as well as multigroup models. Our results showed that science-oriented final exam choices depended on two mechanisms. Within the expectancy-value framework, a science-oriented choice pattern was predicted by ability self-concept in mathematics for male and female students. However, attainment and utility values appeared to be irrelevant for this specific choice. Within the dimensional comparison framework, the relative mathematics-English midterm grade was relevant, but only for male students. Our findings raise the question whether male and female students should be encouraged differently in order to stay in the STEM pipeline and how structural conditions may shape pathways into or out of this pipeline.

INTRODUCTION
Low student enrollment in STEM subjects appears to be a widespread problem (e.g., for the United States: McFarland et al., 2018;for Europe: OECD, 2018). Female students in particular are largely underrepresented in these study fields (OECD, 2018), with the gender asymmetry being especially pronounced in physics, computer, and engineering studies (OECD, 2018;Statistisches Bundesamt, 2018, 2019National Science Foundation, 2019). Considering the consequences of this situation for society as a whole, the exploration of the reasons of this phenomenon is a matter of great importance for governments, industry, and educators (Bøe et al., 2011). Furthermore, it seems to be crucial to find ways of encouraging enrollment in STEM subjects with a special focus on female students.
Numerous studies investigated course selection in an expectancy-value framework (e.g., Nagy et al., 2006;Bøe, 2012;Wang et al., 2013), establishing that it is already during the late years of secondary school that students' choices of a subject major at the university level are channeled (e.g., Trusty, 2002;Guo et al., 2015b;Perez-Felkner et al., 2017). Hence, it is important to concentrate on the analysis of academic choices and their determinants in the late school career, when trying to identify reasons for low enrollment rates and gender differences in enrollment rates in STEM university studies (Guo et al., 2015b). Against this background, it might be a fruitful endeavor to investigate how regulations regarding subject choices in the final school exams encourage academic decisions in favor of or against STEM subjects. Countries vary in the strictness of their regulations on the subject coverage of final exams. For example in the United States, in 13 states students do and in 11 states do not have to take science as a subject in their school leaving examination (Snyder et al., 2019). In England, students are free to choose from a wide variety of subjects without including science or mathematics (Cuff, 2017).
Former studies based on the expectancy-value theory (EVT; Eccles et al., 1983) indicated that gender differences in expectancies and values concerning STEM fields might contribute to gender-related academic choices and career paths (e.g., Gaspard et al., 2015). In Germany, as in many other countries, students do not only have to decide which courses to select during upper secondary level education, but also in which STEM courses to take their final exams. Previous studies predicting academic choices in an expectancyvalue framework have, however, typically focused on students choosing a single subject or on science vs. non-science course choices (e.g., Dickhäuser et al., 2005;Nagy et al., 2006Nagy et al., , 2012Palmer et al., 2017). Moreover, the majority of single subject studies investigated mathematics choices only (e.g., Guo et al., 2015b;Perez-Felkner et al., 2017). We extend prior studies by distinguishing between two choice patterns for the final exam subject choice: science-orientation vs science-avoidance.

Final School Exam Choice: Science-Oriented vs Science-Avoidance Patterns
In the German federal state Schleswig-Holstein, where we conducted our study, students have to pass four exams at the end of secondary school (Abiturprüfung). Regulations on subject choice ensure that the chosen exams cover a wide range of academic domains. To that end, two of the four final exams have to be chosen out of the three core subjects: German, a foreign language, and mathematics. So, possible combinations of the first two exams are: German and a foreign language, German and mathematics, a foreign language and mathematics. To arrive at the total of four exam subjects, any of these three choices will have to be combined with two further subjects. If students choose the combination German and a foreign language, thus avoiding mathematics, one of the science subjects, such as biology, chemistry, or physics, has to be among the additional two subjects. Under certain restrictions students may also choose two final exams in mathematics and a science subject.
The purpose of these guidelines is to guarantee that students choose at least one subject from the STEM-domain (mathematics, physics, chemistry, and biology), in each case. Guo et al. (2017) found that within the sciences students perceive biology as being more closely related to the verbal domain and chemistry and physics as being more closely related to the mathematical domain. We therefore predicted that students who intend to avoid the natural sciences as much as possible choose biology as the one compulsory subject to be chosen from the natural sciences (science-avoidance choice pattern). In contrast, we defined that a science-oriented choice pattern constituted selecting either (a) physics, (b) chemistry or (c) two STEM subjects, with one of the two being either chemistry or physics. An example of a science-oriented choice pattern is the selection of physics, English, mathematics, and history. An example of a science-avoidance pattern is the selection of biology, English, German, and religion. Within the restrictions for final exam choice, students have to weigh advantages and disadvantages of mathematics versus the different science subjects as well as mathematics and the science subjects versus the two core subjects German and foreign language. Among the foreign languages, the vast majority of students chooses English in their final exams. In our study, we investigated whether the decision in favor of or, under the described restrictions, against STEM subjects differed according to students' gender.

Predicting Academic Choices From Prior Performance, Expectations, Values, and Dimensional Comparisons
Academic choices are typically investigated in the framework of the EVT of achievement motivation (Wigfield and Eccles, 2000; for a comprehensive overview on theories of beliefs and values, see Eccles and Wigfield, 2002). In EVT, choices are regarded as the result of a complex interplay between a students' ability on the one hand and psychological variables -expectancies of success and subjective values -on the other . EVT states that achievement-related choices depend on individuals' expectations of success and their valuing regarding a given task (Eccles et al., 1983). Values consist of four components: intrinsic value (enjoyment), utility value (usefulness for own goals), attainment value (importance of doing well), and cost value (subjective cost of engagement; Eccles et al., 1983;Wigfield and Eccles, 2000). In our study, we focused on expectations of success students have toward different school subjects, as well as the attainment and utility values they associate with these subjects (Nagy et al., 2006;Watt et al., 2012;Wang et al., 2013;Lazarides and Laumann, 2019).

Prior Performance and Dimensional Comparisons
Prior achievement is one of the main determinants of academic choices. The better individuals perform in a given subject the more likely it is that they will select it in the future, for example by enrolling in extra-courses or university studies (e.g., Köller et al., 2000;Watt, 2006;Watt et al., 2006;Guo et al., 2015b) or by choosing that subject in their final exam. Our study considers both, achievement test scores and midterm grades as indicators of students' prior performance. Scores in achievement tests are an objective, criterion-referenced measure of ability that is unbiased by reference group effects (Duckworth et al., 2012). In contrast, school grades reflect objective performance as well as individuals' relative position within the respective reference group and therefore more directly represent the social feedback a student receives in class with respect to his or her ability (e.g., Zeidner and Schleyer, 1998;Südkamp and Möller, 2009). Grades also reflect other personal characteristics such as motivation or conscientiousness (e.g., Südkamp et al., 2012;Kaiser et al., 2013;Brookhart et al., 2016). While we considered grades for the three core subjects mathematics, German, and English as a foreign language, achievement test scores were available for mathematics and for science (i.e., a test measuring competences in biology, chemistry, and physics).
As students in Germany can choose in which STEM subject to take their final exams, we took into account the dimensional comparison (DC) they make between different subjects, when predicting their choices. Dimensional comparison theory (DCT; Möller and Marsh, 2013) states that students compare their absolute and relative abilities across subjects in what is termed DC (e.g., Dickhäuser et al., 2005;Wang et al., 2017). The theory is based on two main assumptions. The first assumption is that students compare their performance within and across domains (Möller and Marsh, 2013). The second assumption is that domains can be placed on a continuum with the end poles mathematic and verbal (Helm et al., 2016). Along this continuum domains with which students compare their performance in a given domain are either perceived as being rather similar (near domains) or rather dissimilar to this given domain (far domains). So high achievement in a given subject has a positive effect on students' self-concept in the respective subject. High achievement in subjects that are perceived to be similar to the given subject also has a positive effect on students' selfconcept in the respective domain. This phenomenon is termed assimilation. At the same time, the reverse process describes how high performance in subjects that are perceived to be rather dissimilar to the given subject leads to lower self-concept in the given domain (Helm et al., 2016).
Applied to final exam subject choice, high achievements in STEM subjects should coincide with strong self-concepts in STEM subjects but at the same time negatively impact selfconcepts in non-STEM subjects (German, foreign language; cf. Möller et al., 2006). These dimensional evaluations appear to have an impact on students' future academic choices. Previous studies substantiated this mechanism. For example, students with high math and verbal abilities at grade 12 appear to be less likely to pursue STEM careers, compared to students with high math abilities only (Wang et al., 2013). Uerz et al. (2004) investigated DC in secondary school. They found female students to perform relatively higher in language (mother tongue) as compared to mathematics and the reverse for male students. Students' DC between their performances in these two subjects predicted the number of science courses they selected. In order to examine effects of DC on exam subject choices, we included the comparison between the three core subjects students have to choose from: mathematics and the two non-STEM subjects (German and English as a foreign language). Going beyond the study by Uerz et al. (2004), we also considered the relationship between gender and DC in the prediction of science course taking.
In accord with prior research on DC, we expected students' choice of their exam subject to not only be influenced by their absolute prior performance in science and mathematics but also by their prior performance in these subjects in relation to other subjects. We predicted that high absolute midterm grades and high achievement test scores in mathematics and science would result in science-oriented choice patterns for the final exams. We also predicted that a relatively high grade in mathematics -as compared to grades in the other non-STEM core subjects German and foreign language (English) -would furthermore encourage a science-oriented choice pattern. To test this assumption, we calculated relative midterm grades by subtracting the grade in German from the grade in mathematics and by subtracting the grade in English from the grade in mathematics.

Domain-Specific Ability Self-Concepts as Indicators of Expectations of Success
Within EVT, expectations of success are often measured via the ability self-concept (Eccles and Wigfield, 2002). Ability selfconcept is defined as "one's knowledge and perceptions about one's academic ability" (Marsh and Seaton, 2013, p. 62). Even after controlling for the impact of actual prior performance, domainspecific self-concepts predict academic choices (e.g., Watt, 2006;Nagy et al., 2008;Wang et al., 2013). One study even revealed that ability self-concept fully moderates the effect of achievement on course selection in mathematics (Köller et al., 2000). In our study, we used domain-specific ability self-concept as a measure of expectations of success. We measured students' ability self-concepts in mathematics and science (comprising biology, physics, and chemistry) as well as in English and in German. DCT states that individuals perceive their personal achievements in a given domain the weaker, the better they perceive their performance to be in a different subject and vice versa (e.g., Marsh and Craven, 2006;Möller and Marsh, 2013). The negative effect of high achievement in one subject on ability self-concept in a different subject is the stronger the less similar the subjects are regarded to be (contrasting DC). As a result, while achievements are typically positively correlated across subjects, people's ability self-concepts in distal subjects are not (Möller et al., 2020). Mathematics and science are perceived as rather similar to each other while mathematics and science on the one hand and languages on the other hand are perceived as less similar (Chiu, 2008). For instance, Perez-Felkner et al. (2017) found a positive impact of mathematical ability beliefs on science course taking. Owing to the fact that these two domains are perceived as rather similar, prior high performances in these two domains will lead to high ability self-concepts in the respective domains. Lazarides and Laumann (2019) point out that DC do not solely apply to the formation of self-concepts but also affect outcomes such as career choices or course selection. Accordingly, we expected that strong students' ability self-concept in mathematics and science would predict science-oriented final exam choice patterns whereas their ability self-concept in the core non-STEM subjects (German and English) would not.

Values
Academic choices do not only depend on previous performance and ability self-concept, but also on values individuals associate with different options and choices (Wigfield and Eccles, 2000). Values are referred to as "the quality of the task [or the school subject] that contributes to the increasing or decreasing probability that an individual will select it" (Eccles, 2005, p. 109). Values can be further differentiated into (among others) attainment values and utility values (Eccles and Wigfield, 2002). Attainment values focus on the importance a person attaches to succeeding in a task and on the fit of a subject or choice with the person's identity (e.g., I highly value learning a lot in science classes). Utility values indicate the match between the task and external goals of that person (e.g., A good grade in mathematics will be helpful for my future). Various studies showed that both values predict choices in favor of STEM careers including choice of courses and study fields (e.g., Stokking, 2000;Guo et al., 2015b). In our study, we measured students' values with respect to the two STEM domains mathematics and science. Since both subjects are perceived as rather similar as well as close to or at the mathematical end pole of the DC continuum, we expected that final exam subject choice would be positively predicted by attainment and utility values in both science and mathematics.

Gender Differences
We expected to find gender differences in students' scienceoriented final exam choice patterns as well as in the predictors for these choice patterns. Previous studies suggest that male students are more inclined to choose STEM subjects than female students (e.g., National Research Council, 2012; Luttenberger et al., 2019). Among the STEM subjects, male students more likely choose mathematics, physics, or chemistry than female students (e.g., Van De Werfhorst et al., 2003;Uerz et al., 2004;van Langen et al., 2006). Female students, in contrast, tend to favor biology (Rennie and Parker, 1993;Hill et al., 2010). At the university level, women are especially underrepresented in physics and the engineering studies (DESTATIS, 2019). In view of this situation, we expected male students to show higher rates of science-oriented final exam choice patterns and female students to show higher rates of science-avoidance final exam choice patterns.
Regarding mathematics abilities, some studies revealed small differences in favor of male as compared to female students, measured by achievement test scores (e.g., Nagy et al., 2008;Else-Quest et al., 2010;Guo et al., 2015b). However, other studies found no gender difference . The latest nationwide regular school performance study for Germany established a small but significant advantage of male students in mathematics (d = 0.08) and a small advantage of female students in science (d 1 average = 0.13), mainly due to female students' higher achievement in biology (d content knowledge = 0.24, d scientific inquiry = 0.22; Schipolowski et al., 2019). Further studies showed gender differences in mathematics in favor of male students in upper secondary level when assessed through achievement tests but not when assessed through grades (Köller et al., 2000). Since our study was conducted in Germany, we hypothesized that male students would outperform female students in mathematics in an achievement test, while we did not expect a gender difference in mathematics midterm grades or in attainments in a science achievement test.
On the subject of ability self-concepts, many studies found female students to think lower of their abilities in mathematics and higher of their abilities in language-related domains than male students, even after controlling for the impact of actual performance (e.g., Marsh and Seaton, 2013;Parker et al., 2014;Guo et al., 2015b). Results with respect to science are inconsistent. While some studies found male students to exhibit higher science-related ability self-concepts (Debacker and Nelson, 1999;Wilkins, 2004;Taskinen et al., 2013), other studies found no difference between male and female students (Leibham et al., 2013). Accordingly, we expected male students to indicate a higher ability self-concept in mathematics and lower ability self-concepts in German and English than female students but refrained from specifying a hypothesis for the science-related ability self-concept.
Gender differences in value beliefs have typically been investigated regarding the subject mathematics (e.g., Gaspard et al., 2017). In this respect, findings are inconsistent. Regarding attainment values, some studies found male students to assign greater importance to mathematics while other studies found no gender difference (see Gaspard et al., 2015 for an overview). For utility values, some studies revealed that male students perceive mathematics as more useful than female students (e.g., Steinmayr and Spinath, 2010;Bøe, 2012;Wang, 2012;Watt et al., 2012;Gaspard et al., 2015Gaspard et al., , 2017Guo et al., 2015b). Other studies did not find a gender difference regarding the perceived usefulness of mathematics (e.g., Köller et al., 2000;Watt, 2006;Bøe, 2012). Against the background of these findings, we hypothesized that male students would indicate higher attainment and utility values with respect to mathematics but refrained from specifying a hypothesis regarding gender differences in values toward science.
Surprisingly, there are only few studies that focus on gender differences in DC and consequent academic choices (Lazarides and Laumann, 2019). According to these studies, female students' lower self-concept in mathematics could be due to DC processes (Wang et al., 2013). Lazarides and Laumann (2019) demonstrated in a longitudinal study that girls were less inclined to make mathematics career plans due to their lower mathematics self-concept. Therefore, we also had differing predictions regarding the impact of ability self-concept, DC, and prior performance on male and female students' final exam choice. We assumed that lower prior performance and a lower ability self-concept in mathematics would have a higher impact on girls' than on boys' science-oriented final exam choice. Due to inconclusive EVT results, we refrained from specifying a hypothesis regarding gender differences for the expectancyvalue variables.

RESEARCH HYPOTHESES
Our first set of hypotheses concerns descriptive differences between male and female students. We anticipated that compared to female students, male students would more often opt for science-oriented final exam choice patterns (H1a). We also assumed that male as compared to female students would show higher values in all mathematicsrelated predictors except for absolute midterm grades in mathematics (H1b; test scores in mathematics, mathematics ability self-concept as well as mathematics attainment and utility values). However, we expected no gender difference in science achievement (H1c) and -due to inconsistent results -refrained from specifying hypotheses regarding gender differences in science self-concept, science attainment values, or science utility values.
In our second set of hypotheses, we focus on predictive patterns for science-oriented and science-avoidance final exam choices. We assumed that science-oriented final exam choice patterns would be positively predicted by mathematics and science achievement test scores (H2a and H2b), mathematics midterm grades (H2c), mathematics and science self-concept (H2d), as well as mathematics and science attainment and utility values (H2e). Science-oriented final exam choice patterns would not be predicted by English and German self-concept (H2f and H2g). Regarding absolute and relative midterm grades, we assumed that the effect of the absolute grade would be explained by DC between STEM and non-STEM midterm grades. Therefore, we expected that science-oriented final exam choice patterns would be positively predicted by high mathematics midterm grades in comparison to English (H2h) and German (H2i) midterm grades.
Our third set of hypotheses concentrates on gender differences in the predictive patterns for science-oriented and scienceavoidance final exam choices. We anticipated that science and mathematics self-concepts and relative midterm grades (DC) would interact with student's gender (H3a and H3b). We assumed that mathematics self-concept would have a stronger impact for female than for male students. Consequently, female students would opt for a science-oriented final exam pattern when they reported a lower mathematics self-concept whereas male students would not. We also expected that female students would drop out of science once they performed equally well in the verbal and math domain which offered them a wider range of subjects to choose from: female students avoid sciences when their other-domain grade is higher or the same than the mathematics grade. When confronted with the same situation male students will stay in science. We are not aware of studies on gender differences in the prediction of course choice and therefore did not formulate hypotheses on possible gender interactions regarding the prediction of attainment and utility values for science-oriented choices in final exams.

METHOD
We used data from the final measurement point of the longitudinal study Educational outcomes of Students from Vocational and Academic Upper Secondary Schools (LISA 6 Study), which was conducted in the federal state Schelswig-Holstein in spring 2013. This is the most recent study providing data on school leaving exam choices available in Germany, as the graduating classes of secondary schools are not included in the nationwide regular school performance studies.

Sample
Our sample consists of 3,639 German students in grade 13 (1,995 females and 1,644 males) who filled out achievement tests in mathematics and science. Grade 13 is the last year of upper secondary education, at the end of which students receive their higher education entrance certificate (Abitur). Students' mean age was 20 years (SD = 0.89) and their families' mean socioeconomic background (HISEI) was 60.55 (SD = 18.79), which corresponds to occupations like business services salesman or clerical supervisors. Students came from 17 academic schools and 27 vocational upper secondary schools (N = 44) in Schleswig-Holstein.

Procedure and Measures
Between May and April 2013, students filled in the achievement tests and questionnaires during regular class hours. Participation in the achievement test was mandatory, while the questionnaires were filled in on a voluntary basis. Overall, study participation took five hours including two breaks of 30 min during one school day. Students first had to work on an English achievement test (the results of which are not reported in this manuscript), followed by a science and mathematics test taken from the National Educational Panel Study (NEPS; for science Hahn et al., 2013; for mathematics Neumann et al., 2013). Both achievement tests have been developed as part of the NEPS. NEPS is a nation-wide panel study in which competencies of the test persons are measured in different subject domains. Documentation on test frameworks, items and scaling can be found on the webpage of the study 1 . At the end, students filled in a questionnaire on psychological and background variables. For the scale documentation of all measures (see Kampa et al., 2020).

Science-Oriented and Science-Avoidance Group
Our core dependent variable, namely whether a student belongs to the science-oriented group or the science-avoidance group, was calculated as follows. In June 2013, the students of our sample passed their school-leaving examination (Abitur). In this way, we were able to use the information in which subject domains each student had taken their exams and what grade they had achieved in each exam. Students who had chosen only one STEM subject and opted for biology as that subject were considered as science-avoiding (n = 439). Students who had chosen only one STEM subject and opted for physics or for chemistry as that subject as well as students who had chosen two STEM subjects of which at least one was either physics or chemistry were considered as science-oriented (n = 248). N = 2,952 could not be considered further in our analyses because they were not covered by this classification.

Mathematical Achievement
The NEPS mathematics test consists of 20 multiple choice (MC) items, which are based on the concept of the German Educational Standards in Mathematics (Kultusministerkonferenz, 2003) and students could reach between 0 and 20 scores. Each item can be classified according to the content domain and the cognitive processes involved in solving them. Item contents related to quantity (4 items), space and shape (3), change and relationships (6), or data and chance (7). Cognitive processes referred to technical abilities and skills (9), modeling (1), problem solving (4), using representational forms (5), and communication (1). Detailed information on the framework of the test as well as example items can be found in Neumann et al. (2013;sample item 3) as well as in Schnittjer and Duchhardt (2015; sample items 3 and 4). In order to calculate the mathematics achievement estimate we followed the NEPS scaling procedure (Gerken and Schnittjer, 2017). We retrieved the five plausible values (PV) per student for achievement in mathematics. The PVs were generated on the basis of a one-dimensional Rasch model with a background model (Embretson and Reise, 2000) and the metric of the PVs was set to a mean of 500 and a standard deviation of 100 (for details on our specific scaling procedure see Leucht and Köller, 2016). The EAP-PV reliability was 0.92.

Science Achievement
The NEPS science test (Hahn et al., 2013) consists of 30 items: 17 multiple-choice (MC) and 13 complex MC items. The items refer to knowledge about science (KAS) or knowledge of science (KOS). KAS includes the content domains scientific inquiry and scientific reasoning (5 items each). KOS covers the content domains matter (7 items), systems (5), development (4), interactions (6), data, and measurement error (3). Students could reach between 0 and 30 scores. Detailed information on the test framework and an example item (sample item 3) can be found in Hahn et al. (2013; sample item 3) as well as in Hahn et al. (2014; Figure A-4). The scaling procedure was congruent to the procedure for mathematics achievement. We again retrieved the five PV per student that were generated congruently to the PV for mathematics achievement. The EAP-PV reliability was 0.86.

Absolute and Relative Midterm Grades
The absolute midterm grades in mathematics, English (first foreign language), and German from the first semester of grade 13 were reported by the students. In this context, a 15-point grading scale is used with 15 (very good) being the highest and 1 (very poor) being the lowest grade. Midterm grades in the remaining science subjects biology, chemistry, and physics could not be included. During upper secondary education, a considerable number of students drop out of these subjects. Since the drop out reflects students' interest in or aversion to subjects such as biology, chemistry, and physics, it is not at random. However, midterm grades can only be obtained from students enrolled in classes in the specific subjects. This renders calculation on the basis of subsamples that are based on midterm grades in these three subjects highly selective and not comparable. Therefore, we did not consider midterm grades in these three subjects. To map midterm grades in comparison to grades in other subjects (DC), we calculated relative midterm grades by subtracting the English as well as German midterm grade from the midterm grade a student had obtained in mathematics.

Ability Self-Concept
Students' ability self-concept in mathematics was measured by three items taken from the Programme for International Student Assessment (PISA) 2000 study (Kunter et al., 2002;e.g., Math is one of my best subjects). Science-related ability self-concept was measured by four items adapted from the TOSCA-Repeat study (Trautwein et al., 2007;e.g., I am good in science subjects). Ability self-concept in English and German was assessed with three items each from PISA 2000 (Kunter et al., 2002; e.g., I have always been good in English/German). Likert scales were used throughout the questionnaire with an answering format ranging from 1 (strongly disagree) to 4 (strongly agree). The scales possessed a high reliability (α mathematics = 0.91; α science = 0.86; α English = 0.89; α German = 0.89).

Expectancy-Value Variables
Items on utility values in mathematics (three items, e.g., I will need good abilities in math for my future life; Cronbachs α = 0.87) and on attainment values in mathematics (nine items, e.g., I would like to have more lessons in math; Cronbachs α = 0.95) were taken from the TOSCA-Repeat study (Trautwein et al., 2007). The wording of the items was adapted to also measure utility values (3 items; Cronbachs α = 0.91) and attainment values in science (nine items, Cronbachs α = 0.95).

Statistical Analyses
Weighted data and imputed missing data were used in all analyses. We imputed the missing values in Mplus (50 datasets) using all considered variables as well as school type and social background in a background model.

Descriptive Analyses
We extracted a group with science-oriented final exam choice patterns and a group with a science-avoidance final exam choice pattern from the representative sample. For the first, the scienceavoidance group, we identified students that only selected one science subject, biology, for their final exams (n = 439). For the second, the science-oriented group, we selected students who chose at least physics or chemistry including students who chose two STEM subjects of which one was at least physics or chemistry (n = 248). First, we checked the distribution of male and female students in the science-avoidance and science-oriented choice groups. Second, to test our hypotheses on descriptive differences between male and female students, we performed multigroup analyses for male and female students on all predictors in our study. We applied the Wald test to assess statistical significance of mean differences. Finally, we calculated the effect size Cohen's d for each difference.

Science-Oriented and Science-Avoidance Final Exam Taking
We ran two sets of consecutive logistic regressions with the science-oriented and science-avoidance exam taking groups as the criterion variable in Mplus7 (Muthén andMuthén, 1998-2010). In the first set, we focused on the hypotheses regarding the impact of achievement tests, expectancy value variables, and absolute midterm grades. Model 1 introduced the science predictors science achievement test scores, science ability selfconcept, and science attainment as well as science utility values. Model 2 introduced the mathematics predictors mathematics achievement test scores, mathematics ability self-concept, and mathematics attainment as well as utility values. Model 3 combined Model 1 and Model 2 and entered the indicators for both domains simultaneously. In Model 4, we added midterm grades in mathematics, English, and German. The final Model 5 included ability self-concept in English and German. In Models 1b to 5b, we added the gender interaction term for the considered predictors of the respective model.
In the second set of consecutive logistic regressions, we investigated the impact of DC regarding midterm grades on subject choice in the final exam and focused on the hypotheses on relative midterm grades. Each of the models was calculated without gender interaction terms (Models 1a-2a) and with gender interaction terms (Models 1b-2b). In order to more specifically investigate relevant interaction terms between gender and a considered construct, we calculated the same regression without interaction terms in multigroup analyses for male and female students separately.

RESULTS
The distribution of male and female students across the scienceoriented and science-avoidance groups showed the expected pattern. Only 35% of the students who chose biology as their only science final exam were male, 65% were female. The reverse pattern could be observed for the group that chose at least physics or chemistry including students who chose two STEM subjects of which one is either physics or chemistry. 73% in this group were male and 27% were female.
In a next step, we report the findings of the multigroup analyses with respect to gender differences (see Table 1). For descriptive differences between the science-oriented and scienceavoidance group we refer to Table A1 in the Appendix.
Contrary to our prediction, female students performed lower than male students in the science achievement test (gender difference 70.57 points). We further explored the remaining science predictors. While male students had higher sciencerelated ability self-concept and attainment values (slightly not significant at p < 0.05), they did not differ from female students regarding science-related utility values. As predicted, male students also had higher values in all mathematics predictors except mathematics midterm grades. So, despite having lower mathematics achievement test scores than male students, female students on average received equal mathematics midterm grades compared to male students. Female students showed higher values in all non-STEM predictors with the exception of the English midterm grade.

Prior Achievement and Expectancy Value Variables as Predictors of Male and Female Students' Science-Oriented Final Exam Choices
For the prediction of science-oriented final exam choice patterns, we assumed that mathematics and science predictors but not the non-STEM predictors would impact this choice. We also anticipated an interaction between gender and science-related as well as mathematics-related ability self-concept; and an interaction between gender and the relative mathematics-English and mathematics-German midterm grades. Table 2 shows the results for the consecutive regression analyses conducted on students' science-oriented vs. science-avoidance choice pattern.
Correlations between the predictors can be found in the Appendix (see Table A2).
Model 1, focusing on science predictors, confirms our hypotheses in so far as science-oriented choice patterns were predicted by higher science achievement scores. Contrary to our hypotheses, self-concept, attainment, and utility values did not predict this choice. This picture did not change once we introduced the gender interaction terms into the model (Model 1b). As expected, no gender interaction became significant. In Model 2, in which we only considered the mathematics predictors, we see the same picture. Again, science-oriented choice patterns were predicted by mathematics achievement test scores and ability self-concept but not by mathematics attainment and utility values. Introducing gender interaction terms, the direct effect of mathematics achievement test scores and, as expected, the interaction terms did not become significant. Unexpectedly, the prediction by both STEM self-concepts did not interact with gender.
In Model 3 science-oriented choice patterns were predicted by the science and mathematics predictors simultaneously. While the science predictors did not reach statistical significance once the mathematics predictors were controlled for, mathematics achievement test scores and ability self-concept still impacted science-oriented choice patterns. This is an imbalanced pattern that we did not anticipate. The gender interaction terms, which as expected did not become significant, again moderated the direct effect of mathematics achievement test scores on the exam choice. The increase in explained variance as compared to Model 1 and Model 2 further underlines the importance of mathematics achievement test scores and mathematics ability self-concept.
In Models 4 and 5 we introduced the non-STEM variables. We investigated the effect of German and English midterm grades. The German midterm grade negatively predicted scienceoriented choice patterns. Once the gender interaction terms were introduced, this direct effect disappeared and the negative interaction effect between English midterm grade and gender became significant. A multigroup analysis comparing the effect of the English midterm grade for male and female students after controlling for the remaining predictors in the model showed that while this grade was not relevant for female students' choices (OR female students = 0.96), male students opted for a science-oriented choice pattern when their English midterm grade was lower (OR male students = 0.50). In our final Model 5 we introduced non-STEM self-concepts. Besides the stable effect of mathematics-related self-concept across models, German ability self-concept negatively predicted science-oriented choice patterns. However, the latter effect disappeared once the (nonsignificant) gender interaction terms were introduced.
To sum up our results, contrary to our hypotheses only mathematics predictors contributed to science-oriented choice patterns and only achievement test scores and ability self-concept in this subject were relevant across multiple models. The science predictors were moderated by the mathematics predictors. In line with our hypotheses, the non-STEM predictors did not predict science-oriented choice patterns. We also found one of the expected gender interaction effects: English midterm grade interacted with gender but in a reverse direction. While male students were in the science-oriented choice group when they had a relatively lower English than mathematics midterm grade, this relative grade was irrelevant for female students' choices. So, against our expectations, female students did not drop out of science when they had a wider array of options. On the other hand, in contrast to female students, male students appeared to choose science-oriented final exams when their options narrowed because they performed lower in non-STEM subjects.

The Dimensional Comparison Process -Relative Midterm Grades as Predictors of Final Exam Subject Choice
Our second set of consecutive regression analyses investigated the impact of DC on science-oriented choices. We performed the same regressions for final exam choice, this time including mathematics grade and mathematics grade in relation to the two other non-STEM grades in German and English. The results are displayed in Table 3.
In Model 1 we included all control variables except non-STEM ability self-concepts. Contrary to our hypothesis, the absolute mathematics midterm grade became significant and in line with our hypothesis the relative mathematics-German midterm grade became a significant predictor as well. The relative mathematics-English midterm grade did not contribute to science-oriented exam choices. When introducing the gender interaction terms, these effects persisted and, as hypothesized, a gender effect emerged for the relative mathematics-German midterm grade. In Model 2 we controlled these effects for ability self-concepts in English and German. This model showed that the effect of the absolute and relative midterm grades was moderated by the ability self-concept in the respective domain. Both effects disappeared in this model and a lower ability self-concept in German predicted science-oriented choices. In the final model, in which we also considered gender interaction terms for these ability self-concepts, the direct effect of German ability selfconcept disappeared again.
So in the final model, in line with our hypotheses, the absolute grade in mathematics did not contribute to the prediction of science-oriented choices. Of the relative grades, the expected relative mathematics-German as well as mathematics-English midterm grades were relevant and these effects were moderated by gender. Taking a closer look at the interaction between gender and the relative mathematics-German midterm grade, a multigroup model showed that neither the odds ratio for female students (OR = 1.01) nor the one for male students (OR = 0.88) became significant. For the relative mathematics-English midterm grade, the multigroup model revealed greater differences between male and female students. While this comparison was not relevant for female students (OR = 1.04), male students were more likely to be in the science-oriented choice group the higher their mathematics midterm grade was, compared to their English midterm grade (OR = 2.06).

DISCUSSION
Our study extends existing research on academic choices in four ways. First, we investigated the choice that students have to make before leaving school: the decision which subjects to be tested in in their final exams (Abitur). Second, we took into account a wide array of predictors relating to mathematics, science, and non-STEM subjects. When predicting students' final exam subject choices, we simultaneously considered the relative impact of prior performance and expectancy-value variables (ability self-concept and values). By doing so, we can give a more differentiated insight into motives of male and female students for staying in and opting out of STEM careers. Third, we integrated achievement test scores as well as midterm grades into our models. This approach bears the chance to compare the relevance of a more objective measure (achievement test scores) with a measure that is given in the social frame of reference of the classroom (school grades; Zeidner and Schleyer, 1998). Fourth, regarding midterm grades, we considered DC that students make between varying subjects as a basis for their decision.
Our main hypotheses were (a) that male students would show higher values in most mathematics predictors, (b) that achievement and midterm grades as well as expectancy value variables in mathematics and science would predict scienceoriented choice patterns, (c) that non-STEM midterm grades and ability self-concept in English and German would not predict TABLE 2 | Logistic regression for the prediction of science-oriented vs. science-avoidance final exam choice patterns by prior performance, expectancy value variables in mathematics, science, English, and German (odds ratios).

Domain
Predictor  science-oriented final exam choice, (d) that the effect of absolute mathematics midterm grade would be moderated by the relative mathematics midterm grade compared to the non-STEM grades, as well as (e) gender interactions for these relative grades and for the impact of the non-STEM self-concepts.

Differences Between Male and Female Students' Science-Oriented and Science-Avoidance Final Exam Choice
We distinguished between groups of students who either showed a science-oriented choice pattern (choice of two STEM subjects) or avoided the sciences as far as rules permitted when selecting their final exam subjects (choice of only biology). We found that more male than female students opted for scienceoriented exams. This finding is consistent with previous research reporting similar patterns for high school students' science course selections (e.g., Dickhäuser et al., 2005;Nagy et al., 2008) and university students' selections of STEM subjects (e.g., Cerinsek et al., 2013;National Science Foundation, 2019). Future research is needed to understand how students' choices of final exam subjects translate into selection of university majors. Male students achieved higher test scores and indicated stronger ability self-concepts in science and mathematics as well as higher science attainment and mathematics attainment and utility values than female students. Despite differences in achievement test scores, male and female students did not differ in their mathematics midterm grade. This finding is in line with the results of previous studies (Köller et al., 2000). Female students indicated higher non-STEM ability self-concepts and higher German midterm grades than male students.

Gender Differences in Prediction of Science-Oriented Final Exam Choices
Consistent with prior research within the expectancy-value framework (Wigfield and Eccles, 2000), our results indicate that academic choices can be traced back to prior performance, ability self-concepts, and values which dynamically interact (e.g., Köller et al., 2000;Guo et al., 2015a). We went beyond the assumptions of EVT by incorporating DC and including non-STEM subjects students have to weigh against each other when deciding which science subjects to select for their final exam. Hence, we also looked at the (relative) impact of grades and ability self-concepts in other, non-STEM domains. Our hypotheses were partly supported. Not all mathematics and science-related variables contributed to the prediction of final exam subject choices. First, the mathematics-related variables were stronger predictors than the science-related variables. After controlling for the mathematicsrelated variables, the science-related variables did not predict final exam subject choices anymore. While we had refrained from specifying hypotheses regarding the relative impact of science vs. mathematics-related predictors of final exam choices, this result was rather surprising. One possible interpretation is that mathematics is perceived by students as the overarching discipline which provides the knowledge base for all scientific subjects. This can, for instance, imply that a student who receives low grades in mathematics or has a low mathematics-related ability self-concept may also not be very attracted to the science subjects. An additional possible explanation is that students attend a much higher number of lessons in mathematics than in any of the science subjects they may choose from. Hence, they also receive relatively more feedback on their mathematicsrelated competence. This may explain why when deciding which examination subjects to choose they rely more on their assessment of their mathematics-related skills, rather than on their perception of how well they perform in science. Second, across our models, only achievement test scores and ability selfconcepts turned out to be relevant. As we have incorporated predictors from various domains on the one hand and -within each domain -predictors from EVT, DC, and achievement test scores on the other, we revealed several moderations. While mathematics self-concept robustly predicted science-oriented choice patterns, mathematics achievement disappeared as a predictor once the gender interaction terms were introduced in Models 3-5. Third, other than expected, relative grades had only a small effect on final exam choices. In particular, the effects of the relative mathematics-German midterm grade disappeared once we controlled for the impact of the non-STEM ability selfconcepts.
We had anticipated that the results of comparisons of performances across STEM and non-STEM subjects would impact final exam choices in male and female students differently. This assumption could be confirmed for the DC regarding the mathematics-German midterm and mathematics-English midterm grade, however, not for the ability self-concepts. As expected, no gender interactions occurred for any of the other variables. It seems, male and female students have the same motives when deciding for science-oriented exams. However, the impact of DC between subjects was more pronounced for male students. Only for them did relatively lower midterm grades in English as compared to mathematics matter for the prediction of exam subject choices.
Overall, the pattern of predictors of male and female students' subject choices for their final exams was quite similar. Both, male and female students were more likely to choose STEM subjects when they showed higher mathematics test scores and a higher mathematics ability self-concept. So our descriptive findings showed that the absolute number and percentages of final exam choices differed between male and female students, but the motives for these choices seemed to be quite similar for both genders.
One of the questions we wanted to answer with our study was whether the possibility of choosing between different subjects for the final examination promoted gendered choice patterns. Our results show little evidence to support this assumption. It seems that comparisons made by students about the midterm grades they received in different subjects have little influence on their choice decisions. We did find evidence that male students were more likely to opt for the science-oriented choice pattern, the poorer their midterm grades in English were, compared to those in mathematics. However, we did not find evidence that the likelihood for female students to fall into the science-avoidance choice pattern increased in proportion to the extent to which their midterm grades in the non-STEM subjects surpassed those in the non-STEM subjects. Thus, the DC processes we investigated in our study do not seem to contribute to an explanation why female students in particular rarely choose science-oriented choice patterns. Von Keyserlingk et al. (2019) recently showed that the effect of social comparisons on choosing a mathintensive university major is not gendered as well. So the existing evidence seems to indicate the gender differences in academic choices toward STEM might not be provoked by comparison processes. This issue will need further research. One approach could be to compare the percentages of girls and boys selecting STEM subjects in educational systems that differ according to the strictness of regulations regarding the possibility to opt out of STEM. Results of such research could show gender-related consequences of academic choices offered to students and thus provide evidence on which to base structural reforms.

Limitations
There are a number of limitations to this study that must be acknowledged. Our research concentrated on one federal state of Germany with a specific system for choosing subjects for final exams. Comparative studies incorporating multiple federal states and/or countries need to reveal how systems with different regulations support students and especially female students to stay in STEM education.
Within the restrictions of final exam subject choices, students can only decide to take as few science or mathematics subjects as possible and not to opt out of STEM entirely. Hence, in the given system, we could not create a disjunct opt-out science group. Taking an exam in biology could be an attempt to avoid other science subjects, but it could also be the expression of a special interest in this subject. So in this group, we might also have identified students with this interest and intertwined them with students who actually did choose this subject as their final exam to opt out of science as much as possible. Therefore, our distinction should be seen as a proxy for scienceorientation in exam choice. However, our descriptive results showed that we succeeded in discriminating these two groups. The already proposed comparative approach could further shed light on these issues.
Moreover, we could not shed light on predictors of opting for specific subjects, for example predictors of taking a biology exam vs. predictors of taking a chemistry exam. Students can already drop science courses before they decide on their final exams. Therefore, focusing on these three subjects means investigating selective samples of students, so results across subjects cannot be compared.
Lastly, we could only rely on cross-sectional data and cannot make assumptions as to whether science-oriented choice patterns for final exams lead to choosing STEM study fields or occupations in this area. This issue could be addressed by comprehensive longitudinal studies incorporating psychological, sociological, and structural perspectives in order to trace STEM pathways of female and male students.
Our study considered several psychological variables simultaneously. However, sociological, structural, and environmental variables, such as teacher expectations, social background, or support from parents, may also impact academic choices (e.g., Neugebauer and Schindler, 2012;Cerinsek et al., 2013). We are not aware of a study that incorporates all these perspectives, but such an approach would shed light on the impact which each of these variables has on academic choices in favor of STEM.

Conclusion
Our results indicate that female and male students have similar motives when choosing science-oriented final exams. However, we did find one gender difference: While male students tended to opt for science-oriented final exams when their mathematics midterm grade was better than their English midterm grade, relative mathematics grade did not matter for female students' choice patterns. This finding suggests that female and male students need to be encouraged differently regarding the DC aspect. Male students might profit from a direct intervention, instructing them how to judge in a way that is more independent from their performance in other non-STEM subjects. A study looking at gender interactions of relative grades for a variety of subjects would show whether this intervention could also be fruitful for final exam taking in other subjects.
By incorporating STEM and non-STEM subjects (German and English), we found only weak evidence that DC influence students' exam choices. Possibly, stronger effects can be shown in future studies simultaneously investigating multiple and diverse domains (e.g., STEM, English, or history), and comparisons across different science subjects (e.g., physics, chemistry, and biology; cf., Möller and Marsh, 2013). Future research should target the processes of academic choices and the choice of final exam subjects in longitudinal studies for a broader array of domains, and capture their importance for choice of study fields at university level. Such an approach would lead to deeper insights into the mechanisms of academic choices in favor of STEM.

DATA AVAILABILITY STATEMENT
The research team has submitted the data of the entire study to the online repository of the Research Data Center at the Institute for Educational Quality Improvement (IQB) in January 2020. The data is published on their webpage (https://www.iqb.hu-berlin. de/fdz/studies/LISA_6/).

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by Data Protection Officer of the Federal State of Schleswig-Holstein. Written informed consent from the participants' legal guardian/next of kin was not required to participate in this study in accordance with the national legislation and the institutional requirements.

AUTHOR CONTRIBUTIONS
NK wrote the first draft of the manuscript and conducted the analyses. SK and BH made substantial, direct, and intellectual contributions to the revision of the manuscript. All authors approved the publication of the manuscript.

FUNDING
This study was funded by the Leibniz-Foundation, Germany.  (14)