AUTHOR=Voigtmann Sophia , Speyer Augustin TITLE=Information Density and the Extraposition of German Relative Clauses JOURNAL=Frontiers in Psychology VOLUME=Volume 12 - 2021 YEAR=2021 URL=https://www.frontiersin.org/journals/psychology/articles/10.3389/fpsyg.2021.650969 DOI=10.3389/fpsyg.2021.650969 ISSN=1664-1078 ABSTRACT=This paper aims to find a correlation between Shannon’s (1948) Information Density and extraposition of Relative Clauses in Early New High German. ID can be defined as the “amount of information per unit comprising the utterance” (Levy & Jaeger 2007, 1). Since surprisal is connected to perceiving difficulties (Hale 2001), the impact of frequent combinations with low surprisal-values on the working memory is lower than it is for rare combinations with higher surprisal-values (Levy & Jaeger 2007, Hale 2001). To improve text comprehension, producers therefore distribute information as evenly as possible across a discourse (“Uniform Information Density Hypothesis (UID)”, Levy and Jaeger 2007). Extraposed RC are expected to have a higher surprisal-value than embedded RC. We intend to find evidence for this idea in RC taken from scientific texts from the 17th to 19th century. I built a corpus of tokenized, serialized, lemmatized and normalized articles about theology and medicine from the 17th and 19th century, manually determined the RC-variants and calculated a skipgram-Language Model (Guthrie et al. 2006) to compute the 2-Skip-bigram surprisal of every word of the relevant sentences. A logistic regression (Bates 2004 lme4 package, R) over the summed surprisal values shows a significant result, which indicates a correlation between surprisal values and extraposition. So, for these periods it can be said that RC are more likely to be extraposed when they have a high summed surprisal value. The influence of surprisal values also seems to be stable across time. The comparison of the analyzed language periods shows no significant change.