The impact of text topic and assumed human vs. AI authorship on competence and quality assessment

Proksch, Sebastian; Schühle, Julia; Streeb, Elisabeth; Weymann, Finn; Luther, Teresa; Kimmerle, Joachim

doi:10.3389/frai.2024.1412710

BRIEF RESEARCH REPORT article

Front. Artif. Intell.
Sec. AI for Human Learning and Behavior Change
Volume 7 - 2024 | doi: 10.3389/frai.2024.1412710

This article is part of the Research Topic

The Role of Conversational AI in Higher Education

View all Articles

The impact of text topic and assumed human vs. AI authorship on competence and quality assessment Provisionally Accepted

Sebastian Proksch¹ Julia Schühle¹ Elisabeth Streeb¹

Finn Weymann¹

Teresa Luther²

Joachim Kimmerle^{1, 2*}

¹University of Tübingen, Germany
²Leibniz-Institut für Wissensmedien (IWM), Germany

The final, formatted version of the article will be published soon.

You just subscribed to receive the final version of the article

The aim of this study was to investigate how texts with moral or technological topics, allegedly written either by a human author or by ChatGPT, are perceived. In a randomized controlled experiment, n=164 participants read six texts, three of which had a moral and three a technological topic (predictor text topic). The alleged author of each text was randomly either labeled "ChatGPT" or "human author" (predictor authorship). We captured three dependent variables: assessment of author competence, assessment of content quality, and participants' intention to submit the text in a hypothetical university course (sharing intention). We hypothesized interaction effects, that is, we expected ChatGPT to score lower than alleged human authors for moral topics and higher than alleged human authors for technological topics and vice versa. In contrast to the hypotheses, we did not find any significant interaction effects. However, ChatGPT was consistently devalued compared to alleged human authors across all dependent variables: There were main effects of authorship for assessment of the author competence, β=.78, (326)=7.36, p<.001, =0.81; for assessment of content quality, β=.30, (326)=3.31, p<.001, =0.37; as well as for sharing intention, β=.85, (326)=5.07, p<.001, =0.56. There was also a main effect of text topic on the assessment of text quality, β=.21, (326)=2.35, p=.019, d=0.26. These results are rather more in line with previous findings on algorithm aversion than with algorithm appreciation. We discuss the implications of these findings for the acceptance of the use of LLMs for text composition.

Keywords: Large language models, ChatGPT, competence, quality assessment, morality, Technology, Algorithm aversion

Received: 05 Apr 2024; Accepted: 14 May 2024.

Copyright: © 2024 Proksch, Schühle, Streeb, Weymann, Luther and Kimmerle. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence: Mx. Joachim Kimmerle, Leibniz-Institut für Wissensmedien (IWM), Tübingen, Germany

BRIEF RESEARCH REPORT article

This article is part of the Research Topic

The impact of text topic and assumed human vs. AI authorship on competence and quality assessment Provisionally Accepted

People also looked at