Evaluation of Science Communication: Current Practices, Challenges, and Future Implications

Scientifically substantiated evaluations are pivotal to ensuring the effectiveness and improvement of the growing number of science communication projects. Yet current evaluation practices are still lacking in various respects. Based on a systematic review of evaluation reports, an online survey of, as well as discussion rounds with science communication practitioners in the German-speaking countries, we discuss three main challenges of science communication evaluation: (1) There is a conflation of impact goals and measurable project objectives as well as a lack of precise definitions of objectives and target groups, which complicates the assessment of the projects' success. (2) Although many evaluations highlight the impact-oriented interest of those responsible, the methods chosen rarely allow scientifically valid evaluations of effects. The lack of comparative reference points and the partially unsuitable use of self-report measures are key issues in this regard. (3) The fact that few evaluation processes are made transparent and that formative evaluation designs are a rarity indicates a tendency to understand evaluations as the final ‘success story’ of a project rather than a learning process. This stands in the way of a constructive discussion of the actual impact of science communication. Our exploratory insights contribute to an understanding of the weaknesses of science communication evaluation and needs in the field. They also provide impulses for future improvements in the field for the stakeholders in practice, research, funding, and science management.


INTRODUCTION
For those dedicated to science communication, 2020 will probably be remembered as the year their fields took on new significance in the public eye. Science communication has already changed profoundly in recent years and has become increasingly institutionalized and diversified: New types of actors like the Science Media Centre (2012) have entered the field, and the networks for exchange in the science communication community are growing (e.g., European Citizen Science Association, 2021; European Science Engagement Association, n.d.). Apart from that, the variety of science communication activities and channels increases as new online communication services emerge and offer novel ways for interaction with audiences (Schäfer, 2017, p. 52). This trend can also be observed in Germany, where more and more science communicators on analyses and exchanges with various stakeholders, especially the following: • An online survey with 109 German science communicators (Impact Unit, 2019) 2 , focusing on the goals of science communication, their evaluation experiences and routines, their perceptions of the quality of evaluation, and the needs they identify for better evaluation.
• A systematic review of 55 evaluation reports (Ziegler and Hedder, 2020) of German-speaking science communication projects, focusing on the projects' goals, objectives, and target groups, as well as motives and methods for evaluation.
• Several informal discussion rounds on challenges and needs (2019-2020), with stakeholders from science communication research, funding, and practice in Germany. These included practitioners with varying experiences in evaluation.
This article's claims underlie several constraints. Our analyses are mostly focused on the German case, relying on small sample sizes or only on publicly available sources (such as evaluation reports). Nevertheless, we have observed three challenges that come up consistently throughout all our analyses and exchanges. Based on our extensive reflection, we believe these to be central when working toward a better evaluation practice for impactful science communication. In the following, we outline these challenges, before discussing the roles of researchers, practitioners, and other relevant stakeholders within the academic system in overcoming them.

STRATEGIC APPROACH TO PROJECT DESIGN
Clear expectations of what a project is supposed to accomplish and why, are necessary criteria for a strategic project design and an informative evaluation of its effectiveness (Spicer, 2017, p. 21 f.). Strategic communication differentiates between goals, meaning general guidelines or end results, and objectives, which are the concrete communication outcomes desired (Hon, 1998, p. 105) that contribute to reaching the goals (Hallahan, 2015, p. 247). For science communication to be strategic, this implies "choosing one's goal for communication, determining interim communication objectives [. . . ], and then selecting tactics that have a realistic chance of meeting those objectives" (Besley et al., 2018, p. 709). But there are doubts whether science communication projects appropriately do so. Scholars question if the choice of activities and tactics is in line with the project initiators' communication objectives (Stilgoe et al., 2014, p. 6), while others see a disconnect between objectives and evaluated outcomes (Phillips et al., 2018). Looking at the German case, we see similar issues reflected in the way objectives and target groups are defined. For one, the phrasing of objectives lacks precision: There might be the formulation of a wish to raise awareness of an issue, without defining what it means 'to be aware.' Other projects might strive to 'encourage' people to think about scientific topics or to gain 'more' visitors, without giving reference points. This cautious phrasing lowers the bar to meet expectations but complicates the judgment of the success of a project or an activity's potential. Furthermore, broadly formulated objectives put the focus on detecting any effect instead of the size of specific effects (Ziegler and Hedder, 2020, p. 19 f.).
This room for interpretation might reflect a wish to maintain flexibility when it comes to managing expectations or even an uncertainty about where to actually set the bar, especially when exploring new formats or experimenting. According to our community survey, 73% of the participants stated that their projects are born mostly out of curiosity about a new activity and new ideas rather than chosen based on their fit to achieve predefined objectives (n = 94; Impact Unit, 2019, p. 19).
Part of the problem seems to be the process of breaking down goals into concrete objectives. Our review of evaluation reports shows that the practitioners are experienced in explaining their long-term missions (Ziegler and Hedder, 2020, p. 16 ff.) and positioning their projects within the big picture. Discussions with the practitioners left the impression that difficulties occur when they need to pick apart the mission and identify those puzzle pieces which are measurable within a time-limited activity-an issue that has also been brought up by Weitkamp (2015) and King et al. (2015).
But this is not the only obstacle: Once objectives have been derived from goals, suitable tactics and activities need to be found and tailored to a specific target group. However, in our review of evaluations, target groups are mostly described in broad terms by referring to basic sociodemographic characteristics, prominently age and gender. More concrete descriptions of the desired audience are rare. Even when more specific demographics are defined, using terms like 'main target group' opens a backdoor to include others (Ziegler and Hedder, 2020, p. 19). Examples of this are the frequently mentioned target groups 'school children' or 'the general public.' Members of both groups are defined by a small set of indicators they have in common-being young and in school or being part of the public. However, this misses a chance of appropriately addressing the multiple subgroups they contain. As Schäfer and Metag point out, another look at the differences within, especially regarding science attitudes, can be informative for planning communication activities (Schäfer and Metag, 2021, p. 300) and, consequently, their evaluation. This does not mean that comprehensive target groups cannot be of interest, but it is advisable that their diversity is considered.
We believe it is important that practitioners recognize the value of a strategic mindset when planning their activities. Objectives should not serve as low hurdles that can be easily overcome but as motivation and orientation for what is important within the project. Similarly, target groups can help navigate the wide choice of communication activities when their special preferences and peculiarities are considered. With this in mind, defining goals, objectives, and target groups can offer the opportunity for reflection on a project and how it can be meaningfully evaluated.

CHOICE OF METHODS AND STUDY DESIGN
Many characteristics of the evaluations in our review, like their summative evaluation designs, posed research questions, and chosen data sources, indicate that the examination of effects is a key motivator (Ziegler and Hedder, 2020). Whenever effects are in the focus of an evaluation and elaborated designs are necessary, a lack of precision of objectives and target groups can complicate the choice of study design and methods. Accordingly, the methodological flaws mentioned by Jensen (2014) also apply in our context: To gather insights into effects, reference points for comparison are essential. After all, no change, for better or for worse, can be determined with only one data point. A credible procedure to provide such comparisons would be repeated measures as in pre-and post-designs but also the use of control groups during evaluation. Looking at current evaluation practices, such comparisons are rare. Both the community survey and the evaluation report review show that control groups are seldom used in science communication evaluations. Pre-and post-designs come up more regularly-in roughly a third of the cases (Impact Unit, 2019, p. 22;Ziegler and Hedder, 2020, p. 24). Consequently, for the remaining evaluations interested in effects, these can only be judged based on insufficient data as they rely on self-report, meaning survey participants' memory and ability to reflect and compare their feelings, judgments, and thoughts. This is exacerbated when third parties like teachers are asked to judge the effects of an activity on the target group (e.g., school students). Overvaluing these sources that can only offer indirect information increases the risk of redesigning formats while missing the real target groups' interests (Jensen, 2014, p. 2).
Since we did not witness the decision-making processes during these evaluations, we were not able to reconstruct the choices that were made. However, looking back on discussion rounds with practitioners, we felt that short-term planning seems to be a central factor. Choosing the right methods, defining suitable data sources, scheduling repeated measures, and preparing instruments require early evaluation planning. In reality, it is often too late for many of these decisions once practitioners (can) start planning evaluations. In such cases, they might inevitably turn to what is well-known, seemingly costefficient, and presumably easy to conduct. Limited knowledge about possible methods and data sources might result in evaluations being planned around what data one knows how to collect, instead of what information is of actual interest.
We are aware that measuring effects is ambitious. If it cannot be done properly, practitioners are better off focusing on examinations of descriptive findings that enable an informed reflection. However, methodological rigor is indispensable, no matter the interest of the evaluation. To make sure that appropriate conclusions are drawn, evaluations need to be systematically planned, starting with clear questions that lead to the data of interest, to the most valid data sources and, finally, to the best-fitting methods and time frames for data collection. Practitioners not only need time and resources to undergo this process but also the relevant information to base their decisions on.

UNDERSTANDING OF EVALUATION
According to our survey, 36% of the science communicators in Germany agree that projects are evaluated often if not always (n = 96; Impact Unit, 2019, p. 21). Unfortunately, this does not mean that these evaluations are open to everyone to learn from. Our own search for accessible best practices in the Germanspeaking community demonstrated how difficult it is to find benchmarks in comparable contexts. Our examination of the first 50 findings of each of the 68 keyword combinations we searched for (Ziegler and Hedder, 2020, p. 36 ff) yielded a relatively small number of 55 science communication evaluation reports. This is not surprising though: As the community survey shows, evaluations are mostly used in order to reflect upon a project within the team (79%), improve future projects (64%), and their findings are commonly passed on to supervisors and/or funders (65%). Sharing findings for research purposes is not as established (18%; n = 72; Impact Unit, 2019, p. 26). Also, the examples we found online were mostly reports of summative evaluations. Formative evaluations that would allow a deeper understanding of how a project is developed, reflected, and improved are scarce.
These observations may be related to a persistent framing of evaluations as 'telling success stories.' Following this logic, the evaluation process is not as valuable for outsiders as its results. A further reason for not making evaluations accessible is that it might invite criticism. Therefore, failed attempts or mediocre results, which could still stimulate learning, are not disclosed. In our discussion rounds, the practitioners expressed a worry about their work being assessed negatively by others, especially when evaluations are closely linked to the justification of budgets or funding.
In contrast to this, a constructive approach to evaluation needs to be based on curiosity about a project's potential and openness to learning from failures. Certainly, wanting to shift the idea of evaluation in a more productive direction where honest reflections and transparency are encouraged is not a controversial standpoint (e.g., Jensen and Gerber, 2020). Practitioners, researchers, and institutional stakeholders would agree that issues like time and resources pose a greater challenge than motivation. Difficulties arise when it comes to determining the practical implications and assigning roles and responsibilities within the science communication community in this process.

IMPLICATIONS FOR FUTURE PRACTICE
It has become clear that evaluations in science communication are still lacking in central aspects. In order to make evaluation a deliberately planned learning process that builds on existing knowledge, delivers insights into the impact of science communication, and thereby allows evidence-based decisions concerning its development and funding, profound changes need to be made. This will only be possible through the contributions of all the stakeholders in the field. Practitioners can contribute decisively by strategically planning activities and allocating resources within projects. Their work needs to be based on a regular critical reflection and a motivation to apply the latest knowledge in the field. But practitioners should not be expected to do the same work as researchers; therefore, meaningful cooperation between research and practice is key. Even if practitioners are equipped with the right information and tools, social scientists' expertise will remain relevant to measuring impact and developing strategies for effective science communication. The contribution of scientists researching science communication includes not only enabling access to scientific results but also communicating findings that are especially relevant to practice. Moreover, the stakeholders at the management level of scientific organizations and research institutes, as well as the funders of and the policymakers for science communication, need to be clear about their science communication goals so that the practitioners are able to derive their project objectives accordingly. By providing the wider context, they become part of the conversation about appropriate goals of science communication.
Further training for practitioners plays an important role in improving evaluations. Consequently, there should be opportunities and support for learning within organizations and funding schemes, for example, in the form of training programs on evaluation and strategic project planning. Learning opportunities are also central to addressing methodological shortcomings in evaluation practice. Experts from social sciences and evaluation research can be of help by making instruments, measures, and scales more readily available. This allows practitioners to use scientifically sound examples as orientation, instead of designing their own instruments from scratch. Of course, this will not solve the need for guidelines and quality standards in evaluation, including minimum requirements concerning methodological rigor for a wide spectrum of methods and study designs. This task requires scientific expertise and, ideally, an international exchange but cannot succeed without funders and executives as a driving force to accept and implement these standards.
However, it is undeniable that elaborate evaluation designs cannot be conducted 'on the side.' Even though evaluation practice should embrace quality standards, it will not replace academic impact research. There needs to be a discussion of what can be expected from meaningful evaluations conducted by practitioners, at which point external experts or researchers are appointed, and where we draw the line between evaluation and research. Finally, we encourage the stakeholders from the management level, the funders, and the policymakers to demand meaningful and reasonable evaluation planning early on but also to provide sufficient resources for it. For practitioners to evaluate honestly and with enthusiasm, these stakeholders must show interest in a project's learning opportunities, not only in its final results.
Even though resolving these issues will take time, we are convinced that our field will benefit from a better understanding of how specific activities of science communication work, when to use them, and where to invest resources to actually make a difference.

DATA AVAILABILITY STATEMENT
Publicly available datasets were analyzed in this study. This data can be found at: https://zenodo.org/record/4608091#. YGLT7mhCRhA.

ETHICS STATEMENT
Ethical review and approval was not required for the study on human participants in accordance with the local legislation and institutional requirements. Written informed consent for participation was not required for this study in accordance with the national legislation and the institutional requirements.

AUTHOR CONTRIBUTIONS
RZ conceived the concept for this perspective. IH contributed to the final conceptualization and wrote the first draft of the manuscript. LF, IH, and RZ revised the manuscript and contributed to the critical editing and finalizing of the manuscript. All authors approve the submitted version.

FUNDING
The project on which this report is based was funded by the German Federal Ministry of Education and Research under Grant No. 0150862. The responsibility for the content of this publication lies with the authors.

ACKNOWLEDGMENTS
We would like to thank Philipp Niemann (National Institute for Science Communication) and Markus Weißkopf (Wissenschaft im Dialog) for their valuable comments on a draft version of our manuscript. Further thanks go to our reviewers and our editor whose ideas have increased the value of the contribution immensely. We would also like to thank the practitioners and the funders in our workshops and roundtables for the insightful conversations about their experiences in evaluation, as well as the researchers in the field of science communication for the inspiring discussions that contributed to our work.