Quality Assessment in Co-developing Climate Services in Norway and the Netherlands

Climate services, and research on climate services, have mutually developed over the past 20 years, with quality assessment a central issue for orienting both practitioners and researchers. However, quality assessment is becoming more complex as the field evolves, the range and types of climate services expands, and there is an increasing appeal to co-production of climate services. Scholars describe climate services as emerging from complex knowledge systems, where information moves through institutions and actors attribute various qualities to these services. Seeing climate services' qualities as derived from and activated in knowledge systems, we argue for comprehensive assessment conducted with an extended peer community of actors from the system; co-evaluation. Drawing inspiration from Knowledge Quality Assessment and post-normal science traditions, we develop the Co-QA assessment framework; a checklist-based framework for the co-creation of criteria to assess the quality of climate services. The Co-QA framework is a deliberation support tool for critical dialogue on the quality of climate services within a co-construction collective. It provides a novel, structured, and comprehensive way to engage an extended peer community in the process of quality assessment of climate services. We demonstrate how we tested the Co-QA—through interviews, focus groups and desktop research—in two co-production processes of innovative climate services; an ex post evaluation of the “Klimathon” in Bergen, Norway, and an ex ante evaluation for designing place-based climate services in Dordrecht, the Netherlands. These cases reveal the challenges of assessing climate services in complex knowledge systems, where many concerns cannot be captured in straight-forward metrics. And they show the utility of the Co-QA in facilitating co-evaluation.


INTRODUCTION
The field of climate services is establishing itself as important for, "the provision of climate information in ways that supports decision-making through engagement with the users of that information" (Bruno Soares and Buontempo, 2019, p. 4). The past 15 years has seen a rush of climate service-labeled initiatives-both public and private-to translate and transfer scientific climate information for use in various institutions worldwide (Vaughan et al., 2018); from French utility giant EDF (Bruno Soares and Dessai, 2015), to small groups of African farmers (Tall et al., 2018). One important challenge remains how to assess the quality of this information, where "quality" is related to, "both the different types of uncertainty in knowledge and the intended functions of the information" (Funtowicz and Ravetz, 1993, p. 740). When scientific knowledge is used for informing societal decision-making, its quality should thus not only be assessed according to the internal epistemic norms of the scientific community, but it should also be assessed according to its external "fitness for function" (Craye et al., 2005). Indeed, efforts to better link science production and use have seen a multitude of initiatives to "co-produce" climate services [e.g., Bremer et al. (2019c) and Vincent et al. (2018)], leading to a more fluid situation around who or what the producers, users, forms, and purposes of climate services might be.
The challenge is how to recognize and appraise quality in uncertain and malleable information, which travels through various institutions and is interpreted toward different ends (implying changing functions of knowledge use) along the way. In a research institution, the quality of "normal" disciplinary science is established by a bounded, somewhat stable, and largely agreed set of epistemic norms and criteria, through standards of good scientific practice and peer review procedures. But deploying scientific climate information "outside the lab, " to support climate-related decisions characterized by uncertainty, plurality, high stakes, and urgency, opens up for fundamentally new norms of quality (Funtowicz and Ravetz, 1993). Knowledge quality criteria become unbounded, highly unstable, and contentious.
Notwithstanding these challenges, scholars argue it is important to assess climate services quality in order to: (i) develop information that is fitted to institutions' functions and problems; (ii) demonstrate the particular outcomes, impacts, and added value for an institution; (iii) justify public and private investment; and (iv) distill lessons for climate services scholarship and practice, including lessons on evaluation itself (Tall et al., 2018;Bruno Soares and Buontempo, 2019;Vaughan et al., 2019b;Lemos et al., 2020). Reviews show that evaluation is becoming more commonplace in climate services initiatives but that there are varying levels of commitment and no commonly accepted approaches or frameworks, with a consequence that many evaluations adopt a narrow perspective on quality that assesses a subset of qualities (Vaughan et al., 2019a, b;Tall et al., 2018). Very often this sees a division between either assessing information's scientific rigor ("getting the science right") or some measure of its use ("getting the science used"); something which arguably reinforces a disconnect between science and policy/practice and reifies dichotomous and simplified categories of science "providers" and "users." It also creates a blind spot around other relevant qualities of climate services; cultural, social and ethical.
Here we offer a fresh perspective and approach to the challenge of assessing climate services' quality, as distinct from the work on value (Ford et al., 2013;Vaughan and Dessai, 2014;Meadow et al., 2015;Vogel et al., 2017;Wall et al., 2017;Vaughan et al., 2019a). We adopt a perspective that climate services emerge from and travel through context-specific "knowledge systems" of institutions and actors (Buizer et al., 2016), accumulating diverse characteristics or qualities in the process; from scientific rigor to practical usefulness, political legitimacy or cultural appropriateness for instance. And that these characteristics are bundled in unique configurations-and politically contestedby actors in institutions appraising the quality (or fitness) of climate services for particular functions. From this point of departure, our research was steered by the question: how can we comprehensively identify the characteristics associated with a climate service which determine its quality for particular functions in a particular context? This question in turn translates into the two aims of our research and this paper: to (i) develop a framework for identifying climate services' characteristics in order to collaboratively assess their quality; and (ii) test the framework through cases to study how it supports climate service assessment.
Section Assessing the Quality of Climate Services starts from our argument that climate service assessment tends to focus either on products' inherent scientific quality conferred in the lab, or relative to the various standards of use that differ across institutional spheres. We join others (Meadow et al., 2015;Vincent et al., 2018) in recommending more comprehensive and rounded assessment of the constitutive qualities of products, in collaboration with an "extended peer community" of actors in a knowledge system. Section Knowledge Quality Assessment and the Co-QA Assessment Framework suggests that the field of Knowledge Quality Assessment offers insights into comprehensive and collaborative assessment, and goes on to present the novel Co-QA (Collaborative Quality Assessment) framework. Section Case Studies and Methods demonstrates how we implemented the Co-QA framework in two case studies of different climate services; an ex post evaluation of the "Klimathon" in Bergen, Norway, and an ex ante evaluation for designing place-based climate services in Dordrecht, the Netherlands. Both cases were conducted in the context of the European Research Area for Climate Services project "Co-development of place-based climate services for action" (CoCliServ). Section Results: Assessing Climate Services and the Co-QA Framework presents the findings of these evaluations, including an appraisal of how the framework performed in each case study, before Section Discussion finishes with some commentary on the framework, and on the wider importance of comprehensive and bottom-up "co-evaluation."

ASSESSING THE QUALITY OF CLIMATE SERVICES
In conceptually framing our research we adopted a perspective held by some climate service scholars who see climate information as emerging from and traveling through complex and heterogeneous "knowledge systems" (Kirchhoff et al., 2013;Bruno Soares and Dessai, 2015). Buizer et al. (2016, p. 4598) discuss knowledge systems as, "networks of linked actors, organizations and objects that perform a number of knowledge-related functions [. . . ] involved in linking knowledge and know-how with action." This echoes the classic work of Star and Griesemer (1989) who described such systems as ecologies of intersecting institutions, or social worlds, wherein actors attribute different meanings and uses to scientific information and variously appraise its qualities. An example of one such climate knowledge system is the Norwegian flooding simulation described by Bremer et al. (2019c), which was commissioned by a utilities company, derived from data of the Water and Energy Directorate, produced by a consultancy, and deployed in public fora, as part of a municipality policy process.
Within knowledge systems, Star and Griesemer (1989, p. 388) noted, "scientific actors face many problems in trying to ensure integrity of information in the presence of such diversity." Information is re-interpreted and re-packaged as it travels and is translated to the particular institutional rules, norms and cultures that it passes through [see Scott (2014)]. These problems of knowledge quality are amplified when knowledge systems face "wicked" (Rittel and Webber, 1973) or "post-normal" (Funtowicz and Ravetz, 1993) problems like climate adaptation. Under conditions of high uncertainty and high stakes, quality is not universally agreed or inherent to information products. At best, quality is contingent on knowledges fitness for particular functions, opening up for nearly infinite possible quality criteria, always in flux as our understanding of the problem evolves (Funtowicz and Ravetz, 1990). The status of climate services can change as they travel in a knowledge system; they may remain "information" or become interpreted and enacted as knowledge, or as more diffuse understandings.
Under these conditions, what does it mean to talk about the quality of climate information? There is an enduring tradition of appraising knowledge as "justified true belief, " but as post-normal science scholars Ravetz, 1990, 1993) point out, where knowledge faces high stakes problems characterized by significant uncertainties, like the future of climatic change, its approximation to the "truth" ceases to be a universal standard of quality. This opens up for a plurality of more contextspecific standards-political, cultural, practical and so on-by which knowledge's quality can be appraised and trusted, relative to the problem at hand. As such, the post-normal science perspective sees knowledge quality as determined via plural standards, but with a common concern for its fitness for the purpose of addressing a problem. Deciding which standards of quality should be deployed in assessing a climate service is then a highly political choice of which characteristics of knowledge or information are most important for supporting climate adaptation; is it their conforming to rigorous scientific methods? Their political expediency? Their practical implications? We see scholars and practitioners have adopted three broad approaches to determining climate services quality.
In one set of articles, a climate service's quality mainly corresponds to its scientific robustness as determined by normal disciplinary peer review and widely accepted standards of good scientific practice (epistemic norms), and typically discussed as data pedigree and predictive skill. Here quality is determined by the logics of scientific disciplines and their standards of what constitutes rigorous methods and data collection, upheld by recognized scientists in those fields. When a product like a seasonal forecast is deemed scientifically robust, the main concern then is that this information is not "distorted" as it moves through a knowledge system (Vaughan et al., 2019a). Like "immutable mobiles" (Latour, 2005), large, centralized climate information providers such as the European Center for Medium Range Weather Forecasts (Bruno Soares and Dessai, 2015), or Copernicus Climate Change Service (Perrels, 2020) issue "standardized packages" (see Fujimura, 1992; also Kirchhoff et al., 2013) of data, information and tools. Quality is thus attached to scientific standards, and travels with the information (Vaughan et al., 2019b).
A second set of articles sees a climate services quality corresponding to its plasticity for being adopted and used across different institutional settings, like a "boundary object" (Kirchhoff et al., 2013;Meadow et al., 2015;Buizer et al., 2016). For Star and Griesemer (1989, p. 393) boundary objects satisfy disparate information requirements in different institutional settings, "plastic enough to adapt to local needs [. . . ] yet robust enough to maintain a common identity." Seen this way, quality is assessed relative to whether people recognize it as useful and useable according to the particular standards of use in each institution (Dilling and Lemos, 2011). For instance, a seasonal forecast's use will be differently appraised by meteorologists than by an insurance company calculating its losses, or a farmer timing her harvest [see e.g., Tall et al. (2018), Vaughan et al. (2019a,b), and Bouroncle et al. (2019)]. From this standpoint, users are in the best position to determine quality; either through their voiced preferences, or through other metrics of "impact, " like the uptake of a climate service. But by focusing on what each group makes of climate information, such assessment arguably misses a more holistic appreciation of a product's provenance and the diverse qualities it has inherited from the knowledge system, which need to be considered and weighed together. For instance, many studies evaluate a product's scientific qualities separately from its use and impact (Vaughan et al., 2019a), though the importance of having widely used products based on robust data is obvious. Assessment in this tradition is not totally siloed though. Inspired by Lemos and Morehouse's (2005) ideas of co-production as "iterative interaction, " there is work to improve climate services' use through collaboration between actors in a knowledge system, ranging from loose feedback loops and consultation on "what works, " to tight-knit efforts for co-creating services tailored to particular groups (Vaughan and Dessai, 2014).
A third set of articles seeks a more comprehensive and rounded account of the diverse qualities accumulatively attributed to climate services in a knowledge system, integrating a broad suite of criteria (Cash et al., 2006;Vaughan and Dessai, 2014;Meadow et al., 2015;Vincent et al., 2018). This perspective distinguishes between the different types of qualities bound up in a product-see e.g., distinctions between credibility, legitimacy, and salience of Cash et al. (2003)-and recommends considering these qualities together. Resembling approaches to post-normal science, quality assessment becomes a process of weighing imperfect information's various characteristics, including its scientific rigor and practical use, in determining its fitness for certain functions. Because high quality climate services are more than just scientifically robust, or flexible in use. They fit institutional logics (Harjanne, 2017), connect with institutions' risk perception (Bremer et al., 2019b), nurture relationships (Haines, 2019), empower vulnerable groups (Daly and Dilling, 2019;Turnhout et al., 2020), facilitate social learning , link up with histories and identities Krauß, 2020;Marschütz et al., 2020), and appreciate climate as part of other pressing concerns of communities , to name a few characteristics. From this standpoint, a number of authors have assembled frameworks comprising criteria of the inputs, process, outputs, outcomes, and impacts of climate services (Meadow et al., 2015;Vogel et al., 2017;Wall et al., 2017), with others linking categories of context, process, products and value (Vaughan and Dessai, 2014). Most of these frameworks (e.g., Ford et al., 2013) are filled with quality criteria drawn "top-down" from the scholarship, but other scholars have argued that comprehensive quality assessment is best conducted in collaboration with actors in a knowledge system, voicing their own "bottom-up" quality criteria specific to their context (Cash et al., 2006;Meadow et al., 2015;Vincent et al., 2018) as an "extended peer community" (Funtowicz and Ravetz, 1993). This can be intertwined with codesigning research with peer communities, with quality questions often a recurring theme in putting together citizen science initiatives for instance (Bremer et al., 2019a;Wildschut and Zijp, 2020).
Adopting the perspective that climate services qualities are derived from and activated in complex knowledge systems, we see that climate services can have different types of qualities, and argue with others that these ought to be comprehensively "coevaluated" by actors of the knowledge system. But Vincent et al. (2018) and others have noted that there are few examples of such co-evaluation to date. We aimed to develop a framework for unpacking climate services' characteristics for co-evaluation and turned to the field of Knowledge Quality Assessment as a guide.

KNOWLEDGE QUALITY ASSESSMENT AND THE CO-QA ASSESSMENT FRAMEWORK
Knowledge Quality Assessment (KQA) offers frameworks and approaches for more comprehensive co-evaluation of climate services. KQA is an emerging field of practice at the interface between knowledge and action that seeks to systematically reflect on the strengths and limitations of knowledge in relation to its fitness for function (Clark and Majone, 1985;van der Sluijs et al., 2008). Function can be, for instance, informing a local climate adaptation decision-making process. KQA comprises systematic analysis of, and critical reflection on uncertainty, assumptions and dissent in scientific assessments in their societal and institutional contexts; in knowledge systems (van der Sluijs et al., 2008;Haque et al., 2017). It includes critical analysis of underlying methods and implicit and explicit narratives in scientific assessments (Saltelli et al., 2020b). The goal of KQA is to enhance societies' capacity to deal with uncertainties surrounding knowledge production and knowledge use in the management of complex sustainability issues (van der Sluijs et al., 2008).
In their seminal paper "The Critical Appraisal of Scientific Inquiries with Policy Implications, " Clark and Majone (1985) presented one of the first comprehensive frameworks for quality assessment at the science-policy interface. The framework acknowledges that each actor that has a stake in quality control in a knowledge system, has a different role in the process of critical evaluation. For instance, scientists will emphasize other criteria in quality control than policy-makers. Their taxonomy distinguishes three general modes of critical appraisal: the input, the output and the process by which inquiry is conducted. Input refers to data; methods, people, competence, and (im)matureness of field for instance. Output relates to questions such as whether the problem is solved and the hypothesis tested. Process concerns issues such as good scientific practice, procedures for review, documenting.
Other well-developed KQA tools and frameworks in the literature include the Numeral Unit Spread Assessment Pedigree (NUSAP) notational system for qualifying quantities (Funtowicz and Ravetz, 1990;van der Sluijs, 2017); the six reflective lenses framework for auditing narratives of sustainability (Saltelli et al., 2020b); the five principles for responsible use of models in policy support (mind the assumptions, hubris, framing, consequences and unknowns; Saltelli et al., 2020a); and the checklist for systematic critical reflection on uncertainty and quality in scientific assessments implemented at the Netherlands Environmental Assessment Agency (Janssen et al., 2005;van der Sluijs et al., 2008;Petersen et al., 2011Petersen et al., , 2013. The latter systemizes critical reflection on uncertainty and quality in six crucial phases in the process of mobilizing knowledge for action: problem framing, stakeholder involvement, indicator selection, appraisal of the knowledge base, mapping and assessment of relevant uncertainties and communication of uncertainty information. Because none of these existing frameworks is fully fit for application in a setting of co-production of climate services, in this paper we present a new tool for knowledge quality assessment-the Collaborative Quality Assessment (Co-QA) framework. Co-QA extends on Clark and Majones original comprehensive framework, tailored for deliberation support in the co-production of climate services in extended peer communities. The tool is documented in more detail in a scientific report (Van der Sluijs and Bremer, 2019). The framework assists in the co-production of relevant criteria to assess knowledges quality-fitness for purpose-relative to particular climate service projects, or instances when climate knowledge is used for responding to a discrete problem or question or task. It is not suited to a general assessment of climate knowledge, at a national scale for instance. Knowledge quality, as employed here, takes as its reference point the particular and contingent purpose or function for which climate knowledge is mobilized.
Co-QA is an open framework, which is collaboratively filled out by actors interested in a climate service during a focus-group. Alternatively (or in combination), actors can be interviewed individually to elicit quality criteria that are important from their perspective. The resulting framework is ultimately completed in cooperation with others, as a way of bridging knowledge quality expectations across all actors in a knowledge system. Inspired by Clark and Majone's (1985) framework, it distinguishes critical roles and critical modes. The roles, referring to the ways different actors interact with a climate service, can vary from case to case and for instance include scientist, peer group, policy maker, funder, public interest group. It distinguishes the same critical modes as the Clark and Majone framework (input, process, output) but we have added a fourth critical mode: use, because our framework should not only address the step of the cocreation of climate services but should also include quality appraisal of their use in institutions in a knowledge system. This creates a two by two matrix with critical roles heading the rows, and critical modes the columns. In filling out the framework (Table 1), actors discuss and register in the matrix cells their perspective on important quality criteria at each critical mode, or phase, of producing and using a climate service. Put another way, it dynamically unpacks the qualities that are layered on a climate service as information travels through and is used in institutions in a knowledge system. When used in a focus group, actors justify quality criteria before they are recorded, and challenge others on their criteria. The completed matrix is a product of negotiation, not a collage.
In a final step, the researchers and the actors involved jointly assess ("co-evaluate") the quality of a climate service using the resulting set of jointly developed or co-produced knowledge quality criteria, i.e., a filled out version of Table 1. This step, the assessment, can be done either in a group-interview, or in one-on-one interviews.
Having developed the Co-QA framework for comprehensively unpacking and assessing climate services diverse qualities-our first research aim-we sought to test this framework in two cases.

CASE STUDIES AND METHODS
In this section we expand on how we implemented the Co-QA tool for comprehensive knowledge quality assessment in two different cases; testing out the tool together with actors in knowledge systems associated with on-going (in Bergen) and planned (in Dordrecht) climate services. These two cases were chosen to study quality assessment and the Co-QA tool, as our second research aim.
The application of Co-QA in the cases can show to what extent the framework captures the diversity of ways that actors involved in co-developed climate services relate to "quality" and supports the assessment of climate services' fitness for purpose. The cases are unique but comparable. Both cases involve highly developed networks of climate scientists and users of climate information, practical experiences with climate services, and a growing focus on co-development of climate services. The selection of cases targeted novel experimental approaches to this co-development. They highlighted the widening interpretation in such co-development processes of what climate services are, and how they are developed [see e.g., Boon et al. (2021)]. The Klimathon in Bergen is an example of the widening interpretation of what a climate service is; less focused on tools and data and more on engagement and reflection between actors involved in climate adaptation. The place-based climate service design in Dordrecht is an example of changing approaches to climate service design, with local experiences and views on quality as a starting point of co-design. The Bergen case is an ex-post assessment, and the Dordrecht case ex-ante. This is notable because traditional approaches generally focus on ex post assessment only, while co-evaluation could be important for co-development of climate services at a much earlier stage.
In both case studies our two-step method started with the first step of mapping specific quality criteria using the Co-QA table in interviews (see Table 1). Here we conducted individual semistructured interviews with actors connected with the climate service in different ways, trying to include diverse roles and perspectives among the group of interviewees. In the interviews we first discussed what the interviewee considered to be the main function(s) of the climate service, then, we proceeded to fill in the Co-QA table with quality criteria for each critical mode, relative to the stated function (one interview and actor thus making up one row in the table). Following the cohort of interviews, the second step reconvened a group of those same interviewees in focus group sessions for collaboratively assessing the climate service according to an agreed upon short list of quality criteria. These focus groups started by jointly discussing the main function(s) of the climate service, with a sheet of anonymized interview statements as points of departure. They went on to discuss a filled-out Co-QA table, which assembled all quality criteria elicited from the interviews, and worked toward agreeing on the most important criteria fitted to the function(s) of the climate service. The focus groups finished by conducting an assessment of the climate service according to the short list of criteria. This two-step process is designed to enable both a comprehensive mapping of specific quality criteria from different points of view, roles and modes. And a peer review process where different perspectives are presented to different parties and quality criteria are discussed, agreed upon and anchored; bridging quality expectations across different actors.

Ex Post Assessment of the Klimathons in Bergen, Norway
The Klimathon is a collaborative, "hackathon" 1 inspired seminar with participants from different fields, competences and specialties, sharing an interest in local climate adaptation. Participants are divided into "interdisciplinary and intersectoral groups [. . . ] to design practical and strategic solutions to the challenges of planning and implementing climate adaptation at the local level" (Kolstad et al., 2019(Kolstad et al., , p. 1424). As we write, the Bergen Klimathon has been held twice, as comprehensive "live" events, gathering 73 participants in 2018 and 98 participants in 2019 for two full days (Kvamsås et al., 2021). Many of those involved in the development and implementation of the Klimathons, a group of local practitioners and researchers, cowrote an essay that might be seen as the Klimathon "origin story" titled "Trails, Errors, and Improvements in Co-production of Climate Services" (Kolstad et al., 2019), with the introductory statement-"An honest reflection on experiences in a climate service project is provided, with concrete recommendations on how to put ideas of co-production into practice" (Kolstad et al., 2019, p. 1). The Klimathon was developed to remedy some of the "errors" and is one of the "concrete recommendations." The Klimathon developed from several years of cooperation between climate researchers and local municipalities and county administrators in different research and climate service projects 2 focusing on local climate adaptation in and around Bergen, with a "co-production" ambition (Kvamsås and Stiller-Reeve, 2018, Kolstad et al., 2019, Neby, 2020, Kvamsås et al., 2021. A recurrent experience and discussion concerned the challenges of communication and different problem framings, and a lack of understanding of each other's worlds (Kvamsås and Stiller-Reeve, 2018;Kolstad et al., 2019;Neby, 2020;Kvamsås et al., 2021). The Klimathon was an effort to create a new format and forum for dialogue, to address some of these challenges so that future processes for co-producing climate services for adaptation might run more smoothly. It is also a research method in itself, producing knowledge on local climate adaptation governance (Kvamsås et al., 2021). The main focus of the Klimathon is not to produce a climate service product like a scientific report or a concrete solution to a problem, though these are anticipated spin-offs. The focus is on developing insights and ideas on how to work successfully on climate adaptation governance. It aims to stimulate local-scale initiatives that bridge disciplines, and for the participants who are there to experience and reflect upon the challenges, and potential solutions, of working with climate adaptation; a complex problem at the interface of science and politics. For our purposes here it is an interesting case because it is difficult to assess according to either traditional criteria of scientific robustness or plasticity of use alone.
We facilitated a quality assessment of the Klimathon using the Co-QA framework as a guideline, following the two-step approach detailed above. We first conducted individual semistructured interviews with eight Klimathon organizers and participants with different backgrounds, focusing on the goals of competitive element, but takes with it the elements of working in interdisciplinary groups, intensely and focused, on solving concrete problems in creative ways. 2 Hordaklim: https://www.bjerknes.uib.no/hordaklim, R3-Relevant, Reliable and Robust local scale climate projections for Norway: https://www.norceresearch. no/prosjekter/relevant-reliable-and-robust-local-scale-climate-projections-fornorway. Hordaflom: https://www.norceresearch.no/prosjekter/hordaflom-bedrebeslutningsgrunnlag-for-risikostyring-i-flomsoner-i-hordaland. the Klimathon, and quality criteria according to the four critical modes. The individual interviews lasted approximately 1 hour, using the Co-QA deliberative tool as the interview guide (see Supplementary Material); eliciting overlapping perspectives on the Klimathon functions, and a list of 30 quality criteria ( Table 2). All interviews were recorded and all but one was conducted faceto-face (conducted January to May 2020). We then invited these interviewees (four could attend) back for a 3-hour, face-to-face focus group session in June 2020, for discussing and validating the functions and criteria that came up in individual interviews and agreeing on a set of criteria for co-evaluating the Klimathons.
This focus group agreed on six criteria that they saw as best fitted to assessing the Klimathons according to their three main functions (see Table 3). This was both a process of identifying what the group found to be the most important criteria, but also criteria that they found interesting to discuss further. So, for instance, while "sufficient funding" was deemed central it was not a topic that needed much discussion, and it had not been a limitation so far, thus it was not one of the criteria brought forward into the assessment part of the focus group session. Finally, the group assessed the Klimathons according to the six peer reviewed quality criteria ( Table 3).
Recruitment of interviewees went through the organizers and snowballing 3 . This led to a group of interviewees where most had somehow been involved in organizing the Klimathons or had given input to the organizing process. In the group of interviewees there was a mix of natural scientists, social scientists and both municipal and county level administrators. These are the main groups represented at the Klimathon, and therefore the groups we wanted represented among our interviewees. The size and composition of the group of interviewees is a weakness of this case. A larger and more diverse group, with more actors that were "only" participants to the event (had not had a role in its organization) would have been desirable so as to get a more varied and less biased group, especially concerning the assessment of the events. Still, we find that the research material gives valuable insights into identifying the Klimathon goals and quality criteria, and works well for the purpose of testing out the Co-QA tool. Also, reports have been written from two of the Klimathons (Kvamsås and Stiller-Reeve, 2018;Neby, 2020) where results from the discussions and an online evaluation survey among participants carried out the day after the event is discussed. We used these reports to substantiate results from our own study.

Ex Ante Assessment for the Co-production of Climate Services in Dordrecht
The city of Dordrecht, the Netherlands, has been exploring climate-proofing and the co-production of policy and knowledge, together with a variety of neighborhood, local, regional, and national actors. The city is surrounded by rivers, close to the coast, and faces soil subsidence, groundwater issues, periods of heavy river discharge from the hinterland, heavy local rain showers, and heat stress, as well as various non-climatic issues such as socio-economic challenges, demographic change, and increasing demand for housing. Over the past 3 years, the CoCliServ project developed a bottom-up approach to climate service co-development, with Dordrecht as one of its case studies. The Dutch team involved the Municipality of Dordrecht, Utrecht University, KNMI Royal Netherlands Meteorological Institute (Dutch met office), CAS Climate Adaptation Services (climate service developer), and Studio Lakmoes (knowledge communication and design bureau). The case focused on the Vogelbuurt neighborhood, a low-lying area with much social housing that is scheduled for large scale urban renewal. Researchers collected narratives of local and regional policy 3 Lists of Klimathon participants were not available to us. actors as well as neighborhood residents on how they experienced weather, climate, and other changes . These narratives were used in a co-design workshop with 12 residents, policymakers, and researchers to draft future visions and scenarios for the neighborhood (Wardekker et al., 2020, p. 13-30), which in turn provided a basis to reflect on what climate services might be most useful to support "climate proofing" the area.
This process of designing novel climate services is currently ongoing. The Netherlands already has a significant infrastructure related to climate knowledge and climate services, but while very detailed and high-resolution, these are primarily focused on the national and regional level (Meinke et al., 2019, p. 33). The aim of the Dutch project team was to develop locally-specific, "placebased" climate services, based on local knowledge needs. An initial inventory of such knowledge needs was conducted during the co-design workshop. Currently, the project team is designing a concept for a local service that meets some of these needs. For the present paper, we argued that a reflection on knowledge quality criteria, before designing this new climate service, may be beneficial to guide this design process.
Here again our study was guided by the Co-QA framework, and following the two-step method detailed above. We built on the initial inventory of local knowledge needs developed during the co-design workshop with 12 residents, policymakers, and researchers (Wardekker et al., 2020, p. 13-30). We conducted semi-structured interviews, in individual and duo interview formats, with six actors who either participated in or helped organize the co-design workshop. These interviews aimed at eliciting knowledge quality criteria, associated with the four critical modes, which might guide the design of new place-based climate services. Following these interviews, an online discussion was held with ten of the co-design workshop participants (five of the interviewees plus five other workshop participants and co-organizers), focusing on starting the design process, with the knowledge needs and quality criteria in mind.
A key goal of the knowledge quality exercise was to inventory a relatively broad set of criteria that might be used as design guidelines in developing a novel climate service, rather than to evaluate existing services. The exercise took place before decisions had been made on the nature or audience of the service. Therefore, we present the full, uncondensed set of criteria. We focused the interviews and discussion on the CoCliServ partner organizations, as these were designing the new climate service and already included the key user, Municipality of Dordrecht. Interviewees included local policymakers, climate service specialists (public and private), policymakers, academics, and a design bureau. Interview questions roughly followed the interview guide used in Bergen, aiming at eliciting the potential goals and target audiences of a novel service, relation with inventoried local knowledge needs, and the implications of these for potential quality criteria. We used the tool implicitly to guide the initial questions, and explicitly in the inventory of quality criteria.
The individual interviews lasted between 1 and 1.5 h. All interviews were recorded and they were transcribed into a synthesis document. The online discussion lasted 1.5 h. General

Key assessment criteria Qualitative peer assessment
Group composition: diverse, interdisciplinary, and intersectoral; both among the groups of Klimathon participants, and also among the group of organizers.
Good diversity of participants from within the "target groups," based on the Klimathons focus on the use of climate information for planning. Diversity was seen in terms of the different "roles" represented, and the geographic spread of attendees. Politicians were one under-represented group.
Continuity: for building trust and well-functioning networks, and for maintaining and updating knowledge.
Good continuity in that the Klimathon has become a regular event, and many attendees came back for the second event. Another consideration is whether conversations triggered at the Klimathon continue in the different organizations that attended, and there is some evidence they do. Also, new initiatives have come out of the Klimathon, like the "Rent-a-researcher" initiative and one other local "-athon" event.
Infrastructure for discussion: acoustic comfort, thoroughly thought through topics for discussion, capable moderators.
Infrastructure was in place that made group work go smoothly. There was some feedback about the balance between having open or structured group work, and this saw the format change from the first to the second Klimathon. Attendees who returned the second year were also more familiar with the concept, which further facilitated discussion.
An equal meeting ground: that participants are equal independent of background, and all have legitimate concerns for climate adaptation.
It is difficult to assess whether the Klimathon created a neutral meeting place for attendees, because there was no work to deliberately assess this. But the Klimathon was designed to assemble highly diverse groups, which went some way to creating "neutral ground." There were also "informal" facilitators to ensure all felt they had a voice, though this varied between different groups. The pyramid discussion technique worked well to give all a voice, and a positive atmosphere was reported.
Social learning: about each other and how to overcome differences. Learn about each other's roles and work life. Understand each other's challenges. Shared experience with working on cross-disciplinary problem solving.
Attendees have learned from each other, as appraised from immediate feedback at the event, through the continuity of discussions between attendees, and in the enthusiasm to take part in the event, with more applications to attend than available places. However, much of what was learned is background information that sits in the back of people's minds, so a follow-up survey could be an important way of assessing what was learned.
Concrete outputs: reports, policy briefs or a summary that can be brought back to municipalities and further "up the system" so that insights from the event can have an impact.
There was a report following the first Klimathon, but the anticipated "policy brief" was not delivered, nor the report following the second Klimathon. The Klimathon products have not been well-delivered because of a lack of time and money, and a change in personnel. There need to be more resources for follow-up, with deadlines and clear responsibility. In future, a final product in the shape of a report or similar should be delivered immediately following the event.
notes were taken to document aspects relevant to this paper. The synthesis document of the interviews on quality criteria provided the starting material for the discussion. The knowledge needs from the co-design workshop provided input for both interviews and discussion. Both interviews and the discussion were conducted online, as physical meetings were not possible due to the COVID-19 pandemic.

RESULTS: ASSESSING CLIMATE SERVICES AND THE CO-QA FRAMEWORK
In this section we present a comprehensive knowledge quality assessment of the two case study climate services facilitated by the Co-QA framework, as a demonstration of Co-QA "in action" (the second aim). We finish here by comparing experiences in both cases.

Applying the Co-QA Framework to the Klimathon
The Co-QA framework provided a focus for interviewees and focus group participants to reflect on the qualities of the Klimathon and appraise the event by these qualities. Though the framework is designed to elicit diversity, in using it participants were seen to converge on the main functions and related quality criteria of the Klimathons. Differences were mostly found in which goals and criteria the participants emphasized most. While agreement was strong across participants, the two participants from municipalities differed slightly from the rest of the group on some points. The small number of interviewees does not allow for generalizations, but the organizers also mentioned having encountered some of the same comments elicited in this process, in other evaluations and feedback.

Goals, Criteria, and Assessment
The main functions or goals of the Klimathon, discussed and agreed upon through the two-step process, were: i Knowledge development; developing common understandings of climate adaptation and climate services, and understanding of each other ii Enact a framework for co-production iii Develop concrete ideas for climate services and map information needs Another recurrently mentioned function of the Klimathon was to build supportive networks for participants. Though the goals are limited and largely agreed, they do demonstrate the multiple functions attached to climate services in a knowledge system, and the corresponding diversity of quality criteria, some of which are difficult to capture in metrics, like "common understandings." Participants were seen to broadly divide goals or functions into two main categories, one related more to outputs, and one focusing more on the process itself and the experiences of the participants. In the focus group this was at one point discussed as "the concrete" and "the abstract" goals, where abstract meant the learning, experience and knowledge development and the process itself, and concrete meant outcomes like a report or a concrete new idea or solution, like "rent a researcher 4 ". The group didn't necessarily find this vocabulary to be satisfactory, and it was used mostly as a shorthand in the discussion, but it is interesting for our purposes here to see that the group was searching for and trying out different vocabulary to express what they experienced as valuable in the Klimathon. A recurrent comment was also that the things they found valuable with the Klimathon were difficult to assess. We found support within the group for the need to developing new ways to discuss and assess climate services, with Co-QA as one option.
The goal "knowledge development" encompassed aspects like taking part in something challenging that broadens one's understanding, experiencing cross-disciplinary work, and understanding one's role in a larger knowledge system. It embraces a broad understanding of knowledge, including to "gain experience" and "develop understanding." Relative to the second goal, comments related to testing, developing and encouraging others to use the Klimathon framework or something similar, to nurture productive dialogue and prepare the ground for well-functioning collaboration on climate information and action; future co-production. The third goal, to come up with concrete solutions, tools or procedures, reports and documents, was one of the goals most stressed by the municipalities. While all goals were given importance, it was the topic of knowledge development that generated most excitement and enthusiasm during discussions and was most talked about, explained, and nuanced, in the interviews and focus group.
With the agreed upon goals in mind, we presented the focus group participants with a Co-QA framework "filled out" with the 30 different quality criteria assembled from the interviews (see Table 2). Stemming from the three goals, the group agreed on six over-arching quality criteria. Then the Klimathons were assessed according to these six co-produced criteria. We moved systematically, discussing each criteria in turn, with participants offering their own appraisal of the Klimathon supported by evidence for each criteria, and adding to or challenging the appraisals tabled by others. In this way, we arrived at a broad consensus about the Klimathon according to each of the six 4 ≪Rent a researcher≫ was a concrete idea that came up during the first Klimathon and that was actually carried through. A climate researcher working with the Klimathon and its associated projects, spent two weeks at a small municipality working with them giving guidance and advice on climate services. criteria, as presented in Table 3. Please note that since this focus group, a report has been released about the second Klimathon.

Reflection on the Co-QA Tool
Both the topic of how to assess climate services generally, and the Co-QA tool specifically, came up in most interviews without the interviewer prompting on this. This might indicate that it is a topic that climate service practitioners are concerned about. In addition, we directly asked for reflections on the Co-QA tool at the end of both the focus group and in the individual interviews. One participant saw Co-QA as important for ". . . thinking about co-production more thoroughly, " in order to draw lessons for future practice; "there are some generalizable lessons there I think." Many of the participants raised what they felt to be a challenge, that while they did feel the goals of the Klimathon were important and valuable, they were also fundamentally difficult to document and assess. In this sense, a qualitative tool that gave space for nuanced and peer reviewed qualitative appraisals was seen as a good fit. One participant said, "I'm not so fussed about having [a framework with] 50 metrics and red, green, yellow lights (. . . ) we have talked a little bit about the need for standardization (. . . ) [but] I don't know that that is necessarily possible, there might be too many local factors." Some also mentioned they felt it was a general problem that climate services were rarely evaluated, and that to discuss and develop ways of assessing climate services should be an important issue for the climate service community.
There was positive feedback on the Co-QA tool for enabling mapping of different perspectives and for generating fruitful discussion about climate services. Many of the participants also reported that participation in the interviews and focus group was very useful for their own reflections, about both the Klimathon and climate services in general. It was particularly commented on as being a valuable tool in a co-production context. Still, most interviewees also found that this particular group was too biased and involved, not big and varied enough, and missing an outside perspective, for a totally comprehensive and meaningful assessment of the Klimathons to be made.

Applying the Assessment Framework in Dordrecht
Guided by the Co-QA framework, interviewees reflected on the potential target groups and roles for a novel service and quality criteria that could guide its design. Answers were highly similar; most aspects were mentioned by three or more interviewees, with differences in emphasis or details.

Audiences and Goals
Three key potential audiences were defined for new climate services: residents, municipal policymakers, and municipal operations staff.
Residents include the "general public, " but for a specific location or neighborhood. They include resident organizations, with a distinction made between residents who have little knowledge or interest regarding this topic, and residents who are highly interested and well-informed. For residents, climate services should provide practical information. They may also focus on environmental communication, awareness raising and improving sense of urgency. This increases people's capacity and knowledge, allowing them to see connections (e.g., between local heat, flooding and green space), and highlighting the relevance of local adaptation to their lifeworld. A key role for climate services is to provide perspectives for action. Limiting services to raising awareness is insufficient if the service should lead to change.
Municipal policymakers focus on more strategic aspects. They use knowledge to design new plans and policies, and to inform the public. Information may be focused on current and future policy challenges, expected trends, and development options for the neighborhood toward the future. General educative material can also be useful for policymakers to enter discussions with residents and policymakers in other departments. Services might help people think about what information they need in relation to a specific issue, including connections to other issues, and information related to the current climate. Similar to residents, providing perspectives for action is a key role of a climate service.
Municipal operations staff, such as "neighborhood managers, " municipal project managers, and implementation staff have a more practical focus. They require information on what's currently going on in a neighborhood, monitoring, detection/alerts in case of problems, or information related directly to the work that's being implemented. Climate services might have a "signal function" (for managers) or a "quick lookup or reference" that can be used in the field (for implementation staff). An action perspective was again mentioned as a key role.
Other potential audiences included: water boards (regional water management agencies), municipal health services, provinces, companies and industrial areas, environmental agencies, housing corporations, and project developers. Many climate services experience widening target audiences, whether horizontal (more actors) or vertical (higher/lower scale levels). This may or may not lead to different requirements for the climate service.

Quality Criteria
Application of the Co-QA tool yielded a list of quality criteria deemed relevant by the interviewees for different potential target groups of the novel climate service (Table 4).
For residents, information's accessibility and ease of use are paramount. This can involve, for example, the use of language, types of visualization, or the assumed level of background knowledge. Details are not always necessary; it is important to develop a good basis of understanding first. Credible developers-trusted as conducting their work appropriately and rigorously-are important, for example national science institutes, well-known consultancies, or municipal health services. The information should be clearly actionable. For most, that would focus on practical aspects at the scale of the home, street or neighborhood. One interviewee pointed to TEDx and a Dutch TV talkshow ("De Wereld Draait Door"), which tell "informative stories told in an enthusiastic way, which tickle people's curiosity." Recognizing diverse social groups in neighborhoods, services may need to be tailored to people's perspectives, or coupled to local topical issues to be meaningful. Finally, climate services would be open to user input (interactivity), especially when focused on specific neighborhoods.
For municipal policymakers, credibility of both the developer and the input data and data provider are highly important: "We should be able to count on it that this is the best there is. Not something someone cobbled together in his attic." A related point is that climate services sometimes involve biases or debatable assumptions because of the way they use, transform or combine data [see e.g., van der Sluijs and Wardekker (2015)], or how they present and visualize results. Climate services' biases and assumptions, and uncertainties should be made transparent and discussed relative to policymakers use. Providing actionable information, is again a very important criterion. Services should relate to policymakers' fields of work and current and future challenges. Continuity of service is important. Interviewees referred to experiences where services were developed in a research project, and then defunct a few years later, when the project was over. This is detrimental to their practical applicability. Finally, the interviewees stressed that while it is important that services meet user needs, there can be differences between stated needs and what they would actually need for their purpose, emphasizing the importance of iteratively co-develop climate services through continuous conversation.
Criteria for municipal operations staff were similar to policymakers, with differences in the details. Specific aspects were the involvement of sectoral knowledge institutes (e.g., RIONED, STOWA), providing technical know-how and serving as a recognizable "quality label" for these users. Practical aspects related to the use of services, e.g., coupling to existing management systems or approaches, or usability on smartphones, were also mentioned.
Finally, discussions were raised on other qualities of climate services. For example, should climate services have emotional impact? Might they be "slightly scary, " or would that hamper legitimacy or backfire (e.g., induce denial). How prominent should a municipality's ambitions be embedded in a service? And should climate services also include things that municipalities have little knowledge on, or that are highly uncertain? This information may be relevant, but may backfire if people overinterpret it or consider it as "the truth." Overall, the tool was easy to use in the context of designing quality criteria that might (pre-emptively) guide the co-design of novel climate services. The emphasis on different target groups matched participants' own experiences with quality in the sense of fitness for use. Several mentioned this even before being shown the framework. The phases were recognizable to participants when they were shown and briefly introduced. We had prepared a list of example criteria, and referred to this in two cases to stimulate discussion. Participants used the framework easily, to typify past experiences (good and bad) with quality of climate services and to identify quality criteria. One respondent remarked that the criteria did remind them of those in the theoretical literature. However, the resulting criteria were more rooted in respondents' experiences and practice.

Use in Developing Novel Services
CoCliServ participants decided to focus on a warning service for Vogelbuurt and neighboring residents for approaching heavy rain showers. They discussed aspects related to the quality criteria multiple times; often implicitly, but most aspects were covered. There are several local low-elevation points that experience problems during intense rain. Two recent extreme showers led to local flooding and considerable damage. Insurers refuse future compensation and the issue cannot be solved structurally. Both the Municipality and residents see this as an urgent issue. The Municipality wants to set up something that helps residents take timely action, for instance through temporary barriers or sandbags, but these will need to be staged over time. The climate service might provide timely warning and show potential actions and locations of materials. This relates directly to criteria "perspectives for action, " "timely information, " "aimed at house/street/neighborhood scale." A challenge is that such showers are very local and difficult to predict. Residents may be warned several times without being hit by the shower ("reliable"). Discussion will be held with residents on how many unnecessary warnings are acceptable. The format might be a smartphone app, with a warning and further information ("tangible, " "accessible"). KNMI was seen as an ideal provider of the warning, or the information on which it is based, as people see the national met office as a trusted source ("credibility"). Another challenge is that KNMI, as a public institute, cannot provide services that could also be provided commercially, for legal/competition reasons. Commercial meteorological bureaus and consultants may be involved. The CoCliServ co-design workshops allowed for developing an initial sketch or demo. The issue of "consideration of different types of people" and diverging needs was also raised. Layered provision of information in the app may be possible (warning plus different levels of detail). An interviewee also questioned whether it would be free to download the app or not. CoCliServ partners are now developing proposals for further development. These referred explicitly to "user-friendly, " "visual, " "local/neighborhood level, " and "interactive." Alternatives were also suggested by multiple partners: linking up with existing fora including e-mail and WhatsApp groups or neighborhood app ("accessible, " "form matches needs/lifeworld/recognizable") and where residents and (KNMI-)experts ("credible") might exchange information. The service should avoid information overload ("accessible, " "form matches needs") and be developed in conversation between actors ("cocreated"). Participants showed a high degree of awareness of the quality criteria in their discussions.

Comparison
The Bergen and Dordrecht cases applied the Co-QA tool to co-evaluate climate services in different situations. The Bergen case was ex post and aimed at a relatively novel type of climate service; a workshop series intending to stimulate reflection among relevant climate service actors. The Dordrecht case was ex ante and aimed at designing a more typical (though novel) climate service; an app on heavy precipitation events for local residents. In each case, we observed that the Co-QA tool allowed for both individual and collaborative reflection on quality. Both uses of the framework resulted in explicit discussion of the intended audiences and purposes of climate services, and how quality related to that. Diverse sets of quality criteria were generated, and while many of these bear similarities with theory-based criteria in the literature on knowledge quality, they were more detailed and better rooted in the daily realities and experiences of providers and users and other actors with a stake in climate services. In both cases, participants observed that the exercise in itself, while qualitative, stimulated them in explicitly taking quality considerations into account in their work on and with climate services.
Some differences were also observed between the cases. In Bergen, the resulting quality criteria placed emphasis on process-related aspects. Given that the Klimathon is a workshop series where the process itself is indeed key, we argue that the Co-QA tool adequately picked up on this as a key aspect of quality for this specific service. A traditional quality assessment may have retained a focus on the data, models and other tools that would be used in the workshops, rather than on the more nebulous factors that play a role in making the service a success. In Dordrecht, we observed a broad discussion along input, process, output and use of climate services, with the latter two in particularly emphasized by policymakers and tool developers, and the input primarily highlighted by climate scientists. This shows that it is important to include multiple types of actors in the co-evaluation, as different actors will have different "sensitivity" to each stage due to their background and experiences. We also observed that participants found the explicit discussion of such aspects useful and that they took it into account in designing novel climate services.
Jointly, the case studies show that in co-development of climate services in a knowledge system, a wide-ranging set of functions and goals, potential users and uses, and contextual factors play a role in determining quality; many of which are difficult to measure, evaluate and communicate using traditional quality assessment. Co-evaluation using the Co-QA framework helped actors in our cases to make these quality aspects more tangible and to intentionally include them in different phases of climate service design, refinement and use.

Co-QA and Climate Service Co-evaluation
We aimed to: (i) develop a new tool-the Co-QA-for comprehensively assessing the qualities of co-developed climate services relative to their particular functions; and (ii) test the Co-QA for assessing two climate services; an ex-post assessment of the Klimathons in Bergen, and an ex-ante assessment for guiding the design of a novel climate service (heavy precipitation warnings) in Dordrecht.
To the first aim, we developed the novel Collaborative Quality Assessment (Co-QA) framework. This framework builds on Knowledge Quality Assessment (KQA) scholarship, extending on the long-established framework of Clark and Majone (1985) and contributing to the stock of empirical evidence of how KQA is implemented in practice, and to what effect. To climate services scholarship, Co-QA provides a new tool for the comprehensive co-evaluation of services, in recognition that quality means different things to different people in complex knowledge systems, and that these diverse criteria need to be collaboratively weighed in any assessment. Co-QA resembles existing frameworks [see e.g., Wall et al. (2017)], but we see novelty in its (i) simplicity of use; (ii) low resource demands; (iii) openness to including diverse quality criteria of different types; (iv) ability to capture-"bottom-up"-actors contingent concerns in a particular knowledge system; and (v) facilitating discussion, justification and weighing of criteria in collaborative fora, as "extended peer review." Though climate services have seen a paradigm shift toward collaboratively producing (coproducing) information with interested actors, this collaboration has rarely extended to their assessment, which remains narrowly and technically defined. Co-QA shows what co-evaluation can look like.
To the second aim, we tested Co-QA and found it to be effective in both of the cases studies. These cases were chosen because they departed from classic climate models, tools and datasets, and were paradigmatic of the wave of co-developed and creative approaches to climate services, where lines between providers and users are blurred, and new actors are becoming involved. The openness and flexibility of Co-QA meant that it could be deployed for assessing very different serviceswhether for an ex-post assessment of a learning process like the Klimathon, or an ex-ante assessment of an information product like a flood warning app-it is the actors themselves who decide which criteria are most relevant for assessing a particular climate service and its functions. As one participant in Bergen said, a qualitative framework like the Co-QA could be a better fit here than a pre-configured quantitative list of metrics, which may not capture what is actually experienced as important. This much noted, both cases also included (natural) climate scientists and elicited criteria that related to more technical and quantifiable scientific standards (e.g., checked for biases, transparency etc). In this way, the Co-QA framework is also suitable for more "traditional analysis" of scientific quality; it could for instance be used to map quality criteria across different disciplines and foster a cross-disciplinary dialogue on epistemic quality norms.
The categories in the tool, focusing on multiple actors and modes was recognizable to participants, and allowed us to start a dialogue on quality criteria based in participants' experiences and needs, rather than from theoretical work. This meant that the tool was fairly quick and easy to use, compared to some of the more formal KQA methods. Together the cases showed that Co-QA can work for comprehensively assessing co-developed climate services, but more case studies are needed before we can state this any stronger.

Three Broader Lessons for Climate Service Assessment Scholarship and Practice
These cases revealed the importance of studying climate services as emerging from and traveling through complex knowledge systems, and a corresponding need for comprehensive assessment that accounts for these diverse qualities. Accepting that these were atypical climate services, they did nonetheless show two instances of information products that came about through configurations of "linked actors, organizations and objects, " operating across institutions with different logics. Indeed, one of the Klimathon's main goals was to mutually understand the complex landscape of public administrations and research organizations engaged in climate adaptation, by physically gathering these networks of actors and organizations in one venue. The Klimathon product was a self-reflexive understanding of the very system producing the Klimathon. Likewise, in Dordrecht, the process for developing a neighborhood-scale flood-warning product implicated various research and municipal institutions, including different professions, and a local community that is itself far from homogenous.
Both cases demonstrated the multitude of possible qualities attached to a climate service in a knowledge system, with 30 criteria distilled in Bergen, and 51 in Dordrecht. Looking at how these criteria differentially emerge in different institutional settings, the Dordrecht case teased out the different qualities important for different target groups: residents, municipality policy-makers, and municipality operations staff. There the credibility of product developers was a proxy for scientific robustness, and various quality conditions related to "use" were laid out. But beyond scientific quality and plasticity, other qualities arose. For example, information that diverse local communities can make sense of and maps onto their identities and daily lives. Or a service that is developed over a long timehorizon, since it can be through use that we understand what our needs are. There were also issues raised around how numerous and diverse quality criteria can cohere in a single product. On one hand, by discussing qualities in an ex ante design process like in Dordrecht, a product becomes an explicit composite of all these expectations. On the other hand, in Bergen, participants saw tensions between "concrete" and "abstract" expectations.
Distinguishing between the "concrete and abstract" raised another issue around how to put into words some of the qualities attributed to climate services. In Bergen, participants saw that what they found to be important and valuable aspects of the Klimathon were difficult to measure and to communicate to funders, public administrators and the research community. Both cases saw participants try out, or develop, new vocabulary to talk about what they find to be important goals and valuable aspects of climate services for them. Many criteria in our case studies might be interpreted under the classic headings of "credible, legitimate, salient, useful." However, because we approach these bottom up with knowledge system actors, the criteria are much more diverse, specific, place-based and purpose-specific, as well as much more recognizable to users who utilize them in the co-evaluation. For us, this shows the necessity of more comprehensive assessment that can accommodate the tangible and the less tangible understandings of climate service quality.

DATA AVAILABILITY STATEMENT
The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by Marianne Høgetveit Myhren and Lis Tenold from NSD-Norwegian Data Protection Services (Project No. 60380). The patients/participants provided their written informed consent to participate in this study. has received funding from the European Research Council (ERC) under the European Union's Horizon 2020 research and innovation programme (Grant agreement 804150).