An evaluation of a computational technique for measuring the embeddedness of sustainability in the curriculum aligned to AASHE-STARS and the United Nations Sustainable Development Goals

Introduction SDG 4.7 mandates university contributions to the United Nations (UN) Sustainable Development Goals (SDGs) through their education provisions. Hence, universities increasingly assess their curricular alignment to the SDGs. A common approach to the assessment is to identify keywords associated with specific SDGs and to analyze for their presence in the curriculum. An inherent challenge is associating the identified keywords as used in the diverse set of curricular contexts to relevant sustainability indicators; hence, the urgent need for more systematic assessment as SDG implementation passes its mid-cycle. Method In this study, a more nuanced technique was evaluated with notable capabilities for: (i) computing the importance of keywords based on the term frequency-inverse document frequency (TF-IDF) method; (ii) extending this computation to the importance of courses to each SDG and; (iii) correlating such importance to a statistical categorization based on the Association for the Advancement of Sustainability in Higher Education (AASHE) criteria. Application of the technique to analyze 5,773 modules in a university's curriculum portfolio facilitated categorization of the modules/courses to be “sustainability-focused” or “sustainability-inclusive.” With the strategic objective of systematically assessing the sustainability content of taught curricula, it is critical to evaluate the precision and accuracy of the computed results, in order to attribute text with the appropriate SDGs and level of sustainability embeddedness. This paper evaluates this technique, comparing its results against a manual and labor-intensive interpretation of expert informed assessment of sustainability embeddedness on a random sample of 306 modules/courses. Results and discussion Except for SDGs 1 and 17, the technique exhibited a reasonable degree of accuracy in predicting module/course alignment to SDGs and in categorizing them using AASHE criteria. Whilst limited to curricular contexts from a single university, this study indicates that the technique can support curricular transformation by stimulating enhancement and reframing of module/course contexts through the lens of the SDGs.

Introduction: SDG . mandates university contributions to the United Nations (UN) Sustainable Development Goals (SDGs) through their education provisions. Hence, universities increasingly assess their curricular alignment to the SDGs. A common approach to the assessment is to identify keywords associated with specific SDGs and to analyze for their presence in the curriculum. An inherent challenge is associating the identified keywords as used in the diverse set of curricular contexts to relevant sustainability indicators; hence, the urgent need for more systematic assessment as SDG implementation passes its mid-cycle.
Method: In this study, a more nuanced technique was evaluated with notable capabilities for: (i) computing the importance of keywords based on the term frequency-inverse document frequency (TF-IDF) method; (ii) extending this computation to the importance of courses to each SDG and; (iii) correlating such importance to a statistical categorization based on the Association for the Advancement of Sustainability in Higher Education (AASHE) criteria. Application of the technique to analyze , modules in a university's curriculum portfolio facilitated categorization of the modules/courses to be "sustainability-focused" or "sustainability-inclusive." With the strategic objective of systematically assessing the sustainability content of taught curricula, it is critical to evaluate the precision and accuracy of the computed results, in order to attribute text with the appropriate SDGs and level of sustainability embeddedness. This paper evaluates this technique, comparing its results against a manual and labor-intensive interpretation of expert informed assessment of sustainability embeddedness on a random sample of modules/courses.
Results and discussion: Except for SDGs and , the technique exhibited a reasonable degree of accuracy in predicting module/course alignment to SDGs and in categorizing them using AASHE criteria. Whilst limited to curricular contexts from a single university, this study indicates that the technique can support curricular transformation by stimulating enhancement and reframing of module/course contexts through the lens of the SDGs. KEYWORDS AASHE-STARS, curriculum, education for sustainable development, higher education, sustainability lexica, sustainable development goals, TF-IDF, validation

. Introduction
Education for Sustainable Development (ESD) is geared to transformation of educational systems (Leal Filho et al., 2019). Universities have historically recognized ESD as an enabler for addressing pressing sustainability challenges (Cortese, 2003;Wals, 2014;Thomas, 2015). ESD also plays an important role in building sustainable, inclusive and resilient societies and has emerged as a precursor to solving some of the most complex sustainability challenges by equipping learners with the necessary competencies (Rieckmann, 2017;Kioupi and Voulvoulis, 2019). Explicitly stated as a stand-alone goal in the 2030 Agenda for Sustainable Development, SDG 4 addresses quality education as a means of empowering students to learn about and for sustainability challenges and their interlinkages (U.N. DESA, United Nations Department of Economic and Social Affairs, 2016). In particular, SDG 4.7 states that: "By 2030, ensure that all learners acquire knowledge and skills needed to promote sustainable development, including, among others . . . sustainable lifestyles, human rights, gender equality, promotion of a culture of peace and nonviolence, global citizenship and appreciation of cultural diversity and of culture's contribution to sustainable development." Competencies refer to the knowledge, skills, attitudes and values gained by learners (Rychen and Salganik, 2003). ESD seeks to develop competencies, such as critical thinking, systems thinking, collaborative decision-making and assuming responsibility for future generations, which empower learners to reflect on the sustainability impact of their actions (Rieckmann, 2018). Increasingly, sustainability literature advocates for the need for pedagogies that develop transversal competencies to address the wicked nature of sustainability problems (Wiek et al., 2011;Lozano et al., 2017;Hull et al., 2018). These transversal competencies have become the basis for reference frameworks that provide guidance to learners and educators on how to improve their knowledge, skills and attitudes needed to live, work and behave in a manner caring for our planet (e.g., Bianchi et al., 2022). Yet, despite many initiatives to integrate ESD in higher education curricula (e.g., Behan et al., 2022), the extent to which such initiatives have been successful in transforming higher education is unclear, partially due to the lack of transparent measures for an objective assessment of ESD effectiveness, and partially due to the lack of clarity in universities on their role in driving the sustainability agenda (Sylvestre et al., 2014;Harvey et al., 2021). These factors are further complicated by the nebulous nature of sustainability, itself (Stough et al., 2018), and by differing expectations of sustainability outcomes in higher education (Palma and Pedrozo, 2015). Other factors, such as significant time constraints, poor signposting or limited access to quality learning resources, limited educator understanding of sustainability, disciplinary silos and fixed curricular structures, are cited as major barriers to ESD integration in the curriculum (e.g., Cebrián et al., 2015;Freeman et al., 2021).
In a unique attempt to address the contexts outlined above, Technological University Dublin's (TU Dublin) strategic intent to 2030, written through the lens of the SDGs, specifies a key performance indicator for all its programmes to include sustainability as a learning outcome (TU Dublin, 2019). In recognizing programme delivery as an important channel for developing sustainability competencies, this research aims to assist lecturers in mapping sustainability elements to their academic programmes and constituent modules. In this paper "module/course" refers to a discrete standalone element of study that when structured together, constitute an academic programme. For example, a five European Credit Transfer and Accumulation System (ECTS) module/course in Economics may be part of a Bachelor of Business degree programme. In acknowledging that a small proportion of modules may be focused primarily on sustainability, whereas others may contribute indirectly to SDGs, TU Dublin considers the Association for the Advancement of Sustainability in Higher Education's Sustainability Tracking and Assessment Rating System (AASHE-STARS) to be a useful interpretive framework for categorizing its modules (AASHE, 2019). Yet, interpretive validity relies on expertise and norming. Whilst robust qualitative techniques could be used to achieve a conclusive categorization, they would require significant stakeholder engagement and would be subject to interpretive bias.
A more systematic approach could deploy natural language processing (NLP) techniques to interrogate curricular texts based on a lexica of keywords or phrases associated with the SDGs (e.g., Varshney and Mojsilovic, 2019;Buzaboon et al., 2021;Yang and Cormican, 2021) and then to apply machine learning to iterate better search techniques (Hajikhani and Suominen, 2022). The central challenge with this approach is in bridging the gap between keywords and meaning during the mapping process. Yet, automating the mapping could facilitate a more systematic evaluation of educational programmes with a single-click of a button, freeing up educators to internalize the mapping for curricular enhancement purposes. One NLP technique, known as Term Frequency-Inverse Document Frequency (TF-IDF), is deployed regularly to identify keywords in documents (Munot and Govilkar, 2014). Lemarchand et al. (2022) recognized the potential utility of TF-IDF for measuring sustainability in the curriculum as it aims to attribute a weighted value of terms or document frequency to evaluate the relevance of a keyword (Mishra and Urolagin, 2019). TF-IDF works well for large documents with voluminous texts, such as scientific research and policy documentation (Matsui et al., 2022). However, module/course descriptors tend to be much more abridged documents, so the text needs to be much more judiciously written if the TF-IDF is to capture its alignment to the SDGs. Lemarchand et al. (2022) use the term frequency component of TF-IDF to identify the relative importance of root keywords associated with each SDG, providing a basis for evaluating the "sustainability importance" of modules/courses. Using the "sustainability importance" concept, 5,773 TU Dublin module/course descriptors are then interrogated to identify potential SDG mappings and to categorize the modules based on AASHE-STARS criteria. The research objective addressed in this paper is to evaluate the effectiveness of this computational technique (Lemarchand et al., 2022) by comparing its results to those based on expert interpretation of a random sample of 306 of these modules/courses, in which each module/course descriptor is forensically reviewed to identify which (and to what extent) SDGs have been addressed. The objective is underpinned by two research questions: RQ1: how effective is the TF-IDF based technique in assisting with mapping of modules/courses to the SDGs? .

RQ2
: how effective is the TF-IDF based technique in determining the embeddedness of sustainability in modules/courses based on the AASHE-STARS criteria?
The paper begins with a brief summary of the SDGs as a contemporary framework for sustainability and explores the state-ofthe-art with respect to ESD provision in higher education. Various approaches to SDG integration in the curriculum and how these approaches can be evaluated are then outlined. Lemarchand et al.'s (2022) computational technique used to evaluate SDG integration and the results yielded when applied to TU Dublin module/course descriptors are then summarized. The evaluation procedure is then detailed in which the results of the computational technique are compared with those from a manual review (based on expert interpretation) of the module/course descriptors. Finally, in seeking to reconcile discrepancies between both sets of results, opportunities for further design science research needed to enhance the efficacy of the technique as a basis for informing curricular enhancement are then prioritized.
. Materials and equipment . . Utilizing the SDGs as a framework for sustainability Some scholars (e.g., Steffen et al., 2015) claim that we live in the Anthropocene epoch, an era in which population growth, affluence and technological advances have yielded an unprecedented human capability to irrevocably alter its natural environment (Rockström et al., 2009). With emerging societal consensus on the urgent requirement to address the potentially catastrophic issues of climate change, biodiversity loss and threats to the natural world, Generation Z is spearheading a quiet revolution, elevating sustainable development from desirable to essential. This is reflected in the 2015 United Nations plan for achieving a better future, at the heart of which are the SDGs, a universal call to action to eliminate poverty, to protect the planet and to ensure peace and prosperity.
The SDGs are structured into 17 goals, 169 targets and 244 indicators as part of a comprehensive and integrative framework, reflecting interwoven environmental, social and economic challenges. All 17 SDGs are interconnected, with both synergies and tradeoffs, in that key to success in achieving one goal is a requirement to address issues more commonly associated with other goals. Attempts have been made to identify keywords associated with each of the goals, thereby providing the basis for a sustainability lexicon that could be used to map curricular contributions to the SDGs (e.g., Mu and Kang, 2021;Rajabifard et al., 2021).
Although most countries have begun reporting implementation of the SDGs through voluntary national reviews, measuring progress toward achieving the SDGs is a complex matter, with performance seemingly dependent on the chosen method of measurement (Allen et al., 2019). Notwithstanding this complexity, the SDGs are increasingly being used as a guiding framework for reporting progress in implementing a transformation agenda aligned to sustainability ideals in a variety of contexts, including higher education, business and government (Kaur and Lodhia, 2019;Caputo et al., 2021).
. . Sustainability curriculum as a catalyst for achieving the SDGs Sachs et al. (2019) identify education as an important area of societal transformation in order to achieve the SDGs. In particular, the higher education sector is expected to drive social and technological progress by equipping young people to be change-makers and emerging leaders (Zamora-Polo and Sánchez-Martín, 2019). Hence, universities can play an important role in helping the global community to make sense of challenges and opportunities posed by the SDGs; to formulate and test solutions; to articulate transformation pathways; and to track progress toward achieving the goals. SDG 4.7 requires learners to develop sustainability competencies and is measured by the extent to which ESD is mainstreamed in the education provision. Despite a plethora of ESD publications (e.g., QAA Advance HE, 2021), there are limited examples of practicable, university-wide curricular reform oriented sustainability initiatives (Mori Junior et al., 2019). In attempts to addressing this shortfall, professional accreditation bodies have mandated programme teams in universities to initiate SDG integration in their curricula. A growing number of case studies on how university curricular contributions to the SDGs are tracked reveal a common pattern of mapping used to identify gaps, opportunities and alignment to the SDGs in learning, teaching and assessment activities (Leal Filho, 2017). Yet, many academic faculties, particularly those without sufficient knowledge of sustainable development, face challenges in establishing links between the subject matter content of their respective disciplines and sustainability concepts, which challenge the integration of sustainability in existing curricula (Rajabifard et al., 2021). Hence, much of the current literature has focused on how sustainability is embedded in specific degree programmes or in their constituent modules/courses (e.g., Palacin-Silva et al., 2018).
Many reporting frameworks for communicating sustainability contributions, across the various aspects of a university's core business, consider curriculum to be an essential component (Kosta, 2018). Currently, one of the more balanced tools used to measure sustainability contributions across university functions is AASHE-STARS. The (AASHE, 2019) technical manual provides a detailed methodology for measuring and reporting the contributions of a college or university toward the SDGs. Table 1 shows the institutional reporting requirements for six categories with the total number of points varying with an institution's context. The highest score possible is 209 points. To achieve the highest rating, platinum, an institution must score at least 85 points. AASHE currently awards a mere 11 institutions this rating out of 335 rated institutions, thereby illustrating the stringent nature of the awarding process. Out of the five scored categories, the Academic (AC) category represents 58 points of which the Curriculum sub-category scores up to 40 points. With the points for each sub-category within Operations (OP) and Planning and administration (PA) varying between 4 and 10 points (not shown in Table 1), the Curriculum sub-category is consequently the most important subcategory of all. The Curriculum sub-category is, itself, divided into 8 criteria: Academic Courses (AC1), the equivalent in TU Dublin being Academic Modules, Learning Outcomes (AC2), Undergraduate Programmes (AC3), Graduate Programmes (AC4), Immersive Experience (AC5), Sustainability Literacy Assessment (AC6), Incentives for Developing Courses (AC7), and Campus as a Living Laboratory (AC8). Lemarchand et al. . /frsus. . Completing an inventory of an institution's sustainability module/course offerings necessitates appropriate association of modules/courses to SDGs and to classify them in terms of sustainability importance (SI). This process is required to score AC1 to AC6 based on textual detail extracted from module/course descriptors. It also requires tracking the number of students and staff taking and teaching sustainable modules/courses and programmes. It further enables universities to develop and score new sustainable offerings (AC7). Credits to AC8 are attributed to institutions "utilizing its infrastructure and operations as a living laboratory." Applied learning projects would contribute to understanding or advancing sustainability in at least one of the AASHE-STARS sub-categories other than Curriculum and Research. As such, should a module/course descriptor include the living laboratory experience, AC8 can similarly be associated to SDGs and a sustainable importance classification. AAHSE-STARS requires modules/courses to be classified into three categories, namely, "sustainability-focused, " "sustainability-inclusive, " and "nonsustainable." Sustainability-focused modules/courses must contain significant content with explicit reference to sustainability or a focus Frontiers in Sustainability frontiersin.org . /frsus. . on a major sustainability challenge. Whereas sustainability-inclusive modules/courses might have no explicit focus on sustainability, but at least, they must incorporate a component to indicate the requisite sustainability related knowledge. The third category of "non-sustainable" modules/courses represent those courses that do not satisfy either of the two conditions above. AASHE-STARS requires the method of categorization to be specified clearly, but categorization of a large number of modules/courses across a university's curricular portfolio is a major challenge. By its very nature, this challenge requires significant stakeholder engagement with faculty and students to construct a detailed understanding of how each module/course maps to sustainability principles. Therefore, a systematic method that could facilitate such categorization of modules/courses based on AASHE-STARS criteria could help to reduce interpretive bias and free-up faculty to focus on curricular enhancement toward addressing the SDGs. Where there is flexibility of choice of academic pathways, articulating the contribution of a module/course toward addressing specific SDGs could arguably the learner decision to enroll in it. keywords can yield entirely irrelevant hits and (ii) keywords can yield duplicate hits due to one keyword being present inside another. Both issues can yield false positives, notwithstanding that weighting the keywords for relevancy could yield accuracy improvements (Adams et al., 2020). The University of Auckland used machine learning techniques to publish its lexicon of SDG keywords from Elsevier's SDG search query along with additional search terms added from documentation provided by the Sustainable Development Solutions Network (SDSN). The Monash and Auckland examples represent two types of initiatives, (i) ontologies and (ii) machine learning, which both seek to contribute to developing a robust approach to categorizing textual data to the SDGs, but with their own respective limitations. Ontologies, whilst high-quality, lack comprehensiveness in that they cannot capture the expanse of the SDG related discourse. The principal output of ontologies is a hierarchical set of keywords linked to a high-level concept, in this case an SDG. Machine learning techniques are typically trained on small homogeneous corpora and, hence struggle with out-of-sample cases. Machine learning techniques trained, for example, on engineering curricula, might struggle with business curricula, simply due to the different contextualized meaning of keywords. Furthermore, machine learning techniques trained on different datasets are difficult to integrate.
For a list of keywords to address the issues of relevancy and duplicate hits, Lemarchand et al. (2022) details a systematic framework, identifying root keywords (RKs) extracted from the 169 targets and 247 indicators describing the SDGs. In summary, an RK is the stem of a word that can be morphologically inflected. In other words, RKs are truncated keywords that remain after removing prefixes and suffixes. For example, "sustainab" is a root keyword of "sustainable" and "sustainability." To attribute the relative importance of RKs to each SDG, the TF-IDF method was adapted to calculate a sustainability importance (RK SI ) score, as shown in Equation 1 below.

RK SI =
No. of times the RK appears in the targets and indicators No. of SDGs in which the RK appears (1) Lemarchand et al. (2022) also applied additional filtering to avoid RKs being associated with too many SDGs and hence diluting their meaningful attribution. Equation 2 defines the minimum number of occurrences an RK needs to appear in an SDG to be associated to that SDG. This filtering significantly reduces the number of SDGs attributed to a module/course, offering greater focus and clarity to lecuters in considering how their modules/courses are positioned in the SDG framework.
Min. No. of Occurences for an RK to appear in an SDG to be associated to it (2) Figure 1A plots the sum of the RK SI values for RKs associated to each SDG, applying the filter. The uneven distribution of the plots illustrates that the lexicon of RKs is richer for some SDGs than for others. Whilst it is not conclusive, ΣRK SI > 20 was found to be a reasonable threshold to effectively map modules/courses to their most significant SDGs (Lemarchand et al., 2022).

. . Mapping module descriptors to the SDGs
A recent revision of the Malaysian Qualifications Framework, outlining standards for masters and doctoral degree programmes, requires that these programmes should challenge graduates to attain competences that would enable them to contribute to the achievement of the Sustainable Development Goals (Malaysian Qualifications Agency, 2021). Whilst there are several case-studies outlining the application of the SDGs as indicators to evaluate academic programmes (e.g., Gough and Longhurst, 2018;Brugmann et al., 2019), they are not directly comparable as either their methodologies are interpretive, or they use different reporting tools. In considering the number of SDGs in a module/course to be indicative of sustainability focus, Lemarchand et al. (2022) calculated the "Sustainability Importance" (SI) of each module, first for each SDG and then in total (for all 17 SDGs). When an RK was found during the mapping process, its RK SI value was added to its previous value for the respective SDGs with which the RK is associated, as shown in Equation 3 below. The distribution of SI Module SDG scores for the entire set of 5,773 module/course descriptors facilitated the categorization of modules/courses into nine sustainability importance ranges (IR) over a distribution using Equation 4, where f is a scaling factor from −4 to 4. Max Adjusted for an additional RK "sustain, " Equation 4 facilitated the creation of a statistical definition of the AASHE-STARS categories for "sustainability-focused" and "sustainable-inclusive" modules/courses. Lemarchand et al.'s (2022) analysis of 5,773 TU Dublin module/course descriptors suggested that 315 modules (5.44%) were considered sustainability-focused and a further 848 (14.69%) were considered to be sustainability-inclusive. A probability rating (low, medium or high) was also attributed to modules/courses based on the average number of distinguishable RKs for each module/course. This analysis suggested that 286 modules (5%) could be defined as sustainability-focused with a medium to high level of confidence and 769 modules (13.32%) could be defined as .
/frsus. . courses with sustainability-inclusive with a medium to high level of confidence. Based on the SI Module SDG scores, Figure 1B depicts a summary mapping of TU Dublin's modules to each SDG. Lemarchand et al. (2022) provide a systematic articulation of the computational technique as well as its results from applying it to 5,773 TU Dublin modules/courses. The section below focuses on an evaluation of this technique, based on expert interpretation of a random sample of modules/courses across the AASHE-STARS categories.

. . Evaluation procedure
To validate the computational (TF-IDF based) technique, a manual review of a random sample of modules/courses (n = 306) from the 5,773 modules/courses analyzed was conducted. A statistical calculation of the minimum sample required is provided in the Annex 1 in Supplementary material. Each module/course descriptor was forensically examined by a research assistant with acquired knowledge of the SDG goals, targets, indicators and metadata, in consultation with the respective module/course lecturers, to explore the embeddedness of sustainability and to identify the SDGs addressed. The review was organized in three stages based on the three sections inherent in each module/course descriptor, namely: (i) module/course description; (ii) learning outcomes; and (iii) module/course aims. For each stage, the textual data was manually interpreted in two steps: (i) first, the module/course was classified using the AASHE-STARS criteria and (ii) second, the module/course's association with specific SDGs was identified. In each step, results from manual interpretation were compared and contrasted with those from the computational technique. Details of the manual review are provided in the Annex 2 in Supplementary material.
Modules/courses that were interpreted to have text indicating an explicit reference to, or focus on sustainability were categorized as "sustainability-focused." Modules/courses that were interpreted to have no explicit focus on sustainability but were interpreted to incorporate a component to indicate a presence of sustainability related knowledge, were counted as "sustainability-inclusive." The third category of "non-sustainable" modules/courses represent those modules/courses that did not satisfy either of the two conditions above. In the second step, the review involved the manual identification of SDGs associated with modules/courses categorized as "sustainability-focused" or "sustainability-inclusive." During this association, careful consideration was given to understand the context to which a specific SDG refers. Thus, a key aim of the manual SDG association was to inform the gaps in the TF-IDF based technique vis-à-vis the context in which SDGs are assigned. An example from two module descriptions (below) illustrates the significance of this challenge.
Module Code CIVL9000 "Introduction to Sustainable Infrastructure." "This module introduces the student to the main infrastructural topics which will be addressed in the programme and highlights their economic, social and environmental impacts. Key tools for estimating and understanding these impacts are outlined before issues of sustainability are discussed for each infrastructural group: water; energy; transport; and structures.
The module highlights the global, national and regional contexts for the sustainability debate and presents current state of the art and possible future trends in each area. Material is addressed at a high level, with more detailed design and planning left to individual modules." From this text, the computational technique associated the module/course with SDG 1 "no poverty" but considering the context, SDG 9 "industry, innovation and infrastructure" and SDG 11 "sustainable cities and communities" were assigned through manual association.
Module Code CBEH3003 "Environmental Engineering." "The module entails the design and analysis of environmental systems and the principles that underly them. The module is delivered so that the student is introduced to basic theory, the application of that theory to environmental engineering problems and the solution of these problems. The students demonstrate competence through project assessment and exam." From this text, the computational technique associated the module/course with SDG 1 but considering the context, SDG 11 "sustainable cities and communities" was assigned through manual association. This signals how important it is to understand the context and to identify the correct SDG association.
A complicating factor is the interlinkage between SDGs: more than one SDG can be associated with a module/course. The number of manual SDG associations ranged from 1 to 4. These associations were then compared with those from the computational technique where the SDGs were assigned based on ΣRK SI scores. If the SDG number assigned manually to a module/course matched that in the computational technique, then we considered the result from the computational technique to be a "correct association, " otherwise it was considered it to be a "wrong association." In this manner, a list of all SDGs that were wrongly associated by the computational technique was constructed, helping to inform the shortcomings of the method. The frequency of wrong associations for all SDGs was also calculated within each AASHE-STARS category. Through the computational technique, all 306 modules/courses were searched for RKs to classify them into "sustainability-focused, " "sustainabilityinclusive, " and "non-sustainable." Further classification within these three categories attributed the probability of being correct based on the number of RKs (denoted by "N") found in the text. The following tags were selected for sub-category classification: high confidence (N > 13); medium confidence (6 ≤ N ≤ 13) and low confidence. Among the selected sample of 306 modules, 108 modules/courses fell into the category of "sustainability-focused, " which were further divided into the sub-categories of modules/courses with high confidence (53 modules), medium confidence (37 modules) and low confidence (23 modules/courses). Table 1 summarizes results from the computational technique for the selected sample of 306 modules/courses. Whilst the computational technique used the RK search to classify modules/courses into three categories with further subcategories of High, Medium and Low confidence, Table 2, in comparison, represents the results from the three stages of manual review, which more broadly classified courses into "Sustainability-Focused." "Sustainability-Inclusive" or "Non-Sustainable." . /frsus. . The numbers in the table represent the number of modules/courses in each category. * Three columns inside the manual review column represent the stages of the review. One contributing reason for differences between Tables 1, 2 could be the use of the RK "sustainab" included in the search by the computational technique. Any module/course containing the RK "sustainab" in its description, learning outcomes or aims was automatically attributed to be "sustainability-focused" by the computational technique.   Computational  19  5  10  2  3  5  7  2  2  0  0  3  0  0  4  0   considered as "sustainable" with and without the RK "sustainab." Whilst the computational technique categorized 61 modules/courses containing "sustainab" as "sustainability-focused" each manual review stage identified one third or less of those modules/courses as "sustainability-focused, " implying that the computational approach yielded a number of false positives.

SDG Total
. Discussion Figure 2 compares the results from each of the stages of the manual review of modules/courses with those from the computational technique based on a search of RK "sustainab" within the "sustainability-focused" category of modules/courses.  It shows that the manual review of course aims noted a lesser number of modules/courses considered "sustainability-focused" containing the word "sustainab" than noted by the computational technique. The implication of this is that the presence of the RK "sustainab" in course aims might not be sufficient, on its own, to implicate the sustainability-focused credentials of a module/course using the computational technique. This suggests that the presence of RKs associated with specific SDGs is also required for meaningful categorization of modules/courses using the AASHE-STARS categories. The highest success rate using the Frontiers in Sustainability frontiersin.org . /frsus. . computational technique was observed in "sustainability-focused" modules/courses with high confidence i.e., in 67% cases, during Stage 1 analysis. The lowest success rate for the word "sustainab" to correctly identify a courses as "sustainability-focused" using the computational technique was 5%, which occurs in the category of "sustainability-focused modules/courses with medium confidence, " during Stage 2. It appears that the presence of the word "sustainab" leads to the identification of a greater number of sustainabilityfocused modules/courses using the computational technique, as the context for the written text could not be identified through the RK search. From these observations it can, be argued that the sole presence of the word "sustainab" cannot suffice for a module/course to be deemed as sustainability-focused. This suggests possible inadequacy of . /frsus. . keywords besides "sustainab" to articulate module/course mappings to relevant SDGs in all stages. Compared to Stage 1 results, the computational technique yields a lower success rate for "high confidence" sustainability focused modules/ courses than in stages 2 and 3. Whilst this is partially due to module/course descriptors with missing text, it also suggests that that there may be inadequate RK coverage of sustainability in learning outcomes and course aims. This suggests that either better articulation, using RKs, of the SDGs in the text is needed or that the technique is simply more difficult to apply to learning outcomes and course aims, which are relatively abbreviated texts, in their own right. In a second step, during each stage of the manual review, SDGs were assigned to modules/courses categorized as "sustainabilityfocused" or "sustainability-inclusive." The maximum number of manually associated SDGs assigned to a module/course was 4 to identify the most relevant SDGs in descending order of SI score. The frequency of occurrence of each SDG within each category was calculated for both the computational and manual SDG associations, allowing for comparison. This was undertaken to (i) compare the frequency of coverage of SDGs across the modules/courses, and (ii) to find the relative difference in the frequency of occurrence of each SDG. Table 4 compares the number of associations per SDG from the computational technique with the manual review for "sustainability-focused" modules/courses with high confidence, during each stage of the review. SDGs 1 and 17 exhibit disproportionately high computational based SDG associations compared with SDG associations in each stage of manual review. This suggests that the lexica of RKs for SDGs 1 and 17 need to be refined or that more evolved techniques may be required to correctly attribute these SDGs to modules/courses. The opposite is the case for SDGs 9 and 11, suggesting that either the written texts are not informative enough for these SDGs to be recognized from the module/course descriptors, or that the computational technique lacks some relevant RKs. SDGs 10, 11, 13, 14, 16 are not linked to any module/course by the computational technique but are linked to modules/courses during the manual review.
Likewise, SDG 13 (Climate Action) is not assigned to any of the modules/courses by the computational technique. However, using the manual review 4, 7, and 2 modules/courses were assigned during Stage 1 (course description), Stage 2 (learning outcomes) and Stage 3 (course aims), respectively. Notwithstanding, it was disappointing that the computational technique did not assign SDG 13 to any of these modules/course, given that climate action is a central tenet of the current sustainability narrative. The least occurring SDG in the manual review was SDG 14 with a frequency of occurrence of 0, implying that no module/course was linked to this SDG. More purposive sampling of modules/courses mapped (by the computational technique) to SDG 14 is required to evaluate its precision in mapping to this SDG. The challenge in enhancing the computational technique appears to center around two core issues: (1) for some SDGs (e.g., SDG 13) there are insufficient RKs leading to an under-mapping of modules/courses, and (2) for other SDGs (e.g., SDG 1 and 17) the lexica of RKs need to be refined to prevent an over-mapping.
The frequencies of both the correct and the incorrect associations for each of the SDGs are then calculated in each AASHE-STARS category. A "correct SDG association" means that the SDG assigned to a module/course in the manual review matches that assigned by the computational technique. If the manually assigned SDG number The numbers in the table represent the number of modules/courses in each category.
differs from the SDG assigned by the computational technique, then it would imply a "wrong SDG association, " the greater the number of correct SDG associations, the greater accuracy of the computational technique. On the other hand, greater numbers of incorrect SDG associations (false positives) would imply a need for a revised RK search. Figure 3 illustrates the frequencies of wrong association for each of the 17 SDGs for "sustainability-focused courses with high confidence." In the sample set, 53 modules/courses belonged to the category "sustainability-focused courses with high confidence." As apparent from Figure 3, the majority of the incorrect associations in this category occurred for SDG1 and SDG17. The number of modules/courses, which are linked incorrectly to the SDGs 1 and 17, alone, are 19 and 20, respectively in each stage of the manual review process, lending further credence to the view that RKs for these SDGs need to be refined.
To evaluate the effectiveness of the computational technique, the success rate was then calculated for two scenarios: (a) SDG context success and; (b) AASHE-STARS context success. The SDG success rate was calculated as the number of times the SDG association by the computational technique matched the manual (correct) association divided by the frequency of its occurrence in the tool-based association, times 100. For example, the frequency of computational technique occurrences of SDG 3 was 10, out of which the number of correct associations was 5 during Stage 1, implying a success rate of 50%. Figure 4 shows the results from the three stages of the manual review in terms of the relative difference in the occurrences of SDGs (between computational and manual approaches) and the success rate of correct SDG association for "sustainable courses with high confidence." A general trend was the success rate tended to drop off progressively through each stage of the manual review. Figure 4 shows that only SDG 4 returned a 100% success rate (Stage 1), implying that associations captured by computational technique accurately matched the manual associations. Yet, SDG 4 was assigned to four modules/courses in the manual review, but its TF-IDF based occurrence was two. This implies that the tool missed two modules/courses where SDG 4 should have been assigned. Thus, a +2 relative difference can be seen in the graph for Stage 1. One can conclude that the RK search for SDG 4 gave only partially accurate results. Thus, a 100% success for the SDG does not necessarily imply that the RK search criteria is fully effective in linking the SDG accurately. A negative difference in occurrences indicates greater number of associations by computation Frontiers in Sustainability frontiersin.org . /frsus. .

FIGURE
Overall frequencies of wrong SDG associations for all categories.
than by manual association. Negative differences in occurrences were recorded for 7-10 of the 17 SDGs across all three stages, implying that the number of RK search associations exceeded the number of manual associations, yielding false positives. Positive differences in occurrences showed for 5 to 9 of the 17 SDGs, indicating a greater number of manual associations and highlighting a potential inadequacy of the computational technique. Only SDG 14 had no difference in associations between its computational and manual association (across all three stages of the manual review) as it was not associated to any module/course by either the computational technique or manual review. SDGs 1 and 17 had the maximum differences in their occurrences across all three stages of the manual review. The success rates of these two SDGs were also among the lowest with SDG 1 having zero successful associations from the computational technique. SDG 1 and 17 were essentially outliers with significant deviations. Given that association to SDGs 1 and 17 appear particularly problematic, a deeper dive into these SDGs is provided in Figure 5, for all AASHE-STARS categories of modules/courses. Figure 5 shows zero success for SDG 1 in all categories except in sustainability-focused modules/courses with medium confidence.

FIGURE
Overall success rate for all three categories.
This indicates the below-par output of the computational technique based an RK search for SDG 1. Whilst a slightly higher success rate for SDG 17 is depicted, it remains under 30%. To calculate success within each category, the number of modules/courses with 0, 1, 2, 3 and 4 correct associations was calculated. For categories 1 (sustainability-focused) and 2 (sustainability-inclusive), the success rate was calculated by dividing the "number of modules/courses with at least one correctly associated SDG" by the "total number of modules/courses identified in that category." The key phrase here is "at least, " as 100% success does not imply that the RK search results are comprehensive, i.e., that the search identifies all SDGs in the manual review. It merely implies that a given SDG was assigned to the same module/course by both the RK search and manual review at least once. Within the non-sustainable category of modules/courses, the success rate was calculated as the "number of modules/courses with no correct association" divided by "total number of modules/courses in the category." Figure 6 presents the success for each category. The success rates within the categories give the percentage success for the number of modules/courses in that category with at least one correct SDG association. For example, from 53 modules/courses identified as sustainability-focused, 54% of 28 modules/courses have one or more SDGs correctly associated by the computational technique. For all modules/courses with sustainability content, the tool's efficiency in assigning relevant SDGs to the content within these categories ranges from between 25% and 75%. The lowest percentage (25%) was observed for sustainability-focused modules/courses with low confidence. To form an overall perspective of the manual review of module/course descriptions, learning outcomes and module/course aims, results were compiled from the 3 stages and compared with the results from the computational technique. Table 5 shows the results of the manual review from each of the stages. While classifications for module description and learning outcomes aligned closely, there were noticeably less classifications based on module aims, primarily due to missing content. Within each category, the results of SDG associations from the 3 stages were combined to obtain a list of all manually associated SDGs. These were then compared with the computational SDG associations to obtain frequencies of occurrence of each SDG, the wrong SDG associations (non-matching SDGs) and success rates of correct associations. There are some notable differences in SDG associations between those from the computational technique and those from the manual review. In terms of SDG coverage across the modules/courses, the highest occurring SDGs from the computational association were SDGs 1 (No Poverty) and 17 (Partnership), linked to 126 and 125 modules/courses respectively. This contrasts with the manual SDG associations where SDG 9 (Industry, Innovation and Infrastructure) has the greatest coverage with its associations to 31 modules/courses in the manual review. The least coverage is seen for SDG 4 (Quality education) for computational technique whereas the manual review records SDG 14 (Life below water) with the least (no coverage) overall.
To help inform the gaps in tool's accuracy in capturing SDGs, Figure 7 depicts the number of wrong SDG associations for the three categories. SDG 10 is the exception with zero incorrect associations, but this is due to SDG 10 not being linked to any module/course by the computational technique. Yet, it was clear from our review that SDG 10 was assigned manually, which implies the computational technique's inadequacy in terms of being able to associate SDG 10. So, further work is required to evaluate the mapping accuracy of the technique to this SDG. Figure 7 implies that computational based associations to SDGs 1 and 17 are particularly .
/frsus. .   problematic in that the RKs used for these two SDGs have not been appropriately contextualized. Figure 8 presents the overall success rates for each SDG. The success for correct association is highest in the sustainability-focused category. Overall, there appears to be a relatively good representation of RKs for identifying modules/courses with sustainable components in this category. Only SDG 4 has a 100% success rate with sustainability-focused modules/courses. A zero success rate for the non-sustainable category indicates the tool's inability to distinguish non-sustainable courses from others. Table 6 provides a comprehensive analysis of the success rates, manifesting the degree of accuracy of the TF-IDF based technique for each SDG. Blank cells indicate that the SDG was not assigned to any course by the tool. Hence, there is insufficient information to be able to associate a success rate with these SDGs. The overall results show that there needs to be a refinement in the keyword search for SDGs to be associated correctly to modules/courses with a sustainability component, particularly SDGs 1 and 17. For each SDG, the required focus on improvement in the keyword search criteria was calculated by the following formula: Improvement rate = [Absolute difference in occurrence/Total number of pre-review and post-review occurrences] * 100 Figure 9 highlights the percentage of RK refinement required for correct SDG association in the categories with a sustainability-focused and sustainability-inclusive modules/courses, providing a prioritization for future iteration and computational technique. It signals that the improvement required for RK enhancement is highest for SDGs 10 and 13. Likewise, the computational technique yields a relatively low success rates for SDGs 1 and 17.
The overall results of the manual review point to the computational technique's limitations in considering the context in which SDGs should be assigned. Hence, the key focus for improvement should be to refine the RK search for these SDGs. The limitations suggest that the technique, whilst useful, should not be used in isolation. In the first instance, the interwoven nature of the SDG framework, with competing and synergistic attributes, exacerbates the challenge of correctly mapping modules/courses to individual SDGs. In addition, whilst training predictions, e.g., using artificial intelligence, would enhance the precision of the computational technique, particularly with respect to outliers, such as SDGs 1 and 17, the real benefit in the application of the technique is in the robust identification of sustainability-focused and sustainability-inclusive modules/courses, which can be "marketed" in creative ways to students. Arguably, additional benefit of the technique, perhaps, lies in engagement with faculty to enhance the sustainability components of their modules/courses, whilst giving judicious thought to how material SDGs are best narrated in the texts of the module/course descriptors. Indeed, the very act of undertaking a manual evaluation can be used to simultaneously improve both Frontiers in Sustainability frontiersin.org . /frsus. .

FIGURE
Improvement rates required for sustainability-focused and sustainability-inclusive categories.
the technique and the modules/courses to which the technique has been applied. As this study is only focused on interpreting the accuracy of the computational technique developed, further validation work is required. Ontological approaches, such as OSDG (Pukelis et al., 2022) offer bases for benchmarking. There is scope to design in closed-loop feedback in the form of supervised machine to enhance predictive accuracy. However, the most significant learning may have come from engaging with faculty in the process underpinning manual interpretation of the results. Occasionally, the computation led to anomalous results, suggesting, for example, that an Organic Chemistry module/course was sustainability-focused due to the presence of keywords, such as "carbon" and "energy, " but which may not be in sustainability contexts. Perhaps, the opportunity for enhancement lies in statistically informed Kappa studies to develop a collective interpretive intelligence in the academic community. Furthermore, statistical categorization of modules/courses to AASHE-STARS criteria was idiosyncratic to TU Dublin data. Similar work at other higher education institutions, e.g., University of Galway (Adams et al., 2020), Lincoln University (Obroh, 2020) and the National University of Kaohsiung (Chang and Lien, 2020), offer a basis for a comparative study. Lemarchand et al. (2022) found circa. 5% of TU Dublin modules/courses analyzed to be sustainability-focused. As the first Irish university to receive the "Gold" AASHE-STARS rating, University College Cork also reported circa. 5% of their module/courses to be sustainability-focused in their 2018 self-assessment submission (Kirrane, 2018), based on a keywords search.

. Concluding remarks
Evidence from literature indicates that sustainability lexica could be used to identify sustainability components in texts. This approach has been extended to computational methods, such as natural language processing (NLP). Such methods have been used to detect discourse on the SDGs in large documents with voluminous texts, such as scientific research and policy documentation. Two approaches offer potential to bridge keywords to meaningful attribution of SDGs, namely ontological models and machine learning. Yet, these suffer inherent drawbacks, namely their lack of comprehensive coverage and ability to cope with documentation outside the scope of trained data respectively. Academic staff with limited knowledge of SDGs might, understandably, have difficulty in mapping their curricula to the broader SDGs context. Yet, this very task is increasingly mandated at the highest levels of our education systems as universities seek to support societal transformation. The application of SDG keyword searches to module/course descriptors is particularly challenging given that module/course descriptors are, themselves, abridged documents. This implies that we need to be judicious in our choice of keywords to extract both meaningful mapping to the SDGs and the evaluation of sustainability embeddedness (in this case, using AASHE-STARS criteria). One way to address this challenge is to attribute relative importance of root keywords to each of the SDGs. Lemarchand et al. (2022) seek to achieve this with the well-known NLP technique, frequency-inverse document frequency (TF-IDF), used to identify keywords in large documents. These keywords are then used to evaluate the sustainability importance of individual modules/courses, categorizing them based on criteria set out by AASHE-STARS. Our evaluation of this approach suggests Lemarchand et al.'s (2022) technique has developed a lexicon sufficient to recognize most of the 17 SDGs. As SDGs 10 and 13 are more "generic, " with vocabulary significantly attributed to other SDGs, the computational technique disproportionately tends to map too many module/course descriptors to these SDGs, yielding false positives. On the other hand, the lexicon must be enriched for the RK search to better attribute mappings to SDG 11, amongst others. Moreover, the evaluation suggests that the computational technique produces a reasonably robust method for identifying "sustainability-focused" and "sustainability inclusive" modules/courses. All the RK search has to do is to identify modules/courses with at least one correctly associated SDG. The pertinent phrase "at least" essentially means that there is quite a high probability of categorizing modules/courses correctly within the statistically defined RK count bands defined for each AASHE-STARS category.
The evaluation exercise was limited to a random sample of 306 modules/courses from a population of 5,773 TU Dublin modules/courses to which the computational technique was applied. There is, naturally, some scope for improvement of the technique, itself, and we suggest that it offers a basis for a complementary toolkit for engaging with academic staff to better understand their education provisions through the lens of the SDGs. In essence, we see it as an anchor for stimulating curricular transformation through various formal and informal review and enhancement processes. As such, we recommend that future research be focused on the coevolution of this computational based technique with regular and ongoing learning, teaching and assessment initiatives to enhance the curriculum. A particular challenging aspect to the manual review process was the varying quality of module/course descriptor documentation. An online curriculum management system with standardized templates and facilitation for explicitly expressing the SDGs to which a module/course contributes would go a long way toward providing choice awareness for students who might wish to charter their educational journey based on the "green" credentials of the modules/courses on offer.

Data availability statement
The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.