- 1Learning Research and Development Center, University of Pittsburgh, Pittsburgh, PA, United States
- 2Broadening Engagement in Science, Technology, Engineering, and Mathematics (STEM) Center, Dietrich School of Arts and Sciences, University of Pittsburgh, Pittsburgh, PA, United States
Networked Improvement Communities (NICs) are a promising strategy for professional learning and educational change. However, limited empirical research has quantitatively examined how network participation cultivates educators' improvement science capacities. This study draws on four years of longitudinal data from the STEM PUSH Network—a NIC designed to broaden participation in STEM through collaboration with leaders of pre-college STEM programs. We investigate how deliberate practice and social learning, two central components of the NIC model, shape the development of improvement science confidence and skill. Our findings show that social learning is most strongly associated with educators' confidence in using improvement tools, while meaningful participation in structured improvement groups predicts demonstrated skill. We also find that network tenure—longer-term engagement and sustained exposure to improvement work—is critical for both outcomes. Together, these findings offer the first quantitative and longitudinal evidence of the distinct and nuanced ways NIC participation shapes practitioners' confidence and skill in improvement science. We conclude by discussing implications for future research and practice in the design and facilitation of Networked Improvement Communities.
1 Introduction
Improvement networks have gained increasing prominence in K-12 education as a promising strategy to foster innovation and scale and sustain change (Reed et al., 2025). Over the past two decades, the use of improvement science and networked improvement communities (NICs) to address persistent educational challenges have expanded considerably (Bryk et al., 2015; Langley et al., 2009; Bush-Mecenas, 2022). This growth has been accompanied by a proliferation of tools, frameworks, and resources—such as published guides, templates, and tested change packages—that support collaborative learning and structured inquiry in both school-based and out-of-school contexts (Crow et al., 2019; Hinnant-Crawford, 2020). By convening educators, researchers, and organizational leaders to participate in iterative cycles of testing, data use, and reflection, NICs aim to identify, validate, and scale evidence-based practices through shared inquiry, disciplined improvement, and collaborative learning.
Sustaining improvement over time is both a strategic goal and a design consideration for NICs (Joshi et al., 2021). Achieving this requires intentionally cultivating strong network infrastructure, such as shared routines, structures, and norms, as well as building the capacity of network participants to carry the work forward. This includes creating conditions that support ongoing inquiry, adaptation, and distributed leadership beyond the network's formal lifespan (LeMahieu et al., 2017). Within these networks, a central coordination body–often referred to as the “hub” or network leadership team—plays a critical role in initiating and shaping early improvement efforts (Peurach et al., 2020, 2025). The hub typically leads the development of a shared theory of improvement with a diverse group of stakeholders, following principles outlined by Bryk et al. (2015). The hub designs a driver diagram, which integrates the theory of improvement and articulates various changes that the network aims to achieve (Bryk et al., 2015). Additionally, the hub leads the efforts in identifying, vetting, and selecting outlets and environments that are conducive to replicating a common organizational model across large numbers of outlets (Peurach et al., 2016). Beyond these initial efforts in knowledge, the hub team also plays a key role in fostering participants' skills, habits, and dispositions needed to independently lead inquiry and change efforts. These processes lay the groundwork for sustained, practitioner-driven improvement across contexts.
Understanding how educators develop the capacity to lead improvement work is critical for sustaining improvement efforts. Previously, studies have identified self-efficacy as an important predictor of scaling ongoing professional development efforts among teachers (Weißenrieder et al., 2015; Chong and Kong, 2012). While much of the existing literature has examined how NICs function as systems for collective inquiry and organizational learning (Bryk et al., 2015; Anderson et al., 2024; Feygin et al., 2020; Reed et al., 2016; Peurach et al., 2025; Stosich, 2024), fewer studies have investigated what contributes to practitioners' confidence in using improvement science tools and their ability to apply these tools in practice to lead change (Richardson and Rees, 2025; Bonney et al., 2025). Since continuous improvement is a complex process that can be difficult for educators to learn, particularly in the midst of multiple competing priorities in schools, districts, and universities (Gallagher and Cottingham, 2019), we hypothesize in this study that repeated exposure to continuous improvement practice and longer term engagement in an improvement network would be central to developing such capacity.
Research from learning sciences posits that structured collaborative learning and shared knowledge construction can also enhance learners' self-efficacy (Poellhuber et al., 2008). Although this research stems from pedagogical contexts, the mechanisms of collaborative inquiry are also applicable to educators' professional learning in NICs (De Simone, 2020; Martin et al., 2008). When educators engage in improvement work alongside peers, they encounter opportunities for mastery experiences, vicarious learning, and social encouragement–three well-established sources of self-efficacy (Bandura, 1997). In the context of networked improvement communities, these processes are further reinforced by opportunities to contribute meaningfully to the advancement of shared knowledge and practice, which scholars have identified as central to building epistemic agency and professional confidence (Scardamalia et al., 2002; Wenger, 1998). Thus, we posit that deliberate practice combined with collaborative engagement in improvement science enhances both practitioners' competence in using improvement tools and their confidence in applying them to drive change in their local contexts.
However, existing scholarship has largely offered theoretical accounts and qualitative descriptions of how NICs may build educator capacity. While this body of work is valuable, there remains very little quantitative evidence—particularly longitudinal evidence—that models how NIC participation cultivates such growth. To our knowledge, this study provides the first multi-year analysis that systematically tracks educators' development of confidence and applied improvement science skills through NIC participation. We center our investigation on the STEM PUSH Network—a NIC of more than 40 Pre-College STEM Programs (PCSPs) committed to advancing racial and ethnic equity in postsecondary STEM access and success. Guided by literature on deliberate practice and social learning, we draw on four years of survey data and improvement cycle documentation from network members to examine factors that shape participants' improvement science confidence and skills. We conclude by discussing the implications of these findings for sustaining the impact of NICs and designing future networks that intentionally cultivate practitioner expertise and leadership.
1.1 Background
1.1.1 Networked improvement communities in education
A Networked Improvement Community (NIC) is a research-practice partnership that applies improvement science to address persistent educational problems through iterative, collaborative learning across diverse contexts (Proger et al., 2017; Reed et al., 2025). NICs bring together educators, researchers, and organizational leaders to co-design, test, and refine strategies using disciplined inquiry and local data (Bryk et al., 2015). This approach transforms professional development into an inquiry-driven process where knowledge is co-constructed, skills are developed, and participants apply learning in their local settings while sharing insights with the broader network, fostering the spread of effective practices and deepening collective learning (Russell et al., 2020). The literature suggests the importance of capacity building strategies for members to gain confidence in leading change in their practice (Cannata et al., 2017; Durrant and Holden, 2005). However, there is limited quantitative evidence to substantiate which strategies are effective, making this a central question of our study.
Unlike traditional professional development, NICs emphasize sustained engagement with shared protocols such as Plan-Do-Study-Act (PDSA) cycles (Langley et al., 2009), peer coaching, and structured feedback loops that make professional learning actionable and context specific (Noble et al., 2021; Stone-Johnson and Hayes, 2021). For example, in the Better Math Teaching Network, math teachers across five states met regularly to discuss classroom experiments, share successes and challenges, and refine practices, building a repository of knowledge grounded in real-world application (Russell et al., 2021). Such collaborative cycles have been shown to strengthen pedagogical understanding and enhance educators' ability to apply improvement principles to local challenges (Prenger et al., 2019). They also create space to analyze current practices, identify positive deviants, draw on research and peer expertise, and conduct rapid tests of change ideas (Russell et al., 2021).
A fundamental NIC question is “what works, for whom, and under what conditions” (Bryk et al., 2015, p. 172). Top-down mandates often fail by overlooking practitioners' expertise, whereas NICs position them as co-designers, increasing both the relevance of innovations and the likelihood of sustained adoption. Effective networks set measurable aims, test multiple small changes (Park et al., 2013), and equip educators with methods, peer support, and reflective space–conditions that increase the odds of both implementing at scale and sustaining meaningful change. Long-term educational improvement depends on enabling educators to integrate new processes adaptively within their unique contexts, and NICs also should prepare teachers to adapt strategies appropriately (Gutiérrez and Penuel, 2014).
Examining how NICs build educators' capacity for improvement science is thus essential. Qualitative evidence suggests NIC participation can empower educators to experiment, interpret evidence from practice, and make adaptive decisions in local contexts. Lewis (2015) showed in case studies of Japanese math teachers, educators can integrate disciplinary knowledge with organizational processes such as setting shared aims, identifying key drivers, and conducting PDSA cycles. More recently, Stosich (2024) found that educational leaders are able to engage in and learn to use continuous improvement approaches as a lever for equity-focused school reform. Complementary quantitative evidence from over 2,000 educators in 34 networks found that more frequent and sustained participation was linked to greater reported benefits, including collective efficacy, commitment, and problem-solving capacity—especially in long-standing, higher education-led networks (Perlman et al., 2025).
However, a critical gap in the literature is the absence of empirical evidence demonstrating that NICs operate as theorized. Although prior work qualitatively described how participants' self-efficacy may grow through social support and shared practice (Cannata et al., 2017), no study to our knowledge has documented measurable growth in educators' improvement science confidence and skills. This study addresses that gap by providing the first longitudinal, quantitative evidence linking specific aspects of NIC participation to the development of improvement science confidence and skills among practitioners. We hypothesize that several factors may impact educators' measurable growth in improvement science confidence and skills over time: (1) deliberate practice–educators who engage more meaningfully in structured PDSA cycles will demonstrate greater improvement in confidence and skills; (2) social learning–educators who report higher levels of collaboration within improvement groups will demonstrate greater growth in confidence and skills; (3) time in network–educators with repeated exposure to continuous improvement practice for a longer time in an improvement network will exhibit stronger confidence and skills; (4) perceived challenges–educators who report fewer challenges engaging in improvement inquiry may gain more confidence.
1.1.2 The role of deliberate practice in skill acquisition
Mastery is not achieved through experience alone, but rather through deliberate practice—a structured process in which learners pursue specific goals, engage in focused, intentional practice, and refine their performance based on feedback (Ericsson et al., 1993). Professional learning models grounded in deliberate practice frequently incorporate elements such as coaching, guided observation, and structured feedback—features that reflect a cognitive apprenticeship model (Grossman et al., 2009). In such models, novice practitioners learn by working alongside more experienced peers, trying out new techniques in authentic settings, and refining their practice through reflection. This process supports the internalization of expert strategies and helps educators build adaptive expertise over time.
Improvement science reflects the deliberate practice model through PDSA cycles, a core process for driving iterative, evidence-based change. The PDSA framework prompts educators to engage in rapid cycles of inquiry, guided by three central questions: What are we trying to accomplish? How will we know a change is an improvement? What change can we make that will result in improvement? (Langley et al., 2009). In practice, educators “Plan” a small-scale change, “Do” the test in their setting, “Study” the data collected, and “Act” on what was learned by refining, abandoning, or scaling the change. This structured process not only builds habits of inquiry but also encourages teams to critically examine their programs or practices with the goal of achieving measurable improvement. Research suggests that engaging educators in these kinds of reflective, iterative processes can lead to deeper learning of educators themselves, and have a downstream impact on student outcomes (Jones, 2017). Importantly, educators see the most value when they can connect new learning to prior PDSA experiences and receive training that supports meaningful application in their own context (Cohen-Vogel et al., 2015). Additionally, their level of engagement is linked to the perceived benefits of participating in the network–educators who are more invested and participate for longer periods tend to report greater benefits (Perlman et al., 2025). This finding suggests that the amount of investment in deliberate practice is associated with perceived benefit. However, we aim to examine whether this invested engagement is also linked to objective measures, such as demonstrated increases in skills, rather than relying solely on self-reported perceptions. Importantly, such improvement work rarely happens in isolation. When educators engage in regularly testing new innovations at a local level and use data to refine their practice, these activities are deeply embedded within social learning processes. This highlights the need to consider how social learning dynamics, such as collaboration, feedback, and shared reflection further shape educators' development and the effectiveness of improvement efforts.
1.1.3 The social nature of learning
In Situated Learning Theory, Lave and Wenger (1991) promoted the concept of legitimate peripheral participation, which describes how newcomers gain expertise through increasing involvement in a community of practice. Learning is not merely the acquisition of abstract knowledge, but “a process of social participation” within a community. Empirical studies have shown that such collaborative environments can bolster educators' confidence and support instructional change. For example, Kelley et al. (2020) found that middle school teachers engaged in a year-long community of practice developed stronger self-efficacy. Teachers described how receiving peer assessments and customized feedback helped them improve their pedagogical approaches for math teaching. These findings highlight that participation in a community of practice is impactful when supported by relational conditions that enable reflection and growth.
Creating effective collaborative learning communities requires intentional cultivation of psychological safety and peer accountability. Psychological safety, coined by Amy Edmondson, is the shared belief that a group is safe for interpersonal risk-taking (Edmondson, 1999; Edmondson and Lei, 2014). In such environments, members feel free to ask questions, admit mistakes, and offer new ideas without fear of embarrassment or punishment. Alongside safety, peer accountability is equally essential. In contrast to top-down, hierarchical accountability, peer accountability arises from members' commitment to each other and to shared goals (Wenger, 1998, 2000). Wenger notes that participation in a true community of practice creates “very strong horizontal accountability among members through a mutual commitment to collective learning” (Wenger, 2011, p. 109). This means individuals hold each other responsible for upholding group norms, contributing knowledge, and working toward the community's learning goals.
Networked Improvement Communities develop participation structures and community norms that foster psychological safety and peer accountability. At their core, NICs are research-practice partnerships that bring together educational professionals, researchers, and designers to address persistent, high-leverage problems in education (Bryk et al., 2015; Dolle et al., 2013). These partnerships center practitioner expertise and embed improvement efforts within a supportive social architecture designed to accelerate the field's capacity to learn how to improve (Russell et al., 2017). Nielsen (2012) asserts that networks are effective at accelerating innovation because they amplify collective participation and intelligence by leveraging the cognitive diversity of collaborators, reducing barriers to entry through encouraging small contributions, and modularizing collaboration by breaking overall tasks into smaller subtasks that can be addressed independently. Within this social structure, NICs create spaces where educators can surface and refine practice-based knowledge, engage critically with one another's approaches, and build trust and mutual accountability for sustained collaborative inquiry (Katz et al., 2009; Kallio and Halverson, 2020).
Beyond participation structure, network leaders also attend to the enculturation of members into the network by establishing clear community norms. Network leaders at STEM PUSH developed a set of community norms informed by scholarship practice (Bragg and McCambly, 2018; Jacobs et al., 2024). These norms are intended to guide how the network engages with diverse partners in such way that educators feel valued, supported, and empowered to take initiative, take risks, and ask for help when needed in pursuing collective improvement efforts (Iriti et al., 2024). In the next section, we describe how the STEM PUSH Network brings these commitments into action through its structure and routines, and how we empirically examine which aspects of NIC participation support educators' confidence and skills in leading improvement work.
1.2 Context
The STEM Pathways for Underrepresented Students to Higher Education (STEM PUSH) Network was an Alliance funded by the National Science Foundation (NSF) through the Eddie Bernice Johnson INCLUDES (Inclusion across the Nation of Communities of Learners of Underrepresented Discoverers in Engineering and Science) program. NSF INCLUDES Alliances are national collaborative efforts designed to broaden participation in science, technology, engineering, and mathematics (STEM) by improving access, preparation, and opportunity (NSF INCLUDES, 2018, 2020). As part of this alliance, STEM PUSH aims to elevate the role of pre-college STEM programs in higher education admissions processes. PCSPs are out-of-school time programs–often run by universities, museums, or nonprofit community organizations–that engage high school students in STEM learning and exploration, helping them prepare for college and STEM careers. Launched in 2019, the STEM PUSH Network brought together over 40 PCSPs nationwide and introduced a networked improvement approach to support program quality and strengthen connections to higher education pathways.
1.2.1 Opportunities for deliberate practice and confidence building
The STEM PUSH Network applies a theory of practice improvement grounded in iterative, collaborative cycles of change to build educators' capacity and confidence while advancing systemic changes. Each pre-college STEM program identifies a focused area for improvement that aligns with both the network's shared goals and the individual program's needs, guided by a common driver diagram. Through methodologically rigorous processes–such as PDSA cycles–participants test changes in their programs, including efforts to broaden recruitment or strengthen alumni engagement (Iriti et al., 2024). This structured, step-by-step approach not only supports organizational improvement but also cultivates educators' understanding in improvement science and deepens their confidence in leading change.
Twice per year, participating programs join small improvement groups to test and refine change ideas. These groups, composed of programs working on similar challenges and approaches, follow a PDSA process facilitated by the Network hub. The improvement cycle monthly virtual meetings provide a collaborative space where PCSP leaders share their plans for testing changes, define what they aim to learn, determine appropriate measures, analyze the results, and decide whether to adapt, adopt, or abandon the changes based on evidence. This structured approach helps ensure that improvement efforts are both contextually grounded and continuously refined through shared learning. PCSPs used the PDSA template to plan, enact, and reflect on their use of the change idea and received support from PCSP peers in their group and from their hub facilitator. Meetings are organized to support PCSP leaders at each stage of the PDSA cycle, and facilitators use similar meeting agendas and support approaches. At the conclusion of each cycle, program leaders complete a written summary of their test; the summaries are then incorporated into the Change Idea Summary Booklet to support collaborative learning within the network.
The PDSA cycle (Figure 1) offers more deliberate practice than traditional workshops or training. A core design principle of the NIC is that participants learn improvement science by doing it in a social structure–i.e., the improvement group. Participants co-design and test change ideas within their own programming contexts, creating structured, recurring opportunities to apply strategies while receiving timely feedback from peers and hub facilitators (Iriti et al., 2024). These iterative cycles provide repeated, apprenticed practice in framing problems, generating context-specific solutions, collecting data, and making evidence-based decisions–key components of improvement science.
Figure 1. Visual diagram of a PDSA cycle (Bryk et al., 2015).
1.2.2 Opportunities for social learning
Building on the foundation of improvement groups, the STEM PUSH Network intentionally cultivates a multi-layered social learning environment (Table 1). While improvement groups offer structured peer collaboration within smaller clusters, the network also creates expansive opportunities for cross-program learning and reflection. Every six months, an in-person convening is held to gather PCSP leaders, researchers, and hub leaders to share change ideas, examine data, and reflect on progress toward collective goals. Between convenings, monthly virtual whole-network meetings sustain engagement by spotlighting emerging insights, fostering cross-program learning, and surfacing common challenges and successes. The network utilizes a digital communication platform to enable threaded discussions, share materials and resources, and maintain asynchronous interactions. These routines of interaction create fertile ground for what Horn and Little (2010) describe as collaborative public reasoning about problems of practice. Educators in the STEM PUSH Network make their thinking visible, draw on collective wisdom, and co-construct new understandings. Importantly, this culture of social learning is anchored in psychological safety.
In the STEM PUSH Network, community norms that guide interpersonal interactions emphasize shared goals, critiquing ideas rather than people, sharing the space equitably, and being open to changing one's mind. These community norms also honor acknowledging non-closure, allowing space for emotion and reflection, and taking responsibility to repair harm. These norms are consistently honored, maintaining trust over time. Crucially, the network embraces failure as a natural and necessary part of learning. Participants are supported to take risks, try new approaches, and make mistakes without fear of judgment–because these experiences are treated not as setbacks, but as valuable sources of insight. By regularly revisiting and reinforcing these expectations, STEM PUSH fosters an environment where both trust and accountability exist, where participants feel respected, heard, and supported to fully engage in the iterative work of improvement.
Lastly, the improvement cycles generate shared artifacts and tools that support ongoing learning within the network. At the end of each cycle, PCSPs summarize their testing processes, outcomes, and learning in brief narratives, producing individual change summary documents. These are compiled by the hub into a Change Summary Booklet, which organizes entries by change idea topic and synthesizes findings across all tested areas. The booklet highlights what was learned about each routine, its impact, and the key tools and resources that supported its effectiveness (Iriti et al., 2024). In addition to these booklets, the hub develops “improvement packages” for high-leverage changes. These packages detail the tested routine(s), offer guidance for adaptive use based on network-generated evidence, and include practical tools to support implementation and monitor impact—such as sample measures and usage tips (Iriti et al., 2024). Technological platforms like Basecamp and Google Drive facilitate this network-wide exchange by organizing resources and enabling asynchronous collaboration across geographical boundaries. A curated Resource Library further supports access by housing toolkits, templates, and related materials in one accessible location. Collectively, these artifacts document local adaptations, outcomes, and lessons learned, serving as boundary objects–shared tools or resources that help people from different roles or contexts coordinate their work and understanding (Star and Griesemer, 1989). These structures are intended to enable knowledge to travel across diverse contexts, and are intended to help facilitate shared learning within the network.
Together, these structures cultivate a safe social learning environment, a space where educators can take risks, share failures, and learn from one another without fear of judgment (Kallio and Halverson, 2020). As participants engage in repeated cycles of inquiry and reflection, they not only deepen their understanding of improvement science but also strengthen relational trust–a foundational condition for collective learning and innovation. Hub leaders design for members to potentially shift from peripheral to central roles in the network, taking on more leadership as they increase engagement over time. This progression reflects the trajectory of increasing participation found in communities of practice (Wenger, 1998) and illustrates how NICs may scaffold both professional identity development and collective capacity building.
After five years, the STEM PUSH Network offers a rich context for investigating how participation in a networked improvement community supports educators' growth. This study is guided by a central research question: How does participation in the STEM PUSH Network contribute to educators' development of improvement science confidence and skills? To address this question, we investigate the mechanisms through which learning and growth occur within the network. Drawing from prior literature, we hypothesize two primary pathways that contribute to increased confidence and improvement science skills over time. The first is deliberate practice–structured learning and application of improvement science tools through participation in improvement groups. To capture this, we include both the quantity and quality of participants' involvement in improvement groups as proxies for intentional practice. The second pathway is social learning, or the extent to which participants learn through collaboration, interaction, and knowledge exchange with other STEM PUSH members.
We highlight social learning as an important component because it has been difficult to study in the context of NICs. Establishing a genuine, vibrant network where professionals are willing and able to engage in ongoing, collaborative improvement work is a vision for many NICs. However, in practice, making this happen proves challenging. Kruse and Louis (1993) laid out several essential conditions for forming productive working relationships and networks. These factors range from logistical participation structures, such as time to meet, to sustained efforts toward building individual autonomy, relational trust, and shared norms. All of these require significant time and effort to develop. What makes STEM PUSH unique is that it built and sustained a robust, engaged network of pre-college STEM program leaders. This not only enhances opportunities for social learning but also provides a powerful context for investigating the factors that contribute to practitioners' confidence and their capacity to apply improvement science skills in their work.
In addition to these core mechanisms, we include other indicators of engagement in the network, such as participants' tenure in the network, the roles they take on, and their perceived challenges in participating in STEM PUSH. These variables, described further in the next section, allow us to explore a more nuanced understanding of how different dimensions of engagement relate to the development of improvement science capacity.
2 Methods
In this section, we outline the data sources and analytic strategies used to examine patterns of participation and learning within the STEM PUSH Network. Participants in the network pursued a range of improvement aims, including enhancing recruitment and retention in precollege STEM programs and strengthening partnerships with higher education institutions. Concrete examples of these focus areas are provided in the Supplementary materials. We begin by describing the data collected across the network and then detail the analytic strategies employed to investigate how engagement with improvement science tools and routines related to key indicators of educators' learning and practice.
2.1 Participants
The network participants are pre-college STEM program leaders or staff members from across the United States. To be eligible for network membership, programs had to meet the following criteria: (1) Engage high school students in rigorous STEM-centric curricula; (2) Provide at least 100 student contact hours per year; (3) Have explicit goals to serve Black, Latine, and/or Indigenous students; (4) Engage students in doing STEM via hands-on experiences, laboratory work, and/or mentored research; (5) Have operated for 3 or more years; (6) Expose students to STEM college pathways and careers. Once a program was selected for membership, they identified one staff member to serve as a STEM PUSH liaison to attend all network meetings and participate in all network activities. Some programs chose to engage additional staff members based on roles in their program. Programs receive a stipend of $7,000 per year for participation and all costs for one staff member to attend twice per year in-person network convenings.
Over the five years of STEM PUSH activity, the network has seen a dynamic flow of participants, with individuals joining and leaving at different points. We collected data from 138 unique individuals who have participated in the network. In the early years of the STEM PUSH Network, we did not systematically collect participants' demographic information, which resulted in missing data for some individuals in race and gender information. Among participants who reported gender, race, and ethnic information, Table 2 provides the breakdown of frequencies in each category for all members in the network between 2020 and 2024. A total of 115 participants reported their birth year. Using 2025 as an anchor, the average age of participants is 42 years, with the youngest at 24 and oldest at 84. At the program level, the analytic sample includes 46 programs for modeling IS confidence and 41 programs for modeling improvement science skills. Table 3 demonstrates the disciplines represented by these programs and where they are located in the United States. The discrepancies in the total number of programs by region and discipline occur because regional data are available for most programs, whereas discipline data rely on participants' self-reports, and some participants did not complete that portion of the survey.
2.2 Data
This study draws on a longitudinal dataset collected from the STEM PUSH Network between 2021 and 2024. Figure 2 shows the data collection timeline. The primary data source is the annual Network Health Survey (Bryk et al., 2025), which is a validated instrument designed to assess the health and development of improvement networks. This survey includes self-reported measures assessing participants' continuous improvement confidence. To assess participants' development of improvement science skills, the hub team designed and applied a rubric to evaluate submitted PDSA artifacts. The rubric scores the quality of participants' learning and execution across the four core phases of the PDSA cycle. In this study, we focus on the Act phase, as detailed in the section below.
Key predictive variables in our analysis include (1) Improvement Science engagement, which encompasses both attendance at improvement meetings and hub leader-assessed quality of participation; (2) Social learning, reflecting the extent to which participants report learning from their colleagues within the network; (3) Inquiry challenge, indicating participants' understanding of the expectations associated with engaging in improvement inquiry; and (4) Time in network, denoting the duration of participants' exposure to improvement science practices. Together, these variables allow us to examine how participation in NIC structures and routines relates to educators' development in improvement science confidence and applied skills over time. We detail the construction of each variable in the sections that follow.
2.2.1 Measurement of outcome variables
2.2.1.1 Improvement science (IS) confidence
Improvement Science (IS) confidence was measured in the Network Health Survey administered during the fall each year. Participants were asked to rate their confidence in using core improvement science tools on a scale from 1 (not at all confident) to 5 (very confident). The prompt read: “How confident do you feel using the following improvement science tools?” The measure comprises three subitems: (1) PDSA cycles or other inquiry routines, (2) driver diagrams or other visual representations of a theory of improvement, and (3) using data to determine whether changes are leading to improvement. While the original survey also asked about confidence in using process maps and fishbone diagrams, these tools were less widely implemented across the network, and participants had limited exposure to them. Therefore, they were excluded from the present analysis. A composite IS confidence score was calculated by averaging the three relevant items. Internal consistency of the scale was assessed using Cronbach's alpha to confirm that the items formed a reliable composite measure.
2.2.1.2 Improvement science (IS) skills
The rating of PDSA change summaries served as a proxy for participants' applied improvement science skills in using core improvement science principles and tools. The hub structured a template for change idea summaries to scaffold reporting for the following dimensions: (1) Tested change; (2) Relevant program context information; (3) Adaptations to the change idea during implementation; (4) What was learned; (5) Action steps program will take; (6) Resources and tips. Seven improvement science experts and hub IS group facilitators independently rated subsets of the submissions, with each expert evaluating a portion of the total submissions using a standardized rubric developed for this study. Interrater reliability was established by jointly coding a set of pilot summaries, aligning scoring decisions, and holding weekly meetings throughout the process to check consistency and calibrate codes.
Although the rubric assessed all four PDSA components, this analysis focuses specifically on the Act items related to measurement use. This decision was motivated by both conceptual and analytic considerations. Conceptually, measurement-related items capture what we regard as the most critical indicator of applied IS skills: the ability to collect and interpret data to understand if their change made an improvement then use evidence to act (adopt, adapt, or abandon a change). Cycle documentation that demonstrates a deep understanding of the use of measurement for improvement is at the core of improvement science–and the most difficult aspect of the cycle to achieve. It involves collecting meaningful data that is connected to the tested change, making sense of that data, and using that evidence to make a decision about what the next step is on one's improvement journey. Analytically, we tested several alternative grouping methods that incorporated all four PDSA components using composite or weighted scoring rules. These more complex approaches yielded classification patterns and model results similar to those produced using only the Act score. As such, we opted for a more parsimonious approach. This method allowed us to categorize participants' submissions into clear quality levels while still capturing meaningful differences in understanding and application of improvement science. Importantly, our robustness checks support the stability of this decision, suggesting that the observed relationships are not an artifact of how we operationalize quality. Specifically, participants' change summary submissions were categorized into three levels of quality based on this criterion:
• High quality: Their work demonstrated a deep understanding of using measurement for improvement.
• Medium quality: Their work showed some understanding of the concept but lacked full clarity or consistency in application.
• Low quality: Their work showed little to no understanding of how to use measurement for improvement.
Examples of change summaries were provided in the Supplementary material, along with a detailed description of our scoring process to illustrate how we assess the level of IS skills demonstrated in the example work. It is important to note that participants completed their PDSA submissions at the program level, with scores assigned by raters to the overall submission rather than to individuals. For the purposes of individual-level analysis, each participant listed on a given submission was assigned the corresponding program score. We recognize that this approach may not fully capture individual variation in contribution, particularly in programs involving multiple participants. However, given the collaborative design of the STEM PUSH improvement groups, members are expected to co-construct and reflect together on their change work. We view the program-level score as a reasonable proxy for individual exposure to and engagement with improvement science practices.
2.2.2 Measurement of predictive variables
2.2.2.1 Social learning
Social learning was measured using a survey item that asked participants to rate their agreement with the statement: “I learn new skills and knowledge from collaborating with my STEM PUSH colleagues”, using a 5-point Likert scale ranging from 1 (do not agree) to 5 (strongly agree).
To better understand the multidimensional nature of social learning, we also examined several exploratory indicators. These additional indicators were conceptually informative but were not included in the analysis models due to concerns about multicollinearity, potential redundancy, and model overfitting given the modest sample size. Instead, we retained the primary social learning item as a theoretically central and parsimonious measure, while using the other indicators descriptively to enrich our understanding of participants' collaborative learning experiences. The first supplementary variable was drawn from the Network Health Survey administered in the fall, and the second variable was drawn from survey administered in the spring time. They include: (1) Perceived usefulness of collaboration—Participants rated the usefulness of various network activities on a scale from 1 (not at all useful) to 4 (very useful), including the item: “Collaborating with other network members in improvement groups”. (2) Receptiveness—Participants were asked to list the individuals from whom they had gained ideas. The original prompt asks, “Please reflect on your engagement with other STEM PUSH members in the network and identify which network members have had an influence on your thinking or your work”. We operationalized this measure as the total number of names provided, reflecting the extent of participants' active uptake of ideas from others.
2.2.2.2 Improvement group engagement
Improvement Group Engagement, shorten as IS Group Engagement in the following, captures the extent to which a participant's presence in improvement group meetings translates into meaningful contributions across the total set of possible meetings. It is calculated as a composite score by multiplying Attendance (proportion of meetings attended) by Quality of Participation (rating of contribution when present). The resulting variable reflects overall effective engagement, integrating both the depth of engagement when present and the consistency of participation over time. Low scores indicate shallow engagement—either frequent attendance with minimal contribution, or strong contributions made only sporadically—patterns that are unlikely to foster IS skills and may limit the participant's overall impact on collaborative inquiry progress.
We first documented participant's Attendance over time. This variable represents the proportion of meetings a participant actually attended out of the total number they were expected to attend in a given cycle. For participants who exited the network partway through a cycle, their expected attendance was calculated as the number of meetings held before their departure. Attendance scores range from 0 to 1, with values closer to 1 indicating higher attendance.
The Quality of Participation was rated by improvement group facilitators after each meeting on the following scale: 0 = low or no participation, 1 = regular participation, and 2 = substantive participation that deepened the discussion. For each participant, we summed these ratings across all meetings attended and divided by the number of meetings they attended to obtain their average participation quality score. Finally, this average quality score was multiplied by the participant's attendance score to yield the IS Group Engagement. Higher values indicate consistent, high-quality contributions across the full set of expected meetings.
Since the change summary outcome is aligned with the improvement group cadence, no adjustment was needed for that measure. However, the other outcome variable, IS confidence, is collected annually. To maintain consistency in the time frame of predictors, we created a separate IS Group Engagement variable for predicting this outcome. Specifically, if a participant engaged in more than one improvement group within the same calendar year, we calculated the average of their IS Group Engagement scores across the two improvement periods. If a participant participated in only one group during the year, that score was retained for analysis.
2.2.2.3 Challenges to inquiry
Challenges to Inquiry (referred to as Inquiry Challenge hereafter) was measured using five items from the Network Health Survey, administered each fall. The survey prompt read: “Please indicate the extent to which you find the following aspects of the improvement cycles challenging”. The items assessed specific cognitive and technical aspects of engaging in improvement science, including: (1) Identifying or selecting a change to test; (2) Collecting enough data to assess the change; (3) Collecting the right kind of data to judge whether the change worked; (4) Making predictions about how a change will lead to improvement; (5) Using data to make decisions(e.g., to adapt, adopt, or abandon a change) rather than relying on intuition. Participants rated each item on a 5-point Likert scale ranging from 1 (Not at all challenging) to 5 (Very challenging). An overall inquiry challenge score was computed by averaging responses across the five items, with higher scores indicating greater perceived difficulty engaging in core inquiry practices.
2.2.2.4 Role
Within the STEM PUSH Network, participants hold one of three roles, reflecting the unique structure wherein multiple individuals from the same program may be involved. Primary members are individuals who either represent their program alone or take on the main responsibilities within the network. Co-leads or Secondary members share their involvement with another colleague from their program and typically carry fewer responsibilities compared to Primary members. Finally, individuals with minimal engagement in the network are classified as “No role” or “Peripheral”. While participants' roles are generally stable, they can shift over time—particularly in programs with multiple representatives. As such, role is treated as a dynamic variable and recorded at each six-month improvement cycle. We quantified participants' role as a numeric ordinal variable to reflect the increasing centrality and responsibility of their involvement. Specifically, “No role” or “Peripheral” was coded as 0, “Co-lead/Secondary” as 1, and “Primary” as 2. This approach aligns with our theoretical assumption that more central roles are associated with higher levels of confidence and greater quality in PDSA work. Treating role as an ordered numeric variable allowed us to test for a linear trend in outcomes across role levels, while reducing model complexity and preserving degrees of freedom.
2.2.2.5 Time in network
Participants' total time in the STEM PUSH Network reflects the tenure of each participant in the network and is expressed in years. It was calculated dynamically by subtracting each participant's entry date from the date of their survey response. This value was then expressed in fractional years to reflect accumulated experience at each measurement point.
2.2.2.6 Gender and ethnicity
We included gender and race as control variables in an additional model. Due to small sample sizes in several race categories (see Table 1), we collapsed individuals identifying as Mixed Race, Native American/Indigenous, and Not Described into a single category. This decision was made to address data sparsity and prevent estimation instability and convergence failure in the cumulative link mixed model. Given the overall small sample size, reducing the number of race categories also helped stabilize model estimates and mitigate class imbalance.
2.3 Analytic approach
We first conducted a correlation analysis among the predictors and the outcome variables to explore their bivariate relationships prior to fitting the model. This step allowed us to assess the strength and direction of associations between variables, providing an initial understanding of which predictors may be meaningfully related to the outcome. Second, given our relatively small sample size, examining correlations helped inform our modeling decisions by identifying potential issues of multicollinearity and reducing the risk of overfitting. By prioritizing predictors with stronger correlations to the outcome and minimal redundancy with one another, we aimed to construct a more parsimonious and stable model.
To examine predictors of improvement science (IS) confidence, we employed a Multilevel Linear Mixed-Effects model, appropriate for the longitudinal structure of the data, where repeated time points are nested within individuals. This approach allowed us to simultaneously model both within-person change over time and between-person differences in IS confidence, while accounting for individual heterogeneity in baseline levels. By including a random intercept for each participant, the model accommodates the non-independence of observations arising from repeated measurements within individuals and adjusts for unobserved, person-specific factors that may influence confidence trajectories. This modeling strategy is particularly well-suited to our goals, as it enables us to assess how time-varying experiences in the network (e.g., social learning, group engagement, and evolving roles) relate to confidence growth, while also incorporating stable individual characteristics (e.g., gender and race) as between-person predictors. The model was fitted using restricted maximum likelihood (REML), with Satterthwaite's method used for estimating degrees of freedom and p-values. The model is specified as follows:
In this model, IS Confidenceit represents the self-reported improvement science confidence for individual i at time t. β0 is the fixed intercept. The term Time in Networkit is a time-varying covariate indicating how long an individual has been part of the network at time t. Social Learningit captures the extent to which individuals report learning from peers or others in the network. IS Group Engagementit reflects individuals' participation in improvement group activities. Roleit denotes their network role or position at time t. Genderi and Racei are non-time-variant demographic characteristics. The model includes a random intercept, μ0i, to account for unobserved heterogeneity across individuals, and a residual error term, εit, within-person variance over time.
To examine participants' improvement science (IS) skill development, we used an Ordinal Mixed-Effects model to account for the ordered nature of the outcome variable—the change summary score, which captures PDSA quality ratings at the group level. This score reflects the degree to which participants demonstrated understanding of measurement for improvement. The outcome is categorized into three ordered levels: low, medium, and high. As described previously, each individual was assigned the group's score for the submission they contributed to, acknowledging the collaborative nature of the work and its documentation. Given the categorical and ordinal nature of the outcome, we fitted a Cumulative Link Mixed Model, a specific type of ordinal regression model that is used when the outcome variable is ordinal. This model estimates the probability of a participant's submission falling into a higher quality category, while appropriately accounting for the nested structure of the data–specifically, repeated observations within individuals. By including a random intercept for each participant, the model accounts for individual differences that are not directly observed but may affect how each person expresses their IS skills over time. These individual differences might include factors such as openness to adopting new practices, prior experience with improvement science, or cognitive and emotional dispositions. By accounting for these individual-level variations, the model allows us to isolate the specific effects of the predictors we are studying, even in the presence of individual differences. We note that the random intercepts and residual variance estimates are not findings to be interpreted directly. Instead, they function as statistical adjustments that allow the model to account for individual differences. The substantive interpretations in this study come from the estimated coefficients of the predictors. This modeling approach is well-aligned with both the data structure and theoretical framing, allowing us to assess how participants' evolving network experiences (e.g., role, engagement, and social learning) relate to changes in the quality of their applied improvement work. It also accommodates the clustered and collaborative nature of submissions, where multiple participants may contribute to a single outcome.
We fitted extended models that included gender and race as control variables for prediction of both outcomes. Because demographic data were not required, these models were estimated on a reduced subset of the sample with complete demographic information. Although the inclusion of these variables introduced additional complexity in the context of a modest sample size, we believed it was important to account for potential structural differences in participants' experiences and baseline levels of IS confidence. As such, we report results from the full model that includes demographic covariates alongside the core predictors. In an effort to retain a larger analytic sample, we also fitted models that included participants with missing demographic data by coding them into an additional category. Results of these models are provided in the Supplementary material. In regression models, categorical predictors require the selection of a reference category for interpretation. For consistency across analyses, we coded gender with men as the reference group and race with White participants as the reference group. Coefficients for other categories should be interpreted relative to these reference categories.
3 Results
3.1 Correlation outcomes
We examined Pearson correlations among all study variables to assess bivariate relationships prior to multilevel modeling. As shown in Tables 4, 5, the two outcome variables, IS confidence and change in summary score, were significantly positively correlated (r = 0.22, p < 0.05). This suggests a meaningful association between participants' confidence in using improvement science tools and their demonstrated application of those tools in practice, captured in the change summary artifact. This relationship supports our decision to examine both outcomes in parallel, as they capture related but distinct dimensions of improvement science capacity.
Among the predictor variables, Time in Network (r = 0.33, p < 0.01) and Social Learning (r = 0.39, p < 0.01) were positively associated with IS confidence, whereas IS Group Engagement was significantly correlated only with Change Summary Score (r = 0.22, p < 0.01). These patterns suggest that some predictors are more strongly related to participants' perceived confidence, while others are more closely tied to the demonstrated application of improvement science tools.
The overall pattern of correlations showed minimal evidence of multicollinearity, supporting the inclusion of these predictors in the subsequent multilevel models. In the model predicting IS confidence, we include all the predictors presented here, given the modest correlations they exhibit with that outcome. We retained Role despite its nonsignificant correlation, as we consider it conceptually important: the nature of one's role may still signal participants' commitment to the network or opportunities to engage. For the model predicting Change Summary Score, we retained the same set of predictors, except for Inquiry Challenge, which demonstrated minimal correlation with the outcome and was thus excluded from the model.
To further probe the multidimensional nature of social learning, we examined two supplementary indicators: participants' perceived usefulness of collaboration in improvement groups and their reported receptiveness. Perceived usefulness was positively correlated with confidence (r = 0.22, p = 0.05), whereas receptiveness showed a negative correlation (r = -0.24, p = 0.01). One possible explanation is that participants with lower confidence may be more open to absorbing ideas from others, as reflected in a higher number of reported interactions. However, simply gaining ideas does not necessarily translate into greater confidence. As noted above, these exploratory indicators were not included in the final models due to concerns about multicollinearity and the risk of model overload given the modest sample size. Still, this indicated that social learning encompasses multiple facets: not just the quantity of interactions, but also the quality of those exchanges, as reflected in the social learning variable and perceived usefulness, in supporting skill and knowledge acquisition.
3.2 Improvement science confidence
Table 6 presents the results of our linear mixed-effects model examining predictors of participants' confidence in using improvement science (IS) methods (N = 105 observations from 63 individuals). The table shows that participants with longer network tenure, greater social learning, and fewer perceived inquiry challenges reported higher confidence levels. Specifically, Time in network was a significant positive predictor of IS confidence, β = 0.17, SE = 0.06, t(88.31) = 2.66, p = 0.009, suggesting that participants with more time in the STEM PUSH Network reported higher confidence in using improvement science tools. This coefficient is interpreted on the original scale of the outcome variable, indicating that for each additional unit of time (e.g., year) in the STEM PUSH Network, a participant's self-reported confidence in using improvement science tools increased by approximately 0.17 points on average, holding all other variables constant. This finding suggests that sustained participation in the network is associated with greater confidence in applying IS methods. Social learning was also positively associated with IS confidence, β = 0.36, SE = 0.09, t(84.09) = 3.88, p < 0.001. This coefficient indicates that for each one-unit increase in perceived social learning—measured by participants' self-reports of how much they learn from their colleagues—IS confidence increased by approximately 0.36 points on average, holding other factors constant. In other words, participants who felt they learned more from their peers tended to report greater confidence in applying improvement science methods. By contrast, inquiry challenge was a significant negative predictor of IS confidence, β = −0.22, SE = 0.11, t(94.89) = −2.05, p = 0.043, suggesting that participants who perceived greater challenges in engaging with inquiry work tended to feel less confident. Neither IS group engagement (β = 0.07, SE = 0.12, t(84.55) = 0.55, p = 0.581) nor network role (β = −0.15, SE = 0.13, t(80.55) = −1.17, p = 0.244) emerged as significant predictors.
This result highlights the positive effects of accumulated time in the network, the ongoing exposure and peer-based learning in building practitioners' confidence in improvement science. It also emphasizes the relational and collaborative aspects of participation playing an influential role in shaping participants' confidence. Interestingly, although we expected that the quality of engagement in structured improvement group activities would be a predictor of both confidence and skills, it was not significantly associated with increases in confidence. This suggests that confidence may be more closely tied to participants' social integration and perceived learning within the broader network context beyond their direct involvement in formal improvement group routines. By contrast, inquiry challenges were observed to be a roadblock to IS confidence, indicating that participants who found the work more challenging tended to feel less confident in their ability to apply improvement science tools.
The results for the extended model that included demographic variables are presented in Table 7. It shows that, after controlling for demographic characteristics, network tenure and social learning remained the strongest predictors of IS confidence. Specifically, the random intercept variance (σ2 = 0.51) and residual variance (σ2 = 0.18) indicate that a substantial portion of the variability in IS confidence is attributable to differences between individuals. That is, variation in IS confidence is largely due to individuals starting off with different baseline levels of confidence, and less about fluctuations within individuals over time. This also indicates that personal factors, such as professional experiences, likely contributed to the varying baseline confidence levels across individuals. Consistent with the earlier model, time in network (β = 0.18, p = 0.014) and social learning (β = 0.35, p < 0.001) remained statistically significant predictors. These findings reinforce the idea that both sustained participation in the NIC and perceived opportunities to learn from peers are associated with greater confidence in using improvement science tools. Inquiry challenge, however, was not significantly associated with IS confidence in the extended model. None of the demographic variables emerged as significant predictors. The addition of these demographic controls did not substantively alter the pattern or strength of the core predictors, suggesting that the associations between time in network, social learning, and IS confidence are robust even when accounting for demographic variability.
Table 7. Fixed effects from linear mixed-effects model predicting improvement science confidence (extended).
Contrary to our expectations, the lack of—and in fact negative—correlation between IS group engagement and IS confidence was initially puzzling. However, we hypothesize that this pattern may reflect a manifestation of the Dunning-Kruger effect (Kruger and Dunning, 1999), whereby individuals with lower levels of competence tend to overestimate their abilities, while those with greater expertise are more likely to recognize the complexity of a domain and assess themselves more modestly. In the context of our study, participants who are newer to improvement science might be less familiar with its conceptual and technical demands, and may therefore report inflated levels of confidence. By contrast, those with more improvement group exposure and deeper engagement may develop greater awareness of the complexity of improvement science tools, leading to more tempered, or even reduced, self-assessments of confidence. To explore this possibility, we tested for a nonlinear relationship between IS group engagement and confidence using a quadratic model. While the quadratic term did not reach statistical significance, we observe that confidence remains relatively stable at low to moderate levels of engagement, but shows a slight decline at the highest levels. This trend may reflect a growing awareness among experienced participants of the depth and rigor involved in improvement science, which may result in lower confidence the more they are exposed to the practice. A visualization of this pattern is included in the Supplementary material.
3.3 Improvement science skills
Both Tables 8, 9 highlight IS group engagement as the strongest predictor of IS skill ratings, suggesting that participants' active engagement in their improvement groups was most closely linked to the quality of their demonstrated improvement science practice. Table 8 shows the results for predicting IS skills, measured by expert ratings of the Act component in participants' PDSA submissions. As discussed in the methods section, these scores reflect the degree to which participants demonstrated understanding and appropriate use of measurement for improvement. Among all predictors, only IS group engagement emerged as a statistically significant predictor of skill development (β = 0.87, z = 2.06, p = 0.039). This finding suggests that participants who were more regularly and meaningfully engaged in improvement group meetings were more likely to produce PDSA summaries rated as demonstrating stronger measurement-based reasoning and action. Given that improvement groups provide structured support for participants through each stage of the PDSA cycle, this result underscores the value of deliberate practice. Participating in these scaffolded routines offer repeated exposure to key tools, opportunities for peer and facilitator feedback, and embedded accountability, all of which support deeper learning and higher-quality application of improvement methods.
In the extended model (Table 9), we observe a pattern largely similar to the results without participants' gender and race as control variables. IS group engagement continued to be a significant predictor of IS skills (β = 0.94, p = 0.025), and time in network showed a marginal association with skill development (β = 0.44, p = 0.052). This suggests that once we account for demographic factors, the unique contribution of time in network to IS skills becomes more apparent, approaching conventional significance levels, compared to being nonsignificant in the previous model. Neither social learning nor role was significant in either model. Among demographic variables, gender (nonbinary; β = 3.00, p = 0.050) and gender (woman; β = 0.99, p = 0.041) reached significance or marginal significance, and race (Native American/Mixed Race/Not Described) approached significance (β = −1.70, p = 0.064) while the remaining race variables did not meet conventional thresholds. These patterns may warrant further exploration in future research, particularly with a larger sample size. Overall, the results from both models reinforce the importance of direct participation in improvement structures for developing applied IS skills and suggest that this relationship holds even when accounting for individual background characteristics.
While both time in the network and social learning showed positive associations with IS skill ratings, neither reached statistical significance. These results suggest that spending more time in the network or learning from peers may not be sufficient for demonstrating high-quality use of measurement in change work. While general exposure to the network and relational learning experiences may benefit participants' knowledge exchange and motivation, they do not necessarily translate into observable improvement in doing the improvement work. Unlike confidence, which might be more sensitive to self-perception, social interaction and exposure to ideas, developing applied IS skills seems to require hands-on engagement with tools and feedback-rich routines. Participants' roles within the network were not predictive of skill level, indicating that what matters most is how individuals engage with the structured improvement processes, not the position they hold.
4 Discussion
Our study examined a central question: How does participation in the STEM PUSH Network contribute to educators' development of improvement science confidence and skills? Our results show that consistent with our first hypothesis on deliberate practice, participants' meaningful engagement in improvement cycles predicted higher applied IS skills. However, contrary to expectations, improvement group engagement did not significantly predict IS confidence. Partly supporting our second hypothesis, social learning predicted growth in IS confidence but not in IS skills, suggesting that these two forms of learning contribute differently to practitioners' development rather than exerting uniform effects across outcomes. In line with our third hypothesis regarding time in network, educators with repeated exposure to continuous improvement practice demonstrated stronger IS confidence and skills. Sustained participation in the network appears to provide ongoing opportunities for practice, peer collaboration, and shared reflection that cumulatively strengthen practitioners' capacity. Finally, regarding our fourth hypothesis on inquiry challenges, participants who reported fewer challenges engaging in improvement inquiry also demonstrated higher confidence. Although this effect weakened after controlling for demographic variables, it underscores the importance of clear expectations and scaffolds for participants learning improvement science methods. Taken together, these findings emphasize that NICs should intentionally balance relational learning structures that build confidence with structured, feedback-rich PDSA practice that develops skill, supported by clear guidance and expectations throughout the inquiry process.
Our findings provide empirical support for core propositions in the NIC literature, which posit that bringing together educators, researchers, and organizational leaders to co-design, test, and refine strategies through disciplined inquiry and local data use strengthens collective capacity for quality improvement in education (Bryk et al., 2015; Russell et al., 2017; Noble et al., 2021). Consistent with the theoretical models (LeMahieu et al., 2017; Reed et al., 2025), we find that network participants develop meaningful improvement science capacity through collaborative, inquiry-based participation in the STEM PUSH Network. They not only acquire conceptual understanding of improvement methods but also learn to apply measurement and evaluation tools to inform change in their local contexts.
Beyond offering empirical confirmation to the theoretical models, our study also substantiates and extends existing qualitative analyses that have documented how iterative inquiry, shared problem-solving, and collaborative interactions build educators' relational and improvement science capacity (Cannata et al., 2017; Dolle et al., 2013; Stosich, 2024; Russell et al., 2021). By leveraging robust, replicable statistical models, we extend the evidence base from primarily qualitative and static accounts to provide quantitative evidence of how these learning processes unfold and are related to valued outcomes over time within the STEM PUSH Network.
Most importantly, our study advances the NIC literature by identifying and quantifying two distinct mechanisms underlying educators' capacity development—deliberate practice and social learning—that have been theorized but not previously distinguished empirically. While prior research has emphasized the importance of both structured inquiry and the social dimensions of NIC participation (e.g., Bryk et al., 2015; Kallio and Halverson, 2020; Stone-Johnson and Hayes, 2021), they often imply that these two facets of NIC participation contribute in parallel to the development of both confidence and skill, and so did our hypothesis. Instead, our findings reveal that they operate through differentiated pathways—deliberate, feedback-rich engagement in PDSA cycles primarily builds applied skill, whereas collaborative exchange and peer learning have the strongest tie to confidence. By clarifying this distinction, this study is the first to add critical nuance to the understanding of how improvement capacity is cultivated in a NIC context. This insight also carries practical implications, which we discuss in more detail in the following section.
4.1 Implications for research and practice
For practitioners and network facilitators, our findings encourage future NIC organizers and leaders to be more intentional about how different components of network participation can be designed to foster distinct dimensions of professional learning. Improvement groups with PDSA inquiry cycles can serve as structured environments for guided practice, where participants engage in clear, scaffolded inquiry processes—designing measurement and data collection in their own contexts, receiving targeted feedback, and refining their strategies. These spaces are where improvement skills are cultivated and developed. At the same time, collaborative exchange in the broader network context, such as regular monthly meetings, convenings, and peer exchange, can play a crucial role in building confidence, particularly for members who may not have the resources or capacity to engage intensively in structured improvement work. For network organizers, cultivating multiple pathways for participation means creating opportunities for members to learn and grow through both structured group work and more flexible, informal forms of engagement.
Our results also suggest that practitioner growth in the network may not follow a linear path. As participants become more deeply engaged, they may begin to reassess their own confidence in light of new awareness. For instance, what may appear as a drop in confidence could actually signal the emergence of more realistic self-assessment and a deeper understanding of the work. Facilitators should anticipate and normalize this pattern, offering reassurance and scaffolding to sustain engagement through this critical developmental shift.
Importantly, our findings highlight the tension between perceived and demonstrated learning. Confidence can emerge through relational experiences and exposure to others' work, while skill requires deeper engagement with complex tools and routines. This distinction suggests that relying solely on self-reported measures may present an incomplete picture of practitioner capacity. NICs could incorporate complementary assessments of learning—such as peer ratings or reflective exercises that document network members' learning processes—to capture a fuller range of continuous improvement development. At the same time, it is important to recognize that program goals and contexts will shape which forms of assessment are most appropriate.
Additionally, developing effective measures to evaluate IS skills can be challenging in networked improvement communities. When developing evaluation tools, network leaders should anticipate not only their immediate use but also their potential for future analyses, including the feasibility of quantitative modeling and the labor required for implementation. In our case, the original rubric for assessing PDSA submissions was nuanced but ultimately proved overly complex and resource-intensive to score. Although we were able to adapt this measure and make it suitable for analysis, building evaluation tools with such foresight from the outset could yield more efficient processes and stronger interpretability for future networks.
Moreover, measuring IS skills is especially difficult in networks like STEM PUSH, where the goals of improvement are often more complex and context-specific than in traditional instructional settings. Unlike networks centered on measurable academic outcomes, NICs such as STEM PUSH that focus on out-of-school STEM programs must account for variation in program models, local challenges, and the types of data that matter to communities. These realities make it harder to apply standardized rubrics or define uniform success criteria. Our experience suggests a need for better, more consistent ways of capturing applied improvement science skills. Such methods should be practical and honor the diverse work practitioners bring to the table. As NICs continue to evolve, a shared effort to improve how we define and measure improvement skills rooted in authentic practice could greatly enhance both evaluation of network health and educators' professional growth.
Finally, these insights reinforce the need for hub leaders to design inclusive learning structures that create access to both relational and experiential learning, particularly for those in peripheral or emerging roles. Supporting practitioners across varying levels of experience means ensuring that everyone–not just those in formal leadership positions–has meaningful opportunities to practice, reflect, and contribute. By intentionally fostering both psychological safety and structured challenge, NICs can cultivate environments where all participants grow in their ability and confidence to lead improvement work.
4.2 Limitations
There are several limitations to our study. One concern is self-report bias, particularly in the confidence outcome. Participants may overestimate their ability to use improvement science tools, especially in contexts where improvement science confidence could be conflated with social competence or leadership. For instance, we observed that some highly visible individuals, who are less engaged in structured improvement groups but active in convenings and cross-program discussions, reported high confidence regardless. This pattern may suggest an inflation of confidence due to personality, perceived status, or social visibility within the network. Future networks and studies should therefore consider developing measures that distinguish between confidence rooted in social capital and confidence grounded in the use of improvement science tools and knowledge, in order to better capture the multiple dimensions of improvement science capacity.
A further limitation concerns the measurement of improvement science skills through cycle documentation. Written documentation such as PDSA does not always fully capture participants' understanding or the quality of their improvement work. For example, in STEM PUSH we observed certain instances where network members shared thoughtful insights in improvement group meetings that were not reflected in their submitted change summaries. While the hub team streamlined documentation requirements to reduce burden for network participants—focusing on a single change summary rather than multiple artifacts—this approach inevitably captured a partial view of practitioners' applied skills. This challenge highlights the difficulty of balancing meaningful assessment with feasibility in dynamic networks, where members vary in capacity and time to produce high-quality artifacts. Future work may explore more robust approaches for capturing skill development, particularly in mature or long-running NICs with established routines, or in networks where improvement results are more narrowly defined and measurable. Additionally, future research could build on this work by incorporating more fine-grained measures of role within group products–for example, by asking participants to disclose their contributions and responsibilities, and then weighting individual contributions or distinguishing among types of contribution such as idea generation versus data collection. Such approaches may provide deeper insight into how collaborative participation is associated with the improvement science skills demonstrated in the work.
Another limitation pertaining to the analysis of IS skills is the inherent level mismatch between the predictors and the outcome. Specifically, while our key predictors (e.g., social learning, perceived usefulness, engagement) are measured at the individual level, the outcome variable—the PDSA submission score—is assessed at the group level, and shared by all members. This mismatch reflects the reality of group work and collaborative inquiry, where not all individuals contribute equally, and where individual experiences or dispositions may not directly translate into group-level outcomes. For instance, some individuals may approach the assignment with greater rigor, take on leadership or coordinating roles, or contribute more extensively to the quality of the product, while others may be less involved. As a result, individual engagement or learning may not uniformly map onto the group score, and variation in contribution intensity is not captured by the outcome variable. This may explain why certain theoretically relevant predictors—such as individual Improvement Science engagement—do not strongly predict group performance in the model. It is possible that in highly collaborative groups, the efforts of a few highly engaged individuals can carry the group's performance, thereby masking the influence of others' lower engagement levels. Conversely, in less coordinated groups, even high engagement by one or two members may not translate into strong group outcomes. We chose not to average individual-level predictors at the group level, as this would assume equal contribution across members, and potentially obscure meaningful variance from key contributors. Instead, we retain individual-level predictors and interpret the results with caution, acknowledging that their association with the group outcome reflects a partial and asymmetric mapping of individual experiences onto shared performance.
5 Conclusion
Our study moves the field beyond theoretical and qualitative evidence by offering the first quantitative and longitudinal evidence that participation in a Networked Improvement Community contributes to educators' improvement science confidence and skills. Through studying the STEM PUSH Network, we found that practitioners' improvement science confidence develops primarily through social learning and collaborative interactions within the network, whereas applied skill grows through deliberate, hands-on practice within structured improvement groups. These insights clarify how NIC participation builds practitioners' long-term improvement capacity and offer implications for designing future networks that intentionally cultivate both confidence and competence. As NICs continue to gain traction as a model for scalable educational change, these findings underscore the importance of building network structures that recognize and support the varied ways practitioners learn. Supporting both technical mastery and reflective, socially embedded learning will be essential to fostering educators' long-term capacity to lead improvement efforts in complex, real-world settings.
Looking forward, future research should further unpack the temporal dynamics between confidence and skill acquisition. For example, sequential modeling could help clarify whether increases in confidence further deepen engagement in improvement work. Researchers could also test whether participants with deeper engagement, in turn, adjust and calibrate their confidence as they confront the complexity and challenges of improvement work. Additionally, future studies should examine how structural features of NICs—such as facilitation practices, peer feedback mechanisms, or leadership roles—influence the pathways through which learning unfolds. Lastly, future research should look at multiple improvement networks simultaneously to examine patterns across diverse contexts. This could provide more robust and generalizable evidence of how different participation structures shape educator learning and capacity building.
Data availability statement
The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.
Author contributions
YL: Validation, Methodology, Conceptualization, Writing – review & editing, Writing – original draft, Data curation, Investigation, Visualization, Formal analysis. JI: Supervision, Funding acquisition, Writing – review & editing, Conceptualization, Project administration, Methodology, Validation, Investigation. JS: Methodology, Validation, Conceptualization, Investigation, Writing – review & editing. DL: Writing – review & editing, Conceptualization, Investigation, Methodology, Validation, Data curation. CM: Writing – review & editing, Investigation, Data curation, Validation, Methodology, Conceptualization. AL: Writing – review & editing, Supervision, Funding acquisition.
Funding
The author(s) declare that financial support was received for the research and/or publication of this article. This study was conducted with the support of the National Science Foundation's Eddie Bernice Johnson Inclusion across the Nation of Communities of Learners of Underrepresented Discoverers in Engineering and Science (Award # 1930990). The content is a result of the research conducted by the authors and does not reflect the position of the National Science Foundation. Funds for open access publication fees will come from investigator research funds.
Acknowledgments
We are grateful for the partnership and collaboration of the precollege STEM programs who make up the membership of the STEM PUSH Network. Their persistent and deep commitment to dismantling systemic barriers for Black, Latina/o/e, and Indigenous students and their investment in improvement of their own programs is the core of the work shared in this article. In addition to the authors, Talia Stol and Disan Davis contributed to the overall conceptual design and implementation of the network and PDSA scoring. Colleagues Lori Delale O'Connor and David Boone contributed to network implementation and PDSA scoring.
Conflict of interest
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Generative AI statement
The author(s) declare that Gen AI was used in the creation of this manuscript. Gen AI was used to support language-related revisions, including improving sentence flow, polishing grammar and syntax, enhancing clarity and ensuring readability of writing. It was not used to generate original ideas, conduct analyses, or draw conclusions. All intellectual contributions, interpretations, and claims are solely those of the author(s).
Any alternative text (alt text) provided alongside figures in this article has been generated by Frontiers with the support of artificial intelligence and reasonable efforts have been made to ensure accuracy, including review by the authors wherever possible. If you identify any issues, please contact us.
Publisher's note
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.
Supplementary material
The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/feduc.2025.1691044/full#supplementary-material
References
Anderson, E., Cunningham, K. M., and Richardson, J. W. (2024). Framework for implementing improvement science in a school district to support institutionalized improvement. Educ. Sci. 14:770. doi: 10.3390/educsci14070770
Bonney, E. N., Yurkofsky, M. M., and Capello, S. A. (2025). Edd students' sensemaking of improvement science as a tool for change in education. J. Res. Leadersh. Educ. 20, 3–31. doi: 10.1177/19427751231221032
Bragg, D., and McCambly, H. (2018). Equity-Minded Change Leadership. Seattle, WA: Bragg & Associates, Inc.
Bryk, A. S., Gomez, L. M., Grunow, A., and LeMahieu, P. G. (2015). Learning to Improve: How America's Schools Can Get Better at Getting Better. Cambridge, MA: Harvard Education Press.
Bryk, A. S., Li, A. Y. L., Luppescu, S., and Bui, M. A. (2025). Examining the validity of practical measures of improvement network health and development. Peabody J. Educ. 100, 28–47. doi: 10.1080/0161956X.2025.2444840
Bush-Mecenas, S. (2022). Racial equity in continuous improvement: A review of equity issues in improvement science. Educ. Policy 36, 215–244. doi: 10.1177/08959048211000676.
Cannata, M., Cohen-Vogel, L., and Sorum, M. (2017). Partnering for improvement: Improvement communities and their role in scale up. Peabody J. Educ. 92, 569–588. doi: 10.1080/0161956X.2017.1368633
Chong, W. H., and Kong, C. A. (2012). Teacher collaborative learning and teacher self-efficacy: the case of lesson study. the journal of experimental education. J. Exp. Educ. 80, 263–283. doi: 10.1080/00220973.2011.596854
Cohen-Vogel, L., Tichnor-Wagner, A., Allen, D., Harrison, C., Kainz, K., Socol, A. R., et al. (2015). Implementing educational innovations at scale: transforming researchers into continuous improvement scientists. Educ. Policy 29, 257–277. doi: 10.1177/0895904814560886
Crow, R., Hinnant-Crawford, B. N., and Spaulding, D. T. (2019). The Educational Leader's Guide to Improvement Science: Data, Design and Cases for Reflection. Cambridge, MA: Harvard Education Press.
De Simone, J. J. (2020). The roles of collaborative professional development, self-efficacy, and positive affect in encouraging educator data use to aid student learning. Teach. Dev. 24, 443–465. doi: 10.1080/13664530.2020.1780302
Dolle, J. R., Gomez, L. M., Russell, J. L., and Bryk, A. S. (2013). More than a network: Building professional communities for educational improvement. national society for the study of education yearbook. Nat. Soc. Study Educ. Yearbook 112, 443–463. doi: 10.1177/016146811311501413
Edmondson, A. (1999). Psychological safety and learning behavior in work teams. Adm. Sci. Q. 44, 350–383. doi: 10.2307/2666999
Edmondson, A. C., and Lei, Z. (2014). Psychological safety: the history, renaissance, and future of an interpersonal construct. Annu. Rev. Organ. Psychol. Organ. Behav. 1, 23–43. doi: 10.1146/annurev-orgpsych-031413-091305
Ericsson, K. A., Krampe, R. T., and Tesch-Römer, C. (1993). The role of deliberate practice in the acquisition of expert performance. Psychol. Rev. 100, 363–406. doi: 10.1037//0033-295X.100.3.363
Feygin, A., Nolan, L., Hickling, A., and Friedman, L. (2020). Evidence for Networked Improvement Communities.
Gallagher, H. A., and Cottingham, B. W. (2019). Learning and practicing continuous improvement: Lessons from the CORE Districts. Palo Alto: Policy Analysis for California Education. Retrieved from: https://edpolicyinca.org/sites/default/files/%20R_Gallagher_Oct19.pdf (Accessed November 19, 2025).
Grossman, P., Hammerness, K., and McDonald, M. (2009). Redefining teaching, re-imagining teacher education. Teach. Teach. 15, 273–289. doi: 10.1080/13540600902875340
Gutiérrez, K. D., and Penuel, W. R. (2014). Relevance to practice as a criterion for rigor. Educ. Res. 43, 19–23. doi: 10.3102/0013189X13520289
Hinnant-Crawford, B. N. (2020). Improvement Science in Education: A Primer. Gorham, ME: Myers Education Press.
Horn, I. S., and Little, J. W. (2010). Attending to problems of practice: routines and resources for professional learning in teachers' workplace interactions. Am. Educ. Res. J. 47, 181–217. doi: 10.3102/0002831209345158
Iriti, J., Delale-O'Connor, L., Sherer, J. Z., Stol, T., Davis, D., Matthis, C., et al. (2024). Adapting improvement science tools and routines to build racial equity in out-of-school time stem spaces. Front. Educ. 9:1434813. doi: 10.3389/feduc.2024.1434813
Jacobs, J., Burns, R. W., Haraf, S., and McCorvey, J. (2024). Identifying key features of equity-centered professional learning. J. School Leadersh. 34, 122–150. doi: 10.1177/10526846231187567
Jones, M. (2017). “Improving a school-based science education task using critical reflective practice. reflective theory and practice in teacher education,” in Reflective Theory and Practice in Teacher Education (Singapore: Springer Singapore), 179–204.
Joshi, E., Redding, C., and Cannata, M. (2021). In the nic of time: how sustainable are networked improvement communities? Am. J. Educ. 127, 369–397. doi: 10.1086/713826
Kallio, J. M., and Halverson, R. R. (2020). Designing for trust-building interactions in the initiation of a networked improvement community. Front. Educ. 4:154. doi: 10.3389/feduc.2019.00154
Katz, S., Earl, L. M., and Jaafar, S. B. (2009). Building and Connecting Learning Communities: The Power of Networks for School Improvement. Thousand Oaks, CA: Corwin Press.
Kelley, T. R., Knowles, J. G., Holland, J. D., and Han, J. (2020). Increasing high school teachers self-efficacy for integrated STEM instruction through a collaborative community of practice. Int. J. STEM Educ. 7:14. doi: 10.1186/s40594-020-00211-w
Kruger, J., and Dunning, D. (1999). Unskilled and unaware of it: how difficulties in recognizing one's own incompetence lead to inflated self-assessments. J. Pers. Soc. Psychol. 77:1121. doi: 10.1037//0022-3514.77.6.1121
Kruse, S. D., and Louis, K. S. (1993). “An emerging framework for analyzing school-based professional community,” in American Educational Research Association, Paper Presented at the Annual Meeting of the American Educational Research Association (Atlanta, GA). Retrieved from: https://files.eric.ed.gov/fulltext/ED358537.pdf (Accessed November 19, 2025).
Langley, G. J., Moen, R. D., Nolan, K. M., Nolan, T. W., Norman, C. L., and Provost, L. P. (2009). The Improvement Guide: a Practical Approach to Enhancing Organizational Performance. Singapore: John Wiley & Sons.
Lave, J., and Wenger, E. (1991). Situated Learning: Legitimate Peripheral Participation. Cambridge: Cambridge University Press.
LeMahieu, P. G., Grunow, A., Baker, L., Nordstrum, L. E., and Gomez, L. M. (2017). Networked improvement communities: the discipline of improvement science meets the power of networks. Qual. Assur. Educ. 25, 5–25. doi: 10.1108/QAE-12-2016-0084
Lewis, C. (2015). What is improvement science? do we need it in education. Educ. Res. 44, 54–61. doi: 10.3102/0013189X15570388
Martin, J. J., Mccaughtry, N., Hodges-Kulinna, P., and Cothran, D. (2008). The influences of professional development on teachers' self-efficacy toward educational change. Phys. Educ. Sport Pedag. 13, 171–190. doi: 10.1080/17408980701345683
Nielsen, M. (2012). Reinventing Discovery: the New Era of Networked Science. Princeton, NJ: Princeton University Press.
Noble, C. E., Amey, M. J., Colón, L. A., Conroy, J., De Cheke Qualls, A., Deonauth, K., et al. (2021). Building a networked improvement community: lessons in organizing to promote diversity, equity, and inclusion in science, technology, engineering, and mathematics. Front. Psychol. 12:732347. doi: 10.3389/fpsyg.2021.732347
NSF INCLUDES (2018). NSF INCLUDES: Report to the Nation. Available online at: https://eric.ed.gov/?q=NSF+INCLUDES+(2018).+Report+to+the+Nation.&id=ED661738 (Accessed November 19, 2025).
NSF INCLUDES (2020). Special Report to the Nation II – Building Connections: shared vision, partnerships, goals and metrics, leadership and Communication, expansion, sustainability, and Scale. Available online at: https://eric.ed.gov/?q=NSF+INCLUDES+(2020).+special+Report+to+the+Nation.&id=ED661711 (Accessed November 19, 2025).
Park, S., Hironaka, S., Carver, P., and Nordstrum, L. (2013). Continuous Improvement in Education. Palo Alto, CA: Carnegie Foundation for the Advancement of Teaching.
Perlman, H., Bryk, A. S., and Russell, J. L. (2025). Measuring educators' perceived benefits of participation in educational improvement networks. Peabody J. Educ. 100, 82–99. doi: 10.1080/0161956X.2025.2444844
Peurach, D. J., Glazer, J. L., and Winchell Lenhoff, S. (2016). The developmental evaluation of school improvement networks. Educ. Policy 30, 606–648. doi: 10.1177/0895904814557592
Peurach, D. J., Jones, E. S., Duff, M., Sherer, J. Z., and Matthis, C. (2025). The practice and contexts of hub and district leadership: New directions in research on educational improvement networks. Peabody J. Educ. 100, 117–132. doi: 10.1080/0161956X.2025.2444846
Peurach, D. J., Russell, J. L., Sherer, J. Z., McMahon, K., and Parkerson, E. (2020). The Work and Complications of Hub Leadership in Educational Improvement Networks.
Poellhuber, B., Chomienne, M., and Karsenti, T. (2008). The effect of peer collaboration and collaborative learning on self-efficacy and persistence in a learner-paced continuous intake model. Int. J. E-Learn. Dist. Educ. 22, 41–62. Retrieved from: https://www.ijede.ca/index.php/jde/article/view/451 (Accessed November 19, 2025).
Prenger, R., Poortman, C. L., and Handelzalts, A. (2019). The effects of networked professional learning communities. J. Teach. Educ. 70, 441–452. doi: 10.1177/0022487117753574
Proger, A. R., Bhatt, M. P., Cirks, V., and Gurke, D. (2017). Establishing and Sustaining Networked Improvement Communities: Lessons from Michigan and Minnesota. Minnesota, OH: Regional Educational Laboratory Midwest.
Reed, J. E., Antonacci, G., Armstrong, N., Baker, G. R., Crowe, S., Harenstam, K. P., et al. (2025). What is improvement science, and what makes it different? an outline of the field and its frontiers. Front. Health Serv. 4:1454658. doi: 10.3389/frhs.2024.1454658
Reed, J. E., Davey, N., and Woodcock, T. (2016). The foundations of quality improvement science. Future Hosp. J. 3:199. doi: 10.7861/futurehosp.3-3-199
Richardson, A., and Rees, J. (2025). Validation and application of a tool to assess self-confidence to do improvement. BMJ Open Qual. 14:1. doi: 10.1136/bmjoq-2024-003130
Russell, J. L., Bryk, A. S., Dolle, J. R., Gomez, L. M., Lemahieu, P. G., and Grunow, A. (2017). A framework for the initiation of networked improvement communities. Teach. Coll. Rec. 119, 1–36. doi: 10.1177/016146811711900501
Russell, J. L., Correnti, R., Stein, M. K., Bill, V., Hannan, M., and Schwartz, N. (2020). Learning from adaptation to support instructional improvement at scale: Understanding coach adaptation in the TN mathematics coaching project. Am. Educ. Res. J. 57, 148–187. doi: 10.3102/0002831219854050
Russell, J. L., Sherer, J. Z., Iriti, J., and Long, C. (2021). The Better Math Teaching Network Year One: Developmental Evaluation Report. Quincy, MA: Nellie Mae Education Foundation. ERIC.
Scardamalia, M. (2002). Collective cognitive responsibility for the advancement of knowledge. Liberal educ. Knowl. Soc. 97, 67–98. Available online at: https://web.eecs.umich.edu/~mjguz/csl/home.cc.gatech.edu/allison/uploads/4/scardamalia2002.pdf (Accessed November 19, 2025).
Star, S. L., and Griesemer, J. R. (1989). Institutional ecology, ‘translations' and boundary objects: amateurs and professionals in berkeley's museum of vertebrate zoology, 1907-39. Soc. Stud. Sci. 19, 387–420. doi: 10.1177/030631289019003001
Stone-Johnson, C., and Hayes, S. (2021). Using improvement science to (re) design leadership preparation: exploring curriculum change across five university programs. J. Res. Leadership Educ. 16, 339–359. doi: 10.1177/1942775120933935
Stosich, E. L. (2024). Working toward transformation: educational leaders' use of continuous improvement to advance equity. Front, Educ. 9:1430976. doi: 10.3389/feduc.2024.1430976
Weißenrieder, J., Roesken-Winter, B., Schueler, S., Binner, E., and Blömeke, S. (2015). Scaling CPD through professional learning communities: Development of teachers' self-efficacy in relation to collaboration. ZDM 47, 27–38. doi: 10.1007/s11858-015-0673-8
Wenger, E. (1998). Communities of practice: Learning, meaning, and identity. Cambridge MA: Cambridge University Press.
Wenger, E. (2000). Communities of practice and social learning systems. Organization 7, 225–246. doi: 10.1177/135050840072002
Keywords: networked improvement community, improvement science, professional learning community, research practice partnership, improvement science skills, improvement science confidence, social learning
Citation: Lin Y, Iriti J, Sherer JZ, Lowry D, Matthis C and Legg AS (2025) Building educators' confidence and skill in improvement science: a longitudinal study of the STEM PUSH Network. Front. Educ. 10:1691044. doi: 10.3389/feduc.2025.1691044
Received: 22 August 2025; Revised: 05 November 2025;
Accepted: 11 November 2025; Published: 02 December 2025.
Edited by:
Simona Sava, West University of Timişoara, RomaniaReviewed by:
Rachel Funk, University of Nebraska-Lincoln, United StatesEmily Weiss, University of California, Berkeley, United States
Copyright © 2025 Lin, Iriti, Sherer, Lowry, Matthis and Legg. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: Yiwen Lin, eWlsNTUyQHBpdHQuZWR1