How do participants collaborate during an online hackathon? An empirical, quantitative study of communication traces

Starting as niche programming events, hackathons have since become a popular form of collaboration. Events are organized in various domains across the globe, aiming to foster innovation and learning, create and expand communities and tackle civic and environmental issues. While research around such events has grown in recent years, most studies are based on observations of a few individuals during an event and on post-hoc interviews during which participants report their experiences. Such studies are helpful but somewhat limited in that they do not allow us to study how individuals communicate at scale using technology. To address this gap, we conducted an archival analysis of communication traces of teams during a 48-h event. Our findings indicate that teams scaffold their communication around the design of an event, influenced by milestones set by the organizers. The officially selected communication platform's main use was to organize the event and the teams and to facilitate contact between participants and hackathon officials. We further investigated the balance of intra-team communication on the given platform and the potential use of additional communication tools.


. Introduction
Starting as intensive-often competitive-programming events during the early 2000s, hackathons have since become a popular form of collaboration in various contexts (Taylor and Clarke, 2018). During such events, participants form teams and collaborate on a project of interest to them . Hackathons are organized for various aims which can include developing (innovative) technology (Briscoe and Mulligan, 2014;, fostering learning (Gama et al., 2018;Porras et al., 2019), tackling civic and environmental issues (Hou and Wang, 2017;Hope et al., 2019), and initiating or expanding communities (Huppenkothen et al., 2018;Nolte et al., 2020b).
To answer these questions, we conducted an archival analysis of communication traces of teams during a 48-h hackathon.
The contribution of this paper is twofold. First, we provide an in-depth insight into how communication unfolds during an intensive, time-bounded collaboration event. Second, we explored an approach that can potentially be utilized to study future events and develop means for organizers to monitor their events and direct resources to struggling teams. We envision that documenting how hackathon participants communicate during these events and the tools they choose to facilitate their communication is an important step that will provide fundamental knowledge regarding collaborative practices, successful outcomes, and potentially impromptu learning opportunities (Chounta et al., 2017).

. Background . . Collaboration in hackathons
Hackathons are collaborative events at their core (Taylor and Clarke, 2018;Falk Olesen and Halskov, 2020). Collaboration during such events, however, does not involve all participants collaborating with one another. Instead, collaboration mainly takes place in small teams typically formed before or at the beginning of a hackathon (Trainer et al., 2016;. Team members often have diverse backgrounds and interests (DePasse et al., 2014; and they might or might not be familiar with each other before an event (Pe-Than et al., 2022). Teams often remain stable for the duration of an event, but there might be fluctuation in that individuals can leave or join a team while a hackathon is ongoing (Day et al., 2017). Apart from individuals potentially participating in multiple teams, teams are independent of one another, and every team organizes the way they collaborate themselves (Jones et al., 2015;Trainer et al., 2016). Collaboration in these teams evolves around a self-chosen project idea (Briscoe and Mulligan, 2014). During a hackathon, teams commonly go through a process similar to design thinking in that they follow a sequence of divergent and convergent phases to develop a presentable artifact (Gama et al., 2022). During this process, one or multiple team members typically assume a position similar to a project or team leader while others assume roles more related to designer or developers (Gama, 2017;Karlsen and Løvlie, 2017).
Teams are independent, but they still participate in the same event, which provides a scaffolding for how participants engage with each other. This scaffolding includes a hackathon theme to which projects commonly need to be related as well as approaches for ideation and team formation at the beginning of an event (Nolte et al., 2020c) and presentations of team outcomes at the end (Komssi et al., 2014). These aspects, among others, can affect the way individuals engage with each other and with their project (Olesen et al., 2018;. Once teams have been formed, organizers often also deploy means of facilitation to keep them on track (Taylor and Clarke, 2018). These can include different approaches such as regular checkpoints held at different times during an event (Brereton, 2020;Powell et al., 2021) or hackathon staff such as mentors keeping close contact with teams and guiding them (Nolte et al., 2020b). While scaffolding during in-person events can be flexible, online events require a more rigid structure (Powell et al., 2021). This structure commonly revolves around communication channels that are set up by hackathon organizers and that often include different channels for asynchronous and synchronous communication (Bertello et al., 2022). Commonly used tools are Zoom (Braune et al., 2021)-for synchronous communicationand Slack (Bertello et al., 2022) or Discord (Fowler et al., 2020)for asynchronous communication. Recent work in the context of online hackathons has, however, revealed that teams often use different tools than the ones provided by event organizers for their internal communication (Mendes et al., 2022). Moreover, the related work shows that teams exhibit very different ways of communicating and organizing their work. While some mainly organize via text with occasional synchronous sessions in between, others remain on permanent Zoom calls (Mendes et al., 2022).

. . Collaboration using Slack
Slack is a communication platform that aims to facilitate and support workplace interactions by offering messaging features (for example, persistent chat and direct messaging), video calls support, and integration of external services and bots. Slack was initially adopted by software engineering and development communities for a wide range of purposes: personal, team-and community-wide (Storey et al., 2014;Lin et al., 2016).
Following, the use of Slack extended to support teams working together in diverse contexts, among others, education, civic engagement, journalism, and IT enterprises. Along these lines, the use of Slack as a collaboration tool was studied mainly from three perspectives: • Communication patterns. Regarding communication and social interactions, research showed that communication in Slack channels mirrors the dynamics of in-person meetings. In other words, online discussions tend to be dominated by the same people who attend in-person events and meetings (Azarova et al., 2022). At the same time, this can potentially lead to unbalanced communication in that few https://slack.com/ users are responsible for most of the messages exchanged via the platform (Stray et al., 2019). Furthermore, Slack can increase informal communication (Stray and Moe, 2020) while Slack users also favor direct messaging and informal Slack channels over public ones (Vazquez et al., 2020).
• Coordination. In terms of coordination, messaging activity is driven by important milestones, events, and deadlines related to the users' activity either Slack is used in an educational (Van de Zande and Wallace, 2018), workrelated (Azarova et al., 2022) or civic-engagement (McInnis et al., 2018) context. This pattern of teams' coordination around milestones was also reported for other online communication platforms and collaborative settings such as maker spaces (Chounta et al., 2017).
• Collaborative outcomes. Studies suggest that Slack can support engagement in different domains and thus potentially impact outcomes of collaborative practices. For example, Slack has been used to support student engagement with a positive impact on course outcomes (Vazquez et al., 2020). Similarly, Tuhkala and Kärkkäinen (2018) suggested that students using Slack were able to solve practical and technical problems via platform communication. The use of Slack appears to promote transparency and awareness among distributed teams which can be critical aspects of effectiveness and efficiency (Stray and Moe, 2020). Finally, related research explored the relationship between communication styles in Slack channels and performance, suggesting that certain communication features-such as active time span, that is, the temporal duration from the first to the last message-may be predictive of team success (Wang et al., 2022).

. . Related research
Studies in hackathons or hackathon-related contexts commonly focus on teams as their unit of analysis, considering how teams communicate or how their communication affects their experience. This can be expected since collaboration in small teams is at the core of most hackathons (Trainer et al., 2016;. However, such approaches have inherent limitations. First, the findings are based on the perception of team members-in the case of post-interviews (for example, Nolte et al., 2020b-or on the perception of researchers-in the case of observational studies for example, Olesen et al., 2018-or both for example, Pe-Than et al., 2022. Moreover, insights are limited by the ability of researchers to observe team processes and document them-which can be difficult because teams might split into subteams-or by the ability of participants to remember how their collaboration unfolded after an event accurately. Second, these approaches do not scale well. Larger hackathons that .
are attended by a large number of teams would require a large team of researchers to be able to conduct the required observations and interviews. Moreover, observations would not even be possible, for example, in an online context. Third insights gained from these approaches are limited to post-event analyses. Gaining insights during an event about how teams collaborate is, however, crucial for organizers to be able to provide appropriate support to struggling teams (Powell et al., 2021). The recent proliferation of online hackathons, especially during the COVID-19 pandemic (Vermicelli et al., 2021) provides an opportunity to study collaboration as it unfolds due to the availability of communication traces, for example, in the form of chat messages. Studies in other areas such as psychology (Arrow et al., 2005) and organizational sciences (Anders, 2016) have shown the viability of such approaches to not only provide further evidence for existing insights but also to discover patterns and advance our understanding of team communication. We aim to contribute to research on communication during hackathons using a datainformed approach.

. Empirical method
To address our main study objective, we used a mixedmethod approach combining quantitative (descriptive analysis, time-series, and statistical testing) and qualitative methods (content analysis of written texts). This approach is appropriate because we aimed to study how participants used communication tools that were provided by hackathon organizers (RQ 1 ), which tools they mentioned in addition to those (RQ 3 ), and to discover usage patterns (RQ 2 ). In the following, we will elaborate on our study setting and data collection (Section 3.1). Then we will describe the data we collected in detail (Section 3.2) and discuss our analysis approach (Section 3.3).

. . Hackathon setting and data collection
The hackathon we studied took place online in April 2020 and was organized independently of this paper. It focused on developing ideas for technology use in crisis response. The organizers used Slack, Facebook live, Zoom, and YouTube for this event.
Slack was used as a workspace to facilitate communication with event officials and participants. The workspace was organized into channels of two types: general-purpose channels for organization, introduction, and finding team members or mentors, and individual per-team channels for intra-team and team-specific discourse. All of these channels were set to public, so theoretically everyone could join, read and write on every channel. The general purpose channels can be divided into the following subcategories: channels to find team members, mentoring channels, helpdesks, channels that housed a subset of teams (batch channels), and miscellaneous more general channels. The purpose of the different channels is described further in Section 3.2.2. Since the hackathon was mainly aimed at participants in one specific country, the milestones complied with the local time zone. Therefore, all times given here are referring to that as well. The timeline of the hackathon is visualized in Figure 1. The event started with a Facebook live opening session on the first day at 15:30. It continued with checkpoints on Zoom at 18:00 on the same day and 10:00 and 18:30 the next day. On the third day at 9:00 was the submission deadline for the final product on YouTube, followed by a final webinar at 19:30 on Zoom.
To explore how hackathon participants communicated via Slack, we extracted the Slack data after the end of the hackathon. This dataset consists of one file for each channel in json format. Consequently, each file contains channel metadata and a list of the messages sent in the channel, each message being a json object. Each message contains a text message, timestamp, and user ID. Additionally to user-created messages, the list also includes system messages for example, indicating that a user has joined the channel. The channel metadata contains the name, topic, and purpose of a channel, information about who created the channel, and a list of channel members.
The dataset before filtering consisted of 317 channels with 1,286 users and 14,624 messages. Of those channels, 26 are general-purpose channels, and 291 are team channels.
From the original dataset, we removed channels without interaction; that is, channels with less than two users who have actively written messages. Messages created by the system (i.e. Slack)-such as " [user] has joined the channel"-were not taken into account. Additionally, we removed channels that had less than three messages total: two messages would account for one message per user, and one could argue that this would not signify interaction or communication between the users. Additionally, we removed three channels that were named "test," "testest," and "test2," respectively.

. . Data description
After filtering, the dataset consisted of 253 channels with 1,212 users and 14,324 messages. Next, we describe the channels, user roles, and messages contained in the dataset.

. . . User roles in the Slack data
The dataset did not contain information about assigned roles for the different users, but we were informed by the hackathon organizers about the possible roles users could have. This list .

FIGURE
Timeline of milestones during the hackathon.
consisted of "lead mentor," "mentor," "organizer," "participant," "project manager," and "support mentor". The "participant" role describes users who participate in the hackathon in a nonofficial capacity, conversely to all other roles. The "organizer" role is reserved for users who organized the hackathon event as a whole. "Project manager," "lead mentor," and "support mentor" were official roles in supervising and organizing their assigned teams; they were also responsible for running the checkpoints during the hackathon. The "mentors" were users that supported teams with their respective areas of expertise like design, management, or healthcare. In addition, we used messages sent by users to classify them (for example, one person referring to themselves as a "mentor"). In this classification, the role of "participant" serves as the default, so it is possible that users who are tagged as "participants" may, in reality, have been mentors or organizers. The opposite is not possible (for example, users who are classified as mentors, being in reality, participants). Consequently, users who were not classified as participants ("non-participants") were either event organizers, project managers, lead mentors, or mentors and additionally support mentors. Table 1 provides an overview of the classification and distribution of users per role.

. . . Channel types
Additionally, we manually classified the Slack channels according to their main purpose. Out of the 253 channels, 228 were team channels, and 25 were general-purpose channels. Team channels were set up for each team to support team communication and allow team-specific support from mentors. General-purpose channels were used to facilitate communication between hackathon officials-i.e. organizers or mentors-and participants. The general-purpose channels can be even further divided according to their specific purpose. The hackathon teams were assigned to different batches so that the mentoring was easier to organize. There are 8 batch channels where project managers, lead mentors, and support mentors could stay in contact with their respective teams collectively and may give general information. Three generalpurpose channels were dedicated to finding team members with specific expertise in design, development, or other skills, respectively. One general mentoring channel was for mentors to stay in contact and 5 channels were meant to facilitate participants contacting mentors with expertise in business, design, government, health care, and tech. Finally, there were . /fcomp. .
3 helpdesk channels and 5 other general channels ("general," "intros," "photos," "random_stuff," and one named after the hackathon event itself). An overview of the number of messages and users for the two-channel types (teams and general-purpose) is presented in Table 2.
. . . Messages in the dataset As previously mentioned in Section 3.1 the messages were saved as json objects within the json files per channel and can be divided into user-generated messages-i.e. messages or files that users actively sent in a channel-and system messagessuch as " [user] has joined the channel"-which are automatically sent by Slack and which we did not count as active contributions to conversations.
To answer our three main research questions, we used the following information: the message text, the user who composed the message, and the timestamp of the message. We used python to read the json files into a data table. The table contained the Slack messages as rows and the following indicators as columns: channel, user, message text, mentioned a tool (and columns per tool indicating if that tool was mentioned) if it is a system message indicating that the user joined the channel if it is the first or last message of this user in this channel and whether there is an attachment and if so what file type it is.

. . Analysis
For our analysis and most of the preparation we used R . In order to analyze the usage of the official communication tool Slack we used descriptive analysis and time series (see Section 3.3.1). With the Gini coefficient we investigated the symmetry of conversations (see Section 3.3.2). Lastly we examined alternative communication tools with descriptive analysis (see Section 3.3.3). This is further detailed in the following subsections.

. . . Analysis of teams' communication through Slack
To investigate how team members use the official communication tool (RQ 1 ), we carried out a descriptive analysis looking into the number of channels and the respective message counts. Additionally, we analyzed the temporal distribution of Slack messaging activity using time series. In particular, we were interested in potential changes in activity around checkpoints compared to other times as there could be additional communication and organizational effort approaching checkpoints and also planning for the next steps after the checkpoints. To that end, we took into account all messages that fell into the time frame around the hackathon https://www.r-project.org/ event, that is within 24 h leading up to the event and up to the end of the last day. Then, we split the activity of channels into time frames of 1 h. For each time frame, we calculated the number of messages that were sent per channel. If no messages were sent then the message count was set to zero. We created two visualizations based on this approach. One included the average amount of messages in general-purpose and team channels in that time frame. For the second, we ordered the team channels based on the number of messages. Then, we picked 10 channels with the most messages and 10 channels around the third quartile (that is, with minimum communication but not completely absent).

. . . Analysis of communication patterns
To explore the communication patterns in terms of symmetry within the hackathon teams (RQ 2 ), we calculated the Gini coefficient of messaging activity. The Gini coefficient is a measure of symmetry of a distribution (Dorfman, 1979;Martinez et al., 2011). In our case, we used it to measure how symmetrical or balanced the communication was in terms of message count. The Gini coefficient result can range from 0 to 1, with 0 meaning a symmetric and 1 indicating a non-symmetric distribution. As an example, if we had three people in a channel, with each one sending 2 messages the Gini coefficient would be 0. If only one of these three people would send 2 messages and the other two would not send any messages, the Gini coefficient would be 1.
For calculating the Gini coefficient, we need to establish who the "communication contributors" are, that is who are the users who should be considered as communicating in a team channel. There are essentially three (overlapping) ways to define who is part of a conversation in a channel: 1. by using the Slack corpus metadata that lists all users of a channel at the time the dataset was saved; 2. by extracting the users' ids from the messages sent in a team's channel; 3. by using system messages indicating that a user has joined a channel.
We combined all of the above in order to acquire a comprehensive and complete list for the following reasons: (a) users can theoretically leave a channel and therefore not show up in the metadata; (b) users can be in a channel and follow a discussion without sending messages, and (c) system messages can be disabled and therefore, fail to show when a user joins a channel. For this work, we do not count system messages as messages sent by a user. Consequently, a user who sent no messages would have a zero message count, indicating that they are passive consumers of a conversation but not actively contributing to it. This process results in a list of users per team channel that can be considered "members" of this channel. These users can have any role and are not necessarily participants;  for example, they can be hackathon organizers or mentors. Therefore, they should not be equated with team members.

. . . Analysis of other communication tool mentions
To find out what other communication tools the hackathon participants may have used (RQ 3 ), we created a list of popular communication tools. Then, we used this list to check for number of occurrences (that is, how many times a tool was mentioned) in the dataset. Next, to explore whether tool mentions had an impact on the communication of teams, we examined the number of messages before and after tool mentions in the same channel, and focusing on the messaging activity for 10 min before and after the tool mention. Such a change could be due to the users discussing an issue and then deciding to take the discussion to a different communication tool which is briefly mentioned before the conversation temporarily ends within slack. Finally, we explored whether potential differences were significant.

. Findings . . Descriptive analysis of teams' communication through Slack (RQ )
To answer RQ 1 we aimed to investigate how hackathon participants used the official slack workspace. Overall, there are 1212 users in the final dataset, consisting of two organizers, five project managers, seven lead mentors, two support mentors, eight mentors, and 1,188 participants. Table 1  The dataset consisted of 228 team channels and 25 generalpurpose channels. The latter can be further divided into 8 batch channels, 3 channels to find team members with specific expertise, 6 channels to find mentors with specific expertise, 3 general helpdesk channels, and 5 other more general channels as described in Section 3.2.2. In Table 2, we present the distribution of messages in the team and general-purpose channels. The number of messages per channel was highly varied for both of these channel types. In general-purpose channels, on average 179 messages were sent (SD = 211.79) and in team channels on average 34 messages were sent (SD = 78.58). This low average number of messages in team channels-also compared to the standard deviation-indicates a large number of channels with few messages, which could be because of teams dropping out of the hackathon as well as because of alternative tools being used. Meanwhile, the high average number of messages for the general-purpose channels suggests that participants engage in the discussions in general-purpose channels which also aligns with the higher number of users per channel than the team channels.
To have a closer look at the participants' activity-especially around checkpoints-we analyzed the temporal distribution of Slack activity, as described in Section 3.3.1. Again, we differentiated between general-purpose and team channels and we focused on the activity around the hackathon event which started with a kick-off opening session on the first day at 15:30 and ended with a final webinar on the third at 19:30 ( Figure 2). The Slack logfiles recorded messaging activity before the start of the event, mostly in the general-purpose channels. Around the checkpoints, we saw that the number of messages increased, especially in the general-purpose channels. This could be attributed to reminders that were sent mainly by lead and support mentors to team members for joining the respective online sessions. We do not see the number of Frontiers in Computer Science frontiersin.org . /fcomp. .  messages decreasing between the opening session and the first checkpoint in either channel type. This could be an indication that participants used this time to organize and strategize. Between the first and second checkpoints, there was a gradual decrease in messages followed by 8 h of low activity with less than one message on average in the chats per hour, coinciding with nighttime at the origin location of most participants. This was followed by an increase in messages approaching the second checkpoint. Between the second and third checkpoints, messaging activity (in terms of volume) appeared to be steady.
After the third checkpoint, the number of messages in generalpurpose channels slowly decreased again, reaching a low (3rd day) before rising rapidly toward the submission deadline. We were not able to establish the same or similar pattern for team channels due to the low amount of messages within that time frame. However, the gray area of the standard deviation indicates fewer channels with many messages.
The curve of the average number of messages is more pronounced for the general-purpose channels than for the team channels which we attributed to the generally low average Frontiers in Computer Science frontiersin.org . /fcomp. .

FIGURE
Messages over time in top team channels vs. team channels around the third quartil abbreviated to the time of the hackathon itself with an additional day before. Solid line indicates average (mean) amount of messages, the gray area shows the standard deviation.
number of messages exchanged in the team channels. To further explore this difference, we analyzed a subset of 20 team channels: 10 team channels with the most messages and 10 channels around the 3rd quartile in a ranking for the number of messages per channel. The descriptive data for these select channels is shown in Table 3. We then visualized the messages of these subsets (Figure 3). Our results showed that the 10 team channels with the most messages demonstrated a similar pattern to the general-purpose channels: messaging activity decreased after the first and third checkpoints down to very few messages, while it increased toward the second checkpoint and the submission deadline. Conversely, to the general-purpose channels, the messaging activity in the top 10 team channels increased between the opening session and the first checkpoint. The third quartile channels demonstrated messaging activity around the opening and first checkpoint, then after the second checkpoint with minimal activity stretching beyond the third checkpoint, and wrapping up with some messages at and after the submission deadline until shortly after the closing webinar.

. . Communication patterns in team channels (RQ )
In order to investigate whether communication in team channels followed a symmetrical pattern, we calculated the Gini coefficient for each team channel as per the explanation in Section 3.3.2. The resulting distribution of Gini coefficients is illustrated in Figure 4. In Figure 4A, we took all users into consideration, that is hackathon participants and nonparticipants (i.e. users that are not participants, see Section 3.2.1) like mentors who help the team or remind them of checkpoints. Figure 4B excludes non-participant users from the calculation of the Gini coefficient, looking only at participants to have an approximation of within-team communication. It should be mentioned that this would also include participants that were part of a different team, i.e. a member of team A may write in the channel of team B and contribute to that team's discussion.
We found that there were 52 team channels where only one participant wrote messages conversing with nonparticipants like mentors, project managers, etc. We looked at the messages of a random sample of 10 channels and could identify two groups: (a) teams that dropped out and told a mentor in a message explicitly (n = 1) or just stopped messaging (n = 5), and (b) teams that worked outside of slack and have only one member keep in contact with mentors and organizers via slack and seem to have finished the hackathon based on explicit messages about their submission (n = 3) or continued contact with mentors up to the 3rd checkpoint (n = 1). Presumably, the latter are teams that were in contact before and maintained a different communication channel before and during the hackathon. For these channels, the Gini coefficient could not be calculated for participants only (because there was only one participant) and was therefore set to 1, indicating nonsymmetrical conversation.
Frontiers in Computer Science frontiersin.org . /fcomp. . The Gini coefficients suggest that conversations in team channels are imbalanced regardless of who participates in these conversations (that is, team members only or team members and other users, such as mentors).

. . Other communication tools mentioned in Slack channels (RQ )
In Table 4, we present the descriptive statistics of other communication tool-mentions in the various Slack channels. In total, Zoom was mentioned (n = 647) almost as many times as all other tools combined. YouTube and Facebook were next with 209 and 208 mentions, respectively, followed by Slack (n = 123) and Google (n = 81). Hangouts, Instagram, Skype, Telegram, Trello, Twitter, and WhatsApp each had less than 20 mentions. The high amount of mentions for Zoom in particular, but also for Facebook and YouTube could be due to the fact that the organizers introduced these tools to support other activities of the hackathon: Facebook live was used for the kick-off opening session, Zoom was the online venue for the checkpoints and YouTube for sharing the teams' final product presentations.
Additionally, we were interested in whether the volume of messages sent on the channel changed before and after the toolmention. For each tool-mention, we investigated how many messages were sent in that same channel in the 10 min leading up to the tool-mention and how many in the 10 min after it. In Figure 5, we present how many messages were sent on average in the 10 min before (-1) and after (1) a tool-mention-in the respective channel where the tool was mentioned-for each tool and user.
To explore whether the volume of messages before and after tool-mentions changed statistically significantly, we used the non-parametric paired-sample Wilcoxon test. For the sum of tool-mentions (that is the sum of messages that mentioned some other communication tool), we found no significant difference (p = 0.382) between the average amount of messages before (M = 2.05, SD = 4.68) and after tool mentions (M = 2.03, SD = 3.92). Furthermore, we explored the tool mentions divided by user role, tool, as well as user role, and tool. We found no significant difference in the message amounts for any of the roles. We only found a statistically significant difference for Google where the average amount of messages before the mentions (M = 2.98, SD = 5.41) was significantly larger than after (M = 2.23, SD = 3.79) the mention (p = .035 < 0.05). Looking at user role and tool, we found that the average amount of messages before (M = 2.97, SD = 5.59) the mentions is significantly larger than after (M = 2.14, SD = 3.85) for google mentions by participants p = 0.040 < 0.05. For Facebook mentions by participants, we found that the average amount   With this analysis, we wanted to see whether tool mentions indicate a concurrent switch to that tool. Assuming that a scenario for this would be a discussion with a large volume of messages followed by a tool mention-for example, "let's just discuss this in zoom, that's probably quicker" (fictitious message)-and consequent low amount of questions. The fact that we did not find significant changes in the number of messages before and after tool mentions can be due to different factors: (a) the time frame might not have been appropriate.
For example, the teams might not have always used Slack as a synchronous communication tool and reactions may have been slower, making it possible that the discussion took part in a time frame of longer than 10 min; (b) using message count as a metric might be inappropriate when it comes to different texting styles, i.e. people that write many short messages instead of one long message, and (c) the tool mentioning messages might not actually call for an immediate tool change but instead point to requests to use them at a later time.

. Discussion
To investigate our research aim of how hackathon participants collaborate in an online space, we studied the communication of participants of a hackathon event that took place in April 2020, which aimed to develop ideas for technology use in crisis response. Based on the results, we answered our three research questions to gain insights into patterns of collaborations within the teams and the organization for online hackathons.
When we look at the analysis of the official team channels (RQ 1 ), we see that they are often not utilized as the primary form of communication between the teams. These Slack channels contain limited amounts of chats between participants related to collaborative work. We see that teams use their communication tools for intra-team communication when they actively participate in the event. However, the official communication tool, Slack, served as the primary platform for the event and the activity in the general-purpose channels suggests that team members use the official communication tool for organizational purposes providing insights to the first research question. This is in line with findings by Mendes et al. (2022) who reported that teams mainly used official event platforms to communicate with organizers and other hackathon officials while setting up their own communication channels.
To understand what patterns can be identified in the team members' communication activities (RQ 2 ), we see an increase in messages around hackathon checkpoints. This increase in activity around milestones is a pattern that has previously been reported in research on software engineering (Azarova et al., 2022), education (Van de Zande and Wallace, 2018), and civicengagement (McInnis et al., 2018) where team members become more active as critical deadlines approach. It is also in line with prior research on hackathons that suggest that event scaffolding affects team activity, especially in online events (Brereton, 2020;Powell et al., 2021). At the same time, our findings suggest that communication between the last checkpoints is sustained, potentially indicating that participants use the time between checkpoints to organize and strategize their work. Furthermore, discussions in team channels are imbalanced or else asymmetrical, either mentors are present in these discussions or not. This finding suggests that the messaging activity in the Slack team channels is dominated by one team member, possibly assuming the team leader or organizer role.
Hackathon participants mention other communication tools during their Slack discussions, and one could argue that this could indicate that they are switching communication channels (RQ 3 ). This would support prior findings in hackathon-related studies that report teams using tools of their own choice to communicate rather than those suggested by hackathon organizers (Powell et al., 2021). However, these tool mentions do not appear to impact the number of messages significantly. Beyond the scope of this study is the analysis of the discourse in the different channels, which could give more significant insights into the participants' collaborative work patterns. Additionally, the time frame of 10 min for this analysis needs to be investigated, and if a larger window (for example, 1 h or even longer) might be more appropriate to capture information taking into account the asynchronous character of the communication tool. It is possible that the participants use Slack differently in regards to replying times, so an investigation of the reply times should be done in that course as well.

. . Theoretical and practical implications
Our findings have a number of implications for theory and practice. They indicate that active teams use the communication tool provided by the organizers. Analyzing the communication during an event might thus allow organizers to spot inactive teams and provide support. Moreover, we found increased activity in team channels around checkpoints indicating that event scaffolding can guide team communication and collaboration. In addition, our findings revealed that communication within channels is unbalanced with one team member commonly being the most active. This can help event organizers to identify individuals to direct resources to.
On the other hand, our findings indicate-in line with related work (Powell et al., 2021;Mendes et al., 2022)-that even with a complete dataset from the communication tool provided by the organizers of a hackathon, the investigation of intrateam communication and interaction is complicated because teams use different communication tools with researchers or hackathon organizers having access to these tools or the data traces they produce. The plurality and diversity of communication media may hinder data-informed approaches to investigating hackathons.
Our work thus provide actionable insights for online hackathon organizers regarding event scheduling and planning, suggesting the need for supporting team coordination in the early phases of the event, improving checkpoints' logistics, and integrating feedback. Additionally, our findings highlight the importance of maintaining team channels to provide necessary information and support to participants. From a research perspective, the paper contributes to existing research on remote work and collaboration and the challenges of analyzing just the online communication channels. Finally, we demonstrated the feasibility and potential challenges in analyzing communication traces to create actionable insights for hackathon organizers, mentors, and participants.

. . Limitations and future work
There are some limitations associated with the design of our study and the methods we employed. One of the limitations is the scarcity of data in the team channels and the lack of information regarding other communication channels the teams might have used. The method we used (archival analysis) and the study setup, did not allow us to follow up and clarify whether inactive teams on Slack whether communication was absent altogether, or whether they were communicating via other media and how. On other words, we were not able to form a clear picture of the communication of teams or its absence.
Another limitation is that we mainly employed descriptive analysis as an analytical approach. One may argue that communication processes, being complex and multidimensional, are hard to capture with descriptive statistics. We acknowledge this limitation, and while in this paper we focused on quantitative information regarding messages-as a first step-we envision to triangulating our findings with contentual information (that is, insights deriving from the content of exchanges messages) and contextual information (for example, statements of hackathon participants, mentors, and hackathon organizers regarding teamwork and hackathon communication).
The hackathon outcomes-for example, the prototypes or the videos that teams produced during their work-may also offer insights into the progress and the achievement of teams. However, this was not possible here since we did not have access to these artifacts.
Finally, the geographical context of the online hackathon and its targeted focus (health crisis) cannot be generalized and thus, does not allow expanding our findings to other settings. Therefore, further research is required to explore whether similar communication patterns appear in different contexts or whether cultural or goal and topic-related factors come into play.

Data availability statement
The data analyzed in this study is subject to the following licenses/restrictions: the raw data contains potentially personal identifiable information and is thus subject to GDPR. Requests to access these datasets should be directed to AN, alexander.nolte@ut.ee.

Author contributions
The study conceptualization, data preparation, and analysis was carried out by CS. The first draft of the manuscript was authored by CS and revised by AN, DS, and I-AC. All authors contributed to the article and approved the submitted version.

Funding
We acknowledge support by the Open Access Publication Fund of the University of Duisburg-Essen.