Your new experience awaits. Try the new design now and help us make it even better

ORIGINAL RESEARCH article

Front. Comput. Neurosci., 29 January 2026

Volume 20 - 2026 | https://doi.org/10.3389/fncom.2026.1767724

AI-driven audience clustering in sport media: a human–computer interaction approach using ‘CoPE-DEC’

  • Department of Sport Media, Kyung Hee University, Yongin-si, Gyeonggi-do, Republic of Korea

This study investigates the characteristics and underlying patterns of sports media audiences from a human–computer interaction (HCI) perspective using artificial intelligence–based deep learning analysis, with the aim of providing foundational data for the sports media industry. To this end, a novel unsupervised clustering framework, the Column-conditioned Prototype-Enhanced Deep Embedded Clustering (CoPE-DEC) technique, was employed to model and analyze multidimensional viewer experience data derived from sports media consumption contexts. The analysis identified three distinct audience clusters with differentiated behavioral, attitudinal, and value-oriented characteristics. The first cluster, labeled “Sports Value Orientation,” was characterized by enhanced concentration during sports viewing, promotion of cooperative skills, motivation for health and exercise, vicarious satisfaction, aesthetic appreciation of sports movements, and admiration for athletes’ professional and economic success. The second cluster, termed “Sports Consumption Culture Orientation,” exhibited a strong preference for sports broadcasts over entertainment content, frequent consumption of online sports media, active engagement with preferred sports, participation in sports-related tourism and activities, acquisition of sports skills through media, and consumption of sports-related products. The third cluster, identified as “Sports Attitude Orientation,” reflected predominantly social and emotional dimensions of sports viewing, including improved social adaptation, relationship formation, group cohesion, stress relief, psychological stabilization, healthy competitive attitudes, and enhanced overall wellbeing. These findings demonstrate that AI-driven deep learning approaches, particularly the CoPE-DEC framework, are effective in uncovering latent audience typologies and preference structures in sports media consumption environments. By integrating HCI principles with advanced clustering techniques, this study offers a methodological contribution to audience analysis research and provides practical implications for audience segmentation, personalized content design, and strategic decision-making in the sports media industry. Future research is encouraged to extend this approach by incorporating diverse AI methodologies and multimodal data sources to further advance interdisciplinary insights at the intersection of HCI, artificial intelligence, and sports media studies.

1 Introduction

As digital transformation accelerates across media ecosystems, understanding user behavior and experience has become a central challenge in both academia and industry. In this context, Human–Computer Interaction (HCI) has emerged as a core disciplinary framework for explaining how users perceive, interpret, and engage with digital systems, providing theoretical and methodological foundations for user-centered design and experience optimization (Card et al., 1983; Norman, 2013; Preece et al., 2022). Recent advances in artificial intelligence (AI), particularly deep learning, have further expanded the analytical capacity of HCI research by enabling the modeling of complex, high-dimensional user data and uncovering latent behavioral patterns that are difficult to capture using traditional approaches.

The convergence of HCI and deep learning–based clustering techniques is especially meaningful in contemporary media environments characterized by hyper-personalization, platform diversification, and continuous user interaction. Unlike conventional analytic frameworks that rely on predefined categories or linear assumptions, representation-based deep clustering methods allow researchers to identify emergent user typologies directly from data, thereby supporting more precise and adaptive user segmentation (Xie et al., 2016; Li X. et al., 2022; Saito and Yamamoto, 2023). Among these approaches, Column-conditioned Prototype-Enhanced Deep Embedded Clustering (CoPE-DEC) offers a promising framework by integrating item-aware embeddings, prototype alignment, contrastive learning, and DEC-based refinement, enabling both interpretability and structural clarity in unsupervised clustering tasks.

This methodological advancement is particularly valuable in the sports media industry, where audience behavior has become increasingly complex and heterogeneous. The contemporary sports media environment is shaped by the convergence of over-the-top (OTT) platforms, interactive live streaming, real-time data-driven commentary, and immersive technologies such as augmented and virtual reality (AR/VR) (Funk et al., 2018; Ratten, 2021). Viewers are no longer passive recipients of broadcast content; rather, they actively engage with sports media across multiple platforms, participate in online communities, generate user-created content, and interact with athletes, teams, and other fans in real time. As a result, audience experiences are formed through dynamic interactions between human users and computational systems, making an HCI-oriented analytical perspective indispensable.

Recent trends in sports consumption—particularly among the MZ generation—further underscore the limitations of traditional analytical models. Contemporary audiences increasingly favor immediacy (real-time and short-form content), identity expression (community participation and social signaling), and reciprocal interaction (comments, voting, memes, and challenges) (Wohn and Freeman, 2020; Lim and Park, 2023). However, existing research on sports media audiences has largely relied on regression-based models, satisfaction–intention frameworks, or psychologically driven causal analyses that assume linear relationships and predefined constructs (Kim and Hwang, 2019; Lim and Lim, 2024). While these approaches offer valuable insights into specific relationships, they face structural constraints in capturing high-dimensional, nonlinear, and heterogeneous audience patterns across platforms and interaction contexts.

Moreover, prior studies often focus on post hoc explanations of viewing satisfaction or continued usage intention, rather than on the ex-ante identification and generalization of audience typologies. This limitation has become increasingly problematic as sports media data environments grow more complex, incorporating multimodal and high-frequency data from live broadcasts, OTT services, social media interactions, and search behaviors. In such environments, viewer behavior exhibits nonlinearity, temporal dynamics, and network diffusion effects that exceed the explanatory capacity of conventional models (Bunker and Susnjak, 2022; Horvat and Job, 2020).

Against this backdrop, we identify three critical limitations in existing approaches. First, many studies rely on regression-based or psychologically grounded models that presuppose linearity and stable variable relationships. Second, these methods struggle to capture the multidimensional and heterogeneous nature of contemporary sports media audiences. Third, there remains a lack of HCI-based, unsupervised, representation-driven audience typology frameworks capable of generalizing viewer characteristics across platforms and interaction contexts. These gaps point to the need for advanced AI-driven methodologies that can model audience behavior more holistically and flexibly.

To address these limitations, the present study applies the Column-conditioned Prototype-Enhanced Deep Embedded Clustering (CoPE-DEC) framework to sports media audience analysis from an HCI perspective. CoPE-DEC directly responds to the identified gaps by (1) employing column-conditioned embeddings to preserve item-specific semantics, (2) enhancing interpretability through prototype alignment, and (3) improving cluster sharpness and stability via contrastive learning and DEC fine-tuning. By leveraging these mechanisms, the proposed approach enables the extraction of latent audience typologies that reflect not only behavioral patterns but also value orientations and attitudinal dimensions of sports media consumption.

Accordingly, the objective of this study is to predict and classify sports media viewer characteristics using an AI-based deep learning framework grounded in HCI principles. By segmenting audiences based on empirical viewing behavior, this study aims to provide foundational data for market segmentation, personalized content design, and strategic decision-making in the sports media industry. Furthermore, the study seeks to contribute to the development of an AI-based knowledge system for predictive sports media research and to establish an integrated benchmark for evaluating both cultural relevance and market efficiency through data-driven audience classification.

2 Theoretical background

2.1 Sports media viewing behavior theory

Sports media viewing behavior, particularly in live streaming environments, has been extensively examined through multiple theoretical perspectives, most notably Uses and Gratifications Theory (UGT), Flow Theory, and Parasocial Interaction Theory. These frameworks provide complementary explanations for why and how audiences engage with sports media content in increasingly interactive and technology-mediated contexts.

Uses and Gratifications Theory conceptualizes media users as active and goal-oriented agents who selectively engage with media to satisfy specific psychological, social, and informational needs rather than as passive recipients of content (Katz et al., 1973). In contemporary sports media environments, this perspective has been extended to explain viewers’ motivations for consuming live sports streams, including entertainment, real-time information acquisition, social interaction through chat functions and online communities, and emotional engagement with teams and athletes (Sjöblom et al., 2019; Lim and Park, 2023). Recent studies further suggest that the interactivity embedded in digital sports platforms enhances users’ perceived autonomy and involvement, reinforcing active media selection and sustained engagement (Wohn and Freeman, 2020).

Advancements in real-time feedback mechanisms—such as live comments, emojis, and audience reactions—have transformed sports streaming platforms into collective emotional spaces, where viewers experience shared excitement, solidarity, and collective cheering. These developments align closely with Flow Theory, which explains the immersive psychological state experienced when individuals perceive an optimal balance between challenge, skill, and engagement (Csikszentmihalyi, 1990). In sports live streaming contexts, flow emerges when high-quality audiovisual content, real-time interaction, and emotionally charged competition converge, leading to temporal distortion, deep concentration, and heightened affective involvement (Hamari and Sjöblom, 2017). Empirical research indicates that such flow experiences significantly increase re-watching intentions, viewing duration, and long-term platform loyalty (Chen et al., 2022).

Parasocial Interaction Theory offers an additional explanatory lens by focusing on the quasi-social relationships that viewers develop with media figures, commentators, or streamers, whom they perceive as familiar and emotionally accessible despite the absence of reciprocal interaction (Horton and Wohl, 1956). In sports streaming environments, parasocial bonds are strengthened through commentators’ narrative styles, humor, empathy, expertise, and nonverbal expressions, fostering emotional intimacy and trust (Frederick et al., 2022). Recent studies demonstrate that parasocial interaction plays a critical role in shaping viewer satisfaction, perceived credibility, and continued engagement, including long-term subscription behavior and community participation (Tukachinsky, 2021; Wohn and Freeman, 2020).

From a broader social and cultural perspective, engagement with sports media content influences viewers’ socialization processes and emotional development. Sports viewing has been shown to promote cooperative values, respect for rules, social cohesion, and collective identity formation, although these effects may vary depending on the values and norms conveyed through media representations of sport (Funk et al., 2018; Ratten, 2021). Sports consumption is also deeply embedded within contemporary consumer culture, functioning as a symbolic and identity-driven practice that reflects broader social meanings, lifestyle aspirations, and group affiliations (Belk, 2013; Giulianotti, 2022). In this sense, sports media consumption extends beyond leisure activity to participate in the reproduction of cultural norms and market-oriented structures within society.

Recent empirical research has further classified sports consumption culture into multiple domains, including media-based information consumption, sport-related product purchases, experiential participation such as tourism and events, and digitally mediated skill acquisition (Zhang et al., 2021; Kim et al., 2024). Identifying distinct viewer types within this framework enables a more nuanced understanding of differences in preferences, motivations, and behavioral patterns across sports media audiences. Sports values and attitudes are thus conceptualized as expectation-based cognitive and affective orientations, shaped by prior experiences, social influence, and informational cues provided by media platforms (Oliver, 2014; Chiu et al., 2017).

Taken together, these theoretical perspectives indicate that the study of sports media viewing behavior extends beyond explaining satisfaction with isolated media products or services. Instead, sports media consumer theory functions as a strategic analytical framework for understanding long-term engagement, loyalty formation, and value co-creation in interactive media environments. By integrating motivational, psychological, social, and cultural dimensions, contemporary sports media research provides critical insights for enhancing user experience design, strengthening fan relationships, and sustaining competitive advantage in rapidly evolving digital sports ecosystems (Zeithaml et al., 2020; Pizzo et al., 2023).

2.2 Classification of characteristics of sport media content viewers

Market segmentation has long been recognized as a fundamental strategy for understanding heterogeneous consumer populations and for designing effective marketing and communication strategies, particularly in highly competitive industries such as sports. Classical marketing literature conceptualized segmentation as the process of dividing markets into relatively homogeneous groups based on shared characteristics, including demographic, geographic, psychographic, and behavioral variables (Kotler and Keller, 2022). In the context of sports consumption, these variables have traditionally been used to explain differences in attendance, media usage, merchandize purchasing, and fan loyalty.

Early segmentation studies in sports marketing primarily relied on demographic or single-dimension behavioral indicators, such as age, gender, income, or frequency of consumption. However, subsequent research has demonstrated that such approaches provide only limited explanatory power, as sports consumers often exhibit complex combinations of motivations, values, lifestyles, and engagement patterns that transcend demographic boundaries (Funk et al., 2016; Yoshida et al., 2017). As a result, contemporary segmentation research increasingly emphasizes the integration of psychological, attitudinal, and behavioral variables to capture the multidimensional nature of sports media audiences (Trail et al., 2019).

With the rapid digitization of sports media and the proliferation of OTT platforms, social media, and interactive live streaming services, the characteristics of sports media viewers have become more fragmented and dynamic. Viewers engage with sports content across multiple platforms, exhibit varying levels of emotional involvement and social interaction, and differ substantially in their consumption of related products and experiences (Ratten, 2021; Lim and Park, 2023). Consequently, no single segmentation variable can be considered universally superior; rather, the selection of segmentation criteria must align with the analytical objective and the media context in which consumption occurs (Wedel and Kannan, 2016).

In parallel with these conceptual developments, research on predicting sports media viewership has evolved from qualitative, experience-based forecasting toward quantitative and data-driven approaches. While qualitative methods—drawing on expert judgment from broadcasters, league officials, and event organizers—remain valuable for contextual interpretation, their reliance on subjective assessment limits scalability and predictive precision in rapidly changing media environments. The accumulation of large-scale digital trace data, including viewing logs, interaction records, and platform analytics, has highlighted the advantages of quantitative modeling in reducing forecast uncertainty and improving generalizability (Chae and Kim, 2021; Popp et al., 2020).

Recent scholarship has increasingly applied machine learning and deep learning techniques to sports-related prediction and segmentation tasks, including audience clustering, engagement prediction, and consumption pattern analysis. These approaches are particularly well suited to handling high-dimensional, nonlinear, and heterogeneous data structures that characterize contemporary sports media environments (Horvat and Job, 2020; Bunker and Susnjak, 2022). Deep learning–based models enable the extraction of latent representations of consumer behavior, allowing for the identification of viewer segments that reflect underlying value orientations, participation patterns, and attitudinal tendencies rather than surface-level characteristics alone (Chen, 2024; Li X. et al., 2022).

From a sports media perspective, AI-driven audience classification offers several strategic advantages. First, it supports proactive estimation of viewer preferences prior to content release or event broadcasting. Second, it facilitates the empirical identification of diffusion pathways and ripple effects across platforms and communities. Third, it enables differentiated content design, targeted marketing, and personalized interface development tailored to distinct viewer segments. By complementing the limitations of traditional qualitative and rule-based segmentation approaches, AI-based classification provides a systematic and scalable foundation for understanding sports consumers’ decision-making processes in digital environments.

Taken together, recent advances in market segmentation theory and AI-based analytics underscore the necessity of adopting integrated, data-driven frameworks for classifying sports media content viewers. Such frameworks are essential not only for academic understanding of audience heterogeneity but also for practical applications aimed at revitalizing the sports media industry, enhancing fan engagement, and informing evidence-based policy and marketing strategies in an increasingly competitive and technology-driven marketplace.

3 Research questions

Building upon the theoretical background and the literature review on sports media consumption and audience behavior, this study seeks to address the following research questions:

Research Question 1 (RQ1): How can sports media audiences be segmented using deep learning-based analytical methods?

Research Question 2 (RQ2): What are the distinctive characteristics of the audience groups identified through deep learning-based segmentation?

These questions aim to explore both the methodological approach to segmenting sports media audiences and the behavioral, attitudinal, and cultural characteristics that differentiate the resulting clusters. By addressing these questions, the study intends to provide actionable insights for sports media management, marketing strategies, and the application of artificial intelligence in audience analysis.

4 Methods

4.1 Data collection and analysis

The data for this study were obtained from the publicly available dataset, “Sports Live Streamer: Viewer Experience Survey Data,” hosted on the Mendeley Data platform. This dataset consists of survey-based behavioral and cognitive measures collected from a large number of sports content consumers, including viewers of YouTube streams, sports reactions, sports analysis, and sports broadcast jockeys (BJs). Unlike simple viewing his-tory logs, the dataset encompasses a wide range of psychological, cognitive, attitudinal, and behavioral factors, making it particularly suitable for research on sports media consumption, consumer behavior, and fandom.

For this study, the dataset was utilized as the primary source to analyze sports media audiences. The data were processed and analyzed using Column-conditioned Proto-type-Enhanced Deep Embedded Clustering (CoPE-DEC), an artificial intelligence-based deep learning method. This approach enabled the extraction of latent audience patterns and characteristics, providing a robust foundation for segmenting viewers based on multidimensional behavioral and cognitive attributes.

4.2 Variables and measures

The dataset employed in this study was designed to comprehensively capture the psychological and behavioral characteristics of sports live streaming users. Most items were assessed using a 5-point Likert scale (1 = not at all, 5 = very much). The key variables in this study included streamer image, flow (immersion), satisfaction, rewatching intention, and demographic characteristics. Streamer Image measures users’ perceptions of a specific sports streamer. It encompasses dimensions such as attractiveness, expertise, credibility, warmth, and quality of commentary. These factors are critical predictors of viewers’ attitudinal and behavioral responses, including trust, liking, and rewatching intention. In sports media, expert commentary serves as an essential criterion for evaluating content quality.

Flow captures the psychological experiences of users while engaging with live streaming content. This construct includes levels of immersion, time distortion, concentration, psychological engagement, and hedonic enjoyment. Sports live streaming is particularly conducive to flow due to its dynamic and real-time interactive nature. High flow experiences are significant predictors of viewer retention and positive attitudes toward the content. Satisfaction represents viewers’ overall evaluation of both the streamer and the con-tent. It was measured through satisfaction with the content, satisfaction with the streamer, appropriateness of delivery, and suitability of the live environment. Satisfaction is a pivot-al variable in media research, directly influencing repeated usage behavior and loyalty, and plays a similar role in sports streaming contexts.

Rewatching Intention refers to the users’ intention to rewatch the same streamer’s live broadcast, explore additional content from the streamer, and recommend the streamer to others via word-of-mouth. These indicators serve as key measures of behavioral persistence and loyalty within online media environments. Demographic Characteristics included gender, age, level of sports interest, primary viewing platform (e.g., YouTube, AfreecaTV, Twitch, Tving), viewing frequency, viewing time zone, and preferred content type. These variables can be used to examine differences across segments or serve as control variables in analytical models, thereby enhancing the explanatory power of the research framework. Overall, the dataset allows for a multifaceted analysis of psychological and behavior-al responses to sports live streaming. It is suitable for advanced statistical techniques, including structural equation modeling (SEM) and mediation and moderation analyses. Moreover, as the data were collected from Korean sports viewers, they accurately reflect the domestic sports media and streamer environment, providing valuable insights for plat-form strategy development, content format comparisons, and user experience research.

4.3 AI deep learning ‘CoPE-DEC’ data processing process

This study proposes an unsupervised clustering pipeline, Column-conditioned Prototype-Enhanced Deep Embedded Clustering (CoPE-DEC), designed for survey items where all variables are categorical. The pipeline integrates item-context-aware embedding modulation, prototype alignment, contrastive learning, and DEC fine-tuning. Given that all responses are categorical, identifier-like columns with high unique-value ratios and constant columns were removed. Additionally, items with responses representing less than 1% of the total respondents or with fewer than three occurrences were consolidated into a single sparse-response category to avoid an increase in dimensionality and model complexity during training.

To facilitate the analysis of respondents’ answer flow across items, response options were organized for each question like a dictionary, and sequential numerical identifiers were assigned to each option. The same word appearing in different items was assigned distinct identifiers to preserve item-specific context. These numbered sequences were then input into the embedding and clustering stages of the model, allowing for pattern tracking while maintaining context distinctions.

The model first constructs column-conditioned embeddings (Randrianantoanina, 2022). For each item, a category embedding matrix E_c∈R^(d_b) is established, and a FiLM modulator Φ:R^(d_b) → R^(2d_b) generates modulation parameters as defined in Equation 1. For the raw embedding of sample ie_((i,c)) = E_c [y_((i,c))] the column-conditioned modulation is then defined as in Equation 2.

( γ c , β c ) = Φ ( u c ) , γ c = tanh ( γ c )     (1)
e ˜ i , c = ( 1 + γ c ) e i , c + β c     (2)

Each token h_((i,c)) is obtained by projecting onto a shared latent dimension d_m mthrough W_c:R^(d_m × d_b), such that h_((i,c)) = W_c e˜ _((i,c)). The tokens are then concatenated to form H_i = [h_((i,1)); …””;h_((i,C))]∈ R^(C × d_m), vectorized (with dropout), and passed through the MLP encoder f_θ yield the latent representation, as defined in Equation 3 (Lijing et al., 2022). Subsequently, each item is associated with an independent classification decoder g_(ϕ,c):R^d → R^(∣ν_c∣), which produces a predictive distribution using the softmax function, as described in Equation 4. From a theoretical perspective, the use of a softmax-based decoder ensures that each column-specific prediction forms a valid probability distribution, enabling stable gradient propagation and facilitating alignment between latent representations and column-level semantics. This design choice is particularly important in the proposed column-conditioned framework, as it allows each feature group to retain semantic independence while sharing a common latent embedding space. In addition to theoretical considerations, we conducted preliminary sensitivity analyses during the pilot experimentation phase to examine the robustness of the decoder-related hyperparameters. The results indicated that moderate variations in the decoder configuration did not lead to substantial changes in clustering stability or assignment consistency. This suggests that the proposed parameter settings are not overly sensitive and provide a reliable operating range for model optimization.

z i = f θ ( vec ( H i ) ) d     (3)
softmax ( g ϕ , c ( z i ) )     (4)

To enhance interpretability and separability, pairwise latent prototypes are defined for each (item, category) pair. These prototypes are formulated by projecting the base embeddings into the latent space, such that P_((c,v)) = P_c E_c [v]∈R^d. The response-based mean prototype for sample i, as defined in Equation 5, serves as an interpretative reference point for the latent representation, guiding the alignment with z_i.

P ¯ i = 1 C c = 1 C P c , y i , c     (5)

The proposed model jointly optimizes noise reconstruction and prototype alignment without requiring labels. Data augmentation is implemented in a straightforward manner: some responses are intentionally masked, and, occasionally, responses are swapped across columns within the same batch. Under this scheme, the model is required to estimate the masked values using only the remaining item information, thereby learning inter-item relationships autonomously (Le Besnerais, 2008).

By attempting to reconstruct these corrupted samples, the model naturally captures the statistical dependencies among items. Specifically, the cross-entropy between the predicted distribution pˆ _((i,c)) at the masked positions and the original response y_((i,c))^”orig” “is computed. The mask reconstruction loss, as defined in Equation 6, is then obtained by averaging this cross-entropy over all positions where the mask indicator m_((i,c)) = 1.

L mask = 1 i , c m i , c i = 1 N c = 1 C m i , c CE ( p ̂ i , c , y i , c orig )     (6)

However, since some mini-batches may have few randomly masked responses, we stabilize convergence by using a global restoration term that faintly adds cross-entropy to all positions (Busoniu et al., 2010). This term is simply calculated as the average across all positions, as in Equation 7.

L full = 1 NC i = 1 N c = 1 C CE ( p ̂ i , c , y i , c orig )     (7)

In addition, each item-choice combination is assigned a representative vector (prototype) in the latent space, and the average p ̅_i of the prototypes of the choices actually selected by a person is regarded as that person’s typical response pattern. The prototype alignment loss, which induces the latent vector z_i generated by the encoder to be close to this typical pattern, establishes an interpretable reference point in the latent space as Equation 8 and sharpens the cluster centers. In short, the structure between items is learned through a restoration task that scores only some responses, a shallow overall restoration is added to ensure that the learning signal is not interrupted even in layouts with rare occlusion, and the latent representation is aligned to the response-based representative pattern to simultaneously secure separability and interpretability (Hang et al., 2020). In actual optimization, these three terms are combined with appropriate weights and minimized simultaneously.

L proto = 1 N i = 1 N z i p ¯ i 2 2     (8)

In the representation learning phase, we learn to distinguish between two augmented views of the same sample, labeling them as positive pairs and other samples in the same batch as negatives. The contrast objective used here is the InfoNCE loss, which increases the cosine similarity between the two latent vectors z_i^((a)) and z_i^((a)) of sample i while decreasing their similarity with other samples (Yiwei et al.). The scale is controlled by the temperature hyperparameter T, and the final form is expressed as Equation 9. In Equation 9, sim is the cosine similarity, arranging individuals with similar features closer together and individuals with distant features further apart to align the latent space.

L contr = 1 2 N i = 1 N [ log exp ( sim ( z i ( a ) , z i ( b ) ) T ) j = 1 N exp ( sim ( z i ( a ) , z i ( b ) ) T ) + log exp ( sim ( z i ( b ) , z i ( a ) ) T ) j = 1 N exp ( sim ( z i ( b ) , z i ( a ) ) T ) ]     (9)

This contrastive loss is combined with the previously described mask restoration, full restoration, and prototype alignment to form a single pre-training objective. The weights are adjusted appropriately, but are generally set to 1.0 for mask restoration, 0.3 for full restoration, 0.5 for prototype alignment, and 0.5 for contrastive learning, as shown in Equation 10:

L pre = λ mask L mask + λ full L full + λ proto L proto + λ contr L contr     (10)

After initially selecting the pre-trained representation z_i using K-means, we distinguish the cluster boundaries using DEC receiver tuning (Lin et al., 2020). Then, we first assemble the interesting z_i that each individual belongs to each cluster base station μ_k (to indicate the check-in of the boundary), and then create a target group p_ik that further “sharpens” the interesting q_ik to reduce the difference between the two groups, and update the centers together. The interesting q_ik is defined in the same way as Equation 11, which surpasses Student-t.

After initially segmenting the latent representation z_i obtained through pre-training using K-means, the cluster boundaries are refined using DEC fine-tuning (Lin et al., 2020). Then, the soft probability z_i that each sample belongs to each cluster center μ_k is first calculated (reflecting the ambiguity near the boundary), and then the target distribution p_ik is created by further “sharpening” the probability, and the encoder and centers are jointly updated to reduce the difference between the two distributions. The soft probability is defined using the Student-t kernel as shown in Equation 11.

q ik = ( 1 + z i u k 2 α ) α + 1 2 k = 1 K ( 1 + z i u k 2 α ) α + 1 2     (11)

Here, the target distribution is constructed as in Equation 12 for sharpening, which makes the already large samples larger and the small samples smaller, and then the encoder and center are jointly updated as in Equation 13 to minimize the KL divergence (Finlayson, 2000).

p ik = q i k 2 i q ik k q i k 2 i q i k     (12)
L DEC = i = 1 N k = 1 K p ik log p ik q ik     (13)

Intuitively, this process involves rearranging the latent space so that the “soft probabilities q-computed by the model” are aligned with the “more confident targets p.” As a result, similar patterns become more tightly clustered, while dissimilar patterns move farther apart, sharpening the cluster boundaries. Once training converges, the final label for each sample is determined as kˆ_i = arg〖max〗_k q_ik.

Intuitively, this is a process of rearranging the latent space to align the “soft probability q calculated by the model” with the “more confident target p.” As a result, cluster boundaries become sharper, with similar patterns denser and different patterns further apart. When learning converges, the final label for each sample is set to kˆ_i = arg〖max〗_k q_ik.

The production output merges these final labels into a single variable in the original table, and provides the upper-level items of the lift(k,c,v) = Pr (v∣k)/Pr (v) to provide the response defining the clusters at the column-category level. Quality assessment is based on intrinsic metrics (Silhouette–Hamming, Davies–Bouldin, Calinski–Harabasz), and the number of clusters is kept at least 3, and a reasonable candidate set (e.g., 3–7) is iterated over and the lowest Davies–Bouldin index is selected (Firman et al., 2023). Starting from these initial centers, Deep Embedded Clustering (DEC) is applied to fine-tune the boundaries. DEC jointly updates the encoder and centers by reducing the KL divergence to align the soft assignments computed with the Student-t kernel with the target distribution. As a result, the latent space is rearranged so that similar response patterns are closer together and different patterns are further apart, resulting in the final clusters (Figure 1).

Figure 1
Overall architecture of the CoPE-DEC Framework is depicted. It processes input data using several modules: Column-Conditioned Encoder for feature transformation, Latent Representation Layer for encoding, Prototype-Enhanced Clustering for aligning with prototypes, Soft Assignment for cluster refinement, and Joint Optimization for embedding objectives. The output is distinct viewer clusters, designed for high-dimensional sports media data analysis.

Figure 1. Overall architecture of the CoPE-DEC framework.

4.4 Data preprocessing and augmentation

To ensure reproducibility and methodological rigor, the data preprocessing and augmentation procedures were explicitly defined and systematically applied prior to model training. First, identifier-like variables (e.g., response IDs, timestamps, or metadata not related to viewer behavior) were removed, as these features do not contribute meaningful semantic information and may introduce spurious correlations. In addition, categorical variables with extremely low frequency—defined as categories appearing in fewer than a predefined threshold of samples—were excluded to mitigate sparsity and reduce noise in the representation space.

For categorical features with sparse response distributions, a category consolidation rule was applied. Specifically, infrequent categories were grouped into an “other” class to preserve statistical reliability while maintaining semantic coherence. This procedure prevented the fragmentation of category-specific embeddings and ensured stable learning during column-conditioned encoding. All remaining features were encoded using a column-aware encoding strategy designed to preserve feature-level semantics. Each feature column was independently transformed according to its data type, ensuring that numerical and categorical variables retained their intrinsic meaning prior to projection into the shared latent space. This design aligns with the column-conditioned embedding mechanism employed in the proposed framework, enabling meaningful correspondence between input variables and learned representations.

To enhance model robustness and prevent overfitting, data augmentation strategies were incorporated during training. Specifically, a column-wise masking strategy was applied by randomly masking a subset of feature values within each batch, encouraging the model to learn redundant and robust representations. Additionally, a within-batch swapping strategy was employed, in which feature values were exchanged among samples within the same batch under controlled conditions. This augmentation approach increases data diversity while preserving the underlying marginal distributions and column-level semantics. Collectively, these preprocessing and augmentation steps ensure that the input data are semantically consistent, statistically stable, and suitable for deep clustering. By explicitly documenting each stage of the preprocessing pipeline, the revised manuscript supports full experimental reproducibility and strengthens the scientific validity of the proposed analysis.

4.5 Comparison with state-of-the-art (SOTA) deep clustering methods

While the baseline methods employed in this study were deliberately selected to ensure interpretability and compatibility with categorical, survey-based sports media data, we acknowledge the importance of situating the proposed framework within the broader landscape of recent state-of-the-art (SOTA) deep clustering research. Recent advances in deep clustering—particularly contrastive clustering, self-supervised representation learning, and transformer-based frameworks—have demonstrated remarkable performance on large-scale vision, text, and multimodal benchmarks (Caron et al., 2021; Li et al., 2023; Zhou et al., 2022). These approaches primarily focus on maximizing representation quality and clustering accuracy through end-to-end learning from high-dimensional, homogeneous, and often continuous input data.

However, the direct applicability of such SOTA models to column-structured, semantically heterogeneous survey data remains limited. Many recent frameworks implicitly assume feature homogeneity, spatial or temporal continuity, or modality-aligned embeddings, which are rarely satisfied in audience survey datasets composed of discrete, conceptually independent items (Xu et al., 2022; Abid et al., 2021). Moreover, end-to-end representation learning without explicit variable-level conditioning can obscure item semantics, thereby constraining interpretability—an essential requirement in human–computer interaction (HCI)–oriented audience research. To address this methodological gap, the revised manuscript clarifies the rationale for baseline selection by emphasizing data-type compatibility, semantic transparency, and interpretability requirements intrinsic to sports media viewer analysis. In contrast to many SOTA approaches that treat clustering accuracy as a primary objective, the proposed Column-conditioned Prototype-Enhanced Deep Embedded Clustering (CoPE-DEC) framework is explicitly designed to support human-centered analytical goals, where preserving item-level meaning and enabling theoretically grounded interpretation of viewer typologies are equally critical.

In addition, we introduce a conceptual comparison between CoPE-DEC and recent SOTA deep clustering frameworks, highlighting complementary strengths and inherent trade-offs. While contemporary models excel at learning abstract representations from unstructured data (Caron et al., 2021; Van Gansbeke et al., 2021), CoPE-DEC introduces several methodological innovations tailored to structured audience data: (1) column-conditioned embeddings that preserve feature-specific semantics, (2) prototype-aligned clustering mechanisms that enhance interpretability and stability, and (3) joint optimization strategies that balance representation learning with cluster refinement under semantic constraints. These design principles reflect a deliberate emphasis on HCI-aligned explainability and analytical robustness, rather than on marginal gains in clustering accuracy alone. Accordingly, the novelty of CoPE-DEC lies not in outperforming existing methods on generic benchmarks, but in providing a domain-adaptive, interpretable, and HCI-oriented clustering framework capable of uncovering latent viewer typologies within complex sports media environments. This positioning clarifies the contribution of the proposed method within contemporary deep clustering research, particularly for applications where semantic fidelity and interpretability are paramount.

Finally, to further strengthen empirical validation, future research will incorporate systematic benchmarking against emerging SOTA clustering models under comparable experimental conditions, including matched data structures, evaluation criteria, and interpretability constraints. Such extensions will enable a more comprehensive assessment of performance trade-offs and support the continued methodological refinement of AI-driven audience analysis frameworks in sports media and HCI research.

5 Results

In this study, the proposed CoPE-DEC model was compared with three alternative embedding-based clustering approaches to evaluate the performance of the final clusters. The baseline embedding models included Truncated SVD, Multiple Correspondence Analysis (MCA), and UMAP. For all models, K-means clustering was applied to the re-sulting embeddings to generate cluster assignments and facilitate comparison.

For the SVD embedding, a sparse matrix was constructed using one-hot encoding (Peiliang, 1998). Truncated SVD (e.g., d = 32, random_state = 42) was then fit to the entire dataset to obtain linear low-dimensional embeddings. K-means clustering (K = 3, k-means++, n_init = 20, fixed seed) was applied to this embedding space to assign cluster labels. Evaluation metrics, including Silhouette (Hamming), Davies–Bouldin Index (DBI), and Ca-linski Harabasz (CH) scores, were computed in the one-hot space to ensure consistency. Cluster size, imbalance, and Lift (Top-3) were also reported. For visualization, the same seed was used to project embeddings into 2D using PCA or UMAP.

In the MCA embedding, all survey items were treated as categorical, and the original responses were used to fit the MCA model, yielding factor scores (embeddings) for each respondent (Kareem et al., 2012). K-means clustering was then applied to the MCA embed-ding space to determine cluster labels. To maintain comparability, Silhouette (Hamming), DBI, CH, cluster size/imbalance, and Lift (Top-3) were calculated in the one-hot space. Visualization of the MCA embeddings was performed by projecting to 2D using PCA, dis-playing results as points (left panel) or points with centroids and convex hulls (right panel).

For the UMAP embedding, the dataset was first transformed into one-hot format, and UMAP was fit to generate nonlinear embeddings (Allaoui et al., 2020). K-means clustering was subsequently applied to the UMAP embeddings to obtain cluster labels. Evaluation followed the same procedure as the previous models, with metrics computed in the one-hot space and cluster size, imbalance, and Lift (Top-3) reported. For visualization consistency, a single 2D coordinate projection was generated for all models, with identical colors and legends applied to left and right panels.

As a result, the results of each of the four models were derived as shown in Table 1.

Table 1
www.frontiersin.org

Table 1. Comparison of clustering results of models.

Therefore, cluster extraction was performed based on the finally selected CoPE-DEC model, and the results for over-represented categories were as shown in Table 2. When de-riving over-represented categories, it is a value that indicates which category appears frequently in a specific cluster, and this can be explained by Equation 14 (Premji et al., 2010). In Equation 14, #(c = v in k) represents the number of samples in cluster k where the variable c has the value v, and #(k) represents the total number of samples in k. And N represents the total number of samples in the entire data. Therefore, it is the intra-cluster ratio/overall ratio, where # represents the count of the corresponding denominator and numerator, respectively.

Lift ( c = v , k ) = P ( c = v cluster = k ) P ( c = v ) = # ( c = v in k ) # ( k ) # ( c = v ) N     (14)
Table 2
www.frontiersin.org

Table 2. Cope—clustering element results of DEC.

6 Discussion

This study aimed to identify latent viewer typologies in AI-driven sports media environments by applying the Column-conditioned Prototype-Enhanced Deep Embedded Clustering (CoPE-DEC) framework from a human–computer interaction (HCI) perspective. Extending beyond descriptive segmentation, the revised discussion provides an attribution-based interpretation of cluster formation by linking observed audience patterns to underlying HCI mechanisms embedded in contemporary digital sports media ecosystems (Figure 2 and Table 3).

Figure 2
Four scatter plots compare different embedding methods. Top-left: CoPE-DEC with blue, orange, and green clusters. Top-right: SVD with similar cluster colors. Bottom-left: UMAP with distinct positioning for clusters. Bottom-right: MCA with compact clusters. The bottom image replicates the CoPE-DEC plot with a dark background and clear cluster labels.

Figure 2. Clustering results of models.

Table 3
www.frontiersin.org

Table 3. Clustering element features of CoPE-DEC embedding.

6.1 Attributional interpretation of viewer clusters through HCI mechanisms

The emergence of three distinct viewer clusters can be explained by differentiated interaction affordances, psychological engagement mechanisms, and media-mediated social processes inherent in modern sports media platforms. In particular, immersion, parasocial interaction, and interface-mediated participation function as core explanatory mechanisms shaping viewer behavior and orientation.

6.1.1 Cluster 1: sports value orientation and immersive cognitive engagement

Viewers classified under the “Sports Value Orientation” cluster demonstrated heightened concentration, aesthetic appreciation, admiration for athletes, and intrinsic motivation toward health and physical activity. These patterns can be attributed to deep immersive engagement, a central concept in HCI and media psychology. Advanced broadcast technologies—such as high-definition visuals, slow-motion replays, and data-enhanced commentary—facilitate cognitive absorption and attentional focus, enabling viewers to process sports content not merely as entertainment but as meaningful symbolic experiences (Shin and Biocca, 2018; Riva et al., 2022). From an attributional perspective, repeated exposure to immersive sports media reinforces internalized sports values by aligning sensory stimulation with cognitive and emotional appraisal processes. Recent studies suggest that immersive sports media experiences foster long-term value internalization, influencing physical activity intention, moral evaluation, and aesthetic judgment, particularly among digitally native audiences (Wei et al., 2025; Zhang and Zhang, 2023). However, commercialization and algorithm-driven highlight curation may simultaneously reshape value hierarchies by prioritizing spectacle and celebrity narratives over educational or ethical dimensions of sport (Gantz and Wenner, 2020; Lee et al., 2020). Cluster 1 thus reflects a tension between intrinsic sports values and mediated representations shaped by platform design.

6.1.2 Cluster 2: sports consumption culture and interface-mediated participation

The “Sports Consumption Culture” cluster is characterized by frequent media consumption, sports-related tourism, merchandize purchases, and skill acquisition through digital platforms. This pattern can be causally linked to interface-mediated participation mechanisms, wherein platform architectures actively encourage transactional and experiential engagement. Features such as subscription models, recommendation algorithms, in-stream advertising, and integrated e-commerce transform viewers into active participants within a sports consumption ecosystem (Trail et al., 2021; Ratten, 2021). HCI research emphasizes that interfaces do not merely transmit content but structure user behavior by shaping choice architectures and participation pathways (Norman, 2013; Sundar, 2020). In this context, sports media platforms function as consumption orchestrators, lowering participation barriers and reinforcing habitual engagement cycles. The attribution of Cluster 2’s behavior lies in the synergistic interaction between user agency and system-level affordances, which collectively normalize continuous consumption and experiential diversification. This cluster exemplifies how sports fandom evolves into a cultural practice sustained by digital infrastructures rather than by event-centric attendance alone (Popp et al., 2019).

6.1.3 Cluster 3: sports attitude orientation and parasocial–social integration

The “Sports Attitude Orientation” cluster reflects social adaptation, emotional regulation, stress relief, and group solidarity derived from sports viewing. These outcomes can be explained through parasocial interaction and social presence mechanisms, which are increasingly amplified in interactive sports media environments. Real-time chat, social media integration, and influencer-style commentary foster perceived intimacy and social connectedness, enabling viewers to experience sports as shared emotional events (Lim et al., 2024; Tukachinsky, 2021). From an HCI standpoint, these mechanisms reduce psychological distance and promote affective bonding, positioning sports media as a tool for emotional coping and identity reinforcement. Empirical evidence suggests that such digitally mediated social engagement enhances wellbeing, resilience, and sustained participation, particularly in post-pandemic media consumption contexts (Filo et al., 2015; Kim and Hwang, 2019). Cluster 3 thus represents an affective–social orientation shaped by the convergence of parasocial cues and community-based interaction interfaces.

6.2 Broader implications for HCI and the sports media industry

Collectively, the findings illustrate that contemporary sports media consumption is not driven by a single motivational logic but by distinct HCI-mediated pathways that produce differentiated viewer typologies. The integration of AI-driven clustering with HCI theory enables a more nuanced understanding of how interface design, interaction patterns, and psychological mechanisms jointly shape audience behavior. Practically, these insights have significant implications for sports media strategy. Content personalization, interface customization, and engagement design can be optimized by aligning platform features with the dominant HCI mechanisms associated with each viewer type. Moreover, the application of HCI extends beyond media consumption to athlete training, performance analysis, and fan engagement. Emerging technologies—such as AR/VR/MR, wearable sensors, and real-time feedback systems—facilitate immersive training environments, data-driven coaching, and enhanced fan experiences, reinforcing the convergence of performance analytics and user-centered design (Funk et al., 2018; Gürses et al., 2020; Riva et al., 2022).

6.3 Theoretical contributions and future directions

The present study contributes to sports media research by demonstrating how HCI-aligned deep clustering frameworks can move beyond descriptive segmentation toward explanatory audience modeling. By attributing cluster formation to specific interaction mechanisms, this study advances theoretical integration between AI-based audience analytics and HCI theory. Future research should further investigate causal pathways through longitudinal designs, multimodal data integration, and experimental manipulation of interface features. Additionally, benchmarking CoPE-DEC against emerging SOTA models under comparable conditions will strengthen generalizability while preserving interpretability and domain relevance.

7 Conclusion

This study proposed and empirically validated an HCI-oriented deep clustering framework—Column-conditioned Prototype-Enhanced Deep Embedded Clustering (CoPE-DEC)—to identify latent viewer typologies within contemporary AI-driven sports media environments. By moving beyond conventional descriptive segmentation, the study demonstrated that sports media audiences can be meaningfully classified according to distinct interaction-driven mechanisms that reflect how users cognitively, emotionally, and socially engage with digital sports content.

The findings revealed three interpretable viewer clusters—Sports Value Orientation, Sports Consumption Culture, and Sports Attitude Orientation—each shaped by different HCI-mediated processes, including immersive cognitive engagement, interface-mediated participation, and parasocial–social integration. Importantly, these clusters did not merely represent differences in viewing frequency or preference, but reflected deeper attributional mechanisms through which digital interfaces, media affordances, and psychological engagement jointly structure sports media consumption. This supports the argument that audience heterogeneity in digital sports media is fundamentally interactional rather than purely demographic or behavioral.

From a theoretical perspective, this study contributes to the literature by bridging AI-based audience analytics with human–computer interaction theory. While prior sports media research has largely treated segmentation as an outcome-oriented classification task, the present study reframes viewer typologies as emergent results of interaction mechanisms embedded in platform design and media systems. In doing so, it advances a mechanism-based understanding of sports media consumption that integrates immersion theory, parasocial interaction, and interface affordance theory within a unified analytical framework.

Methodologically, the study extends deep embedded clustering research by introducing a column-conditioned and prototype-aligned architecture tailored to structured, semantically heterogeneous survey data. Unlike many state-of-the-art deep clustering models optimized for unstructured or homogeneous inputs, CoPE-DEC prioritizes interpretability, semantic alignment, and stability—features that are critical for HCI-driven audience analysis and applied sports media research. This positions CoPE-DEC as a domain-adaptive alternative to accuracy-centered clustering approaches, particularly in contexts where explainability and human-centered interpretation are essential.

From a practical standpoint, the results offer actionable insights for the sports media industry. Understanding viewer typologies through HCI mechanisms enables media organizations to design differentiated content strategies, personalized interfaces, and engagement pathways aligned with specific audience orientations. Moreover, the implications of this framework extend beyond media consumption to adjacent domains, including fan engagement, athlete training, performance analytics, and immersive experience design, where HCI technologies such as AR/VR/MR, wearables, and real-time feedback systems are increasingly influential.

Despite these contributions, several limitations should be acknowledged. The cross-sectional nature of the data constrains causal inference, and the reliance on self-reported survey measures may not fully capture dynamic interaction behaviors. Future research should therefore employ longitudinal designs, multimodal data sources, and experimental manipulation of interface features to more rigorously test causal pathways. Additionally, systematic benchmarking against emerging state-of-the-art deep clustering models under matched conditions will further clarify performance trade-offs and enhance generalizability.

In conclusion, this study demonstrates that integrating deep learning with an HCI perspective enables a more explanatory, interpretable, and human-centered understanding of sports media audiences. By revealing how interaction mechanisms shape distinct viewer typologies, the proposed CoPE-DEC framework offers both a methodological advancement and a theoretical foundation for future AI-driven audience research in complex digital media environments.

Data availability statement

The raw data supporting the conclusions of this article will be made available by the authors upon reasonable request.

Author contributions

Y-SJ: Formal analysis, Investigation, Methodology, Validation, Visualization, Writing – original draft, Writing – review & editing.

Funding

The author(s) declared that financial support was not received for this work and/or its publication.

Conflict of interest

The author(s) declared that this work was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Generative AI statement

The author(s) declared that Generative AI was not used in the creation of this manuscript.

Any alternative text (alt text) provided alongside figures in this article has been generated by Frontiers with the support of artificial intelligence and reasonable efforts have been made to ensure accuracy, including review by the authors wherever possible. If you identify any issues, please contact us.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

References

Abid, A., Zhang, M. J., Bagaria, V. K., and Zou, J. (2021). Exploring patterns enriched by interpretable machine learning. Nat. Mach. Intell. 3, 1–9. doi: 10.1038/s42256-021-00304-8

Crossref Full Text | Google Scholar

Allaoui, M., Kherfi, M. L., and Cheriet, A. (2020). “Considerably improving clustering algorithms using UMAP dimensionality reduction technique: a comparative study” in International conference on image and signal processing (Cham: Springer), 317–325.

Google Scholar

Belk, R. W. (2013). Extended self in a digital world. J. Consum. Res. 40, 477–500. doi: 10.1086/671052

Crossref Full Text | Google Scholar

Bunker, R., and Susnjak, T. (2022). The application of machine learning techniques for predicting match results in team sport: a review. J. Artif. Intell. Res. 73, 1285–1322. doi: 10.1613/jair.1.13493

Crossref Full Text | Google Scholar

Busoniu, L., Babuska, R., De Schutter, B., and Ernst, D. (2010). Reinforcement learning and dynamic programming using function approximators. Boca Raton, Florida (USA): CRC Press.

Google Scholar

Card, S. K., Moran, T. P., and Newell, A. (1983). The psychology of human-computer interaction : CRC Press.

Google Scholar

Caron, M., Misra, I., Mairal, J., Goyal, P., Bojanowski, P., and Joulin, A. (2021). “Unsupervised learning of visual features by contrasting cluster assignments” in Advances in neural information processing systems, (Red Hook, New York, USA: Curran Associates, Inc) vol. 34, 9912–9924.

Google Scholar

Chae, J.-M., and Kim, E.-H. (2021). A sales prediction model for apparel products using machine learning: focusing on outerwear. Journal of the Korean Society of Apparel Industry 23, 480–490.

Google Scholar

Chen, Y. (2024). Deep learning–based audience segmentation for digital media markets. Expert Syst. Appl. 237:121543. doi: 10.1016/j.eswa.2023.121543

Crossref Full Text | Google Scholar

Chen, Z., Xu, H., and Whinston, A. (2022). Moderating effects of real-time interaction on live streaming engagement. Inf. Syst. Res. 33, 1235–1256. doi: 10.1287/isre.2022.1103

Crossref Full Text | Google Scholar

Chiu, C. M., Hsu, M. H., Sun, S. Y., Lin, T. C., and Sun, P. C. (2017). Usability, quality, value and e-learning continuance decisions. Comput. Educ. 45, 399–416.

Google Scholar

Csikszentmihalyi, M. (1990). Flow: The psychology of optimal experience : Harper & Row.

Google Scholar

Filo, K., Lock, D., and Karg, A. (2015). Sport and social media: a review. Sport Management Review 18, 166–181. doi: 10.1016/j.smr.2014.11.001

Crossref Full Text | Google Scholar

Finlayson, O. (2000). Information theory and clustering. J. Classif. 17, 1–22. doi: 10.1007/s003579900001

Crossref Full Text | Google Scholar

Firman, M. I., Nugroho, L. E., and Hidayatno, A. (2023). Evaluation of clustering validity indices for categorical and mixed-type data. Expert Syst. Appl. 213:118901. doi: 10.1016/j.eswa.2022.118901

Crossref Full Text | Google Scholar

Frederick, D. A., Tylka, T. L., Rodgers, R. F., Convertino, L., Pennesi, J. L., Parent, M. C., et al. (2022). Pathways from sociocultural and objectification constructs to body satisfaction among men: the U.S. Body Project I. Body Image 41, 84–96. doi: 10.1016/j.bodyim.2022.01.018

Crossref Full Text | Google Scholar

Funk, D. C., Lock, D., Karg, A., and Pizzo, A. D. (2016). Sport consumer behaviour research: improving our game. Sport Manage. Rev. 19, 113–116. doi: 10.1016/j.smr.2016.01.003

Crossref Full Text | Google Scholar

Funk, D. C., Pizzo, A. D., and Baker, B. J. (2018). eSport management: embracing eSport education and research opportunities. Sport Manage. Rev. 21, 7–13. doi: 10.1016/j.smr.2017.07.008

Crossref Full Text | Google Scholar

Gantz, W., and Wenner, L. A. (2020). Media, sports, and society. J. Sport Soc. Issues 44, 379–390. doi: 10.1177/0193723520948522

Crossref Full Text | Google Scholar

Giulianotti, R. (2022). Sport, globalization, and culture. Cambridge, United Kingdom: Polity Press.

Google Scholar

Gürses, E., Aydın, A., and Tokgöz, H. (2020). Wearable technology and human–computer interaction in sports training: opportunities and challenges. International Journal of Sports Science & Coaching 15, 757–769.

Google Scholar

Hamari, J., and Sjöblom, M. (2017). The rise of motivational information systems: a review of gamification research. Int. J. Inf. Manag. 45, 191–210.

Google Scholar

Hang, S., Kim, J., and Park, S. (2020). Prototype-based representation learning for interpretable clustering. Pattern Recogn. 107:107494. doi: 10.1016/j.patcog.2020.107494

Crossref Full Text | Google Scholar

Horton, D., and Wohl, R. (1956). Mass communication and Para-social interaction. Psychiatry 19, 215–229.

Google Scholar

Horvat, T., and Job, J. (2020). The use of machine learning in sport outcome and consumption prediction. Wiley Interdiscip. Rev. Data Min. Knowl. Discov. 10:e1380. doi: 10.1002/widm.1380

Crossref Full Text | Google Scholar

Kareem, E. I. A., Alsalihy, W. A. A., and Jantan, A. (2012). Multi-connect architecture associative memory. Intelligent Automation & Soft Computing 18, 279–296.

Google Scholar

Katz, E., Blumler, J. G., and Gurevitch, M. (1973). Uses and gratifications research. Public Opin. Q. 37, 509–523.

Google Scholar

Kim, B., and Hwang, H. (2019). The relationship between satisfaction and approval ratings of volunteers in PyeongChang 2018 winter Olympics. Korean Journal of Physical Education 58, 49–64.

Google Scholar

Kim, J.-E., Park, J.-M., and Park, J.-C. (2024). Application of machine learning to predict positions and outcomes in soccer. Journal of Physical Education Research 42, 13–21.

Google Scholar

Kotler, P., and Keller, K. L. (2022). Marketing management. 16th Edn. London, United Kingdom: Pearson.

Google Scholar

Le Besnerais, G. (2008). Data augmentation strategies for machine learning. IEEE Trans. Pattern Anal. Mach. Intell. 30, 1563–1575. doi: 10.1109/TPAMI.2007.70826

Crossref Full Text | Google Scholar

Lee, S.-M., Jeong, H.-C., So, W.-Y., and Youn, H.-S. (2020). Mediating effect of sports participation on adolescent health. Int. J. Environ. Res. Public Health 17:6744.

Google Scholar

Lijing, L. (P.), Hu, Y., and Li, J. (2022). Deep embedded clustering with data augmentation and self-supervision. Neurocomputing 500, 235–246.

Google Scholar

Li, J., Zhou, P., Xiong, C., and Hoi, S. C. H. (2023). Prototypical contrastive learning of unsupervised representations. IEEE Trans. Pattern Anal. Mach. Intell. 45, 1036–1051. doi: 10.1109/TPAMI.2021.3139751

Crossref Full Text | Google Scholar

Lim, J., and Park, S. (2023). Platform-based interaction effects on sports media engagement. Sport Manage. Rev. 26, 310–325. doi: 10.1016/j.smr.2022.06.004

Crossref Full Text | Google Scholar

Lim, J., Lim, S., and Lee, J. (2024). Digital sports media engagement and psychological well-being. Comput. Hum. Behav. 146:107773. doi: 10.1016/j.chb.2023.107773

Crossref Full Text | Google Scholar

Lim, S.-Y., and Lim, S.-J. (2024). Viewing intentions toward breaking at the 2024 Paris Olympics. Journal of the Korean Society of Sport Industry Management 29, 1–17.

Google Scholar

Lin, T.-E., Xu, H., and Zhang, H. (2020). Discovering new intents via constrained deep adaptive clustering with cluster refinement. In Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence. Association for the Advancement of Artificial Intelligence. pp. 8360–8367.

Google Scholar

Li, X., Hu, Y., and Li, J. (2022). Deep embedded clustering with data augmentation and self-supervision. Neurocomputing 500, 235–246. doi: 10.1016/j.neucom.2022.05.012

Crossref Full Text | Google Scholar

Norman, D. A. (2013). The design of everyday things. Revised Edn. New York, NY, USA: Basic Books.

Google Scholar

Oliver, R. L. (2014). Satisfaction: A behavioral perspective on the consumer. 2nd Edn: Routledge.

Google Scholar

Peiliang, X. (1998). Truncated SVD methods for ill-posed problems. Geophys. J. Int. 135, 505–514.

Google Scholar

Pizzo, A. D. (2023). Hypercasual and hybrid-casual video gaming: a digital leisure perspective. Leis. Sci. 47. doi: 10.1080/01490400.2023.2211056

Crossref Full Text | Google Scholar

Popp, A. L., Scheidegger, A., Moeck, C., Brennwald, M. S., and Kipfer, R. (2019). Integrating Bayesian groundwater mixing modeling with on-site helium analysis to identify unknown water sources. Water Resour. Res. 55, 10602–10615. doi: 10.1029/2019WR025677

Crossref Full Text | Google Scholar

Popp, D. (2020). Promoting clean energy innovation at the state and local level. Agricultural and Resource Economics Review 49, 360–373. doi: 10.1017/age.2020.15

Crossref Full Text | Google Scholar

Preece, J., Rogers, Y., and Sharp, H. (2022). Interaction design: beyond human–computer interaction. 6th Edn. Hoboken, New Jersey, USA: Wiley.

Google Scholar

Premji, A., Ziluk, A., Nelson, A. J., et al. (2010). Bilateral somatosensory evoked potentials following intermittent theta-burst repetitive transcranial magnetic stimulation. BMC Neurosci. 11. doi: 10.1186/1471-2202-11-91

Crossref Full Text | Google Scholar

Randrianantoanina, A. (2022). “Feature-wise linear modulation (FiLM) for conditional representation learning” in Proceedings of the AAAI conference on artificial intelligence, Washington, D.C., USA: AAAI Press (Association for the Advancement of Artificial Intelligence). vol. 36, 4512–4520.

Google Scholar

Ratten, V. (2021). Sport technology and innovation. Technol. Forecast. Soc. Change 170:120904. doi: 10.1016/j.techfore.2021.120904

Crossref Full Text | Google Scholar

Riva, G., Wiederhold, B. K., and Mantovani, F. (2022). Neuroscience of immersive technologies. Cyberpsychol. Behav. Soc. Netw. 25, 213–220. doi: 10.1089/cyber.2021.29298.gri

Crossref Full Text | Google Scholar

Saito, S., and Yamamoto, K. (2023). Robust prototype-based clustering using deep representation learning. IEEE Trans. Pattern Anal. Mach. Intell. 45, 1124–1138. doi: 10.1109/TPAMI.2022.3164589

Crossref Full Text | Google Scholar

Shin, D., and Biocca, F. (2018). Impact of social influence and users’ perception of coolness on smartwatch behavior. Soc. Behav. Pers. 46, 881–890. doi: 10.2224/sbp.5134

Crossref Full Text | Google Scholar

Sjöblom, M., Hamari, J., and Järvelä, S. (2019). Why do people watch live streams? Comput. Hum. Behav. 93, 181–199. doi: 10.1016/j.chb.2018.12.019

Crossref Full Text | Google Scholar

Sundar, S. S. (2020). Rise of machine agency. J. Comput.-Mediat. Commun. 25, 74–88. doi: 10.1093/jcmc/zmz026

Crossref Full Text | Google Scholar

Trail, G. T., Robinson, M. J., and Dick, R. J. (2021). The evolving role of fans. J. Sport Manag. 35, 562–577.

Google Scholar

Trail, G. T., Robinson, M. J., Dick, R. J., and Gillentine, A. J. (2019). Motives and points of attachment in sport consumption. Sport Manag. Rev. 22, 301–315. doi: 10.1016/j.smr.2018.06.002

Crossref Full Text | Google Scholar

Tukachinsky, R. (2021). Parasocial interaction: a comprehensive review. Commun. Theory 31, 183–206. doi: 10.1093/ct/qtz021

Crossref Full Text | Google Scholar

Van Gansbeke, W., Vandenhende, S., Georgoulis, S., Proesmans, M., and Van Gool, L. (2021). “Scan: learning to classify images without labels” in Proceedings of the European conference on computer vision, Cham, Switzerland: Springer Science+Business Media. 268–285.

Google Scholar

Wedel, M., and Kannan, P. K. (2016). Marketing analytics for data-rich environments. J. Mark. 80, 97–121.

Google Scholar

Wei, X., Aman, M. S., Abidin, N. E. Z, and Qian, Y. (2025). Exploring the relationship between sports media use, sports participation behavior, and sport commitment: A mixed-methods study using structural equation modeling and qualitative insights. BMC Psychology, 13:636. doi: 10.1186/s40359-025-02964-x

Crossref Full Text | Google Scholar

Wohn, D. Y., and Freeman, G. (2020). Live streaming, parasocial relationships, and community engagement. New Media Soc. 22, 1154–1173. doi: 10.1177/1461444819861855

Crossref Full Text | Google Scholar

Xie, J., Girshick, R., and Farhadi, A. (2016). “Unsupervised deep embedding for clustering analysis” in Proceedings of the 33rd international conference on machine learning, Proceedings of Machine Learning Research. United States (ICML host/PMLR editorial). 478–487.

Google Scholar

Xu, Y., Zhong, Z., Zhang, J., and Luo, P. (2022). “Deep clustering with sample assignment invariance prior” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, Los Alamitos, California, USA: IEEE Computer Society. 1594–1603.

Google Scholar

Yoshida, Y., Yoshida, Y., Isogai, E., Hayase, T., Nakamura, K., Saito, M., et al. (2017). Level of perception of technical terms regarding the effect of radiation on the human body by residents of Japan. Environ. Health Prev. Med. 22:73. doi: 10.1186/s12199-017-0679-7

Crossref Full Text | Google Scholar

Zeithaml, V. A., Bitner, M. J., and Gremler, D. D. (2020). Services marketing: Integrating customer focus across the firm. 7th Edn: McGraw-Hill.

Google Scholar

Zhang, J., Li, Y., and Wang, S. (2021). Flow experience and continuous use of media platforms. Comput. Hum. Behav. 123:106868. doi: 10.1016/j.chb.2021.106868

Crossref Full Text | Google Scholar

Zhang, Y., and Zhang, H. (2023). Self-transcendence values and sports consumption. Sustainability 15:10938. doi: 10.3390/su151910938

Crossref Full Text | Google Scholar

Zhou, J., Wang, J., Wen, X., and Zhang, Q. (2022). “Deep clustering via dual contrastive learning” in Proceedings of the AAAI conference on artificial intelligence, vol. 36, 4156–4164.

Google Scholar

Keywords: artificial intelligence, CoPE-DEC technology, deep learning, human–computer interaction, sports media viewer characteristics

Citation: Jang Y-S (2026) AI-driven audience clustering in sport media: a human–computer interaction approach using ‘CoPE-DEC’. Front. Comput. Neurosci. 20:1767724. doi: 10.3389/fncom.2026.1767724

Received: 15 December 2025; Revised: 08 January 2026; Accepted: 09 January 2026;
Published: 29 January 2026.

Edited by:

Anguo Zhang, Fuzhou University, China

Reviewed by:

Yu Liang Feng, Jiangsu University of Technology, China
Dongming Li, University of Macau, China

Copyright © 2026 Jang. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Yong-Seok Jang, amFuZ3lzMTZAa2h1LmFjLmty

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.