<?xml version="1.0" encoding="utf-8"?>
    <rss version="2.0">
      <channel xmlns:content="http://purl.org/rss/1.0/modules/content/">
        <title>Frontiers in Big Data | New and Recent Articles</title>
        <link>https://www.frontiersin.org/journals/big-data</link>
        <description>RSS Feed for Frontiers in Big Data | New and Recent Articles</description>
        <language>en-us</language>
        <generator>Frontiers Feed Generator,version:1</generator>
        <pubDate>2026-04-04T09:01:32.248+00:00</pubDate>
        <ttl>60</ttl>
        <item>
        <guid isPermaLink="true">https://www.frontiersin.org/articles/10.3389/fdata.2026.1814157</guid>
        <link>https://www.frontiersin.org/articles/10.3389/fdata.2026.1814157</link>
        <title><![CDATA[A disease potential-driven graph attention model for comorbidity risk prediction of hypertension]]></title>
        <pubdate>2026-04-02T00:00:00Z</pubdate>
        <category>Original Research</category>
        <author>Leming Zhou</author><author>Hanshu Qin</author><author>Yanmei Yang</author><author>Gang Huang</author><author>Zhigang Liu</author>
        <description><![CDATA[Hypertension is associated with an increased risk of serious complications, and the hazards are very serious. However, current methods for predicting comorbidity risks face the challenge that comorbidity prediction relying solely on data driven may lead to clinically implausible associations and reduce model interpretability. Also, how to capture the fusion features of patient and identify differences among them to facilitate risk prediction needs to be addressed. To overcome these challenges, we propose a Disease Potential-Driven Graph Attention (DP-GA) model for comorbidity risk prediction of hypertension, which has 3-fold ideas: (a) Constructing a fusion mechanism for the correlation among the patients' disease features and the structural, thus integrating feature attention and structural attention effectively; (b) Introducing a similarity-difference balance mechanism to further identify the relationships among patients; and (c) Designing a disease potential-driven attention mechanism to calculate the disease potential and construct masks, thus preserving the effective associations from high-risk patients to low-risk patients. Experimental results demonstrate that our proposed DP-GA model achieves a significant improvement in comorbidity risk prediction for patients with hypertension across three comorbidity datasets collected by the research group, compared with both the baseline and state-of-the-art peer methods. We also analyze the comorbidity network to predict the risk of hypertension comorbidity, thereby improving interpretability and early prediction of such comorbidities.]]></description>
      </item><item>
        <guid isPermaLink="true">https://www.frontiersin.org/articles/10.3389/fdata.2025.1594374</guid>
        <link>https://www.frontiersin.org/articles/10.3389/fdata.2025.1594374</link>
        <title><![CDATA[Toward robust social media sentiment for SMEs: a comparative study of dictionary-based and machine learning approaches with insights for hybrid methodologies]]></title>
        <pubdate>2026-04-01T00:00:00Z</pubdate>
        <category>Original Research</category>
        <author>Heru Susanto</author><author>Aida Sari Omar</author><author>Alifya Kayla Shafa Susanto</author><author>Desi Setiana</author><author>Leu Fang-Yie</author><author>Junaid M. Shaikh</author><author>Asep Insani</author><author>Uus Khusni</author><author>Rachmat Hidayat</author><author>Indra Akbari</author><author>Iwan Basuki</author>
        <description><![CDATA[Small and Medium-sized Enterprises (SMEs) increasingly rely on social media to engage customers, promote products, and enhance workplace collaboration. Customer opinions expressed through comments and posts on platforms such as Facebook and Instagram represent valuable insights, yet their informal and context-specific nature—often characterized by slang, misspellings, and bilingual usage—poses challenges for automated sentiment analysis. This study addresses this gap by comparatively evaluating dictionary-based and machine learning approaches to sentiment classification for SMEs' social media content. Data were collected from a diverse set of SMEs across multiple industries, with a substantial volume of customer comments extracted and pre-processed through tokenization, normalization, stop-word removal, and stemming. A customized dictionary was developed to account for local language variations, while Naïve Bayes and Support Vector Machine (SVM) models were employed as supervised classifiers. The findings indicate that dictionary-based methods, while simple and interpretable, struggle with accuracy when processing informal and localized language, whereas machine learning approaches deliver higher overall performance but require extensive preprocessing and tuning. Moreover, the study highlights the potential of hybrid frameworks that combine the interpretability of dictionary-based models with the adaptability of machine learning classifiers. This research contributes both practically and theoretically by (i) demonstrating the limitations of applying generic sentiment analysis tools in localized SME contexts, (ii) proposing a hybrid sentiment analysis framework tailored to SMEs, and (iii) offering empirical evidence to support digital transformation strategies for SMEs in resource-constrained environments. Ultimately, accurate sentiment analysis can enable SMEs to refine business strategies, strengthen customer engagement, and achieve sustainable growth in the digital economy.]]></description>
      </item><item>
        <guid isPermaLink="true">https://www.frontiersin.org/articles/10.3389/fdata.2026.1778363</guid>
        <link>https://www.frontiersin.org/articles/10.3389/fdata.2026.1778363</link>
        <title><![CDATA[Tree-based machine learning methods for predicting vehicle insurance claim size]]></title>
        <pubdate>2026-03-23T00:00:00Z</pubdate>
        <category>Original Research</category>
        <author>Edossa Merga Terefe</author><author>Merga Abdissa Aga</author>
        <description><![CDATA[Vehicle insurance claim severity modeling requires accurate and interpretable methods that can handle skewed and heterogeneous loss data. This study provides a structured empirical comparison between classical parametric regression models and tree-based ensemble learning approaches for predicting claim size conditional on claim occurrence. The analysis is conducted within a cross-sectional conditional severity framework using real-world motor insurance data. We implement and compare ordinary least squares (OLS), a Tweedie generalized linear model (GLM), and three ensemble methods: bagging, random forests (RFs), and gradient boosting. Model performance is evaluated using out-of-sample root mean square error (RMSE), and variable importance measures assess the relative contribution of predictors. The results indicate that tree-based ensemble methods achieve modest improvements in predictive accuracy relative to classical parametric models. The Tweedie GLM remains a competitive, flexible parametric benchmark for skewed positive claim amounts. Variable importance analysis consistently identifies premium and insured value as key determinants of claim severity. Overall, the findings suggest that ensemble learning methods can complement traditional actuarial models, offering additional flexibility in capturing non-linear effects while maintaining comparable predictive performance in moderate-complexity severity data.]]></description>
      </item><item>
        <guid isPermaLink="true">https://www.frontiersin.org/articles/10.3389/fdata.2026.1737043</guid>
        <link>https://www.frontiersin.org/articles/10.3389/fdata.2026.1737043</link>
        <title><![CDATA[Fairer non-negative matrix factorization]]></title>
        <pubdate>2026-03-17T00:00:00Z</pubdate>
        <category>Original Research</category>
        <author>Lara Kassab</author><author>Erin George</author><author>Deanna Needell</author><author>Haowen Geng</author><author>Nika Jafar Nia</author><author>Aoxi Li</author>
        <description><![CDATA[There has been a recent critical need to study fairness and bias in machine learning (ML) algorithms. Since there is clearly no one-size-fits-all solution to fairness, ML methods should be developed alongside bias mitigation strategies that are practical and approachable to the practitioner. Motivated by recent work on “fair” PCA, here we consider the more challenging method of non-negative matrix factorization (NMF) as both a showcasing example and a method that is important in its own right for both topic modeling tasks and feature extraction for other ML tasks. We demonstrate that a modification of the objective function, by using a min-max formulation, may sometimes be able to offer an improvement in fairness for groups in the population. We derive two methods for the objective minimization, a multiplicative update rule as well as an alternating minimization scheme, and discuss implementation practicalities. We include a suite of synthetic and real experiments that show how the method may improve fairness while also highlighting the important fact that this may sometime increase error for some individuals and fairness is not a rigid definition and method choice should strongly depend on the application at hand.]]></description>
      </item><item>
        <guid isPermaLink="true">https://www.frontiersin.org/articles/10.3389/fdata.2026.1676922</guid>
        <link>https://www.frontiersin.org/articles/10.3389/fdata.2026.1676922</link>
        <title><![CDATA[FunduScope: a human-centered, machine learning–based interactive tool for training junior ophthalmologists in diabetic retinopathy detection]]></title>
        <pubdate>2026-03-13T00:00:00Z</pubdate>
        <category>Original Research</category>
        <author>Sara-Jane Bittner</author><author>Michael Barz</author><author>Daniel Sonntag</author>
        <description><![CDATA[Interpreting fundus images is an essential skill for detecting eye diseases, such as diabetic retinopathy (DR), one of the leading causes of visual impairment. However, the training of junior doctors relies on experienced ophthalmologists, who often lack the time for teaching, or on printed training materials that lack variability in examples. In this work, we present FunduScope, an interactive human-centered learning tool for training junior ophthalmologists, which is based on a pre-trained ML model for classifying DR. In a qualitative pre-study, we investigated the needs of junior doctors and identified gaps in recent learning procedures. In the main mixed-methods study, we examined the experience of 10 junior doctors with the tool and its impact on cognitive load, usability, and additional factors relevant to e-learning tools. Despite technical constraints our results confirm the potential of using an ML-based learning tool in medical education, addressing the time constraints of ophthalmologists, and providing learning independence for junior doctors. However, future work could extend the learning tool by using explainable artificial intelligence (XAI) to further support the clinical decision making of learners and exceeding the scope of this proof of concept to other ophthalmic diseases.]]></description>
      </item><item>
        <guid isPermaLink="true">https://www.frontiersin.org/articles/10.3389/fdata.2026.1752142</guid>
        <link>https://www.frontiersin.org/articles/10.3389/fdata.2026.1752142</link>
        <title><![CDATA[Jingdezhen ceramic culture in the digital era: a qualitative inquiry into digital dissemination and platform innovation]]></title>
        <pubdate>2026-03-09T00:00:00Z</pubdate>
        <category>Original Research</category>
        <author>Qiuyang Huang</author><author>Zhengjun Chen</author>
        <description><![CDATA[IntroductionDigital platforms have increasingly reshaped the ways in which traditional craft cultures are produced, circulated, and interpreted. While prior research has examined digital heritage broadly, limited attention has been paid to how platform-based dissemination transforms ceramic culture in historically significant craft centers such as Jingdezhen.MethodsThis study adopts a qualitative research design, combining semi-structured interviews with 32 ceramic practitioners and digital ethnography of 58 ceramic-related livestreaming sessions on Douyin.ResultsThe findings reveal three key dynamics: (1) the reconfiguration of craft authority through platform visibility; (2) the emergence of hybrid artisan–educator–entrepreneur identities; and (3) persistent tensions between cultural authenticity and commercial logic in platform-mediated environments.DiscussionBy integrating cultural ecology and platform ecosystem theory, this study contributes to scholarship on digital heritage and provides practical insights for cultural practitioners and heritage institutions navigating digital platform ecosystems.]]></description>
      </item><item>
        <guid isPermaLink="true">https://www.frontiersin.org/articles/10.3389/fdata.2026.1770989</guid>
        <link>https://www.frontiersin.org/articles/10.3389/fdata.2026.1770989</link>
        <title><![CDATA[Spatiotemporal deep learning framework for predictive behavioral threat detection in surveillance footage]]></title>
        <pubdate>2026-02-27T00:00:00Z</pubdate>
        <category>Original Research</category>
        <author>Asha Aruna Sheela Matta</author><author>Venkata Purna Chandra Sekhara Rao Manukonda</author>
        <description><![CDATA[Anomaly detection in video surveillance remains a challenging problem due to complex human behaviors, temporal variability, and limited annotated data. This study proposes an optimized spatiotemporal deep learning (DL) framework that integrates a Convolutional Neural Network (CNN) for spatial feature extraction with a Long Short-Term Memory (LSTM) network for temporal dependency modeling. The CNN processes frame-level appearance information, while the LSTM captures sequential motion patterns across video frames, enabling effective representation of anomalous activities. Hyperparameter optimization and regularization strategies are employed to improve convergence stability and generalization performance. The proposed model is evaluated on the DCSASS surveillance dataset and the experimental results demonstrate that the optimized CNN-LSTM framework achieves an accuracy of 98.1%, with consistently high precision, recall, and F1-score across 3-fold, 5-fold, and 10-fold cross-validation settings. Comparative analysis shows that the proposed method outperforms conventional machine learning models and recent deep learning baselines, highlighting its effectiveness and robustness for practical video-based anomaly detection in surveillance environments.]]></description>
      </item><item>
        <guid isPermaLink="true">https://www.frontiersin.org/articles/10.3389/fdata.2026.1779935</guid>
        <link>https://www.frontiersin.org/articles/10.3389/fdata.2026.1779935</link>
        <title><![CDATA[GFTrans: an on-the-fly static analysis framework for code performance profiling]]></title>
        <pubdate>2026-02-27T00:00:00Z</pubdate>
        <category>Original Research</category>
        <author>Jie Li</author><author>Yunbao Wen</author><author>Jingxin Liu</author><author>Biqing Zeng</author><author>Seyedali Mirjalili</author>
        <description><![CDATA[Improving software efficiency is crucial for maintenance, but pinpointing runtime bottlenecks becomes increasingly difficult as systems expand. Traditional dynamic profiling tools require full build-execution cycles, creating significant latency that impedes agile development. To address this, we introduce GFTrans, a static analysis framework that predicts c program performance without execution. GFTrans utilizes a Transformer architecture with a novel “anchor-based embedding” technique to integrate control flow and data dependencies into a unified sequence. Additionally, a dynamic gating mechanism fuses these semantic representations with 16 handcrafted statistical features to comprehensively capture code complexity. Evaluated on a dataset of real-world GitHub c functions with high-precision runtime labels, GFTrans outperforms baseline models like Random Forest and Code2Vec, achieving 78.64% accuracy. The system identifies potential bottlenecks in milliseconds, enabling developers to perform optimization effectively during the coding phase.]]></description>
      </item><item>
        <guid isPermaLink="true">https://www.frontiersin.org/articles/10.3389/fdata.2026.1681382</guid>
        <link>https://www.frontiersin.org/articles/10.3389/fdata.2026.1681382</link>
        <title><![CDATA[Federated learning for teacher data privacy protection: a study in the context of the PIPL]]></title>
        <pubdate>2026-02-09T00:00:00Z</pubdate>
        <category>Original Research</category>
        <author>Shanwei Chen</author><author>Xiu Zhi Qi</author><author>Xue Hui Han</author><author>Zhao Chen Fan</author><author>Le Le Wang</author>
        <description><![CDATA[BackgroundThe Personal Information Protection Law (PIPL) in China imposes strict requirements on personal data handling, particularly in educational contexts where teacher data privacy is critical. Traditional centralized machine learning approaches pose significant risks of data breaches and non-compliance. Federated Learning (FL) offers a promising decentralized alternative by enabling collaborative model training without sharing raw data.MethodsThis study combines quantitative simulations and qualitative compliance analysis to evaluate FL frameworks under PIPL principles, with a focus on Differential Privacy as the primary empirically validated mechanism for noise addition and privacy guarantee. Other techniques, such as Secure Multi-Party Computation (SMC), are analyzed theoretically for their alignment with PIPL requirements like data minimization, anonymization, and encrypted transmission.ResultsExperimental simulations demonstrate that FL effectively reduces data breach risks compared to centralized methods. It achieves principle-level compliance with PIPL through local data processing, differential privacy mechanisms, and secure aggregation, leading to improved privacy preservation while maintaining model performance.ConclusionFL conceptually supports teacher data privacy protection under the PIPL framework. This study proposes a tailored compliance framework that integrates FL with privacy-enhancing technologies, offering theoretical foundations and practical recommendations for educational institutions and technology implementers to deploy privacy-preserving machine learning solutions.]]></description>
      </item><item>
        <guid isPermaLink="true">https://www.frontiersin.org/articles/10.3389/fdata.2026.1782461</guid>
        <link>https://www.frontiersin.org/articles/10.3389/fdata.2026.1782461</link>
        <title><![CDATA[A genetic algorithm-based framework for online sparse feature selection in data streams]]></title>
        <pubdate>2026-02-09T00:00:00Z</pubdate>
        <category>Original Research</category>
        <author>Guanyu Liu</author><author>Jinhang Liu</author><author>Guifan He</author><author>Yifan Liu</author><author>Huabo Bai</author><author>Min Zhou</author>
        <description><![CDATA[High-dimensional streaming data implementations commonly utilize online streaming feature selection (OSFS) techniques. In practice, however, incomplete data due to equipment failures and technical constraints often poses a significant challenge. Online Sparse Streaming Feature Selection (OS2FS) tackles this issue by performing missing data imputation via latent factor analysis. Nevertheless, existing OS2FS approaches exhibit considerable limitations in feature evaluation, resulting in degraded performance. To address these shortcomings, this paper introduces a novel genetic algorithm-based online sparse streaming feature selection (GA-OS2FS) in data streams, which integrates two key innovations: (1) imputation of missing values using a latent factor analysis model, and (2) application of genetic algorithm to assess feature importance. Comprehensive experiments conducted on six real-world datasets show that GA-OS2FS surpasses state-of-the-art OSFS and OS2FS methods, consistently attaining higher accuracy through the selection of optimal feature subsets.]]></description>
      </item><item>
        <guid isPermaLink="true">https://www.frontiersin.org/articles/10.3389/fdata.2026.1697392</guid>
        <link>https://www.frontiersin.org/articles/10.3389/fdata.2026.1697392</link>
        <title><![CDATA[Dynamic transfer learning with co-occurrence-guided multi-source fusion for urban spatio-temporal crime prediction]]></title>
        <pubdate>2026-02-05T00:00:00Z</pubdate>
        <category>Original Research</category>
        <author>Chen Cui</author><author>Ziwan Zheng</author><author>Hao Du</author><author>Wen Wang</author>
        <description><![CDATA[Spatio-temporal crime prediction is crucial for optimizing police resource allocation but faces challenges including data sparsity, which hinders models from extracting effective patterns and limits robustness—and the underutilization of cross-type crime co-occurrence correlations. To address these issues, we propose a transfer learning approach that explores underlying cross-type relationships, enabling the sharing of spatio-temporal features across crime types and alleviating data sparsity. An adaptive weight updating mechanism is incorporated to enhance the perception of distinct crime categories, while the impacts of points of interest (POIs), meteorological factors, and other features are also analyzed. Experiments on real-world data from a Chinese city show that our model comprehensively captures latent features across crime types, thereby enhancing predictive performance and robustness, particularly for crime types with sparse data. Moreover, it effectively incorporates environmental features, further improving crime prediction performance.]]></description>
      </item><item>
        <guid isPermaLink="true">https://www.frontiersin.org/articles/10.3389/fdata.2025.1651290</guid>
        <link>https://www.frontiersin.org/articles/10.3389/fdata.2025.1651290</link>
        <title><![CDATA[Depression detection through dual-stream modeling with large language models: a fusion-based transfer learning framework integrating BERT and T5 representations]]></title>
        <pubdate>2026-02-04T00:00:00Z</pubdate>
        <category>Original Research</category>
        <author>Na Wang</author><author>Weijia Zhang</author><author>Raja Kamil</author><author>Ian Renner</author><author>Syed Abdul Rahman Al-Haddad</author><author>Normala Ibrahim</author><author>Zhen Zhao</author>
        <description><![CDATA[Millions of people around the world suffer from depression. While early diagnosis is essential for timely intervention, it remains a significant challenge due to limited access to clinically diagnosed data and privacy restrictions on mental health records. These limitations hinder the training of robust AI models for depression detection. To tackle this, this article proposes a parallel transfer learning framework for depression detection that integrates BERT and T5 through a fusion mechanism, combining the complementary advantages of these two large language models (LLMs). By integrating their semantic embeddings, the method captures a broader range of linguistic cues from transcribed speech. These embeddings are processed through a model with two parallel branches: a one-dimensional convolutional neural network and a dense neural network are used to construct each branch for preliminary prediction, which are then fused for final prediction. Evaluations on the E-DAIC dataset demonstrate that the proposed method outperforms baseline models, achieving a 3.0% increase in accuracy (91.3%), a 6.9% increase in precision (95.2%), and a 1.7% improvement in F1-score (90.0%). The experimental results verify the effectiveness of BERT and T5 fusion in enhancing depression detection performance and highlight the potential of transfer learning for scalable and privacy-conscious mental health applications.]]></description>
      </item><item>
        <guid isPermaLink="true">https://www.frontiersin.org/articles/10.3389/fdata.2026.1750906</guid>
        <link>https://www.frontiersin.org/articles/10.3389/fdata.2026.1750906</link>
        <title><![CDATA[Algorithmic recourse in sequential decision-making for long-term fairness]]></title>
        <pubdate>2026-02-04T00:00:00Z</pubdate>
        <category>Original Research</category>
        <author>Francisco Gumucio</author><author>Lu Zhang</author>
        <description><![CDATA[Long-term fairness in sequential decision-making is critical yet challenging, as decisions at each time step influence future opportunities and outcomes, potentially exacerbating existing disparities over time. While existing methods primarily achieve fairness by directly adjusting decision models, in this work, we study a complementary perspective based on sequential algorithmic recourse, in which fairness is pursued through actionable interventions for individuals. We introduce Sequential Causal Algorithmic Recourse for Fairness (SCARF), a causally grounded framework that generates temporally coherent recourse trajectories by integrating structural causal modeling with sequential generative modeling. By explicitly incorporating both short-term and long-term fairness constraints, as well as practical budget limitations, SCARF generates personalized recourse plans that effectively mitigate disparities over multiple decision cycles. Through experiments on synthetic and semi-synthetic datasets, we empirically examine how different recourse strategies influence fairness dynamics over time, illustrating the trade-offs between short-term and long-term fairness under sequential interventions. The results demonstrate that SCARF provides a practical and informative framework for analyzing long-term fairness in dynamic decision-making settings.]]></description>
      </item><item>
        <guid isPermaLink="true">https://www.frontiersin.org/articles/10.3389/fdata.2026.1718710</guid>
        <link>https://www.frontiersin.org/articles/10.3389/fdata.2026.1718710</link>
        <title><![CDATA[Modeling household adoption of IoT-based home security in Dhaka: a PLS–machine learning framework]]></title>
        <pubdate>2026-02-04T00:00:00Z</pubdate>
        <category>Original Research</category>
        <author>Arif Mahmud</author><author>Ashikur Rahman</author><author>Fahmid Al Farid</author><author>Jia Uddin</author><author>Hezerul Bin Abdul Karim</author>
        <description><![CDATA[IntroductionDespite several strategies, Bangladesh has a poor rate of internet of things (IoT) deployment. This study therefore seeks to investigate the factors shaping IoT adoption for residential security in Dhaka and to analyze their respective contributions.MethodHence, this study combined two important theories, namely protection motivation theory (PMT) along with attitude-social influence-self-efficacy (ASE) in which a hybrid PLS-Machine learning approach has been used to identify both linear and nonlinear correlations with high predictive accuracy. Snowball sampling method was utilized to choose 348 valid replies from a survey of household heads. Afterward, partial least squares (PLS) followed by artificial neural networks (ANN) and machine learning (ML) classifiers were the procedures that made up the complete assessment method.ResultsThe variables that affected intention with a variance of 34.9% and accuracy of 74.28% were severity, vulnerability, response efficacy, response cost, and attitude. On the other hand, vulnerability was the most significant predictor, followed by response cost, attitude, response efficacy, self-efficacy, social influence, and severity.DiscussionThe theoretical contribution of this study lies in its novel integration of PMT and ASE models, offering new insights into their combined effect on technology adoption in emerging markets. Besides, the findings contribute to the literature by increasing the public awareness of home security that can enhance Dhaka's overall state of public order and safety. Moreover, the findings may offer valuable insights for companies and entrepreneurs, as incorporating these factors into marketing strategies and investment initiatives is likely to foster greater consumer adoption.]]></description>
      </item><item>
        <guid isPermaLink="true">https://www.frontiersin.org/articles/10.3389/fdata.2026.1775728</guid>
        <link>https://www.frontiersin.org/articles/10.3389/fdata.2026.1775728</link>
        <title><![CDATA[Adaptive core-enhanced latent factor model for highly accurate QoS prediction]]></title>
        <pubdate>2026-02-02T00:00:00Z</pubdate>
        <category>Original Research</category>
        <author>Siqi Ai</author><author>Peixin Li</author><author>Hao Fang</author><author>Yonghui Xia</author>
        <description><![CDATA[Accurate prediction of Quality of Service (QoS) plays a crucial role in service recommendation and selection across large-scale distributed environments. Latent factor (LF) models have become a mainstream solution for QoS prediction owing to their simplicity and scalability, yet typical formulations struggle to capture complex latent interactions and usually rely on manually tuned regularization, which often limits prediction accuracy. To address these challenges, we propose an Adaptive Core-Enhanced Latent Factor (ACELF) model that integrates a learnable core interaction mechanism with an incremental Proportional-Integral-Derivative (PID)-driven adaptive regularization strategy. Specifically, a learnable core interaction matrix is introduced to model interactions between latent user and service factors, enabling richer representation learning beyond standard bilinear assumptions. To further enhance robustness, we design an incremental PID controller that dynamically adjusts the regularization coefficient of the core interaction matrix according to the training dynamics, allowing the optimization process to automatically balance model expressiveness and overfitting. Extensive experiments on real-world QoS datasets demonstrate that ACELF consistently outperforms several state-of-the-art methods in terms of prediction accuracy.]]></description>
      </item><item>
        <guid isPermaLink="true">https://www.frontiersin.org/articles/10.3389/fdata.2025.1679897</guid>
        <link>https://www.frontiersin.org/articles/10.3389/fdata.2025.1679897</link>
        <title><![CDATA[Examining the influence of deterrent and enhancement factors on QR-code mobile payment continuance intention: insights from PLS-SEM and IPMA analysis]]></title>
        <pubdate>2026-01-22T00:00:00Z</pubdate>
        <category>Original Research</category>
        <author>Ashikur Rahman</author><author>Fahmid al Farid</author><author>Mohammad Abul Bashar</author><author>Jia Uddin</author><author>Arif Mahmud</author><author>Hezerul Abdul Karim</author>
        <description><![CDATA[IntroductionThe rise of contactless payment has made quick response (QR) code-mobile payment (QR-MP) platform increasingly popular among mobile financial service (MFS) users, especially in emerging economies. It has been demonstrated that the ongoing use of QR payments can significantly drive the growth of emerging economies. However, despite its importance, the continued use of this technology has not been satisfactory. Thus, this study seeks to explore the modified Unified Theory of Acceptance and Use of Technology 2 (UTAUT2) model, including four additional constructs: amotivation (AM), alternative attractiveness (AA), QR transaction anxiety (QTA), and transaction convenience (TC) to examine the MFS users' sustained usage of QR payment.MethodsData were collected from 247 MFS users in Bangladesh using an online survey and analyzed through SEM-PLS and non-linear analysis of IPMA.ResultsThe research findings reveal that effort expectancy is the most influential factor, and that both moderator factors, QTA and TC, are significant. However, social influence and hedonic motivation were found to be insignificant. Furthermore, our extended research model explains 76.5% of the variance in CINT without the moderation effect.DiscussionThe IPMA findings help to find the best-performing variables and provide practical insights for this study. Theoretical and managerial implications are provided to enrich the existing literature on the study of information technology, indicating how MFS providers in developing countries can retain their existing users.]]></description>
      </item><item>
        <guid isPermaLink="true">https://www.frontiersin.org/articles/10.3389/fdata.2025.1659026</guid>
        <link>https://www.frontiersin.org/articles/10.3389/fdata.2025.1659026</link>
        <title><![CDATA[EnDuSecFed: an ensemble approach for privacy preserving Federated Learning with dual-security framework for sustainable healthcare]]></title>
        <pubdate>2026-01-22T00:00:00Z</pubdate>
        <category>Original Research</category>
        <author>Bela Shrimali</author><author>Jenil Gajjar</author><author>Swapnoneel Roy</author><author>Sanjay Patel</author><author>Kanu Patel</author><author>Ramesh Ram Naik</author>
        <description><![CDATA[Recent advances in Artificial Intelligence have highlighted the role of Machine Learning in healthcare decision-making, but centralized data collection raises significant privacy risks. Federated Learning addresses this by enabling collaborative training across multiple clients without sharing raw data. However, Federated Learning remains vulnerable to security threats that can compromise model reliability. This paper proposes a dual-security Federated Learning framework that integrates Fernet Symmetric Encryption for secure transmission of model updates using symmetric encryption and an Intrusion Detection System to detect anomalous client behavior. Experiments on a publicly available healthcare dataset show that the proposed system enhances privacy and robustness compared to traditional FL. Among tested models, including Logistic Regression, Random Forest, and SVC, the ensemble method achieved the best performance with 99% accuracy.]]></description>
      </item><item>
        <guid isPermaLink="true">https://www.frontiersin.org/articles/10.3389/fdata.2025.1723155</guid>
        <link>https://www.frontiersin.org/articles/10.3389/fdata.2025.1723155</link>
        <title><![CDATA[Big data approaches to bovine bioacoustics: a FAIR-compliant dataset and scalable ML framework for precision livestock welfare]]></title>
        <pubdate>2026-01-16T00:00:00Z</pubdate>
        <category>Original Research</category>
        <author>Mayuri Kate</author><author>Suresh Neethirajan</author>
        <description><![CDATA[The convergence of IoT sensing, edge computing, and machine learning is revolutionizing precision livestock farming. Yet bioacoustic data streams remain underexploited due to computational-complexity and ecological-validity challenges. We present one of the most comprehensive bovine vocalization datasets to date-569 expertly curated clips spanning 48 behavioral classes, recorded across three commercial dairy farms using multi-microphone arrays and expanded to 2,900 samples through domain-informed data augmentation. This FAIR-compliant resource addresses key Big Data challenges: volume (90 h of raw recordings, 65.6 GB), variety (multi-farm, multi-zone acoustic environments), velocity (real-time processing requirements), and veracity (noise-robust feature-extraction pipelines). A modular data-processing workflow combines denoising implemented both in iZotope RX 11 for quality control and an equivalent open-source Python pipeline using noisereduce, multi-modal synchronization (audio-video alignment), and standardized feature engineering (24 acoustic descriptors via Praat, librosa, and openSMILE) to enable scalable welfare monitoring. Preliminary machine-learning benchmarks reveal distinct class-wise acoustic signatures across estrus detection, distress classification, and maternal-communication recognition. The dataset's ecological realism-embracing authentic barn acoustics rather than controlled conditions-ensures deployment-ready model development. This work establishes the foundation for animal-centered AI, where bioacoustic streams enable continuous, non-invasive welfare assessment at industrial scale. By releasing a Zenodo-hosted, FAIR-compliant dataset (restricted access) and an open-source preprocessing pipeline on GitHub, together with comprehensive metadata schemas, we advance reproducible research at the intersection of Big Data analytics, sustainable agriculture, and precision livestock management. The framework directly supports UN SDG 9, demonstrating how data science can transform traditional farming into intelligent, welfare-optimized production systems capable of meeting global food demands while maintaining ethical animal-care standards.]]></description>
      </item><item>
        <guid isPermaLink="true">https://www.frontiersin.org/articles/10.3389/fdata.2025.1753871</guid>
        <link>https://www.frontiersin.org/articles/10.3389/fdata.2025.1753871</link>
        <title><![CDATA[Deep learning-enabled hybrid systems for accurate recognition of text in seal images]]></title>
        <pubdate>2026-01-14T00:00:00Z</pubdate>
        <category>Original Research</category>
        <author>Keke Zhang</author><author>Mingyu Guan</author><author>Chao Wu</author><author>Yutong Li</author><author>Qingguo Lü</author><author>Yi Liu</author><author>Yi Wang</author><author>Wei Wang</author><author>Wei Zhang</author>
        <description><![CDATA[Chinese seals are widely used in various fields within Chinese society as a tool for certifying legal documents. However, recognizing text on these seals presents challenges due to background text, high noise levels, and minimalistic image features. This paper introduces a hybrid model to address these difficulties in Chinese seal text recognition. Our model integrates preprocessing techniques tailored for real seals, a deep learning-based position correction model, a circular text unwrapping model, and OCR text recognition. First, we apply a color-based method to effectively remove the black background text on seals, eliminating redundant information while retaining crucial features for further analysis. Next, we introduce an innovative image denoising algorithm to significantly improve the system's robustness in processing noisy seal images. Additionally, we develop a deep learning-based angle prediction network and create synthetic datasets that mimic real seal scenes, enabling optimal seal image positioning for enhanced text flattening and recognition, thus boosting overall system performance. Finally, polar coordinate transformation is employed to convert the circular seal into a rectangular image for more efficient text recognition. Experimental results indicate that our proposed methods effectively enhance the accuracy of seal text recognition.]]></description>
      </item><item>
        <guid isPermaLink="true">https://www.frontiersin.org/articles/10.3389/fdata.2025.1717592</guid>
        <link>https://www.frontiersin.org/articles/10.3389/fdata.2025.1717592</link>
        <title><![CDATA[Dynamic patterns of healthy lifestyle awareness after COVID-19: a study using Google Trends and joinpoint regression]]></title>
        <pubdate>2026-01-13T00:00:00Z</pubdate>
        <category>Original Research</category>
        <author>Zahroh Shaluhiyah</author><author>Shabrina Arifia Qatrannada</author><author>Roshan Kumar Mahato</author><author>Farid Agushybana</author><author>Sri Handayani</author><author>Dzul Fahmi Afriyanto</author><author>Usha Rani</author><author>Dewie Sulistyorini</author>
        <description><![CDATA[IntroductionThe COVID-19 pandemic has significantly influenced public interest in health-related behaviors, as reflected in online search trends. Analyzing these trends provides insights into shifting health concerns and informing future public health strategies. This study examined Google Trends data to assess the changes in public interest in mental health, healthy diet, sleep, screen time, physical activity, and tobacco smoking before, during, and after the COVID-19 pandemic.MethodsGoogle Trends data (2019–2023) were analyzed using joinpoint regression to identify statistically significant shifts in relative search volume (RSV) over time. Additionally, the Mann–Whitney U test was conducted to examine differences in mean RSV across time period.ResultsAwareness that consistently increased during and after the pandemic was observed in mental health, particularly anxiety, and sleep patterns. These topics showed significant positive trends in joinpoint regression and higher mean RSVs, with statistically significant differences across time periods (p < 0.05). In contrast, some behaviors such as physical activity and screen time saw increased awareness only during the pandemic but did not sustain afterward. Whilst, dietary behavior and smoking either remained stagnant or declined, indicating limited or declining public interest despite their relevance to health outcomes.ConclusionDigital interest in health behaviors varied during and after COVID-19, with only mental health and sleep showing sustained concern. However, spikes in awareness often reflected personally relevant issues, highlighting Google Trends' potential as an early signal for health promotion efforts.]]></description>
      </item>
      </channel>
    </rss>