ORIGINAL RESEARCH article
Front. Nutr.
Sec. Nutrition Methodology
Volume 12 - 2025 | doi: 10.3389/fnut.2025.1610363
ByteTrack: A Deep Learning Approach for Bite Count and Bite Rate Detection Using Meal Videos In Children
Provisionally accepted- 1Department of Nutritional Sciences, College of Health and Human Development, The Pennsylvania State University, University Park, United States
- 2Department of Food Science, College of Agricultural Sciences, The Pennsylvania State University, University Park, Pennsylvania, United States
- 3Department of Human Development and Family Studies, College of Health and Human Development, The Pennsylvania State University, University Park, Pennsylvania, United States
Select one of your emails
You have multiple emails registered with Frontiers:
Notify me on publication
Please enter your email address:
If you already have an account, please login
You don't have a Frontiers account ? You can register here
Assessing eating behaviors at a meal (e.g., eating rate) provides insight on risk for overconsumption and obesity. Traditional sensor-based methods often interfere with natural eating and lack accuracy, while manual video annotation, the current gold standard, is labor-intensive, costly, and lacks scalability to larger studies. To address these challenges, we developed ByteTrack, a deep learning-based system for automated bite count and bite rate detection developed using video-recorded child meals. This is one of the first AI-based systems for bite detection, is designed to handle noise (i.e., blur, low light, camera shake) and occlusions (ie., hands or utensils blocking the mouth) common in meals recorded with children. Our model was trained on 1,440 minutes from 242 videos, obtained from 94 children (ages 7-9 years) who consumed 4 meals (separated by approximately 1 week) where identical foods were served in varying amounts. ByteTrack processes images in two stages: 1) face detection using a hybrid approach of Faster Regional-Convolutional Neural Network (Faster R-CNN) and You Only Look Once version 7 (YOLOv7), and 2) bite classification with EfficientNet CNN and Long Short-Term Memory-Recurrent Neural Network (LSTM-RNN). The model achieved an average precision of 79.4%, recall of 67.9%, and an F1 score of 70.6%, over a test set of 51 videos. Performance varied with individual movement and occlusions (i.e., face blocked by hands or objects). Inter-rater reliability with the gold-standard manual observational coding was assessed via intraclass correlation coefficient which averaged 0.66, ranging from 0.16 to 0.99 with lower reliability seen in videos with high movement or occlusions. This pilot study marks a step toward a scalable, automated bite detection tool, with future work focused on improving robustness across varied populations and conditions.
Keywords: Bite detection, neural networks, eating behaviors, Childhood Obesity, DietaryAssessment, Automation
Received: 11 Apr 2025; Accepted: 11 Sep 2025.
Copyright: © 2025 Bhat, Keller, Brick and Pearce. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
* Correspondence: Alaina L Pearce, azp271@psu.edu
Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.