- 1Departamento de Ingeniería Eléctrica, Universidad de La Frontera, Temuco, Chile
- 2Magíster en Ciencias de la Ingeniería, Universidad de La Frontera, Temuco, Chile
- 3Facultad de Ciencias Agropecuarias y Medioambiente, Universidad de La Frontera, Temuco, Chile
In this study, we developed a dataset of behaviors associated with lameness in dairy cows. The data collection utilized IoT collars that were placed around the necks of 10 dairy cows. This publicly available dataset contains 441 labeled behaviors, amounting to over 7 h of recording time. It includes acceleration data referenced in both body and world frames, as well as gyroscope signals, which facilitated the extraction of 112 relevant features for classifying key behaviors such as walking, grazing, and resting through machine learning algorithms. To enhance model performance and reduce feature dimensionality, automatic feature selection techniques were applied before classification. The dataset's effectiveness was assessed using various classification models, including Support Vector Machines (SVM), Logistic Regression, Decision Trees, and Random Forests. Results indicated that signals referenced to the body frame yielded better behavior discrimination, achieving a maximum macro F1-score of 0.9625 with the SVM model. This public dataset can facilitate early lameness detection by enabling accurate classification of behavior patterns.
1 Introduction
In contemporary livestock production, the optimization of animal welfare and the efficient management of herds are essential for enhancing the profitability and sustainability of dairy farming operations. Among the most critical challenges that farmers encounter are the timely detection of estrus and the early identification of health issues, such as lameness, both of which significantly affect livestock productivity. Ineffective management of these conditions can result in substantial economic losses, characterized by low reproduction rates, diminished milk yield, and increased veterinary expenses (1–4).
Numerous studies have established the correlation between lameness and cow behavior, even prior to the visual manifestation of lameness (5). Lame cows typically display reduced feeding and walking activity, alongside an increase in lying time when compared to their healthy counterparts (6–9).
Walking, grazing, and resting constitute fundamental behaviors that serve as indicators of dairy cows' overall health and welfare. Variations in these activities represent some of the earliest signs of lameness, often detectable before the appearance of visible clinical symptoms. The act of walking is particularly relevant, as lame cows generally exhibit diminished locomotor activity due to pain or discomfort (8). Similarly, grazing behavior, which is intricately associated with feeding time and frequency, declines significantly in lame cows, reflecting both their decreased appetite and the physical strain associated with foraging (6, 7). In contrast, resting time tends to increase as a compensatory behavior, allowing cows to alleviate pressure on affected limbs (9). Consequently, the continuous and automated monitoring of these three behaviors offers critical insights into early lameness detection and supports timely intervention strategies aimed at safeguarding animal welfare and enhancing farm productivity.
The advancement of emerging technologies, especially the Internet of Things (IoT), has facilitated the deployment of automated solutions for continuously and precisely monitoring animal behavior. Recent reviews have underscored the efficacy of wearable sensors in classifying livestock behavior, such as accelerometers and Inertial Measurement Units (IMUs). When integrated with machine learning algorithms, these technologies provide robust tools for detecting lameness (10, 11). Daily activity patterns and specific movement behaviors exhibited by cows are closely correlated with their health and reproductive status (12). Additionally, other IoT-based methodologies, such as vision systems, have exhibited substantial potential. For instance, research conducted by Van Hertem et al. (13) utilized automatically registered three-dimensional video data for the early detection of lameness in dairy cows, thereby demonstrating the significance of continuous behavioral analysis.
In dairy farms using grazing systems, where cows feed directly from pastures, detecting lameness and other diseases is challenging due to the vast territory and the large number of livestock. Grazing systems are commonly used in Southern Hemisphere countries such as New Zealand, Australia, Argentina, Uruguay, Chile, and Brazil.
The literature reveals several public datasets dedicated to the detection of cattle movements. One notable example is CowScreeningDB, proposed by Ismail et al. (10), which is a multisensor dataset created from data collected from 43 dairy cows. In addition to making the dataset publicly accessible, the authors present a machine learning technique based on Support Vector Machines (SVM) designed to classify cows as healthy or lame. This methodology achieves an average accuracy rate of 77% in the best cases. However, this dataset does not include specific labels for normal walking or lameness events. Instead, it provides continuous data collection for 7 h, corresponding to the daily activity of healthy and lame cows. Furthermore, data were collected on cows confined to at least 10 square meters.
Haladjian et al. (11) proposed a sensor-based system for detecting lameness in dairy cattle. The dataset used to train the classification algorithm was made available to facilitate developing and validating alternative methods to improve cow welfare. However, the dataset construction did not include naturally lame cows; instead, lameness was simulated by placing a plastic block on the outer claw of the left or right hind hoof. Employing an SVM algorithm, an average accuracy of 91.1% was achieved. Ito et al. (14) also posted a dataset that includes triaxial accelerometer measurements with thirteen labeled behaviors. These data were collected using a Kionix KX122-1037 ±2g, 16-bit accelerometer placed on the neck of six Japanese black cows at a farm belonging to Shinshu University in Nagano, Japan. Data were collected over two days, allowing the cows to roam freely in a pasture field and farm pens, while being recorded with Sony FDR-X3000 4K video cameras. A total of 197 minutes of data were labeled, covering thirteen distinct behaviors. This dataset was utilized by Russel et al. (15), who, through a deep learning model, achieved the following performance metrics: in the worst case, an accuracy of 0.98, precision of 0.73, recall of 0.88, and an F1-score of 0.76 were obtained. In the best case, they reached values of 1.00 accuracy, 1.00 precision, 0.98 recall, and 0.99 F1-score.
A common aspect of the studies by Ismail et al. (10) and Haladjian et al. (11) is that data collection occurs in confined or controlled environments, and both employ methodologies involving measurement devices attached to the cows' legs. Since these studies took place in confined spaces, the results might differ from those obtained in free-grazing systems, where monitoring behavior is particularly challenging due to environmental variability and limited animal visibility (16). On the other hand, while Ito et al.'s (14) dataset is obtained in open spaces and farm pens, it only includes linear acceleration measurements in the body reference frame. Furthermore, behavior labeling was based on video recordings that were manually captured, a process which may influence cow behavior due to the presence of human operators during recording. It lacks gyroscope measurements, whereas in Liang et al. (17), the features extracted from the gyroscope have the highest permutation importance in movement classification.
This highlights the need for a more comprehensive dataset in free-grazing systems, incorporating new variables for behavior identification. Additionally, a more reliable system for recording events during dataset creation would improve the accuracy and robustness of the dataset.
A notably relevant study was conducted in Norway by Versluijs et al. (18), who monitored free-ranging beef cattle in rugged, forested pasture areas using GPS collars equipped with tri-axial accelerometers. Their work is notable not only for the natural grazing context but also for the preprocessing techniques explored, including smoothing window selection and orientation correction using a world reference frame. Despite testing models with and without orientation correction, their findings showed similar performance between both approaches, suggesting that the effect of correction may depend on the collar fit or terrain. This study serves as a reference for the real-world application of accelerometry in open grazing environments, supporting the relevance of our approach.
Building upon these insights, our study seeks to overcome limitations reported in earlier works conducted in confined or semi-controlled environments, unlike the natural settings addressed by Versluijs et al., where sensor placement and environmental complexity are often limited. By incorporating acceleration measurements in both body and world reference frames and collecting data in open grazing conditions, we aim to enhance behavior classification and provide a more realistic and generalizable dataset. These latter measurements could improve the detection of behaviors such as walking (19). Additionally, angular velocity measurements are included to identify behaviors such as walking, grazing, and resting. Reliability and reproducibility of the results are ensured through rigorous documentation of the data collection and labeling process. Unlike the methodologies presented in (10) and (11), data collection is conducted on grazing dairy cows using an IoT collar and a high-range Pan-Tilt-Zoom (PTZ) camera with night vision. The camera will be mounted on a 9-meter-high pole with a GPS tracking system to obtain corresponding videos when cows are in an 80-hectare grazing environment.
To address the challenges of behavior monitoring in free-grazing dairy systems, this work introduces a novel and fully documented dataset of annotated cow behaviors collected in open pasture conditions using wearable sensors and video validation. The dataset aims to support the development and testing of machine learning models for behavior recognition and early detection of lameness-related patterns in real-world environments.
The key contributions of this work are summarized as follows:
1. A publicly available dataset for detecting behaviors associated with lameness, including accelerometer and gyroscope data collected from grazing dairy cows in southern Chile, along with an example script demonstrating how to implement a machine learning algorithm using these data;
2. Inclusion of acceleration and gyroscope data in body and world reference frames;
3. Implementation and testing of various machine learning algorithms using the behavior dataset, demonstrating its utility in classification tasks.
2 Materials and methods
This section describes the proposed methodology, which is illustrated in Figure 1. As shown, it is divided into three main stages, detailed below.
2.1 Data collection
2.1.1 Field details
The data acquisition process was carried out at a dairy farm located in the Maquehue Experimental Farm (see Figure 2), owned by the Universidad de La Frontera in Temuco, Chile. It is situated 17 km south of Temuco, in the Freire municipality (Lat: −38.8379, Long: −72.6938). This farm has a total of 120 Holstein Friesian cows that are raised on open pasture with access to food and water.

Figure 2. Location of the Maquehue experimental field. The white line indicates the boundary of the LoRaWAN network (maps data: Google Earth, © 2025 / Maxar Technologies).
2.1.2 Data and video collection system
Data were collected using IoT collars installed around the animals' necks, as illustrated in Figure 3. Each collar integrates two IMUs: the MPU9250 and the BNO055, both sampling at 10 Hz. BNO055 was included in the collar, as it provides orientation in quaternion format, allowing the calculation of acceleration in the world reference frame, while MPU9250 is a low-cost sensor alternative. The collars also include GPS functionality, and both location data and battery status are transmitted via a LoRa network for real-time visualization on a Grafana dashboard. Additional components of each device include a Wireless Stick V3 microcontroller, an SD card slot for local data storage, and a Real-Time Clock (RTC).
Data recorded by the collar are stored in Comma-Separated Values (CSV) files on a microSD card, which is removed at the end of the measurement period for data storage. Figure 4A provides a detailed view of the data collection device components.

Figure 4. Data collection system setup. (A) IoT collar components used for data collection. (B) PTZ cameras used for monitoring.
To validate cow behaviors, a system of PTZ cameras (model IP DAHUA 2MP SD6AL230F-HNI) records the pasture throughout the day. This system includes automatic monitoring of general cattle behavior (20). Recent modifications enable the system to individually locate cows equipped with IoT collars via their GPS coordinates (21). Video recordings obtained from these cameras are stored on a Network Video Recorder (NVR) for a limited time before being transferred to a server. Figure 4B shows a photo of the PTZ cameras used for video acquisition.
2.2 Dataset construction
2.2.1 Data processing
At this stage, the relevant columns for the dataset are processed. These correspond to the variables recorded by the BNO055 IMU, which include linear acceleration, angular velocity, and quaternion-based orientation. Additionally, a column is added for acceleration in the world frame (WF). This transformation expresses acceleration relative to the Earth's surface, making both acceleration and orientation measurements independent of the sensor's posture or position on the animal. The procedure to convert accelerations from the body frame (BF) to the world frame using a rotation matrix can be found in Muñoz-Poblete et al. (19). Furthermore, the variables measured by the MPU9250 IMU, including linear acceleration, gyroscope, and magnetometer readings, are also included.
To perform behavior labeling, it is necessary to synchronize the signal timestamps with the video timestamps. For this purpose, a time column (date and time) is generated to facilitate the labeling process. Since the measurements provided by the IMUs are raw values, these need to be converted to their corresponding physical units. This is achieved using a linear scaling method to transform the data into a defined range. In this context, this involves multiplying each raw value in the CSV files by a scaling factor, which consists of the resolution value and the sensor's sensitivity. The equations for scaling raw data for the accelerometer and gyroscope are as follows:
where ba represents the acceleration data in m/s2, and bw corresponds to the gyroscope measurements in °/s. ar is the raw accelerometer value, and wr is the raw gyroscope value. Ra and Rw denote the measurement ranges of the accelerometer and gyroscope, respectively. ba and bw represent the resolutions (in bits) of the accelerometer and gyroscope, corresponding to the number of quantization levels used to digitize the analog signal.
The BNO055 and MPU9250 IMUs were configured with the following parameters: the BNO055 accelerometer was set with a range of ±8g and a resolution of 14 bits, while the MPU9250 accelerometer operated at ±16g with a resolution of 16 bits. For both devices, the gyroscopes were configured with a range of ±2000°/s and a resolution of 16 bits.
2.2.2 Definition of cow behaviors
The behaviors of interest for the dataset are walking, grazing, and resting, as these three are the most indicative of lameness-related changes in dairy cattle. Studies have shown that lame cows typically reduce their walking activity (8), spend less time grazing (6, 7), and increase their resting time as a compensatory strategy (9). Therefore, continuous monitoring of these behaviors provides valuable insight for early lameness detection.
To build a more robust and representative dataset, it is also necessary to label other less frequent behaviors such as standing, licking, shaking, head nodding, scratching, and rising. Including these actions helps prevent misclassification and allows the model to distinguish between major locomotor behaviors and isolated secondary activities. Table 1 provides the definition of each behavior analyzed in this study.
The machine learning models in this work classify only four categories: walking, grazing, resting, and miscellaneous behaviors. The miscellaneous behaviors class groups all remaining behaviors that are not included in the three main categories, following a one-vs-rest strategy commonly used in behavior recognition tasks.
2.2.3 Behavior labeling
The labeling of behaviors was based on the visual inspection of video recordings obtained from the PTZ cameras installed on the farm. The first step involved visually identifying the cow wearing the IoT collar. Since the cow's local identification number was not visible in the footage, photographs were taken during collar installation to document each cow's unique spot pattern. These spot patterns allowed the labeling team to match the cow observed in the video with the corresponding dataset collected by the collar.
Once the individual cow had been identified, the behaviors it exhibited were annotated by marking the start and end times of each event in an Excel file, along with other relevant information. As cow behavior is naturally variable in duration, the labeled events reflect non-uniform time lengths.
During the labeling process, it was observed that the timestamps of the collar data were not synchronized with the video timestamps; therefore, these discrepancies were identified and corrected through exhaustive event review. The synchronization between the video recordings and the signal data was performed manually by simultaneously analyzing both the video footage and the acceleration signals. This procedure was carried out only after accurately identifying the cow and confirming the specific collar it was wearing, ensuring that the observed behavior in the video corresponded precisely to the signal data recorded by that collar at that moment.
The labeling process was carried out by two research assistants and was supported by a veterinarian, who held periodic meetings with the team to validate each annotated behavior.
2.2.4 Dataset structure
The recorded and labeled data were stored in individual CSV files, each corresponding to a single annotated behavior event. The naming convention of these files follows a standardized format that encodes key metadata, including the event number, behavior type, animal identifier, acquisition date, and start time, as illustrated in Figure 5. This facilitates systematic identification, traceability, and automated parsing of events.

Figure 5. Structured format of CSV file names. The name consists of six elements: event number, event type, cow identifier, event acquisition date (YYYYMMDD), start time (HHMMSS), and file extension (.csv).
To further enhance accessibility and organization, the dataset is arranged hierarchically into folders based on behavior categories, as shown in Figure 6. Each folder contains only the CSV files associated with a specific behavior, enabling streamlined filtering during model training.

Figure 6. Hierarchical folder structure of the dataset. Each level represents a logical organization for storing and accessing the data.
Each CSV file contains time-series sensor data collected by the IMU collars. Table 2 provides a summary of the variables included, their units of measurement, and their corresponding column positions. The dataset integrates signals from two types of IMUs (BNO055 and MPU9250), offering redundancy. Variables include linear acceleration in both world and body reference frames, angular velocity, magnetometer data, and orientation quaternions.
This structure was designed to support the practical needs of the project, which involved collecting multi-sensor data from several cows over an extended period in a real grazing environment. The file naming convention and folder organization facilitate systematic access, integration with video annotations, and automation of the preprocessing pipeline. It also ensures that each event is uniquely identifiable and reproducible, which is essential for future expansion of the dataset or replication by other research groups.
2.2.5 Handling labels with outliers
Following the methodology described by Voss et al. (22), a statistical analysis of the data was conducted. Data points greater than three standard deviations from the mean were considered outliers.
This same methodology was used to identify and correct events that were mislabeled or erroneously recorded. To this end, the z-score statistical measure was used, which indicates how many standard deviations a specific value is above or below the mean. Labels with acceleration mean values for each axis with a z-score greater than 3 or less than −3 were considered outliers (beyond three standard deviations from the mean). Each of these outlier labels was reviewed and either removed or corrected in cases of misclassification of movements.
2.3 Machine learning algorithms
2.3.1 Definition of the time window
The literature indicates that larger time windows, close to 30 seconds, yield better results in behavior classification (17, 23). However, choosing such a large time window would significantly reduce the dataset size. According to Wang et al. (24), a window of at least 5 s can adequately represent each behavior. Considering this, a time window of 5 s (50 samples) with a 50% overlap was selected for this study to segment the labels of each behavior. Figure 7 shows an example of how a walking label is segmented.
2.3.2 Proposed features for classification
Feature extraction from acceleration measurements has been shown to yield good results. However, including gyroscope measurements could significantly improve behavior classification. This is evidenced by Liang et al. (17), where the most relevant features for classification were those derived from gyroscope measurements.
For this reason, in this study, the measurements of the linear acceleration in the x, y, and z axes in the body reference frame (IoT collar), the linear acceleration in the world reference frame, and the gyroscope measurements in the body reference frame (IoT collar) were used for feature extraction. Additionally, three derived signals were calculated using the following formulas:
Where represent the linear acceleration in the world reference frame along the x, y, and z axes. Similarly, represent the linear acceleration in the body reference frame, and correspond to the gyroscope measurements in the body reference frame. w|a| and b|a| are the magnitude vectors of the acceleration in the world and body reference frames, respectively. b|ω| is the magnitude of the gyroscope measurements.
Two datasets were defined: the first, called Dataset BF, is based solely on the body reference frame, while Dataset WF includes the linear acceleration data transformed into the world reference frame.
For each dataset (BF and WF), 14 features were computed for each of the 8 variables, resulting in a total of 112 features used for model training and validation. These features are listed in Table 3.
2.3.3 Accessing the dataset
To facilitate the application of the dataset in machine learning pipelines, a practical workflow was designed for reading, processing, and training models with the labeled behavioral data. Each CSV file contains time-series measurements for a single annotated event, and these files are organized by behavior and naming convention to support batch processing (as described previously).
The dataset can be used to generate feature vectors suitable for input into classification algorithms. A typical process involves reading each file, extracting statistical or spectral features from the accelerometer and gyroscope signals, assigning a class label, and constructing the corresponding input matrix.
Below is an example script written in Python that demonstrates how to load a single labeled file, extract basic features (mean and standard deviation), and train a simple classifier using scikit-learn:
This simplified example can be expanded to include all labeled files, additional features (such as frequency-domain metrics), and more complex models. A more complete implementation, including batch processing, feature extraction, and classifier evaluation, is available in the public repository at https://github.com/WASP-lab/db-cow-walking. The dataset's structure and consistent formatting allow for seamless integration with most machine learning workflows.
2.3.4 Feature selection
Feature selection is a key step to improving the performance of machine learning models. By choosing a smaller and more relevant subset of features, greater accuracy can be achieved by avoiding irrelevant and redundant patterns. This also reduces training time and computational costs, as the model works with fewer variables. Eliminating unnecessary features enhances efficiency and accuracy, making this process an essential tool for optimizing machine learning.
Liang et al. (17) presents a machine learning approach based on IMU data for recognizing daily behavior patterns in dairy cows. This study used two feature selection techniques: Permutation Importance and Sequential Backward Selection (SBS).
Permutation Importance is a model-agnostic technique that evaluates the contribution of each feature by measuring the decline in model performance when the feature's values are randomly permuted. A significant drop in the F1-score indicates that the feature plays an important role in classification. Sequential Backward Selection is a greedy algorithm that starts with the complete set of features and removes the least relevant one in each iteration, based on its impact on model performance. These methods are commonly used to reduce dimensionality and improve generalization. In the work by (17), which focused on behavior recognition using IMU signals, both techniques were applied to identify relevant features from an initial set of 70. Their results showed that reducing to 58 features maintained a high F1-score of 0.87, and further reduction to 31 features still yielded 0.8506, demonstrating the effectiveness of these selection strategies.
In our study, after the training process, the trained model was evaluated using a test set (20% of the original dataset) to generate a baseline metric (in this case, the F1-score). Then, a feature from the test set was selected and permuted (its values were randomly rearranged), and predictions were made using the test set with the permuted feature. Feature importance was calculated as the difference between the baseline performance and the performance after permutation. This procedure was repeated for each feature in the test set, generating a list of relative importances. Features were then ranked in descending order of importance.
Finally, the Sequential Backward Selection technique was used. In this process, the algorithm was initially trained with all available features. In each iteration, the least important feature was removed, and the model's performance was evaluated using the F1-score for each resulting subset. At the end, the subset of features providing the best F1-score was selected, optimizing the model.
2.3.5 Training the machine learning algorithms
To evaluate the utility of the dataset, four widely used supervised machine learning algorithms were implemented and trained: Support Vector Machine (SVM), Logistic Regression (LR), Decision Trees (DT), and Random Forest (RF). All models were developed in Python (version 3.11.9) using the scikit-learn library (version 1.5.1), which provides a consistent and efficient framework for model training, evaluation, and hyperparameter tuning.
Support Vector Machine (25) aims to find an optimal hyperplane that separates the data into distinct classes, maximizing the margin between the closest points of each class, known as support vectors. It works with both linearly separable and non-linear data using kernels. There are different ways to handle multiple classes using SVM, such as the One-vs-Rest and One-vs-One methods. This study uses a multi-class SVM with the One-vs-One approach, as it generally provides better performance. This method involves training a binary classifier for each pair of classes, and the results of each classifier are combined using a voting scheme (the class with the most votes is the final prediction).
Logistic Regression (26) uses a linear combination of independent variables and applies the sigmoid (logistic) function to model the probability that an observation belongs to one of two classes. The Decision Tree algorithm (27) works by repeatedly splitting the data into subsets based on the feature that best separates according to a specific criterion, such as information gain or Gini impurity. Tree nodes represent decisions based on features, and the leaves represent the final predictions. Random Forest (28) is a machine learning algorithm consisting of a collection of independently trained decision trees, with predictions made by majority voting or averaging the predictions of individual trees.
For model training and evaluation, the original dataset was randomly divided, with 80% assigned to the training set and 20% to the test set. Hyperparameter tuning for the models was conducted using the Grid Search method with k-fold cross-validation to ensure that overfitting is minimized or avoided altogether.
2.3.6 Performance indices
The models' performance was evaluated using the metrics accuracy, precision, recall, F1-score, and macro-averaged F1-score. Accuracy is the percentage of correct predictions overall, representing how often the model was correct compared to all predictions. Precision is the proportion of correctly identified positive cases relative to all cases classified as positive. Recall is the proportion of positive cases correctly identified by the model relative to all actual positive cases. The F1-score is the harmonic mean of precision and recall, especially valuable for imbalanced datasets. The macro F1-score is the arithmetic mean of all F1 scores for each class, providing a single number to describe the overall performance of the models. These metrics are calculated as follows:
Where TP, TN, FP, and FN represent the number of true positives, true negatives, false positives, and false negatives, respectively.
3 Results
3.1 Unprocessed data
3.1.1 Variables
Data collection from the IMUs was carried out between May 10 and October 8, 2024, involving 10 dairy cows. The recorded variables are detailed in Table 2. All data were stored on both Google Drive and a dedicated server, resulting in a total of 10 Runs, each compressed in ZIP format. A Run refers to the complete data collected during the time a collar was attached to an individual cow: from installation through removal and data extraction. Each run folder, representing a specific cow during a given monitoring campaign, contributes to a cumulative total of over 3 GB of raw data.
3.1.2 Recordings
A total of 150 video recordings were collected using PTZ cameras positioned to capture the animals' behavior during collar operation. Each recording lasted approximately one hour, with a file size of 1.7 GB, resulting in a total data volume of around 255 GB. These videos were subsequently used for behavior annotation.
3.2 Dataset
As detailed in Table 4, the dataset consists of 441 labels, totaling 7 h, 34 min, and 2 s of recording. It can be observed that the predominant behavior is walking, with a total of 217 events, while resting is the behavior with the fewest observations, with only 10 events. However, the total time recorded for resting amounts to 3 h, 11 min, and 20 s, significantly surpassing the time recorded for walking, which is 53 min and 2 s. This implies that during time window segmentation, the resting behavior corresponds to the majority class. A detailed breakdown of the behaviors included in the miscellaneous behaviors class is provided in Table 5, showing that the predominant behavior within this class is standing.

Table 5. Number of events, average duration, and total duration (hh:mm:ss) of behaviors classified as miscellaneous behaviors.
Figure 8 shows the linear accelerations in the body reference frame for the x, y, and z axes for the four predominant behaviors. It can be observed that each behavior exhibits a distinct pattern. Activities such as walking and grazing show greater variability and amplitude in accelerations, while static activities such as resting and standing present more stable signals.

Figure 8. Graph of linear accelerations in the body reference frame (x, y, z) versus time for the four main behaviors: walking, grazing, resting, and standing.
3.3 Evaluation of classification algorithms
The results presented in Table 6 were obtained after performing hyperparameter optimization using the Grid Search method with k-fold cross-validation, aiming to minimize or avoid overfitting and ensure optimal model performance. It can be observed that, overall, models achieve better performance for each behavior when trained with the dataset containing acceleration in the body reference frame. This indicates that acceleration in the world reference frame does not contribute to improving the classification of these behaviors. Furthermore, grazing achieves the best performance, with F1-scores of 0.99 in the best cases using support vector machine and logistic regression models, and an F1-score of 0.98 in the worst cases using decision tree and random forest models.

Table 6. Evaluation results for the different classification models using body and world reference frame accelerations.
Table 7 summarizes the evaluation results for the models, showing the accuracy and macro F1-score metrics for each model and dataset. For models trained with the dataset containing acceleration in the body reference frame, the highest macro F1-score was 0.9625 with the support vector machine model. The random forest and logistic regression models also performed well, achieving macro F1-scores of 0.9539 and 0.9509, respectively. The model trained with the decision tree method presented the lowest performance, with a macro F1-score of 0.9290. Additionally, grazing and resting behaviors achieved the best performance across all evaluation metrics, while walking and others showed the lowest performance.
For machine learning models trained with the dataset including acceleration in the world reference frame, the best macro F1-score was 0.8965 with the SVM model, while the worst performance was achieved by the random forest model, with a macro F1-score of 0.8430.
These results demonstrate the usefulness of the proposed dataset for classifying key dairy cow behaviors under open grazing conditions, paving the way for future detection systems targeting early signs of lameness.
4 Discussion
Several studies have explored the classification of cattle behaviors or the detection of lameness using IMU-based data. For instance, Ismael et al. (10) proposed CowScreeningDB, a multisensor dataset collected from 43 confined cows, using an SVM model that achieved up to 77% accuracy in identifying lame individuals. Similarly, Haladjian et al. (11) proposed a sensor-based system for the same purpose, collecting data from 10 cows and achieving a higher average accuracy of 91.1%. The results obtained by Ismael et al. (10) and Haladjian et al. (11) are not directly comparable to the present work, as this research focuses on developing a dataset of behaviors associated with lameness rather than directly identifying a lame cow.
In addition, Ito et al. (14) presented a dataset including triaxial accelerometer data from six Japanese black cows with 13 labeled behaviors. The dataset was collected under open grazing and pen conditions using neck-mounted sensors and manual video labeling. Russel et al. (15) used this dataset to train deep learning models, achieving up to 0.99 F1-score. Cabezas et al. (29) also obtained classification accuracies above 0.93 using Random Forest models and neck-mounted accelerometers. While these results are strong, those datasets either lacked gyroscope measurements or were not collected continuously in extensive pasture conditions.
Compared to the present study, the work by Versluijs et al. (18) stands out for its use of orientation-dependent features, i.e., accelerations measured in the body reference frame, to classify behaviors of free-ranging beef cattle in forested pastures in Norway. Their best-performing model used a 20-second smoothing window and achieved excellent results: an accuracy of 0.997, precision of 0.961, and recall of 0.985. However, there are key methodological differences. Versluijs et al. recorded behavioral data using handheld Canon cameras, requiring close human presence and potentially influencing animal behavior. In contrast. Our system employed PTZ IP cameras mounted on elevated poles with GPS-guided framing, enabling continuous monitoring without disturbing the animals, thus avoiding the presence of humans in the video recording process and preserving natural behavior patterns.
Another major distinction lies in the sensing hardware. Versluijs et al. used commercial virtual fencing collars named “Nofence” with built-in IMUs, whereas our custom-designed collar is intended to be low-cost and modular. It integrates two independent IMUs (BNO055 and MPU9250), allowing redundancy and cross-validation of sensor data. Additionally, our system collects both linear acceleration and angular velocity.
From an ethological standpoint, our study focused on dairy cows in an open grazing system, while Versluijs et al. analyzed the behavior of beef cattle. These differences in management system and breed type may also influence the behavioral repertoire and sensor signal profiles. Despite these contrasts, both studies demonstrated that body-frame accelerations yield high classification performance. Notably, while Versluijs et al. applied a simple manual correction (axis inversion when collars were mounted backward), we implemented continuous orientation correction using quaternion-based sensor fusion with magnetometer and accelerometer inputs, transforming data to a world reference frame. However, this correction did not improve performance in our case, likely due to magnetometer anomalies.
During the study, it was observed that signals captured by the magnetometer exhibited certain anomalies in amplitude, affecting orientation estimates and, consequently, the calculation of accelerations in the world reference frame. This could explain why these accelerations performed worse than those in the body reference frame.
In the best case, using an SVM model trained with body-frame accelerations and gyroscopic data, an F1-score of 0.9625 was achieved. This indicates that even with basic classification algorithms, robust results can be obtained when high-quality labeled data is available. Although this value is slightly lower than the 0.99 F1-score achieved by Russel et al. (15), their results were obtained using deep learning techniques on a smaller and less diverse dataset. Notably, the grazing behavior achieved the highest F1-score in our study, reaching 0.99 with the SVM model. This behavior is particularly relevant, as it is directly linked to milk production; thus, accurate monitoring can contribute to improved management decisions and increased economic returns (30).
Although this study does not directly address lameness detection and no lame cows were included in the dataset, the resulting data and findings are relevant from a lameness detection perspective. It is well established that lameness in dairy cows is associated with behavioral changes such as reduced walking activity, decreased grazing time, and increased resting periods. In this context, the classification algorithms developed in this study successfully identified different types of behavior, suggesting that models trained on this dataset could serve as a foundation for monitoring systems aimed at detecting early signs of lameness.
5 Conclusions
This study developed a dataset aimed at identifying behaviors associated with lameness in dairy cows. The dataset contains linear acceleration and angular velocity signals, totaling 441 labels and 7 hours, 34 minutes, and 2 seconds of recordings corresponding to 11 different behaviors. The data were collected from 10 cows grazing in an open pasture system over a period of approximately five months. The entire dataset generation process is fully documented, characterized, and verified, ensuring data reliability and result reproducibility. To test the dataset's adequacy, machine learning algorithms were trained. In general, the models achieved better performance when trained with the dataset composed of body reference frame accelerations along with angular velocity. In the best case, using an SVM model, an F1-score of 0.9625 was achieved. Although this performance is lower than the 0.99 F1-score obtained by Russel et al. (15) using deep learning models, only basic classification algorithms were used in this work. This also confirms that accelerations in the world reference frame do not contribute to improving the classification of these behaviors.
A limitation of this study is the low number of cows used to build the dataset. With only 10 individuals, the results may not fully capture the behavioral variability in larger or more diverse dairy populations. This limitation may affect the generalization of the trained models and highlights the need for future studies with larger and more representative samples. This limitation highlights the potential benefits of expanding the dataset in future studies. The open pasture dairy cow behavior dataset is made publicly available for use in training classification models. It can be further enhanced by increasing the number of labels for each behavior and the number of cows studied. This would allow the creation of more robust and representative models by capturing a greater diversity of data, reducing biases, and improving generalization across different contexts. Moreover, a larger dataset would facilitate using more advanced algorithms, including deep learning techniques. Increasing the duration of behavior recordings is also important for evaluating the effect of different time window sizes and identifying optimal configurations for classification tasks.
In conclusion, this work contributes a well-documented and publicly available dataset of dairy cow behaviors under open grazing conditions. The data and classification results presented here support the development of reliable, scalable monitoring systems aimed at improving animal welfare and early detection of lameness in real-world pasture environments.
Data availability statement
The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found below: https://github.com/WASP-lab/db-cow-walking.
Ethics statement
The animal studies were approved by Comité Ético Científico Universidad de La Frontera. The studies were conducted in accordance with the local legislation and institutional requirements. Written informed consent was not obtained from the owners for the participation of their animals in this study because the associated university is the owner of the dairy farm.
Author contributions
DM-V: Writing – review & editing, Supervision, Funding acquisition, Resources, Conceptualization, Investigation, Software, Writing – original draft, Validation, Project administration, Data curation, Formal analysis, Visualization, Methodology. MG-V: Writing – review & editing, Visualization, Investigation, Conceptualization, Supervision, Methodology, Software, Resources, Funding acquisition, Project administration, Formal analysis, Writing – original draft, Data curation, Validation. DI-Q: Data curation, Project administration, Writing – original draft, Validation, Conceptualization, Resources, Investigation, Supervision, Funding acquisition, Methodology, Software, Formal analysis, Writing – review & editing, Visualization. DC-B: Writing – review & editing, Investigation, Methodology, Software, Writing – original draft, Funding acquisition, Supervision, Resources, Conceptualization, Visualization, Data curation, Validation, Formal analysis, Project administration. CM-P: Investigation, Supervision, Resources, Validation, Software, Funding acquisition, Conceptualization, Writing – review & editing, Formal analysis, Project administration, Writing – original draft, Data curation, Methodology, Visualization.
Funding
The author(s) declare that financial support was received for the research and/or publication of this article. This work was funded by ANID/FONDECYT Regular N°1220178, and Basal funding for Scientific and Technological Center of Excellence, IMPACT, #FB210024.
Acknowledgments
The authors thanks the Maquehue experimental dairy farm, managed by the Faculty of Agricultural Sciences and Environment at the University of La Frontera, Chile.
Conflict of interest
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Generative AI statement
The author(s) declare that no Gen AI was used in the creation of this manuscript.
Any alternative text (alt text) provided alongside figures in this article has been generated by Frontiers with the support of artificial intelligence and reasonable efforts have been made to ensure accuracy, including review by the authors wherever possible. If you identify any issues, please contact us.
Publisher's note
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.
Abbreviations
AAV, Average Acceleration Variation; BF, Body Frame; CSV, Comma-Separated Values; DT, Decision Tree; FP, False Positive; FN, False Negative; GPS, Global Positioning System; IMU, Inertial Measurement Unit; IoT, Internet of Things; LD, Linear Dichroism; LoRa, Long Range; LR, Logistic Regression; NVR, Network Video Recorder; PTZ, Pan-Tilt-Zoom; RF, Random Forest; RTC, Real-Time Clock; SD, Secure Digital; SVM, Support Vector Machine; TLA, Three Letter Acronym; TN, True Negative; TP, True Positive; WF, World Frame.
References
1. Tsousis G, Boscos C, Praxitelous A. The negative impact of lameness on dairy cow reproduction. Reprod Domest Anim. (2022) 57:33–9. doi: 10.1111/rda.14210
2. Barkema HW, Westrik JD, van Keulen KAS, Schukken YH, Brand A. The effects of lameness on reproductive performance, milk production and culling in Dutch dairy farms. Prev Vet Med. (1994) 20:249–59. doi: 10.1016/0167-5877(94)90058-2
3. Green L, Borkert J, Monti G, Tadich N. Associations between lesion-specific lameness and the milk yield of 1,635 dairy cows from seven herds in the Xth region of Chile and implications for management of lame dairy cows worldwide. Animal Welfare. (2010) 19:419–27. doi: 10.1017/S0962728600001901
4. Huxley JN. Impact of lameness and claw lesions in cows on health and production. Livest Sci. (2013) 156:64–70. doi: 10.1016/j.livsci.2013.06.012
5. Norring M, Häggman J, Simojoki H, Tamminen P, Winckler C, Pastell M. Short communication: lameness impairs feeding behavior of dairy cows. J Dairy Sci. (2014) 97:4317–21. doi: 10.3168/jds.2013-7512
6. Frondelius L, Lindeberg H, Pastell M. Lameness changes the behavior of dairy cows: daily rank order of lying and feeding behavior decreases with increasing number of lameness indicators present in cow locomotion. J Vet Behav. (2022) 54:1–11. doi: 10.1016/j.jveb.2022.06.004
7. Barker ZE, Vázquez Diosdado JA, Codling EA, Bell NJ, Hodges HR, Croft DP, et al. Use of novel sensors combining local positioning and acceleration to measure feeding behavior differences associated with lameness in dairy cattle. J Dairy Sci. (2018) 101:6310–21. doi: 10.3168/jds.2016-12172
8. Walker SL, Smith RF, Routly JE, Jones DN, Morris MJ, Dobson H. Lameness, activity time-budgets, and estrus expression in dairy cattle. J Dairy Sci. (2008) 91:4552–9. doi: 10.3168/jds.2008-1048
9. Solano L, Barkema HW, Pajor EA, Mason S, LeBlanc SJ, Nash CGR, et al. Associations between lying behavior and lameness in Canadian Holstein-Friesian cows housed in freestall barns. J Dairy Sci. (2016) 99:2086–101. doi: 10.3168/jds.2015-10336
10. Ismail S, Diaz M, Carmona-Duarte C, Vilar JM, Ferrer MA. CowScreeningDB: A public benchmark database for lameness detection in dairy cows. Comp Electron Agricult. (2024) 216:108500. doi: 10.1016/j.compag.2023.108500
11. Haladjian J, Haug J, Nüske S, Bruegge B. A Wearable sensor system for lameness detection in dairy cattle. Multimodal Technol Interact. (2018) 2:27. doi: 10.3390/mti2020027
12. O'Leary NW, Byrne DT, Garcia P, Werner J, Cabedoche M, Shalloo L. Grazing cow behavior's association with mild and moderate lameness. Animals. (2020) 10:661. doi: 10.3390/ani10040661
13. Van Hertem T, Schlageter-Tello A, Lokhorst C, Maltz E, Antler A, Romanini CEB, et al. Early lameness detection based on automatically registered 3D-video data of dairy cows. Biosyst Eng. (2018) 173:103–11. doi: 10.1016/j.biosystemseng.2017.08.011
14. Ito H, ichi Takeda K, Tokgoz KK, Minati L, Fukawa M, Chao L, et al. Japanese Black Beef Cow Behavior Classification Dataset. Geneva: Zenodo (2021).
15. Russel NS, Selvaraj A. Decoding cow behavior patterns from accelerometer data using deep learning. J Vet Behav. (2024) 74:68–78. doi: 10.1016/j.jveb.2024.06.005
16. Mancuso M, Bassano B, Peric T, Maione S, Pino S. Cow behavioural activities in extensive farms: challenges of adopting automatic monitoring systems. Sensors. (2023) 23:3828. doi: 10.3390/s23083828
17. Liang HT, Hsu SW, Hsu JT, Tu CJ, Chang YC, Jian CT, et al. An IMU-based machine learning approach for daily behavior pattern recognition in dairy cows. Smart Agricult Technol. (2024) 9:100539. doi: 10.1016/j.atech.2024.100539
18. Versluijs E, Niccolai LJ, Spedener M, Zimmermann B, Hessle A, Tofastrud M, et al. Classification of behaviors of free-ranging cattle using accelerometry signatures collected by virtual fence collars. Frontt Anim Sci. (2023) 4:1083272. doi: 10.3389/fanim.2023.1083272
19. Muñoz-Poblete C, González-Aguirre C, Bishop RH, Cancino-Baier D. IMU auto-calibration based on quaternion kalman filter to identify movements of dairy cows. Sensors. (2024) 24:1849. doi: 10.3390/s24061849
20. Muñoz C, Huircan J, Huenupan F, Cachaña P. PTZ camera tuning for real time monitoring of cows in grazing fields. In: 2020 IEEE 11th Latin American Symposium on Circuits & Systems (LASCAS). San Jose: IEEE (2020). p. 1–4.
21. Pinilla D, Muñoz C. Design of a location-based targeting system for PTZ cameras in dairy farms. In: 2024 IEEE International Conference on Automation/XXVI Congress of the Chilean Association of Automatic Control (ICA-ACCA). Santiago: IEEE (2024). p. 1–6.
22. Voss S, Zampieri C, Biskis A, Armijo N, Purcell N, Ouyang B, et al. Normative database of postural sway measures using inertial sensors in typically developing children and young adults. Gait Posture. (2021) 90:112–9. doi: 10.1016/j.gaitpost.2021.07.014
23. Smith D, Rahman A, Bishop-Hurley GJ, Hills J, Shahriar S, Henry D, et al. Behavior classification of cows fitted with motion collars: Decomposing multi-class classification into a set of binary problems. Comp Electron Agricult. (2016) 131:40–50. doi: 10.1016/j.compag.2016.10.006
24. Wang J, He Z, Zheng G, Gao S, Zhao K. Development and validation of an ensemble classifier for real-time recognition of cow behavior patterns from accelerometer data and location data. PLoS ONE. (2018) 13:1–19. doi: 10.1371/journal.pone.0203546
25. Cortes C, Vapnik V. Support-vector networks. Mach Learn. (1995) 20:273–97. doi: 10.1007/BF00994018
27. Rokach L, Maimon O. In: Maimon O, Rokach L, editors. Decision Trees. Boston, MA: Springer US (2005). p. 165–192.
29. Cabezas B, Fuentes D, Morales R. Analysis of accelerometer and GPS data for cattle behaviour identification and anomalous events detection. Sensors. (2022) 22:2639. doi: 10.3390/e24030336
Keywords: standardized dataset, behaviors, cattle, lameness, animal welfare, Internet of Things (IoT), IMU
Citation: Morales-Vargas D, Guarda-Vera M, Iglesias-Quilodrán D, Cancino-Baier D and Muñoz-Poblete C (2025) A dataset for detecting walking, grazing, and resting behaviors in free-grazing cattle using IoT collar IMU signals. Front. Vet. Sci. 12:1630083. doi: 10.3389/fvets.2025.1630083
Received: 16 May 2025; Accepted: 11 August 2025;
Published: 29 August 2025.
Edited by:
Paulo de Mello Tavares Lima, University of Wyoming, United StatesReviewed by:
Jayakrishnan Nair, Southern Illinois University Carbondale, United StatesMurilo Antônio Fernandes, São Paulo State University, Brazil
Copyright © 2025 Morales-Vargas, Guarda-Vera, Iglesias-Quilodrán, Cancino-Baier and Muñoz-Poblete. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: Carlos Muñoz-Poblete, Y2FybG9zLm11bm96QHVmcm9udGVyYS5jbA==