- Department of Veterinary and Animal Sciences, University of Copenhagen, Frederiksberg, Denmark
Qualitative Behaviour Assessment (QBA) is a method that is used to assess emotional states in animals, either based on a list of pre-established terms (fixed list; FL) or developed through Free-Choice-Profiling. Although FL QBA was originally developed for welfare assessment of farm animals, it is nowadays also used for various species and other sectors. This is, amongst others, because QBA contributes a unique ‘whole-animal’ insight into animal experience that complements other measures and its high feasibility along with a general lack of available indicators of positive emotional state. This has led to a number of different usages and applications of FL QBA of which an overview (e.g., exact methodology used, statistical analysis, purpose and aim) so far does not exist. The aim of this review is to provide an overview of the studies that have applied FL QBA, the species it has been used for and the corresponding lists, what the QBA was used for (purpose), as well as about the various methodological approaches. Web of Science was searched between October 2023 and February 2024 for all peer-reviewed publications on FL QBA. 193 publications met inclusion criteria and were included for final review. The most common aim of use of QBA was on-farm welfare assessment, followed by measuring behavioural/emotional responses to specific events and temperament assessment/behavioural profiling. FL QBA was mainly identified for farmed animals (cattle, pigs, hens, sheep, goats, buffaloes and salmon) but also for working and companion animals (horses and donkeys, dogs and cats) as well as various exotic species in other contexts such as zoological institutions (brown bears, polar bears, elephants, dolphins, gorillas, peccaries and pampas deer). No FL QBA use was identified for laboratory animals. Methodological approaches varied greatly, including differences in term generation, observation methods (e.g., individual-vs. group observation and time spent observing), level of training and inter-observer reliability, and statistical analysis. Moreover, the level of reporting also varied greatly. In sum, this review provides a full overview of the current state of FL QBA including a list of all FL used which is important for the future development, refinement and standardisation of the method.
1 Introduction
Qualitative Behaviour Assessment (QBA) is currently one of the relatively few available indicators of positive welfare [see, e.g., Boissy et al. (1) and Keeling et al. (2)] and one of the few methods currently thought as directly inferring an emotional state (3). The key characteristic of QBA is that it addresses the whole dynamic animal, describing and quantifying the emotionally expressive qualities that emerge from the animal’s way of moving around its environment. Qualitative descriptors such as fearful, joyful, or energetic integrate different aspects of an animal’s demeanour and are presumed to reflect an animal’s experience of its surroundings. Thus, QBA postulates that behaviour has observable dynamic expressive qualities open to formal analysis (4).
QBA was first mentioned in literature in 2000. Based on the argument that traditional, quantitative (ethogram-based) behavioural observation methodologies may not capture information on how an animal carries out behaviour (i.e., demeanour), a qualitative approach was explored as a novel methodology for integrative animal welfare assessment (5). Qualitative approaches have been used before to identify personality traits in animal personality research, inferring on underlying constructs that are not only based on which behaviours are performed, but also on how they are performed (6). This means that instead of only quantifying certain behaviours like in traditional ethograms, QBA specifically aims to capture the quality of behaviour, i.e., the style or expressive quality. A first link between this approach and the emotional state of animals was proposed by Wemelsfelder et al. (5). In the original article, as well as in the following years to come [e.g., Wemelsfelder et al. (4, 5) and Rousing and Wemelsfelder (7)], QBA was based on Free-Choice-Profiling (FCP) methodology, in which multiple observers freely generate terms to describe animals’ behavioural expressivity, usually based on video clips. In short, in Free Choice Profiling, observers use their own words to describe the expressive quality of the behaviours they see. A group of observers observes animals (usually from video clips) and then each observer writes down descriptive terms that in his or her opinion describe best the expressive quality of behaviours observed (e.g., descriptors like curious, relaxed, fearful). Then the same observers, using their self-generated descriptor list, rate the expressivity of observed animals on a Visual Analogue Scale (VAS) ranging from ‘minimum’ (expression absent) to ‘maximum’ (expression strongly dominant). Because everyone uses different words, the data is analysed using a statistical method called Generalised Procrustes Analysis (GPA). This technique finds common patterns in the ratings, despite the differences in vocabulary (4, 5, 8).
In a widely cited literature review on measuring positive emotions in animals (1), QBA is mentioned as one potential methodology to measure positive emotional state of animals and for potential inclusion in welfare assessment protocols, although the authors also highlight the general problem of validating such indicators of positive affect. Moreover, as studies suggested good reliability [(e.g., 4, 9–11)] along with the fact that not many (feasible) indicators for the positive emotional state had been described [(e.g., 1)], QBA was included as a measure of positive emotion within the Welfare Quality project (WQ) (development of feasible on-farm welfare assessment protocols). In order to enhance the feasibility of the QBA method, here, for the first time, the development of QBA fixed lists (FL), as ready-to-use lists of terms, is described (9–11). In the FL approach, the list of terms is pre-established based on existing research (sometimes further developed and refined in further studies), or on consultation with suitable species experts and stakeholders [(e.g., 12)], and is not, as with FCP, chosen freely by the observer(s) who end up using the list. This standardisation means that a FL QBA can be carried out by a single observer.
After inclusion in the WQ, QBA has also been extended to other welfare assessment frameworks and protocols, especially as measure for the criterion ‘positive emotional state’ [(e.g., 13–16)]. Because FL QBA is now a part of various welfare assessment schemes [e.g., WQ, Animal Welfare Indicators (AWIN), Shelter Quality (SQP)] and thus the use of FL QBA is sometimes not obvious in these studies, the exact number of studies that have used FL QBA is unclear. The methods used to develop the QBA lists vary, as does the context in which it has been used. Although some literature reviews on QBA already exist, these have so far focused on its potential use in welfare assessment protocols (17–21) and were focused on a specific group of animal species and/or on the usefulness of QBA as a tool for specific contexts [e.g., inclusion in Australian livestock industry (18) or for zoo animal welfare assessment (19)]. Moreover, these were not systematic reviews. The aim of the present review is to provide a structured overview of the application of QBA in studies using the FL approach, covering all species that a FL QBA has been developed for, as well as the uses (aims) of the method. The focus on FL QBA was chosen because this approach is most relevant for welfare assessment tools (i.e., on-farm/on-site use) due to its higher feasibility compared to FCP. Therein, we did not limit on specific purposes of use of FL QBA, but aimed to provide an overview of use of FL QBA in all areas of current research. The specific research questions we aim to answer are: (1) On which animal species has the FL QBA been carried out so far? (2) How were the FLs developed? (3) What was the aim of studies using the FL QBA? (4) How was the FL QBA applied? (5) How were QBA results analysed statistically? Providing such an overview is useful for guiding further developments of the method; the review’s focus will be on identifying methodological concerns for further discussion and research. However, as this is not a comprehensive review of QBA research, it will not address whether the listed QBA FL studies have used QBA successfully or not.
2 Materials and methods
2.1 Search methods
The electronic database Web of Science (WoS) was searched for relevant publications on QBA. This was carried out between October 2023 and February 2024. After initial scoping to detect the best possible search word combination, different searches were carried out for specific species, or groups of animals, to ensure covering the most common species within farmed, companion, experimental and wild animals. Specifically, this included cattle, buffaloes, pigs, poultry, sheep, goats, horses, dogs, cats, fish, experimental animals, as well as wild and exotic animals. For each species or animal group, two searches were applied. The first search was specified by the keywords ‘qualitative + behav* + assessment’ and/or ‘QBA’, supplemented by (i.e., also including) relevant species-specific terms (for example, species-specific terms for horse consisted of ‘horse OR equ* OR pon* OR foal OR filly OR mare OR stallion OR gelding’). The keywords of the second search included ‘welfare + assessment’ and/or ‘Welfare + Quality*’ and/or ‘AWIN’, along with the species-specific terms of the first search. The second search was added because QBA commonly is part of existing welfare assessment schemes such as WQ or AWIN and related publications, which in some cases, were missed by the first search. All search strings were specified to search in ‘topic’ (includes title, abstract and author keywords) with no limitation on publication year. For experimental and wild animals, one broad search was made for each category owing to the large number of species belonging to these categories (for experimental animals, specific searches for rats, mice, hamsters, rabbits, guinea pigs were also included). Finally, the species-specific searches were supplemented by a broad search without species specific terms with the search string ‘qualitative + behav* + assessment OR QBA OR welfare + assessment OR Welfare + Quality* OR AWIN’ to ensure all relevant publications were identified.
2.2 Inclusion and exclusion criteria
Title and abstract of all publications appearing in the searches were initially screened against inclusion and exclusion criteria (if the information could not be obtained from the title or abstract, the full text was screened). Publications that met all the following inclusion criteria were included in the review: (i) applied QBA as part of the study’s methodology (either focused entirely on QBA or included it as part of a larger objective), (ii) used the FL approach (either exclusively or as a second step to FCP for, e.g., term list development), (iii) published in a peer-reviewed journal, (iv) available in English, and (v) available in full. Any duplicate publications (i.e., publications that were already included) were excluded. Consequently, only original research publications utilising QBA based on the FL approach as defined by Wemelsfelder et al. (9), Wemelsfelder (10), Wemelsfelder et al. (11) (based on the respective authors’ claim and interpretation) were included. In addition publications that reported using an existing welfare assessment protocol of which QBA is an established part of (e.g., WQ, AWIN, SQP) were also included in the review, even if QBA was not specifically mentioned in the text.
2.3 Extraction of information
Selected parameters related to the studies’ methodology and results were extracted from the included publications. These parameters included information on the aim of the QBA, the animals used (e.g., species assessed, number of individuals, and life stages), information on the assessors (e.g., experience with species, QBA training received), the QBA method [e.g., term list development, time spent observing the animals, length of the visual analogue scale (VAS)], statistical methods (e.g., whether data suitability criteria were met, whether principal component analysis (PCA) was carried out and number of extracted principal components). Database searches, initial review of publications against inclusion and exclusion criteria, and extraction of parameters on publication-level were carried out approximately evenly distributed by the three authors. Fourteen randomly selected papers were reviewed independently beforehand to assure sufficient agreement in extraction (100%) between the three authors.
3 Results
The searches resulted in 193 included articles which ranged from the years 2011–2023. The last search without species-specific terms did not result in any additional articles. The studies and key results are presented in two tables: Table 1 presents the species, the setting, the life stage and the aim for which the QBA was used. Table 2 presents the experimental procedure of the same studies, i.e., origin of the QBA list of terms used, observer training, observation method and time, length of the VAS and whether QBA scores are analysed at PC or term level. Supplementary Table 1 contains the terms used in the studies on cattle, pigs and poultry, Supplementary Table 2 contains terms for sheep and goats, Supplementary Table 3 contains the same for horses and donkeys, and finally Supplementary Table 4 contains the same for dogs. The majority of the studies were done on production animals. More than half (54.4%) of the studies were on either cattle (34.7%, mainly dairy cattle) or pigs (19.7%). 11.9% of the studies were done on poultry and another 10.9% on small ruminants. On equids, encompassing working, farmed and companion animals alike, 8.8% of the studies were carried out. 8.2% of the studies were carried out on dogs (7.2%) and cats (1.0%). The remaining studies (5.2%) were done on zoo and aquaria animals or fish. No studies were found on experimental animals.
Table 1. Overview of the identified studies utilising fixed-list (FL) Qualitative Behaviour Assessment (QBA), the species the method was applied on, the setting and life stage of the animals during the study, and the aim of the QBA comprised into four general categories.
Table 2. Information on the experimental procedure of the identified studies, including the origin of the used QBA list of terms, level of observer training, observation method and time, number of animals assessed, length of the VAS and the analysis level of QBA outcomes.
3.1 Aim of the studies and origin of QBA term lists
The aim of the studies was in most cases welfare assessment (144 papers), and QBA was often done as part of the WQ (112 papers) or AWIN protocols (14 papers) (Table 1). Seven studies did not use QBA as an indicator in the area of general welfare assessment but rather as a measure of temperament [e.g., Gois et al. (22)]. The remaining studies’ aims can be summed up as assessment of emotional state independent of general welfare assessment, for example as an evaluation of an animal’s emotional response to specific events or contexts (e.g., disease, sport events etc.).
The greater representation of studies using QBA as part of a welfare assessment protocol, is also reflected in the origin of the FL, since the term lists came from either the WQ or the AWIN protocol in 141 out of the 193 studies (see Table 2). However, in many cases the protocols were modified by adding one or several new terms [e.g., Andreasen et al. (23) and Sans et al. (24)], by reducing the number of terms overall [e.g., des Roches et al. (25) and da Silva et al. (26)] or, e.g., by exclusively using negative valenced terms (27). When the FL was not part of an existing welfare assessment protocol (identified as WQ, AWIN or SQP), list development can be categorised as being either based on the literature (terms are collected in the literature to form a list), FCP (a new list was created based on the FCP approach) or using focus groups (terms were generated in a focus group), and was often based on a combination of these. While the studies on cattle, pigs and poultry most often used standardised lists (typically from WQ, Supplementary Table 1), the case is different for goats and especially sheep (Supplementary Table 2). Although AWIN has developed lists for these species (13, 15), several of the identified studies reported using self-developed lists, and it was not always clearly described how the lists had been developed. In the studies providing details on how the lists were provided, the authors often specifically highlight the need for developing alternative lists for specific purposes [(e.g., 28)] or with regard to translation issues when used in different geographical regions [(e.g., 12)]. However, in general, another notable fact concerning studies using alternative lists is that altogether, these lists vary greatly in the number of terms, for example in small ruminants, some consist of just six (29) and others of 21 (30) terms. Considering the details of the FL used, please see Supplementary Tables 1–4.
3.2 Experimental procedure of use of QBA
Again, most of the studies applied the QBA according to the methodology described in the respective welfare assessment protocols. However, some differences can be detected, for example regarding the length of the VAS. Eight studies reported using a VAS of only 100 mm (instead of 125 mm as originally described in WQ, AWIN and SQP), and two studies reported a VAS of more than 125 mm. Further adaptations of the VAS were also found in the form of using, e.g., survey software formats, categorical or Mercalli scales [e.g., Menchetti et al. (31), Delfour et al. (32), and Gartland et al. (33)]. Noteworthy, three studies (34–36) used a novel method they termed continuous (c) QBA (c-QBA). C-QBA is a combination of QBA with the “Temporal Dominant Behavioural Expression” methodology (34) and enables recording shifts in individual QBA descriptors over time, i.e., the description of changes in animal behavioural expression during the observation session.
For the production animals, whole groups of animals were typically observed (following the WQ approach for these species), with only a few exceptions [e.g., Ebinghaus et al. (37, 38) and Gois et al. (22)]. For companion animals (including horses, but not donkeys) as well as for the zoo animals the reverse is true; most of the studies observed the animals at an individual level. The total number of animals included in the studies differed widely, with larger numbers of animals observed in production animals, with studies on hens and broilers including the highest numbers. The time frame observed per animal group was in most studies determined by the respective welfare assessment protocols, although different time frames also can be found (see Table 2).
Most of the studies observed the animals directly, while 29 studies used indirect (video) observation, and 11 studies used both direct and indirect observations. The level of observer training was found to vary greatly across the studies and was often not reported or was poorly described. Most studies provided no information on the level of experience with the relevant species (results not included in tables), while 89 studies reported their observers as experienced, however with large variation in provided details and in level of experience. Concerning the number of observers that performed QBA, 29 studies did not provide details, 49 studies were based on observations by one observer, 24 on observations by two observers and the remaining studies on observations by multiple observers. However, of the studies using more than one observer, only 43 reported that observer agreement was checked before data collection. A large variation is found in how observers were trained and how agreement was reached, checked and reported. This ranges from reporting of simple discussions about terms among observers, reaching an overall consensus of the whole WQ, AWIN or SQP (of which QBA is part), to utilising a few videos or spending up to multiple days or weeks on on-site training. Likewise, the analysis of observer agreement varied from descriptive evaluations to different statistical analyses, in which the level of interpretation also varied.
3.3 Reported statistical analysis
As shown in Table 2, 20 studies analysed the results of QBA outcomes solely on term level, 56 studies used the aggregation system of WQ, and 93 studies used a PCA for analysis (these are reported on in more detail in Supplementary Table 5). Fifteen of these 93 studies provided information on data suitability criteria. Fifty-two of the studies retained two PCs to explain the outcomes of the QBA (as in the WQ protocol), the other studies either retained one component (two studies), three components (20 studies, with 17 studies interpreting the third extracted component further) or four components (eight studies, with five studies interpreting the third and fourth extracted principal component further). In less than half of these 93 studies, information on cut-off values for factor loadings that were used for interpretation of the respective components could be extracted, i.e., most of the included studies did not state what was interpreted as loading highly on a PC and thus which values were used for interpreting/naming a PC. In some cases, this information was not clearly reported in the material and methods section, but could be extracted from the results tables. In about a quarter of the studies, principal component loadings of above 0.4 and below −0.4 were reported as used as cut-offs for this interpretation.
4 Discussion
Overall, FL QBA has been used in a variety of species, in many different settings and contexts, with various approaches to its methodology and analysis. The majority of FL QBA studies were carried out as part of a welfare assessment protocol for farm animals. While a large variation in species is evident, the literature search yielded no results on experimental animals. This is somewhat surprising, as the method aligns with other qualitative approaches used in experimental animals, such as those included in some forms of pain grimace assessments [e.g., in rabbits: Benato et al. (39)]. Moreover, using the method in experimental animals may aid in substantiating the validity and reliability of QBA, as laboratory settings typically offer more controlled environments [e.g., Calisi and Bentley (40) compared to, e.g., on farm or in zoos]. Overall, there is a variation in a number of factors that are likely to affect the outcome of a QBA and its meaning. Differences in the conditions under which they were observed (e.g., filmed or live), choice of terms in FL and statistical analysis, makes it difficult to compare the results of the current studies even on the species-level.
4.1 Aim of the studies
The original development of QBA was aimed at the evaluation of welfare (4, 5, 20), arguing that its whole-animal expressive information could make a unique contribution to scientific welfare assessment. Multiple validity and reliability studies on the FCP approach were carried out resulting in a generally proven efficacy of the methodology [reviewed by Wemelsfelder (8)]. Likewise, the high feasibility owing to its rapid assessment and ease of implementation [(e.g., 9–11)], compared to other methods of assessing the emotional state [such as the cognitive bias test; Crump et al. (41)] is a clear advantage of QBA. These advantages likely contribute to QBA being included as a customary part of various welfare assessment protocols. The first FL developments were specifically carried out for inclusion in welfare assessment protocols for farm animals (9–11). With this development, it is not surprising that the far most common use of FL QBA in the included studies was identified as general welfare assessment, and predominantly as part of the established frameworks of WQ, AWIN and SQP, belonging to the welfare criterion ‘positive emotional state’ [e.g., Botreau et al. (42)]. Also outside such larger protocols, and following some concern for QBA being at risk for subjectivity due to the reliance on human observers (43), it is generally recommended not to use Fl QBA as a stand-alone indicator for welfare assessment but to combine and cross-validate it with other indicators, as for example Andreasen et al. (23) could not validate QBA as stand-alone indicator for welfare assessment.
Despite this focus on general welfare assessment, the FL QBA has by now been used for a variety of aims. In fact, the second-most common use was its application in specific contexts, mainly to assess emotional reactions to certain events, such as intrusive sampling and capture of, e.g., salmon (44) and pampas deer (45), calf-roping events during rodeo (46), agonistic social encounters in pigs (47), dogs’ interaction with humans during canine-assisted interventions (48) and sport events (28, 49). In these studies, FL QBA was mainly applied to investigate potential impacts of such events on animals’ emotional states and how this might affect their welfare. Further aims included temperament assessment. In general, the various aims showcase a broad and flexible usage of the method within the context of assessment of emotional state as also suggested by Boissy et al. (1). Therein, it should specifically be noted that in comparison to other methods of assessment of emotional state, QBA takes the whole-body language into account (4) instead of relying on separately measuring specific mimics, gestures or body postures [e.g., ear position, play and all grooming in cattle (2)].
4.2 Origin of QBA lists
Because the descriptors that constituted the lists developed for WQ and AWIN were not necessarily appropriate or optimal for other types of situation and contexts, alternative lists were developed for other purposes such as the study of sick animals (25), human-animal relationship tests (37, 50), mother-young interactions (51) and sport competitions (in contrast to the evaluation in the normal husbandry environment) (28). Moreover, it should be noted that translation and cultural interpretation issues might arise concerning the descriptors which might make development of lists for use in specific geographical regions necessary as highlighted by Souza et al. (12). These different circumstances, as well as different aims other than welfare assessment, justify the use of different lists. There was however, some variation in how the FL were developed. However, not all these lists were developed as originally described by a first step of FCP, and the creation of a validated FL as well as the process of FL development or justification of selection of terms used was not always clearly described [e.g., Carroll et al. (52), Kaurivi et al. (27), and Harvey et al. (53)]. It should be noted that many of these studies detailing on development and validation of FL specifically pointed out that the terms included should cover many different affective states and the reduction of terms without any further validation is thus not recommended [e.g., Arena et al. (54, 55) and Souza et al. (12)]. Therewith, there is in principle also a minimum number of terms that should be used in QBA. This study presents an overview of all FL that have been used to date, however, as pointed out, the level of validation of these lists varies.
4.3 Experimental procedure of use of QBA
Most of the studies used a VAS of 125 mm in length. In the first mention in Wemelsfelder et al. (4) of a VAS in the context with QBA, a length of 12.5 cm was described. Since then, and especially in the first developments of FL QBA for welfare assessment protocols (9–11), this length was most commonly used. The authors of this study are not aware of any justification for using 12.5 cm in QBA to have been reported in literature. In human medicine, the most common length of used VAS is 10 cm (56), which also is the second most common VAS length identified in the present literature review. In a controlled trial on patients’ VAS preferences, Sriwatanakul et al. (239) found 10 cm to be the length of the most preferred type of VAS. Another study by Seymour et al. (58) on specifically comparing different VAS lengths, reported a 10 cm continuous scale as the most appropriate, and in general, that lengths from 10 to 15 cm were suitable. Consequently, it is not clear whether 100 mm and 125 mm differ in suitability. The authors of the present study are not aware of any studies that investigate potential effects of VAS length on QBA outcomes. Such knowledge might be beneficial in order to unify the QBA methodology and for comparability across QBA studies.
In addition to the differences in VAS lengths, a few alternative measurement techniques used for QBA were identified: for example, Gartland et al. (33) rated various gorilla expressions and activity patterns on qualitative descriptors such as anxiety, curiosity, irritability, cooperation and dominance, using a categorical 1–5 scale (ranging from ‘very low’ to ‘very high’). Moreover, in some articles, the method referred to as ‘continuous QBA’ (c-QBA) was introduced. C-QBA works with individual descriptors and is based on temporal dominance of sensations (TDS) procedure, which allow raters to detect behavioural fluctuations during sessions, as opposed to the classical approach of QBA where the sum of behavioural expression is considered and rated after a session. C-QBA hence provides information on variation over time in discrete emotions, i.e., shifts in behavioural expressivity over time can be captured. C-QBA was developed for goats (35) and buffaloes (34, 36). These approaches use the same type of qualitative descriptors as QBA, based on whole-animal expressive demeanour, and so require the same type of observational assessment and therefore were included in this review. However, in contrast to the original QBA, the format in which such assessments are subsequently processed and analysed differs.
Large variation was identified in the experimental setups across studies, which is not surprising given the general variation of the purpose and context of the studies. Hence, some studies observed groups of animals while others focused on individuals. Moreover, the size of the group under observation varied largely and was not always clearly reported on. This depends naturally on the species being studied and feasibility in the settings (i.e., groups are more likely to be observed in production animals, explainable by the husbandry environment on farm). However, at this stage, it remains unclear as to what effect individual vs. group-level observations has on FL QBA outcomes. To the authors’ knowledge, this has not yet been investigated. Likewise, in the majority of the studies, direct observations, rather than video-based observations, were used. It is plausible that video observations may yield more accurate results and improved observer focus, since there may be less external disturbances (59–61). On the other hand, assessors may be less involved, meaning that their actual ability to integrate perceived details of behaviours and context and transfer that into descriptors may be limited due to not all information being transferred and the observers are also less able to react to, e.g., sudden changes on-site (which they might not even be aware of) (62–64). The question such arises on whether observers should be informed about the context or background or not when using video observations. A general disadvantage of video observation is moreover the additional costs and time involved (64) that should be taken into account with regard to feasibility of the assessment. Cooke et al. (65) investigated the difference between direct and video observations of beef cattle, and found no difference between the methods for PC1, whereas the response was less pronounced for PC2 for the video observations. Consequently, the authors of the study did not recommend using video observations for QBA. In contrast, Czycholl et al. (62) found good reliability for QBA when based on video observations, but not for on-farm assessments. A possible explanation for the difference in results is that in the study by Cooke et al. (65), the observers were in both cases looking at the same animals, whereas in Czycholl et al. (62), the live observations were carried out in the same section of the farms, however not necessarily on the same animals.
The results of this review further show a large variability in the level of observer training and experience. Tuyttens et al. (66) focused their study on observer bias and effects of observer training and proved an influence on both quantitative and qualitative methods (specifically also the QBA). Likewise, in a QBA-like study Meyer et al. (67) suggested that there were possible interactions between observer experience with dogs, and interpretation of dog behaviour (amongst others). Furthermore, Gronqvist et al. (68) highlighted the importance of experience with a species to correctly interpret potentially dangerous situations and Broom and Johnson (69) emphasised that knowledge about the behaviour of a species is important to avoid misinterpretations. Likewise, in the initial introduction of FL QBAs for welfare assessment for cattle, pigs and poultry alike, it was mentioned that for use of FL QBA, observers need to be trained and experienced (9–11). Accordingly, the most common welfare assessment protocols all highlight the need for a sufficient training level of observers before using welfare assessment protocols (of which QBA is part of). On the contrary, the first introductory publication on FCP QBA worked with observers naive to the species, relying on the general ability of humans to assess the qualitative body language signals (5). Although in principle, QBA methodology thus can work with naive observers, overall, like in quantitative behavioural observations, species experience and training of observers can improve the reliability. Guidelines regarding training level and requirements of species experience when using FL QBA could be helpful with regard to a unification of the literature and thus an enhanced comparability and therewith enable the possibility of drawing better conclusions about the reliability and validity of the methodology and potential influence on results by specific settings.
4.4 Reported statistical analysis
Three main statistical approaches for retrieving FL QBA outcomes were identified: (1) utilising mm values on term level, (2) subjecting the QBA scores to a PCA and (3) aggregation following the WQ approach, based on expert opinion and pre-existing data. The latter was usually applied when the FL QBA was used as part of an existing WQ protocol. It should be noted that in the very first publications on QBA (which are carried out as FCP), statistical analysis was carried out by Generalised Procrustis Analysis (GPA) prior to PCA (4, 5). In the first publications concerning the development of a FL approach in the bounds of the WQ project, results were analysed on term level as well as by PCA (9–11). However, the respective authors argued that a PCA may be the most suitable approach to analysing QBA. Identifying principal components (PCs) on which the terms have a certain loading may help in a more valid interpretation with regard to, e.g., observer agreement. Thus, although the analysis based on term level was presented in those studies (9–11), the authors argued that conducting a PCA would provide more reliable results. This would mean that the analysis based on term level, which occurred in 20 studies cannot be seen as the most appropriate analysis.
Regarding PCA, it should be noted that the data needs to meet certain prerequisites such as certain sample size requirements and interval-level measurement for this statistical method to be applied. In textbooks, it is described that each PC should at least have an eigenvalue of >1 (Kaiser-Guttman criterion), a clear break in eigenvalues is seen between the PCs (scree-test) and a certain amount of variance of the data set is actually explained by the extracted PCs. Additionally, there is the interpretability criterion with regard to variables loading highly on the extracted PCs (70, 71). A further useful parameter to assess the data suitability is the Kaiser-Meyer-Olkin criterion, which was actually invented for factor analysis (72, 73). Looking at the results of this review, it becomes clear that only 15 of 93 studies using PCA as analysis actually tested their data for data suitability criteria beforehand. In textbooks regarding the PCA, factor loadings to be interpreted as meaningful are named as >0.4 (70) or even higher (>0.6–0.7) (74, 75). Looking at the results of this review, 19 of the studies using PCA interpreting factor loadings used cut-off values of >0.4 and three studies used cut-off values of >0.6. It is probably a matter of study aims whether the use of clear cut-off values of factor loadings or the pattern of loadings showing which descriptors contribute most to the identified PCs (70) is most suitable and how informative relatively low-value loadings then are. That said, the use of term-loadings close to zero for interpreting a PC should always be treated with caution. However, using clear cut-off values might be impossible, as, depending on the exact model used (and different statistic programmes use different models by default), exact values may differ. This is further complicated by the fact that some authors [and in some cases it may be justifiable (70)] also use a factor analysis, but interpret it as a PCA. Another noteworthy result of the present literature review is that most of the studies that conducted a PCA extracted two PCs and interpreted those further without explicit reference to the use of general rules for extraction such as scree plot or Guttman criterion (70). This may be due to the fact that studies started adapting their methodology to that of other studies without adjusting it to their own data, which is a known risk and phenomenon in science (76). Not meeting the prerequisites of data for statistical data analysis, incorrect extraction or over interpretation of relatively low values includes quite clearly the risk of misinterpretations (70). This all being said, PCA is in general a relative flexible method and over the years has been adapted to a variety of disciplines (77), so it seems well-suited also for analysis of QBA. However, the findings described highlight the need for more advice on how to correctly use and interpret multi-variate statistical techniques such as PCA for the analysis of QBA data.
The third way of statistical analysis found is via the aggregation system suggested and published by WQ (in the case of FL QBA being a part of a larger welfare assessment protocol). This aggregation system has the general aim of aggregating all the different welfare indicators (of which the FL QBA is only one) into one final welfare score of 0–100 (78). While in total, many different methods for aggregation are used, for the FL QBA, basically weighted sums are used which were obtained by expert opinion and PCA on—by nature of the studies—limited data sets that had conducted FL QBAs (9–11, 42). Those data sets were also limited to certain regions [e.g., 17 farms in Germany, three assessors: (9)], the results obtained from the PCAs on these limited data sets may not be generalizable. It is a well-known fact that small study populations easily lead to over- and under-estimations (79). A solution to overcome this in the future, due to the points raised above, would be a revision of the existing aggregation system, e.g., by joint use of the now available larger data sets of, e.g., WQ data from different working groups and countries, which also aligns with the general aim of many welfare assessment protocols (e.g., WQ) to enhance and revise the existing protocol as new knowledge arises (80).
4.5 Quality of the literature search
The high interobserver reliability between the extractors, along with the fact that only studies after 2011 were extracted [whereby the first developments of FL QBA are described in the Welfare Quality Reports in 2009 (9–11)] and that the last search without species specific terms did not result in any further articles, demonstrates the quality of this literature review.
5 Conclusion
In conclusion, the FL QBA approach has been used across many species, primarily farm animals, but also companion animals and more recently also zoo animals. However, there was during this time no QBA developed for experimental animals. FL QBA has been used for a variety of aims, however mainly for evaluating emotional state and most often as part of welfare assessment, which is also what the FL QBA was originally developed for. Different aims and settings will call for specifically tailored FL in order to strengthen reliability and validity of QBA in those settings. However, if a FL must be standardised as part of larger welfare assessment protocols, then it is advisable to clarify the context in which that list of terms can be used. A number of methodological aspects of FL QBA vary in the identified studies, ranging from using different lengths of VAS, to the evaluation of animals at group/individual level, time used for observation, training and experience level of observers and several other factors. Moreover, different statistical analyses are used, and it is identified that not always respective prerequisites for the use of those methods exist. These are aspects to consider when gathering knowledge of the current level of reliability and validity of QBA. Future studies should thus address the question whether or not there are certain conditions that must be met when applying QBA and what conditions these are, taking into account that studies have different aims and are applied in different settings, which may require a certain flexibility in using and interpreting QBA. This could answer the question whether clearer guidelines on the construction, use and statistical analysis of FL QBA are necessary and allow the potential development of such. This could then also include guidance on how and which results on FL QBA need be presented to encourage cross-study comparison.
Author contributions
IC: Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Project administration, Validation, Visualization, Writing – original draft. CS: Data curation, Formal analysis, Investigation, Methodology, Validation, Visualization, Writing – review & editing. BF: Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Project administration, Validation, Visualization, Writing – review & editing.
Funding
The author(s) declared that financial support was not received for this work and/or its publication.
Acknowledgments
We thank the reviewers for their valuable time and their constructive feedback, which has helped to improve the clarity and quality of this manuscript.
Conflict of interest
The author(s) declared that this work was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Generative AI statement
The author(s) declared that Generative AI was not used in the creation of this manuscript.
Any alternative text (alt text) provided alongside figures in this article has been generated by Frontiers with the support of artificial intelligence and reasonable efforts have been made to ensure accuracy, including review by the authors wherever possible. If you identify any issues, please contact us.
Publisher’s note
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.
Supplementary material
The Supplementary material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fvets.2025.1588346/full#supplementary-material
References
1. Boissy, A, Manteuffel, G, Jensen, MB, Moe, RO, Spruijt, B, Keeling, LJ, et al. Assessment of positive emotions in animals to improve their welfare. Physiol Behav. (2007) 92:375–97. doi: 10.1016/j.physbeh.2007.02.003,
2. Keeling, LJ, Winckler, C, Hintze, S, and Forkman, B. Towards a positive welfare protocol for cattle: a critical review of indicators and suggestion of how we might proceed. Front Anim Sci. (2021) 2:753080. doi: 10.3389/fanim.2021.753080
3. Rutherford, KM, Donald, RD, Lawrence, AB, and Wemelsfelder, F. Qualitative behavioural assessment of emotionality in pigs. Appl Anim Behav Sci. (2012) 139:218–24. doi: 10.1016/j.applanim.2012.04.004
4. Wemelsfelder, F, Hunter, TE, Mendl, MT, and Lawrence, AB. Assessing the ‘whole animal’: a free choice profiling approach. Anim Behav. (2001) 62:209–20. doi: 10.1006/anbe.2001.1741
5. Wemelsfelder, F, Hunter, EA, Mendl, MT, and Lawrence, AB. The spontaneous qualitative assessment of behavioural expressions in pigs: first explorations of a novel methodology for integrative animal welfare measurement. Appl Anim Behav Sci. (2000) 67:193–215. doi: 10.1016/S0168-1591(99)00093-3
6. Uher, J, and Asendorpf, JB. Personality assessment in the great apes: comparing ecologically valid behavior measures, behavior ratings, and adjective ratings. J Res Pers. (2008) 42:821–38. doi: 10.1016/j.jrp.2007.10.004
7. Rousing, T, and Wemelsfelder, F. Qualitative assessment of social behaviour of dairy cows housed in loose housing systems. Appl Anim Behav Sci. (2006) 101:40–53. doi: 10.1016/j.applanim.2005.12.009
8. Wemelsfelder, F. How animals communicate quality of life: the qualitative assessment of behaviour. Anim Welf. (2007) 16:25–31. doi: 10.1017/S0962728600031699
9. Wemelsfelder, F, Schulze-Westerath, H, Lentfer, T, Staack, M, and Sandiland, V. Qualitative behaviour asssessment In: B Forkman and L Keeling, editors. Welfare quality reports: assessment of animal welfare measures for layers and broilers. Uppsala, Sweden: SLU Service/Reproenheten (2009)
10. Wemelsfelder, F. Qualitative behaviour assessment In: B Forkman and L Keeling, editors. Welfare quality reports: assessment of animal welfare measures for sows, piglets and fattening pigs. Uppsala, Sweden: SLU Service/Reproenheten (2009)
11. Wemelsfelder, F, de Rosa, G, and Napolitano, F. Qualitative behaviour assessment In: B Forkman and L Keeling, editors. Assessment of animal welfare measures for dairy cattle, beef bulls and veal calves. Uppsala, Sweden: SLU Service/Reproenheten (2009)
12. Souza, AP, Wemelsfelder, F, Taconeli, C, and Molento, C. Development of a list of terms in Brazilian Portuguese for the qualitative behaviour assessment of broiler chickens. Anim Welf. (2021) 30:49–59. doi: 10.7120/09627286.30.1.049
16. Barnard, S, Pedernara, C, Velarde, A, and Dalla Villa, P. Welfare assessment protocol for shelter dogs. Milan (Italy): Istituto Zooprofilattico Sperimentale dell’Abruzzo e del Molise (2014).
17. Cooper, R, and Wemelsfelder, F. Qualitative behaviour assessment as an indicator of animal emotional welfare in farm assurance. Livest. (2020) 25:180–3. doi: 10.12968/live.2020.25.4.180
18. Fleming, PA, Clarke, T, Wickham, SL, Stockman, CA, Barnes, AL, Collins, T, et al. The contribution of qualitative behavioural assessment to appraisal of livestock welfare. Anim Prod Sci. (2016) 56:1569–78. doi: 10.1071/AN15101
19. Rose, P, and Riley, L. The use of qualitative behavioural assessment to zoo welfare measurement and animal husbandry change. J Zoo Aquarium Res. (2019) 7:150–61. doi: 10.19227/jzar.v7i4.423
20. Wemelsfelder, F, and Lawrence, AB. Qualitative assessment of animal behaviour as an on-farm welfare-monitoring tool. Acta Agricult Scand Sect A Anim Sci. (2001) 51:21–5. doi: 10.1080/090647001300004763
21. Wemelsfelder, F, and Mullan, S. Applying ethological and health indicators to practical animal welfare assessment. Rev Sci Tech Off Int Epiz. (2014) 33:111–20. doi: 10.20506/rst.33.1.2259
22. Gois, KCR, Ceballos, MC, Sant'Anna, AC, and Costa, MJRPD. Using an observer rating method to assess the effects of rotational stocking method on beef cattle temperament over time. R Bras Zootec. (2016) 45:501–8. doi: 10.1590/S1806-92902016000900001
23. Andreasen, S, Wemelsfelder, F, Sandøe, P, and Forkman, B. The correlation of qualitative behavior assessments with welfare quality® protocol outcomes in on-farm welfare assessment of dairy cattle. Appl Anim Behav Sci. (2013) 143:9–17. doi: 10.1016/j.applanim.2012.11.013
24. Sans, E, Tuyttens, F, Taconeli, C, Rueda, P, Ciocca, J, and Molento, C. Welfare of broiler chickens reared in two different industrial house types during the winter season in southern Brazil. Br Poult Sci. (2021) 62:621–31. doi: 10.1080/00071668.2021.1908519
25. des Roches, ADB, Lussert, A, Faure, M, Herry, V, Rainard, P, Durand, D, et al. Dairy cows under experimentally-induced Escherichia coli mastitis show negative emotional states assessed through qualitative behaviour assessment. Appl Anim Behav Sci. (2018) 206:1–11. doi: 10.1016/j.applanim.2018.06.004
26. da Silva, PMRS, Ferreira, IC, Da Fonseca Neto, ÁM, Malaquias, JV, de Pinho, GAS, De Oliveira, SAS, et al. Does environmental enrichment consisting of brushing prepartum zebu heifers improve first-lactation behavior? Appl Anim Behav Sci. (2021) 234:105206. doi: 10.1016/j.applanim.2020.105206
27. Kaurivi, YB, Laven, R, Hickson, R, Stafford, K, and Parkinson, T. Identification of suitable animal welfare assessment measures for extensive beef systems in New Zealand. Agriculture Basel. (2019) 9:66. doi: 10.3390/agriculture9030066
28. Jaramillo, FM, Oliveira, TM, Silva, PEA, Trindade, PHE, and Baccarin, RYA. Development of a fixed list of descriptors for the qualitative behavioral assessment of thoroughbred horses in the racing environment. Front Vet Sci. (2023) 10:1189846. doi: 10.3389/fvets.2023.1189846,
29. Willis, RS, Fleming, PA, Dunston-Clarke, EJ, Barnes, AL, Miller, DW, and Collins, T. Animal welfare indicators for sheep during sea transport: the effect of voyage day and time of day. Appl Anim Behav Sci. (2021) 238:105304. doi: 10.1016/j.applanim.2021.105304
30. Mialon, M-M, Boivin, X, Durand, D, Boissy, A, Delval, E, Bage, A, et al. Short-and mid-term effects on performance, health and qualitative behavioural assessment of Romane lambs in different milk feeding conditions. Animal. (2021) 15:100157. doi: 10.1016/j.animal.2020.100157
31. Menchetti, L, Righi, C, Guelfi, G, Enas, C, Moscati, L, Mancini, S, et al. Multi-operator qualitative behavioural assessment for dogs entering the shelter. Appl Anim Behav Sci. (2019) 213:107–16. doi: 10.1016/j.applanim.2019.02.008
32. Delfour, F, Monreal-Pawlowsky, T, Vaiceauskaite, R, Pilenga, C, Garcia-Parraga, D, Rödel, HG, et al. Dolphin welfare assessment under professional care: ‘willingness to participate’, an indicator significantly associated with six potential ‘alerting factors’. J Zool Bot. (2020) 1:42–60.
33. Gartland, KN, Bovee, E, and Fuller, G. Impact of alternating overnight housing conditions on welfare measures in a bachelor group of western lowland gorillas (Gorilla gorilla gorilla). Am J Primatol. (2023) 85:e23443. doi: 10.1002/ajp.23443,
34. Napolitano, F, de Rosa, G, Serrapica, M, and Braghieri, A. A continuous recording approach to qualitative behaviour assessment in dairy buffaloes (Bubalus bubalis). Appl Anim Behav Sci. (2015) 166:35–43. doi: 10.1016/j.applanim.2015.01.017
35. Napolitano, F, De Rosa, G, Grasso, F, and Wemelsfelder, F. Qualitative behaviour assessment of dairy buffaloes (Bubalus bubalis). Appl Anim Behav Sci. (2012) 141, 91–100.
36. Serrapica, M, Braghieri, A, Riviezzi, AM, Bragaglio, A, Carlucci, A, and Napolitano, F. Qualitative assessment of temporal fluctuations on buffalo behaviour. Ital J Agron. (2014) 9:157–62. doi: 10.4081/ija.2014.612
37. Ebinghaus, A, Ivemeyer, S, Rupp, J, and Knierim, U. Identification and development of measures suitable as potential breeding traits regarding dairy cows’ reactivity towards humans. Appl Anim Behav Sci. (2016) 185:30–8. doi: 10.1016/j.applanim.2016.09.010
38. Ebinghaus, A, Ivemeyer, S, and Knierim, U. Human and farm influences on dairy cows responsiveness towards humans–a cross-sectional study. PLoS One. (2018) 13:e0209817. doi: 10.1371/journal.pone.0209817,
39. Benato, L, Murrell, J, Knowles, TG, and Rooney, NJ. Development of the Bristol rabbit pain scale (BRPS): a multidimensional composite pain scale specific to rabbits (Oryctolagus cuniculus). PLoS One. (2021) 16:e0252417. doi: 10.1371/journal.pone.0252417,
40. Calisi, RM, and Bentley, GE. Lab and field experiments: are they the same animal? Horm Behav. (2009) 56:1–10. doi: 10.1016/j.yhbeh.2009.02.010,
41. Crump, A, Arnott, G, and Bethell, EJ. Affect-driven attention biases as animal welfare indicators: review and methods. Animals. (2018) 8:136. doi: 10.3390/ani8080136,
42. Botreau, R, Veissier, I, and Perny, P. Overall assessment of animal welfare: strategy adopted in welfare quality®. Anim Welf. (2009) 18:363–70. doi: 10.1017/S0962728600000762
43. Bokkers, EA, de Vries, M, Antonissen, IC, and de Boer, I. Inter-and intra-observer reliability of experienced and inexperienced observers for the qualitative behaviour assessment in dairy cattle. Anim Welf. (2012) 21:307–18. doi: 10.7120/09627286.21.3.307
44. Wiese, TR, Planellas, SR, Betancor, M, Haskell, M, Jarvis, S, Davie, A, et al. Qualitative behavioural assessment as a welfare indicator for farmed Atlantic salmon (Salmo salar) in response to a stressful challenge. Front Vet Sci. (2023) 10:1260090. doi: 10.3389/fvets.2023.1260090
45. Munerato, MS, Marques, JA, Caulkett, NA, Tomas, WM, Zanetti, ES, Trovati, RG, et al. Hormonal and behavioural stress responses to capture and radio-collar fitting in free-ranging pampas deer (Ozotoceros bezoarticus). Anim Welf. (2015) 24:437–46. doi: 10.7120/09627286.24.4.437
46. Rizzuto, S, Evans, D, Wilson, B, and McGreevy, P. Exploring the use of a qualitative behavioural assessment approach to assess emotional state of calves in rodeos. Animals. (2020) 10:113. doi: 10.3390/ani10010113
47. Oldham, L, Arnott, G, Camerlink, I, Doeschl-Wilson, A, Farish, M, Wemelsfelder, F, et al. Once bitten, twice shy: aggressive and defeated pigs begin agonistic encounters with more negative emotions. Appl Anim Behav Sci. (2021) 244:105488. doi: 10.1016/j.applanim.2021.105488
48. Pedersen, H, and Malm, K. Cross-disciplinary method development for assessing dog welfare in canine-assisted pedagogical work: a pilot study. J Appl Anim Welf Sci. (2025) 28:90–103. doi: 10.1080/10888705.2023.2211205
49. Fleming, PA, Paisley, CL, Barnes, AL, and Wemelsfelder, F. Application of qualitative behavioural assessment to horses during an endurance ride. Appl Anim Behav Sci. (2013) 144:80–8. doi: 10.1016/j.applanim.2012.12.001
50. Ebinghaus, A, Ivemeyer, S, Lauks, V, Santos, L, Brügemann, K, König, S, et al. How to measure dairy cows’ responsiveness towards humans in breeding and welfare assessment? A comparison of selected behavioural measures and existing breeding traits. Appl Anim Behav Sci. (2017) 196:22–9. doi: 10.1016/j.applanim.2017.07.006
51. Ceballos, MC, Gois, KCR, Sant’Anna, AC, Wemelsfelder, F, and da Costa, MP. Reliability of qualitative behavior assessment (QBA) versus methods with predefined behavioral categories to evaluate maternal protective behavior in dairy cows. Appl Anim Behav Sci. (2021) 236:105263. doi: 10.1016/j.applanim.2021.105263
52. Carroll, G, Boyle, L, Hanlon, A, Palmer, M, Collins, L, Griffin, K, et al. Identifying physiological measures of lifetime welfare status in pigs: exploring the usefulness of haptoglobin, C-reactive protein and hair cortisol sampled at the time of slaughter. Ir Vet J. (2018) 71:8–10. doi: 10.1186/s13620-018-0118-0,
53. Harvey, AM, Morton, JM, Mellor, DJ, Russell, V, Chapple, RS, and Ramp, D. Use of remote camera traps to evaluate animal-based welfare indicators in individual free-roaming wild horses. Animals. (2021) 11:2101. doi: 10.3390/ani11072101,
54. Arena, L, Berteselli, GV, Lombardo, F, Candeloro, L, Villa, PD, and Massis, FD. Application of a welfare assessment tool (shelter quality protocol) in 64 Italian long-term dogs’ shelters: welfare hazard analysis. Anim Welf. (2019) 28:353–63. doi: 10.7120/09627286.28.3.353
55. Arena, L, Wemelsfelder, F, Messori, S, Ferri, N, and Barnard, S. Development of a fixed list of terms for the qualitative behavioural assessment of shelter dogs. PLoS One. (2019) 14:e0212652. doi: 10.1371/journal.pone.0212652,
56. Heller, GZ, Manuguerra, M, and Chow, R. How to analyze the visual analogue scale: myths, truths and clinical relevance. Scand J Pain. (2016) 13:67–75. doi: 10.1016/j.sjpain.2016.06.012,
57. Battini, M, Renna, M, Giammarino, M, Battaglini, L, and Mattiello, S. Feasibility and reliability of the AWIN welfare assessment protocol for dairy goats in semi-extensive farming conditions. Front Vet Sci. (2021) 8:731927. doi: 10.3389/fvets.2021.731927,
58. Seymour, RA, Simpson, JM, Charlton, EJ, and Phillips, ME. An evaluation of length and end-phrase of visual analogue scales in dental pain. Pain. (1985) 21:177–85. doi: 10.1016/0304-3959(85)90287-8
59. Schlageter-Tello, A, Bokkers, EAM, Groot Koerkamp, PWG, Van Hertem, T, Viazzi, S, Romanini, CEB, et al. Comparison of locomotion scoring for dairy cows by experienced and inexperienced raters using live or video observation methods. Anim Welf. (2015) 24:69–79. doi: 10.7120/09627286.24.1.069
60. Steinemann, S, Berg, B, Dituillio, A, Skinner, A, Terada, K, Anzelon, K, et al. Assessing teamwork in the trauma bay: introduction of a modified “NOTECHS” scale for trauma. Am J Surg. (2012) 203:69–75. doi: 10.1016/j.amjsurg.2011.08.004
61. van Maarseveen, OEC, Ham, WHW, Van Cruchten, S, Duhoky, R, and Leenen, LPH. Evaluation of validity and reliability of video analysis and live observations to assess trauma team performance. Eur J Trauma Emerg Surg. (2022) 48:4797–803. doi: 10.1007/s00068-022-02004-y
62. Czycholl, I, grosse Beilage, E, Henning, C, and Krieter, J. Reliability of the qualitative behavior assessment as included in the welfare quality assessment protocol for growing pigs. J Anim Sci. (2017) 95:3445–54. doi: 10.2527/jas.2017.1525,
63. Jewitt, C. An introduction to using video for research. National Centre for Research Methods, NCRM Working Paper. London, UK: Institute of Education (2012).
64. Martin, P, and Bateson, P. Measuring behaviour: an introductory guide. 3rd ed. Cambridge, Cambridgeshire, UK: University of Cambridge (2007).
65. Cooke, A, Mullan, S, Morten, C, Hockenhull, J, Lee, M, Cardenas, L, et al. V-QBA vs. QBA—how do video and live analysis compare for qualitative behaviour assessment? Front Vet Sci. (2022) 9:832239. doi: 10.3389/fvets.2022.832239,
66. Tuyttens, F, De Graaf, S, Heerkens, JL, Jacobs, L, Nalon, E, Ott, S, et al. Observer bias in animal behaviour research: can we believe what we score, if we score what we believe? Anim Behav. (2014) 90:273–80. doi: 10.1016/j.anbehav.2014.02.007
67. Meyer, I, Forkman, B, and Paul, ES. Factors affecting the human interpretation of dog behavior. Anthrozoös. (2014) 27:127–40. doi: 10.2752/175303714X13837396326576
68. Gronqvist, G, Rogers, C, Gee, E, Martinez, A, and Bolwell, C. Veterinary and equine science students’ interpretation of horse behaviour. Animals. (2017) 7:63. doi: 10.3390/ani7080063,
69. Broom, DM, and Johnson, KG. Assessing welfare: long-term responses In: DM Broom and KG Johnson, editors. Stress and animal welfare: key issues in the biology of humans and other animals. Cham: Springer International Publishing (2019)
70. O’Rourke, NH, and Larry,. A step-by-step approach to using SAS® for factor analysis and structural equation modeling. Cary, NC, USA: SAS Institute Inc. (2013).
71. Stevens, JLEA. Applied multivariate statistics for the social sciences. New York City, USA: Lawrence Erlbaum Associates (2002).
72. Dziuban, CD, and Shirkey, EC. When is a correlation matrix appropriate for factor analysis? Some decision rules. Psychol Bull. (1974) 81:358–61. doi: 10.1037/h0036316
73. Kaiser, HF. A second generation little jiffy. Psychometrika. (1970) 35:401–15. doi: 10.1007/BF02291817
75. Hair, JF, Anderson, RE, Tatham, RL, and Black, WC. Multivariate data analysis. 5th ed. Upper Saddle River, NJ: Prentice Hall (1998).
76. Ioannidis, JPA. Why science is not necessarily self-correcting. Perspect Psychol Sci. (2012) 7:645–54. doi: 10.1177/17456916124640
77. Jolliffe, IT, and Cadima, J. Principal component analysis: a review and recent developments. Philos Trans R Soc Lond A Math Phys Eng Sci. (2016) 374:20150202. doi: 10.1098/rsta.2015.0202,
78. Welfare Quality. Welfare quality assessment protocol for pigs. Lelystad, the Netherlands: Welfare Quality Consortium (2009).
79. Lin, L. Bias caused by sampling error in meta-analysis with small sample sizes. PLoS One. (2018) 13:e0204056. doi: 10.1371/journal.pone.0204056,
80. Blokhuis, H, Veissier, I, Jones, B, and Miele, M. The Welfare Quality® vision In: Improving farm animal welfare. Wageningen, The Netherlands: Wageningen Academic Publishers (2013)
81. Adamie, BA, Uehleke, R, Hansson, H, Musshoff, O, and Huettel, S. Dairy cow welfare measures: can production economic data help? Sust Prod Consum. (2022) 32:296–305. doi: 10.1016/j.spc.2022.04.032
82. Andreasen, S, Sandøe, P, Waiblinger, S, and Forkman, B. Negative attitudes of Danish dairy farmers to their livestock correlates negatively with animal welfare. Anim Welf. (2020) 29:89–98. doi: 10.7120/09627286.29.1.089
83. Ostojić-Andrić, D, Hristov, S, Petrovic, MM, Pantelic, V, Niksic, D, Stanojkovic, A, et al. Health and welfare of dairy cows in Serbia. Sci Papers-Ser D-Anim Sci. (2016) 59:233–9. Available at: https://hdl.handle.net/21.15107/rcub_ristocar_497
84. Armbrecht, L, Lambertz, C, Albers, D, and Gauly, M. Assessment of welfare indicators in dairy farms offering pasture at differing levels. Animal. (2019) 13:2336–47. doi: 10.1017/S1751731119000570,
85. Barry, C, Ellingson-Dalskau, K, Garmo, RT, Grønmo Kischel, S, Winckler, C, and Kielland, C. Obtaining an animal welfare status in Norwegian dairy herds—a mountain to climb. Front Vet Sci. (2023) 10:1125860. doi: 10.3389/fvets.2023.1125860
86. Brscic, M, Otten, ND, Contiero, B, and Kirchner, MK. Investigation of a standardized qualitative behaviour assessment and exploration of potential influencing factors on the emotional state of dairy calves. Animals. (2019) 9:757. doi: 10.3390/ani9100757,
87. Bugueiro, A, Pedreira, J, and Dieguez, FJ. Study on the major welfare problems of dairy cows from the Galicia region (NW Spain). J Anim Behav Biometeorol. (2018) 6:84–9. doi: 10.31893/2318-1265jabb.v6n3p84-89
88. Bugueiro, A, Fouz, R, and Dieguez, FJ. Associations between on-farm welfare, milk production, and reproductive performance in dairy herds in Northwestern Spain. J Appl Anim Welf Sci. (2021) 24:29–38. doi: 10.1080/10888705.2020.1750016,
89. Chen, X, Ogdahl, W, Hanna, LLH, Dahlen, CR, Riley, DG, Wagner, SA, et al. Evaluation of beef cattle temperament by eye temperature using infrared thermography technology. Comput Electron Agric. (2021) 188:106321. doi: 10.1016/j.compag.2021.106321
90. Coignard, M, Guatteo, R, Veissier, I, Des Roches, ADB, Mounier, L, Lehebel, A, et al. Description and factors of variation of the overall health score in French dairy cattle herds using the welfare quality® assessment protocol. Prev Vet Med. (2013) 112:296–308. doi: 10.1016/j.prevetmed.2013.07.018
91. Coignard, M, Guatteo, R, Veissier, I, Lehebel, A, Hoogveld, C, Mounier, L, et al. Does milk yield reflect the level of welfare in dairy herds? Vet J. (2014) 199:184–7. doi: 10.1016/j.tvjl.2013.10.011,
92. Collins, S, Burn, CC, Cardwell, JM, and Bell, NJ. Evaluating the concept of iceberg indicators for on-farm welfare assessment of dairy cattle by farmers. Cattle Pract. (2015) 23:300.
93. Collins, S, Burn, C, Wathes, CM, Cardwell, JM, Chang, Y-M, and Bell, NJ. Time-consuming, but necessary: a wide range of measures should be included in welfare assessments for dairy herds. Front Anim Sci. (2021) 2:703380. doi: 10.3389/fanim.2021.703380
94. Cooke, AS, Mullan, S, Morten, C, Hockenhull, J, Le-Grice, P, le Cocq, K, et al. Comparison of the welfare of beef cattle in housed and grazing systems: hormones, health and behaviour. J Agric Sci. (2023) 161:450–63. doi: 10.1017/S0021859623000357,
95. de Andrade Kogima, P, Diesel, TA, Vieira, FMC, Schogor, ALB, Volpini, AA, Veloso, GJ, et al. The welfare of dairy cows in pasture, free stall, and compost barn management systems in a Brazilian Subtropical Region. Animals. (2022) 12:2215. doi: 10.3390/ani12172215,
96. de Graaf, S, Ampe, B, and Tuyttens, F. Assessing dairy cow welfare at the beginning and end of the indoor period using the welfare quality® protocol. Anim Welf. (2017) 26:213–21. doi: 10.7120/09627286.26.2.213
97. de Rosa, G, di Palo, R, Serafini, R, Grasso, F, Bragaglio, A, Braghieri, A, et al. Different assessment systems fail to agree on the evaluation of dairy cattle welfare at farm level. Livest Sci. (2019) 229:145–9. doi: 10.1016/j.livsci.2019.09.024
98. de Vries, M, Bokkers, E, Van Schaik, G, Botreau, R, Engel, B, Dijkstra, T, et al. Evaluating results of the welfare quality multi-criteria evaluation model for classification of dairy cattle welfare at the herd level. J Dairy Sci. (2013) 96:6264–73. doi: 10.3168/jds.2012-6129
99. de Vries, M, Engel, B, den Uijl, I, Van Schaik, G, Dijkstra, T, de Boer, I, et al. Assessment time of the welfare quality® protocol for dairy cattle. Anim Welf. (2013) 22:85–93. doi: 10.7120/09627286.22.1.085
100. de Vries, M, Bokkers, E, van Schaik, G, Engel, B, Dijkstra, T, and de Boer, I. Exploring the value of routinely collected herd data for estimating dairy cattle welfare. J Dairy Sci. (2014) 97:715–30. doi: 10.3168/jds.2013-6585
101. des Roches, ADB, Veissier, I, Coignard, M, Bareille, N, Guatteo, R, Capdeville, J, et al. The major welfare problems of dairy cows in French commercial farms: an epidemiological approach. Anim Welf. (2014) 23:467–78. doi: 10.7120/09627286.23.4.467
102. Dos Santos, SGCG, Saraiva, EP, Fonseca, VDFC, Saraiva, CAS, Neto, SG, da Silva Fidelis, S, et al. Avaliação de indicadores de bem-estar em vacas leiteiras a pasto no Nordeste do Brasil. Semina Ciênc Agrár. (2020) 41:3225–36. doi: 10.5433/1679-0359.2020v41n6Supl2p3225
103. Ebinghaus, A, Knierim, U, Simantke, C, Palme, R, and Ivemeyer, S. Fecal cortisol metabolites in dairy cows: a cross-sectional exploration of associations with animal, stockperson, and farm characteristics. Animals. (2020) 10:1787. doi: 10.3390/ani10101787,
104. Ebinghaus, A, Matuli, K, Knierim, U, and Ivemeyer, S. Associations between dairy herds’ qualitative behavior and aspects of herd health, stockperson and farm factors—a cross-sectional exploration. Animals. (2022) 12:182. doi: 10.3390/ani12020182
105. Ellingsen, K, Coleman, GJ, Lund, V, and Mejdell, CM. Using qualitative behaviour assessment to explore the link between stockperson behaviour and dairy calf behaviour. Appl Anim Behav Sci. (2014) 153:10–7. doi: 10.1016/j.applanim.2014.01.011
106. Garro-Aguilar, Y, Fernandez, R, Calero, S, Noskova, E, Gulak, M, de la Fuente, M, et al. Acute stress-induced changes in the lipid composition of cow’s milk in healthy and pathological animals. Molecules. (2023) 28:980. doi: 10.3390/molecules28030980,
107. Gieseke, D, Lambertz, C, and Gauly, M. Relationship between herd size and measures of animal welfare on dairy cattle farms with freestall housing in Germany. J Dairy Sci. (2018) 101:7397–411. doi: 10.3168/jds.2017-14232,
108. Grimard, B, des Roches, ADB, Coignard, M, Lehebel, A, Chuiton, A, Mounier, L, et al. Relationships between welfare and reproductive performance in French dairy herds. Vet J. (2019) 248:1–7. doi: 10.1016/j.tvjl.2019.03.006
109. Gutmann, A, Schwed, B, Tremtesberger, L, and Winckler, C. Intra-day variation of qualitative behaviour assessment outcomes in dairy cattle. Anim Welf. (2015) 24:319–26. doi: 10.7120/09627286.24.3.319
110. Hernandez, A, Berg, C, Eiksson, S, Edstam, L, Orihuela, A, Leon, H, et al. The welfare quality® assessment protocol: how can it be adapted to family farming dual purpose cattle raised under extensive systems in tropical conditions? Anim Welf. (2017) 26:177–84. doi: 10.7120/09627286.26.2.177
111. Hernandez, A, Berg, C, Westin, R, and Galina, C. Seasonal differences in animal welfare assessment of family farming dual-purpose cattle raised under tropical conditions. Animals. (2018) 8:125. doi: 10.3390/ani8070125,
112. Hulsmann, HLL, Hieber, JK, Yu, H, Celestino, EF, Dahlen, CR, Wagner, SA, et al. Blood collection has negligible impact on scoring temperament in Angus-based weaned calves. Livest Sci. (2019) 230:103835. doi: 10.1016/j.livsci.2019.103835
113. Kirchner, M, Westerath, HS, Knierim, U, Tessitore, E, Cozzi, G, Pfeiffer, C, et al. Application of the welfare quality® assessment system on European beef bull farms. Animal. (2014) 8:827–35. doi: 10.1017/S1751731114000366
114. Kirchner, M, Westerath, HS, Knierim, U, Tessitore, E, Cozzi, G, and Winckler, C. On-farm animal welfare assessment in beef bulls: consistency over time of single measures and aggregated welfare quality® scores. Animal. (2014) 8:461–9. doi: 10.1017/S1751731113002267
115. Krug, C, Haskell, M, Nunes, T, and Stilwell, G. Creating a model to detect dairy cattle farms with poor welfare using a national database. Prev Vet Med. (2015) 122:280–6. doi: 10.1016/j.prevetmed.2015.10.014,
116. Lutz, B, Zwygart, S, Thomann, B, Stucki, D, and Burla, J-B. The relationship between common data-based indicators and the welfare of Swiss dairy herds. Front Vet Sci. (2022) 9:991363. doi: 10.3389/fvets.2022.991363,
117. Molina, L, Agüera, EI, Perez-Marin, CC, and Maroto-Molina, F. Comparing welfare indicators in dairy cattle under different loose housing systems (deep litter vs cubicle barns) using recycled manure solids for bedding. Span J Agric Res. (2020) 18:e0501. doi: 10.5424/sjar/2020181-15287
118. Popescu, S, Borda, C, Diugan, EA, Spinu, M, Groza, IS, and Sandru, CD. Dairy cows welfare quality in tie-stall housing system with or without access to exercise. Acta Vet Scand. (2013) 55:1–11. doi: 10.1186/1751-0147-55-43
119. Popescu, S, Borda, C, Diugan, EA, Niculae, M, Stefan, R, and Sandru, CD. The effect of the housing system on the welfare quality of dairy cows. Ital J Anim Sci. (2014) 13:2940. doi: 10.4081/ijas.2014.2940
120. Russell, AL, Randall, LV, Kaler, J, Eyre, N, and Green, MJ. Use of qualitative behavioural assessment to investigate affective states of housed dairy cows under different environmental conditions. Front Vet Sci. (2023) 10:1099170. doi: 10.3389/fvets.2023.1099170
121. Sant’Anna, AC, and da Costa, MJP. Validity and feasibility of qualitative behavior assessment for the evaluation of Nellore cattle temperament. Livest Sci. (2013) 157:254–62. doi: 10.1016/j.livsci.2013.08.004
122. Schmitz, L, Ebinghaus, A, Ivemeyer, S, Domas, L, and Knierim, U. Validity aspects of behavioural measures to assess cows’ responsiveness towards humans. Appl Anim Behav Sci. (2020) 228:105011. doi: 10.1016/j.applanim.2020.105011
123. Schulz, F, Wagner, K, Brinkmann, J, March, S, Hinterstoißer, P, Schüler, M, et al. Welfare of dairy cattle in summer and winter—a comparison of organic and conventional herds in a farm network in Germany. J Sustain Org Agric Syst. (2020) 70:83–96. doi: 10.3220/LBF1608034952000
124. Thomann, B, Würbel, H, Kuntzer, T, Umstätter, C, Wechsler, B, Meylan, M, et al. Development of a data-driven method for assessing health and welfare in the most common livestock species in Switzerland: the smart animal health project. Front Vet Sci. (2023) 10:1125806. doi: 10.3389/fvets.2023.1125806
125. Tremetsberger, L, Leeb, C, and Winckler, C. Animal health and welfare planning improves udder health and cleanliness but not leg health in Austrian dairy herds. J Dairy Sci. (2015) 98:6801–11. doi: 10.3168/jds.2014-9084
126. Tremetsberger, L, Winckler, C, and Kantelhardt, J. Animal health and welfare state and technical efficiency of dairy farms: possible synergies. Anim Welf. (2019) 28:345–52. doi: 10.7120/09627286.28.3.345
127. Valente, D, and Stilwell, G. Applying a new proposed welfare assessment protocol to suckler herds from three different autochthonous breeds. Animals. (2022) 12:2689. doi: 10.3390/ani12192689
128. van Eerdenburg, FJ, Hof, T, Doeve, B, Ravesloot, L, Zeinstra, EC, Nordquist, RE, et al. The relation between hair-cortisol concentration and various welfare assessments of Dutch dairy farms. Animals. (2021) 11:821. doi: 10.3390/ani12192689
129. Vucemilo, M, Matkovic, K, Štokovic, I, Kovacevic, S, and Benic, M. Welfare assessment of dairy cows housed in a tie-stall system. Mljekarstvo. (2012) 62:62–7.
130. Wagner, K, Brinkmann, J, March, S, Hinterstoißer, P, Warnecke, S, Schüler, M, et al. Impact of daily grazing time on dairy cow welfare—results of the welfare quality® protocol. Animals. (2017) 8:1. doi: 10.3390/ani8010001
131. Wagner, K, Brinkmann, J, Bergschmidt, A, Renziehausen, C, and March, S. The effects of farming systems (organic vs. conventional) on dairy cow welfare, based on the welfare quality® protocol. Animal. (2021) 15:100301. doi: 10.1016/j.animal.2021.100301
132. Zhitia, E, Leeb, C, Muji, S, and Winckler, C. Welfare of dairy cows in Kosovo and intervention thresholds for selected welfare indicators as suggested by farmers and veterinarians. Anim Welf. (2022) 31:483–93. doi: 10.1017/S0962728600032474
133. Zuliani, A, Romanzin, A, Corazzin, M, Salvador, S, Abrhantes, J, and Bovolenta, S. Welfare assessment in traditional mountain dairy farms: above and beyond resource-based measures. Anim Welf. (2017) 26:203–11. doi: 10.7120/09627286.26.2.203
134. de Rosa, G, Grasso, F, Winckler, C, Bilancione, A, Pacelli, C, Masucci, F, et al. Application of the welfare quality protocol to dairy buffalo farms: prevalence and reliability of selected measures. J Dairy Sci. (2015) 98:6886–96. doi: 10.3168/jds.2015-9350,
135. Brandt, P, Hakansson, F, Jensen, T, Nielsen, M, Lahrmann, H, Hansen, C, et al. Effect of pen design on tail biting and tail-directed behaviour of finishing pigs with intact tails. Animal. (2020) 14:1034–42. doi: 10.1017/S1751731119002805,
136. Camerlink, I, Peijnenburg, M, Wemelsfelder, F, and Turner, SP. Emotions after victory or defeat assessed through qualitative behavioural assessment, skin lesions and blood parameters in pigs. Appl Anim Behav Sci. (2016) 183:28–34. doi: 10.1016/j.applanim.2016.07.007
137. Cardona, Z, Ceballos, MC, Morales, AMT, Jaramillo, DE, and de Jesus Rodriguez, B. Music modulates emotional responses in growing pigs. Sci Rep. (2022) 12:3382. doi: 10.1038/s41598-022-07300-6
138. Cardona, Z, Ceballos, MC, Morales, AMT, Jaramillo, DE, and de Jesus Rodriguez, B. Spectro-temporal acoustic elements of music interact in an integrated way to modulate emotional responses in pigs. Sci Rep. (2023) 13:2994. doi: 10.1038/s41598-023-30057-5
139. Carreras, R, Mainau, E, Arroyo, L, Moles, X, Gonzalez, J, Bassols, A, et al. Housing conditions do not alter cognitive bias but affect serum cortisol, qualitative behaviour assessment and wounds on the carcass in pigs. Appl Anim Behav Sci. (2016) 185:39–44. doi: 10.1016/j.applanim.2016.09.006
140. Clarke, T, Pluske, JR, and Fleming, PA. Are observer ratings influenced by prescription? A comparison of free choice profiling and fixed list methods of qualitative behavioural assessment. Appl Anim Behav Sci. (2016) 177:77–83. doi: 10.1016/j.applanim.2016.01.022
141. Czycholl, I, Kniese, C, Büttner, K, grosse Beilage, E, Schrader, L, and Krieter, J. Interobserver reliability of the ‘welfare quality® animal welfare assessment protocol for growing pigs’. Springerplus. (2016) 5:1–13. doi: 10.1186/s40064-016-2785-1
142. Czycholl, I, Kniese, C, Büttner, K, grosse Beilage, E, Schrader, L, and Krieter, J. Test-retest reliability of the welfare quality® animal welfare assessment protocol for growing pigs. Anim Welf. (2016) 25:447–59. doi: 10.7120/09627286.25.4.447
143. Czycholl, I, Kniese, C, Schrader, L, and Krieter, J. Assessment of the multi-criteria evaluation system of the welfare quality® protocol for growing pigs. Animal. (2017) 11:1573–80. doi: 10.1017/S1751731117000210,
144. Czycholl, I, Kniese, C, Schrader, L, and Krieter, J. How reliable is the multi-criteria evaluation system of the welfare quality® protocol for growing pigs? Anim Welf. (2018) 27:147–56. doi: 10.7120/09627286.27.2.147
145. Duijvesteijn, N, Benard, M, Reimert, I, and Camerlink, I. Same pig, different conclusions: stakeholders differ in qualitative behaviour assessment. J Agric Environ Ethics. (2014) 27:1019–47. doi: 10.1007/s10806-014-9513-z
146. Friedrich, L, Krieter, J, Kemper, N, and Czycholl, I. Test−retest reliability of the ‘welfare quality® animal welfare assessment protocol for sows and piglets’. Part 1. Assessment of the welfare principle of ‘appropriate behavior’. Animals. (2019) 9:398. doi: 10.3390/ani9070398,
147. Friedrich, L, Krieter, J, Kemper, N, and Czycholl, I. Animal welfare assessment in sows and piglets—introduction of a new German protocol for farm’s self-inspection and of new animal-based indicators for piglets. Agriculture. (2020) 10:506. doi: 10.3390/agriculture10110506
148. Friedrich, L, Krieter, J, Kemper, N, and Czycholl, I. Iceberg indicators for sow and piglet welfare. Sustainability. (2020) 12:8967. doi: 10.3390/su12218967
149. Friedrich, L, Krieter, J, Kemper, N, and Czycholl, I. Interobserver reliability of measures of the welfare quality® animal welfare assessment protocol for sows and piglets. Anim Welf. (2020) 29:323–37. doi: 10.7120/09627286.29.3.323
150. Friedrich, L, Krieter, J, Kemper, N, and Czycholl, I. Application of principal component analysis of sows' behavioral indicators of the welfare quality® protocol to determine main components of behavior. Front Anim Sci. (2021) 2:728608. doi: 10.3389/fanim.2021.728608
151. Hubbard, C, and Scott, K. Do farmers and scientists differ in their understanding and assessment of farm animal welfare? Anim Welf. (2011) 20:79–87. doi: 10.1017/S0962728600002451
152. Kang, HJ, Bae, S, and Lee, H. Correlation of animal-based parameters with environment-based parameters in an on-farm welfare assessment of growing pigs. J Anim Sci Technol. (2022) 64:539–63. doi: 10.5187/jast.2022.e23,
153. Losada-Espinosa, N, Trujillo-Ortega, ME, and Galindo, F. The welfare of pigs in rustic and technified production systems using the welfare quality protocols of pigs in Mexico: validity of indicators of animal welfare as part of the sustainability criteria of pig production systems. Vet Méx. (2018) 4:1–15. doi: 10.21753/vmoa.4.4.521
154. Martin, P, Czycholl, I, Buxade, C, and Krieter, J. Validation of a multi-criteria evaluation model for animal welfare. Animal. (2017) 11:650–60. doi: 10.1017/S1751731116001737,
155. Martinez, A, Donoso, E, Hernandez, RO, Sanchez, JA, and Romero, MH. Assessment of animal welfare in fattening pig farms certified in good livestock practices. J Appl Anim Welf Sci. (2024) 27:33–45. doi: 10.1080/10888705.2021.2021532,
156. Meyer-Hamme, S, Lambertz, C, and Gauly, M. Assessing the welfare level of intensive fattening pig farms in Germany with the welfare quality® protocol: does farm size matter? Anim Welf. (2018) 27:275–86. doi: 10.7120/09627286.27.3.275
157. Munsterhjelm, C, Heinonen, M, and Valros, A. Application of the welfare quality® animal welfare assessment system in Finnish pig production, part I: identification of principal components. Anim Welf. (2015) 24:151–60. doi: 10.7120/09627286.24.2.151
158. Munsterhjelm, C, Heinonen, M, and Valros, A. Application of the welfare quality® animal welfare assessment system in Finnish pig production, part II: associations between animal-based and environmental measures of welfare. Anim Welf. (2015) 24:161–72. doi: 10.7120/09627286.24.2.161
159. Rocha, L, Velarde, A, Dalmau, A, Saucier, L, and Faucitano, L. Can the monitoring of animal welfare parameters predict pork meat quality variation through the supply chain (from farm to slaughter)? J Anim Sci. (2016) 94:359–76. doi: 10.2527/jas.2015-9176
160. Schmitt, O, O’Driscoll, K, Boyle, LA, and Baxter, EM. Artificial rearing affects piglets pre-weaning behaviour, welfare and growth performance. Appl Anim Behav Sci. (2019) 210:16–25. doi: 10.1016/j.applanim.2018.10.018
161. Schmitt, O, O’Driscoll, K, Baxter, E, and Boye, L. Artificial rearing affects the emotional state and reactivity of pigs post-weaning. Anim Welf. (2019) 28:433–42. doi: 10.7120/09627286.28.4.433
162. Temple, D, Dalmau, A, De La Torre, JLR, Manteca, X, and Velarde, A. Application of the welfare quality® protocol to assess growing pigs kept under intensive conditions in Spain. J Vet Behav. (2011) 6:138–49. doi: 10.1016/j.jveb.2010.10.003
163. Temple, D, Manteca, X, Velarde, A, and Dalmau, A. Assessment of animal welfare through behavioural parameters in Iberian pigs in intensive and extensive conditions. Appl Anim Behav Sci. (2011) 131:29–39. doi: 10.1016/j.applanim.2011.01.013
164. Temple, D, Manteca, X, Dalmau, A, and Velarde, A. Assessment of test–retest reliability of animal-based measures on growing pig farms. Livest Sci. (2013) 151:35–45. doi: 10.1016/j.livsci.2012.10.012
165. Termatzidou, S-A, Dedousi, A, Kritsa, M-Z, Banias, GF, Patsios, SI, and Sossidou, EN. Growth performance, welfare and behavior indicators in post-weaning piglets fed diets supplemented with different levels of bakery meal derived from food by-products. Sustainability. (2023) 15:12827. doi: 10.3390/su151712827
166. Vitali, M, Santacroce, E, Correa, F, Salvarani, C, Maramotti, FP, Padalino, B, et al. On-farm welfare assessment protocol for suckling piglets: a pilot study. Animals. (2020) 10:1016. doi: 10.3390/ani10061016
167. Vitali, M, Santolini, E, Bovo, M, Tassinari, P, Torreggiani, D, and Trevisi, P. Behavior and welfare of undocked heavy pigs raised in buildings with different ventilation systems. Animals. (2021) 11:2338. doi: 10.3390/ani11082338
168. Wiseman-Orr, M, Scott, E, and Nolan, A. Development and testing of a novel instrument to measure health-related quality of life (HRQL) of farmed pigs and promote welfare enhancement (part 1). Anim Welf. (2011) 20:535–48. doi: 10.1017/S0962728600003171
169. Bassler, A, Arnould, C, Butterworth, A, Colin, L, de Jong, I, Ferrante, V, et al. Potential risk factors associated with contact dermatitis, lameness, negative emotional state, and fear of humans in broiler chicken flocks. Poult Sci. (2013) 92:2811–26. doi: 10.3382/ps.2013-03208
170. Buijs, S, Ampe, B, and Tuyttens, F. Sensitivity of the welfare quality® broiler chicken protocol to differences between intensively reared indoor flocks: which factors explain overall classification? Animal. (2017) 11:244–53. doi: 10.1017/S1751731116001476,
171. Chen, Q, Saatkamp, W, H Cotenbach, J, and Jin, W. Comparison of Chinese broiler production systems in economic performance and animal welfare. Animals. (2020) 10:491. doi: 10.3390/ani10030491,
172. de Jong, I, Hindle, V, Butterworth, A, Engel, B, Ferrari, P, Gunnink, H, et al. Simplifying the welfare quality® assessment protocol for broiler chicken welfare. Animal. (2016) 10:117–27. doi: 10.1017/S1751731115001706
173. di Marcantonio, L, Marotta, F, Vulpiani, MP, Sonntag, Q, Iannetti, L, Janowicz, A, et al. Investigating the cecal microbiota in broiler poultry farms and its potential relationships with animal welfare. Res Vet Sci. (2022) 144:115–25. doi: 10.1016/j.rvsc.2022.01.020,
174. Federici, J, Vanderhasselt, R, Sand, E, Tuyttens, F, Souza, A, and Molento, C. Assessment of broiler chicken welfare in southern Brazil. Rev Bras Cienc Avic. (2016) 18:133–40. doi: 10.1590/18069061-2015-0022
175. Granquist, EG, Vasdal, G, de Jong, IC, and Moe, RO. Lameness and its relationship with health and production measures in broiler chickens. Animal. (2019) 13:2365–72. doi: 10.1017/S1751731119000466,
176. He, S, Lin, J, Jin, Q, Ma, X, Lie, Z, Chen, H, et al. The relationship between animal welfare and farm profitability in cage and free-range housing systems for laying hens in China. Animals. (2022) 12:2090. doi: 10.3390/ani12162090
177. Iannetti, L, Neri, D, Santarelli, GA, Cotturone, G, Vulpiani, MP, Salini, R, et al. Animal welfare and microbiological safety of poultry meat: impact of different at-farm animal welfare levels on at-slaughterhouse Campylobacter and Salmonella contamination. Food Control. (2020) 109:106921. doi: 10.1016/j.foodcont.2019.106921
178. Iannetti, L, Romagnoli, S, Cotturone, G, and Podaliri Vulpiani, M. Animal welfare assessment in antibiotic-free and conventional broiler chicken. Animals. (2021) 11:2822. doi: 10.3390/ani11102822,
179. Li, H, Wen, X, Alphin, R, Zhu, Z, and Zhou, Z. Effects of two different broiler flooring systems on production performances, welfare, and environment under commercial production conditions. Poult Sci. (2017) 96:1108–19. doi: 10.3382/ps/pew440,
180. Muri, K, Stubsjøen, SM, Vasdal, G, Moe, RO, and Granquist, EG. Associations between qualitative behaviour assessments and measures of leg health, fear and mortality in Norwegian broiler chicken flocks. Appl Anim Behav Sci. (2019) 211:47–53. doi: 10.1016/j.applanim.2018.12.010
181. Nenadovic, K, Vucinic, M, Turubatovic, R, Beckei, Z, Geric, T, and Ilic, T. The effect of different housing systems on the welfare and the parasitological conditions of laying hens. J Hell Vet Med Soc. (2022) 73:4493–504. doi: 10.12681/jhvms.27585
182. Plitman, L, Ben-Dov, D, Dolev, S, Katz, R, Miculitzki, M, Nagar, S, et al. Case study: comparing the welfare of broiler chickens in two intensive production systems in Israel. Isr J Vet Med. (2021) 76, 156–160.
183. Sans, E, Federici, J, Dahlke, F, and Molento, C. Evaluation of free-range broilers using the welfare quality® protocol. Revista Brasileira de Ciência Avícola. (2014) 16:297–306. doi: 10.1590/1516-635x1603297-306
184. Sans, E, Tuyttens, F, Taconeli, C, Rueda, P, Ciocca, J, and Molento, C. Welfare of broiler chickens reared under two different types of housing. Anim Welf. (2021) 30:341–53. doi: 10.7120/09627286.30.3.012
185. Sans, E, Dahlke, F, Federici, JF, Tuyttens, FAM, and Forte Maiolino Molento, C. Welfare of broiler chickens in Brazilian free-range versus intensive indoor production systems. J Appl Anim Welf Sci. (2023) 26:505–17. doi: 10.1080/10888705.2021.1992280
186. Souza, ADO, de Oliveira, S, Sans, E, Müller, B, and Molento, C. Broiler chicken welfare assessment in GLOBALGAP® certified and non-certified farms in Brazil. Anim Welf. (2015) 24:45–54. doi: 10.7120/09627286.24.1.045
187. Tuyttens, F, Federici, J, Vanderhasselt, R, Goethals, K, Duchateau, L, Sans, E, et al. Assessment of welfare of Brazilian and Belgian broiler flocks using the welfare quality protocol. Poult Sci. (2015) 94:1758–66. doi: 10.3382/ps/pev167
188. Vasdal, G, Granquist, EG, Skjerve, E, De Jong, IC, Berg, C, Michel, V, et al. Associations between carcass weight uniformity and production measures on farm and at slaughter in commercial broiler flocks. Poult Sci. (2019) 98:4261–8. doi: 10.3382/ps/pez252
189. Vasdal, G, Muri, K, Stubsjøen, SM, Moe, RO, and Kittelsen, K. Qualitative behaviour assessment as part of a welfare assessment in flocks of laying hens. Appl Anim Behav Sci. (2022) 246:105535. doi: 10.1016/j.applanim.2021.105535
190. Bodas, R, Garcia-Garcia, JJ, Montanes, M, Benito, A, Peric, T, Baratta, M, et al. On farm welfare assessment of European fattening lambs. Small Rumin Res. (2021) 204:106533. doi: 10.1016/j.smallrumres.2021.106533
191. Collins, T, Anthony, UM, Dunston-Clarke, EJ, and Fleming, PA. Feasibility of a sheep welfare assessment tool in the pre-export phase of Australian live export industry. Front Anim Sci. (2021) 2:687162. doi: 10.3389/fanim.2021.687162
192. Diaz-Lundahl, S, Hellestveit, S, Stubsjøen, SM, Phythian, CJ, Oppermann, MR, and Muri, K. Intra-and inter-observer reliability of qualitative behaviour assessments of housed sheep in Norway. Animals. (2019) 9:569. doi: 10.3390/ani9080569
193. Hernandez, RO, Sanchez, JA, and Romero, MH. Iceberg indicators for animal welfare in rural sheep farms using the five domains model approach. Animals. (2020) 10:2273. doi: 10.3390/ani10122273,
194. Muri, K, and Stubsjøen, SM. Inter-observer reliability of qualitative behavioural assessments (QBA) of housed sheep in Norway using fixed lists of descriptors. Anim Welf. (2017) 26:427–35. doi: 10.7120/09627286.26.4.427
195. Phythian, C, Michalopoulou, E, Duncan, J, and Wemelsfelder, F. Inter-observer reliability of qualitative behavioural assessments of sheep. Appl Anim Behav Sci. (2013) 144:73–9. doi: 10.1016/j.applanim.2012.11.011
196. Phythian, C, Michalopoulou, E, Cripps, PJ, Duncan, JS, and Wemelsfelder, F. On-farm qualitative behaviour assessment in sheep: repeated measurements across time, and association with physical indicators of flock health and welfare. Appl Anim Behav Sci. (2016) 175:23–31. doi: 10.1016/j.applanim.2015.11.013
197. Stubsjøen, SM, Moe, RO, Mejdell, CM, Tømmerberg, V, Knappe-Poindecker, M, Kampen, AH, et al. Sheep welfare in different housing systems in South Norway. Small Rumin Res. (2022) 214:106740. doi: 10.1016/j.smallrumres.2022.106740
198. Battini, M, Stilwell, G, Vieira, A, Barbieri, S, Canali, E, and Mattiello, S. On-farm welfare assessment protocol for adult dairy goats in intensive production systems. Animals. (2015) 5:934–50. doi: 10.3390/ani5040393,
199. Battini, M, Barbieri, S, Vieira, A, Stilwell, G, and Mattiello, S. Results of testing the prototype of the AWIN welfare assessment protocol for dairy goats in 30 intensive farms in northern Italy. Ital J Anim Sci. (2016) 15:283–93. doi: 10.1080/1828051X.2016.1150795
200. Battini, M, Barbieri, S, Vieira, A, Can, E, Stilwell, G, and Mattiello, S. The use of qualitative behaviour assessment for the on-farm welfare assessment of dairy goats. Animals. (2018) 8:123. doi: 10.3390/ani8070123,
201. Can, E, Vieira, A, Battini, M, Mattiello, S, and Stilwell, G. On-farm welfare assessment of dairy goat farms using animal-based indicators: the example of 30 commercial farms in Portugal. Acta Agricult Scand Sect A Anim Sci. (2016) 66:43–55. doi: 10.1080/09064702.2016.1208267
202. Can, E, Vieira, A, Battini, M, Mattiello, S, and Stilwell, G. Consistency over time of animal-based welfare indicators as a further step for developing a welfare assessment monitoring scheme: the case of the animal welfare indicators protocol for dairy goats. J Dairy Sci. (2017) 100:9194–204. doi: 10.3168/jds.2017-12825,
203. Costa, EDO, Gordiano, LA, Ferreira, FG, Santos, SA, De Carvalho, GGP, De Araujo, MLG, et al. Thermography as an indicator of goat welfare in an intensive production system. Trop Anim Health Prod. (2023) 55:373. doi: 10.1007/s11250-023-03791-1,
204. Grosso, L, Battini, M, Wemelsfelder, F, Barbieri, S, Minero, M, Dalla Costa, E, et al. On-farm qualitative behaviour assessment of dairy goats in different housing conditions. Appl Anim Behav Sci. (2016) 180:51–7. doi: 10.1016/j.applanim.2016.04.013
205. Muri, K, Stubsjøen, S, and Valle, P. Development and testing of an on-farm welfare assessment protocol for dairy goats. Anim Welf. (2013) 22:385–400. doi: 10.7120/09627286.22.3.385
206. Muri, K, Leine, N, and Valle, P. Welfare effects of a disease eradication programme for dairy goats. Animal. (2016) 10:333–41. doi: 10.1017/S1751731115000762
207. Czycholl, I, Büttner, K, Klingbeil, P, and Krieter, J. An indication of reliability of the two-level approach of the AWIN welfare assessment protocol for horses. Animals. (2018) 8:7. doi: 10.3390/ani8010007,
208. Czycholl, I, Klingbeil, P, and Krieter, J. Interobserver reliability of the animal welfare indicators welfare assessment protocol for horses. J Equine Vet Sci. (2019) 75:112–21. doi: 10.1016/j.jevs.2019.02.005,
209. Czycholl, I, Büttner, K, Klingbeil, P, and Krieter, J. Evaluation of consistency over time of the use of the animal welfare indicators protocol for horses. Anim Welf. (2021) 30:81–90. doi: 10.7120/09627286.30.1.081
210. Dai, F, Riva, MG, Dalla Costa, E, Pascuzzo, R, Chapman, A, and Minero, M. Application of QBA to assess the emotional state of horses during the loading phase of transport. Animals. (2022) 12:3588. doi: 10.3390/ani12243588,
211. Minero, M, Dalla Costa, E, Dai, F, Canali, E, Barbieri, S, Zanella, A, et al. Using qualitative behaviour assessment (QBA) to explore the emotional state of horses and its association with human-animal relationship. Appl Anim Behav Sci. (2018) 204:53–9. doi: 10.1016/j.applanim.2018.04.008
212. Mullan, S, Szmaragd, C, Hotchkiss, J, and Whay, HR. The welfare of long-line tethered and free-ranging horses kept on public grazing land in South Wales. Anim Welf. (2014) 23:25–37. doi: 10.7120/09627286.23.1.025
213. Popescu, S, Lazar, EA, Borda, C, Blaga Petrean, A, and Mitransecu, E. Changes in management, welfare, emotional state, and human-related docility in stallions. Animals. (2022) 12:2981. doi: 10.3390/ani12212981
214. Rowland, M, Hudson, N, Connor, M, Dwyer, C, and Coombs, T. The welfare of traveller and gypsy owned horses in the UK and Ireland. Animals. (2022) 12:2402. doi: 10.3390/ani12182402
215. Ruet, A, Biau, S, Arnould, C, Galloux, P, Destrez, A, Pycik, E, et al. Horses could perceive riding differently depending on the way they express poor welfare in the stable. J Equine Vet Sci. (2020) 94:103206. doi: 10.1016/j.jevs.2020.103206
216. Ruet, A, Arnould, C, LeMarchand, J, Parias, C, Mach, N, Moisan, MP, et al. Horse welfare: a joint assessment of four categories of behavioural indicators using the AWIN protocol, scan sampling and surveys. Anim Welf. (2022) 31:455–66. doi: 10.7120/09627286.31.3.008
217. Dai, F, Dalla Costa, E, Murray, LM, Canali, E, and Minero, M. Welfare conditions of donkeys in Europe: initial outcomes from on-farm assessment. Animals. (2016) 6:5. doi: 10.3390/ani6010005,
218. Dai, F, Segati, G, Brscic, M, Chincarini, M, Dalla Costa, E, Ferrari, L, et al. Effects of management practices on the welfare of dairy donkeys and risk factors associated with signs of hoof neglect. J Dairy Res. (2018) 85:30–8. doi: 10.1017/S0022029917000723,
219. Gonzalez, FJN, Vidal, JJ, Leon Jurado, JM, McLean, AK, and Delgado Bermejo, JV. Nonparametric analysis of noncognitive determinants of response type, intensity, mood, and learning in donkeys (Equus asinus). J Vet Behav. (2020) 40:21–35. doi: 10.1016/j.jveb.2020.08.003
220. Minero, M, Dalla Costa, E, Dai, F, Murray, LAM, Canali, E, and Wemelsfelder, F. Use of qualitative behaviour assessment as an indicator of welfare in donkeys. Appl Anim Behav Sci. (2016) 174:147–53. doi: 10.1016/j.applanim.2015.10.010
221. Barnard, S, Pedernara, C, Candeloro, L, Ferri, N, Velarde, A, and Dalla Villa, P. Development of a new welfare assessment protocol for practical application in long-term dog shelters. Vet Rec. (2015) 178:18–8. doi: 10.1136/vr.103336
222. Berteselli, GV, Arena, L, Candeloro, L, Dalla Villa, P, and de Massis, F. Interobserver agreement and sensitivity to climatic conditions in sheltered dogs' welfare evaluation performed with welfare assessment protocol (shelter quality protocol). J Vet Behav. (2019) 29:45–52. doi: 10.1016/j.jveb.2018.09.003
223. Berteselli, GV, Messori, S, Arena, L, Smith, L, Dalla Villa, P, and de Massis, F. Using a Delphi method to estimate the relevance of indicators for the assessment of shelter dog welfare. Anim Welf. (2022) 31:341–53. doi: 10.7120/09627286.31.3.007
224. Cuglovici, DA, and Amaral, PIS. Dog welfare using the shelter quality protocol in long-term shelters in Minas Gerais state, Brazil. J Vet Behav. (2021) 45:60–7. doi: 10.1016/j.jveb.2021.06.004
225. Harvey, ND, Moesta, A, Kappel, S, Wongsaengchan, C, Harris, H, Craigon, PJ, et al. Could greater time spent displaying waking inactivity in the home environment be a marker for a depression-like state in the domestic dog? Animals. (2019) 9:420. doi: 10.3390/ani9070420,
226. Raudies, C, Waiblinger, S, and Arhant, C. Characteristics and welfare of long-term shelter dogs. Animals. (2021) 11:194. doi: 10.3390/ani11010194
227. Shaw, N, Wemelsfelder, F, and Riley, LM. Bark to the future: the welfare of domestic dogs during interaction with a positively reinforcing artificial agent. Appl Anim Behav Sci. (2022) 249:105595. doi: 10.1016/j.applanim.2022.105595
228. Stubsjøen, SM, Moe, RO, Bruland, K, Lien, T, and Muri, K. Reliability of observer ratings: qualitative behaviour assessments of shelter dogs using a fixed list of descriptors. Vet Anim Sci. (2020) 10:100145. doi: 10.1016/j.vas.2020.100145
229. Stubsjøen, SM, Moe, RO, Johannessen, C, Larsen, M, Madsen, H, and Muri, K. Can shelter dog observers score behavioural expressions consistently over time? Acta Vet Scand. (2022) 64:35. doi: 10.1186/s13028-022-00654-x
230. Heritier, C, Riemer, S, and Gaschler, R. The power is in the word—do laypeople interpret descriptors of dog emotional states correctly? Animals. (2023) 13:3009. doi: 10.3390/ani13193009,
231. Travnik, IC, and Sant’Anna, AC. Do you see the same cat that I see? Relationships between qualitative behaviour assessment and indicators traditionally used to assess temperament in domestic cats. Anim Welf. (2021) 30:211–23. doi: 10.7120/09627286.30.2.211
232. Travnik, IC, Machado, DS, and Sant’Anna, AC. Do you see the same cat that I see? Inter- and intra-observer reliability for qualitative behaviour assessment as temperament indicator in domestic cats. Anim Welf. (2022) 31:319–27. doi: 10.7120/09627286.31.3.004
233. Jarvis, S, Ellis, MA, Turnbull, JF, Rey Planellas, S, and Wemelsfelder, F. Qualitative Behavioral assessment in juvenile farmed Atlantic Salmon (Salmo salar): potential for on-farm welfare assessment. Front Vet Sci. (2021) 8:702783. doi: 10.3389/fvets.2021.702783,
234. Stagni, E, Brscic, M, Contiero, B, Kirchner, M, Sequeira, S, and Hartmann, S. Development of a fixed list of terms for qualitative behavioural assessment of brown bear (Ursus arctos) in sanctuaries. Appl Anim Behav Sci. (2022) 246:105523. doi: 10.1016/j.applanim.2021.105523
235. Yon, L, Williams, E, Harvey, ND, and Asher, L. Development of a behavioural welfare assessment tool for routine use with captive elephants. PLoS One. (2019) 14:e0210783. doi: 10.1371/journal.pone.0210783
236. Dobrikj, E, Ilieski, V, Ilievska, K, and Kjosevski, M. Using species-specific protocols for the welfare assessment of elephants in the Skopje zoo. Maced Vet Rev. (2022) 45:201–8. doi: 10.2478/macvetrev-2022-0019
237. Skovlund, CR, Kirchner, MK, Contiero, B, Ellegaard, S, Manteca, X, Stelvig, M, et al. Qualitative behaviour assessment for zoo-housed polar bears (Ursus maritimus): intra- and inter-day consistency and association to other indicators of welfare. Appl Anim Behav Sci. (2023) 263:105942. doi: 10.1016/j.applanim.2023.105942
238. Nogueira, SSC, Macedo, JF, Sant’Anna, AC, Nogueira-Filho, SLG, and Paranhos Da Costa, MJR. Assessment of temperament traits of white-lipped (Tayassu pecari) and collared peccaries (Pecari tajacu) during handling in a farmed environment. Anim Welf. (2015) 24:291–8. doi: 10.7120/09627286.24.3.291
239. Sriwatanakul, K, Kelvie, W, Lasagna, L, Calimlim, JF, Weis, OF, and Mehta, G. Studies with different types of visual analog scales for measurement of pain. Clin Pharmacol Ther. (1983) 34:234–9. doi: 10.1038/clpt.1983.159
Keywords: Qualitative Behaviour Assessment, emotional state, fixed list, welfare assessment, animal welfare, positive emotional state
Citation: Czycholl I, Skovlund CR and Forkman B (2026) Literature review of the use of Qualitative Behaviour Assessment with a fixed list of terms. Front. Vet. Sci. 12:1588346. doi: 10.3389/fvets.2025.1588346
Edited by:
Jen-Yun Chou, University of Saskatchewan, CanadaReviewed by:
Monica Battini, University of Milan, ItalyFrancoise Wemelsfelder, Scotland's Rural College, United Kingdom
Copyright © 2026 Czycholl, Skovlund and Forkman. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: Irena Czycholl, aWNAc3VuZC5rdS5kaw==
Björn Forkman