# HUMAN-ANIMAL INTERACTION (HAI) RESEARCH: A DECADE OF PROGRESS

EDITED BY : Peggy D. McCardle, Sandra McCune and James A. Griffin PUBLISHED IN : Frontiers in Veterinary Science, Frontiers in Psychology, Frontiers in Public Health and Frontiers in Pediatrics

#### Frontiers eBook Copyright Statement

The copyright in the text of individual articles in this eBook is the property of their respective authors or their respective institutions or funders. The copyright in graphics and images within each article may be subject to copyright of other parties. In both cases this is subject to a license granted to Frontiers.

> The compilation of articles constituting this eBook is the property of Frontiers.

Each article within this eBook, and the eBook itself, are published under the most recent version of the Creative Commons CC-BY licence. The version current at the date of publication of this eBook is CC-BY 4.0. If the CC-BY licence is updated, the licence granted by Frontiers is automatically updated to the new version.

When exercising any right under the CC-BY licence, Frontiers must be attributed as the original publisher of the article or eBook, as applicable.

Authors have the responsibility of ensuring that any graphics or other materials which are the property of others may be included in the CC-BY licence, but this should be checked before relying on the CC-BY licence to reproduce those materials. Any copyright notices relating to those materials must be complied with.

Copyright and source acknowledgement notices may not be removed and must be displayed in any copy, derivative work or partial copy which includes the elements in question.

All copyright, and all rights therein, are protected by national and international copyright laws. The above represents a summary only. For further information please read Frontiers' Conditions for Website Use and Copyright Statement, and the applicable CC-BY licence.

ISSN 1664-8714 ISBN 978-2-88963-601-3 DOI 10.3389/978-2-88963-601-3

### About Frontiers

Frontiers is more than just an open-access publisher of scholarly articles: it is a pioneering approach to the world of academia, radically improving the way scholarly research is managed. The grand vision of Frontiers is a world where all people have an equal opportunity to seek, share and generate knowledge. Frontiers provides immediate and permanent online open access to all its publications, but this alone is not enough to realize our grand goals.

### Frontiers Journal Series

The Frontiers Journal Series is a multi-tier and interdisciplinary set of open-access, online journals, promising a paradigm shift from the current review, selection and dissemination processes in academic publishing. All Frontiers journals are driven by researchers for researchers; therefore, they constitute a service to the scholarly community. At the same time, the Frontiers Journal Series operates on a revolutionary invention, the tiered publishing system, initially addressing specific communities of scholars, and gradually climbing up to broader public understanding, thus serving the interests of the lay society, too.

### Dedication to Quality

Each Frontiers article is a landmark of the highest quality, thanks to genuinely collaborative interactions between authors and review editors, who include some of the world's best academicians. Research must be certified by peers before entering a stream of knowledge that may eventually reach the public - and shape society; therefore, Frontiers only applies the most rigorous and unbiased reviews.

Frontiers revolutionizes research publishing by freely delivering the most outstanding research, evaluated with no bias from both the academic and social point of view. By applying the most advanced information technologies, Frontiers is catapulting scholarly publishing into a new generation.

#### What are Frontiers Research Topics?

Frontiers Research Topics are very popular trademarks of the Frontiers Journals Series: they are collections of at least ten articles, all centered on a particular subject. With their unique mix of varied contributions from Original Research to Review Articles, Frontiers Research Topics unify the most influential researchers, the latest key findings and historical advances in a hot research area! Find out more on how to host your own Frontiers Research Topic or contribute to one as an author by contacting the Frontiers Editorial Office: researchtopics@frontiersin.org

# HUMAN-ANIMAL INTERACTION (HAI) RESEARCH: A DECADE OF PROGRESS

Topic Editors:

Peggy D. McCardle, Haskins Laboratories, New Haven, CT, United States Sandra McCune, Waltham Centre for Pet Nutrition, United Kingdom James A. Griffin, National Institutes of Health (NIH), United States

Human-Animal Interaction Research

Citation: McCardle, P. D., McCune, S., Griffin, J. A., eds. (2020). Human-Animal Interaction (HAI) Research: A Decade of Progress. Lausanne: Frontiers Media SA. doi: 10.3389/978-2-88963-601-3

# Table of Contents


Evan L. MacLean and Brian Hare


Tiffany Syzmanski, Rita J. Casey, Amy Johnson, Annmarie Cano, Dana Albright and Nicholas P. Seivert


## Editorial: Human-Animal Interaction (HAI) Research: A Decade of Progress

Sandra McCune<sup>1</sup> \*, Peggy McCardle2,3, James A. Griffin<sup>4</sup> , Layla Esposito<sup>4</sup> , Karyl Hurley <sup>5</sup> , Regina Bures <sup>4</sup> and Katherine A. Kruger <sup>6</sup>

*<sup>1</sup> Waltham Centre for Pet Nutrition, Melton Mowbray, United Kingdom, <sup>2</sup> PM Consulting LLC, Tarpon Springs, FL, United States, <sup>3</sup> Haskins Laboratories, New Haven, CT, United States, <sup>4</sup> Eunice Kennedy Shriver National Institute of Child Health and Human Development (NICHD), Bethesda, MD, United States, <sup>5</sup> Mars Inc, McLean, VA, United States, <sup>6</sup> School of Veterinary Medicine, University of Pennsylvania, Philadelphia, PA, United States*

Keywords: human-animal bond, human-animal interaction (HAI), human-animal relationships, child development, animal assisted intervention, animal assisted activities

**Editorial on the Research Topic**

**Human-Animal Interaction (HAI) Research: A Decade of Progress**

In 2008, the Eunice Kennedy Shriver National Institute of Child Health and Human Development (NICHD), one of the 27 Institutes and Centers that make up the U.S. National Institutes of Health (NIH), and the WALTHAM <sup>R</sup> Centre for Pet Nutrition (WALTHAM <sup>R</sup> ), a division of Mars Inc., entered a public-private partnership (henceforth referred to as "the Partnership") to explore the science of human-animal interaction (HAI), specifically as it relates to children's health and overall development. To take stock of existing research and more fully understand research needs in this area, the partners convened a series of workshops, bringing together researchers and practitioners currently working in HAI as well as individuals with expertise in other relevant fields including ethology, developmental, cognitive, clinical and comparative (animal) psychology, pediatric and veterinary medicine, epidemiology, and public health. Based on the research needs identified, the NICHD established a new research program on HAI and child health and development (1).

Those initial meetings also led to the publication of two edited volumes to disseminate the information from the presentations and the rich discussions that took place. The first volume, Animals in Our Lives (2), provided information about HAI research design and methodology and studies of HAI in child development and human health. The second, How Animals Affect US (3), had a more applied focus, highlighting HAI in family, community, and therapeutic settings and emphasizing the need for evidence-based practice in this relatively young and rapidly growing research area.

Informed by the workshops, NICHD, joined by the NIH's National Institute of Nursing Research, issued the first in a series of Funding Opportunity Announcements (FOAs) (1) partially funded through a gift to the NICHD from Mars, Incorporated under the Partnership. The first FOAs focused on how children perceive, relate to, and think about animals; how pets in the home impact children's social and emotional development and health (e.g., allergies, asthma, mitigation of obesity); and whether and under what conditions therapeutic use of animals is safe and effective. Seven research grants were funded under the initial FOAs, and between 2008 and 2018 the NICHD has issued 8 FOAs and funded a total of 27 research grants in HAI (for current information on funded grants and current FOAs see https://www.nichd.nih.gov/about/org/der/branches/cdbb/ programs/psad/HAI).

#### Edited and reviewed by:

*Mary M. Christopher, University of California, Davis, United States*

\*Correspondence: *Sandra McCune drsandramccune@gmail.com*

#### Specialty section:

*This article was submitted to Veterinary Humanities and Social Sciences, a section of the journal Frontiers in Veterinary Science*

Received: *11 December 2019* Accepted: *17 January 2020* Published: *18 February 2020*

#### Citation:

*McCune S, McCardle P, Griffin JA, Esposito L, Hurley K, Bures R and Kruger KA (2020) Editorial: Human-Animal Interaction (HAI) Research: A Decade of Progress. Front. Vet. Sci. 7:44. doi: 10.3389/fvets.2020.00044*

As the research investments began to bear fruit the Partnership continued to evolve and new Research Topics emerged involving a wider range of scientific fields. Researchers were documenting the relationship between the impact of HAI and child development, but little was known about the mechanisms underlying these effects. In 2011, a workshop was held on the social neuroscience of HAI, which formed the basis for another edited volume, The Social Neuroscience of Human-Animal Interaction (4), addressing the basic neurobiological mechanisms that underlie the effects observed in HAI. The Partnership also continued to disseminate information regarding new directions in HAI research (5) as well as how new research findings were relevant to clinical applications such as Animal-Assisted Therapy (6, 7) and HAI in school settings (8).

These Partnership workshops, research solicitations, and publications over the past 10 years have promoted the adoption and use of more rigorous designs and methods, raising the bar for the quality of research in HAI. As a result, the field has broadened the scope of studies undertaken, addressing not only correlational studies examining the association between interactions with pets and the subsequent social-emotional benefits generally, but also potential mechanisms for those quantifiable effects, including intervention studies employing randomized controlled trial designs.

In addition to promoting randomized controlled trials, the NICHD-Mars Partnership also has recognized the need to encourage and support more public health research through population representative studies. These efforts have included the development of survey questions/instruments and their inclusion in ongoing large-scale cross-sectional surveys and longitudinal studies that use population representative samples (5).

Embedding pet ownership questions in these studies informs our knowledge of the prevalence of pets and offers a costeffective way to collect data on the impact of pets on health and development for children, adults, and families for secondary analyses. Because data from these studies are publicly available, this also increases the research resources for the field of HAI. While cross-sectional data help describe the relationships between pet ownership and health and development, the Partnership's goal is, where possible, to encourage longitudinal data collection on the same individuals to allow for the study of changes in pet ownership over time and concomitant changes in owner physical and mental health status and overall well-being.

The Partnership also has fostered an increased focus on the health benefits of interaction with animals, including therapeutic and rehabilitative interventions for children and adults with intellectual, developmental, physical, and mental health-related disabilities and other disorders. These studies frequently examine the effects of HAI interventions on both the humans and the animals involved, exploring how the animals may benefit but also suffer stress as the result of their role in the intervention. Such studies are needed to identify appropriate selection, training, and monitoring protocols to ensure the well-being of the animal and the optimal conditions for the therapeutic interaction with the person receiving the intervention.

There is ample evidence that the field of HAI has grown and matured over the last decade (9), including the increased number of academic positions in research and education, the development of HAI Centers of expertise (e.g., Tufts' s Institute for HAI and Purdue's Center for the Human-Animal Bond), and funding and industry support through non-governmental organizations such as the Human-Animal Bond Research Institute (HABRI) and the Horses and Humans Research Foundation (HHRF). Likewise, journals dedicated to HAI research have expanded beyond Anthrozoos (a peerreviewed journal launched in 1987) to include the HAI Bulletin in 2013, which resulted from the establishment of the Human-Animal Interaction Section of Division 17 (Society of Counseling Psychology) by the American Psychological Association (APA) in 2012, as well as the creation of a special section devoted to HAI research in the journal Applied Developmental Science in 2016.

As recognition of the success of this decade-long collaborative effort, the Research Topic "Human-Animal Interaction: A Decade of Progress" highlights the research jointly funded by the Partnership. This thematic series of original research papers addresses a variety of topics within HAI research. Several papers address children's experiences and relationships with pets (Hart et al.; Hurley and Oakes; Kertes et al.; Meints et al.), one on teaching families to read the body language of dogs (Meints et al.), and an Opinion article highlighting potential health benefits of dog walking even when you are not the owner (Chen). There are reports on the social-emotional effects of pet ownership (Jacobson and Chang), and intervention studies of behavioral and biological effects of animal-assisted interventions on stress (Pendry et al.), including stress-reduction and adaptive behaviors in children with Autism Spectrum Disorder (ASD) (Gabriels et al.; Pan et al.), Attention Deficit Hyperactivity Disorder (ADHD) (Schuck et al.), and incarcerated youth (Syzmanski et al.). Three papers address new measures and methods (Guérin et al.; MacLean and Hare; Bures et al.). The final paper, by members of the NICHD-Mars Partnership (Griffin et al.) addresses what will be necessary to sustain and accelerate progress in HAI research over the coming decade. We note that this Research Topic does not reflect all of the research of all the investigators funded under the Partnership, as many have already published their findings and others are still in the data collection phase of their projects.

### AUTHOR CONTRIBUTIONS

All co-authors contributed to the design and writing of the paper. SM, PM, and JG served as Editors for the Research Topic. LE, KH, RB, and KK were not Topic Editors of this Research Topic but co-wrote the Editorial.

### ACKNOWLEDGMENTS

We wish to acknowledge some of the many individuals who supported us in establishing the Public-Private Partnership and developing the field of human-animal interaction research: From Mars, Inc., John Lunde, Megan Sibole Rottcher, Harold Schmitz, Cathie Wotecki, Susan Blount, and Kay O'Donnell; from the NIH, Duane O'Donnell; from the NIH, Duane Alexander, Lisa Freund, Alan Guttmacher, Yvonne Maddox, and Valerie Maholmes.

## REFERENCES


**Disclaimer:** The views expressed in this chapter are those of the authors and do not necessarily represent those of the National Institutes of Health, Eunice Kennedy Shriver National Institute of Child Health and Human Development, or the U.S. Department of Health and Human Services.

**Conflict of Interest:** SM and KH are employed by Mars Incorporated. KK was supported by a grant from Mars Inc. PM was employed by PM Consulting and as a science and ethics consultant by Mars Inc.

The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2020 McCune, McCardle, Griffin, Esposito, Hurley, Bures and Kruger. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Infants' Daily Experience With Pets and Their Scanning of Animal Faces

Karinna Hurley 1,2 and Lisa M. Oakes 1,3 \*

<sup>1</sup> Center for Mind and Brain, University of California, Davis, Davis, CA, United States, <sup>2</sup> Human Development, University of California, Davis, Davis, CA, United States, <sup>3</sup> Department of Psychology, University of California, Davis, Davis, CA, United States

Very little is known about the effect of pet experience on cognitive development in infancy. In Experiment 1, we document in a large sample (N = 1270) that 63% of families with infants under 12 months have at least one household pet. The potential effect on development is significant as the first postnatal year is a critically important time for changes in the brain and cognition. Because research has revealed how experience shapes early development, it is likely that the presence of a companion dog or cat in the home influences infants' development. In Experiment 2, we assess differences between infants who do and do not have pets (N = 171) in one aspect of cognitive development: their processing of animal faces. We examined visual exploration of images of dog, cat, monkey, and sheep faces by 4-, 6-, and 10-month-old infants. Although at the youngest ages infants with and without pets exhibited the same patterns of visual inspection of these animals faces, by 10 months infants with pets spent proportionately more time looking at the region of faces that contained the eyes than did infants without pets. Thus, exposure to pets contributes to how infants look at and learn about animal faces.

#### Edited by:

Peggy D. McCardle, Consultant, New Haven, CT, United States

#### Reviewed by:

Federica Pirrone, Università degli Studi di Milano, Italy Mitsuaki Ohta, Tokyo University of Agriculture, Japan

> \*Correspondence: Lisa M. Oakes lmoakes@ucdavis.edu

#### Specialty section:

This article was submitted to Veterinary Humanities and Social Sciences, a section of the journal Frontiers in Veterinary Science

> Received: 30 April 2018 Accepted: 18 June 2018 Published: 10 July 2018

#### Citation:

Hurley K and Oakes LM (2018) Infants' Daily Experience With Pets and Their Scanning of Animal Faces. Front. Vet. Sci. 5:152. doi: 10.3389/fvets.2018.00152 Keywords: infant development, pets, experience, cognitive development, human-animal interaction, face processing

### INTRODUCTION

Many families with children have pets (1–3), and there has been significant interest in the connection between experience with animals and development in childhood (4–8). However, few studies have considered the impact of exposure to pets on very young infants (9). Instead, the vast majority of work on how exposure to animals influences development has focused on older children and, often, in therapeutic settings (4, 10, 11). The lack of work on the period of infancy is surprising because it is a developmental period profoundly influenced by experience. For example, experience with particular sounds, faces, and objects contribute to infants' rapidly developing abilities in language (12), facial perception (13), and categorization (14). Why has the effect of pets on infants' development been so neglected? One possibility is that because households without children often have high levels of pet ownership (15, 16) people assume that most families with infants are unlikely to have pets, and thus there are few opportunities for infant development to be shaped by pets. Another possibility is that research on the effect of pet experience on development has not focused on typical cognitive development, as the examples given for the effect of languages, faces, and categorization.

Here we address both of these possibilities. First, we present data on the prevalence of pets in homes with infants between 4 and 12 months of age. These data provide an important context for why researchers should focus on the influence of pets on development during this age range. To preview our findings, we observe that families with infants have companion dogs and cats at similar rates as have been reported for families with older children (17). Thus, there is no reason to assume that infants have less exposure to pets than do children at other developmental stages.

Next, we examine the effect of pet exposure on one aspect of typical cognitive development in infancy, their learning of animal faces. Thus, our work will fit in the context of findings that infants' developing face processing is related to their experience with faces of a particular gender, race, or species. For example, infants have a processing advantage for female faces (18–20), perhaps because most infants have female primary caregivers (21), and therefore, in general, have more experience with female faces. By 3 months infants show preferences for ownrace faces over those from unfamiliar races (22–24), presumably reflecting, at least in part, their daily experience with faces of a particular (parents') race. In addition, experience shapes the development of infants' face processing. Although 3-month-old infants discriminate between individual faces from both their own (parents') racial group as well as from other less familiar racial groups, 9-month-old infants discriminate faces only from their own racial group (25). Similarly, whereas 6-month-old infants discriminate both individual human and monkey faces, 9-month-old infants discriminate only individual human faces and are unable to discriminate between individual monkey faces (26).

We extend this work to examine the effect of daily exposure to companion dogs and cats on infants' developing processing of animal faces. Providing infants with daily experience with monkey faces between 6 and 9 months helped them maintain the ability to discriminate monkey faces (27, 28), and this effect is particularly robust when that daily experience with each animal emphasized animals as an individuals [i.e., looking at pictures of named individuals (28)]. Exposure to a pet in the home, which emphasizes that pet as an individual (i.e., pets are named, they are talked to), may influence infants' perceptual processing of face stimuli similar to that pet. Thus, our results will allow us to generalize the effect of this artificial experimental manipulation to a naturalistic difference that occurs in infants' daily life. Family pets have the potential to have a profound effect on infants' development. Not only do infants with and without pets differ in their amount of exposure to animals, their experience with pets likely differs in other ways given the interactive social nature of domestic animals (29–33) and the fact that pets commonly are considered family members (34–38).

The work presented here builds on previous findings demonstrating that infants who live with indoor pets perceive and learn about images of dogs and cats in the lab differently than infants who do not live with indoor pets (39–43). For example, Kovack-Lesh et al. (41, 43) found that 4-monthold infants with pets responded differently in a categorization task than did infants without pets, at least if they engaged in high levels of looking back-and-forth between the two images during familiarization. Thus, observed differences in infants' responding during test trials appears to have been a function of their past experience. Other work points to differences in how infants actually approach stimuli as a function of their pet experience. Hurley et al. (39) observed that 6-monthold infants with pet experience engaged in more looking and comparison when viewing images of animals than did infants without pet experience, consistent with other findings that infants are more interested in stimuli relevant to their past experience (19, 44). Examinations of eye-movements of 4 month-old infants as they inspected individual images of cats and dogs revealed that infants with pets looked more at the informationally-rich head regions than did infants without pets (40, 42). Thus, experience with dogs and cats in the home appears to have translated into differences in attentional biases when infants processed images similar to that experience. Hurley and Oakes (40) further showed that infants with and without pets did not differ in their visual inspection of human faces and vehicles, suggesting that the effect of such animal experience was specific to images of animals that were similar to the animals common in the everyday experience of infants with pets.

The current work addressed several important unanswered questions. First, none of the previous studies of pet experience examined age-related changes in the effect of pet experience on infants' visual processing of animals. All of the existing work in this area has examined the relation between pet experience and visual processing of animal images in infants at a single age (39–43). We predict from the work on infants' processing of human faces, however, that pet experience will differentially influence how younger and older infants visually process images of animals, presumably as both the result of older infants having more experience—and that experience having more time to influence processing—than younger infants and the result of the effect of experience on development at early ages having a cascading effect on later developing skills and abilities. As described in more detail in the General Discussion section, we assume that daily experience with a pet helps to shape the attentional strategies infants adopt when looking at animal images. Thus, we anticipate that there will be differences in how older infants visually explore or scan images of animal faces as a function of pet experience, although there may be few, if any, differences in how younger infants visually explore or scan animal faces. We tested this prediction by observing separate groups of 4-, 6-, and 10 month-old infants' looking at animal faces. These are key ages in the work on changes in infants' processing of human faces.

A second question we addressed in this investigation is whether the effect of experience would be observed for infants' processing of animal faces. All the previous investigations of pet experience on infants' processing of dog and cat images have used representations of whole animals as stimuli. Although this work has shown that infants with pets have a stronger bias to look at the head and face region of these images (40, 42), we do not know whether differences will be observed for how infants process the faces of animals. Previous work suggests head regions are especially informative for infants' processing of animal images (45, 46), and Mareschal et al. (47) established that infants are sensitive to variations in the facial features of cats and dogs. Moreover, if the effects of experience on infants' processing of human faces reflect general processes, then we should see similar effects for the effect of experience on infants' processing of nonhuman faces. For these reasons, in the present investigation we presented infants with images of novel animal faces.

Finally, we asked how infants' pet experience extended to their processing of different types of animals. The previous work has focused on infants' visual cognitions of images of dogs and cats (39–43). Although (40) showed that the effect of pet experience did not extend to images of human faces and vehicles, we do not know how infants' pet experience influences their processing of other kinds of animals. Therefore, we presented infants with images of dogs and cats, that are likely highly familiar to infants (particularly infants who have pets at home) and images of animals monkeys and sheep, that are likely relatively unfamiliar to infants.

### EXPERIMENT 1

The goal of Experiment 1 was to document the prevalence of pet ownership in families with infants between 4 and 12 months of age. This Experiment will demonstrate that many infants are exposed to companion dogs and cats in their daily lives, and that there are significant opportunities for pets in the home to shape development in infancy.

Both Experiments 1 and 2 were carried out in accordance with the recommendations of the Institutional Review Board of the University of California, Davis. The protocols were approved by the Institutional Review Board of the University of California, Davis. All subjects gave written informed consent in accordance with the Declaration of Helsinki.

### Methods

### Participants

Between 12/13/2007 and 12/12/2014, we asked the parents of 1,270 infants between 106 and 320 days of age (M = 176.70, SD = 61.13) who were visiting our lab about the companion dog and cat animals who live in their homes. There were 648 boys and 622 girls. Infants were full healthy, typically developing fullterm infants recruited from the greater Sacramento Valley region of Central California.

Names of potential participants were initially obtained from the State Vital Records office. All parents who lived within a ∼30-min drive from the lab were sent informational packets describing our work and a general invitation to participate in studies, and parents who were interested in volunteering contacted us. Infants were recruited for this investigation solely based on age, and any infant in our pool who was born full term and who was healthy and typically developing was recruited to participate in this study via phone call or e-mail (depending on parental stated preference when they volunteered). Parents and infants received a certificate and a t-shirt, toy, or book as a thank-you for participation.

The parents of our sample were highly educated. Of the 1,256 mothers who reported their education, all but 17 completed high school, all but 70 had at least some college, and 847 had earned at least a bachelor's degree. Of the 1,196 parents who reported their infants' race, 852 reported their infant to be White, 36 reported their infant to be Black, 56 reported their infant to be Asian, 232 reported their infants to be of mixed race, and 20 reported their infants to be Native Hawaiian, American Indian, or other. Of the 1,196 parents who reported it, 323 indicated that their infant was Hispanic (165 White, 69 mixed race, 66 with no race reported, and the remaining were Black, Asian, Native Hawaiian, American Indian, or other race). Thus, our infants represented the diversity of the community.

### Procedure

When infants came to our lab to participate in a study of infant cognition, parents completed a questionnaire about family demographics (see Appendix). In this questionnaire, parents reported infant birthdate, due date, sex, race, and mother's education and the age of any older siblings. In addition, they reported on their infants' pet experiences by replying verbally to the following question: "Do you have pets?" If the answer was yes they were asked about the number and type as well as whether the pet(s) lived indoors with the family.

### Results and Discussion

To examine the likelihood of proportions, we conducted binomial probabilities of observing the number of occurrences or more given the sample size. We compared the difference in proportions between two groups (e.g., infants with and without siblings, Hispanic vs. non-Hispanic families) with z-ratios for the difference between two independent proportions. We used twotailed tests to evaluate these z-ratios. All binomial and z-tests were conducted using vassarstats.net. We compared group means on continuous variables (e.g., age, maternal education) using twotailed t-tests independent groups, performed using IBM SPSS Statistics for Mac (Armonk, NY: IBM Corp). Our critical p-value for significance was 0.05, except as noted to correct for multiple comparisons.

Of the 1,270 parents who completed our questionnaire, 804 (63.31%) reported having a pet dog or pet cat (or both), a proportion that was significantly different from chance, binomial probability (804 or more out of 1270), p < 0.001 (see **Table 1**). Of the 1,253 families who reported whether or not their pet lived indoors, 696 (55.55% of the sample) reported having an indoor pet, a proportion that was significantly different from chance, binomial probability (696 or more out of 1253), p < 0.001. The numbers of families who had dogs and cats or both are presented in **Table 1**. Clearly, in our sample more families had dogs than cats; 387 62%) of the 626 families who had only dogs or cats had only dogs, a proportion that is significantly different from chance, p < 0.001. Thus, most of the families in our sample had one or more pet, and more than half of the infants in our sample had exposure daily to a pet in the home.

To gain a clearer understanding of the frequency and type of pet ownership in this group, we provide in **Table 1** demographics for families who had any dog or cat (indoor or outdoor) and families without any pets. As is clear from this table, the two groups of infants looked very similar; they both had approximately equal proportion of boys and girls and the average age of the samples did not differ. For the infants who had information about siblings reported, the proportion of infants





without pets who had siblings was significantly greater than the proportion of infants with pets who had siblings, z = 2.23, p = 0.03.

Overall, maternal education did not differ for families who had pets compared to families who did not have pets, t(1254) = 1.25, p = 0.21, d = 0.07. Mother's education was higher for families who only had cats than families who had no pets, t(701) = 3.71, p < 0.001, d = 0.30, families who had only dogs, t(617) = 4.23, p < 0.001, d = 0.35, and families who had both dogs and cats, t(410) = 2.94, p = 0.003, d = 0.29. Mother's education did not differ between families without pets and families who had only dogs, t(842) = 0.63, p = 0.53, d = 0.04, or who had both dogs and cats, t(635) = 0.21, p = 0.84, d = 0.02. Similarly, maternal education did not differ between families who had only dogs and families who had both dogs and cats, t(551) = 0.70, p = 0.48, d = 0.06. We also evaluated these differences for families who reported having indoor pets, and the patterns were identical.

Next we examined how pet ownership varied according to infant race, which is a proxy for the family race (in our sample, all infants have the same race as their parents; if the parents are of different races, the infant is reported as mixed race). For the present purposes we divided the infants into three groups according to reported race: White and not Hispanic, infants who were rated as neither White nor Hispanic, and infants who were reported as Hispanic regardless of race. The proportion of families in each of these groups that had pets is presented in **Table 2**.

In our sample, the proportion of White/non-Hispanic families with pets was greater than the proportion of non-White/non-Hispanic families with pets, z = 5.67, p < 0.001, and than Hispanic families, z = 2.96, p = 0.003. More Hispanic families had pets than did non-White/non-Hispanic families, z = 2.53, p = 0.01. This is not due to the fact that most Hispanic families were White; 165 (51%) of the infants who were reported to be Hispanic were also reported to be White. In addition 95 (58%) of the White/Hispanic families had pets and 100 (63%) of the non-White/Hispanic families had pets. Thus, the differences appear to be a lower rate of pet ownership by families who are non-White and non-Hispanic. However, this finding would need to replicated in a larger, more representative sample before strong conclusions could be drawn about racial differences in pet ownership by families with infants.

What is clear from these data is that many infants have opportunities to learn from household pets, and that this is a naturally occurring difference in experience that could yield different developmental outcomes. Interestingly, infants were not more likely to have a pet and a sibling; more families in our sample with pets had only one child. In addition, although there were no overall differences in maternal education and the presence of a pet, maternal education was highest for families who had only cats than for any other group. These data are the first to our knowledge to describe aspects of the home context of infants under 1 year who do and do not live with pets.

### EXPERIMENT 2

Experiment 1 revealed that ∼63% of the infants between the ages of 4 and 12 months living in our region have daily experience with pets. In Experiment 2, we asked how infants with this experience differed from infants without such experience in their processing of animal faces. Importantly, we examined the effect of animal experience across age, allowing us to determine whether infants with and without pets differed at all points in development, or whether the effect of such experience changed across this developmental period.

### Methods

#### Participants

The final sample included a total of 171 healthy, full-term infants with no known vision problem: 52 infants were 4 months old (M = 125.02 days, SD = 7.46 days; 24 girls and 28 boys), 57 infants were 6 months old (M = 184.91 days, SD = 7.92 days; 27 girls and 30 boys), and 62 infants were 10 months old (M = 304.11 days, SD = 7.89 days; 26 girls and 36 boys). The same self-report questionnaire was used here as in Experiment 1, probing the presence of pets and whether they lived indoors. These parental reports revealed that in our sample 64 infants did not have an indoor pet, 35 had only a cat or cats that lived indoors with the family, 52 had only a dog or dogs that lived indoors with the family, and 20 had both a cat or cats and a dog or dogs that lived indoors with the family; thus the proportion of infants in our sample with pets (63%) was similar to that in Experiment 1. Infants were recruited as described in Experiment 1. We tested 28 additional infants, but excluded their data from the final analyses due to fussiness or inattention (N = 8), equipment or experimenter error (N = 8), ambiguous pet status (i.e., an infant who had a dog for several months and then did not) (N = 1), or failure to provide useable data on the minimum number of trials (N = 11, see Data Processing section below).

In the final sample of 171 infants, 116 infants were reported to be White. The remaining infants were reported to be Asian (N = 4), Black or African American (N = 2), mixed race (N = 38), or other (N = 2); 9 parents did not report the race of their infant. Thirty-seven infants were reported to be Hispanic; of these infants 17 infants were White, 11 infants were mixed race, and 9 infants did not have their race reported. The sample was highly educated; of the 167 mothers who reported their educational background, all but one mother had completed high school, 47 had completed at least some college, and 113 had earned at least a bachelor's degree. Thus, the sample was demographically similar to that in Experiment 1.

#### Stimuli

Stimuli were digitized photographs of 12 different faces from each of four animal categories: cat, dog, monkey, and sheep (see **Figure 1**). Using these four types of images allowed us to compare infants' responding to both relatively familiar and relatively unfamiliar animal faces. Specifically, we selected cats and dogs because they are relatively familiar to infants (even infants who do not have a dog or cat as pet at home likely see one or both types of animals at the homes of friends and relatives, in the park, walking in their neighborhood, etc.), and we selected sheep and monkeys because they are relatively unfamiliar to infants. Thus, these four face types will allow us to determine how general any effect of pet experience is on infants' face scanning; if it extends only to familiar cats and dogs or even to unfamiliar sheep and monkeys. We selected monkeys and sheep specifically because they varied configurally, with the monkey faces being configurally more similar to cat faces (e.g., relatively large eyes, small noses) and sheep faces being configurally more similar to dog faces (e.g., smaller eyes at the top of the face, prominent snout with larger nose at the bottom of the face). Thus, this will allow us to determine if pet experience extends more to some configurations than to others. Finally, we selected sheep and monkeys faces because both species had been used in facial discrimination studies and thus good quality stimuli sets already existed.

Sheep faces came from a photograph stimuli set previously used to study facial discrimination in infants' and adults' (48) as well as in sheep (49). Monkey faces came from a photograph

stimuli set used to understand facial discrimination in monkeys (50, 51). Cat and dog photographs were gathered from breed books and cropped to match in size in Adobe Photoshop. All faces were front-oriented, symmetrical, and similar in breed (dogs were either Golden Labradors or Golden Retrievers) species (all monkeys were tufted capuchins), or coloring and marking (all sheep where white and all cats were brown tabbies). Using Adobe Photoshop, an oval mask was overlaid on the images to make the external contours of the images identical within face type and similar across faces, similar to the mask used in Chien et al. (52). Thus, differences in infants' looking or scanning would not reflect differences in face shape, protrusion of ears, etc., but rather would primarily reflect differences in internal features, such as the prominence of the nose, the top-heaviness, etc. Due to differences in the overall shape of the different faces we created two masks; one mask for the dog and cat faces and another for the sheep and monkey faces. The mask covered the ears of all animals (see **Figure 1**). Images were ∼38 cm × 25 cm in size, subtending ∼21.5 by 14.25 degrees visual angle.

#### Apparatus

A Dell computer was used to present the stimuli and control the experiment. Stimulus images were presented side-by-side in the center of a 37-inch LCD TV monitor (19:9 aspect ratio), and subtended ∼21.5 by 14.25◦ visual angle at a viewing distance of 100 cm. Eye gaze was recorded using an Applied Science Laboratory (ASL) pan/tilt R6 eye-tracker controlled by a second Dell computer. An eye-camera located at the bottom and center of the monitor focused on the infants' right eye; using the image from this camera, the eye-tracker calculated the location of infants' fixations from the reflections of an infrared light source of the cornea and pupil. A wide-angle camera was affixed to the eyecamera to provide an image of infants' heads and torsos. A sensor, attached to an infant-sized headband, was positioned above the right eye and was used to locate the infants' head in a magnetic field produced by a generator located directly behind the parents' chair. This position was communicated to the eye-tracker, which, if necessary, was used to adjust the camera to refocus the infants' eye (e.g., if the infant looked at the parent and then back at the screen). A white curtain separated the infants from the observers and equipment.

#### Procedure

Infants sat on their parents' lap in a dimly lit room ∼100 cm from the monitor and ∼75 cm from the eye-camera. Parents wore occluding glasses in order to reduce any bias their reaction to the stimuli could have on infants' looking. Sessions began with a five-point calibration protocol in which a looming circle was presented at each point: (1) 11.5◦ above and to the left of the central fixation point, (2) 11.5◦ above and to the right of the central fixation point, (3) at the central fixation point, (4) 11.5◦ below and to the left of the central fixation point, and (5) 11.5◦ below and to the right of the central fixation point. As infants fixated at each point an experimenter pressed a key on the computer to record the relative locations of the corneal and pupil reflections when the infant was fixating on that known location. This information was used to calculate the point-of-gaze (POG) for each data sample during the experiment.

Immediately after calibration the experimenter initiated the experimental paradigm. Each trial began with a geometric colored shape (e.g., a purple diamond, green triangle, yellow star) presented at the central fixation point; the shape continuously loomed for 800 ms (from 0◦ × 0 ◦ visual angle to ∼16◦ × 16◦ visual angle) and was accompanied by a randomly selected sound (e.g., buzz, beep, ding). When infants fixated this stimulus (as indicated by cross-hairs superimposed on the stimulus by the ASL eye-tracking system) the experimenter pressed a computer key to initiate the start of an experimental trial.

The experimental trials were 5 s in duration, and on each trial a pair of images from the same category was presented (e.g., two dogs, two cats, two sheep, or two monkeys). We presented two images on each trial, center-to-center distance was 22◦ (the center of each image was ∼11.5◦ to the left or right of midline). Each trial was initiated when infants looked at an attention getter at center of the monitor; thus when the stimuli were first presented, infants were fixating the center of the monitor and they had to move their eyes from fixation to look at either image. A bias to look at a particular region (e.g., eyes, nose) therefore could not reflect infants simply maintaining fixation in the location where the stimulus happened to be presented; rather any observed bias will reflect infants' selecting that region and maintaining their attention to it.

We created a custom program in Adobe Director to control stimulus presentation and randomly order image pairs in blocks of four trials. Each block contained one trial with a pair of dogs, one trial with a pair of cats, one trial with a pair of sheep, and one trial with a pair of monkeys. Thus, infants saw a pair of images from each animal category in each block of four trials. On each trial, a randomly selected clip of classical music (Bach, Beethoven Mozart, Pachelbel, Vivaldi, or Ravel) was played to aid in keeping infants' attention.

If the infant became uninterested in general and looked away from the monitor, the experimenter could present one of several stimuli to recapture their attention. These stimuli included sequences of randomly chosen clips of children's television shows (Teletubbies, Blues Clues, Sesame Street), a cartoon of animated animals singing, a series of pictures of babies accompanied by classical music, and the calibration stimuli. Key commands in the computer program were used to present the stimuli and allowed the experimenter to present any of the attention-getting stimuli between trials if infants' attention needed to be redirected to the center of the screen. There were a maximum of 264 experimental trials, and trials were presented

FIGURE 2 | An example of one possible pair of stimuli presented on a single (cat) trial. To illustrate how we evaluated infants' looking times, Areas of Interest (AOIs) corresponding to the top and bottom halves of the faces are superimposed on the images.

until infants showed signs of disinterest in looking at screen (e.g., fussing, looking at the parent, refusing to look at the screen).

### Results Data Processing

Data processing was similar to that reported in Hurley et al. (40). The point of gaze data was recorded at a rate of 60 Hz, using an online average of 4 samples (the current sample and the 3 previous samples) to minimize noise in the data. In addition, a blink filter was implemented in which pupil loss of fewer than 12 samples was considered a blink. The horizontal and vertical position of the gaze was recorded at each sample with a code to indicate which type of stimulus was presented on each trial. Data were first processed using the software program ASL Results to parse the datastream into trials. Next, we used custom a Matlab routine to determine how may samples fell into prespecified Areas of Interest (AOIs). We evaluated infants' looking in four AOIs: the top and bottom halves of each of the two stimuli presented side-by-side (see **Figure 2**). This approach allowed us to have the same AOIs across faces and species, while having one AOI contain the eye region, known to be important for face processing (53). This approach—of dividing the face into upper and lower halves—has been used in other studies of face processing (54).

The number of samples in each AOI was converted to duration for analysis. We included in the analyses any trial in which at least 200 ms of looking was recorded; across all infants at all ages the analyses are based on an average of 28.49 trials per infant (Range = 4–105, SD = 14.85). All infants who contributed at least one trial of each type (cat, dog, monkey, and sheep) that met this criterion were included in the final analyses (as described in the Participants section, 11 infants failed to meet this criterion).

### Analysis Plan

We tested our hypotheses by examining differences in infants' preferences for the top half of the faces. To examine how infants' scanning of these faces varied by age and pet status we calculated infants' preference for the top half of each type of face. If infants focus more on the eye-region on our faces, as is typical when young infants scan human faces (53), we should see a strong preference for the top half of the faces. If infants scan more broadly—a pattern exhibited by older infants when exhibiting human faces (55)—will should see a weaker top-half preference. To evaluate any effect of age, pet experience, or type of face on infants' top half preference, we conducted Analyses of Variance (ANOVAs) comparing the top half preferences. We conducted follow up comparisons for any significant effects using t-tests, adjusting our criterion of significance to control for multiple comparisons. We also examined infants' preference for the top half by comparing their preferences using one-sample t-tests and Bayes Factors.

### Analyses

We calculated preference for the top half of faces by dividing the looking to the top half of the face by the looking to the top and bottom half combined. We use infants' median top half preference across trials because the median is less influenced than the mean by extreme values. We entered each infants' median top half preference for each stimulus type (dog, cat, sheep, monkey) with pet group and age as the between-subjects variables. This analysis revealed a main effect of trial type, F(3, 495) = 14.64, p < 0.001, η 2 <sup>p</sup> = 0.08, and a trial type by age group interaction, F(6, 495) = 2.38, p = 0.03, η 2 <sup>p</sup> = 0.03. The means for this interaction are provided in **Figure 3**. It can be seen in this figure that, overall, there were few age differences in infants' preference for the top halves of cat, monkey, or sheep faces, but that in general older infants had a weaker preference for the top half of dog faces than did the younger infants.

To better understand the age by trial type interaction and how infants' preferences for the top halves of the different face types varied by age, we conducted separate ANOVAs on each age group. The analyses of the top half preferences by 4-month-old infants revealed only a main effect of trial type, F(3, 150) = 7.15, p < 0.001, η 2 <sup>p</sup> = 0.13. As is evident in **Figure 3**, 4-month-old infants (open bars) had a weaker top half preference for sheep faces than for the other faces. We confirmed this impression by conducting the mean preference scores for each of the face types, using p ≤ 0.008 as our cut-off for significance to control for multiple comparisons. The preference for top halves of sheep faces was significantly lower than that of cat faces, t(51) = 4.51, p < 0.001, d = 0.63, or monkey faces, t(51) = 3.34, p = 0.002, d = 0.46, and the difference between sheep faces and dog faces was marginal, t(51) = 2.68, p = 0.01, d = 0.37. To provide further insight into infants' top half preferences, we compared each preference score to chance (0.50); these comparisons would confirm whether infants looked at the top half of any of the face more than expected by chance. These 4-month-old infants significantly preferred the top halves of cat faces, t(51) = 5.23, p < 0.001, d = 0.73, dog faces, t(51) = 3.35, p = 0.002, d = 0.46, and monkey faces, t(51) = 3.92, p < 0.001, d = 0.55. Their preference for the top half of sheep faces did not differ from chance, t(51) = 1.12, p = 0.27, d = 0.16. Thus, in general, at the youngest age infants preferred the top half of all the faces except the sheep faces which were both relatively unfamiliar and, as can be seen in **Figure 1**, dominated by the nose in the bottom half of the face.

The ANOVA on the mean preference for the 6-monthold infants also revealed only a main effect of trial type, F(3,165) = 8.31, p < 0.001, η 2 <sup>p</sup> = 0.13, however as can be seen in **Figure 3** the pattern was somewhat different. At this age, infants had a stronger preference for the top half of both cat and monkey than dog and sheep faces; cat versus dog, t(56) = 3.97, p < 0.001, d = 0.53, monkey faces vs. dog faces, t(56) = 3.61, p = 0.001, d = 0.48, cat vs. sheep t(56) = 3.89, p < 0.001, d = 0.52, monkey versus sheep, t(56) = 3.70, p < 0.001, d = 0.49. In general, 6 month-old infants seemed to have stronger preferences for the top halves of the faces that are configured with larger eyes toward the top than the faces with longer snouts and relatively large noses at the bottom.

Comparisons of the preference for the top halves to chance corroborated this conclusion. Six-month-old infants had clearly significant preferences for the top halves of cat faces, t(56) = 5.09, p < 0.001, d = 0.67, and monkey faces, t(56) = 5.10, p < 0.001, d = 0.68. They had non-significant preferences for the top halves of sheep faces, t(56) = 2.14, p = 0.04, d = 0.28, and dog faces, t(56) = 1.76, p = 0.08, d = 0.23. Bayes factor analyses confirmed that these preferences were ambiguous, at best. For the sheep faces, Bayes factor analyses with a scale r on effect size of 0.707, revealed a Scaled JZS Bayes Factor in favor of the Null of 0.85; the Scaled JZS Bayes Factor in favor of the alternative was 1.18. For the dog faces, Bayes factor analyses with a scale r on effect size of 0.707, revealed a Scaled JZS Bayes Factor in favor of the Null of 1.62; the Scaled JZS Bayes Factor in favor of the alternative was 0.62. Thus, neither the t-tests nor the Bayes Factor analyses provided strong support for the conclusion that 6-month-old infants preferred the top halves of dogs and sheep. In general, therefore, these 6-month-old infants preferred the top halves of faces with large eyes in the top halves, but not the top halves of faces that were dominated by long snouts.

The analysis of the top half preference by 10-monthold infants revealed significant main effects of trial type, F(3, 180) = 4.67, p = 0.004, η 2 <sup>p</sup> = 0.07, and pet group, F(1, 60) = 4.16, p = 0.046, η 2 <sup>p</sup> = 0.07. Comparisons of infants' preferences for the top halves of each type of face revealed that overall 10-month-old infants had weaker preferences for the top halves of dog faces than cat faces, t(61) = 3.25, p = 0.002, d = 0.41, and monkey faces, t(61) = 3.44, p = 0.001, d = 0.44; the difference between the preference for the top halves of dogs and sheep did not reach our adjusted criterion of significance, t(61) = 2.02, p = 0.047, d = 0.26. The pet group main effect reflects the fact that across face types, 10-month-old infants with pets had stronger preferences for the top halves of faces (M = 0.65, SD = 0.27) than did 10-month-old infants without pets (M = 0.50, SD = 0.28). Moreover, comparison of the average top half preference to chance revealed that only infants with pets differed significantly, t(39) = 3.52, p = 0.001, d = 0.56; the average top half preference of infants without pets was not different from chance, t(21) = 0.14, p = 0.89, d = 0.03.

Finally, to provide additional insight into the preferences of 10-month-old infants, we compared the preferences for the top halves of each type of face to chance separately for infants with and without pets (see **Figure 4**). The 10-month-old infants with pets significantly preferred the top halves of cat faces, t(39) = 3.75, p = 0.001, d = 0.59, monkey faces, t(39) = 3.55, p = 0.001, d = 0.56, and sheep faces, t(39) = 2.85, p = 0.007,

FIGURE 4 | Mean preference for the top half of each face type by age and pet status. The individual blue circles represent a single infant; the squares represent the mean of each group.

d = 0.45. Ten-month-old infants without pets did not have preferences for the top halves of any faces that significantly differed from chance; cat faces, t(21) = 0.753, p = 0.46, d = 0.16,

dog faces, t(21) = 1.38, p = 0.18, d = 0.29, monkey faces, t(21) = 0.54, p = 0.60, d = 0.11, and sheep faces, t(21) = 0.03, p = 0.98, d = 0.006. Bayes Factor analyses, with an r scale of 0.707, provided modest support for the null hypothesis that the preferences were equivalent to chance for cat faces, Scaled JZS Bayes Factor of 3.47, monkey faces, Scaled JZS Bayes Factor of 3.94, and sheep faces, Scaled JZS Bayes Factor of 4.48. The Bayes Factor analysis did not provide clear support for either the null or the alternative hypothesis for the dog faces. Over time, experience with pets seems to help maintain an infant's interest in the top halves of these animal faces, as 10-monthold infants without pets show a reduced top half preference compared to the other age groups. All of the analyses from the 10-month-old infants lead to the same conclusion: infants with and without pets visually scanned these animal faces differently.

### GENERAL DISCUSSION

The experiments presented here provide important insight into the role of companion cats and dogs on development in infancy. Experiment 1 revealed that more than half of the infants we sampled lived with one or more pet. Thus, pets have the opportunity to have an influence on development for many infants. Experiment 2 showed that between 4 and 10 months, exposure to a pet in the home was related to how infants visually inspected images of animal faces. Although young infants with and without pets responded in the same way to animal faces, by 10 months infants with and without pets exhibited different patterns of visual inspection when looking at these images. Clearly, therefore many infants have experience with pets, and that experience seems to influence at least one aspect of their development.

These findings contribute to two separate literatures. First, they address the literature focused on the role of animals on child development (4–6, 8, 9). As described earlier, little research has examined either the prevalence of pet experience during infancy, or the effect of pet exposure on infant development. The present work addresses both gaps in the literature. In Experiment 1, we show that in our sample, ∼63% of families with infants had household cats or dogs. Clearly our sample is not representative of all families, but does show that the rates of pet ownership in families with infants—at least middle-class families in the Sacramento Valley of California—are similar to those documented in other studies of pet ownership [As reported by the American Pet Products Association (56); http://www.americanpetproducts.org/press\_industrytrends.asp] including in families with children (17). These data confirm that many infants have opportunities to develop in the context of experience with pets, and that this is a real difference in experience between infants. Thus, it is important to understand more about how infants' development is shaped by exposure to and experience with pets.

In addition, the data presented here confirm previous reported findings that exposure to a household cat or dog seems to induce different strategies for learning about images of dogs and cats in laboratory tasks (39–43). That is, we not only documented the prevalence of pet ownership, we also showed how infants' visual investigation of animal faces varied as a function of pet ownership. These results converge with previous findings that infants as young as 4 months visually investigate images of cats and dogs differently as a function of pet ownership (39, 42, 43), and that infants at 4 months learn about images of cats and dogs differently as a function of pet ownership (41). The results we reported here extend this previous work in several ways. First, we showed that these differences a function of pet ownership hold even when infants are shown only animal faces. The previous work revealed differences when infants were shown images of full bodied animals. Thus, not only does this extend previous work showing that by 4 months infants show a bias to look at the heads of whole body animal images as a function of pet experience (39, 42), it shows that previously reported results about the effect of experience on developmental changes in infants' processing and scanning of human faces (21, 27, 57, 58) may extend to the role of experience on their processing of other kinds of faces. Just as previous work suggested a tuning of human face perception between 4 and 9 months based infants' experience with face of a particular race (24, 25, 52, 59, 60), here we show a shift in the specificity of infants' investigation of animal faces as a function of their experience with dogs or cats.

Moreover, the timing of the effects suggests that experience with pets is not a single, unified influence, but rather that exposure to pets may have different effects at different time points. Specifically, previous work showed that pet experience influences young infants' processing of whole body images of animals (39, 42). The current results show that pet experience has an influence on infants' processing of animal faces during the same developmental time period during which infants show shifts in their perception, discrimination, and visual investigation of human faces (24, 25, 52, 54, 59– 62).

Importantly, these results also show that pet experience influences not only infants' processing of familiar animals such as cats and dogs, but that daily experience with a companion animal also has an effect on infants' processing of relatively unfamiliar animals such as sheep and monkeys. Thus, the current investigation addresses the specificity of the effect of infants' pet experience on their face processing. Our results suggest that experience with a pet influences infants' inspection of animal faces beyond their specific pet experience, as at 10 months we observed a difference in how infants with and without pets scanned monkey faces. This effect may reflect a mechanism like that is responsible for the effect of pet experience on children's understanding of biology and living kinds (63, 64).

In summary, the results reported here add to both the literatures on the impact of animals in child development and on the effect of experience on infants' processing of visual stimuli. We showed here that pet experience is pervasive in infancy, and that this experience influences one aspect of infant development. Future research on animals and child development should not overlook the important developmental period of infancy.

### AUTHOR CONTRIBUTIONS

KH and LO both contributed to the design and implementation of the research, analysis of the results, and writing of the manuscript.

### ACKNOWLEDGMENTS

This work was made possible by grants R03HD070651 Awarded by the National Institutes of Child Health and Human Development and BCS 095158 awarded by the National Science Foundation to LO. We thank Karen Leung, Mielle Setoodehnia, Mirjam Harrison, and all the lab members at the Infant Cognition Lab at the UC Davis Center for Mind and Brain for their help in

### REFERENCES


collecting and coding these data. We also thank Jennifer Pokorny and Keith Kendrick for sharing their animal face stimuli. We are especially grateful to the families who participated and made this research possible. This research was submitted by KH as partial fulfillment of a doctoral dissertation for the University of California, Davis. A version of this article was presented at the International Conference on Infant Studies in Minneapolis in 2012.

### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fvets. 2018.00152/full#supplementary-material

the owner: a cross-sectional study in Italy. Dog Behav. (2015) 1:23–33. doi: 10.4454/db.v1i2.14


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Hurley and Oakes. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Long-Term Effect of Therapeutic Horseback Riding in Youth With Autism Spectrum Disorder: A Randomized Trial

Robin L. Gabriels 1,2,3 \*, Zhaoxing Pan1,3, Noémie A. Guérin<sup>4</sup> , Briar Dechant 2,3 and Gary Mesibov <sup>5</sup>

<sup>1</sup> Department of Pediatrics, University of Colorado Anschutz Medical Campus, Aurora, CO, United States, <sup>2</sup> Department of Psychiatry, University of Colorado Anschutz Medical Campus, Aurora, CO, United States, <sup>3</sup> Children's Hospital Colorado, Aurora, CO, United States, <sup>4</sup> Department of Comparative Pathobiology, Center for the Human-Animal Bond, College of Veterinary Medicine, Purdue University, West Lafayette, IN, United States, <sup>5</sup> Frank Porter Graham Child Development Institute, University of North Carolina, Chapel Hill, NC, United States

#### Edited by:

Peggy D. McCardle, Consultant, New Haven, CT, United States

#### Reviewed by:

Sophie Hall, University of Lincoln, United Kingdom Mitsuaki Ohta, Tokyo University of Agriculture, Japan

\*Correspondence: Robin L. Gabriels robin.gabriels@childrenscolorado.org

#### Specialty section:

This article was submitted to Veterinary Humanities and Social Sciences, a section of the journal Frontiers in Veterinary Science

> Received: 16 April 2018 Accepted: 20 June 2018 Published: 16 July 2018

#### Citation:

Gabriels RL, Pan Z, Guérin NA, Dechant B and Mesibov G (2018) Long-Term Effect of Therapeutic Horseback Riding in Youth With Autism Spectrum Disorder: A Randomized Trial. Front. Vet. Sci. 5:156. doi: 10.3389/fvets.2018.00156 This paper presents 6-month follow-up data of 44% (N = 64/116) of participants (ages 6–16 years) with a diagnosis of autism spectrum disorder, who participated in a previously-published randomized controlled trial of therapeutic horseback riding (THR) compared to a no-horse contact active control. The objective of this study was to examine whether significant improvements of irritability, hyperactivity, social, and communication behaviors observed in participants randomized to receive a 10-week manual-based THR intervention were sustained 6 months after the intervention conclusion. Participants' caregivers from both the THR (n = 36) and active control (n = 28) groups completed a measure of irritability and hyperactivity behaviors (primary outcome variables). Additionally, only the THR group participants completed the full battery of study outcomes assessments. Between group comparisons examining the extended interval from baseline (1-month pre-intervention assessment) to 6-months after the intervention revealed that the THR group maintained reductions in irritability behavior at a 0.1 level (effect size = 0.32, p = 0.07). (Effect size = 0.32, p = 0.07), which was 73% of efficacy preserved from the primary post-intervention endpoint (within 1-month post-intervention). Hyperactivity behaviors did not sustain this same trend. Comparisons from baseline and 6-months after the intervention revealed that the THR group sustained significant initial improvements made in social and communication behaviors, along with number of words and different words spoken during a standard language sample. This is the first known study to examine and demonstrate the longer-term effects of THR for individuals with ASD and warrants a more thorough evaluation of whether the effects of THR are maintained for at least 6-months after the intervention compared to a control.

Clinical Trial Registration Information: Trial of Therapeutic Horseback Riding in Children and Adolescents with Autism Spectrum Disorder; http://clinicaltrials.gov; NCT02301195.

Keywords: animal-assisted interventions, autism spectrum, therapeutic horseback riding, long-term outcomes, irritability

### INTRODUCTION

Along with the diagnostic social, communication, restricted, and repetitive behavior features of autism spectrum disorder (ASD) this population has particular difficulties with emotion regulation (1). Emotional dysregulation and related aberrant behavior responses (e.g., irritability and aggression) can detrimentally affect the daily social functioning of this population (2). Such issues can also contribute to an increase risk of exhibiting highly inappropriate and unsafe behaviors, the consequences of which can erode the quality of life (QoL) for the child with ASD and caregivers (3–5). The fact that there is no "one-size-fitsall" ASD intervention package (6) fuels a particular interest in seeking complementary and alternative ASD treatment options (7). One increasingly popular practice is the inclusion of animals in interventions to enhance human health, quality of life or wellbeing, known as animal-assisted intervention (AAI) (8). The use of AAI for individuals with ASD has been hypothesized to provide a unique social partnership experience with the animal, one that can reduce arousal levels (i.e., dampen stressed/anxious states) and can address the unique social, communication, and behavior challenges of individuals with ASD (9).

Emerging evidence for the benefits of animals on the health and well-being of individuals with ASD is highlighted by recent systematic literature reviews (10, 11). Research on AAI for ASD has increased in recent years, from only 14 studies meeting inclusion criteria for empirical research between 1989 and 2012, to 28 studies between 2012 and 2015. Early studies reported improvements in social and communication skills, decreases in ASD symptom severity, amelioration of behavior problems (e.g., aggression), reduced stress, and enhanced quality of life (10). However, a majority of these studies lacked methodological rigor, making these findings difficult to interpret or rely upon. Although diverse methods continue to be employed methodological quality of some of the more recent AAI studies have improved (11). In this more recent review of 22 out of 28 AAI studies, social interaction skills was identified as the most consistent outcome reported with additional outcome indications of improved communication skills, positive emotions, and reduced arousal levels (11). In this same review, equines were the most common animal species included in AAI (55% of studies) (11).

A systematic mapping review of equine-assisted activities and interventions (EAAT) studies with the ASD population revealed a wide variety of intervention methods ranging from equine assisted activities (EAA) (e.g., psychoeducational horseback riding, therapeutic riding) involving riding instructors, coaches or trainers, and equine-assisted therapies (EAT) (e.g., hippotherapy, simulated developmental horse-riding) involving therapists (e.g., occupational or physical therapists) and therapeutic riding instructors (12). In 31 of 33 studies reviewed, riding the horse was a key component, but EAA and EAT had different aims. Horsemanship, communication, and social skills were an emphasis of EAAs, with 13 out of 25 involving group sessions. Conversely, the eight EAT studies did not always specify group or individual session type, instead focused on the use of the horse's movement to target physical and sensorimotor functioning. Outcome improvement areas reported by EAA studies included social interactions, communications, sensory processing, movement control, ASD symptom severity, and QoL, whereas EAT studies reported outcome improvements in motor control and adaptive living skills (12).

Although THR appears to be a wide-spread practice that has become popular for individuals with ASD, few studies have systematically validated the effects of THR for individuals with ASD following recommended guidelines for ASD research of therapeutic interventions (13). Such a practice is necessary to guide consumers' AAI treatment choice making and third-party payers' interest in funding evidenced-based AAI. In response to this need, this research team conducted the first known largescale (N = 127) randomized controlled trial (RCT) of THR compared to a no-horse activity control with children ages 6– 16 years diagnosed with ASD (14). Results showed a significant medium effect size improvement in participants from the THR group (n = 58) compared to the control (n = 58) on measures of irritability, hyperactivity, social cognition, social communication, and total words and new words spoken during a standardized language sample.

There is a paucity of research examining the longer-term maintenance of AAI benefits for individuals with ASD beyond immediate outcomes. A recent follow-up study by Hall et al. (15) examined the maintenance of the immediate observed improvements in family functioning and stress made after families of children diagnosed with ASD either acquired a dog or did not (16, 17). This two-and-a-half-year followup revealed maintenance of family functioning gains in the subset of those who followed-up from the intervention group (n = 22) compared to the control (n = 15) (15). One study to date has attempted to prospectively examine residual effects of THR for children with ASD (18). That study conducted a repeated measure, interrupted treatment design to evaluate 21 participants with ASD before and after 10-weeks of THR. An unintentional 6-week break from treatment interrupted the study design. Participant outcome assessment following this initial 16 week-period indicated that participants showed improvements in sensory processing abilities and autism-related symptoms. However, none of these improvements remained when re-evaluating participants after discontinuing THR for 6 weeks. After re-introducing 6 weeks of THR, an immediate follow-up assessment revealed initial improvements resumed. However, of note, this study had several methodological limitations including only teacher-report that limited outcome evaluation to the school environment, lack of a control condition, and an unplanned 6-week break during the initial treatment phase (18). Examining the durability of AAI improvements is an important area of research needed in the growing effort to validate the efficacy of these interventions.

This article presents 6-month follow-up data from a subset of participants who were randomized as part of a previously published RCT to receive either 10-weeks of THR intervention or 10-weeks of a barn activity (BA) control group with no exposure to horses (14). In this study, we examined whether the behavioral, social, and communication improvements remained 6 months after the completion of the THR intervention.

### MATERIALS AND METHODS

The following summarizes the methods of our previously published RCT (14), which was conducted at a Première certified PATH international center follows industry guidelines to insure horse welfare. For additional details concerning participant consent/assent process, inclusion, and exclusion criteria, ASD study diagnostic confirmation, screening, randomization, measures, and interventions, please see the discussion in Gabriels et al. (14).

### Participants

At the onset of the RCT (14), all of the 127 participants ages 6–16 years with a study confirmed diagnosis of ASD were invited to engage in this 6-month follow-up assessment, approved by the Institutional Review Board (IRB) at the first author's institution. Specifically, participants and caregivers completed an informed consent/assent process, giving consent for three evaluation points: (1) baseline assessment, (2) postintervention assessment, and a (3) 6-month post-intervention follow-up assessment contingent on agreement to refrain from continuing participation in THR for 6-months after the initial intervention phase. The informed consent process included mention of monetary incentives offered for each assessment period of the study, including the 6-month follow-up. From the 127 participants enrolled in the RCT (14), only 116 participants were eligible to complete this third study phase because they had completed RCT baseline (pre-intervention) assessments. However, only a subset of these 116 participants, 96 were invited to be included in the follow-up assessment process, because they completed 1-month post-intervention follow-up assessments and chose to refrain from continued participation in THR until after the 6-month follow-up evaluation period. Six months after completing the intervention phase of the RCT (14), the study coordinator contacted these 96 participants. A subset of these 96 participants contacted (64/66.67%) responded (THR group n = 36; control group n = 28). Of the 32 participants who did not complete this third study phase, two were no longer living with the same caregivers, rather were living in community-based placements, and the remainder were otherwise unable to schedule and/or complete study forms. Participating families received compensation for travel to study visits and for completing and returning the study questionnaire.

### Study Design

Following institutional review board-approved informed consent/assent procedures for the RCT (14), participants were randomized in a 1:1 ratio to receive either a 10-week THR group or a no-horse barn activity (BA) control group based on a priori randomization list generated study statistician. This was stratified by participants' nonverbal intelligence quotient (NVIQ) standard score (85 or >85) measured by the Leiter-R (19). Weekly intervention and control group lessons lasted 45-min, consisted of two to four participants, were led by a THR instructor, followed a consistent routine, and were taught horsemanship skills via activities tailored to ASD learning styles outlined in the THR intervention manual (20). Specifically, the THR group participants each had one or multiple assigned volunteers (one horse leader and up to two side walkers) and the BA group participants each had one assigned volunteer. BA group participants had no contact with horses at the riding center, but were just able to view horses from a distance. There was a life-sized stuffed horse present in the BA group for hands-on learning related to the weekly horsemanship topic.

### Six-Month Outcome Measures

Six months following the completion of the intervention phase of the study, the same caregiver who completed baseline (preintervention) study forms for THR (n = 36) and control (n = 28) groups, completed the 6-month follow-up study measures. Only the THR group participants (n = 36) were scheduled to come into the study clinic site for a re-administration of all baseline assessment measures (i.e., language assessment and social-communication caregiver report form). The speech therapist, who had conducted the baseline and post-intervention evaluations as a blinded evaluator, was unblinded at the 6 month follow-up, because only THR participants received the 6-month follow-up language assessment. However, baseline and post-intervention assessments were not available to the speech therapist during the 6-month evaluation period.

The Irritability and Hyperactivity subscales of the Aberrant Behavior Checklist-Community (ABC-C) (21) were the primary outcome measures for the RCT (14). For this follow-up, the subset of caregivers from both groups (THR and BA) were asked to complete the ABC-C (21). The ABC-C is a 58 item symptom checklist that assesses problem behaviors /selfregulation in children and adults with developmental disabilities (22, 23) and is commonly used as a primary outcome measure in psychopharmacological studies with the ASD population (21). Caregivers received this questionnaire electronically or by mail and were asked to report on participants' current (within the past 4 weeks) observed irritability and hyperactivity behaviors.

The persistence of the previously-reported (14) improvements in social communication and social-cognition behaviors observed in the THR group, were again measured using the Social Responsiveness Scale (SRS) (24) completed by the subset of caregivers from the participants who completed the THR intervention. The SRS has high internal consistency and retest temporal stability in males and females with ASD (25).

To evaluate the persistence of word fluency improvements made in the THR group compared to the control from baseline to post-intervention (14), the subset of participants from the THR group were again administered the Systematic Analysis of Language Transcripts (SALT) (26) 6-months postthe THR intervention. The SALT (26) consists of a 5 min standardized language sample that includes software to structure the collection, transcription, and analysis of language samples obtained from individuals, including those with ASD.

TABLE 1 | Demographics for participants with follow-up at 6 months' post-treatment and those without follow-up.


<sup>a</sup>p-value for comparisons between THR and Barn group among participants with 6-month follow-up.

<sup>b</sup>p-values for comparisons between participants with 6-month follow-up and those without respectively for THR and Barn participants.

\*\*p < 0.05; \*p < 0.1 (Significant p-values are in bold).

### Statistical Analysis

SAS 9.4 was used for all the analyses (SAS Institute Inc.,)<sup>1</sup> . Participant characteristics and outcome data were compared between the THR and BA groups using two-sample t-test or chi square test as appropriate. The outcome analysis used a linear mixed effects model (LMM). For irritability and hyperactivity, the fixed effects of LMM included the classification variables of evaluation time (i.e., baseline, post-THR, and 6-month followup) and group indicator (THR or BA) as well as their interaction. Statistical test of the interaction of group by time was used to examine the significance of efficacy. Cohen's D effect size for efficacy is estimated based on this interaction test using 2 times t value divided by degree of freedom. Other outcome variables collected only for THR participants were also analyzed by LMM. Compound symmetry was the covariance structure used for all LMM analyses.

### RESULTS

### Patient Demographics and Baseline Clinical Data

Summary statistics of demographic and baseline clinical characteristics for the 64 participants who chose to followup 6-months post the initial THR intervention (n = 64) and those participants (n = 52) who did not follow-up are listed in **Table 1**. There were no statistically significant differences between these two groups with respect to demographic and clinical data collected, except for the fact that those participants who did not chose to follow-up tended to travel from farther distances to the riding center (see **Table 1**).

### Between Group Efficacy Maintenance at Six-Month Follow-Up

For the subgroup of participants from the RCT (14) who completed the 6-month follow up assessment (THR group, N = 36; BA group, N = 28), the THR group experienced significantly more improvements (effect size = 0.44, p = 0.016) on the ABC-C (21) Irritability subscale between pre-intervention and within 1-month post-intervention as compared to the BA group (**Table 2** and **Figure 1**). This efficacy is consistent with the larger RCT study results (14). Examining the extended interval from baseline to 6-months after intervention, showed significance at the 0.1 level (effect size = 0.32, p = 0.07) results favoring the THR group for the ABC-C (21) Irritability subscale. The observed effect size at 6 months are 73% of that at postintervention in this sample and 64% of that observed in the larger RCT study. For the ABC-C (21) Hyperactivity subscale, the THR group showed a greater (non-significant) improvement compared to the BA group from pre-intervention to within 1

<sup>1</sup> SAS Institute Inc. "SAS" (Cary, NC).

month post-intervention (effect size = 0.32, p = 0.08), in contrast to the significant finding in the original RCT study (14). There was no significant difference; however, when examining the extended interval from baseline to 6 months after intervention for the hyperactivity subscale (effect size = 0.09, p = 0.61), indicating efficacy of THR on hyperactivity was not sustained.

### Six-Month Follow-Up of the THR Group

Consistent with the original RCT study (14), significant improvements were observed in the THR group participants who completed the 6-month follow-up data collection using the SRS (24) Social Communication and Social Cognition subscales along with number of words and different words spoken during the SALT (26) from baseline (within 1-month pre-intervention) to 1-month post-intervention (each p < 0.01). These post-intervention changes sustained from 1-month postintervention to the 6-month follow-up period (see **Figure 2**). Specifically, there was significant improvement for each outcome (p < 0.01) between baseline and 6 months' post-intervention, while there was not a significant difference between 1-month post-intervention and 6 months' post-intervention.

### DISCUSSION

This report presents follow-up data from a subset of participants (n = 64) in a previously published RCT (14) study of the effects of a 10-week THR group intervention on children and adolescents with an ASD compared to a no-horse contact activity control group. Results from this follow-up study show that in the subset of THR participants measured, they retained some of their initial improvements made in irritability compared to the BA control. Additionally, an exploratory analysis of only the THR group revealed that they sustained their significant initial improvements made in social and communication behaviors, along with number of words and different words spoken during a standard language sample for at least 6 months following the completion of the THR intervention.

For reference, the previously published results of that RCT reported that after 10 weeks of intervention (THR or BA control), participants in the THR group showed significantly more improvements on the in irritability and hyperactivity behaviors compared to the BA control (14). The THR group also showed significant improvements on the SRS (24) subscales of social cognition and communication and used a greater number of different words as measured by the SALT (26), while the control group did not show similar improvements (14). The significant irritability and hyperactivity effects began by the fifth week of the THR intervention. Although the BA control showed significant within-group improvements in irritability and hyperactivity behaviors at end of intervention and 6 months after intervention, such improvements could be due to a variety of factors other than the BA group (e.g., placebo response). It would be misleading to make conclusions about the effects of this active (BA) control intervention due to the absence of a nonintervention control for comparison, a necessary element to test treatment effects as discussed in the literature [e.g., (29, 30)].


bCohen's D Effect size for efficacy at EoT or 6 months, estimated by 2 times t value divided by sqrt (DF).

\*\*p < 0.05; \*<sup>p</sup> < 0.1 (Significant p-values are in bold).

These results suggest that THR may be an effective complementary intervention to enhance social and verbal core symptoms of ASD, and to reduce irritability behaviors. Given the results of this study, particularly the lingering effects on irritability behaviors, we hypothesize that our THR manualbased approach may induce a reduction in arousal states, dampening stress/anxiety, in youth with ASD. Therefore, an important next step might be to examine the physiological regulation mechanisms involved in THR may explain (at least partially) improved outcomes in youth with ASD. Additionally, this study's finding of the long-term sustained improvements in irritability behaviors have clinical practice implications for youth with ASD that can add to the current standard of practice of administering anti-psychotic medications (i.e., Risperidone and

FIGURE 1 | Profile of irritability (IRR) and hyperactivity (HYP) over three assessment periods. Note, the typical clinical threshold for the ABC-C irritability subscale is >14–16 in psychopharmacology clinical trials for the ASD population [e.g., (27, 28)].

aripiprazole) to reduce symptoms of irritability in this population (ages 5–16 years and 6–17 years) (31–33). For example, it has been proposed that THR might be a safe adjunct intervention to facilitate lowering medication dosages in this population (34). Finally, this study expands the AAI research base by prospectively examining the residual effects of THR for children with ASD.

The limitations of this study include its small sample size due to the high dropout rate 6 months following the initial intervention phase. This limits the validity of the results, as findings may not represent the greater population of youth with ASD. Another limitation is the fact that for those participants who did not chose to follow-up, attrition might be because they lived farther away from the study site. This also presents a possible selection bias. Even though we only included participants who refrained from engaging in THR for 6-months following the initial intervention, we did not specifically assess what other, if any, contact with horses was made during that time, which is a limitation. Another limitation is the fact that this study did not assess the efficacy of THR compared to the BA control group on all outcome measures employed in the previously published RCT (14). This limits the validity of these study results and warrants a more thorough evaluation of whether the positive effects of THR can be maintained for at least 6-months after the THR intervention compared to a control.

Despite these limitations, this study provides useful preliminary data to both support and extend the significant findings from our previously published pilot study and RCT (14, 35). This study also provides suggestions for future investigations of the longer-term benefits of THR in children and adolescents with ASD. For example, future investigations should consider the addition of incentives to lower follow-up attrition rates such as conducting outcome evaluations in closer proximity to participants' residence and providing increased monetary

incentives for completing follow-up assessments. Future THR studies may consider collecting follow-up outcome assessments at several time points post-intervention to elucidate information regarding the lingering time course of outcome improvements. Finally, additional long-term outcome research will help to establish empirical evidence for THR as a valid intervention for youth with ASD, one that leads to the acquisition and long-term maintenance of behaviors skills that may enhance the quality of life for individuals with ASD and their caregivers.

### AUTHOR CONTRIBUTIONS

RG was the principal investigator of this study and contributed to writing the majority of this manuscript. ZP served as the statistical expert for this study and wrote the result section of this manuscript. GM served as a consultant for this study and provided editorial edits to this manuscript. BD served as the clinical coordinator for this study and provided editorial edits to this manuscript. NG assisted in writing the introduction of this manuscript.

### REFERENCES


### ACKNOWLEDGMENTS

Portions of the research described in this study have been previously presented at the International Meeting for Autism Research in May 2012 (Toronto, CA) and the Society for Research in Child Development in April 2013 (Seattle, WA). This research was supported by the NIH/National Institutes of Nursing Research grant NR012736-01. Luitpold Pharmaceuticals donated Adequan <sup>R</sup> medication for the horses in the study in collaboration with the riding center's veterinarian. The authors would like thank the families and children who participated in this study as well as our colleagues who assisted with this project: John Agnew, Natalie Brim, Laurie Burnside, Tina Farrell, Oren Gordon, Syd Martin, Shana Holderness, Amy Shoffner and the staff of the Colorado Therapeutic Riding Center, particularly Carol Heiden, Jody Howard, Heather McLaughlin, Mary Mitten, Penelope Powell and Sharon Van Boven. We also thank PATH International for supporting this study as well as Stephanie Luallin and James Heathers for their technical assistance formatting sections of this manuscript.


**Conflict of Interest Statement:** RG is a co-author of the book, Growing Up with Autism: Working with School-aged Children and Adolescents (Guilford Press) and the book, Autism from Research to Individualized Practice (Jessica Kingsley Publishers), from which she receives royalties. Current grant funding for RG provided by Simons and Lurie Foundations and The Human-Animal Bond Research Institute (HABRI) Foundation.

The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Gabriels, Pan, Guérin, Dechant and Mesibov. This is an openaccess article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Children's Relationship With Their Pet Dogs and OXTR Genotype Predict Child–Pet Interaction in an Experimental Setting

#### Darlene A. Kertes1,2 \*, Nathan Hall<sup>3</sup> and Samarth S. Bhatt<sup>1</sup>

<sup>1</sup> Department of Psychology, University of Florida, Gainesville, FL, United States, <sup>2</sup> Genetics Institute, University of Florida, Gainesville, FL, United States, <sup>3</sup> Department of Animal and Food Sciences, Texas Tech University, Lubbock, TX, United States

Human–animal interaction (HAI) research has increasingly documented the important role of pet dogs in children's lives. The quality of interaction between children and their pet dogs, however, is likely influenced by individual differences among children as well as their perceived relationship with their pet dog. Ninety-seven children aged 7–12 years and their pet dogs participated in a laboratory protocol during which the child solicited interaction with their dog, from which time petting and gazing were recorded. Children reported on their perceived relationship with the pet dog via interview. Children provided saliva samples, from which a polymorphism in the oxytocin receptor, OXTR rs53576, which has long been implicated in social behavior, was genotyped. The results showed that OXTR genotype and children's perceived antagonism with the pet dog predicted the amount of petting, but not gazing, between children and their pet dogs. This research adds to the growing body of HAI research by documenting individual differences that may influence children's interactions with animals, which is key to research related to pet ownership and understanding factors that may impact therapeutic interventions involving HAI.

Keywords: human–animal interaction, relationships, child, petting, dogs, OXTR, oxytocin receptor gene, oxytocin

### INTRODUCTION

Human–animal interaction (HAI) research has increasingly documented the important role of pet dogs in providing social support to children (e.g., Friedmann et al., 1983; Kotrschal and Ortbauer, 2003; Anderson and Olson, 2006). However, the bulk of this research has been descriptive or correlational in nature, with long-standing concerns about methodological rigor (Griffin et al., 2011) and few well-controlled laboratory experiments that afford greater confidence to the validity of results (Wilson and Barker, 2003). Our group has previously reported that children randomly assigned to experience a standard laboratory stressor accompanied by their pet dog for social support reported feeling less stressed compared to children who completed the stressor with their parent present or with no social support (Kertes et al., 2017). Among the children who underwent the stressful experience with their pet dog, those who naturally solicited their dog to be stroked or petted had lower levels of the stress-sensitive hormone cortisol compared to children who engaged their dog less.

#### Edited by:

Peggy D. McCardle, Peggy McCardle Consulting, LLC and Haskins Laboratories, United States

#### Reviewed by:

Sabrina E. B. Schuck, University of California, Irvine, United States Kristine Ann Kovack-Lesh, Ripon College, United States

> \*Correspondence: Darlene A. Kertes dkertes@ufl.edu

#### Specialty section:

This article was submitted to Developmental Psychology, a section of the journal Frontiers in Psychology

Received: 24 May 2018 Accepted: 26 July 2018 Published: 05 September 2018

#### Citation:

Kertes DA, Hall N and Bhatt SS (2018) Children's Relationship With Their Pet Dogs and OXTR Genotype Predict Child–Pet Interaction in an Experimental Setting. Front. Psychol. 9:1472. doi: 10.3389/fpsyg.2018.01472

Indeed, petting, and to a lesser extent gazing, have been suggested as potential mechanisms by which HAI impact humans by altering their emotional and physiological state (e.g., Odendaal and Meintjes, 2003; Shiloh et al., 2003). Among adults, petting is associated with reduced perceived stress or anxiety (Shiloh et al., 2003; Barker et al., 2005), increased immunoglobulin A (Charnetski et al., 2004), lower heart rate or blood pressure (Jenkins, 1986; Vormbrock and Grossberg, 1988; Demello, 1999; Handlin et al., 2011), and changes in β-endorphins, prolactin, β-phenylethylamine, oxytocin, cortisol, and dopamine (Odendaal, 2000; Barker et al., 2005). Adult owner-pet gazing has been linked with increased oxytocin levels (Nagasawa et al., 2009, 2015). Among children, petting during child–dog interaction has been associated with lowered cortisol stress response (Kertes et al., 2017) and positive affect (Kerns et al., 2018). The vast majority of research on HAI with children has focused on dog presence, which has been linked with reduced blood pressure (Friedmann et al., 1983) perceived stress (Kertes et al., 2017), enhanced emotional stability in the classroom (Anderson and Olson, 2006), increase social interaction, and decrease aggression and hyperactivity (Kotrschal and Ortbauer, 2003), and reduced distress to a routine medical procedure (Vagnoli et al., 2015).

Its potential benefits notwithstanding, the degree to which children's interactions with their pet dogs spontaneously include petting and gazing may be influenced by individual differences in children's perceived relationship with their pet dog. To date, the majority of research on pet owners' feelings toward their pets have centered on positive emotions (Johnson et al., 1992; Cromer and Barlow, 2013). This area of research has shown that children and adults alike report strong positive feelings toward their pet dogs (Serpell, 1996; Daly and Morton, 2006; Kurdek, 2008). Noticeably absent from most HAI studies is the role of perceived negative aspects of the child–pet relationship, such as feeling annoyed with or hassled by interactions with the pet. A more complete evaluation of effects of children's feelings toward their pet necessarily involves inclusion of both positive and negative components to children's perceived relationships (e.g., DeRosier and Kupersmidt, 1991; Furman and Buhrmester, 1992; Shantz and Hartup, 1992; Van Horn and Cunegatto, 2000; Moilanen and Raffaelli, 2010).

Another factor that may contribute to individual differences in children's interactions with their pet is variability within the oxytocinergic system. Oxytocin is a hormone and neuromodulator shown to be involved in a variety of social behaviors (Carter, 2014). Oxytocin is linked with affiliative behavior (Insel, 1992), formation of social bonds (Lim and Young, 2006), and responses to stressful social situations (Neumann et al., 2000; Kirsch et al., 2005).

Endogenous circulating oxytocin effects are influenced by actions at the oxytocin receptor. This receptor is encoded by the gene OXTR, which is variably expressed across individuals. Among the most commonly studied genetic polymorphisms at OXTR is rs53576, involving a guanine (G) to adenine (A) substitution. Genetic variation at this locus has been associated with both prosocial and negative behaviors. For example, A carriers, (i.e., individuals with the AA or AG genotype), compared to those with the GG genotype, have demonstrated lower levels of interpersonal empathy (Rodrigues et al., 2009; Smith et al., 2014), trust (Krueger et al., 2012), as well as lower self-esteem (Saphire-Bernstein et al., 2011), and higher negative affect and loneliness (Lucht et al., 2009). Among adolescents, A-carriers are reported to be less responsive to parental support (Smearman et al., 2016). Consistent with the differential susceptibility hypothesis (Belsky et al., 2009), it has been suggested that OXTR rs53576 may be one of a set of susceptibility loci in the genome, whereby genetic variation influences an individual's sensitivity to the social environment (Kim et al., 2010).

Notably, the extensive literature examining OXTR rs53576 in relation to social behavior has focused exclusively on human social interaction. To date, there are no published studies examining OXTR genotype with respect to children's interaction with animals. This is notable because interaction with and ownership of lay or trained therapy dogs is increasingly becoming a mainstay of clinical interventions for children with anxiety disorders, autism spectrum diagnoses, or a history of maltreatment, on the assumption that interacting with animals is particularly beneficial for individuals for whom social interactions are challenging (Nimer and Lundahl, 2007). The present report is the first to assess whether the OXTR genotype among children is related to HAI.

This study focused on typically developing children in middle childhood (aged 7–12 years). During this developmental period, the amount of time children spend with parents declines dramatically compared to earlier ages (Lam et al., 2012). Although parents continue to be important social partners, in middle childhood, children begin to rely on a broader network of social support figures compared to earlier ages, including pets (Bryant, 1985).

The purpose of the present study was to test whether children's perceived relationships (including both positive emotional support and negative interactions) along with child genotype at the OXTR rs53576 polymorphism predict directly observable child–pet interaction. To achieve this aim, we assessed two essential elements of child–pet interaction, petting, and gazing, via direct behavioral observation in the context of a controlled laboratory environment with minimal distractions. Based on the extant literature linking OXTR rs53576 genotype to social behavior, we expected that children who are A-carriers would differ from those with the GG genotype with respect to the amount of time spent petting and gazing with their pet dogs. Because of the complex associations of the OXTR genotype with respect to social behavior, and the fact that this is the first study to examine the rs53576 polymorphism with respect to HAI rather than human social interactions, we did not specify a priori directional predictions for the OXTR genotype. With respect to children's relationships, we anticipated that children's perceived relationship with their pet dogs, as reflected in higher levels of perceived support and lower levels of perceived negative interactions, would be associated with higher levels of petting and gazing with their pet dogs. Children were also asked about their perceived relationship with the mother, which was included on conceptual grounds for their key role in social emotional development, and to evaluate whether child–parent relationships were related to child–pet interactions.

### MATERIALS AND METHODS

fpsyg-09-01472 September 4, 2018 Time: 9:43 # 3

### Participants

Participants were 97 children (49 boys; 48 girls) accompanied by a parent (81% mother) and pet dog. Participating families were recruited through locally distributed mailings, flyers, and radio and TV advertisements. Interested families contacted the research lab and were screened for eligibility. To be eligible for the study, children could not have any diagnosed physical or behavioral health conditions, and the pet dog must have lived with the family for at least 6 months and have no history of aggression. If multiple dogs resided in the home that met inclusion criteria, the family selected one dog based on the child–pet relationship to accompany the child to the research lab.

Child age ranged from 7 to 12 years (M = 10.3 years, SD = 1.32) Child ethnicity was reported by parents as follows: 11% Hispanic; 89% non-Hispanic. The majority of the sample was White (84%), with the remainder reporting their race as follows: 7% two or more races; 3% Latino; 2% Native American; 2% African American, and 2% Asian.

### Procedure

Procedures were approved by the University of Florida Institutional Review Board and the Institutional Animal Care and Use Committee. All procedures took place in three adjacent rooms – a waiting room, interview room, and experimental testing room – at the research laboratory at the University of Florida. Children were aware of their parent's and dog's location at all times. All rooms were temperature controlled and water was available for the dog.

At the start of the study visit, parents provided written informed consent in the waiting room. The study was also explained to children verbally for purposes of oral assent. A trained dog handler brought the dog to the experimental testing room to familiarize the dog to the room and study personnel. Then, the child accompanied an experimenter to the interview room, decorated in child-friendly décor while the dog remained with the parent in the waiting room.

In the waiting room, parents completed questionnaires providing basic demographic information on their child, family, and pet dog. Parents also provided information about the breed of the dog, which was subsequently classified into breed groups. Children's pets included lap/toy dogs (n = 31), sporting breeds (n = 20), herders (n = 18), terriers/ratters (n = 12), bully/fighting breeds (n = 11), and unknown mixes (n = 5). A research assistant was present throughout to answer any parent questions.

### Children's Perceived Relationships

In the interview room, children completed an experimenterassisted questionnaire about their relationships with their mother and their pet dog using the Network of Relationships Inventory (NRI) (Furman and Burmheister, 1985). The original NRI, comprised of 21 items, was designed and has been validated for assessing relationship qualities across a broad variety of social relationships, including but not limited to parents, teachers, and peers. An example item from this measure is, "How often do you tell this person everything that you are going through?" The measure contains 10 subscales typically collapsed into two broader scales, termed Support and Negative Interactions. We have previously evaluated the NRI among children owning pet dogs to determine the relevance of items for assessing child–pet relationships. With the exception of two subscales, Instrumental Aid and Conflict, the remaining subscales were retained as applicable to child–pet relationships (Hall et al., 2016).

The NRI items tapping relationship with the mother was scored as recommended by the scale developers (Furman and Burmheister, 1985). Then, the overarching dimension of Support was created by computing the mean of the items on the subscales Companionship, Intimate Disclosure, Nurturance, Affection, Admiration, Reliable Alliance, and Instrumental Aid. The dimension of Negative Interactions was computed as the mean of the scores on Conflict and Antagonism.

The NRI pet items were subjected to a principal component analysis (PCA) to determine whether a two-dimension solution was appropriate with the more limited set of subscales assessed for child–pet relationships. As described in the Section "Results," a two-dimension solution was deemed appropriate for the data, and therefore the subscale means were computed and averaged into the two broad dimensions as follows: Support (Companionship, Intimate Disclosure, Nurturance, Affection, Admiration, Reliable Alliance) and Negative Interactions (Antagonism).

### Behavioral Assessment of Child–Pet Interaction

The child and pet dog were brought to the experimental testing room for behavioral assessment of interaction between the child and pet. Specifically, this assessment measured the proportion of time the child and dog spent interacting while the child was sitting quietly in a room (4.5 m by 3 m) that contained a chair, desk, and lamp for 10 min. This task was based on components of past sociability tests (e.g., Barrera et al., 2010; Jakovcevic et al., 2012), but were simplified such that the child could implement the protocol with brief instruction. The child sat in a chair in the center of a 1 m radius semi-circle marked with tape on the floor. The dog handler brought the dog to the opposite end of the room, where the dog was able to greet a second observer for approximately 1 min before beginning the task. The child was asked to direct attention to the dog, and call the dog over twice while remaining seated, once at the beginning of the 10 min session and again after the first 5 min. The child was asked to otherwise remain neutral unless the dog entered the semi-circle. If the dog entered the circle the child was told to interact with it as if they were at home, to capture natural variability in child– pet interaction. During the assessment, the handler and observer stood along the wall opposite of where the child was seated for an unobstructed view of the child–pet interaction.

The handler and the observer, both trained in coding dog behaviors, scored each session live on two dimensions: gazing and petting. Each behavior was scored using partial-interval recording by breaking the 10 min session into 120 5 s. epochs. If a target behavior occurred during that epoch, the interval was scored. The percentage of epochs during which a target behavior occurred was averaged across the scorers. Gazing was defined as the percentage of intervals in which the dog and child were facing each other. Petting was defined as the percentage of intervals in which the child made contact with the dog with their hand. Interclass correlations among the two coders was 0.85 for gazing and 0.99 for petting.

#### Genotype Assessment

fpsyg-09-01472 September 4, 2018 Time: 9:43 # 4

Children were asked to provide a 4 mL saliva sample by passive drool into Oragene-DNA (OGR-500) saliva collection tube (DNA Genotek, Kanata, ON, Canada), which was stored at room temperature until the DNA extraction step. DNA extraction was performed using our lab's standardized protocol. Briefly, 750 µl of the content from OGR-500 tube was incubated at 50◦C in a GeneMate dry bath (Bioexpress, Kaysville, UT, United States) for 2 h, followed by incubation on ice for 10 min and centrifugation at 21,100 × g for 10 min. The DNA was precipitated by transferring the supernatant to a tube containing 750 µl of ethanol, mixing the content gently, incubating at room temperature for 10 min, and centrifuging at 16,000 × g for 3 min. The DNA pellet was washed using 70% ethanol, dried at room temperature, dissolved in 100 µl TE buffer and stored at −80◦C. The DNA quality and quantity was measured using NanoDrop 2000 spectrophotometer (Thermo Fisher Scientific, Wilmington, DE, United States). Genotyping was performed using TaqMan Genotyping Master Mix (Applied Biosystems, Foster City, CA, United States), TaqMan SNP Genotyping assay (Applied Biosystems, Foster City, CA, United States) for OXTR (rs53576), with the StepOnePlus real time PCR system (Applied Biosystems, Frederick, MD, United States), according to the manufacturer's instructions. PCR was performed using 10:l reaction mix with 1.5 ng DNA and the following cycling conditions: 60◦C for 30 s, 95◦C for 10 min, 40 cycles of 95◦C for 15 s and 60◦C for 1 min, 60◦C for 30 s. Allelic discrimination was performed using StepOne v.2.1 (Applied Biosystems, Frederick, MD, United States).

### Statistical Analysis

Statistical analyses were conducted in R version 3.3.2 (R Core Team, 2016). A hierarchical regression framework with backward selection was utilized to test for significant predictors of two dimensions of child–pet interaction, with predictors of petting and gazing tested in separate models. Tested predictors included demographics, relationship qualities, and child genotype.

### RESULTS

### Evaluating the NRI to Assess the Child–Dog Relationship

We first calculated the mean scores for each subscale for the child–dog relationship. We then conducted Principal Component Analysis on the scaled scores of each subscale. Two components explained 71% of the variance (55 and 16% respectively). The loadings of each component are shown in **Table 1**. Inspection of the loadings suggests a two-factor solution identical to Support and Negative Interactions used for computing summary scores for the child–mother relationship. This suggested that these two summary scores can be computed similarly for the child–dog relationship.

### Descriptives

Descriptive statistics of child-reported relationship qualities and behavioral observation of child–pet interaction are shown in **Table 2**. Genotype assessment of rs53576 yielded the following genotypes: 41% GG, 53% AG, and 6% AA. These proportions are comparable to other U.S.-based studies (see for review Luo and Han, 2014). As is common in analysis of rs53576, the AG, and AA genotypes were combined into one genotype group, termed A-carriers, yielding two genotype groups for analysis, GG homozygotes, (41%) and A-carriers (59%).

### Analysis of Petting

Predictors of child–dog petting during the child–pet interaction task were determined via regression analyses using backward selection to obtain a reduced model with the strongest predictors. An initial model included all predictors (demographic covariates, child self-reported relationship qualities, and child genotype). The initial model suggested that petting was associated with child age [F(1,84) = 4.18, p = 0.044], OXTR rs53576 genotype [F(1,84) = 6.15, p = 0.015], and Negative Interactions with the dog [F(1,84) = 6.31, p = 0.014]. Specifically, more petting was observed with older child age, OXTR rs53576 A-carrier status, and lower child-reported Negative Interactions with the pet dog (see **Figure 1**). Petting was not associated with child sex [F(1,84) = 0.37,

TABLE 1 | Principal component analysis (PCA) loadings of dog NRI subscales.


Loading absolute values >0.4 are in bold.

TABLE 2 | Descriptive statistics for child self report relationship qualities and behavioral observation.


FIGURE 1 | The proportion of time spent petting during a child–dog interaction task was independently predicted by child age, child genotype, and children's self-reported relationship with the pet dog. (A) Higher levels of child-reported negative relationships with the pet dog was associated with less petting. The relationship effect is shown averaged across child genotype and illustrates the overall main effect for the child–pet relationship. The regression line shows the final model prediction with the shaded area indicated standard error of the mean. (B) Older child age is associated with more time spent petting. Lines shows reduced model prediction for each genotype and shading indicates standard error of the mean. (C) Children who are A-carriers at the OXTR rs53576 genetic polymorphism, compared to GG homozygotes, engaged in more petting during the child–pet interaction.


TABLE 3 | Akaike's information criterion (AIC) for each step in the regression predicting petting using backward regression.

Values are shown for each variable successively dropped from the model to achieve the best fitting results.

p = 0.54], dog breed group [F(5,84) = 1.47, p = 0.21], childreported Support from dog [F(1,84) = 1.09, p = 0.300], or either of the two child-reported measures of relationship quality with the mother [F(1,84) = 0.53–1.21, p = 0.27–0.47]. To obtain the most parsimonious model for our dataset, the model was subjected to backward selection using the Step routine, which identifies the most parsimonious model based on Akaike's information criterion (AIC; see **Table 3** for AIC values at each regression step). All three significant predictors – child age, child genotype, and the Negative Interactions dimension of the child-reported relationship measure – were retained in the final model, as shown in **Table 4**.

### Analysis of Gazing

A comparable regression model was created for gazing during the child–pet interaction. The initial model including all predictors revealed that none of the demographic, genotype, or relationship quality variables was significantly associated with gazing (F's = 0.01–2.14, p's > 0.15). The backward selection procedure was implemented to yield the most parsimonious

TABLE 4 | Final regression model of significant predictors of child–dog petting during behavioral observation.


model. The final model of gazing during the child–pet interaction task included only child age as a predictor, however, the association was not statistically significant (F = 2.28, p = 0.13; see **Table 5**).

### DISCUSSION

The present study was the first to test whether the OXTR genotype and children's perceived relationships with their pet dogs are related to HAI, specifically, petting and gazing. The research design simulated a common, naturally occurring HAI, in which human owners call over their pets, within the context of a controlled laboratory experiment with minimal distractions. On average across children, the total time spent petting was approximately 50% of the 10 min interaction period. The results showed that variation at the OXTR polymorphism rs53576 was associated with the proportion of time spent petting during child–pet interactions. Specifically, A-carriers engaged in more petting than children with the GG genotype. This observation is noteworthy given that OXTR rs53576 has previously been suggested as a genetic locus associated with sensitivity to the social environment. Prior research with typically developing children has demonstrated that A-carrier youth are less responsive to parental support (Smearman et al., 2016) and to social consequences of peer relational aggression (Kushner et al., 2017), and show lower levels of interpersonal empathy (Rodrigues et al., 2009; Smith et al., 2014), trust (Krueger et al., 2012), and self-esteem (Saphire-Bernstein et al., 2011). This may be relevant to growing trend of incorporating HAI into behavioral therapy with children for whom human social interactions are challenging (Nimer and Lundahl, 2007; Silva et al., 2018). Although little empirical research has been conducted in this area, there is preliminary evidence that dogs may be preferred social partners for such children. Children with autism, a neurodevelopmental disorder in which social deficits are common, prefer to interact with a dog over another person or toy (Prothmann et al., 2009). Children with anxiety disorders tend to spend long durations interacting with a pet dog but tend to engage in fewer interactions with another person compared to children with other behavioral health problems (Prothmann et al., 2005). Although highly speculative, our results contribute to emerging evidence that pet dogs may be an important source of social interaction for children that have difficulty in other social environments.

The results of this study also demonstrated that children's selfreported negative interactions in the context of their relationship with their pet was related to the proportion of time spent petting the dog. Specifically, higher levels of antagonism, reflecting children's reports that they and their dog hassle each other, annoy each other, and "get on each others' nerves," spent less time engaged with petting. Children's perceptions of support, reflecting items tapping aspects of affection, companionship, and other positive features, were not associated with the proportion of time spent petting. Of note, this was the first study that simultaneously assessed both positive and negative components of children's relationships with their pet dog. Psychometric data from the principal components analysis demonstrated that children's responses about positive and negative relationship qualities were distinct measurable aspects of the child–dog relationship that paralleled the relationship dimensions measured for the child–parent relationship. The observation that negative interactions, and not support, was associated with petting speaks to the need to incorporate both positive and negative aspects of relationships in HAI research.

We also observed that older child age was associated with more time spent petting. This observation was of interest in light of the broad consensus in the developmental literature that, beginning in middle childhood, children spend proportionally less of their social time with parents and more time with other social partners (Lam et al., 2012). Research with 7–10 year old children has shown that with age, children broaden their network of social support figures, including pets (Bryant, 1985). With age, intimate disclosure also declines to parents whereas it rises with other social partners such as peers (Buhrmester and Furman, 1987). Although we did not assess peer relationships in this study, the age effected observed is consistent with the notion that


Values are shown for each variable successively dropped from the model.

non-parental sources of social interaction and support gain in importance during the course of middle childhood, and highlight the role that pets may play in this important developmental transition.

There was no evidence in this study that either genotype or relationship quality was associated with gazing. There are at least two possible explanations for this finding. First, the literature on owner-dog gazing has to date been restricted to research with adults, and there may be unknown differences in childrens' interactions with their pet dogs compared to adult owners. In the absence of any studies comparing adult to child owner's interactions with pets, this possibility cannot be ruled out. Second, in contrast to some studies with adults (e.g., Nagasawa et al., 2015), we did not attempt to manipulate owner-dog gazing, but rather quantified the degree to which such behavior naturally occurred in the context of the child soliciting interaction with the dog. It may be that the amount of naturally occurring gazing (approximately 20% of the total interaction time) was too low in our behavioral paradigm to detect association with children's individual differences.

The present results should be considered in light of several considerations. First, participants were primarily from non-Hispanic White families and thus the generalizability to a more diverse population warrants further study. Second, the participants in this study were typically developing children prescreened for known health conditions. Whether these findings generalize to clinical populations is unknown; however, the present results may serve as foundational research for application to clinical populations. Third, we did not genotype the pet dogs for variation at the OXTR gene. There is some evidence to suggest that dogs' human-directed behavior is associated with genetic polymorphisms at OXTR (Kubinyi et al., 2017; Oláh et al., 2017) or OXTR methylation (Cimarelli et al., 2017). Genotyping OXTR in both children and the pet dogs may reveal more nuanced associations of OXTR genotype within the context of HAIs. Fourth, this study focused on families who already owned a pet dog. This was intentional to avoid the inherent challenge of interpreting child–dog interaction among a mixed group of dog owners vs. non-owners. Finally, the study was conducted in a research laboratory. It is possible that child– dog interaction in a laboratory environment may not be the same as in more naturalistic environments. This limitation is

### REFERENCES


offset, however, by the benefits of the tightly controlled context of a laboratory, with standardized environmental and testing conditions for maximizing the validity of the observed results and reducing distractions and confounding variables. Moreover, the direct behavioral observation of child–pet interaction lends higher confidence in the validity of the observed empirical associations compared to descriptive or self-report studies.

This study adds to the growing body of literature on HAI by documenting two key factors that predict natural variation in children's interactions with their pet dogs. This knowledge is critical as the field as a whole strives to maximize the potential therapeutic benefits of HAI for clinical populations. A greater understanding of the individual differences that influence children's interactions with familiar animals will also aid in the broader research goal of determining the potential benefits and challenges of pet ownership for children.

### AUTHOR CONTRIBUTIONS

DK developed the study, oversaw its execution, analyzed the data, and wrote the manuscript. NH collected the data, conducted the behavioral coding, and analyzed the data. SB conducted the genotyping and analyzed the data. All authors contributed to the writing of the manuscript and approved the final manuscript.

### FUNDING

This research was supported by the National Institutes of Health grant HD071288 to DK. The content is solely the responsibility of the authors and does not necessarily represent the official views of the funders.

### ACKNOWLEDGMENTS

We would like to thank Clive Wynne, Jingwen Liu, and Natalie Hadad for their contributions to the research from which this study was developed. We gratefully acknowledge the families who participated in this study, without whom this research would not have been possible.

Bryant, B. K. (1985). The neighborhood walk: sources of support in middle childhood. Monogr. Soc. Res. Child Dev. 50, 1–122. doi: 10.2307/3333847



variation in the oxytocin receptor gene. Aggress. Behav. 44, 60–68. doi: 10.1002/ ab.21724



**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Kertes, Hall and Bhatt. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Enhanced Selection of Assistance and Explosive Detection Dogs Using Cognitive Measures

Evan L. MacLean1,2 \* and Brian Hare3,4

<sup>1</sup> School of Anthropology, University of Arizona, Tucson, AZ, United States, <sup>2</sup> Department of Psychology, University of Arizona, Tucson, AZ, United States, <sup>3</sup> Evolutionary Anthropology, Duke University, Durham, NC, United States, <sup>4</sup> Center for Cognitive Neuroscience, Duke University, Durham, NC, United States

#### Edited by:

Peggy D. McCardle, Consultant, New Haven, CT, United States

#### Reviewed by:

Mitsuaki Ohta, Tokyo University of Agriculture, Japan Nathaniel James Hall, Texas Tech University, United States

> \*Correspondence: Evan L. MacLean evanmaclean@email.arizona.edu

#### Specialty section:

This article was submitted to Veterinary Humanities and Social Sciences, a section of the journal Frontiers in Veterinary Science

Received: 18 June 2018 Accepted: 10 September 2018 Published: 04 October 2018

#### Citation:

MacLean EL and Hare B (2018) Enhanced Selection of Assistance and Explosive Detection Dogs Using Cognitive Measures. Front. Vet. Sci. 5:236. doi: 10.3389/fvets.2018.00236 Working dogs play a variety of important roles, ranging from assisting individuals with disabilities, to explosive and medical detection work. Despite widespread demand, only a subset of dogs bred and trained for these roles ultimately succeed, creating a need for objective measures that can predict working dog aptitude. Most previous research has focused on temperamental characteristics of successful dogs. However, working dogs also face diverse cognitive challenges both in training, and throughout their working lives. We conducted a series of studies investigating the relationships between individual differences in dog cognition, and success as an assistance or detection dog. Assistance dogs (N = 164) and detection dogs (N = 222) were tested in the Dog Cognition Test Battery, a 25-item instrument probing diverse aspects of dog cognition. Through exploratory analyses we identified a subset of tasks associated with success in each training program, and developed shorter test batteries including only these measures. We then used predictive modeling in a prospective study with an independent sample of assistance dogs (N = 180), and conducted a replication study with an independent sample of detection dogs (N = 90). In assistance dogs, models using data on individual differences in cognition predicted higher probabilities of success for dogs that ultimately succeeded in the program, than for those who did not. For the subset of dogs with predicted probabilities of success in the 4th quartile (highest predicted probability of success), model predictions were 86% accurate, on average. In both the exploratory and prospective studies, successful dogs were more likely to engage in eye contact with a human experimenter when faced with an unsolvable task, or when a joint social activity was disrupted. In detection dogs, we replicated our exploratory findings that the most successful dogs scored higher on measures of sensitivity to human communicative intentions, and two measures of short term memory. These findings suggest that that (1) individual differences in cognition contribute to variance in working dog success, and (2) that objective measures of dog cognition can be used to improve the processes through which working dogs are evaluated and selected.

Keywords: cognition, assistance dog, detection dog, canine, behavior, cognition

## INTRODUCTION

Working dogs play a wide variety of important roles in human society, performing tasks ranging from assisting people with disabilities, to explosive and medical detection (1). Despite widespread demand, only a subset of dogs bred and trained for these roles are ultimately able to succeed as working dogs (2– 4). Attrition from training programs, or failure to succeed after training, have important consequences with respect to public health (e.g., wait lists to receive certified assistance dogs) as well as the financial costs of breeding, training, and placing working dogs (e.g., through investment of resources in dogs that ultimately do not succeed). Therefore, there is an important need for objective measures that can predict whether individual dogs are likely to succeed in diverse types of working dog programs [reviewed in (5, 6)].

To date, most research on predictors of success as a working dog have focused largely on measures related to temperament and behavior. Studies of temperament have been motivated by the idea that working dogs are often utilized in highly stimulating environments, that these dogs frequently encounter unfamiliar people, other animals, and potentially startling stimuli, and that dogs must be able to remain calm, and task-focused in these situations. Similarly, inappropriate behaviors (e.g., excessive barking, scavenging, inappropriate elimination) can cause problems for dog handlers, or compromise a dog's ability to effectively perform his or her role. Studies across the last two decades have developed a wide range of approaches for assessing these characteristics, many of which serve as useful predictors of working dog success (2, 3, 7–17). For example, Wilsson and Sinn (16) found that scores on a principal component relating to a tendency to engage in tug-of-war, chasing, and interest in object retrieval, were positively associated with training success in the Swedish armed forces. In assistance dog populations, Duffy and Serpell (3) found that dogs prone to excitability, stranger- and dog-directed aggression, and social and nonsocial fear were less likely to successfully complete training. Lastly, Svobodová et al. (14) found that puppies that were more willing to chase and fetch a ball, and least reactive to noise, were the most likely to pass police dog certification. Therefore, previous studies have identified a range of temperamental and behavioral traits that relate to working dog training outcomes.

However, working dogs also face a variety of cognitive challenges, both in their initial training, and throughout their working lives (2, 18). Therefore, it is possible that individual differences in dog cognition also explain variance in aptitude for working roles (2). The cognitive skills that dogs require in these roles are likely to be diverse, extending beyond the basic learning mechanisms typically emphasized in dog training (e.g., operant and classical conditioning). For example, while animal trainers can shape behavior so that a dog associates the completion of a goal (e.g., retrieve the keys), with a social or food reward, trainers cannot train animals to flexibly and spontaneously respond to barriers that might prevent the completion of a trained goal (e.g., the closest door to the room where the keys are is closed, the keys are on the floor among many objects and are partially occluded by a book). In retrieving the keys, it is cognitive flexibility and not just temperament or trained responses that allows a dog to solve the problem. A dog's cognitive abilities allow her to mentally represent space and infer the need to take a detour (19, 20); to categorize objects as either being keys or not, and to inhibit bringing back incorrect object(s) (21–23); to maintain a mental representation of the referent of the verbal command (keys) in short-term memory even though it is not at first visible (24, 25); and to understand the communicative intention behind a human pointing gesture to infer the location of the occluded keys and finally retrieve them (26, 27).

Relative to studies of temperament, there have been very few investigations of whether individual differences in cognition relate to aptitude as a working dog. Bray et al. (2) recently tested young adult candidate guide dogs with a series of temperament and problem-solving tasks, and found that performance on a multistep problem-solving task was a significant predictor of subsequent success in the guide dog program. However, current work has employed relatively few cognitive measures, and has not explored associations between cognition and working dog outcomes in working roles beyond guide dogs.

The importance of assessing dog cognition broadly is evidenced through work describing the psychometric structure of individual differences in dog cognition. Specifically, individual differences in dogs are best described by multiple factors, reflecting psychological processes such as memory, understanding of communicative intentions, inhibitory control, and social engagement (28, 29). Thus, individual differences in dog cognition vary across multiple cognitive domains, yet we know little about which domains of cognition are most important for working dogs. Moreover, given that different working roles present different sets of job-specific challenges, it is probable that the aspects of cognition associated with working dog success will vary between different working roles. Therefore, the central challenges for this line of research are to (1) measure diverse cognitive processes when assessing links between cognition and working dog performance, and (2) identify the specific links between these aspects of cognition, and success in different working dog roles.

Here, we present a series of studies investigating associations between individual differences in dog cognition and success as an assistance or explosive detection dog. We implemented a similar research approach with both working dog populations. Within each population, we first conducted an exploratory study in which a sample of dogs was tested with a 25 item cognitive test battery, probing diverse aspects of dog cognition. We then identified associations between individual differences on items in this test battery, and measures of success as a working dog. Based on these preliminary findings, we then developed short-format test batteries including the subset of tasks that were most strongly associated with outcomes (specific to each population). Lastly, we implemented predictive models (Experiment 1) or a replication study with an independent sample of dogs (Experiment 2) to validate or confirm the associations between individual differences on the cognitive measures, and measures of success as a working dog.

### GENERAL METHODS (ALL POPULATIONS)

During exploratory studies, large samples from both populations of working dogs were tested in the Dog Cognition Test Battery [DCTB; (28)]. The DCTB consists of 25 problem solving tasks designed to assess skills for reasoning about social and physical problems as well as domain-general cognitive processes. A detailed description of the DCTB, including its factor structure and implementation with the populations described here, is reported by MacLean et al. (28). All tasks in the DCTB are described briefly in **Table 1**, and were conducted by trained experimenters (university students and researchers). Detailed methods for these tasks are provided in the **Supplemental Material**. In all experiments, researchers were blinded to the training outcomes of dogs during testing, and dog trainers were blind to the results of the cognitive tests.

All testing was voluntary, and dogs were free to stop participating at any time. Subjects participated for food and toy rewards, and were not deprived of food or water. All experimental procedures were approved by the Duke University Institutional Animal Care and Use Committee (protocol number: A138-11- 06). Inter-rater reliability was assessed for a randomly selected sample of 20% of the data and was excellent for all measures [mean ± SEM: kappa = 0.96; correlation = 0.94 ± 0.02; (28)].

### EXPERIMENT 1

### Exploratory Study

### Methods

#### **Subjects**

Candidate assistance dogs were tested at Canine Companions for Independence (CCI) in Santa Rosa, CA (N = 164; 107 females, 57 males, 19 Labrador retrievers, 4 golden retrievers, 141 Labrador retriever x Golden retriever crosses). CCI raises and places assistance dogs for diverse roles, including service dogs (placed with adults with physical disabilities), hearing dogs (placed with adults who are deaf or hard of hearing), skilled companion and facility dogs (placed with an adult or child with a disability under the guidance of a facilitator, or partnered with a facilitator in a health care, visitation, or education setting). Because hearing dogs are selected for a different behavioral phenotype than the other roles, hearing dogs (N = 21) were excluded from analysis. Dogs who aborted more than 2 tasks in the battery (N = 24) or were released for medical reasons (N = 8), were also excluded from analysis, yielding a final sample of 111 dogs included for exploratory analysis (mean age = 1.98 years, SD = 0.19 years).

#### **Assistance dog outcome measures**

After entering professional training and passing medical clearances, dogs in CCI's program either graduate and are placed in one of the roles described above, or are released for behavioral problems during training (a decision made by professional trainers without input from the researchers or knowledge of performance on cognitive tests). Therefore, success was coded as a categorical variable with two levels (graduate, release). In the sample for the exploratory study, 76 dogs graduated the program and 35 dogs were released. Assistance dogs were tested in the DCTB prior to obtaining an outcome in the program.

### **Analysis**

For exploratory analysis, we implemented a variety of predictive modeling strategies. The cognitive data were prepared for analysis using a Box-Cox transformation (30) with missing data imputed using a K nearest neighbors approach. Exploratory analysis was conducted using eight different predictive modeling techniques to assess the utility of diverse modeling strategies, as well as consensus across models regarding variable importance. Specifically, we employed the following models from the caret package (31) in the R programming environment (32): (1) generalized linear model [GLM], (2) linear discriminant analysis [LDA], (3) regularized regression [RR], (4) partial least squares [PLS], (5) naïve Bayes classification [NB], (6) multivariate adaptive regression splines [MARS], (7) K nearest neighbors, and (8) random forest.

Because our aim was to develop a short-format battery using only the cognitive measures most strongly associated with training outcomes, we investigated the relative importance of predictor variables across models. To do so, we extracted the variable importance statistic from each of the training models, which reflects the relative importance of variables in a model (31), with values scaled to a range between 0 (unimportant) and 100 (most important). We also conducted univariate analyses (logistic regression) using each individual cognitive task as a predictor of training outcomes. We ranked the results of these analyses by p-value, and interpreted associations with the smallest p-values as those warranting future investigation. Across these analyses we identified 5 cognitive measures which were implicated as being strongly associated with training outcomes in exploratory analyses (causal reasoning [visual], spatial transpositions, inferential reasoning, cylinder, and social referencing). We further identified an additional 6 measures with more modest associations with training outcomes, or which were important covariates, that warranted further investigation (unsolvable task, odor discrimination, laterality: object manipulation, laterality: first step, arm pointing, and reward preference).

We then used data from these 11 measures to develop statistical models predicting training outcomes. Models were trained and evaluated using 4-fold cross validation, repeated 100 times (data randomly divided into 4-folds, 3 folds used for model construction, 1 fold used to assess model accuracy, with this process repeated 100 times). To assess model performance we used the cross-validated accuracy and area under the curve (AUC) from the receiver operating characteristic (ROC), a measure of sensitivity and specificity for a binary classifier. AUC values range between 0.5 and 1, with a value of 0.5 indicating a non-informative model, and a value of 1 indicating a perfectly predictive model.

Categorical predictions (graduate, release) were made using a probability threshold of 0.5 (i.e., predict release when predicted probability of graduation <0.5; predict graduate when predicted probability of graduation >0.5) but we retained predicted probabilities for additional analyses. Specifically, to assess whether accuracy was higher for observations with the highest and lowest predicted probability of success, we calculated cross-validated model accuracies for dogs with

#### TABLE 1 | Brief descriptions of measures included in the Dog Cognition Test Battery (DCTB).


Detailed methods for all tasks are provided in MacLean et al. (28) and the Supplemental Materials.

predicted probabilities of success in the 1st and 4th quartiles (at each iteration of the cross validation procedure).

#### TABLE 2 | Model statistics from the exploratory phase of Experiment 1.

### RESULTS AND DISCUSSION

The results from predictive models using the 11 candidate cognitive measures in the exploratory study are shown in **Table 2**. The best performing models (partial least squares, k nearest neighbors [n = 7]) yielded cross-validated AUCs of 0.76, and an overall cross-validated accuracy of 72%. However, all models tended to be much more accurate for dogs predicted to have the highest probabilities of graduating (**Table 2**). Specifically, for dogs in the 4th quartile of predicted probability of success (i.e., the 25% of dogs with the highest predicted probability of success), predictions were 87% accurate during cross validation. In contrast, for dogs in the 1st quartile of predicted probability of success, predictions tended to be less accurate (mean = 50%). Thus, we expected that future predictions would be most reliable for dogs with the highest predicted probabilities of success.

On a descriptive level, the largest differences between graduate and released dogs in the exploratory study were for the following


Accuracy (training data) and AUC (training data) reflect model performance with the training dataset. Metrics denoted as (CV) derive from cross validation with the training dataset (4-fold, 100 repeats). Columns represent the 8 different modeling strategies employed in this study. AUC, Area under the receiver operating characteristic curve; LDA, linear discriminant analysis; GLM, generalized linear model; RR, regularized regression; PLS, partial least squares; NB, naïve Bayes; MARS, multivariate adaptive regression splines; KNN, K nearest neighbors; RF, random forest.

tasks: spatial transposition, odor discrimination, causal reasoning (visual), unsolvable task (look at experimenter), inferential reasoning, social referencing. Specifically, graduate dogs scored higher than released dogs on the odor discrimination and inferential reasoning tasks, and made more eye contact with the experimenter during the social referencing and unsolvable tasks. However, release dogs scored higher than graduate dogs on the spatial transpositions task (one-cross condition) and the causal reasoning (visual) task. Therefore there was no clear pattern of graduates or releases systematically scoring higher across diverse cognitive measures.

### Prediction Study

Following the exploratory study, we designed a shorter test battery consisting of the 11 tasks determined to be potentially promising measures during initial predictive modeling. Because some of these tasks included relatively few trials in their initial format, we added additional trials to assess whether more data from these measures would improve predictive power. Specifically, we implemented changes in the number of trials as follows: causal reasoning [visual]: 4 trials → 8 trials; spatial transpositions [one-cross condition]: 4 trials → 8 trials; inferential reasoning: 6 trials → 10 trials; odor discrimination: 6 trials → 10 trials. The revised test battery consisted of two test sessions (conducted on two consecutive days).

### Subjects

We tested an independent sample of 180 dogs (i.e., none had participated in the exploratory study) in the revised assistance dog battery (115 females, 65 males, 43 Labrador retrievers, 4 golden retrievers, 133 Labrador retriever X golden retriever crosses). Of these, 33 dogs were excluded from analysis because they were transitioned into the hearing dog program (N = 19), released for medical reasons (N = 5), placed in a new program not represented in the training data (N = 7), or were still in training at the time of the analysis (N = 2). Twenty-six additional dogs were excluded from analysis due to missing data on more than two of the cognitive predictor variables. Therefore, our final sample for predictive modeling included 121 dogs (mean age = 1.88 years, SD = 0.23 years).

### Procedure

Testing procedures were identical to those in the exploratory study with the exception that the revised battery included a smaller number of tasks, as well as additional test trials for some tasks, as described above. The order of tasks in the revised test battery for assistance dogs was: Day 1: warmups > causal reasoning (visual) > spatial transpositions > inferential reasoning > cylinder > mutual gaze; Day 2: warmups > unsolvable task > odor discrimination > laterality (object manipulation) > arm pointing > laterality (first step) > reward preference. Inter-rater reliability was assessed for ∼40% of trials in the prediction study (Cohen's κ for discrete measures, Pearson correlation for continuous measures), and was excellent across measures (mean Cohen's κ = 0.98; mean Pearson's R: 0.96).

### Analysis

To assess predictive validity, we used the predictive models in the exploratory study to predict training outcomes for dogs in the independent sample tested on the short-format battery. As in the exploratory study, we assessed model performance via accuracy and area under the curve (AUC) from the receiver operating characteristic (ROC). To assess the effect of including additional test trials in the short-format battery, we initially ran all predictive models both including and excluding data from these additional trials. Model performance was better using data including the additional trials, and we report these analyses below. Based on the results of the exploratory study, we expected that models would be most accurate for the subset of dogs with the highest predicted probability of success. To evaluate this prediction, we calculated accuracy separately for dogs with predicted probabilities of success in the 4th quartile of predicted probabilities.

### RESULTS AND DISCUSSION

At a descriptive level, all but one model (Naïve Bayes Classifier), predicted higher average probabilities of success for dogs that ultimately did graduate from the program, than for dogs who were released from training. However, one-tailed t-tests indicated that only the random forest model yielded predicted probabilities of success that were significantly higher for graduate than release dogs (**Table 3**). The best performing models (generalized linear model, random forest, and regularized regression) yielded AUCs of 0.60-0.61 (accuracy range: 68–74%; **Table 4**). Thus, overall model performance was considerably poorer than expected based on initial cross-validation with the training data set. However, the distribution of training outcomes varied considerably between the exploratory and prediction datasets, an issue that can seriously affect model performance (33). Specifically, in the exploratory dataset, 68% of dogs graduated from the program whereas considerably more dogs did so in the independent sample (77%). In addition, as expected based on the exploratory study, predictions were much more accurate for dogs in the 4th quartile of predicted probabilities of success. On average (across models), outcome predictions for dogs with predicted probabilities of success in the 4th quartile were 86% accurate (**Table 4**). Two models (linear discriminant analysis and random forest) yielded predictions that were 90% accurate for this subset of observations (**Table 4**).

As a further test of the ability to discriminate between dogs most and least likely to succeed, we used one-tailed t-tests to compare the predicted probability of success for graduate and release dogs, restricting our analyses to dogs with predicted probabilities of success in the 1st and 4th quartiles (calculated separately for each model). In these analyses, 5 of 8 models produced significantly higher predicted probabilities of success for graduate compared to release dogs (**Table 3**; **Figure 1**). At a descriptive level, some differences in mean performance between graduate and release dogs were consistent between the exploratory and predictive studies, whereas others were not. Consistent with findings from the exploratory study, in the independent sample, graduate dogs again tended to make more eye contact with the experimenter in the unsolvable and social referencing tasks, and tended to score higher on the inferential reasoning task. However, in contrast to the exploratory study, graduate dogs scored higher on the spatial transpositions task, and scored lower on the odor discrimination task.

Overall, we were able to produce useful predictions regarding training outcomes with an independent sample, although relative to initial cross-validations, predictions for the independent sample tended to be less accurate. One important finding from TABLE 3 | Results from t-tests comparing the predicted probability of success for dogs that were ultimately successful (graduates) or unsuccessful (releases) in the assistance dog training program.


First vs. fourth quartiles reflects this comparison restricting the data to dogs with predicted probabilities of success in the 1st and 4th quartiles. All tests were one-tailed to evaluate the directional hypothesis that predicted probability of success would be higher for graduate than release dogs.

TABLE 4 | Model statistics from the prediction study of Experiment 1.


AUC, Area under the receiver operating characteristic curve; LDA, linear discriminant analysis; GLM, generalized linear model; RR, regularized regression; PLS, partial least squares; NB, naïve Bayes; MARS, multivariate adaptive regression splines; KNN, K nearest neighbors; RF, random forest.

this study was that predictions were much more accurate for the subset of dogs predicted to have the highest probability of success (with the strongest models performing at 90% accuracy in these cases). Therefore, from an applied perspective, we expect that it will be challenging to produce accurate predictions for all candidate assistance dogs, but that these measures and models may be particularly valuable for identifying the subset of dogs with the most potential for success. Given that these cognitive measures (1) can be collected in <2 h per dog, (2) do not require any training of dog participants, and (3) do not require specialized or costly equipment, these types of measures will provide a useful addition to the existing screening mechanisms employed by assistance dog agencies.

### EXPERIMENT 2

### Exploratory Study

#### Methods

#### **Subjects**

Detection dogs were tested at K2 Solutions Inc. in Pinehurst, North Carolina. All detection dogs were Labrador retrievers (N = 222, 131 male, 91 female, mean age = 3.96 ± 1.66 years). Twohundred and eight dogs completed the DCTB, and partial data were available for an additional 14 dogs.

#### **Detection dog performance measures**

Unlike the assistance dog organization, the detection dog provider did not employ a definitive metric to define success in the program. Therefore, we worked with the detection dog provider to assess diverse training and performance-related records which could be incorporated as proxies for success as a detection dog. These records included weekly training log entries, survey reports from trainers and individuals who had overseen a dog during deployment, standardized postdeployment evaluations, and dog status in the program. For several of these sources, we compiled information about 7 specific subcategories of dog performance, focusing on traits that program staff noted as important for detection work. These subcategories included the following: (1) Handling—ability to respond to directional signals when working off leash; (2) Temperament—nervous or fearful responses to loud noises or unfamiliar people and physical environments; (3) Motivation eagerness to execute searches and follow verbal and gestural commands; (4) Handler dependence—overreliance on cues from the handler and limited ability to work independently; (5) Odor recognition—consistent detection of trained target odors; (6) Odor Response—appropriateness of behavioral response upon detection of a target odor; (7) False responses—tendency to indicate the presence of an odor when the odor was not present at that location. Below, we describe all data sources on dog performance and associated scoring protocols.

### **Training logs**

Weekly training logs provided prose descriptions of dog behavior during training exercises (written by a dog trainer). Records primarily focused on aspects of performance needing additional attention (e.g., weaknesses rather than strengths). For each training session in the weekly log, researchers documented the occurrence of notes about weaknesses in the 7 subcategories described above. If no deficiencies were noted, the dog received a score of 0 for that category for a given week. If deficiencies were noted, they were assigned a prevalence score of 1–3, denoting the following categories: 1–rare and extremely minor weaknesses; 2–multiple, but sporadic weaknesses; 3–consistent patterns of deficiency. Most weekly logs contained notes about 3 or more days of training during the week. Logs in which 2 or fewer days of training were reported were excluded from analysis due to limited information for these periods. For each dog, we calculated a ratio of total scores in each behavioral category to the number of weekly training logs that were scored. Thus, higher ratios reflected more behavioral problems in training, controlling for the number of records available for analysis. Training log scoring was performed by two coders, with 20% of the sample coded by both individuals to assess inter-rater reliability (correlation). Reliability, was excellent for all measures (handling: R = 0.89; temperament: R = 0.93; motivation: R = 0.98; handler dependence: R = 0.96, odor recognition: R = 0.94; odor response: R = 0.97; false response: R = 0.99). Training log data for 162 dogs was available for analysis, with an average of 33 weeks of data per dog (SEM = 1.33 weeks).

#### **Trainer surveys**

For a subset of dogs in our sample (N = 34) we were able to administer quantitative surveys to the dog's primary trainer. Respondents rated dogs on a 3-point scale (above average, average, below average) relative to other dogs in the training

program, with respect to each of the 7 behavioral subcategories described above.

regression splines; KNN, K nearest neighbors; RF, random forest.

### **Performance while deployed**

For dogs that had previously deployed, we distributed a quantitative survey (identical to that used with trainers) to the individuals who were responsible for overseeing the dog during deployment. We obtained completed survey data for 62 dogs.

### **Post-deployment evaluation**

Within 3 weeks of return from deployment, the provider performs a behavioral evaluation assessing temperament, detection and search abilities, obedience, and motivation. Each item on the evaluation is scored (by the provider) on a pass/fail basis. Within each category, we calculated the percent of passed items as the dependent measure. Evaluators also provided free-form comments on the dog's behavior at the time of the evaluation. Using these notes, coders assessed the presence/absence of deficiencies in the 7 behavioral subcategories described above. Data were available for 132 dogs. Coding of free-form comments was performed by two coders with 30% of the sample coded by both individuals to assess inter-rater reliability (Cohen's κ). Reliability was excellent for all measures (handling: κ = 1; temperament: κ = 0.88; motivation: κ = 0.84; handler dependence: κ = 1; odor recognition: κ = 0.93; odor response: κ = 0.78; false response κ = 0.94).

#### **Status in program**

The detection dog program assigned dogs a "status" relating to their fitness for future detection work. At the broadest level, dogs were considered serviceable if they were reserved for future use in the program, and unserviceable if they were being released from the program. Excluding dogs being released for non-behavioral reasons (e.g., medical problems), we used program status as a proxy measure for identifying the least and most successful dogs. Status records were available for 83 dogs that were identified as serviceable, or unserviceable due to behavioral reasons.

### **Analysis**

Because detection dog performance could not be summarized using any single measure (the program did not use a definitive outcome), and data availability varied across measures, it was not feasible to build formal predictive models as in Experiment 1. Therefore, for the purpose of exploratory analysis, we conducted univariate analyses assessing associations between each performance measure described above, and individual item scores on the DCTB.

Scores on all outcome measures were discretized into two (training survey, post-deployment evaluation, program status, performance while deployed) or three (training log) quantile categories corresponding to dogs with below and above average scores on each measure (or below average, average, and above average for the measure discretized into 3 categories). For each performance measure, we conducted a t-test (2 category outcomes) or ANOVA (3 category outcomes) to test for differences on cognitive measures as a function of the discretized performance measure.

For exploratory analysis, we treated each analysis yielding a p-value <0.05 as a significant association. Each significant association was then annotated to describe the direction of association between the cognitive and performance measure. For t-tests, these associations were either positive (higher scores on the cognitive measure associated with better performance) or negative (higher scores on the cognitive measure associated with worse performance). For the ANOVAs, we included a third category, "neutral" to annotate cases in which the omnibus test was significant, but there was no clear directional association with the performance measure (e.g., dogs in the above and below average categories performed similarly, whereas dogs in the average category deviated).

For aggregation across analyses, we assigned a score of −1 for each "negative" association, a score of 0 for a "neutral" association, and a score of +1 for each "positive" association. For each cognitive measure, we then added these scores (across analyses with the different performance measures) to derive an aggregate measure of the direction and strength of association between the cognitive and performance measures. For example, a cognitive measure that was significantly associated with 6 performance measures, with all 6 of these associations being positive (higher scores on the cognitive measure corresponding to better performance) would receive an aggregate score of 6. In contrast, a cognitive measure that was significantly associated with 6 performance measures, but with three of these associations being positive, and three being negative, would receive an aggregate score of 0 (−3 + 3 = 0). Thus, while we expected many false positives due to the large number of statistical tests, we predicted that the direction of false positive associations should be random. Consequently, we expected that the cognitive measures with the strongest positive or negative aggregate scores (consistent directional associations) would be those with the most robust and meaningful links to detection dog performance.

### RESULTS AND DISCUSSION

Aggregate measures describing the association between cognitive tests and detection dog performance measures are shown in **Table 5** and **Figure 2**. On average, there were 3.2 ± 0.4 associations with each cognitive measure. However, the mean aggregate score was 0.4 ± 0.4, which was not significantly different than the hypothesized value of 0, if false positives were equally likely to be positive or negative (one-sample ttest, t<sup>28</sup> = 0.86, p = 0.40). However, the number of significant associations, and the directional consistency of these associations varied widely across cognitive measures.

**Figure 2** depicts the aggregate score for each cognitive measure in the test battery. While some tasks (e.g., transparent obstacle) had many significant associations, the direction of these associations was highly variable, yielding an aggregate score near 0. In contrast, five cognitive measures had four or more significant associations, all of which were positive (i.e., higher scores on the cognitive measure linked to better measures of performance as a detection dog), and two additional measures had five significant associations, with 80% of these being positive. One cognitive measure had 6 associations with performance metrics, all of which were negative. Based on these results, and the aim of developing an approximately 1 h short-format test battery, we retained all measures with an aggregate score of ≥ |3| (marker cue, odor discrimination, arm pointing, causal reasoning [visual], working memory, memory—distraction, and unsolvable task). We opted to retain one additional measure which yielded two negative associations with performance metrics (laterality: object manipulation) due to the simplicity and potential utility of this measure.

The cognitive measures yielding consistent directional associations with detection dog performance included measures of sensitivity to human communication, short-term memory, odor discrimination, causal reasoning, and persistence at an unsolvable task. Several of these tasks index processes that are likely to be important for dogs performing off-leash explosive detection. For example, off-leash detection dogs are required to use gestural communication from a human handler when executing search routes, and individual differences in sensitivity to human communication may be an important determinant of success in this aspect of detection work. Similarly, detection dogs rely on short-term memory in a variety of situations ranging from memory for recent commands, to locations recently searched, and odorants (or the strength thereof) recently encountered. Lastly, detection dogs are required to make olfactory discriminations, and individual differences in spontaneous odor discrimination tasks may predict a dog's potential for employing these skills during trained detection work. Therefore, several of the positive associations from the exploratory study can be intuitively interpreted with respect to the requirements of detection work.

One limitation of this study was that because there was no definitive outcome measure in the detection dog population, it was not possible to develop formal predictive models as we did with the assistance dogs. Because the outcomes we recorded were not available for all dogs, and data availability varied widely between measures, it was similarly not possible to develop a unified outcome measure (e.g., through dimension reduction). However, by relying on a diverse set of outcome measures, it is possible that this type of analysis provides a more sensitive measure of working dog performance than a simple pass/fail type of metric.

### REPLICATION STUDY

To assess the replicability of associations from the exploratory study, we tested an independent sample of detection dogs in a short-format assessment consisting of the measures most strongly associated with detection dog performance in the exploratory study.

## Methods

### Subjects

Ninety Labrador retriever dogs (60 male, 30 females) participated in the replication study. All dogs were from the detection dog population described above, and none of them had participated in the initial exploratory study.

#### Procedure

Testing procedures were identical to those in the exploratory study with the exception that a smaller number of tasks were employed, and tasks were implemented in a novel order. Unlike TABLE 5 | Distribution of positive, negative and neutral associations between cognitive measures and metrics of success as a detection dog from the exploratory study in Experiment 2.


Total indicates the total number of significant (p < 0.05) associations between each predictor variable and the outcome measures. Positive associations reflect cases in which higher scores on the cognitive measure were associated with better performance as a detection dog. Negative associations reflect cases where higher scores on the cognitive measure were associated with worse performance as a detection dog. Neutral associations indicate cases in which the test statistic was significant, but there was no clear directional association with the performance measure (e.g., dogs in the above and below average categories performed similarly, whereas dogs in the average category deviated).

Experiment 1, we did not include additional test trials for any of the measures in this replication study. The order of tasks in the replication study was: warm-ups > arm pointing > marker cue > odor discrimination > working memory > memory—distraction > unsolvable task > causal reasoning (visual) > laterality: object manipulation. We assessed inter-rater reliability (Cohen's κ for discrete measures, Pearson correlation for continuous measures) for ∼20% of all trials, and reliability was excellent across measures (mean Cohen's κ = 0.97; mean Pearson's R: 0.91).

#### Performance Measures

As in the exploratory study, we obtained and scored records to be used as a proxy of success as a detection dog. Our primary performance measure was scoring of weekly training logs, as described above (N = 67 dogs). Two coders rated 20% of observations and inter-rater reliability was excellent for all measures (handling: R = 0.99; temperament: R = 1.0; motivation: R = 0.89; handler dependence: R = 0.99, odor recognition: R = 0.91; odor response: R = 0.97; false response: R = 0.99). For dogs in the replication study, the ratio scores (problems per category to weeks of data) were correlated with the number of weeks of data available. To control for this confound, we used linear models predicting the ratio score as a function of weeks of available data, and extracted residuals from these models as an adjusted measure of performance. Prior to analysis, residuals were multiplied by −1 so that higher values corresponded to better performance in the program.

For dogs in the replication study we also gained access to additional electronic records which described (trainer perceptions of) weekly performance for each dog using

an ordinal scale ("excellent," "good," "fair," "poor"). These electronic records were obtained for 71 dogs, with a mean of 97.4 records per dog (SEM = 8.9). To quantify these ordinal scores, for each dog we (a) calculated the percent of records achieving each of the different ordinal ratings, (b) multiplied each percentage by the following weightings: excellent = 1, good = 0.66, fair = 0.33, poor = 0, and (c) summed these values to obtain an overall numerical score. Thus, overall numerical scores were bounded from 0 (all ratings = poor) to 100 (all ratings = excellent). Observed overall scores had a mean of 59, and ranged between 24 and 70.

The other measures of dog performance originally used in the exploratory phase were unavailable for dogs in the replication study, and thus could not be included in analysis.

#### Analysis

To replicate the approach used in the exploratory study, we conducted univariate analyses predicting the performance outcome measures described above as a function of scores on each of the cognitive tasks. All statistical tests were run as linear models with the predictor and outcome variables converted to z-scores to facilitate interpretation of regression coefficients.

For each analysis we recorded the β coefficients describing the relationship between the cognitive predictor variable and the detection dog performance measure as an outcome. To summarize results from these analyses we (1) calculated the mean and standard error of the β coefficients for each predictor variable, and (2) performed a one-tailed, one-sample t-test on the distribution of these β coefficients for each predictor variable, testing the null hypothesis that the β coefficients would have a mean of 0. The direction of the alpha region for the one-tailed ttests was assigned based on whether we hypothesized a positive or negative association with the cognitive predictor variable, based on the results of the exploratory study. Therefore, our main predictions were that cognitive measures that were positively associated with detection dog performance in the exploratory study would also have positive β coefficients in the replication study, and vice versa for associations determined to be negative in the exploratory study.

### RESULTS

The mean and standard error of the β coefficients associated with each cognitive predictor are shown in **Figure 3**. Four of the six measures which were positively associated with detection dog performance in the exploratory study, on average, also had positive β coefficients in the replication study. For two of these measures (memory—distraction, arm pointing) the distribution of β coefficients had a mean significantly >0 (**Table 6**), suggesting consistent positive associations with detection dog performance. However, two cognitive measures which were positively associated with performance in the exploratory study were negatively related to performance in the replication study (**Figure 3**; **Table 6**). In addition, both cognitive measures that were negatively associated with performance in the exploratory study had, on average, positive β coefficients in the replication study. Therefore, the replication study confirmed a subset of findings from the exploratory study, but did not replicate other findings.

As in the exploratory study, multiple measures of shortterm memory were positively associated with detection dog performance. Similarly, individual differences in sensitivity to human gestures (arm pointing) was associated with better detection dog outcomes, in both the exploratory and replication phases. Although odor discrimination, causal reasoning (visual), and use of an arbitrary communicative marker were all positively associated with performance in the exploratory study, none of these tasks maintained strong associations across the replication. Additionally, the two tasks that were negatively associated with detection dog performance in the exploratory study (laterality: object manipulation, unsolvable task [look at experiment]) were unrelated to detection dog outcomes in the replication.

The use of an exploratory and confirmatory approach illustrates the importance of replication in developing predictive measures. Spurious or weak results are less likely to be upheld across analyses with independent datasets, whereas the most

TABLE 6 | Mean and standard error of regression coefficients from the replication study.


β (mean) reflects the mean regression coefficient describing the relationship between a cognitive measure and the outcome variables (metrics of performance as a detection dog). T-test statistics correspond to one-sample t-tests comparing the distribution of β coefficients for each predictor to the null expectation (0).

associated with outcomes in the exploratory study.

promising measures should yield comparable findings across multiple iterations of behavioral testing and analysis. In the current experiment, it is possible that some initial findings did not replicate because these associations were spurious or relatively weak. However, many of the outcome measures used in our exploratory study were not available for dogs in the replication study, which may also account for limited reproducibility in some cases.

In sum, these findings indicate that simple measures of shortterm memory and sensitivity to human gestural communication are reliably associated with performance as a detection dog, and suggest that these measures may provide a simple and rapid approach for evaluating a dog's potential for this role. The current work identifies a subset of simple cognitive measures that can be easily incorporated into such a prospective study.

### GENERAL DISCUSSION

Across a series of studies with candidate assistance dogs and detection dogs, we assessed associations between individual differences in cognition, and success as a working dog. In both populations we initially used exploratory analyses with a large sample of dogs tested on a broad array of cognitive tasks. We then developed shorter test batteries comprised of only the items most strongly associated with outcomes within each population. Lastly, we collected data on these revised sets of measures with independent samples and used predictive models (assistance dogs) or a replication study (detection dogs) to assess the utility of these cognitive measures for predicting working dog outcomes. In both populations we identified cognitive measures associated with working dog success. In the assistance dog population, predictive models developed in the exploratory study were effective at prospectively predicting training outcomes in an independent sample, with model performance being best for dogs predicted to have the highest (vs. the lowest) probability of success. In the detection dog population, our replication study confirmed positive associations between individual differences in short-term memory, sensitivity to human gesture, and measures of success as a detection dog. Therefore, our findings suggest that measures of dog cognition provide a useful approach for predicting working dog aptitude, and support the hypothesis that individual differences in cognition may be an important determinant of success in these roles (2).

Importantly, the particular aspects of cognition associated with working dog success varied between the two study populations, consistent with the notion that different working roles may require different cognitive skillsets. In the assistance dog population, successful dogs were characterized by a greater tendency to engage in eye contact with a human when faced with an unsolvable task, or when a joint social activity was disrupted (social referencing), as well as higher scores on an inferential reasoning task. Given that assistance dogs work closely with a human partner, and must be highly responsive to this person, it is likely that a natural tendency to attend to the human's face, and seek information from this person, is fundamental to a dog's success in this role. In the detection dog population, we found the strongest associations with measures of short-term memory and sensitivity to human gestural communication. Given that these dogs work off-leash at a distance from a human handler, it is likely that the ability to use human gestural communication provides an important skillset for effective detection work. Similarly, because detection dogs must efficiently search complex physical environments, and maintain verbal commands in memory while executing searches, short-term memory is probably critical for several aspects of successful detection work.

Our findings support the hypothesis that different types of cognition have evolved in a variety of animals—including dogs (28, 29, 34, 35). In our previous study describing the psychometric structure of the DCTB (i.e., the same data used here), measures of sensitivity to communicative intentions, memory processes, and eye contact with humans, all loaded on different factors (28). Therefore, our current findings are consistent with the hypothesis that the cognitive skills linked to working dog success reflect processes in distinct cognitive domains, that can vary independently of one another. This provides evidence that individual variation across these different factors, or types of cognition, is also related to how dogs solve a variety of problems in the real world. In other words, these experimental measures have ecological validity (36). An individual's cognitive profile can increase his or her potential to either succeed or fail in performing trained behaviors effectively—with different profiles being predictive of success with different sets of problems (e.g., assisting people with disabilities vs. explosive detection). This also leads to the prediction that future studies with other working dog populations will identify other aspects of cognition that are important for other working roles. If correct, it is unlikely that a construct such as "general intelligence" will be sufficient for assessing (cognitive) aptitude in candidate working dogs. At a practical level, this suggests that there will not be a single (ideal) cognitive phenotype that can be selected or screened for across all working dog populations.

One important challenge in assessing cognitive predictors of working dog success will continue to be how success is defined and operationalized. In the assistance dog population, training success was independently defined by the dog provider, and operationalized as whether a dog graduated the program [a common metric of success for studies with assistance dogs; (2, 3, 7)]. Although clearly defined, and relevant to the practical challenges that motivate predictive modeling (e.g., identifying dogs most and least likely to complete training), the use of a dichotomous outcome may obscure meaningful differences between dogs within the successful and unsuccessful groups. In the detection dog population there was no single metric available to quantify success, and thus we relied on diverse approaches ranging from scoring training records to surveys with trainers and individuals overseeing dogs during deployment. These data sources likely reflect a large degree of subjectivity. Additionally, many of these data sources were not available for dogs in our study, yielding variance in statistical power across analyses, and precluding the development of a single composite metric of success. Therefore, in addition to continued research on the cognitive and behavioral traits that predict aptitude for working roles, there is also an important need for the development and validation of objective measures that can more robustly quantify success in these roles.

Despite these limitations, Our findings speak to the validity of spontaneous, non-verbal cognitive measures in capturing meaningful differences in real world problem solving behavior (34). They suggest that in dogs (1) individual differences in cognition contribute to variance in working dog success, and (2) that experimental measures of these individual differences can be used to improve the processes through which working dogs are evaluated and selected. Importantly, we expect that cognitive measures will be useful in addition to, rather than as an alternative to current methods of dog selection. A wide range of traits, including aspects of physical health, behavior, temperament, and cognition, make important contributions to working dog success. Thus, the development and validation of measures that probe this diverse range of phenotypic characteristics will be critical to enhancing working dog selection. Collectively, our findings contribute to a rapidly growing body of research on working dog selection, and suggest that embracing a broad view of the characteristics required of successful working dogs—including temperamental and cognitive traits, as well as the interactions between them (18, 37, 38)– will provide a powerful and integrative approach for future research.

### ETHICS STATEMENT

This study was carried out in accordance with the recommendations of the Duke University IACUC, and

### REFERENCES


was approved by the Duke University IACUC (protocol #: A138-11-06).

### AUTHOR CONTRIBUTIONS

EM and BH designed and conducted the research. EM analyzed the data. EM and BH wrote the paper.

### ACKNOWLEDGMENTS

We thank K. Rodriguez, K. Almon, S. Laderman, A. Reinhardt, K. Leimberger, K. Duffy, T. Jones, K. Suchman, W. Plautz, L. Strassberg, L. Thielke, E. Blumstein, R. Reddy, and K. Cunningham for help with data collection and coding. We are grateful to K Levy for her help with the assistance dog studies, and P. Mundell for help with research on both assistance and detection dogs. We thank Canine Companions for Independence and K2 Solutions, Inc. for accommodating research with assistance and detection dogs. This research was supported in part by ONR N00014-17-1-2380, ONR N00014-12-1-0095, NIH 5 R03 HD070649-02, and the AKC Canine Health Foundation. The contents of this publication are solely the responsibility of authors and do not necessarily represent the views of the AKC Canine Health Foundation.

### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fvets. 2018.00236/full#supplementary-material


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 MacLean and Hare. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# The Role of Animal Assisted Intervention on Improving Self-Esteem in Children With Attention Deficit/Hyperactivity Disorder

Sabrina E. B. Schuck <sup>1</sup> \*, Heather L. Johnson<sup>2</sup> , Maryam M. Abdullah<sup>3</sup> , Annamarie Stehli <sup>1</sup> , Aubrey H. Fine<sup>4</sup> and Kimberley D. Lakes <sup>5</sup>

*<sup>1</sup> Pediatrics, School of Medicine, University of California, Irvine, Irvine, CA, United States, <sup>2</sup> School Psychology, College of Education, University of Utah, Salt Lake City, UT, United States, <sup>3</sup> Greater Good Science Center, University of California, Berkeley, Berkeley, CA, United States, <sup>4</sup> Education, California State Polytechnic University, Pomona, CA, United States, <sup>5</sup> Psychiatry, School of Medicine, University of California, Riverside, Riverside, CA, United States*

#### Edited by:

*Peggy D. McCardle, Consultant, New Haven, CT, United States*

#### Reviewed by:

*Sheffali Gulati, All India Institute of Medical Sciences, India Chris Fradkin, Pontifical Catholic University of Rio de Janeiro, Brazil*

> \*Correspondence: *Sabrina E. B. Schuck sabrina@uci.edu*

#### Specialty section:

*This article was submitted to Children and Health, a section of the journal Frontiers in Pediatrics*

Received: *15 May 2018* Accepted: *25 September 2018* Published: *02 November 2018*

#### Citation:

*Schuck SEB, Johnson HL, Abdullah MM, Stehli A, Fine AH and Lakes KD (2018) The Role of Animal Assisted Intervention on Improving Self-Esteem in Children With Attention Deficit/Hyperactivity Disorder. Front. Pediatr. 6:300. doi: 10.3389/fped.2018.00300* Attention Deficit/Hyperactivity Disorder (ADHD), the most ubiquitous mental health problem in children, has been associated with poor self-esteem. Psychosocial interventions have aimed to improve self-esteem among this group, with the aim of reducing the development of comorbid depression and anxiety. The present study implemented a randomized control design to examine the possibility of Animal Assisted Interventions (AAI) as a viable approach to improving self-esteem among children with ADHD. Children's self-esteem across multiple domains as measured by the Self-Perception Profile for Children was evaluated (*n* = 80, ages 7–9, 71% male). To test the hypothesis that AAI improves self-esteem, stratified Wilcoxon Signed-Rank Tests (SAS NPAR1WAY procedure) were used to compare pre- to post-treatment ratings. Analyses indicated that scores of children's self-perceptions in the domains of behavioral conduct, social, and scholastic competence, were significantly increased from baseline to post-treatment in the AAI group (*z* = 2.320, *p* = .021, *z* = 2.631, *p* = .008, and *z* = 2.541, *p* = .011, respectively), whereas pre-post-treatment differences in self-perceptions were not found for the children in the control group without AAI. Findings suggest that AAI is a viable strategy for improving ratings of self-perceived self-esteem in children with ADHD.

Keywords: Human Animal Interaction, Animal Assisted Intervention, therapy dogs, ADHD, self-esteem, selfawareness, school-based interventions, social skills training

### INTRODUCTION

Attention Deficit/Hyperactivity Disorder (ADHD) is the most prevalent mental health disorder of childhood and despite intervention continues to impair individuals across the lifespan when compared to their typically developing peers (1). Children with ADHD are characteristically impaired by deficits in skills of Executive Function (EF) including attention, working memory, and inhibition—all skills essential to self-awareness and self-regulation. Children with ADHD oftentimes present with associated poor social skills and/or problem behaviors and are at greater risk for developing other mental health disorders by young adulthood (2). Pharmacotherapy (e.g., atomoxetine, methylphenidate) is the mainstay of traditional medical intervention for ADHD, but treatment failures are common (3), there is evidence of slowed growth in individuals who take stimulants for long periods of time (4, 5) and parents of young children are increasingly seeking alternatives to medication treatments. Of note, many parents find alternative therapies, including Animal Assisted Intervention (AAI), to be more acceptable than medication (6).

Early on it was hypothesized that Human Animal Interaction (HAI) and AAI with children contributed to improvements on measures of psychological well-being, such as self-esteem (7, 8) or closely related constructs (9, 10), while others reported no AAI impact (11, 12). The inconsistent findings of these early studies seems likely to have been the product of weak research designs.

Reviews of HAI and AAI indicate an abundance of correlational, hypothesis-generating studies (13–16) and conclude that the field would be further strengthened by rigorous experimental, evidence-based intervention designs (17, 18). While groundbreaking in hypotheses generation, most AAI research has been criticized for the lack of control conditions and specificity of intervention aims (19). Many of these studies describe solely correlational findings. Perhaps, one of most problematic areas of AAI research, plagued by both of these weaknesses, is the examination of the relationship of AAI and self-esteem—a nebulous construct to start with.

The concept of self-esteem has been defined as "the level of global regard that one has for the self as a person" [(20), p. 88]. In typically developing populations, self-esteem has been linked to task persistence, achievement and overall outcomes. Low self-esteem has been linked to poor outcomes, depression and other mental health disorders (20). Considering the frequent adverse social feedback that children with ADHD are likely to experience throughout their development, it seems plausible that these experiences may contribute to low self-esteem.

It is clear that children with ADHD are at greater risk for poor outcomes and the development of comorbid mental health disorders (21). The role of self-esteem in predicting these outcomes is less understood (22). Previous research has examined the role of self-esteem in prognosis and the later development of comorbid mental health disorders in young adulthood (23). Historically, psychosocial interventions have targeted increased self-esteem in children with ADHD (24), with the aim of reducing the development of comorbid depression and anxiety.

Positive Assertive Cooperative Kids (Project P.A.C.K.), a randomized controlled trial, was designed to examine the safety and efficacy of AAI with dogs vs. traditional psychosocial treatment strategies for children with ADHD (25). The P.A.C.K. model reduced symptoms of ADHD and problem behaviors and improved social skills (26). A secondary aim of Project P.A.C.K. was to determine if AAI contributed to an increase in self-esteem as measured by children's self-perceptions of their competence across multiple domains. We hypothesized that the children participating in the therapy with dogs would have greater gains in their self-esteem than those children participating in therapy as usual.

### METHODS

### Participants

This study was approved by the local university Institutional Review Board. Written and informed consent was obtained from parents and written and informed assent was obtained from child participants. Additionally, upon review by the local Institutional Animal Care and Use Committee (IACUC), this study was determined exempt from IACUC review as the participation of the therapy animals was not outside the scope of their normal activity and because they were not the subject of investigation. Eighty-eight children with ADHD, ages 7–9 years, (71% male), and their parents participated in Project P.A.C.K., a randomized controlled trial examining the safety and efficacy of traditional "best practice" psychosocial intervention with and without therapy dogs, 82 completed treatment across 12 weeks and participated in a follow up assessment 6 weeks later (26). Eighty children and families completed the measures included in this analysis at baseline, end of treatment and at follow-up.

### Screening and Eligibility Criteria

Participants were selected using a multi-gate screening procedure to determine eligibility. Parents completed a family medical and psychosocial history questionnaire. Researchers administered the Kaufman-Schedule for Affective Disorders and Schizophrenia for School-Age Children: Present and Lifetime Version [K-SADS-PL; (27)], a semi-structured clinician-administered interview conducted with parents and children, based on criterion set forth in the Diagnostic and Statistical Manual of Mental Disorders [4th ed., text rev.; DSM-IV- TR; (28)] for psychiatric disorders. Children also completed the Wechsler Abbreviated Scale of Intelligence, Second Edition [WASI-II; (29, 30)]. Eligibility criteria included a primary diagnosis of ADHD, Combined subtype, aged 7–9 years, an estimated full scale IQ score of 80 or above, and the ability to complete all screening measures. Exclusionary criteria included current use of medication for ADHD; a diagnosis of a pervasive developmental disorder/autism, depression, anxiety, or epilepsy; and a history of cruelty to animals.

### Randomization Design

All participants were randomly assigned to one of two treatment groups: (a) a cognitive-behavioral group therapy incorporating a AAI with therapy dogs/handler dyads or (b) a cognitivebehavioral group therapy without therapy dogs (non-AAI). In efforts to establish treatment efficacy for both the AAI and the non-AAI treatment groups, a waitlist condition was implemented to control for the possible influence of time and child development on symptom severity in both groups. Specifically, half of all recruited participants, regardless of treatment group, were consented and assessed and then experienced a waiting period of 12 weeks prior to a subsequent assessment and the start of treatment. The remainder of participants recruited began immediate treatment (IT) subsequent to consent and assessment.

### Measures

Children participated in a battery of screening measures, including a brief assessment of cognitive skills, at intake prior to intervention (described above). Subsequent to screening and randomization, they participated in direct assessment which included measures of performance, self-evaluations and structured interviews in the context of a laboratory school setting at three time points: (a) immediately prior to, (b) immediately following, and (c) 6 weeks after the 12-week intervention period. Simultaneously, parents rated their children's social skills competence, severity of ADHD symptoms, and problem behaviors.

#### Intelligence

The Wechsler Abbreviated Scale of Intelligence, Second Edition [WASI-II; (29, 30)] is a brief measure utilizing four subtests with the highest factor loading on generalized intelligence (g). The administration takes about 3 min and yields a Verbal Comprehension Index (VCI; Vocabulary and Similarities), a Perceptual Reasoning Index (PRI; Block Design and Matrix Reasoning) and an estimated Full Scale IQ (FSIQ-4). Cronbach's alpha coefficients for all subscales of this measure are high, ranging from the high 0.80s to 0.99, indicating good to excellent reliability in child samples, and the measures has demonstrated acceptable to excellent validity with other established measures utilized for estimating g (31). Participants completed the WASI at baseline, no more than 2 weeks prior to participation in the laboratory school day assessments and the intervention phase.

#### Self-Esteem

The Self-Perception Profile for Children [SPPC; (32)] is a 36 item scale measuring perceived global self-worth in children. Subscales in this measure include scholastic competence, social acceptance, athletic competence, physical appearance, and behavioral conduct, and a cumulative measure of global selfworth. Cronbach's alpha coefficients for each domain exceed 0.80, indicating acceptable to excellent reliability and validity testing, resulting in clear factor loadings with basic oblique rotation-five loadings, ranged between 0.41 and 0.90 as reported in the most recent update to the SPPC manual (32). This scale was administered to child participants individually by a trained research assistant who read the items aloud in keeping with recommendations in the most recent manual for children younger than Grade 5. Of note, in consideration of the attention challenges of the special population the scale was administered with the aid of a pictorial visual analog to assist the child in choosing their rating. The measure was completed on the Saturdays immediately preceding and following the intervention period, and then at a 6-week follow-up in the context of the laboratory school day setting.

#### Social Skills and Problem Behaviors

Social Skills Improvement System-Parent Form [SSIS-P; (33)] is a 79-item scale measuring social skills and problem behaviors in children as reported by their parents. In the Social Skills domain, the subscales are communication, cooperation, assertion, responsibility, empathy, engagement, and self-control. Subscales in the Problem Behaviors domain include internalizing, externalizing, bullying, hyperactivity/inattention, and autism spectrum. Gresham and Elliott (33) provide an extensive psychometric review of the SSIS in the administration manual and report reliability tests for Cronbach's alpha coefficients for the main domains all exceeding 0.90 indicating very good to extremely high reliability for children between the ages of 5–12, with measures of validity reported in the moderate to high range (pp. 65–136). Commensurate with their child's participation in each of the laboratory school days, one parent respondent (91% female) rated their child's social skills and problem behaviors, as measured by the SSIS.

### ADHD and ODD Symptoms

The Attention-Deficit/Hyperactivity Disorder Rating Scale, Fourth Edition [ADHD-RS-IV; (34)] is an established measure of ADHD symptoms derived from the Diagnostic and Statistical Manual of Mental Disorders 4th Edition [DSM-IV; (28)] with Cronbach's alpha coefficients all exceeding 0.85, indicating extremely high to excellent subscale reliability (34). In addition to the ADHD-RS items, in a similar fashion, parents rated their children on the nine symptoms of Oppositional Defiant Disorder as listed in the DSM-IV. The ADHD-RS and symptoms of ODD scale were completed by parents every 2 weeks during the course of the intervention.

### Permanent Product Math Test

The Permanent Product Measure of Performance [PERM-P; (35)] is a validated, time-sensitive, skill-adjusted test. The assessment is comprised of simple math problems which are required to be completed at multiple time points throughout the simulated classroom sessions of the laboratory school days. This measure has been used extensively in clinical trials examining the effects of pharmaceuticals for children with ADHD, and has been found to be a robust, objective measure of the ability to initiate a task, self-monitor/stay on task, and complete written seatwork (35).

### Intervention

For a period of 12 weeks, each child participant attended an intervention group session twice a week; 1 weekday evening for 2 hours and on Saturday for 2 ½ hours, resulting in a total of 4 ½ hours per week of treatment for the child. Parents received 2 hours of group-based behavioral parent training (BPT) once a week that occurred during their child's weekly evening sessions. The P.A.C.K. intervention curriculum implemented for both groups, incorporated strategies based on components from the University of California, Irvine Child Development Center School-based Social Skills model, the Kids Interacting with Dogs Safely program developed by Jane Deming and the American Humane Association (2009), and the Intermountain Therapy Animals' Reading Education Assistance Dogs program (ITA R.E.A.D <sup>R</sup> Handbook, 2003–2004).

The AAI group included the participation of three certified therapy dogs, facilitated by their handlers (partners), during each intervention session (see **Figure 1**). The non-AAI group received the same standard treatment curriculum, but utilized toy dogs (realistic puppets/stuffed plush toys) in lieu of live dogs (see **Figure 2**). Of note, written and informed consent for the release of photographic images was obtained from parents of participants in these photos but the faces of the minors are masked to protect their privacy.

### P.A.C.K. Curriculum

The social skills curriculum that was implemented in each of the children's therapeutic group sessions was originally developed for the UC Irvine Child Development Center School Program, a laboratory school environment for children with ADHD. This model combines strategies based in learning and cognitivebehavioral theories and is aimed to promote adaptive skill acquisition and thereby reduce problem behaviors (36). This curriculum, along with the curriculum of the Summer Treatment Program developed by Pelham et al. (37) contributed to the psychosocial treatment strategies implemented and tested in the Multi-modal Treatment Study of ADHD (MTA Study), the

FIGURE 1 | Animal Assisted Intervention with certified therapy dogs.

FIGURE 2 | Traditional psychosocial skills training with a dog theme.

largest NIH funded longitudinal study of treatment modalities for children with ADHD (38–40). These traditional evidencebased components were complemented with novel strategies (The How to Be a Good Teacher lessons and the How Did I Do? self-assessment) which aimed to increase self-esteem by developing self-competence and self-efficacy [see (25, 26)]. These lessons were inspired by a theoretical premise that interaction with dogs provides a naturalist, non-verbal, 'feedback' mechanism unique to AAI. The lessons were administered with all children, with and without the assistance of certified therapy dogs and puppies in training for Canine Companions for Independence. In the AAI group, children were supported in 'training' basic commands (come, sit, stay) with certified therapy dogs for the first 9 weeks of the intervention and then with the puppies in the last 3 weeks of the intervention. In the group without dogs, children were instructed and coached to teach their peers how to complete a skill or inform the peers about a specific subject of interest. See the chapter by Gee et al. (41) for a more detailed description of the specific strategies implemented.

### Behavioral Parent Training (BPT)

The parent training component of intervention consisted of 12, weekly, 2-hour long sessions of BPT conducted with six families per treatment group. Sessions were based on a traditional BPT curriculum and adapted from the MTA study (40). Specifically, parents were taught behavior modification techniques and standard directive parenting strategies (e.g., giving effective directions, transitional warnings, problem solving) and how to teach self-regulation strategies, facilitate anger management, and target social skills development for their children. Parent/child shared homework activities (e.g., reading a short dog-themed story together) were assigned to encourage discussions focusing on targeted social skills and/or humane education topics.

### Analysis

To test the hypothesis that AAI increases self-esteem as measured by child self-perceptions of competence, and in consideration of the non-normal distribution of scores on the SPPC for the sample, stratified Wilcoxon Signed-Rank Tests (SAS NPAR1WAY procedure) were used. This test allows for a comparison of pre- to post- differences in a nonnormal distribution and results in a z score statistic that can then be tested for significance. Considering the nature of the ranking test, a final transformation of the scores was utilized in efforts to best represent the direction of change for those individuals that had higher (improved) scores of selfperception at end of treatment when compared to their scores at baseline.

Stratified correlation analyses were performed to examine the relationship between SPPC ratings and baseline measures including: ADHD-RS subscales (ADHD, Hyperactivity, Inattention), ODD symptoms, Parent rated SSIS, the WASI (Block Design, Matrix Reasoning, Vocabulary and Similarities, Full Scale IQ), and PERM-P level.

### RESULTS

### Prior to Intervention

At baseline, groups were equivalent on self-ratings of each domain of self-esteem as measured by the SPPC, with the exception of the children in the non-AAI group rating themselves more favorably in the domain of physical appearance (z = 4.562, p < .001) and somewhat but not significantly more favorably in athletic competence (z = 1.951, p = .052). Of note, the majority of children across groups rated their competence fairly favorably, resulting in a non-normal, negatively skewed or "Jshaped" distribution of scores both before and after treatment for both groups. But, for the most part, mean scores for this particular sample did not differ significantly from published means for their typically developing, same age peers, with the exception of this sample rating themselves higher in the domain of physical appearance at baseline (non-AAI z = 11.441, p < .001; AAI z = 9.112, p < .001). As previously reported in an analysis testing equivalency of groups, randomization resulted in no significant differences in demographic characteristics or behavior skills as measured by; parent ratings on the ADHD-RS, for each subtype (Inattention, H/I, and Combined type) as well as symptoms of ODD) (26). Similarly, at baseline no group differences were revealed for cognitive skills, as measured by the WASI-II (Vocabulary, Block Design, Similarities, and Matrix Reasoning) (**Table 1**) or academic skills as measured by the PERM-P (χ <sup>2</sup> = .05, p = .974).

### Post Intervention

As previously reported, all children who participated in the P.A.C.K. study demonstrated significant improvement in symptoms of ADHD and social skills but improvements were significantly greater in the group that participated in AAI (26). Furthermore, in that study, a group by time interaction was revealed on ratings of problem behaviors and social initiation, suggesting a modest benefit for the AAI group over the non-AAI group. For the present study, stratified Wilcoxon Signed-Rank Tests (SAS NPAR1WAY procedure) revealed that children's self-reported scores of their behavioral conduct, scholastic and social competence were significantly higher at post-treatment than at pre-treatment in the AAI group (z = 2.320, p = .021,


*AAI = Animal Assisted Intervention; WASI-II = Wechsler Abbreviated Scale of Intelligence, Second Edition; VI = Verbal Intelligence Quotient, PIQ = Performance Intelligence Quotient, FSIQ = Full Scale Intelligence Quotient.*

z = 2.631, p = .008, and z = 2.541, p = .011, respectively). Prepost-treatment differences were not found for the children in the group without AAI.

### Correlational Analyses

Post-intervention analysis in the full sample revealed no significant correlations of cognitive measures and self-esteem subscales, with the exception of children's level of math achievement as measured by the PERM-P, correlating with selfratings of scholastic competence as measured by the SPPC (r = .192, p = .046). This small magnitude correlation did not reach significance in either subgroup (AAI r = .218, p = .110; Non-AAI r = .157, p = .267).

Likewise, for the whole sample post-intervention, behavioral measures of ADHD symptoms were not correlated with selfesteem (i.e., scores on the SPPC). Parent-reported symptoms of ODD, however, were positively correlated with child reports in the domains of Scholastic Competence (r = .215, p = .044), Athletic Competence (r = .307, p = .004), and Physical Appearance (r = .214, p = .045). That is, greater impairment from ODD symptoms as rated by the parent was linked to children more favorably perceiving their competence in these domains.

When stratified by group, however, no significant relationships between parent ratings of ODD symptoms and self-esteem were revealed in the AAI group. The finding persisted in the non-AAI group: higher parent ratings of ODD symptoms were related to greater self-perceptions of Scholastic Competence (r = .366, p = .016), Athletic Competence (r = .601, p < 0.001), and Physical Appearance (r = .361, p = .017), and Social Competence (r = .340, p = .026). Furthermore, when stratified, parent ratings of H/I were positively correlated with Scholastic Competence (r = .276, p = .033).

### DISCUSSION

The specific aim of the present study was to determine if AAI improved self-esteem in children with ADHD when compared to more traditional psychosocial interventions. Findings from the present study indicate children's perceptions about their social competence, behavioral conduct and scholastic competence, were significantly higher at post-intervention when compared to preintervention in the group in which live animals participated (AAI). Conversely, no pre-post-intervention differences in selfperceptions were found for the children who participated in the intervention without dogs (non-AAI).

### Dogs and Character Development

Leaders in the field of HAI have called for more controlled research design with more specific aims in efforts to better understand the role of AAI in psychosocial outcomes, including self-esteem. In response, the present study employed a randomized and controlled trial of AAI vs. a "best practice" control condition with a specific aim of investigating children's self-esteem as measured by self-perceived social competence, behavioral conduct, and global self-worth. Central to the development of the treatment protocol was the consideration that elements of humane education about animals is thought

to contribute to character development, social communication, and compassion—all thought to be key in the treatment aims for children with ADHD who present with deficits in skills of executive function necessary for self-regulation and selfawareness. The authors proposed that utilizing these educational strategies would be enhanced by the presence of a live animal and the integration of canine assisted intervention activities. Group differences in these findings provide support for the participation of dogs in intervention strategies aimed to improve key elements of self-esteem.

### The Role of Feedback From a Live Animal and Self-Regulation

Children who participated in AAI gained better access to aspects of the intervention, specifically targets of social competence and behavioral conduct. Pre-intervention, all children rated themselves about average across domains when compared with published norms. But in the areas in which they rated themselves a bit lower (social competence and behavioral conduct; presumed weaknesses in this population and specific targets of the intervention) only children in the AAI group rated themselves more favorably post-intervention. Direct interaction with a live animal provides immediate feedback of socially appropriate and compassionate behavior toward the dogs. Results suggest the tailored humane education in the session content, coupled with interaction with live dogs, served as an immediate and non-verbal means of feedback to which the child responded positively when compared to their peers in the non-AAI group.

### The Role of Parent and Child Engagement

As previously reported, parent ratings indicated that all children in the sample responded favorably on aims of symptom reduction and improved social skills, with the children in the AAI group making the strongest gains (26). The behavioral parent training (BPT) component for both intervention groups incorporated parent/child "homework" around human education, character development, and compassion. It is likely that the increased use of positive parenting strategies signals to the child that he/she is doing better. As such, the child in turn rate himself or herself better. But this supposition should have held true across both groups as all parents participated in the same BPT. Taken together, the findings suggest that the families in the AAI group responded differentially to the lessons of the BPT.

The research suggests that as a consequence of experiencing direct contact with therapy dogs, children were more motivated to attend and positively engage in the intervention. Several studies have found that incorporating therapy animals into activities can help motivate children to comply with the therapeutic or educational process, and to retain that motivation over time (42–46). In P.A.C.K. sessions, children were "frontloaded" and practiced aspects of the parent/child interaction assignment for that week in the session prior to the parent receiving the "homework." It seems plausible that children who "role-played" with live animals, more readily recalled or shared elements of the lesson more vividly with their parents about how they acted and behave in their presence.

One might also posit that parents of children who participated in the group with therapy dogs found it easier to engage in discussion about the intervention sessions by talking about the animals instead of the content alone. By doing so, they may have directly promoted stronger generalization of the strategies learned by the children in session. It is plausible then to suspect that those parents began to notice improvement in their child's social skills and behavioral conduct more readily and thereby were more likely to express praise and appreciation at increased frequency when compared to parents of children in the non-AAI group. When parents perceive and report behavioral improvement, it is likely that children in turn receive this feedback, directly or indirectly, reinforcing their perceptions of their improvements and increasing their perceptions of their social competence and behavioral conduct.

### Limitations and Future Investigation Positive Illusory Bias

Self-perception is a construct largely influenced by psychosocial development with typical younger children presenting with a lack self-awareness marked by inflated self-perceptions when compared to older children (32). Theoretically, children with neurodevelopmental disorders marked by deficits in EF may be delayed in the development of this skill when compared to their typically developing peers. The severity of specific cognitive deficits in children with ADHD are associated with positive illusory bias (47) and particularly self-awareness (48). It may be that some children with ADHD are relatively delayed in their ability to attend to, respond to, or benefit from feedback from peers, teachers or caregivers commensurate to their peers. In fact, despite frequent feedback from peers, teachers and caregivers, these children oftentimes demonstrate poor self-awareness of their problems and rate themselves more favorably than their teachers and parents rate their actual performance in academic and social context (49). This phenomenon, positive illusory bias, among children with ADHD has been well described (23, 49– 51). Furthermore, children with ADHD who also demonstrate positive illusory bias are found to demonstrate greater social impairment when compared to their typically developing peers and their peers with ADHD who do not demonstrate positive illusory bias (52)

Alternative hypotheses have been presented about the nature of positive illusory bias. Some propose an over-estimation of one's competences acts as a protective mechanism for children who feel bad or embarrassed about their problem behaviors and or social challenges or that denial of weaknesses is a defense mechanism (53–55). More important, perhaps, is the finding that the degree of over-estimation of confidence also seems to mediate poor response to feedback (56). This lack of response to feedback is especially relevant to developing interventions for this population.

In the present study, the non-normal distribution of the children's self-perception scores across each domain has implications for interpreting improved scores at postintervention for the AAI group. The majority of children in the present study, similar to their typically developing, same age peers, rated their competence fairly favorably. This finding is not surprising given the literature on the influence of development on self-perception and self-awareness. In fact, considering children with ADHD have been found to be relatively delayed in their ability to respond to social feedback and demonstrate susceptibility to positive illusory bias, one might hypothesize that these children would have rated themselves significantly more positively than their same age, typically developing peers. But in the present study, participating children seemed to perceive their competence about the same as their typically developing peers across each domain.

While not significantly different from norms reported for typically developing children of similar age, the scores indicate that the children in the present study may not accurately perceive their competence when compared to how their parents and peers perceive them. While group differences in increased social, and scholastic competence and behavioral conduct may be a function of greater parent reported improvement, these increases do not necessarily mean that children demonstrated behavior commensurate with their self-estimations.

This finding may suggest a continued lack of self-awareness despite improvement and increased self-regard. Alternatively, it may be simply a function of their age compounded by their diagnosis of a neurodevelopmental delay. Conversely, it may simply mean those in the AAI group felt better about themselves and they reported so. Further investigation of this phenomenon among children participating in AAI and the control condition is indicated and should include an analysis of difference scores between parent/teacher and peer ratings of competence and children's self-ratings of competence to more directly measure the possibility of positive illusory bias in this group. Additional considerations include an examination of the associations between improved selfperceptions of competence and behavioral conduct with baseline reports of externalizing behaviors, observed compliance and selfregulation in the course of psychosocial intervention with and without dogs.

### Targeting Self-Esteem and Measuring It as an Outcome

Brummelman addressed contemporary criticism that interventions aimed at increasing self-esteem inadvertently cultivate narcissism among children and clarified the distinction between the two constructs as "the belief that one is superior to others vs. the belief that one is worthy" (p. 11) (57). Indeed, in the AAI group, promotion of cross-species value and worthiness were reinforced with intervention activities, adult modeling, and by the positive attention directed to and received from the therapy dogs. Nevertheless, whereas self-esteem has been found to be variable and dependent on outcomes of day-to-day experiences, self-compassion is a relatively consistent way of relating to oneself that is characterized by self-kindness, a sense of common humanity, and mindfulness when encountering a personal challenge (58). Furthermore, moderating factors including aggression and internalizing symptoms may moderate ratings of self-esteem (59). Future work extending the findings of this study should more closely examine internalizing symptoms and focus on self-compassion as an alternative to self-esteem. Neff and Vonk assert that self-compassion may be an "important source of positive self-regard that is . . . less ego reactive and inflated" (p. 44) (58) As such, children with ADHD who encounter day-to-day challenges across contexts may benefit from explicit cultivation of self-compassion as an adaptive coping strategy that fosters well-being and resilience.

### Medication Naïve Sample

To our knowledge, this work is the first randomized controlled trial examining the effectiveness of AAI with a large sample of young children with ADHD. Initial findings provide promise for increasing the modalities of treatment available to this population, but replication of these early findings is called for. Additionally, while many families seek alternatives to pharmacological interventions, evidence for the effectiveness of stimulant medications for reducing symptoms of ADHD is robust. The prescription of stimulants remains the most common practice for treating ADHD in the United States (60) and remains among the main practice parameters of the American Academy of Child and Adolescent Psychiatry (61) and the American Academy of Pediatrics (62). Of note, this study was limited to young children who were medication naïve and whose parents were seeking a non-pharmacological intervention prior to trying stimulant medications. While it was not the aim of this study to determine the effectiveness of AAI compared to stimulant medications for improving outcomes for children with ADHD, that is a viable research question that may warrant future investigation.

### CONCLUSION

When compared to traditional, "best practice" psychosocial intervention, psychosocial intervention with the assistance of therapy dogs (AAI) yielded greater improvements in perceived self-competence, behavioral conduct and academic competence among children with ADHD. The findings may lead future researchers to continue investigating what elements of AAI appear to impact self-esteem and how the impact can be sustained over time. AAI is becoming a more recognized complementary therapy and with more scientific support, it may become more naturally applied in numerous treatment programs serving not only children with ADHD but other children with neurodevelopmental disorders.

### AUTHOR CONTRIBUTIONS

SS was the principal investigator and led the development, interpretation, writing and dissemination of this work. HJ participated in the preparation of the data and contributed to the writing of the manuscript. MA participated in the development and implementation of the intervention, the interpretation and the writing of this manuscript. AS provided the statistical consultation and analyses of the data. AF significantly contributed to the development of the AAI content and strategies and participated in the interpretation of the findings. KL contributed to the interpretation of the results and writing of the manuscript.

### FUNDING

This study was funding by a grant, R01H066593, from the Eunice Kennedy Shriver National Institute of Child Health and Human Development (NICHS) and Mars-WALTHAM <sup>R</sup> (S.E.B.S.). The content is solely the responsibility of the authors and does not necessarily represent the official views of the Eunice Kennedy Shriver National Institute of Child Health and Human Development or the National Institutes of Health.

### REFERENCES


### ACKNOWLEDGMENTS

We would like to acknowledge and thank the children and families that participated in Project P.A.C.K. at UC Irvine as well as all of the fabulous dogs and their amazing handlers, without whom, this work would not have been possible. We would like to provide a special thanks to the volunteers from Pet Partners and Canine Companions for Independence for allowing us the opportunity to work with the most amazing therapy dogs, services dogs in training, and their partners! We would also like to acknowledge Michelle Fassyoux, Shivani Amin, and Nicole Kirin for their assistance in the lab.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Schuck, Johnson, Abdullah, Stehli, Fine and Lakes. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Reliability and Validity Assessment of the Observation of Human-Animal Interaction for Research (OHAIRE) Behavior Coding Tool

Noémie A. Guérin<sup>1</sup> , Robin L. Gabriels <sup>2</sup> , Monique M. Germone<sup>2</sup> , Sabrina E. B. Schuck <sup>3</sup> , Anne Traynor <sup>4</sup> , Katherine M. Thomas <sup>5</sup> , Samantha J. McKenzie<sup>6</sup> , Virginia Slaughter <sup>7</sup> and Marguerite E. O'Haire<sup>1</sup> \*

<sup>1</sup> Department of Comparative Pathobiology, Center for the Human-Animal Bond, College of Veterinary Medicine, Purdue University, West Lafayette, IN, United States, <sup>2</sup> Department of Psychiatry, School of Medicine, University of Colorado, Denver, CO, United States, <sup>3</sup> Child Development Center, Pediatrics School of Medicine, University of California, Irvine, Irvine, CA, United States, <sup>4</sup> Department of Educational Studies, College of Education, Purdue University, West Lafayette, IN, United States, <sup>5</sup> Department of Psychological Sciences, College of Health and Human Sciences, Purdue University, West Lafayette, IN, United States, <sup>6</sup> Institute for Teaching and Learning Innovation, The University of Queensland, Brisbane, QLD,

Edited by: Peggy D. McCardle, Consultant, New Haven, CT, United States

#### Reviewed by:

Nathaniel James Hall, Texas Tech University, United States Erika Friedmann, University of Maryland, Baltimore, United States

> \*Correspondence: Marguerite E. O'Haire mohaire@purdue.edu

#### Specialty section:

This article was submitted to Veterinary Humanities and Social Sciences, a section of the journal Frontiers in Veterinary Science

Received: 16 April 2018 Accepted: 08 October 2018 Published: 08 November 2018

#### Citation:

Guérin NA, Gabriels RL, Germone MM, Schuck SEB, Traynor A, Thomas KM, McKenzie SJ, Slaughter V and O'Haire ME (2018) Reliability and Validity Assessment of the Observation of Human-Animal Interaction for Research (OHAIRE) Behavior Coding Tool. Front. Vet. Sci. 5:268. doi: 10.3389/fvets.2018.00268

Frontiers in Veterinary Science | www.frontiersin.org 1 November 2018 | Volume 5 | Article 268 **59**

Australia, <sup>7</sup> School of Psychology, The University of Queensland, Brisbane, QLD, Australia

The Observation of Human-Animal Interaction for Research (OHAIRE) is a coding tool developed to capture the behavior of children when interacting with social partners and animals in naturalistic settings. The OHAIRE behavioral categories of focus are emotional displays, social communication behaviors toward adults and peers, behaviors directed toward animals or experimental control objects, and interfering behaviors. To date, the OHAIRE has been used by 14 coders to code 2,732 min of video across four studies with a total of 201 participants ages 5 to 18 years (M = 10.1, SD = 2.5). Studies involved animal-assisted intervention with three species (i.e., dogs, horses, and guinea pigs) and three populations (i.e., autism spectrum disorder, attention-deficit hyperactivity disorder, and typically developing children) in a school, a therapeutic horseback riding program, a group therapy program, and the hospital setting. We explored the psychometric properties of the OHAIRE through analyses of its inter-rater reliability, intra-rater reliability, convergent and divergent validity, and internal structure, using data from these four human-animal interaction studies. The average inter-rater reliability was excellent (kappa = 0.81), with good reliability in most of the behavioral categories coded. Intra-rater reliability was consistently excellent (0.87 ≤ kappa ≤0.96). Internal structure analyses with Cronbach's alpha supported the exploratory use of subscales to measure social communication behaviors toward peers (α = 0.638) and adults (α = 0.605), and interactions experimental control objects (α = 0.589), and the use of a subscale to measure interactions with animals (α = 0.773). Correlation analyses with multiple questionnaires showed a convergence between positive emotional display and social behaviors as assessed by the OHAIRE and social skills as assessed by the Social Skills Rating System (SSRS) and the Social Communication Questionnaires (SCQ). Little concordance was found between the OHAIRE and the Social Responsiveness Scale (SRS) or the Aberrant Behavior Checklist-Community (ABC). The OHAIRE shows promise for wider use in the field of Human-Animal Interaction, with a need for generalization across more settings and ages.

Keywords: human-animal interaction, animal-assisted intervention, social behaviors, behavior coding, interval coding, psychometrics

### INTRODUCTION

### Background

The notion that animals can affect people's lives and behaviors in many positive ways is investigated in a field of research known as Human-Animal Interaction (HAI). As a relatively recent and interdisciplinary field, HAI is often criticized for its lack of methodological rigor (1, 2). Common HAI research critiques target weak study design, small sample sizes, and the inappropriate use of assessment tools, which limits the field's ability to develop an evidence base for animal-assisted intervention (AAI). Assessment in HAI research has relied heavily on questionnaire data and there has been a call to use more physiological measures and behavioral observation.

Physiological measures and behavioral observation are considered more objective than questionnaires, because they quantify observable physical phenomena rather than mental experiences as reported by a study participant's or caregiver's perceptions. Yet, while the instruments used to collect physiological data rely on direct physical measures (e.g., heart beats per minute) and assays (e.g., salivary cortisol), thus reducing the influence of human error, the quantification of behavior still requires the direct involvement of a human observer. To assess behavior, a human observer typically watches study participants directly or via a video recording and assigns numerical values to the participants' behaviors based on precise behavior definitions. From the combination of such behavior definitions with sampling and scoring procedures, researchers can develop standardized coding schemes or systems.

Standardized assessment tools are critical to building an empirical base for the HAI field by yielding results that are replicable and comparable across studies. Ultimately, the use of standardized assessment facilitates conducting meta-analyses, which summarize the empirical evidence available in the current literature on a specific topic (3, 4). While the use of standardized behavior observation schemes is common practice in the field of psychology, we are not aware of a published, validated tool that incorporates behaviors relevant to the study of HAI, that is, behaviors directed toward animals.

To address the need for a standardized human behavior coding tool adapted to HAI research, the Observation of Human-Animal Interaction for Research (OHAIRE) was developed. The OHAIRE is a behavior coding tool developed to capture the behavior of humans when interacting with social partners and animals in naturalistic settings. Here, we define naturalistic settings as any setting where participants are not asked to perform specific tasks and are free to interact with each other and with any animal present. We do not recommend the use of the OHAIRE in settings where behaviors are heavily directed (i.e., with a detailed agenda), as we are seeking to capture natural variations in willful social interactions across conditions. Behaviors captured in the OHAIRE coding tools were selected based on common research questions, commonly evaluated outcomes, and the main theories of focus in HAI research.

Four of the main theories applied in HAI research are grounded in evolutionary biology and social psychology (5). The two main evolutionary theories informing HAI research are Biophilia and Neoteny. Biophilia postulates that humans are inherently drawn to the living beings around them (6), while Neoteny refers to the presence of juvenile characteristics (e.g., large eye to head ratio, play behaviors) in adult domesticated animals, encouraging social and nurturing behaviors from humans (7). Both theories hypothesize that human beings naturally display a certain level of behavioral attention (e.g., social and nurturing) toward animals. This direct display of attention sometimes encourages social behaviors directed toward animals and, leads to the creation of a human-animal bond.

The human-animal bond has been hypothesized to fit within the psychological theories of social support and attachment (5). In the social support theory framework (8), interactions with companion animals may reduce loneliness and be a source of social support for humans, as well as encourage social interactions with other humans, while attachment theory (9) applied to HAI suggests that human beings may develop attachment bonds to animals, providing emotional safety. Taken together, these theories have shaped the research questions and outcomes evaluated in socio-emotional HAI research.

To accommodate these common research questions and theories, the behavioral categories captured in the OHAIRE include social interactions, interactions with animals and control objects, emotional display, and interfering behaviors. Specific behaviors are captured to address prevalent theories, including attention to humans and animals (Biophilia), prosocial or caring behaviors (Neoteny), social interactions (social support theory), and human-animal bond (attachment theory). The OHAIRE is a timed interval coding tool designed to code behaviors from video data. In this paper, we describe the development process of the OHAIRE, and present the results of analyses of its psychometric properties collected over four studies (10–12), including analyses of the OHAIRE's reliability, and validity.

Reliability refers to the property of a research tool to yield consistent results when used by different observers or at different times to assess the same situation. Good reliability indicators demonstrate that the tool provides enough details to parse out the subjectivity of the observer. The objectivity of an observer can be compromised by a number of sources of bias, such as the observer's familiarity with the individual whose behavior is being coded, either in the form of a personal relationship between the observer and the individual, or through the knowledge of some characteristics or demographics of an individual (e.g., socioeconomic status, disease, or disorder diagnosis). Another source of observer bias can come from the knowledge of a study's design or hypotheses. In order to minimize the risk of bias, observers should be blinded to as many variables as possible that may influence their judgement, and given clear instructions on how to use the research tool. In this paper, we assess interrater reliability to test whether the OHAIRE manual contains precise and clear definitions and whether the training of coders was effective. Intra-rater reliability is assessed to measure the drift of coders' observations over time and the potential need for re-training (13).

Validity refers to the capacity of an instrument to generate data that is representative of the actual behaviors it intends to measure. Validity can be assessed using many different types of evidence. In this paper, we assess convergent and divergent validity of the OHAIRE by evaluating its correlation with standardized questionnaires. We expect that subscales of the OHAIRE will correlate with measures that assess similar constructs. We also explored the internal reliability of the subscales of the OHAIRE, or how coded behaviors from the same subscale relate to each other.

### Development of the OHAIRE Coding System

In an effort to quantify human behaviors theorized to be generated by interacting with animals, the OHAIRE coding system was developed.

#### Behavior Definitions

The choice of the behaviors to include in the OHAIRE was made based on a review of common behaviorally relevant variables reported in the HAI literature. The behaviors included can be observed in any naturalistic setting, whether the investigator is observing interactions between humans and animals in the home or during animal-assisted activities or therapy. In order to encompass common research questions in HAI research, the OHAIRE captures social interactions, interactions with animals, interactions with control objects, facial and verbal emotional display, and interfering behaviors. The list of behaviors is presented in **Table 1**.

Social interactions are a common outcome of interest in HAI research, from studies that evaluate the effect of being accompanied by a companion animal on social interactions with strangers [e.g., (14)], to the effect of animal-assisted intervention on the social skills of children with autism spectrum disorder [e.g., (15)]. The OHAIRE captures six different forms of social interactions, namely talking, looking, gesturing, touching, showing affection, and being prosocial (i.e., purposefully helpful) to others. The OHAIRE identifies the target of social interactions, TABLE 1 | List of behaviors included in different versions of the OHAIRE.


V1, Version 1; V2, Version 2 ; V3, Version 3.

Behaviors

whether they are directed toward adults or individuals of the same age cohort (i.e., peers) of research participants.

To account for interactions with animals, the OHAIRE captures the same behaviors toward animals. Following a push for more rigorous and controlled research in the field of HAI, more study designs have started to include active or attention control conditions to parse out the effect of the animal in a study. In an active or attention control condition, the participants engage in activities that mimic the amount of time and attention dedicated to participants in the treatment group. As these control conditions often include control objects, such as toys or stuffed

animals, the OHAIRE captures the behaviors expressed toward these control objects.

Interacting with animals is also often reported to have a positive effect on mood and emotions [e.g., (16–18)]. To quantify this effect, the OHAIRE captures emotional display in two ways: facial emotional display and verbal emotional display. Facial emotional display refers to facial expressions of happiness, like smiling and laughing, and discontent or sadness, like frowning or crying. Verbal emotional display can be positive or negative, and refers to the valence of the speech of the participants; its coding relies on the actual words pronounced by the participant rather than on the tone of their voice.

Interfering behaviors coded with the OHAIRE encompass behaviors that may impair the individual's ability to participate in and benefit from an activity or interaction, including aggression, overactivity, and isolation. Aggression refers to any potentially harmful behaviors, and is coded along with its target (i.e., to whom or what it is directed). Overactivity is coded when a participant is loud, disruptive, or shows signs of restlessness. Isolation is coded when a participant is socially withdrawn, not engaged in their social environment.

All behaviors captured with the OHAIRE are described extensively in the OHAIRE coding manual. For each behavior, detailed coding tips and multiple examples are provided.

#### OHAIRE Versions

Between its first use in 2013 (11), and the current paper, the OHAIRE has undergone modifications to improve the usability and psychometric properties of the tool. In total, three different versions of the OHAIRE were used over four studies, coded in six coding periods. Between the OHAIRE-Version 1 (OHAIRE-V1) and the OHAIRE-Version 2 (OHAIRE-V2), definitions of negative emotional display were simplified, gestures were added as a social communication behavior, interfering behaviors were simplified, and anxiety was added to the list of interfering behaviors. Between the OHAIRE-V2 and the OHAIRE-Version 3 (OHAIRE-V3), the definition of negative facial emotional display was further simplified, verbal emotional display was re-introduced, and interfering behaviors were re-arranged. The list of behaviors that were recorded for each version of the OHAIRE is available in **Table 1**. The mean, standard deviation, and skew of all behaviors are presented in **Table 4**. Overall, between the OHAIRE-v1 and the OHAIRE-v3, behaviors were added, removed, or merged in the tool, but the definitions of the behaviors were stable over time, which allows us to use data from all four studies coded with the OHAIRE so far for reliability and validity analyses.

### METHODS

### Studies

The OHAIRE was used to assess the behavior of children in four independent HAI studies exploring the effects of animalassisted intervention. A summary of the main characteristics of each study included in the analyses is presented in **Table 2**. The total combined sample for this paper included 201 children aged 5 to 18 (M = 10.1, SD = 2.5) and 2,732 min of coded video data.

### Study 1—Species: Guinea Pigs, Population: Children With ASD and Typically Developing Children

Study 1 assessed the effects of Animal-Assisted Activities (AAA) with guinea pigs in inclusion classrooms (11). Inclusion classrooms accommodate typically developing (TD) children as well as their peers with Autism Spectrum Disorder (ASD). Participants were recruited from 15 inclusion classrooms within four schools in the area of Brisbane, Australia. Thirty-three groups of three children participated in this program, each pairing one child with ASD with two TD children randomly selected from the same classroom (N = 99). Participants were aged 5 to 12 years old (M = 9.1, SD = 2.3) All groups participated in free-play sessions with toys and AAA sessions with guinea pigs. There were three 10-min free-play sessions with toys: one before an 8-week waitlist control, one after the waitlist and before an 8-week AAA program, and one at the end of the AAA program. The AAA program consisted of bi-weekly 20-min free-interaction sessions with guinea pigs and animalrelated materials for 8 consecutive weeks. All sessions were videorecorded, and three toy sessions and three AAA sessions were selected for behavior coding. The first 10 min of each session was selected for coding. Results of this study indicated that children


ASD, Autism Spectrum Disorder; TD, Typically-developing; ADHD, Attention-Deficit Hyperactivity Disorder; AAA, Animal-Assisted Activities; CBT, Cognitive-Behavioral Therapy; OHAIRE, Observation of Human-Animal Interaction for Research.

with ASD displayed more social behaviors, more positive affect, and less negative affect in the presence of animals, compared to toys (11). For TD children, results indicated more social behaviors, especially toward adults, and more positive emotional display in the presence of animals, compared to toys (19).

### Study 2—Species: Horses, Population: Children With ASD

Study 2 assessed the effects of a Therapeutic Horseback Riding (THR) program for children with ASD (10). Sixteen participants ages 6 to 16 years (M = 10.2, SD = 3.0) were randomly assigned to a 10-week THR program or a 10-week control program of barn activities. Both conditions offered 45-min, once weekly sessions in small groups (2–4 participants). During the THR group, participants (n = 8) learned horsemanship and riding skills while engaged with a horse. The barn activity group participants (n = 8) learned similar horsemanship skills, but without contact with horses, instead activities involved a life size stuffed horse. Participants in this study were filmed for a minimum of 1 min before and after each intervention group (THR and barn activity), and all sessions were included in behavior coding. Participants in THR group were recorded before the group while waiting to ride seated on a bench on the side of the riding arena. Barn activity group participants were recorded while waiting for the group to begin while seated at the group table. Both group participants were recorded in similar conditions after the groups (i.e., seated at a table with their respective groups engaging with art materials). Because of the timing of the recordings, participants were not taped when interacting with horses or stuffed horses, thus the results for this study do not include interactions with animals and control objects, but do include all other behaviors normally coded with the OHAIRE (emotional display, social interactions, and interfering behaviors).

### Study 3—Species: Dogs, Population: Children With ADHD

Study 3 evaluated the effect of the inclusion of a dog in a cognitive behavioral therapy (CBT) group program for children with Attention-Deficit Hyperactivity Disorder (ADHD; 12). Thirty-six children ages 7–9 years old (M = 7.9, SD = 0.72) with a diagnosis of ADHD were randomized to groups of six participants to either receive CBT in the presence of dogs (n = 18) or with stuffed, plush dogs (n = 18). Participants attended twice-weekly sessions (a total of 4 ½ h per week) for 12 weeks, over 23 sessions. All sessions were video-recorded. Five sessions were selected for behavior coding (sessions 1, 7, 12, 18, and 23), with an attempt to maximize the number of participants present at each of the coded session, and to represent different sessions at regular intervals during the length of the intervention.

### Study 4—Species: Dogs, Population: Children With ASD

Study 4 assessed the effect of a dog's presence on the behavior of youth with ASD and co-existing psychiatric diagnoses admitted to a developmental disability specialty psychiatric unit. A total of 76 children and adolescents with ASD aged 6 to 18 (M = 12.4, SD = 3.5) participated in this crossover design 10-min sessions of unstructured activities with either a dog and adult handler or a marble track toy and adult handler. Forty-seven children participated in both types of sessions, 23 children participated in sessions with the dog only, and six children with the marble track toy only. Children participated in the activities in groups of two or three, and an adult supervisor. All sessions were video-recorded and used for behavior coding.

### Ethical Considerations

Written informed parental consent and oral child participant assent were obtained for all participants in the studies used in the present article. The protocols for video transfers between institutions and coding of the videos at the first author's institution were reviewed and approved by the Purdue Institutional Review Board (Approval #1410015340). Study 1 human-related protocols were reviewed and accepted by the University of Queensland's Human Ethics Committee (Approval # 2010001284) and animal-related protocols were reviewed and accepted by the University of Queensland's Animal Ethics Committee (Approval # SPH/057/11). Study 2 and 4 humanrelated protocols were reviewed and accepted by the Colorado Multiple Institutional Review Board (Study 2, Approval # 07- 1148; Study 4, Approval # 15-1227) and animal-related protocols were considered exempt of review by the University of Colorado IACUC as no research was directly performed on the animals. Study 3 human-related protocols were reviewed and approved by the University of California Irvine Institutional Review Board (Approval # 2010-7679) and animal-related protocols were considered exempt of review by the University of California Irvine Institutional Animal Care and Use Committee as no research was directly performed on the animals.

### Behavior Coding Sampling Method

The OHAIRE coding system uses the online data entry system Qualtrics (20) to facilitate coding and reduce data entry error. The OHAIRE relies on the coding of 1-min video segments that are divided into six 10-s intervals. For each 10-s interval, behaviors are described as either present (1) or absent (0). The scores for each interval are summed to create a score out of six for a full minute for each behavior. This type of coding, called onezero sampling or interval sampling, is an effective way to code large amounts of video data with high inter-rater reliability (21). In one-zero sampling, the behaviors are not rated in intensity, but rather coded as present or absent, thus, this technique is referred to as behavior coding, and the observers as coders. The lack of intensity rating and the coding as present or absent rather than an exact duration measurement are often cited as drawbacks of one-zero sampling; whereas its simple use yielding high reliability, and efficiency are cited as its major strengths [e.g., (22, 23)]. To verify the accuracy of one-zero sampling in our sample, we compared its use with measuring the exact duration of behaviors. To reduce time burden, we selected one behavior for one coder to measure using both one-zero sampling, and exact duration measurement in a randomly selected set of 60 one-min videos. We selected the behavior "smiling," because it is common, but varies largely between children and videos. We

selected videos from study 1 to compare one-zero sampling and duration measurement because this study had excellent video quality. Study 1 also included both ASD and TD children, which increased variability. A coder viewed 60 videos of children (30 ASD; 30 TD) from Study 1. Using a Spearman rank correlation to accommodate the ordinal one-zero sampling data, we found an excellent correlation (r = 0.92, p < 0.001) between the two sampling techniques (**Figure 1**). Additionally, the coder went through one-zero sampling faster than duration measurement, and reported feeling more confident with the judging criteria for one-zero sampling than for duration measurement. We concluded that with high reliability, high efficiency, and little loss in information, one-zero sampling is suited for use with the OHAIRE to address the current state of research in HAI, as proof-of-concept is still needed for numerous research questions.

#### Training

Each new coder undergoes a standardized training to learn to use the OHAIRE coding system. The training starts with a detailed study of the manual and the viewing of example videos for each behavior. Coders are then taught how to use the online coding system and the video sampling procedure. Next, coders are trained to code with videos from the specific study they will be working on. Since HAI is a broad field with different populations and types of interactions, coders should reach inter-rater reliability on a sample of the specified study's data before starting to code. The trainer and the coders first code a full minute of video together. Then, each coder views and then codes three videos by him or herself. After coding three videos, inter-rater reliability with the trainer is calculated. Differences in coding are discussed, and three more videos are coded. Cycles of coding three videos and subsequently discussing reliability continue until each coder has reached excellent overall inter-rater reliability (Cohen's Kappa > 0.8). This initial phase of training typically takes 3 to 5 h. Training will be made available to a larger public in the Spring of 2019. For more information, please visit http://www.ohairecoding.com.

### Coders

For each study, one primary coder was designated to code the full set of videos. The data obtained from the primary coder was used for the scoring of the OHAIRE and the outcome data analyses. Additionally, one or more secondary coders coded at least 20% of the videos to calculate inter-rater reliability. Videos coded for reliability were selected randomly from the main coding sets with a random number generator. A total of 14 coders were trained in and used the tool. Coders are individually referred to as the letter "C" followed by a number between 1 and 14 for the rest of this article.

### Questionnaires

Each study included standardized informant-report questionnaires. We decided to focus on questionnaires that had been used in at least two studies to explore the convergent


ABC, Aberrant Behavior Checklist; SCQ, Social Communication Questionnaire; SRS, Social Responsiveness Scale; SSRS, Social Skills Rating System.

and divergent validity of the OHAIRE. Questionnaires included in each study are listed in **Table 3**.

#### Aberrant Behavior Checklist

The Aberrant Behavior Checklist-Community [ABC-C; (24)], is a 58-item questionnaire developed to assess interfering behaviors in children and adults with intellectual and developmental disabilities. The ABC comprises five subscales, including irritability and agitation, lethargy and social withdrawal, stereotypic behavior, hyperactivity and non-compliance, and inappropriate speech. In multiple studies, the ABC-C has shown high internal consistency, good inter-rater reliability, and a consistent five-factor structure [e.g., (25, 26)]. Higher ABC-C scores indicate more aberrant behaviors.

The ABC-C was used in Study 2 and Study 4 for two purposes: As a screening measure for entry in the studies, and as a weekly outcome measure using the irritability subscale (10). For consistency, we used only the first ABC-C score recorded for each child (baseline score) in the present analyses. In both studies, the ABC-C was completed by a caregiver for each child.

#### Social Communication Questionnaire

The Social Communication Questionnaire [SCQ; (27)], is a 40 item questionnaire developed to assess autism-like behavior in individuals of all chronological ages and with a developmental age over 2 years. The SCQ demonstrates good internal consistency, test-retest reliability, and convergent reliability with other ASD diagnostic tools (27). Higher SCQ scores indicate more behaviors characteristic of ASD.

The SCQ-Lifetime was completed in both Study 1 and Study 2 by caregivers of the participants upon entry in the study, as an additional screening measure for ASD.

#### Social Responsiveness Scale

The Social Responsiveness Scale [SRS; (28)] is a 65-item rating scale developed to measure symptoms associated with autism spectrum disorder. The SRS comprises five subscales, namely social awareness (eight items), social cognition (12 items), social communication (22 items), and social motivation (11 items), which can be summarized in an overall Social subscale score, and Restricted and Repetitive Behaviors. The SRS demonstrates high internal consistency and test-retest reliability (29). Its updated version, the SRS-2, enlarges the age range of the intended SRS test-taking population (30). Higher SRS scores indicate more problems in the designated subscale.

Participants' caregivers completed the SRS in Study 2, and the SRS-2 in Study 4. For the age ranges of participants included in this paper the SRS-2 does not introduce new subscales or items, therefore scores of the SRS and SRS-2 will be presented together in the subsequent analyses. In both studies, questionnaires were completed upon entry in the study and after the intervention period. For consistency, we used the SRS and SRS-2 scores of participants at study entry for the validity analyses in this paper.

#### Social Skills Rating System

The Social Skills Rating System [SSRS; (31)] is a 57-item (teacher version) or 55-item (parent version) rating scale developed to measure Social Skills and Competing Problem Behaviors as rated by parents or teachers, and academic competence as rated by teachers in children. The SSRS demonstrates adequate internal consistency and test-retest reliability (31). Its updated version, the Social Skills Improvement System [SISS; (32)], is a 79 item measure structured similarly, with additional subscales and improved psychometric properties. Because scores on the social skills and problem behavior scales of the SSRS and the SSIS are highly correlated (33), these scores will be presented together in the subsequent analyses.

In Study 1, the SSRS was completed by parents and teachers of participants upon entry in the study, after an 8-week waitlist period, and after an 8-week program of animal-assisted activities. In Study 3, the SSIS was completed by parents of participants upon entry in the study, at the end of the intervention period, and at a 6-week follow-up. For consistency, SSRS and SSIS scores from the time of study entry are used for validity analyses in this paper. Higher scores indicate better skills in the social skills and academic competence subscales of the SSRS and SSIS, while higher scores indicate more problem behaviors in the competing problem behavior subscale.

### Data Analyses Inter-rater Reliability

Ensuring that the observation coding tool was used consistently across coders was important to parse out coders' subjectivity, which may reflect the quality of the training and the precision of the manual. To assess inter-rater agreement, a primary coder coded all (100%) of the videos for each study, and one or two secondary coders coded 20% of the videos or more. We calculated Cohen's kappa (34), an agreement coefficient that corrects for chance agreement. Cohen's kappa values range from −1, indicating complete disagreement, to 1, indicating perfect agreement. In this paper, we base our interpretation of kappa values on recent guidelines, considering values above 0.20 minimal, above 0.40 weak, above 0.60 moderate, above 0.80 strong, and above 0.90 excellent (35).

#### Intra-Rater Reliability

Observer drift can be an issue observed in the days or week following initial inter-rater reliability training, which can result in observers coding behaviors with less accuracy (13, 36). To assess the risk of observer drift in the OHAIRE, we calculated intra-rater reliability for a random selection of videos from all four studies included in this paper. Coders were assigned a list of 30 videos to code in 1 week, then again 2 weeks later. We calculated Cohen's kappa between the two coding repetitions for each study. We used McHugh's interpretation of Cohen's kappa for intra-rater reliability (35).

### Convergent and Divergent Validity

We examined potential correlations of the OHAIRE with questionnaire data to provide evidence of convergent and divergent validity. We compared the average OHAIRE score of each participant with the ABC-C, the SCQ, the SRS and SRS-2, and the SSRS and SSIS scores upon entry in studies. For all questionnaires, raw scale and subscale scores were used. OHAIRE behavior scores of facial emotional display, verbal emotional display, and interfering behaviors were included individually in the analyses. OHAIRE scores of social interactions with peers, social interactions with adults, interactions with animals (human-animal bond score), and interactions with objects were included as subscale scores. Pearson's correlations were used to adapt to the continuous rating scales of the questionnaires, and mean OHAIRE values per participant ranging in a nearcontinuous way from 0 to 6. We hypothesized the following correlations:

	- (a) [(1)] Irritability and Agitation subscale correlated negatively with positive facial and verbal emotional display, and positively with negative facial and verbal emotional display.
	- (b) [(2)] Lethargy and Social Withdrawal subscale correlated negatively with social interactions with peers and adults, and positively with social isolation.
	- (c) [(3)] Stereotypy and Hyperactivity subscales correlated positively with overactivity.
	- (a) [(1)] SCQ scores correlated negatively with positive facial expressions (smile, laugh), and social interactions with peers and adults.
	- (b) [(2)] SCQ scores correlated positively with negative facial expressions and overactivity.
	- (a) [(1)] Social skills scale correlated positively with OHAIRE scores of social interactions with peers and adults, and negatively with isolation.
	- (b) [(2)] Competing problem behaviors scale correlated negatively with OHAIRE scores of social interactions with

peers and adults, and positively with OHAIRE scores of aggression, overactivity, and isolation.

### Structure

The behaviors coded in the OHAIRE were originally arranged in behavioral categories designed to facilitate ease of coding (i.e., emotional display, interactive behaviors, and interfering behaviors), rather than designed to be used as aggregate subscales. While the behavioral categories "emotional display" and "interfering behaviors" consist of unique behaviors that have distinct functions, behaviors coded in the category "interactive behaviors" refer to the common function of interacting with either a peer, an adult, an animal, or an object. We used Cronbach's alpha (37) to assess the internal consistency of the following subscales for the OHAIRE: social interactions with adults, social interactions with peers, interactions with animals, and interactions with objects. We used average OHAIRE scores for each participant.

### RESULTS

The descriptive statistics for behavioral codes of the OHAIRE across all studies, averaged by child and then by study, are presented in **Table 4**.

### Inter-rater Reliability

The number of videos coded by primary and secondary coders for each study, as well as overall Cohen's kappa between pairs of coders for the OHAIRE coding system and for five categories of behaviors are presented in **Table 5**. Overall, interrater reliability was excellent (0.79 < k <0.88), with differences across behavior categories. Facial and verbal emotional display are coded with moderate to excellent agreement (0.62 < k <0.99), and interfering behaviors yield strong to excellent agreement across all studies (0.88 < k <0.98). Social communication yields weak to moderate agreement in most studies (0.37 < k <0.79), with a drop in kappa for the TD sample of Study 1. Interactions with animals and objects yields moderate to excellent reliability in most studies (0.67 < k <0.91), except for Study 2 (k = 0.16). The most recent version of the coding system (OHAIRE-V3), used in Studies 3 and 4, yielded moderate to excellent inter-rater reliability in all categories.

### Intra-rater Reliability

Intra-rater reliability for coding occasions separated by 2 weeks was calculated for a subset of 26 to 30 videos by study. Overall intra-rater reliability was excellent, with Cohen's kappa varying between 0.87 and 0.96 (**Table 6**). Intra-rater reliability was moderate to excellent across five behavior categories, with slightly lower reliability for social communication (0.72 < k <0.88), and excellent agreement for interfering behaviors (0.97 < k <0.98). Intra-rater reliability seems to vary between coders, with notably one coder who performed slightly worse than others, with a strong kappa of 0.87, compared to excellent kappas (above 0.90) for all other three coders (C13, Study 4).

TABLE 4 | Descriptive statistics of OHAIRE behavioral codes, averaged by individual.


### Convergent and Divergent Validity Aberrant Behavior Checklist—Community

Pearson's correlations between the OHAIRE behavior scores and the ABC-C scores are summarized in **Table 7**. Contrarily to our hypotheses, the Irritability and Agitation subscale did not correlate significantly with positive facial and verbal emotional display. It correlated positively with negative facial display for Study 2 but not for Study 4. Contrarily to our hypotheses, the Lethargy and Social Withdrawal subscale was correlated negatively with social interactions with adults only in Study 4, and was not correlated positively with social isolation. Additionally, the ABC-C Lethargy Social Withdrawal subscales were negatively correlated with interactions with animals and over activity for Study 2. Contrarily to our hypotheses, the Stereotypy and Hyperactivity subscales were not correlated positively with the OHAIRE over activity scale, but ABC-C Hyperactivity was correlated positively with negative emotional display (r = 0.72, p < 0.001) and interactions with adults (r = 0.67, p = 0.003), and negatively with social isolation (r = −0.52, p = 0.033). Contrarily to our hypotheses, the Inappropriate Speech subscale did not correlate positively with aggression.

### Social Communication Questionnaire

Pearson's correlations between the OHAIRE behavior scores and the SCQ scores are summarized in **Table 8**. Confirming our hypothesis, SCQ scores did significantly correlate negatively with positive facial expressions (smile, r = −0.56, p < 0.001; laugh, r = −0.21, p = 0.049), and with social interactions with peers (r = −0.50, p < 0.001), although not with adults for Study 1. SCQ scores correlated positively with negative facial expressions as hypothesized (r = 0.34, p = 0.001), but contrarily to our hypothesis, not with overactivity for Study 1. Trends were overall the same for Study 2, without reaching statistical significance. Additionally, the SCQ correlated positively with interactions with animals (r = 0.43, p < 0.001) and objects (r = 0.42, p < 0.001), and aggression (r = 0.40, p < 0.001), and negatively with isolation (r = −0.52, p < 0.001) in Study 1. It correlated negatively with interactions with objects (r = −0.61, p < 0.001) in Study 2.

#### Social Responsiveness Scale

Pearson's correlations between the OHAIRE behavior scores and the SRS scores are summarized in **Table 9**. Contrarily to our hypotheses, no statistically significant correlations were observed between the SRS and OHAIRE behavior scores. Overall tendencies show a possible positive association between the Restricted Interests and Repetitive Behaviors Subscale and negative facial emotional display, and a negative association with positive facial emotional display. The Social subscale did not correlate negatively with OHAIRE scores of social interactions with peers and adults, and positively with isolation, and the Restricted Interests and Repetitive Behaviors subscale did not correlate positively with over activity.

#### TABLE 5 | Inter-rater reliability results.


ASD, Autism Spectrum Disorder; TD, Typically-developing; all ps < 0.001. Bold face font indicates acceptable inter-rater reliability (k > 0.40).



#### Social Skills Rating System

Pearson's correlations between the OHAIRE behavior scores and the SSRS and SSIS scores are summarized in **Table 10**. In Study 1, the Social Skills scale of the SSRS as rated by parents and teachers was positively correlated with OHAIRE scores of social interactions with peers as hypothesized (parent, r = 0.42, p < 0.001; teacher, r = 0.28, p = 0.006), but, contrarily to our hypothesis, it was not correlated with social interactions with adults, and it was positively correlated with isolation (parent, r = 0.39, p < 0.001; teacher, r = −0.44, p < 0.001). Additionally, the Social Skills scale of the SSRS was positively correlated with smiling (parent, r = 0.51, p < 0.001; teacher, r = 0.42, p < 0.001), negatively correlated with negative facial emotional display (parent, r = −0.23, p = 0.031; teacher, r = −0.30, p = 0.003) and negative verbal emotional display (parent, r = −0.19, p = 0.303; teacher, r = −0.35, p = 0.043), and negatively correlated with interactions with animals (parent, r = −0.33, p = 0.001; teacher, r = −0.30, p = 0.003) and objects (parent, r = −0.28, p = 0.006; teacher, r = −0.40, p < 0.001). In Study 3, the Social Skills scale of the SSIS was not correlated with emotional display or social interactions, but was positively correlated with OHAIRE behavior scores of aggression (r = 0.44, p < 0.001).

In Study 1, the Competing Problem Behaviors scale of the SSRS was correlated negatively with OHAIRE scores of social interactions with peers as hypothesized (parent, r = −0.23, p = 0.031; teacher, r = −0.20, p = 0.045), but unexpectedly not with adults. It was also, contrarily to our hypotheses, not correlated with overactivity, and positively correlated with aggression (parent, r = 0.24, p = 0.024; teacher, r = −0.44, p < 0.001), and isolation (parent, r = −0.48, p < 0.001; teacher, r = −0.40, p < 0.001). Additionally, the Competing Problem Behaviors scale of the SSRS was positively correlated with interactions with animals (parent, r = 0.24, p = 0.020; teacher, r = 0.23, p = 0.026), and objects (parent, r = 0.36, p < 0.001; teacher, r = −0.37, p < 0.001). In Study 3, the Competing Problem Behaviors subscale was not correlated with OHAIRE behavior scores of facial emotional display or social interactions, but was negatively correlated with interactions with objects (r = −38, p = 0.021).

Finally, the Academic Competence subscale of the SSRS was positively correlated with OHAIRE behavior scores of smiling (r = 0.31, p = 0.002), and isolation (r = 0.38, p < 0.001), and negatively correlated with OHAIRE behavior scores of interactions with animals (r = −0.32, p = 0.001) and objects (r = −0.34, p = 0.001), and aggression (r = −0.29, p = 0.004).

### Structure

Cronbach's alphas were moderate for subscales of social interactions with peers (α = 0.638), social interactions with adults (α = 0.605), interactions with animals (α = 0.773), and low for interactions with objects (α = 0.589).



† p < 0.10; \*p < 0.05; \*\*p < 0.01; \*\*\*p < 0.001.

### DISCUSSION

The OHAIRE coding tool was developed to fill a need for a standardized behavior observation method in the field of HAI. In this article, we presented analyses of its reliability and validity, and summarized changes to the tool implemented to improve its psychometric properties, in the OHAIRE-V2, and OHAIRE-V3.

Overall, the OHAIRE demonstrated good inter-rater reliability, with variability between behavioral categories and increasing reliability through the versions of the OHAIRE. Intra-rater reliability was excellent but varied slightly between coders. Correlational analyses showed limited concordance between the behaviors coded with the OHAIRE during animalassisted intervention, and questionnaires measuring various aspect of social communication, interfering behaviors, and ASD symptoms. These correlations varied widely across studies and questionnaires. Analyses of subscale internal consistency showed predominantly low to moderate Cronbach's alpha values.

The inter-rater reliability of the OHAIRE was overall excellent but varied with the version of the tool used, and peaked in the latest version of the tool, the OHAIRE-v3. Low inter-rater reliability was interpreted as a lack of precision of the coding manual, and following inter-rater reliability analyses, changes were made to increase its clarity. For example, the notions of initiation and response of interactions that was included in the OHAIRE-v2 led to confusion, and apart from expert raters (RG & MG), it yielded low inter-rater reliability for social communication and interactions with objects or animals. Only the form of interaction (talk, gesture, etc.) was retained for analyses in the current paper, and for the next version of the tool. The latest version of the tool, the OHAIRE-v3, shows improved reliability from previous versions in all behavioral categories.

In addition to imprecisions in the earlier versions of the coding manual, one reason for lower inter-rater reliability may be the personal performance of coders. The calculation of intrarater reliability indicated how well coders retain their training and whether some behavior definitions are more or less likely to drift over time. While all coders retained excellent reliability over time, one coder scored slightly lower than others in all categories (except for interfering behaviors), despite having received the same training. This difference highlights the need for precise recruitment and in-depth training.

Analyses of the convergence of the OHAIRE with standardized questionnaires showed varying correlations depending on the questionnaire and the sample tested. Overall, our hypotheses as to the direction of correlations between the OHAIRE and varying questionnaires were not validated. One important factor of variation in correlations was the study that was tested. For example, the SCQ and the SSRS show strong correlations with the OHAIRE as used in Study 1, but much less so for Study 2 (SCQ) and Study 3 (SSRS). This difference is likely due to the difference in samples between studies. While Study 1 had a mixed sample of TD children and children with ASD from inclusion classrooms, both Study 2 and Study 3 had



† p < 0.10; \*p < 0.05; \*\*p < 0.01; \*\*\*p < 0.001.

samples of participants enrolled in a treatment program for one particular neurodevelopmental disorder (ASD and ADHD, respectively). Specifically, there are strong correlations between the "interactions with animals" subscale of the OHAIRE and the SCQ and SSRS in Study 1, which is consistent with differences in SCQ and SSRS scores between children with ASD and TD children in this sample, and more interactions with animals displayed by children with ASD compared to TD children in this study (11, 19). The lack of correlations between questionnaires and behaviors coded with the OHAIRE may reflect a lower variance in these populations. For example, a minimum SCQ score was required for children with ASD to be able to participate in Study 2. If all children have SCQ scores in a restricted range, it may be expected that we see weaker or no correlations with the OHAIRE.

Another important consideration is that the OHAIRE directly evaluates the behavior of children during interventions. The questionnaires used in correlation analyses were mostly completed by caregivers, asking retrospective questions about the recent behavior of their child. However, behavior can vary widely from one setting to the other (38), and we do expect it to vary when the child is participating in animal-assisted intervention sessions. In the future, the correlation of behaviors as coded with the OHAIRE and change scores in questionnaires for before to after an intervention might help to explain how a child particularly benefited from a given intervention. The comparison of behavioral data with continuous physiological data, such as TABLE 9 | Pearson's correlations between the OHAIRE coding system and the Social Responsiveness Scale.


† p < 0.10; \*p < 0.05; \*\*p < 0.01; \*\*\*p < 0.001.

electrodermal activity or heart rate variability, may also provide evidence of convergent validity of the OHAIRE with another direct measure.

In addition to observing child behavior, recording the behavior of an animal in animal-assisted intervention may provide a more complete picture of human-animal interaction, including animal welfare. The dyadic analyses of the behavior of a human study participant and an animal may help identify specific activities with the animal or behaviors of the animal that trigger certain responses in a child. The development of animal behavior modules for species often included in animal-assisted intervention (e.g., dogs, horses) is a next step in the development of the OHAIRE.

Analyses of internal consistency with Cronbach's alpha yielded preliminary support for the use of four interaction subscales: social interactions with peers, social interactions with adults, interactions with animals (i.e., human-animal bond score), and interacting with a toy or control object. Specifically, the subscale measuring interactions with animals shows high internal consistency and can be used to quantify the engagement of a study participant with animals. This behavioral humananimal bond score may also be used in the future as a potential moderator of animal-assisted intervention success. For example, future studies may use the behavioral humananimal bond score as a way to explore whereas an animalassisted intervention's success depends on the actual level of

TABLE 10 | Pearson's correlations between the OHAIRE coding system and the Social Skills Rating System and the Social Skills Improvement System.


SSRS, Social Skills Rating System; SSIS, Social Skills Improvement System. † p < 0.10; \*p < 0.05; \*\*p < 0.01; \*\*\*p < 0.001.

engagement of its participants with animals, thereby exploring the active role of animals in animal-assisted intervention. The low Cronbach's alpha value for interactions with objects may stem from the very low frequency of some behaviors (e.g., prosocial behaviors toward objects, which would only have been recorded if a child tried to "help" a toy, by cleaning or repairing it, or otherwise taking care of it). Repeating these analyses in future studies using control objects more likely to receive such attention from children (e.g., dolls or stuffed animals) will allow for further exploration of the internal reliability of this subscale. We currently recommend the use of subscales in the OHAIRE for interactions with animals, and the exploratory use of subscales for social interactions with peers, social interactions with adults, and interactions with objects. We recommend that researchers using these subscales present Cronbach's alphas in future publications for ongoing monitoring. We do not recommend using subscales for presenting behavior results in the behavioral categories of emotional display and problem behaviors). Additionally, while the current sample size did not lend itself to the use of factor analysis, future structure analyses for the OHAIRE may include factor analysis to confirm the suitability of the use of these subscales.

Finally, the OHAIRE has been used so far as a measure of behavior in studies of animal-assisted intervention including control groups where participants were not interacting with animals. Previously published results (11, 19) have shown its discriminative capacities, both between situations [e.g., children with ASD were found to smile more often in the presence of animals compared to toys, (11)], and between diagnostic groups [e.g., regardless of the situation, typically developing children smile more often than children with ASD; (39)]. Its use is apt to detect differences in the coded behaviors between situations with or without an animal. While it is not a diagnostic tool, the OHAIRE also shows sensitivity to behavioral differences between typically developing children and children with autism.

### CONCLUSION

The OHAIRE is a behavior coding tool that captures social interactions, emotional display, interfering behaviors, and interactions with animals and control objects. In the evaluated studies, the OHAIRE-v3 reached overall excellent levels of interand intra-rater reliability, limited correlations with caregiverreport questionnaires of social and interfering behaviors, and presents a reliable human-animal interaction subscale. Its current use is targeted to research teams aiming to examine and quantify children's behavior during animal-assisted intervention and continually monitor the psychometric properties of the coding tool. Its extension to new age ranges and diagnostic populations will evaluate its potential to have an even stronger impact in the field of HAI, as the first standardized behavior observation tool developed specially for human-animal interaction research.

### AUTHOR CONTRIBUTIONS

NG led the coding of study 1 for typically developing children and of studies 3 and 4, is a co-author of the behavior coding tool, and led reliability and validity analyses. RG was the principal investigator for studies 2 and 4, and is a coauthor of the behavior coding tool. MG was an investigator on studies 2 and 4, and is a co-author of the behavior coding tool. RG and MG led behavior coding for study 2. SS was the principal investigator for study 3. AT and KT provided extensive statistical guidance for the reliability and validity analyses. SM and VS provided guidance in the initial development of the coding system and for study 1. MO developed the initial version of the OHAIRE behavior coding tool for study 1 and is its lead author, was the principal investigator for study 1, led the behavior coding for children with ASD in study 1, and provided extensive guidance for the behavior coding of all studies and for the reliability and validity analyses.

### FUNDING

This study was supported by several grants awarded to the authors of this study (initials of the awardee in parentheses) from the following agencies:

### REFERENCES


### ACKNOWLEDGMENTS

The authors would like to thank all study participants and staff at the University of Queensland, the University of Colorado Denver, the Children's Hospital Colorado, and the University of California Irvine who have made the collection of the data included in this article possible. We would also like to thank the coders who carefully observed and coded behaviors in all the videos used in this article.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Guérin, Gabriels, Germone, Schuck, Traynor, Thomas, McKenzie, Slaughter and O'Haire. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

## Compatibility of Cats With Children in the Family

Lynette A. Hart <sup>1</sup> \*, Benjamin L. Hart <sup>2</sup> , Abigail P. Thigpen<sup>1</sup> , Neil H. Willits <sup>3</sup> , Leslie A. Lyons <sup>4</sup> and Stefanie Hundenski <sup>1</sup>

*<sup>1</sup> Department of Population Health and Reproduction, School of Veterinary Medicine, University of California, Davis, Davis, CA, United States, <sup>2</sup> Department of Anatomy, Physiology and Cell Biology, School of Veterinary Medicine, University of California, Davis, Davis, CA, United States, <sup>3</sup> Department of Statistics, University of California, Davis, Davis, CA, United States, <sup>4</sup> Department of Veterinary Medicine and Surgery, College of Veterinary Medicine, University of Missouri, Columbia, MO, United States*

Although studies involving pet dogs and cats, and human adults and children, have been reported, the specific interactions between cats and children have not. This study sought information from parents about the cat's role in families that have at least one child 3–12 years of age and at least one cat. Demographic data on cat source, breed, gender/neuter status, was sought as well as information on adults and children in the families and on affectionate, aggressive, fearful, and playful responses of the cats to children. A convenience sample was recruited via listservs for pet owners and parents. Using a pilot tested web survey, descriptive statistics were based on 865 respondents. Multi-variate statistical analyses were conducted on data from 665 respondents with complete responses for all items, including respondents' locations and whether cats were adopted as kittens. Multi-variate analyses included consideration of demographic data, geographic region of respondents, behavioral characteristics of the cats, and responses of the children to the cats. From descriptive statistics, cats' affection was more typical with adults than young children. Neuter status or gender was unrelated to cats' aggression or affection. Being the family's only cat was associated with heightened aggression and reduced affection. Younger cats were more likely to be affectionate. Multivariate analysis revealed three primary factors accounting for children's compatibility with the specified cat: positive interactions of the cat, aggression/fearfulness of cat, and the cat's playfulness and children's reaction to the cats. Positive child-cat relationships were more typical with two or more adults and multiple cats in the home. Old cats were the least satisfactory. A breeder or shelter was a better source than as a feral, from a newspaper ad, or another source. European respondents rated their cats' interactions with children more favorably than in U.S./Canada. This difference may reflect the European adoptions more frequently being of kittens, often purebred, assuring more early handling within the family. A noteworthy finding was that all family participants, humans, and pets alike, affect the cat-child relationship, and these results reveal that many variables can play a role in achieving a desirable relationship for a cat and child.

Keywords: cat aggressive behavior, cat affectionate behavior, cat fearfulness, cats and children, cultural differences, human-animal interaction, anthrozoology

#### Edited by:

*Peggy D. McCardle, Consultant, New Haven, CT, United States*

#### Reviewed by:

*Malathi Raghavan, Purdue University, United States Nathaniel James Hall, Texas Tech University, United States*

> \*Correspondence: *Lynette A. Hart lahart@ucdavis.edu*

#### Specialty section:

*This article was submitted to Veterinary Humanities and Social Sciences, a section of the journal Frontiers in Veterinary Science*

Received: *02 July 2018* Accepted: *19 October 2018* Published: *19 November 2018*

#### Citation:

*Hart LA, Hart BL, Thigpen AP, Willits NH, Lyons LA and Hundenski S (2018) Compatibility of Cats With Children in the Family. Front. Vet. Sci. 5:278. doi: 10.3389/fvets.2018.00278*

### INTRODUCTION

Many studies have documented the contributions of pets to children's emotional and physical development (1). The value of cats as pets has been extensively studied over decades, focusing on their interactions with adults (2) and documenting contributions to human health (3). With regard to children's pets, studies often have examined the development of empathy among children who nurture pets. Yet, as revealed in reviews, most of these studies do not treat dogs and cats separately, but rather lump dogs and cats together as companion animals or pets (4, 5), despite evidence that dogs and cats clearly differ (6). Often, dogs are emphasized as a major focus, perhaps because they frequently emerge as the preferred pet, as shown in an early study (7), and in examples from U.S. (8) and Holland (9). Thus, despite many studies exploring children's interests and engagement with pets, little specific attention has been directed to understanding details of cats' behavioral responses to children and children's relationships with cats.

In assessing children's interest in pets, Brucke (7) evaluated the essays of 7–16 year old children about pets and noted that more children preferred dogs than cats and interest in them increased as the children got older. In another study, children 3– 5 years of age, in choosing between paired photographs, showed a preference for more infantile over adult cats; similar differences were not found when they were asked to choose between infantile and adult dog photographs (10).

Many studies by Levinson (11) expanded on the importance of pets for children in filling several roles, including as companions and confidants. Focusing on adolescents' loneliness and companion animals, Black (12) presented a conceptual model of the contributions of companion animal attachment that included constituents of caregiving, offering a secure base and safe haven, and proximity seeking and separation anxiety in the absence of the animal.

Studies reported that virtually all children were found to want a pet (13), and children lacking pets often desired one and sought out contact with their neighbors' pets (14). In interviews of Pennsylvania latchkey children (8) with a median of 8 years of age, dogs were the primary pet and the most frequently owned. However, for the children without pets, cats were wished for most, with dogs second.

In evaluating children's drawings of themselves and their other family members, a study found that children who owned pets placed the drawing of themselves significantly closer to their drawings of their pets than to other members of their family (15). The closeness at which they placed themselves to cats vs. dogs did not differ, suggesting that the children experienced the supportive characteristics of cats and dogs similarly and that closeness was based on animals' general characteristics. The same authors (16) also found that 3–7 year-old children's experiences at a petting zoo where they could easily see and touch the animals was associated with forming favorable attitudes toward animals more than when they visited large wildlife exhibits. When they studied toddlers responding to live and toy animals, even very young infants strongly preferred live pets to mechanical animals, and spent significantly more time observing and interacting with the live animals (17). Children 12–18 months old used the animal's species name, and by 24–30 months, the children called the animal by its given name. The 12–30 month-old children preferred dogs to cats, presumably because the dogs were interactive and more likely to approach the children, whereas the cats often walked away, thus limiting reciprocal interaction. As noted by these authors in their studies of parent-child relationships, this reciprocal interaction is a characteristic of attachment (18, 19), and the same may be true in child-pet relationships. Turn-taking of this sort forms a basis for communication, one in which conversational interchange becomes possible (20). Recent studies of human-cat relationships have emphasized that both the cat and the human affect and contribute to the relationship and bond involved (21, 22).

Triebenbacher (23) described the importance of transitional objects to which children become attached as they begin separating from their parents; these soothe and calm the children. While blankets and cuddly toys commonly serve as transitional objects, pets also fill this role. Preschool children articulated their specific emotions characteristic of special relationships, especially that these relationships were reciprocal; they understood that the best way to show love is through affection. Children in grades 2 through 5 most commonly said if their pets could talk they would say, "I love you." (23). As special friends and important family members, pets also provide affection, social interactions, and emotional support.

When considering the most important relationships in their lives, 42% of 9–12 year olds in one study chose a pet, more often or the same as grand parents, aunts, uncles, friends or teachers (24). Similarly, in an earlier study, 7 and 10 year-old children taking a neighborhood walk listed a pet among their special friends (25).

Interest in dogs and cats can change with age (26). Parents reported that kindergarteners were more involved with and more interested in dogs than cats, while the reverse was true of second-graders. Among fifth-graders, there was no difference reported in involvement or interest between dogs and cats. In kindergarten, second- and fifth-grades, questionnaire responses were collected from one parent of each child. Older children and those whose mothers were employed were more attached to their pets. Moreover, ideas about pets and their care appear to generalize beyond the specific type of pet owned. Children who owned dogs but not cats were reported to be just as knowledgeable about cats and their care as were cat-owners. And similarly, cat-owners showed as many ideas about dogs and their care as did dog owners. In a study by Daly and Morton (27), higher empathy was found for children owning both dogs and cats as compared with children owning neither a dog nor a cat, or only one type of pet. Children that were highly attached to their pets were more empathic than those who were less attached.

Various approaches have categorized types of behaviors of cats. In a study of domestic cat personality, adjectives included amiability, which was strongly positively correlated with owner satisfaction, attachment, and bond quality. Amiability included descriptors such as cooperative, warm, peaceful, charming, and faithful (28). Demandingness included: persistent, demanding, needy, persevering, and loud. Dominance included: proud, domineering, serious, independent, and territorial. Nervousness included: nervous, timid, apprehensive, and cautious. Neutered (spayed) females scored higher on demandingness than intact males. Amiability increased with cat age, decreased with owner age, and increased with the number of cats in the home. In a somewhat similar study, personality attributes of cats [(29), p. 157] were used to measure sociability of cats toward humans. In assessing cats for their behavioral tendencies prior to adoption to evaluate systems of seeking, fear, and rage, variables included sociability, boldness, gregariousness, frustration reactivity, and fearfulness.

Yet another approach assessing personality qualities of cats generated factors using a principal components analysis based on observer ratings and behavior codings (21). Four factors involved in social interactions included active, anxious, sociable, and rough. Subtle behavioral indicators of fear vs. engagement were identified in a recent study of cat behavior, showing that a left gaze and head turn reflects fear, whereas a right gaze and head turn reveals engagement (30).

Early experience with being handled during a sensitive period is known to increase friendly and affectionate behavior in cats (2, 31). Kittens that have been handled often more rapidly approach a familiar person, a stranger or a novel object as compared with unhandled kittens (32). When a UK shelter offered enhanced socialization to kittens between 2 and 9 weeks of age, owners interviewed when the cats were ∼1 year of age reported they felt more emotionally supported by their cats; fewer of these cats were fearful with humans, as compared with cats that had received standard handling (33).

In Australia, a study of 488 people with cats found that almost all cats were neutered; only 3% of owners reported having an intact male cat, and 2% reported owning an intact female cat (34).

In New Zealand, a study of children 8–12 years old found that cats were the most frequently owned pet; they were owned by 71% of families (35). A majority of children in the cat-owning families were said to be "owner" of the cat. This was particularly true in families with only one or two children The child wanted the cat and it was often acquired to teach responsibility. In a UK study, those in semi-urban and rural householders more often reported cat ownership, as did returns from female respondents (36). In a longitudinal study of UK parents with children up to 10 years of age, cats were the most commonly owned pet, and cats were most common in families with female children (37). In another UK study, with children of all ages, cats were found in the highest number of households. In a Norway study of children and adolescents aged 9 to 15, a majority of rural participants had a cat (38).

A study of UK adolescents who had only one pet reported that 57% of respondents had a dog and 23% a cat (39). For those with multiple pets, a dog and cat was the most frequent combination. Adolescents that lived in single-parent families or stepfamilies more often reported having pet cats, when they were compared with adolescents living with two parents. Also, cat ownership was more often reported by adolescents with siblings than by those who had no siblings. Adolescents reporting a median or higher level of family affluence less often reported having a cat than those reporting a low level of family affluence.

In a study of cats in Italy, the owner's gender influenced the cat's time spent with the owner: cats spent more time with women than men (40). The composition of the family influenced the cat's behavior toward the owner and the time spent with the owner. The more sociable cats lived in small families having no children. The cats that lived with other cats had a higher quality of life, based on care, behavior, and a physical examination, than those cats living alone. Cats generally are not thought to be highly social, but it seems that living with other cats may improve a cat's quality of life. In this study, cat owners who adopted their kittens between 7 and 10 weeks of age were more attached to their cats later on than the owners who had adopted older cats. This young age of adoption appears to be an important time to let the cat socialize with humans and with animals of other species in the household.

The number of human adults in the family seems to play a role in attachment to the cat, as suggested in a study in Switzerland, where families with fewer adults reported higher attachment to the cat than in families with more adults (41). The size of the household was negatively correlated with two attachment scales.

In Japan, families with dogs often considered their pets to be family members, but families with cats less often held this view (42). Compared with dog owners, cat owners scored their pets lower on emotion and intellect. Those cat owners who considered their cats to be family members were more likely to attribute compassion to their cats when compared with owners who regarded their pets as not being family members.

For this study, we hypothesized that a well-mannered cat that can be held by a child could be a valuable companion. While cats typically rest much of the day, at times, cats could be significant companions for children, being a source of calming comfort. Despite abundant evidence that pets matter to children, most reports on children and pets somewhat lump together dogs and cats rather than specifically examining the interactions of cats and children, or the behaviors or preferences that children may have for cats.

The study sought to characterize the interactions of cats with children compared with their interactions with adults in families responding to a general web survey.

Cat breeds differ in their behavioral tendencies, such as their affection and aggressive behaviors toward family members (43); one would expect the breed of cat to be one aspect affecting which cats would provide affection and comfort to a child. For example, a comprehensive telephone-based set of interviews with 80 feline veterinary practitioners covering 15 of the most common cat breeds, found that the Ragdoll is the most affectionate, socially outgoing and least aggressive breed (43). The same survey found that male cats were rated as more affectionate than female cats.

In this study we gathered data from the general public, by means of a web-based survey, to determine factors that would predict, or correlate with, the characteristic of cats being affectionate and non-aggressive with children. Overall, the study focused on cats' behaviors with children to characterize and determine behavioral correlates and attributes of positive relationships with cats. Of particular interest also was the extent to which children in the family valued the relationship with the specified cat.

### METHODS

### Web-Based Survey of Families With a Child, 3–12 Years of Age, and With a Cat

A web-based survey was designed in SurveyMonkey to gain information about cats' characteristics that qualify them as desirable companions for young children. The 25 item survey was directed toward families, most presumably with typically developing children, and launched to appropriate listservs, publicizing and disseminating the link to feline and parent groups and other cat-interested groups. It was required that participants have a child within 3–12 years of age and a cat that was at least 1 year of age. Families with multiple cats were instructed to answer questions pertaining to the cat most interactive with the child/children in the family, to characterize behaviors in the more interactive cat/child relationships, rather than an average cat/child relationship that would have included the more outdoor and fearful cats. The survey was designed to require about 15 min to complete. Numerous published reports of behavioral studies have used similar web-based surveys [e.g., (44)]; such surveys have been found comparable in validity to more traditional survey methods (45). This survey was open for responding October 2010 through January 2012.

Among the 1,000 respondents allowed to complete this survey prior to ending data collection, 865 met the inclusion criteria: having at least one child 3–12 years old; having in the household at least one cat that was at least 1 year of age; and completing all of the 25 questions of the survey. The socio-demographic information gathered included: numbers of adults in the family; ages of children; and information on the numbers of cats and dogs in the household, as well as the age, breed and source of the cat specified as interacting the most with the children in the family. Parents provided specific behavioral ratings for the cat and children's responses to the cat on a five-point scale. For example, the cat's affectionate interactions were categorized as: very affectionate; quite affectionate; moderately affectionate; relatively non-affectionate; and non-affectionate. The cat's aggressive interactions were categorized as: very aggressive; quite aggressive; moderately aggressive; relatively non-aggressive; and non-aggressive. Parents also provided ratings of the children's level of interest in the cat.

### Institutional Review Approval Board

The University of California, Davis, Institutional Review Board approved Protocols #201018447-1 and #284059-2.

## Statistical Analyses

### Descriptive Statistics

Descriptive statistics were used to analyze data from 865 participants, including medians and results of chi-square or Fisher exact tests for significance. Further multivariate analyses from 665 participants included geographic location of respondents based on IP addresses, and whether adopted as kittens when specified in responses. Inclusion criteria for the multivariate analyses required specific answers: some participants' responses were excluded due to responding with the item, "other."

### Multivariate Statistics

For the survey data, thirteen responses were identified as reflecting the quality of interactions between the family children and the focal cat. These included: affection, aggression, friendly behavior, playfulness, and fearfulness toward various age groups of children and adults, as well as the child's reaction to the cat. A principal component analysis (PCA) was run on these variables, and the first three principal components, which explained 62% of the variability in the responses, were used in additional analyses. In this analysis, only subjects that answered all 13 questions were used, reducing the sample size to 665 responses. The factor loadings for the first factor, named cat's positive interactions, were all positive except for two variables reflecting fearfulness, which were negative. The second factor contrasted two negative behaviors (aggression and fearfulness) against more positive behaviors, and the third paired the child's reaction to the cat and positive behaviors like playfulness against the cat's aggression, particularly toward children. The eigenvectors for the principal components, and their values for each of the survey responses, are available in Excel files. Each of the first three factors were used as dependent variables in several one-way ANOVA models, looking for systematic differences with respect to a series of demographic variables, indicating the global region from which the survey response came, the composition of the family in which the cat lived, the source of the cat, the cat's current age, whether adopted as a kitten, and gender and neuter status of the cat. For this analysis, global regions were consolidated as: U.S/Canada, Europe, other. Cat breeds were consolidated as: mixed (including domestic shorthair and domestic longhair) and purebred. Factors that were statistically significant for one or more of the first three PCAs were presented in a biplot of the factor loadings and group differences. Those factors were also used as dependent variables in a conditional inference tree analyses (a form of CART) that used a broader array of explanatory variables. All analyses were run using SAS, version 9.4, except for the inference trees, which were run using R statistical software and the ctree command.

### Data Availability

In compliance with journal policy, datasets used for statistical analyses, including the PCA factor loadings, are publicly available at figshare.com: http://dx.doi.org/10.6084/m9.figshare.7007993.

### RESULTS

### Descriptive Statistics Family and Pet Demographics

The survey with 865 families meeting inclusion criteria revealed that most of the children resided in households with at least two adults available; 12 percent of families included only one adult.

Regarding the age-ranges of children, 28 percent of households had teenagers 13–19 years of age, 36 percent had children 9–12 years, 31 percent had children 6–8 years, and 40 percent had children 3–5 years. Multi-pet households were a majority of respondents, with 63 percent having multiple cats over 1 year of age. Forty-seven percent had at least one dog, of which 48 percent had multiple dogs.

The survey focused on the cat that interacted with the children the most. Almost one-third of these cats, 31 percent, were 3 years of age or less; 59 percent of cats were 6 years old or less. Male neutered cats comprised 49 percent and female spayed cats 46 percent of the specified cats, with 2 percent being intact males and 4 percent being intact females. With regard to breed, 57 percent were domestic shorthair, 11 percent domestic longhair, and 6 percent Maine Coon.

### Cat's Affectionate, Fearful, or Aggressive Behavior as Related to the Cat's Age and Gender and Children's Ages

While 32 percent of the designated cats were very affectionate toward adults and 27 percent toward 6–12 year olds, just 14 percent were very affectionate toward 3–5 year olds (statistical tests: 3 groups, or 2 pairs, all ps < 0.0001). With regard to the age of the cat, the 78 cats that were very affectionate toward 3–5 year olds were younger, with 58 of 78 (74%) being 6 years of age or less, as compared with 255 of 469 (54%), fewer, of the remaining less affectionate cats being 6 years of age or less (p < 0.001). The cats that were very affectionate consisted of 51 percent neutered males and 42 percent spayed females, with no intact males and 6 percent intact females.

Among cats rated as very affectionate to adults and/or children (n = 360), 239 (66.4%) were 6 years of age or less and 121 (33.6%) were 6 years or older. Among the remaining 505 cats that were not rated as very affectionate, 270 (53.5%) were 6 years of age or less and 234 (46.3%) were older than 6 years. Very affectionate cats were significantly younger than the other cats (p < 0.001).

Among 360 cats rated as very affectionate with children and/or adults, 278 (77.2%) were described as very affectionate to adults, 171 (47.5%) as very affectionate to children 6–12 years of age, and only 78 (21.7%) to children 3–5 years of age (adults vs. children, children vs. children: all ps < 0.0001). Very affectionate cats were more likely to express affection toward adults than toward children; they were least likely to express affection toward the youngest group of children.

Ratings of cats being very fearful were infrequent: among the 78 cats very affectionate with 3–5 year old children, only 2 (3%) were fearful of visiting children. Among the cats rated as anything less than very affectionate with 3–5 year olds, 54 of 469 (11.5%) were described as "very fearful, runs away and stays hidden" with visiting children (p < 0.0001). Further, among the 87 cats described as "definitely does not like being held or carried around" by children ages 3–6, 25 (28.7%) were also described as very fearful.

While 56 of 78 (71.8%) very affectionate cats came from multi-cat households, only 284 of 469 (60.6%) less affectionate cats came from multi-cat households. Interestingly, cats that were very affectionate toward 3–5 year olds were not always affectionate toward adults, as illustrated by our finding that among 78 cats very affectionate toward 3–5 year olds, only 50 cats (64%) were also very affectionate toward adults.

The median age range of cats rated as at least moderately aggressive to adults, children, or other cats (n = 63) was 7–10 years. Of these aggressive cats, 27 (42.9%) were 6 years of age or less, and 35 cats (55.6%) were older. One respondent was too unsure of the cat's age to indicate a range. Among the remaining 802 cats not rated as aggressive, 482 (60.1%) were 6 years of age or less, and 320 (39.9%) were older. Comparative analyses reveal that these aggressive cats were significantly more likely to be older (p < 0.01).

Among these 63 aggressive cats that were at least moderately aggressive, 29 (46%) were spayed females, 2 (3.2%) were intact females, 31 (49.2%) were neutered males, and 1 (1.6%) was an intact male. Among the remaining 802 cats not rated as aggressive, 366 (45.6%) were spayed females, 33 (4.1%) were intact females, 389 (48.5%) were neutered males, and 14 (1.7%) were intact males.

Among 24 cats scored as at least quite aggressive to children in the home, 15 (62.5%) were the only cat in the home and 14 (58.3%) were in homes without dogs; among the other 775 cats, 280 (36.1%) were the only cat and 407 (52.5%) were in homes without dogs. These aggressive cats were significantly more likely to be the only cat in the home (p = 0.01). These data suggest that these 24 quite aggressive cats tended to be isolated from other cats, but not dogs.

### Cat's Affection to Adults and Children

At least moderate affection was shown by 706 of 865 (81.6%) cats in this study to adults, as shown in **Table 1**. A somewhat lesser percentage of cats, but still a majority, was similarly affectionate to children: 429 of 626 (68.5%) for 6–8 year olds and 297 of 547 (54.3%) for 3–5 year olds.

Considering these cats in families of the general public, neuter status, or gender was unrelated to the cats' aggression or affection. Being the family's only cat was associated with heightened aggression and reduced affection. Younger cats were more likely to be affectionate.

#### Cat's Behavior Affecting the Child-Cat Relationship

In an open-ended item in the survey, parents had an opportunity to remark on the child's interaction with the designated cat that interacted the most with the child or children in the family. In these responses a vast majority of the children "liked to hold or sit with the designated cat about half the time," or "usually loved to hold or pet the cat," or were "crazy about holding, petting, snuggling and sleeping with the cat." Among 792 parents, 638 (81%) rated their children as being at least moderately responsive to the cat half the time, indicating that most children sought and valued the relationship with the cat.

TABLE 1 | Percentages of cats rated as moderately affectionate to children and adults: web survey of general public.


Among children living with the 63 cats rated as at least moderately aggressive cats, parents involved with 16 (25.4%) of these cats rated their children as crazy about the cat, and 6 (9.5%) parents described their children as feeling indifferent to the cat. One parent whose child was crazy for the cat wrote that, "the cat is not having it." In comments, parents sometimes described a conflicted situation where the child: "would love to hold and cuddle the cat," but "the cat views any interaction from the child as a potential threat," "will leave if possible," or "the cat hates it."

As a contrast with the aggressive cats, of the 360 cats rated as very affectionate with children and/or adults, parents involved with 173 (48%) of these cats rated their children as crazy about the cat, and only 10 (3%) parents scored their children as indifferent. There was a highly significant level of compatibility of children who had affectionate rather than aggressive cats (p < 0.001). In the 6–12 year age group, parents involved with 190 cats in 488 (39%) households judged their children as crazy about the cat; this was a trend toward a higher percentage of children being crazy about the cat than reported by the parents involved with the 113 cats for younger children, 3–5 years of age, from 347 (33%) households.

Some cultural differences were evident, for 343 out of 776 (44.2%) cats in U.S./Canada were adopted from a shelter, compared to 11 out of 63 (17.5%) in Europe (p < 0.0001). Conversely, 196 out of 776 (25.3%) cats in the U.S./Canada were purebred, compared to 41 out of 63 (65.1%) in Europe (p < 0.0001). Additionally, only 172 out of 776 (22.2%) cats in the U.S./Canada were adopted as kittens, compared to 42 out of 63 (66.7%) in Europe (p < 0.0001).

To summarize, cats in the U.S./Canada were more likely than in Europe to come from shelters, and less likely to be purebred and adopted as kittens.

### Multivariate Statistics

The Principal Components Analysis (PCA) revealed three primary factors; PCA Eigenvalues of the correlation matrix were: Prin 1, 4.72; Prin 2, 1.99; Prin 3, 1.29. Behaviors of primarily positive interactions of cats with children play a major role in the first behavioral factor, cat's positive interactions (e.g., from high to low), friendly to visiting kids, affection of the cat to kids 6–8 years old, to kids 3–5 years old, friendly to adults, affectionate to adults, affectionate to children; and with negative loadings for fearful with kids 3–5 years old and fearful

FIGURE 1 | Biplot of Factor 1 cat's positive interactions and factor 2 cat's fearfulness/aggression (Factor 1 increases to the right; Factor 2 increases going up. Lower right quadrant is the optimal relationship; e.g., intact cats were better for positive interactions than neutered females; cats from a breeder provided more positive interactions than those from an ad; ferals scored low on fearfulness/aggression; cats in Europe scored better on these two factors than those scored in the U.S./Europe). Points plotted in blue represent the variables that were used in the PCA. Points plotted in green, purple or black place subgroups of responses on the graph, corresponding to neuter/gender status, cat source, and miscellaneous categorical predictors, respectively. Points that plot in the same general direction relative to the origin are positively associated, while ones that plot on opposite directions are negatively associated. The strength of an association is related to the distance from the origin, so the points closest to the origin exhibit negligible associations.

with adults. These Prin 1 vectors then were associated in oneway ANOVA models with non-behavioral variables, such as the geographic region of the respondent, number and ages of children, the age and source of the cat, the age at adoption, the cat's breed and gender/neuter status. The second factor, cat's fearfulness/aggression, is dominated by; fearfulness with adults, aggression with kids 6–8 years old, aggression with adults, and fearfulness with kids 3-5 years old; this was associated in the ANOVA model with specific non-behavioral variables such as the ages of children, the sex status of the cat, and whether the cat was adopted as a kitten. The third factor, cat's playfulness and child's positive reaction, pertains to: the cat being playful with children 3–5 years old, being playful with adults, the children's positive reactions to the cat, and the cat's playfulness, with the cat's aggression with young children and adults having a substantial negative loading; this factor also was associated in the ANOVA model with the number and ages of children, and the cat's age.

To illustrate these results more specifically, three biplot figures are presented; each is a biplot of two of the PCAs revealing group differences. The biplots highlight and focus on variables found to have significant differences and exclude those with only marginal differences. In plots the cat's age range is indicated by: age1, 1 up to 3 years old; age2, 3 up to 6 years old; age3, 6 up to 10 years old; age4, over 10 years. **Figure 1** plots PCA Factors 1, cat's positive interactions, and 2, cat's fearfulness/aggression; the lower right quadrant is most favorable and the upper left is least favorable for a positive catchild relationship. A cat living in Europe, or being intact or a neutered male, was associated with the cat's positive interactions and low fear/aggression. A cat being young, an intact female, or from a breeder was associated with the cat's positive interactions. Being adopted as a kitten was somewhat associated with lower fear/aggression. Conversely, a cat being a neutered female or acquired from a newspaper ad was associated with a cat's negative interactions and high fear/aggression. A cat being older, feral, or an intact male, is off to the left of the bi-plot and associated with a cat's negative interactions to children, but low fear/aggression.

**Figure 2** plots PCA Factors 2, cat's fearfulness/aggression, and 3, cat's playfulness and child's positive reaction; the upper left quadrant is most favorable and the lower right is least favorable for a positive cat-child relationship. A cat living in Europe or being an intact male scored low on fearfulness/aggression and somewhat positive for playfulness and the child's positive reaction. Cat's playfulness and positive reactions from children, but also with heightened fear/aggression, were associated with a cat's young age and being acquired through an ad. A cat being old was associated with somewhat low fear/aggression, as well as low playfulness and negative reactions from children. Kittens, neutered males, and feral cats showed increasingly low levels of fearfulness/aggression.

**Figure 3** plots PCA Factors 1, cat's positive interactions, and 3, cat's playfulness and child's positive reaction; the upper right

positively associated, while ones that plot on opposite directions are negatively associated. The strength of an association is related to the distance from the origin, so the points closest to the origin exhibit negligible associations.

origin, so the points closest to the origin exhibit negligible associations.

quadrant is most favorable and the lower left is least favorable for a positive cat-child relationship. A cat being young was associated with positive interactions from the cat and reactions of the child, whereas the cat being old was associated with negative cat and child reactions. The cat living in Europe, being an intact female, or acquired from a breeder was associated with the cat's positive interactions. The cat being acquired from a paper ad was strongly associated with negative cat interactions but positive child reactions. The cat being an intact male, a neutered female, or feral was associated with the cat's negative interactions.

Additionally, three figures depicting conditional inference trees are presented, representing each of the three principal components, Prin 1, 2, and 3. This analysis searches recursively for predictors and threshold values (or dichotomous splits for categorical predictors) that result in a significant response difference, depending on whether the observation in question is above or below the threshold value. **Figure 4** depicts Prin 1, cat's positive interactions, separating cats with high values of Prin1 from those with lower values; a high score is favorable. Solitary cats as a group had lower scores and younger cats had higher scores. The highest score at Node 23 reflected female cats in Europe living in families with at least 3 cats. Another high scorer, node 18 cats lived with one other cat, had no children 6–8, was a male cat, living with a child 9–12. As examples of cats with low scores, Node 13 represents solitary cats that are at least 6 years of age, and Node 9 includes cats living with no more than one other cat, that are no more than 6 years of age acquired as feral or from a newspaper ad, and lack any children 6–12 years old.

**Figure 5** depicts Prin 2, fearfulness/aggression, separating cats with high values of Prin 2 from those with lower values; a high score is unfavorable. The highest score at Node 7 included female cats living with 2 or fewer adults and a child 9–12 years old, adopted as a kitten, and living with one dog or less. The lowest score at Node 10 was for female cats living with 3 or more adults.

**Figure 6** depicts Prin 3, cat's playfulness and child's positive reaction; a high score is favorable. The highest value at Node 6 represented a cat no more than middle-aged, living with at least 4 adults. The lowest score at Node 14 includes very old cats acquired from unusual sources with no children aged 9–12 at home, only younger children.

### LIMITATIONS OF THE RESEARCH

The survey pertained to cat ownership, so participants necessarily were aware that the study was about cats. This knowledge presumably may have recruited participation of people who enjoy their cats and have good relationships with them. People whose cats have been aggressive may have relinquished those cats. The

study was not a randomized survey representing all families with cats. In fact, in multi-cat households, respondents were asked to answer the survey with regard to the cat most interactive with the children.

## DISCUSSION

This study focused on cats' affectionate and aggressive interactions with children. The web survey of the general public revealed that relationships of children with cats tend to be less consistently affectionate and are possibly more problematic than relationships cats have with adults. Many of the limitations in the relationships are from the cats' unwillingness to be affectionate.

Cats were generally more likely to be affectionate toward adults in the family than to children ages 3–5 years old. In considering the variables that could predict likelihood for a cat being very affectionate to children, the age of the cat, and fearfulness vs. friendliness toward visiting adults and/or children emerged as prominent factors that may result in conflicted relationships. Aggressiveness is obviously incompatible with being affectionate toward 3–5 year olds. Similar to our previous published study of cats with children who have autism (46), very few cats were reported as aggressive. The parents were asked to choose the cat that interacted with children the most, and answer questions regarding that cat. A strong majority of families had multiple cats, meaning that most children had the option to select a cat for interactions.

Many children would like their cats to be affectionate with them, but the cats may have less interest in a relationship than the children and may be unwilling to be held by a child. Early social habituation of kittens to children could predispose cats to be affectionate with young children. This would take advantage of the sensitive period in the early weeks of cats' lives when friendly, affectionate behavior can be elicited from cats (31–33).

Concerning the interesting result that cats in Europe were described more positively than those in other parts of the world, this may reflect different perceptions or expectations among respondents of what a cat's behavior is or should be, or it could involve unidentified differences based on living situations. With no direct observations of behaviors, it is not possible to know whether these differences reflect contrasts in the cats' behaviors.

Obtaining cats by a newspaper ad stood out as a risk factor, in the bi-plots. When obtaining a cat from a shelter or a breeder, the cat is receiving good care and efforts are made to locate a good home for the cat. Perhaps with a newspaper ad, there may be greater urgency to place the cat with less emphasis on the cat's care and welfare throughout the process.

Having at least two adults and multiple cats in the home was associated with more positive child-cat relationships, but as noted, the reason for this relationship is unclear. Old cats were

the least satisfactory in child-cat relationships. Obtaining a kitten from a breeder or shelter seemed better than obtaining a cat that was feral, or from a newspaper ad, or other source. As previously highlighted (21, 22), the cat-human relationship is affected by both participants, and these results reveal that many variables can play a role in achieving a desirable relationship for a cat and child.

Cats living in Europe were rated as more interactive and less fearful than those living in the U.S./Canada. The higher rates of adoption of purebred cats and of kittens in Europe, rather than adopting older cats and acquiring cats from shelters as in the U.S./Canada, may help explain this difference.

In this survey of responding parents from the general public, presumably with mostly typically developing children and family cats, the cats varied in their levels of affection expressed to adults and children, with affection to adults being more common than to children, especially young children 3–5 years of age. Based on the responses from parents, children sought affectionate relationships with their cats and frequently enjoyed spending time with them. However, the desired level of a child's compatibility with the cat was often not fulfilled, in that some cats that are friendly and affectionate and provide a rewarding relationship to adults may offer much less to children in the family. This finding underscores the perspective that the cat's behavior is often the limiting factor in the interaction between a pet cat and a child, more than the child's level of interest. Also, these results suggest that with very young children 3–5 years of age, compatible relationships are more likely with younger cats. Risk factors for conflicted relationships include: the cat's age; fearful and aggressive behavior of the cat; and the cat's social context with other companion animals. Although isolated cats scored higher on aggression than those living with other animals, it is unclear whether the aggression led to a particular cat living with no other animals vs. the cat becoming aggressive due to isolation. Suggestions supported by the data in this study for enhancing compatible, affectionate relationships between children and cats are: (1) to assume that cats in the age range of 1 up to 6 years are more likely to be affectionate to very young children than older cats; and (2) not to assume that a cat that is fearless and affectionate toward adults will also be affectionate to young children. While there are behavioral differences among breeds of cats that would undoubtedly be important in predicting that a pet cat would be likely to be affectionate and non-aggressive with children (43), there were too few purebred cats in this study to address this issue.

### ETHICS STATEMENT

The study was carried out in accordance with the recommendations of the University of California, Davis, Institutional Review Board as Protocols #201018447-1 and #284059-2. Participants responded anonymously to an online web survey and were informed on the survey of its

voluntary nature, and that participation indicated their informed consent.

### AUTHOR CONTRIBUTIONS

LH, BH, and AT conceived and designed all phases of study, collected and analyzed data, and edited manuscript drafts. LH drafted and compiled manuscript. LL conceived and designed study. SH refined dataset entries for multi-variate statistical analyses by categorizing variables, especially for respondents' geographic status, cat acquisition as kittens, and numerical categories for statistical analyses. NW conceived and conducted

### REFERENCES


extended statistical analyses, and drafted text for methods pertaining to figures that resulted, with specific edits. LH, BH, AT, NW, LL, and SH reviewed and edited interim and final draft manuscripts.

### ACKNOWLEDGMENTS

Generous grant support was provided by the Eunice Kennedy Shriver National Institute of Child Health and Development-Mars Public Private Partnership: NICHD 1R03HD066594-01. The UC Davis Center for Companion Animal Health provided partial support:#2016-56-FM.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Hart, Hart, Thigpen, Willits, Lyons and Hundenski. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Teaching Children and Parents to Understand Dog Signaling

Kerstin Meints <sup>1</sup> \*, Victoria Brelsford<sup>1</sup> and Tiny De Keuster <sup>2</sup>

*<sup>1</sup> School of Psychology, University of Lincoln, Lincoln, United Kingdom, <sup>2</sup> Department Nutrition, Genetics and Ethology, Faculty of Veterinary Medicine, Ghent University, Ghent, Belgium*

Safe human-dog relationships require understanding of dogs' signaling. As children are at particularly high risk of dog bites, we investigated longitudinally how children from 3 to 5 years and parents perceive and interpret dogs' distress signaling gestures. All participants were then taught how to link their perception of the dog with the correct interpretation of dogs' behavioral signals and tested again. Results show a significant increase in learning for children and adults, with them showing greater understanding of dogs' signaling after intervention. Better learning effects were found with increasing age and depended on the type of distress signaling of the dogs. Effects endured over time and it can be concluded that children and adults can be taught to interpret dogs' distress signaling more correctly. Awareness and recognition of dogs' stress signaling can be seen as an important first step in understanding the dog's perspective and are vital to enable safe interactions.

#### Edited by:

*Peggy D. McCardle, Consultant, New Haven, CT, United States*

#### Reviewed by:

*Aubrey Howard Fine, California Polytechnic State University, United States Esther Schalke, Lupologic GmbH, Germany*

> \*Correspondence: *Kerstin Meints kmeints@lincoln.ac.uk*

#### Specialty section:

*This article was submitted to Veterinary Humanities and Social Sciences, a section of the journal Frontiers in Veterinary Science*

Received: *08 June 2018* Accepted: *28 September 2018* Published: *20 November 2018*

#### Citation:

*Meints K, Brelsford V and De Keuster T (2018) Teaching Children and Parents to Understand Dog Signaling. Front. Vet. Sci. 5:257. doi: 10.3389/fvets.2018.00257* Keywords: children, adults, dog body language, dog bite prevention, safety intervention

### INTRODUCTION

Benefits of dog ownership include positive effects on human health and well-being and on child development and learning [see (1) for overview; for recent systematic reviews, see (2, 3)]. Dogs function as social facilitators (4), assist in therapy, are used as co-visitors in retirement and care homes, in nurseries and in hospitals (1). Pets are seen as friends, companions and social partners (5– 8) and, increasingly, as family members (5, 6). Dogs are among children's favorite pets and children show most attraction to dogs, be it puppies or grown-up dogs, compared to other pets (9, 10).

In the UK around 30% of households own a dog, with regional fluctuations in numbers (21– 38%) (11–13), while in the US and in Australia up to about 40% of households own a dog (5, 14). The dog is also the pet of choice in many pet-owning households in Europe and Canada, with even higher figures in Mexico, Argentina and Brazil (15).

However, despite the benefits of dog ownership, there are also risks involved. Hospital data revealed that each year, about 1.5% of the general population suffers a dog bite that requires medical attention (16, 17) and the prevalence of dog bites in children is twice that of other age groups (18–20).

In the UK, a clear increase in the number of people attending a minor injury unit or accident and emergency department for treatment of dog bites and strikes has been observed. Over the tenyear period March 2005 to February 2015 the number of admissions due to dog bites increased 76% from 4,110 per year to 7,227. This is a 6.5% increase from the 6,783 finished admission episodes recorded in the previous 12 months (21). With the highest rate of dog bite injuries occurring in children (22–24), Schalamon et al. (25), demonstrated that most injuries occur in those under 15 years of age, with rates peaking between the ages of 5–9 years. Recent figures from the National Health Service on dog bites and strikes (21, 24, 26) demonstrate that more serious dog bite injuries requiring admission to hospital are on the increase, with 17% being related to children under the age of 10 years. Furthermore, dog bite rates in most-deprived compared to leastdeprived areas are three times as high (21, 24).

However, the above estimate is low as these figures for adults and children do not include unreported cases were treatment was not required or where injuries were not presented to the medical profession (27, 28). Strikingly, when interviewed directly, about 47% ofschool children reported they had been bitten (28, 29). In a recentsurvey in the UK, Westgarth et al. (30) found that a quarter of their local sample of 694 adult respondents had suffered a dog bite.

High dog bite figures are not unique to the UK: the problem of dog bite injuries is a world-wide problem (31) with research from Australia (20), the Netherlands (23), Alaska (32), Belgium (33), Switzerland (18), Canada (34), and Spain (35) highlighting the extent of the issue. A recent study carried out by Quirk (27) estimated that 1,615,426 persons were treated in US emergency departments for non-fatal dog bite-related injuries between 2005 and 2009.

Costs caused by dog bite incidents are estimated at around \$53.9 million for hospital stays only in the US (36), with home owners insurance claim payments reaching \$530 million in 2014 (37). Likewise, costs in Australia were estimated around \$7 million (38) and in the UK at around £10 million (39). Medical and veterinary professionals have repeatedly demanded effective prevention [e.g., (40)] and a collaborative (41) and evidencebased strategy (42, 43).

The majority of bite accidents (about 75%) occur in the home environment and involve children bitten by a familiar dog [e.g., (25), (44–47)]; see also (48) for similar data on adults]. Childinitiated interactions, such as approaching the dog while eating or surprising it while sleeping, seem to trigger up to 86% of accidents at home (44). Recent questionnaire studies also showed that injuries occurred during feeding treats or play (49).

Younger children are more often injured in the face, neck and upper torso (25, 46, 50). It has also been reported that 43% of patients on a maxillofacial ward for treatment after a dog bite were children under the age of 10 (40). Such injuries can lead to life-threatening medical conditions or psychological sequelae like Post-Traumatic Stress Disorder (51, 52). Whilst physical injuries are apparent, the psychological impact is less obvious, and left untreated can have long term consequences, not only for the victim but also their family (52). Seventy percent of all fatal dog bites involve children (53, 54).

Given these high figures, and given that most of the time, the child's interaction with a dog triggers the biting incident there is a clear need to increase parent awareness about home contexts and child actions that may trigger a dog bite (55, 56). There is also a need to improve the child's ability to assess how a dog responds to their action and for them to learn when it is not safe to interact with a dog. For appropriate supervision of children and dogs, it is also important for parents to be aware of the dog's signaling as reaction to their or their children's interactions with the dog.

Surprisingly, children as well as adults often do not notice dogs' stress signaling or misinterpret dogs' attempts to signal (57– 59). When shown images of dogs' facial displays, children often do not understand dogs' facial expressions and can confuse a very angry dog as being friendly and approachable (60). Without tuition, children do not discriminate dogs' body signals and tend to look mainly at the face instead (61). In adults, dog signaling interpretations vary with experience, however, dog ownership does not predict correct understanding of dogs' behavior [e.g., (62, 63)].

Overall, research has demonstrated that there is little knowledge regarding dog behavior and safety practices for childdog interactions [see also (64, 65)]. When trying to enable safe human-animal interaction, it is vital to be able to recognize and interpret the animal's distress signaling correctly in order to avoid injury to the person and distress to the animal. Arhant et al. (49) also emphasize the need for a dog bite prevention approach directed at caregivers.

While dog bite prevention programmes exist, and some address how to behave in public with unfamiliar dogs [e.g., (18, 66, 67), see (68) for a systematic review], while others teach children and their families to be aware of potential risk situations with a family dog, and how to avoid or de-escalate risk situations [e.g., Blue Dog bite prevention program assessment; see (56, 69)] there is no assessed program so far that teaches children or adults more basic skills—how to recognize and interpret specific dog body language. More precisely, currently no intervention has been tested to teach children and adults about dogs' behavioral response and their stress signals as a response to the child or adult in the context of a dog-directed action.

Humans often perceive petting a dog or hugging a dog as friendly gestures. Especially young children like to hug dogs as a sign of their friendship, not realizing that their (benign) actions might intimidate a dog and induce fear or distress. If a dog freezes and does not move, this may lead parents and teachers to think the dog feels happy with this well-intended attention. Thus, when targeting dog bite prevention in families with children and their pet dog, it is crucial to realize that safe cohabitation is based on mutual understanding of species-specific signaling, social gestures and interactions (70). Research indicates that most of the dog bite accidents with family dogs result from such seemingly benign (from the human perspective) interactions, hence the importance to stimulate awareness in children and parents about how their dog behaves, and which signals the dog presents when being hugged, petted or approached in different situations (55). Recent research has shown that most of children's interactions with dogs fall into this category, and mostly increase in frequency with age (49).

Dogs who feel stressed are likely to present stress- and threatavoiding signaling (e.g., nose-licking, turning away). When these signs are ignored or misinterpreted, the pet may use other strategies, including aggression [(71–73); see also Mariti et al. (74) for a first systematic empirical investigation of such behaviors in dogs]. Recent studies have shown further evidence that dogs show signals like licking of lips and looking away as appeasement signals in dog-human communication [(75); see also (76)].

Shepherd's "ladder" of distress signals (72, 73) includes conflict-defusing signals on its lower steps (appeasement behavior, calming signals, displacement behavior, e.g., nose-licking, eye-blinking)—these are signals to defuse conflict and restore harmony in a social interaction. In the next grouping on the ladder, conflict-avoiding signals are included (e.g., walking away, standing crouched, tail tucked under, creeping). In case a perceived social threat continues, and/or conflict-defusing avoiding strategies have failed, dogs may present strategies higher on the ladder such as conflict escalation signals (e.g., staring, growling, biting). For an overview, see **Figure 1**.

It is important to stress that Shepherd's ladder is not to be understood in a strictly hierarchical way as dogs do not necessarily move through these signals in a linear fashion. Depending on how the interaction evolves (i.e., if the approaching human understood the message correctly, and by stopping all interaction with the dog, the dog may be able to relax and return to a state of comfort) and depending on what the dog has learnt (e.g., unpleasant outcome of interactions in the past despite conflict-avoiding signaling), their strategy may change over time, and dogs may move on to a snap or bite action to stop a perceived threat.

It is also vital to be aware that dogs' strategies depend on factors relating to the context (social & environmental triggers),

factors relating to the dog e.g., personal history (past experiences) and on their physical and behavioral health. It is important to stress that factors that are known to reduce a dog's wellbeing will reduce a dog's threshold for stress and arousal and increase the odds of using escalation strategies in a stressful encounter. Wellknown examples are sensory deficits, physical illness, chronical pain or dogs suffering from anxiety (45, 47, 77). In addition, other signals may be shown [e.g., (57, 71, 74)].

There is a striking lack of knowledge of dog signaling in the population, and there is also a general lack of knowledge regarding dog behavior and safety practices for child-dog interactions, with owners of dogs often unaware of the factors likely to increase the risk of dog bites to children (64), for example, subtle signals are often not known by dog owners to be stress signals (58). This is a serious knowledge gap, as the safety of young children mainly relies on the perceptual understanding, and knowledge and anticipatory guidance of the adults around them (47, 64). The following steps are often named to constitute a more complete process of prevention and action:


Thus, while dogs are rather good at interpreting human signaling [e.g., (79–92)], humans do not seem to be equally equipped to interpret dog's visual signaling.

Given not only the popularity of dogs as pets, also the increasing popularity of animal-assisted interventions in educational settings as well as the application of pets in the classroom [(93); for a systematic review, see (2); see also (1, 94– 99)], and given the frequency of injury with familiar dogs at home, there is an urgent need to teach adults and children dog body language.

In order for children to interact safely with dogs, they must first have knowledge of dog behavior and awareness of situations which may put them at risk of being bitten. This means that they must know the signals, recognize them, understand that they are the consequence of actions toward the dog, and, if it is their own action, adapt their action. Ultimately, it is crucial that parents also have this knowledge in order to teach and supervise their children when interacting with dogs and to provide anticipatory guidance.

If we can successfully teach children and parents to recognize and interpret dogs' stress signaling correctly, and be aware of the actions that trigger the signaling, and ideally, act upon their knowledge, then all sides will profit: adults and children will understand dogs' distress signaling better, risk situations may be defused and the (family) dog will enjoy more respectful and appropriate treatment.

In the current study, we have addressed the lack of knowledge and lack of systematic intervention with children and adults alike. By teaching participants how to recognize and interpret dog stress signals and by assessing if our intervention works, we are undertaking the first steps toward preventing misunderstandings and risk escalation due to lack of knowledge.

We assessed participants' knowledge of dogs' signaling behaviors before and after a dog body language intervention with a range of video clips of real dogs. We tested both children and parents. In addition, we integrated this into a longitudinal design to monitor the effectiveness of the current intervention by assessing children's developmental progression over 4 time points up to 1 year. Finally, to gain more in-depth knowledge of other potential factors, we used questionnaires to learn about background demographic data, socio-economic status and dog ownership statistics.

### METHODS

### Participants

Children were recruited through schools and nurseries in the county of Lincolnshire, UK. All participants were healthy and had normal or corrected-to-normal vision. No exclusions occurred before testing.

Initial calculations with Poweranalysis [G∗Power3; (100)] showed a necessary sample size of 18 children per age group (3, 4, and 5 years). As attrition rates of about 30–70% do occur in longitudinal studies and often reduce the initial cohort to a vastly smaller size in the final cohort (101), we over-recruited children to be able to cope even with a harsher drop-out rate. Hence, our initial overall group size at Test 1 contained 124 children for this longitudinal study. However, our attrition rate was very low and we managed to keep 82% (N = 101) of children in the sample after 6 months and we retained 85% (N = 105) of children in the final sample after 1 year as can be seen in the following **Table 1**.

Children took part in Test 1, 2, 3, and 4. Reasons for attrition in children are as follows: In Test 2, 3 children who took part in Test 1 did not complete Test 2 on the same day, hence were excluded from analysis. Test 3: Attrition of 20 children due to being ill, having moved school and being on holiday. Test 4: A slight gain of children occurred, as some who had missed Test 3 due to absence were back for Test 4.

Overall, in the final sample entered into the data set, there are 88 children who took part in all testing sessions (39 girls and 49 boys overall; 26 3-year-olds (12 = female, 14 = male; M = 3.4, SD = .32, range 2.8–3.9), 23 4-year-olds (11 female, 12 male; M = 4.6, SD = .24, range 4.0–4.9) and 39 5-year-olds (16 female, 23 male; M = 5.7, SD = .45, range 5.0–6.8). Of this sample, 37% had a dog.

Parents took part in Test 1 and 2 (same day) only. Additional longitudinal parent testing was not possible due to limited funding. However, piloting had shown that adults showed clear improvements as they found the teaching phase to be a real "eye-opener." Error rates dropped once they had realized what the behavior of the dog implied. The current study results confirm this and we have no reason to assume that adults with typical and intact memory capacity would forget this knowledge over time. Of the parents 27.5% were dog owners, these dog ownership figures for children and adults compare well with the national average of about 30% dog owners. Also, 47.5% of parents had been bitten by a dog, this is very TABLE 1 | Participant numbers over time.


similar to the 47% reported elsewhere [e.g., (29)]. Thus, we can assume our sample is fairly representative concerning these factors.

#### Ethical Approval

This study was carried out in accordance with the recommendations of University of Lincoln, School of Psychology Research Ethics Committee (SOPREC). The protocol was approved by the SOPREC. Written informed consent was gathered in accordance with the Declaration of Helsinki.

### Stimuli

#### Video Clips

The stimuli consisted of sets of 16 short video-clips portraying dogs with the full range of behavioral distress signals described in "Shepherd's ladder" (72, 73). These are as follows: yawning, blinking, nose licking, turning the head away, turning the body away, pawing, walking away, creeping, crouching with tail tucked under, lying down with legs up, stiffening up and staring, growling, snapping and biting. Due to other literature, we also added snarling and walking away with hiding. We also presented four video clips of relaxed dogs. Given research indicating that acoustic input may help children's recognition and correct interpretation (102, 103), those clips that naturally had a sound (snarling and growling) were accompanied by this sound.

Due to ethical considerations we did not show serious bites drawing blood, and parents had the opportunity to view the images beforehand and decide if they allowed their children to take part. Also, children received a thorough debriefing session after testing finished, so we could make sure children clearly understood the dog signaling. None of the children displayed any signs of distress during testing or after testing, none of the parents reported any detrimental effects back to the research team.

Having piloted the video clips, we decided that the procedure worked best if we used 2 × 16 videos (2 per distress behavior) we added 4 relaxed (happy) behaviors so that children would not get the impression that dogs are usually distressed, however, these items were not part of the intervention phase and children were not trained to recognized relaxed dogs<sup>1</sup> .We called these "happy" as the language needed to be child-appropriate and previous work has shown that children understood this label well, similar for the terms "ok," "unhappy," and "angry" (60).

Videos were clipped and resized using Bink and Smacker (RAD Video Tools): each video was 6,000 ms duration, 360 × 240 pixels, and with a data rate of 25 frames per second.

Video clips were presented centrally on the monitor screen and displayed on a 15% greyscale background. Altogether, we used up to 4 different sets of videos in Test 1 (baseline), Test 2, 3, and 4 (see below). All video stimuli were assessed for their expression and approved by 3 internationally renowned dog behavior specialists.

#### Audio Stimuli

Audio recordings matching each of the visual stimuli were produced in a sound-proof professional audio-recording studio at the University. All recordings were carried out within one session so as to reduce variation in the voice of the speaker. The speaker was female and a native speaker of British English. Audio messages consisted of four features across all trials: an initial "Look" command, followed by a description of the dogs' behavioral signal to steer participants' attention, then a message of how the dog is feeling and lastly a message of safety instruction for the child. We have consulted closely with a consultant and dog behavior expert on the appropriate content of the verbal messages. Messages take the following character: (a) Attention getter (Look!), (b) highlighting the dog's signaling behavior, (c) followed by an explanation how to interpret the dog's behavior, (d) then a clear safety instruction for adapting their actions. An example of such a message is as follows: "Look! The dog is blinking its eyes. The dog is worried. You should leave the dog alone." Audio files were cut and manipulated using Audacity version 2.0.1. Files were 1141 kbps, 2 channel and were used in .wav format.

#### Rating Scale

We used a child-appropriate 1-5 rating scale in which symbolic faces expressed either very happy (1), happy (2), just ok (3), unhappy/angry (4), and very unhappy/angry (5) emotions. Children had no problems using the scale.

### Procedure

Children were tested in schools and nurseries in a quiet room. Videos were presented on a laptop and the experiment was programmed using the Lincoln Infant Lab Package 1.0 (104). Participants were seated approximately 70 cm from the screen.

Child participants took part in the study longitudinally; this included viewing an initial baseline phase of video stimuli (Test 1), immediately followed by a training phase of videos and then tested with novel videos (Test 2) afterwards to investigate if their knowledge had improved. Participants were then tested again 6- and 12 months later (Test 3 and 4) without any additional training to see if they had retained their knowledge. Hence, we have an integrated control group with each child being their own control (before and after learning and at the follow-up testing). In addition, we have further integrated controls in that 4-yearolds at testing start can be compared with 3-year-olds after 1 year (when they have turned 4 years of age). In the same way, the 5-year-olds at start of their testing can be compared with the 4-year-olds at testing point 1 year (when they have turned 5). Adults only took part twice on the same day (Test 1, Training and Test 2) and results can therefore be compared before and after testing.

### Testing Phases

### **Baseline phase**

Each participant viewed 20 trials. Each trial was made up of a 6,000 ms video displaying dog behavioral signals as described above. These were followed by a fixed choice user/child friendly rating 1–5 scale ranging from "very happy" to "very unhappy/angry." Participant ratings were recorded both electronically and verbally, and the rating scale stayed on the screen until the participant had made their choice. Duration of this phase was between 2 and 5 min.

#### **Training phase**

Participants viewed 32 trials (2 × 16 distress behaviors, one set with dogs seen in Test 1, one set with novel dogs). Each trial was made up of a 1,000 ms blank screen accompanied by the initial "Look" audio. This was followed by a 6,000 ms video displaying dog behavioral signals accompanied by the remainder of the audio sentence highlighting the dogs' behavioral stress. Duration of this phase is about 4–5 min.

### **Test 2 (same day) and Test 3 and 4 (6- and 12-month intervals)**

Participants were again presented with 20 trials (16 distress behaviors and an additional 4 "happy" dogs). This was immediately followed by the fixed choice user/child friendly rating 1–5 scale as described above. This took between 2 and 5 min. Both, children and parents thoroughly enjoyed taking part.

Note: In addition, half of the children always saw novel stimuli at each testing time, and the other half saw the novel set from Test 2 repeated at Tests 3 and 4. This was to explore if children learn

<sup>1</sup> Incidentally, our behavior experts agreed least on "happy" dogs. In order to teach about relaxed dog behavior, we would need to set up a separate study investigating this. For the current research, we analyzed the behaviors that were trained in the intervention to see if we can educate participants on recognizing distress behaviors in dogs.

differently with items that are novel each time as opposed to items that are novel at Test 2 and then reoccur, however, there was no statistical difference, hence, results below include both groups of stimuli.

### RESULTS

### Study 1 With Children Rating Scores Children

We initially calculated a repeated measures ANOVA with Gender (male/female), Dog Ownership (yes/no), Age Group (3, 4, 5 years) and Distress Signal Group (defuse, avoid, escalate) on the rating scores at different Testing times (before training, after training, after 6 months, after 1 year)<sup>2</sup> . This analysis revealed no significant effects of Gender and Dog Ownership, hence we calculated a repeated measures ANOVA only with Age Group (3, 4, 5 years) and Distress Signal Group (conflict defusing, conflict avoiding, conflict escalating) on the rating scores at the different testing times (before training, after training, after 6 months, after 1 year).

We found a highly significant main effect of Age [F(2, 85) = 7.84, p < .001, partial η <sup>2</sup> = .16] with older children showing more correct results than younger children. A significant main effect of Distress Signal Group [F(2, 170) = 298.85, p < .001, partial η <sup>2</sup> = .78] also emerged, with children judging conflict escalating signals as different from conflict-avoiding and defusing signals, but not distinguishing between conflictavoiding and defusing signals in dogs–post hoc tests with Bonferroni corrections (p < .0166) show that the following differences are highly significant: conflict-escalating vs. conflictdefusing (p < .001); conflict-escalating vs. conflict-avoiding (p < .0001); while children do not distinguish conflict-defusing vs. conflict-avoiding signals in dogs (p < .05).

We also found a highly significant main effect for Testing times [F(3, 255) = 6.93, p = .0002, partial η <sup>2</sup> = .08] with children improving significantly from Test 1 (baseline measure before intervention) to Test 2 after intervention (p < .002). Children also show improved knowledge from Test 1 to Test 3 at 6 months (p < .0026) and from Test 1 to Test 4 after 1 year (p < .0006).

There was also a significant interaction between Age group and Testing Times [F(6, 255) = 5.11, p = .0001, partial η <sup>2</sup> = .11] which demonstrated that the older the participants, the better they perform. Highly significant interactions of Age by Distress Signal [F(4, 170) = 5.07, p = .0007, partial η <sup>2</sup> = .11], see **Figure 2** below, and of Distress Signal by Testing Time [F(6, 510) = 6.02, p < .0001 partial η <sup>2</sup> = .07] also emerged as well as a significant three-way interaction between Testing time, Distress Signal and Age [F(12, 510) = 1.94, p = .028, partial η <sup>2</sup> = .04] showing clear differences between conflict-escalating signals vs. conflictdefusing and avoiding signals, with children showing better performance with increasing age and improvement over time, especially in the conflict-escalating signal group.

These results show medium to high effect sizes. Results are illustrated in overview in **Figures 2**, **3**.

After the intervention, children improve in their judgments, but even the oldest children do not come close to the correct ratings (e.g., 5 for conflict-escalating signal, 1 for happy).

## Study 2 With Adults

#### Rating Scores Adults

An ANOVA of Gender (male/female) by Dog Ownership (yes/no) by Distress Signal group (conflict-defusing, conflictavoiding, conflict-escalating) was calculated for Testing Times before and after intervention on rating scores. Gender and Dog Ownership yielded no significant results, therefore the analysis was calculated with Distress Signal group (conflict-defusing, conflict-avoiding, conflict-escalating) and Testing Times before and after intervention. We found a highly significant main effect for Testing Time [F(1, 39) = 243.93, p = .0001, partial η <sup>2</sup> = .86] showing improved understanding after intervention and a highly significant main effect for Distress Signal group [F(2, 78) = 291.54, p = .0001, partial η <sup>2</sup> = .88] highlighting differences between Distress Signal groups. **Figure 4** below illustrates this.

After the intervention, adults come close to the ratings that would be suitable for the dog's signaling attempt (5 for conflict-escalating signals, 4-4.5 for conflict-avoiding signals, 4 for conflict-defusing signals).

We also tested if there were effects for parental education, but no significant results existed.

### Studies 1 and 2: Rating Scores Compared Children and Adults

We also found highly significant main effects on differences between the parents' and children's judgments of dog's behavior, with most mistakes occurring in the conflict-defusing and conflict-avoiding signal groups [F(3, 387) = 251.69; p < .0001].

<sup>2</sup> See Norman (105) and Carifio and Perla (106) for the appropriateness of using Likert-scale data with ANOVAs.

#### Expected vs. Obtained Scores

One-sample t-tests revealed that all age groups significantly underestimate and misinterpret the dogs' real distress signaling (p < .001). Again, younger children make most misinterpretations. Least recognition of different distress signaling is found in 3-year-old children.

### Studies 1 and 2: Correct Answers and Errors

In a further analysis, we calculated correct responses and errors from the original scores. **Table 2** below shows percentages of correct answers and errors per Distress Signal category<sup>3</sup> . Please note the high proportion of errors classed as "happy" by the participants.

### Correlations Between Children's and Parents' Responses

There were no significant correlations between children's and their parents' judgments of the dogs' signaling behaviors before or after training.

### Correct Answers and Errors – Children

We also calculated a repeated measures Anova of Gender (male/female) by Dog Ownership (yes/no) by Age Group (3, 4, 5) by Distress Signal Group (conflict-defusing, conflictavoiding, conflict-escalating) before and after Intervention (Test 1, 2, 3, and 4) on correct answers. As there were no effects of dog ownership or gender, we ran the analysis with Age

<sup>3</sup>For the purpose of scoring % correct, we have scored "unhappy/angry"(4) as correct for conflict-defusing distress, and have accepted both, "unhappy/angry" (4) and "very unhappy/very angry"(5), as correct for conflict-avoiding distress and conflict-escalating distress. In a stricter analysis below, we have only accepted "very unhappy/very angry" (5) as correct for highly distressed dogs. Here, we have accepted both 4 and 5 for highly distressed dogs (instead of just accepting 5s) due to adults known reluctance to give extreme measures for emotional stimuli [e.g., (107)].


*Based on 114 children overall and 40 adults.*

Group (3, 4, 5) by Distress Group (conflict-defusing, conflictavoiding, conflict-escalating) at the different testing times (before training, after training, after 6 months, after 1 year). The following main effects were found: A significant main effect for Age [F(2, 148.822) = 6.98, p = .002, partial η <sup>2</sup> = .14] and Distress Signal Group [F(1.772, 84) = 395.36, p = .0001, partial η <sup>2</sup> = .83] as well as Testing Time [F(2.823, 237.156) = 4.72, p = .004, partial η <sup>2</sup> = .053]. Significant interactions were shown for Testing Time by Age [F(6, 84) = 4.94, p = .001, partial η <sup>2</sup> = .11], Distress Signal by Age [F(4, 84) = 4.298, p = .002, partial η <sup>2</sup> = .93] and Testing Time by Distress Signal [F(5.643, 473.980) = 4.70, p = .001, partial η <sup>2</sup> = .53]. Overall, children distinguish conflict-escalating signals better than conflict-avoiding and conflict-defusing signals. They show more correct answers with increasing age and improve after intervention, specifically in the conflict-escalating signal group. In this group, improvements are stable over time (up to 1 year). The 5-year-olds also improve in the conflict-avoiding signal group from before to after intervention, however, this effect is not enduring over time. Interestingly, despite the same rating categories 4 and 5 accepted for conflict-avoiding and conflictescalating signals, children distinguished conflict-avoiding and conflict-escalating signals clearly (p < .0001). Overall, these results show significant differences over time and for the different distress groups, with older children giving more correct answers than younger children. See **Figure 5** below for an overview of the results.

Concerning the question if children just learn over time or if results are due to our intervention, we have compared results of children at 4 and 5 years (4-year-olds at initial test act as control group to 3-year-olds at testing after 1 year when they are 4; 5-year-olds at initial test act as control group to 4-year-olds at testing after 1 year when they are 5). When comparing these 3-year-olds' reactions after 1 year, they show significantly more correct answers (66%) compared to 4-yearolds before intervention (55% correct, p < .044). Similarly, 4 year-olds after 1 year when they turned 5 demonstrate 76% correct answers vs. 64% correct answers in 5-year-olds before intervention start (p < .025). These significant differences between the control and intervention groups indicate that the intervention is successful and causes a significant increase in learning.

#### Correct Answers and Errors – Adults

We calculated a repeated measures Anova of Gender (male/female) by Dog Ownership (yes/no) by Distress group (conflict-defusing, conflict-avoiding, conflict-escalating) for Testing Times before and after intervention on percentage of correct answers. Gender and Dog Ownership yielded no significant results, therefore the analysis was calculated with Distress Signal group (conflict-defusing, conflict-avoiding, conflict-escalating) and Testing Times (before and after intervention) on percent correct responses. We found a highly significant main effect for Testing Times [F(1, 39) = 311.49, p = .0001, partial η <sup>2</sup> = .89] with better results overall after intervention and a highly significant main effect for Distress Signal Group showing differences between distress signal groups are perceived [F(2, 78) = 173.73, p = .0001, partial η <sup>2</sup> = .82]. A highly significant interaction between Testing Time and Distress Signal also emerged [F(2, 78) = 26.01, p = .0001, partial η <sup>2</sup> = .40] demonstrating higher rates of correct answers with higher distress as well as rates of correct answers rising from conflict-defusing via avoiding to escalating and all scores being higher after intervention. Results in overview in **Figure 6** below.

Interestingly, if we calculate results on a stricter criterium, i.e., only count as correct for conflict-escalating those answers that said "very unhappy/very angry," all main effects and interactions stay intact, however, performance of adults drops in the conflictavoiding category to 40% - and in children to 35, 51-, and 60% respectively for 3-, 4-, and 5-year-olds after intervention.

### Additional Observations–Children's Initial Perceptions

In addition to the quantitative data described above, we also would like to provide some additional observations. While we were working with the children, they often commented on the videos. The quotes below give an impression of children's thinking and reflect the most frequent comments, see **Table 3** below.

These comments were frequently made and show that children often anthropomorphise dogs and try to find an explanation that would be appropriate to explain human behavior, but unfortunately does not fit the dog's signaling intentions.

We would furthermore like to report, so far also only anecdotal comments of parents stating that they frequently provoked distress-signaling behaviors, for example, like lip/noselicking in their dog, as the family found it funny. However, having learned about dogs' distress signaling in the intervention, the

TABLE 3 | Dog signaling behaviors and children's perceptions and interpretations.


adults were upset that they and their children might have caused their dog distress and commented that they will change their (and their children's) behavior, thus contributing to a safer home environment for all and to dogs' welfare. Further research will need to be carried out to investigate this systematically.

### DISCUSSION

Results show that children and adults profit from the intervention and improve their knowledge of dogs' stress signaling significantly. When performing analyses over time we found that, overall, learning effects are still highly significant in children after 6 months and 1 year despite no training taking place in the meantime–thus, the intervention works successfully, even over the duration of 1 year.

A closer look at the error results shows us the areas in which the intervention has worked most successfully, and also the areas in which we need to invest more training with children and parents alike. We have very good success teaching all age groups of children, even young children of 3 years, and parents the meaning of conflict-escalating distress signals. They learn to understand, recognize and correctly interpret the signals and the learning success is still evident after 1 year. This is an important success as dogs showing their teeth or snarling or biting, pose a significant risk to children if these approach the dogs displaying such signaling. We have good to moderate success in training

especially older (5-year-old) children and parents on conflictavoiding distress signals. However, the data also show that all participants, including adults, find the more subtle signals of dogs' distress hardest to judge. Here, after intervention, only adults show excellent improvements. More research is needed to analyse these signals and how they are perceived in detailed examinations of this in future studies.

One could also question whether children's increase in knowledge is due to general learning and increase in maturity– however, the results of the 4- and 5-year-olds clearly contradict this as children who have taken part in the interventions (3 and 4-year-olds tested after 1 year when they turned 4 and 5 respectively) show significantly better results than the 4-and 5 year-olds at the start of the study (before intervention). Thus, our intervention has clearly improved their knowledge over time compared to the control group. To investigate the role of the intervention in light of children's learning and general maturity over time, it could be useful to devise larger studies with independent control groups, hence requiring significantly larger funding sources.

Overall, it becomes evident from this data that it is possible to educate adults and children to understand dogs' distress signaling. Adults profited from the intervention throughout all distress categories and show clear and significant learning effects. Thus, it is advisable to teach dog signaling to parents, dog owners, dog trainers, veterinary students and the wider public. The short intervention is easy to use and leads to significant improvements in knowledge, recognition and interpretation straight away and with enduring effect.

It has also become clear which areas need further attention and research–while our intervention works very well with adults and also with older children, it has to be adapted to improve especially the younger children's understanding, especially of the more subtle distress signals in dogs. Further research will need to explore how children process the signals and how to teach these signals best.

Our background measures of dog ownership, SES/parental education showed that there were no effects of any of these factors–in other words, neither children's nor parents' performance was better if, for example, they owned a dog, had a higher SES/education. Instead, performance was independent of these factors.

There was also no difference between children seeing novel stimuli in all test phases or the same stimuli again. This is useful to know for the future creation of interventions as we can now be confident that we do not need to increase the amount of novel stimuli to be shown in order to train and assess children on dog body signaling.

Finally, children's utterances illustrated how they perceived and misinterpreted—dogs' body language. Further quantitative as well as qualitative research in this area is warranted and could help develop additional dog bite prevention tools.

By assessing if our intervention works, we have undertaken the first step toward preventing misunderstandings and risk escalation due to addressing the current lack of knowledge and replacing it with knowledge that is stable over time. In the case of conflict-escalating signals, all participants showed significant improvements in knowledge over time.

Further steps next to teaching children and parents Knowledge of stress signaling (step 1) and Recognition and correct interpretation of stress signaling in context (step 2) are to Adapt the action. Having created awareness of the situation, insight to act accordingly should follow (step 3). Finally, Repeat recognition of future triggers and contexts and avoidance of risk (step 4) need to follow to effectively implement the taught knowledge. Further research will have to assess how to achieve these aims best.

In particular, future studies should address how best to implement the above so that beyond recognizing and understanding the signals, specific human actions and contexts wherein the dog presents these signs are recognized. It will also be useful to investigate if parents—or other educators—can guide and educate children to be aware of specific risk contexts. Concerning parental supervision, it would be interesting to find out to what extent they supervise child and dog and stop children from engaging in risky contexts with their dog in the first place. Furthermore, it would be beneficial to follow up in how far the welfare of family and dog are compromised after escalations have happened as well as investigate the role of professional help from a veterinary behavioral specialist.

Finally, and importantly, we assume that dogs will benefit from children and adults having been taught how to read their distress signals. This increased understanding will mean that dogs are better understood, and if humans apply their knowledge appropriately this will lead to greater wellbeing of the dog living within a family household.

### CONCLUSION

This project is the first to offer an intervention to enhance children's and adults' abilities to interpret dog signaling correctly.

We investigated how children perceive and categorize dogs' body language and interpret their signals and we then trained them and were able to improve their knowledge, recognition and interpretation skills.

We showed very good results in improving the potentially very dangerous misunderstandings of dogs' conflict-escalating distress and threat signals. For example, a snarling dog showing teeth which children often misinterpreted as a happy dog, can now be corrected–children showed significant improvements that were stable over time. We have shown successfully that we can significantly improve all participants' abilities to recognize and understand these signals and enable all participant groups to avoid escalating risk situations–our intervention works especially well for these high risk situations. This is especially useful as if such escalation occurs—it should be stopped to avoid risk of dog bite incidents and continued stress to the dog. Crucially, as our intervention furthers understanding of conflict-defusing and conflict-avoiding signals, hopefully, this may help to avoid risk escalation.

We have revealed the extent of children's and adults misinterpretation errors for the first time, and we have shown areas in which children and adults make most errors. We have also shown that we can teach adults and children successfully to learn, recognize and interpret the signals correctly.

With this new knowledge we enhance the currently scarce scientific database on children's and adults' interpretation abilities of dog signaling. We can now also address not only the most dangerous misinterpretations, but also commit ourselves to creating awareness of the less well understood and most frequently misunderstood signaling behaviors of dogs in order to avoid escalation of risk. The materials used can be further developed into an awareness raising intervention that is more widely usable for children and adults. For future effective prevention the above mentioned steps of implementation need to follow and, in turn, also be assessed as to their effectiveness.

In sum, we have now got a solid knowledge base about how children and adults look at and perceive dogs and (mis)interpret their behavior.

Our study was able to close these particular knowledge gaps, establish the necessary knowledge for the first time and therefore significantly advance the scientific knowledge in this area. Our study was also able to show that we can teach dog signaling successfully, and it outlines the current limitations.

Veterinarians will profit from these results insofar as they can help to raise awareness of the existing knowledge gaps in both adults and children.

Our study can also serve as an example of good practice in that we have evaluated the learning effects of the intervention cross-sectionally and longitudinally, as well as using additional measures.

In the future, integrated research projects including child psychology, veterinary, medical, educational and other social sciences can be developed as a result of these efforts and produce research with impact on One Health-related injury prevention challenges.

### REFERENCES


### AUTHOR CONTRIBUTIONS

KM conceived and designed the research project. TD fed back on the project proposals, contributed to their improvement and contributed the majority of dog videos. As behavior specialist, she also carefully assessed and commented on the video pool and helped select appropriate videos. KM and VB contributed to all aspects of the research itself from planning to testing to data analysis and writing up results. TD also contributed to initial design and planning and to writing the manuscript.

### FUNDING

This research was co-funded by the Eunice Kennedy Shriver National Institute of Child Health and Human Development (NICHD) and the WALTHAM <sup>R</sup> Center for Pet Nutrition, a division of Mars, Incorporated. The project described was supported by Grant Number 1R03HD071161-01 from the Eunice Kennedy Shriver National Institute of Child Health & Human Development and Mars-WALTHAM <sup>R</sup> .

### ACKNOWLEDGMENTS

We would like to thank our funders and all nurseries and schools, parents and children who took part in this research–it would not have been possible without your help.

We would also like to thank dog behavior specialists Prof. Daniel Mills and Dr. Hannah Wright, both University of Lincoln, UK, for taking the time and great care to assess and comment on our video selection and to help us with the selection process for final testing. We would furthermore like to acknowledge and thank Geert De Bolster (dog rehabilitation trainer, Belgium) for providing additional videos.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Meints, Brelsford and De Keuster. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Associations Between Pet Ownership and Attitudes Toward Pets With Youth Socioemotional Outcomes

Kristen C. Jacobson\* and Laura Chang

Department of Psychiatry and Behavioral Neuroscience, University of Chicago, Chicago, IL, United States

Evidence regarding the effects of pet ownership and related variables on youth socioemotional development is mixed. Inconsistencies across studies may be due to a variety of factors, including the use of different outcomes measured across studies, small potential effect sizes, and use of selected samples. In addition, studies have not systematically controlled for demographic characteristics that may bias results, nor have studies systematically examined whether effects are consistent across different subgroups. The present study examined the impact of pet ownership and attitudes toward pets on four measures of youth socioemotional outcomes: delinquency, depressed mood, empathy, and prosocial behavior. Linear mixed-effect regression analyses were conducted on 342 youth (48.0% male) aged 9–19 (M = 14.05, SD = 1.77) from a racially, ethnically, and socioeconomically diverse sample. The majority (59.1%) of youth currently lived with a dog or cat and all participants completed the Pet Attitude Scale-Modified. Pet owners reported lower delinquency and higher empathy than nonowners; however, group differences became non-significant once demographic factors were controlled for. Attitudes toward pets was significantly associated with all four outcomes. More positive attitudes was modestly associated with lower delinquency (β = −0.22, p < 0.001) and higher empathy (β = 0.31, p < 0.001), with smaller effects for depressed mood (β = −0.12, p = 0.04) and prosocial behavior (β = 0.12, p = 0.02). For delinquency, empathy, and prosocial behavior, effects were only slightly attenuated and remained statistically significant after controlling for gender, age, race/ethnicity, family socioeconomic status, and pet ownership, although the effect for depressed mood became non-significant after inclusion of these demographic factors. While there was some variability in effect sizes across different subgroups, none of the interactions between attitudes toward pets and gender, race/ethnicity, age, family SES, or pet ownership was statistically significant, indicating that the effects may transcend individual differences in demographic characteristics. Overall, the study adds to a growing body of work supporting a positive relationship between emotional bonds with pets and youth socioemotional outcomes and offers potential explanations for inconsistencies across previous studies.

#### Edited by:

Peggy D. McCardle, Consultant, New Haven, CT, United States

#### Reviewed by:

Megan Kiely Mueller, Tufts University, United States Francesca Cirulli, Istituto Superiore di Sanità (ISS), Italy

> \*Correspondence: Kristen C. Jacobson kjacobso@bsd.uchicago.edu

#### Specialty section:

This article was submitted to Developmental Psychology, a section of the journal Frontiers in Psychology

Received: 16 June 2018 Accepted: 05 November 2018 Published: 26 November 2018

#### Citation:

Jacobson KC and Chang L (2018) Associations Between Pet Ownership and Attitudes Toward Pets With Youth Socioemotional Outcomes. Front. Psychol. 9:2304. doi: 10.3389/fpsyg.2018.02304

Keywords: pets, children, depressed mood, delinquency, empathy, prosocial behavior

## INTRODUCTION

fpsyg-09-02304 November 23, 2018 Time: 10:55 # 2

There is a high prevalence of pet ownership in the United States, with dogs and cats being the most common types of pets (American Pet Products Association [APPA], 2018). Overall, roughly 68% of United States household have pets (American Pet Products Association [APPA], 2018), and pet ownership is even more common among families with children (Melson, 2003; Westgarth et al., 2007, 2010), According to the US Census (2017), approximately 57% of households with children have two or more children. Thus, it is likely that more American children live with pets than with siblings. There is also evidence that humans form strong emotional bonds with pets. In a recent large United States national survey of adults, more than 80% of cat and dog owners indicated that "companionship, love, company, affection" was a positive benefit of owning a pet (American Pet Products Association [APPA], 2018). Studies of both children and adults reveal that a significant number of individuals consider pets to be family members (Melson, 2001; Cohen, 2002; American Pet Products Association [APPA], 2018) and to rank relationships with pets as being important (Kosonen, 1996). A study of 7- to 8-year-old children reported that pets ranked higher as sources of social support than non-immediate family members such as aunts, uncles and grandparents, (McNicholas and Collis, 2000), while a study of 12-year olds found that children reported greater satisfaction with their relationships with pets than with their relationships with siblings (Cassels et al., 2017). Despite the prevalence and importance of pets in children's lives, there is surprisingly little research on the effects of pets on child development, especially in comparison to research examining human–human family relationships.

While there have been numerous reviews on the impact of therapy animals on child developmental outcomes (e.g., Nimer and Lundahl, 2007; Lentini and Knox, 2009; O'Haire, 2013; Chur-Hansen et al., 2014), only one published review has considered the effects of pets. This comprehensive evidence review reported only 22 studies of child pet ownership and related pet variables (e.g., time spent with pets, attachment toward pets) published between 1960 and 2016 (Purewal et al., 2017). According to this review, evidence for positive benefits of pets is inconsistent across studies. Of the 39 results summarized from these 22 studies, 64% (N = 25) claimed positive effects, although more than 25% of these results (N = 7) did not include associated p-values or confidence intervals. Exactly onethird (N = 13) reported no differences between owners and non-owners, and one result showed a negative impact of pet ownership.

One possible explanation for inconsistencies in prior research is that studies have used different measures of child developmental outcomes. Pet ownership and related variables are most often studied in relationship to child self-esteem and measures of social competence. While the majority of studies examining self-esteem have reported positive results, studies of other measures of social competence have been less consistent (Purewal et al., 2017). In a large study of 826 Croatian children aged 10–15, greater attachment to pet dogs was associated with higher empathy and more prosocial behavior (Vidovic, ´ 1999). Two studies of Canadian elementary children also found that dog ownership was associated with greater empathy, but empathy levels were actually lower among cat-owners (Daly and Morton, 2003, 2006). Pet ownership has been associated with lower self-reported loneliness in two unique samples: a study of 293 racially and ethnically diverse, rural, high school students living in Arizona (Black, 2012) and a study of 332 homeless youth living in Los Angeles (Rhoades et al., 2015), although there were no effects of pet ownership or attachment to pets on perceived loneliness in the large Croatian study (Vidovic, ´ 1999).

Research on pet ownership and child emotional and behavioral problems is less common, and results are considerably more mixed. Vidovic (1999) ´ found no relationship between pet ownership and anxiety in a large sample of Croatian youth. Gadomski et al. (2015) found that rural children aged 4–10 currently living with pets had lower screening anxiety scores than non-pet-owning children, but did not find a relationship between pet ownership with broader measures of parent-reported youth emotional, behavioral, and attentional problems. Rhoades et al. (2015) reported that pet-owning homeless youth reported less depression than non-pet-owning youth, but a large study of Australian adolescents did not find an association between pet ownership and a composite measure of child emotional, social, and school problems (Mathers et al., 2010). One of the only longitudinal studies of pet ownership found that levels of tearfulness in 8- to 12-year olds were decreased at 12 months following adoption of a pet dog, in comparison to non-dog owning children, although the sample size for this study was small (Paul and Serpell, 1996). There are virtually no published studies of pet ownership and child behavior problems, although there are multiple reports suggesting that child hyperactive, aggressive, and disruptive behaviors in school decrease after introduction of pets into classrooms (e.g., Hergovich et al., 2002; Kotrschal and Ortbauer, 2003; Tissen et al., 2007; O'Haire et al., 2013). In the longitudinal study, Paul and Serpell (1996) reported a decrease in "naughty" behavior among children at 1 month following the adoption of the family dog, but this effect did not persist at the 6- or 12-month assessments. Given the relatively limited number of studies on pet ownership in childhood, it is unclear whether inconsistencies across studies are due to the use of different outcome measures or due to differences in sample characteristics. Research designs that include multiple measures of child socioemotional outcomes within the same sample are an efficient way to test whether positive benefits of pet ownership are limited to certain outcomes.

Prior studies also differ markedly in whether or not they control for demographic covariates. Given that pet ownership is not randomly distributed across families, it is critically important that studies consider other factors that might account for results. For example, ownership of and interest in pets tend to peak in middle childhood (i.e., 8–12 years) and to decline during adolescence (Melson, 1988; Paul and Serpell, 1992, 1996). Because rates of depression and delinquency increase during adolescence, correlations between pet ownership and outcomes could be driven by these coinciding developmental

patterns, especially when samples encompass a wide age range. Gender confounds are also under-explored. Less than half of the studies reported in the Purewal et al. (2017) review controlled for gender, despite there being marked gender differences in behavioral, social, and emotional problems in childhood and adolescence.

Race/ethnicity and socioeconomic factors are other important factors to consider. Within North America, Caucasian families are more likely to have companion animals than African American, Hispanic, and Asian families (Siegel, 1995; Risley-Curtiss et al., 2006; Pet Food Industry, 2012; Saunders et al., 2017). While there is evidence that dogs are equally valued among Hispanic and Caucasian adolescents (Black, 2012) and adults (Johnson and Meadows, 2002; Schoenfeld-Tacher et al., 2010), Caucasian adults tend to own more pets and be more highly attached to their pets than African Americans (Brown, 2003). Racial/ethnic differences have not been routinely examined in studies of children, however, as most studies have contained more than 95% Caucasian youth. While population-based studies in Europe typically report inverse associations between family pet ownership and levels of income and education (Mullersdorf et al., 2010; Murray et al., 2010; Westgarth et al., 2010), a study of over 42,000 adults living in California reported that several positive socioeconomic factors, such as full-time employment, higher income, and home ownership, predicted both dog and cat ownership (Saunders et al., 2017). Because cultural and economic factors are important predictors of both pet ownership and child outcomes, failure to control for these effects could lead to biased results.

Finally, we have limited information as to whether the effects of pets on child development vary for children in different subgroups. One reason for this gap is that prior research on the impact of pets in children has relied heavily on small sample sizes that lack diversity. Of the 22 studies reported in Purewal et al. (2017) review, 40.9% (N = 9) were based on sample sizes less than 100 individuals, with 5 of these using less than 25 participants. In particular, the vast majority of prior work has been based on Caucasian samples. Thus, whether the potential protective effects of pets generalize to minority youth is largely unknown. There is also evidence that emotional bonds with pets may vary by gender (Kidd and Kidd, 1989; Johnson et al., 1992; Woodward and Bauer, 2007), age (Melson, 1988; Paul and Serpell, 1992, 1996), and family composition (Melson et al., 1991; Siegel, 1995; Bodsworth and Coleman, 2001; Westgarth et al., 2013). Thus, it is important to test whether the benefits of pet ownership vary across demographic characteristics.

The present study was designed to address these limitations. Specifically, detailed measures of pet ownership and attitudes toward pets were added to a larger study of risk and protective factors for youth socioemotional and behavioral outcomes in a racially, ethnically, and socioeconomically diverse sample of urban and suburban youth aged 8–19. We obtained multiple measures of child socioemotional outcomes and caregivers of youth provided detailed information on demographic characteristics. This enabled us to address the following research questions: (1) Is there a stronger relationship with youth socioemotional outcomes for attitudes toward pets compared to pet ownership? (2) Do the effects of pet ownership and attitudes toward pets generalize across different socioemotional outcomes? (3) Are effects attenuated when demographic confounds are considered? (4) Are the effects different for youth in various ecological niches?

### MATERIALS AND METHODS

### Participants and Procedure

Participants in this study took part in an in-lab family study at the University of Chicago. The sample was recruited from a larger community-based study of 3,582 urban and suburban youth in the greater Chicago area who had participated in a prior inschool survey of socioemotional behavior among middle school students (Chen and Jacobson, 2013). The in-lab study consisted of 378 youth aged 8–19 from 241 families, including 137 sibling pairs. More than 85% of families contacted for recruitment agreed to participate in the in-lab assessment, which occurred between March 2010 and August 2012. Exclusion criteria included the presence of severe physical, psychological, or neurological problems in children which would have interfered with study participation (<2% of families contacted) and/or a primary caregiver who could not read or write English (∼6% of families contacted). The study protocol was approved by the University of Chicago Institutional Review Board. In accordance with the Declaration of Helsinki, a parent/legal guardian (79.4% biological mothers) provided written informed consent for themselves and their children and youth provided written informed assent. Participants were compensated for their time. Youth and a single caregiver were studied simultaneously in an on-campus research laboratory during a single 3–4-h visit. Assessments included faceto-face interviews with caregivers and self-report instruments administered to both youth and caregivers.

### Measures

#### Predictors

### **Pet ownership**

Pet ownership was assessed through a detailed, semi-structured interview with the youth's caregiver that was designed for the current study. In brief, caregivers were asked to report on the presence of any pets currently living in the home, as well as any other pets they had had during the past 10 years. Questions were asked about dogs, cats, and small pets, including mammals, reptiles, birds, and fish. Preliminary analyses (available from first author) indicated that youth who lived only with small pets were more similar demographically to youth who did not live with any pets than they were to youth living with a dog or a cat. Likewise, youth living with a cat were similar to youth living with a dog. Thus, analyses used current dog and/or cat ownership as the primary predictor. Pet ownership information was available for 371 out of 378 youth.

#### **Attitudes toward pets**

Youth completed the Pet Attitude Scale-Modified (PAS-M; Templer et al., 1981). This measure includes 18 questions and assesses participants' general attitudes about pets. Responses

ranged from 1 = strongly disagree to 7 = strongly agree. Questions were phrased both positively (e.g., "You should treat your house pets with as much respect as you would a human member of your family") and negatively (e.g., "The world would be a better place if people would stop spending so much time caring for their pets and stated caring more for other human beings instead"). Negatively phrased questions were reverse-coded, and all items were averaged to create a single composite score (Cronbach's α = 0.90), with higher scores indicating more positive attitudes toward pets. Youth were given the PAS-M scale regardless of whether or not they were current or past pet owners. Due to a procedural error, the PAS-M was not administered to N = 28 out of 378 youth (7.4%). Youth with more than 20% missing data on individual items were given a missing value for the composite score.

#### **Demographic factors**

Youth age was calculated using caregiver reports of youth date of birth subtracted from the date of the study day. Gender and race/ethnicity were obtained via both youth and caregiver report. For race/ethnicity, youth and their caregivers were asked whether they were Hispanic or Non-Hispanic, and they used a checklist to indicate their racial background. Responses included White, Black or African American, Asian, Native Hawaiian or Other Pacific Islander, American Indian or Alaskan Native, and more than one race. Family socioeconomic status (SES) was assessed with the two-factor Hollingshead weighted SES index based on parental education and occupation, with a possible range of 8– 66. The SES measure correlated positively with caregiver report of family income (r = 0.62, N = 373, p < 0.001) and was used because it was less negatively skewed (skewness = −0.56) than income levels (skewness = −2.76).

#### Youth Outcomes

#### **Prosocial behavior**

Youth prosocial behavior was assessed using the Child Social Behavior Scale (CSBS; Crick, 1996) based on caregiver report on child. The CSBS uses four items to assess child prosocial behavior toward peers (e.g., "This child tries to cheer up peers when they are sad about something"). Responses ranged from 1 = never true to 5 = always true. Items were averaged to create a mean composite score (Cronbach's α = 0.91) with higher scores indicating more prosocial behavior. Youth with more than 25% missing data on individual items were given a missing value for the composite score.

#### **Empathy**

Youth empathy was assessed through self-report using the 15-item Social Attitudes Scale (SAS; Eisenberg et al., 1996). Responses ranged from 1 = really like me to 3 = not at all like me. Questions were phrased both positively (e.g., "I feel sorry for other kids who don't have toys and clothes") and negatively (e.g., "I think it is funny that some people cry during a sad movie or while reading a sad book"). Positively phrased questions were reverse-coded so that higher scores indicated greater empathy, and all items were averaged to create a composite score (Cronbach's α = 0.91). Youth with more than 20% missing data on individual items were given a missing value for the composite score.

#### **Depressed mood**

Youth self-report depressed mood were assessed using the 20 item Center for Epidemiological Studies Depression Scale (CES-D; Radloff, 1977). Questions asked how often each statement was true during the past week and included items such as "you felt depressed," "you were bothered by things that usually didn't bother you," and "you enjoyed life" (reverse-coded). Responses were 1 = never or rarely to 4 = most of the time or all of the time. Items were averaged to create a mean score (Cronbach's α = 0.87) with higher scores indicating greater depressed mood. Youth with more than 20% missing data on individual items were given a missing value for the composite score.

#### **Delinquency**

Youth delinquency was measured with 16 items assessing frequency of a broad range of illegal (e.g., stealing something worth more than \$50), norm-violating (e.g., skipping school without permission), and aggressive (e.g., getting into a serious physical fight) behaviors within the past 12 months. Responses were given on a 3-point scale, ranging from 0 = never to 3 = five or more times; each behavior was recoded into 0 = never and 1 = one or more times. A composite score of the number of delinquent behaviors endorsed was computed by summing the recoded responses to the 16 items (Cronbach's α = 0.81). The initial composite delinquency score was positively skewed (skewness = +1.68); thus the composite score (+1) was log-transformed to normality (skewness = −0.13). Youth with more than 20% missing data on individual items were given a missing value for the summary score.

### Statistical Analyses

The data analysis for this paper was generated using SAS software, version 9.3 for Windows. Copyright © 2002–2010, SAS Institute Inc. SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc., Cary, NC, United States. Descriptive statistics and preliminary analysis of demographic differences were calculated using standard chi-square tests, t-tests, and Pearson correlations. Primary analyses used hierarchical multiple regression to test the effects of pet ownership and attitudes toward pets on youth outcomes. Because the sample consists of a subsample of sibling pairs, regression analyses were conducted using linear mixed models in SAS PROC MIXED. Mixed level models take into account the clustering of siblings within families by including family ID as a random effect, while all predictors are modeled as fixed effects. All regression models described in the results adjusted for the non-independence of the sample. Separate analyses were conducted for pet ownership versus attitudes toward pets, and separate analyses were conducted for each of the four youth outcomes. Both unstandardized (b) and standardized (β) regression coefficients are reported, as the latter further serves as a measure of effect size, roughly equivalent to Cohen's d.

### RESULTS

### Descriptive Information Missing Data

fpsyg-09-02304 November 23, 2018 Time: 10:55 # 5

Of the N = 378 youth who participated in the study, 31 youth had missing data on the PAS-M and 5 youth had missing data on pet ownership, resulting in a sample N = 342 for statistical analysis. Between 2 and 6 youth were missing data on each outcome, resulting in small differences in sample size across analyses.

### Demographic Characteristics of Youth

The sample was approximately evenly divided across gender (48.0% male) with a Mean age = 14.05 (SD = 1.77; range 9– 19). Over half of the sample identified as Hispanic or non-Caucasian, including N = 64 Hispanic (18.7%), 121 Black (35.4%), 6 Asian (1.8%), one each American Indian/Alaskan Native and Native Hawaiian/Other Pacific Islander, and 23 youth who reported more than one race (6.7%). Racial/ethnic categories were combined for comparison of minority (63.2%) versus non-Hispanic White (36.8%) youth. The majority (89.8%) of youth lived with their biological mother, and 27.2% lived in single parent homes. There was a wide range of socioeconomic backgrounds (Mean SES = 45.17, SD = 13.79, range 9–66).

### Pet Ownership

Of the 342 youth, 226 (66.1%) currently lived with one or more pets. Dog ownership was most prevalent (N = 159, 46.5% of the total sample), followed by cat ownership (N = 73, 21.3%) and small pet ownership (N = 60, 17.0%). These estimates are largely consistent with figures based on population-based samples (American Pet Products Association [APPA], 2018). For analytic purposes, the sample was divided into current dog and/or cat "owners" (N = 202, 59.1%) versus "non-owners," i.e., youth living with no pets or only small pets (N = 140; 40.9%).

Of the 202 owners, 114 lived with dog(s) only, 34 lived with cat(s) only, 20 lived with both dog(s) and cat(s), 15 lived with dog(s) and other small pets, 9 lived with cat(s) and other small pets, and 10 lived with dog(s), cat(s), and other small pets. Of the 140 non-owners, a minority (N = 24, 20.7%) currently lived with small pets while the remainder were not currently living with any type of pet. Moreover, almost half of the non-owner group (N = 60, 42.9%) had not lived with any type of pet for the past 10 years. Of the N = 24 youth living with small pets, the majority (N = 19, 79.2%) reported living with fish. Indeed, exactly half (N = 12) were living with fish and no other small pets.

### Preliminary Analyses

Chi-square tests and t-tests indicated that the N = 36 youth excluded due to missing data did not differ significantly from the N = 342 included youth on gender, minority racial/ethnic background, age, family SES, pet ownership, or on any of the four youth outcomes (all p > 0.10, results available from first author).

Girls (61.2%) were slightly more likely than boys (56.7%) to live with a dog or cat, but the gender difference was not statistically significant (χ <sup>2</sup> = 0.72, df = 1, p = 0.39). Owners and non-owners did not differ in age (M = 13.96, SD = 1.75 for owners; M = 14.18, SD = 1.80 for non-owners, t<sup>340</sup> = 1.14, p = 0.25). There were significant differences between owners and non-owners in youth racial/ethnic background (χ <sup>2</sup> = 39.53, df = 1, p < 0.001) and family SES (t<sup>340</sup> = 5.09, p < 0.001). Minority youth were less likely to own pets than White youth (46.3% versus 81.0%, respectively) and non-owners had lower family SES (M = 40.77, SD = 14.05) than owners (M = 48.22, SD = 12.78).

There were no gender difference in self-reported attitudes toward pets (M = 5.48, SD = 0.91, for females; M = 5.45, SD = 0.93, for males, t<sup>340</sup> = 0.35, p = 0.73). Minority youth reported significantly less positive attitudes toward pets than Caucasian youth (M = 5.31, SD = 0.95, for minority youth; M = 5.73, SD = 0.80, for Caucasians, t<sup>340</sup> = 4.12, p < 0.001). Age and family SES had modest, albeit significant associations with attitudes, with positive attitudes toward pets decreasing with age (r = −0.12, p = 0.02) and increasing with higher family SES (r = 0.12, p = 0.12). Finally, owners reported significantly more positive attitudes toward pets than non-owners (M = 5.69, SD = 0.83 for owners; M = 5.14, SD = 0.94 for non-owners, t<sup>340</sup> = 5.66, p < 0.001).

### Correlations Among Study Outcomes and Predictions

**Table 1** presents simple Pearson correlations between main study predictors and outcomes. Note that p-values for these correlations are not adjusted for clustered observations of siblings within families. Pet ownership and attitudes toward pets were moderately correlated (r = 0.29, p < 0.001). There were some significant correlations among the four study outcomes, although most were modest in size, ranging in magnitude from −0.10 to +0.38. Pet ownership was significantly correlated with higher empathy (r = 0.14, p = 0.008) and lower delinquency (r = −0.14, p = 0.01). Attitudes toward pets was positively correlated with empathy (r = 0.31, p < 0.001) and inversely correlated with delinquency (r = −0.22, p < 0.001) and depression (r = −0.13, p = 0.02). The correlation between attitudes toward pets and prosocial behavior was smaller (r = 0.10) and was significant only at trend level (p = 0.08). Overall, correlations between youth outcomes with attitudes toward pets were stronger in magnitude than the respective correlations with pet ownership.

### Pet Ownership and Youth Outcomes

A hierarchical series of mixed level regression models was used to test whether pet ownership was associated with youth socioemotional outcomes. In the first set of models, pet ownership was entered as the sole fixed-level predictor in a simple regression, with family ID entered as a random effect to adjust standard errors and significance tests for the correlated observations. Separate models were run for each outcome. Next, models were re-run including youth gender, age, race/ethnicity, and family SES as covariates.

In the first set of regression models, pet ownership was significantly associated with lower delinquency (b = −0.20, SE = 0.08, β = −0.29, t<sup>121</sup> = −2.40, p = 0.012) and higher empathy (b = 0.12, SE = 0.05, β = 0.28, t<sup>122</sup> = 2.46, p = 0.015). Once demographic covariates were included in the second set of

#### TABLE 1 | Pearson correlations among study predictors and outcomes.


Pet ownership is a binary variable with 1 = current ownership. Delinquency is shown with log-transformed scores. ∗∗∗p < 0.001, ∗∗p < 0.01, <sup>∗</sup>p < 0.05, #p < 0.10.

models, effects of pet ownership were not significant for any of the four outcomes.

To determine whether results were influenced by the definition of pet ownership, we ran additional post hoc models comparing current dog owners (N = 159) with the N = 60 youth who had not owned pets in the past 10 years. With demographic factors included in the models, effects using this more extreme definition of pet ownership/non-ownership were not significant for any of the four outcomes.

### Attitudes Toward Pets and Youth Outcomes

**Table 2** shows the results from the mixed level regression models used to test whether attitudes toward pets was associated with youth socioemotional outcomes. As above, analyses were run in two steps, without and with demographic covariates. In addition to gender, age, race/ethnicity, and family SES, pet ownership was also included in the second step. Both unstandardized and standardized regression coefficients are shown to enable comparison of estimates within and across models. In models without covariates, attitudes toward pets was significantly associated with all four outcomes. The strongest effects were seen for empathy (β = 0.31, p < 0.001) and delinquency (β = −0.22, p < 0.001), with more modest effects on prosocial behavior (β = 0.12, p = 0.021) and depressed mood (β = −0.12, p = 0.035). More positive attitudes toward animals was associated with greater empathy and prosocial behavior and with less delinquency and depressed mood. After including demographic covariates, the association between attitudes toward pets and empathy (β = 0.27), delinquency (β = −0.18) and prosocial behavior (β = 0.11) remained significant at p < 0.05, while the effect sizes for depressed mood (β = −0.08) became non-significant (p = 0.14). Pet ownership was not significantly associated with any of the four outcomes (all p > 0.20).

### Effects of Attitudes Toward Pets Across Different Ecological Niches

We examined whether the relationship between attitudes toward pets and child outcomes differed by gender, racial/ethnic background, or for owners versus non-owners, and whether age or family SES moderated the associations. Moderating effects were tested by including an interaction term between attitudes toward pets with each of the five demographic characteristics. All models were run as mixed level models and controlled for correlations among family members. Each model contained all of the demographic factors (including pet ownership) as main effects, attitudes toward pets as a main effect, and a single interaction term. Models tested each interaction separately, and separate models were run for each outcome. For continuous measures of age and family SES, both attitudes toward pets and the continuous demographic factors were centered prior to creating interaction terms.

Out of the 20 different models (5 interactions × 4 outcomes, results not shown), none of the interaction terms was statistically significant (all p ≥ 0.17). To ensure that lack of power to detect statistical interactions did not obscure any meaningful patterns across subgroups, we also ran separate models for each outcome in each subgroup so that we could compare the magnitude of the association between attitudes toward pets and youth outcomes across subgroups. Each regression model included attitudes toward pets and all four of the remaining demographic factors. For example, in addition to attitudes toward pets, the regressions run separately for boys and girls included minority racial/ethnic background, pet ownership, age, and family SES as covariates. For family SES, we used a median split to define subgroups of low versus high SES. For age, separate regressions were run for 9- to 12-year-olds, 13- to 14-year-olds, and 15- to 19-year olds.

The effect sizes and 95% confidence intervals for the association between attitudes toward pets with child outcomes within each subgroup are presented in **Figure 1**. While there was some variation in effect sizes across subgroups, differences were modest and there was considerable overlap in confidence intervals.

### DISCUSSION

The current study provided a comprehensive examination of associations between pet ownership and attitudes toward pets with youth socioemotional outcomes. Strengths of the study include: the use of a moderately large, community-based sample of urban and suburban youth with substantial racial, ethnic, and socioeconomic diversity; comparison of results for pet ownership

TABLE 2 | Results from regression models of attitudes toward pets predicting youth outcomes.


Bolded italic values reflect effects significant at p < 0.05.

versus youth attitudes toward pets; rigorous measurement of current and history of pet ownership obtained through detailed interviews with caregivers; consideration of multiple socioemotional and behavioral outcomes; the use of sophisticated statistical controlling for potential demographic confounds; and a systematic comparison of effects across different subgroups of youth. Results support three main conclusions: (1) attitudes toward pets is a stronger predictor of youth outcomes than pet ownership; (2) effects are strongest for youth reports of empathy and delinquency compared with prosocial behavior and depressed mood; and (3) significant effects were found among youth across a wide range of demographic characteristics.

### Main Effects

Results from this study add to a small, albeit growing body of work examining the impact of pets on child socioemotional development, and further shed some initial light on potential reasons for inconsistencies across prior studies. First, while there was an initial main effect of pet ownership on child empathy and delinquency, these effects became non-significant once controls for gender, age, minority race/ethnicity, and family socioeconomic status were considered. This underscores the importance of considering demographic confounds in research on pets, given that pet ownership is not randomly distributed across the population. At the same time, controlling for the effects of demographic confounds only slightly attenuated the associations between attitudes toward pets and youth outcomes, and the associations remained statistically significant for three of the four outcomes considered. This result is consistent with both prior theoretical and empirical work suggesting that the positive benefits of pet ownership are largely mediated through the emotional bonds that humans form with animals (Garrity et al., 1989; Friedmann et al., 1993; Sable, 1995; Collis and McNicholas, 1998; Carlisle-Frank and Frank, 2006; Barker et al., 2010; Melson, 2010; Julius et al., 2012; Freund et al., 2016; Purewal et al., 2017). However, it should be noted that the effect sizes for

attitudes toward pets were small, ranging in absolute magnitude from β = 0.11 to β = 0.27, after controlling for demographic confounds.

We found the strongest association between attitudes toward pets and child-reported empathy, consistent with a recent empirical review of existing studies on pet ownership and child outcomes (Purewal et al., 2017). In addition, this association was relatively robust across different subgroups of youth in the study. Interestingly, we also found a relatively strong association between attitudes toward pets and youth delinquency. To our knowledge, this may be the first reported significant association between pet-related measures and adolescent externalizing behaviors in a non-clinical sample, although we note that studies examining the impact of introducing pets in classrooms have reported decreases in disruptive behaviors (e.g., Hergovich et al., 2002; Kotrschal and Ortbauer, 2003; Tissen et al., 2007; O'Haire et al., 2013).

We found the smallest effects of attitudes toward pets on prosocial behavior and depressed mood, and the association with depressed mood was not statistically significant once demographic factors were considered. For depressed mood, this result may indicate that relationships with pets only affect certain kinds of emotional problems. For example, a recent study reported that pet ownership was associated with screening anxiety in a sample of rural children, but not with a broader measure of youth socioemotional difficulties (Gadomski et al., 2015). Likewise, Vidovic (1999) ´ reported that attachment to pets was associated with child empathy and prosocial behavior in a large sample of Croatian adolescents, but was not significantly associated with anxiety or loneliness. Given the small number of studies that have focused specifically on measures of pediatric anxiety and depression, more work is needed before drawing any firm conclusions about inconsistency of results across different child emotional outcomes. Moreover, we note that the measure of depressed mood used in this study assessed mood experienced during the past week, while the measures of empathy, prosocial behavior, and delinquency encompassed a broader time frame. Rates of depression were also relatively low in our community-based sample. These factors could have reduced potentials associations. Studies that examine pet-related variables in relationship to emotional problems among clinical samples may shed further light on these issues.

For prosocial behavior, the smaller association in comparison to empathy and delinquency was unexpected, given that

previous theoretical and empirical work has posited a direct link between attachment to pets, social support, and prosocial behaviors (Collis and McNicholas, 1998; Vidovic, 1999 ´ ; Melson, 2010; Julius et al., 2012; Freund et al., 2016; Purewal et al., 2017). We note that the measure of prosocial behavior we used was based on parent report, and was specific to parent observations of youth prosocial behaviors toward peers, which is a fairly narrow definition of prosocial behavior. Given that this was predominantly a middle- and high-school aged sample, it is possible that child report of a wider range of prosocial behaviors would have been a more appropriate outcome.

### Moderating Effects

This study is one of a handful to systematically explore whether associations between pets and child outcomes were consistent across different demographic subgroups. None of the 20 different interaction terms reached statistical significance (all p > 0.15), and there was considerable overlap in the 95% confidence intervals for effect sizes across the different subgroups. This suggests that the positive benefits of pets may transcend individual differences in demographic characteristics. While a larger sample may have revealed statistically significant differences between subgroups, our study demonstrated that differences in effect sizes across subgroups were relatively small, and therefore unlikely to be of meaningful importance.

### Causal Inferences

Although the data from this study are cross-sectional, results may speak to issues of causality. Specifically, there was a significant association between youth self-reports of attitudes toward pets and empathy among youth who did not currently live with a cat or a dog. Indeed, associations between attitudes toward pets and empathy were largely significant across all subgroups examined, even after controlling for demographic factors. Additional post hoc analyses (available from first author) indicated that the Pearson correlation between attitudes toward pets and empathy among the 60 youth who had not lived with any type of pet in the past 10 years was r = 0.27 (p = 0.03). Even after age, gender, minority racial/ethnic background, and family SES were included in the model and standard errors adjusted for correlated observations, the effect for attitudes toward pets in this subgroup was significant at a trend level [F(1,19) = 3.29, p = 0.09]. These results suggest that youth with higher levels of empathy might be more likely to desire pets and to form stronger emotional bonds with pets than youth with lower empathy. This may also be true of other measures of child social and emotional competency, such as self-esteem. On the other hand, we cannot rule out the hypothesis that youth who do not live with pets but who really like pets seek out other opportunities to interact with animals outside the family, which could have a causal effect on empathy.

### Future Research

The fact that the majority of research on the impact of pets on socioemotional and behavioral outcomes in both child and adult samples is based on cross-sectional studies is a significant limitation of the field. While studies that include demographic covariates associated with pet ownership can help control for selection factors, many of the associations between children's attitudes toward and emotional bonds with pets with children's social, emotional, and behavioral outcomes are likely to be bidirectional. For example, children who have higher empathy and show more prosocial behavior may be more naturally inclined to form close bonds with pets. Conversely, aggressive children may find it more difficult to form successful relationships with pets, especially if the pet is fearful of the child. Longitudinal studies may help to disentangle the causal nature of these associations, especially if children can be assessed before and after the acquisition of a new pet.

Studies that use within-family designs are also under-utilized in the field of human–animal interactions. Samples that include more than one child per family could shed light on both similarities and differences among children within the same family, and could identify the specific child characteristics that impact the development of emotional bonds with pets. Behavioral genetic designs can further determine the extent to which associations between emotional bonds with pets and outcomes are driven by shared genetic factors. At present, there is only one published study that used a genetically informative design to investigate genetic influence on a pet-related measure. This study found that self-reports of frequency of playing with pets among a middle-aged, male twin sample had a heritability of h <sup>2</sup> = 0.29– 0.37, indicating that genetic factors, which are likely mediated through individual differences in personality and related traits, play a role in establishing bonds with pets (Jacobson et al., 2012). Surprisingly, the effects of shared environmental influences, which would include childhood exposure to pets, accounted for less than 10% of the variance in pet play during adulthood. This finding may call into question the causal implications of prior research showing that childhood pet ownership predicts both pet ownership patterns (Serpell, 1981; Westgarth et al., 2010) and strength of emotional bonds with pets in adulthood (Kidd and Kidd, 1989; Ellingsen et al., 2010).

### LIMITATIONS AND CONCLUSION

Results from the current study should be considered in the context of several limitations. First, attitudes toward pets and outcome measures are based predominantly on youth self-report. Thus, we cannot rule out the hypothesis that factors such as social desirability could account for some of the associations. However, social desirability would not account for the differential patterns of effects seen across outcomes. Second, the study focused on general attitudes toward pets, rather than specific emotional bonds with pets. This is because we wanted to directly compare the distal effects of pet ownership with the more proximal, emotional impact of pets, and we needed measures that could be administered to both pet-owning and non-pet-owning youth. The study did obtain measures of emotional bonds with pet dogs from dog-owning youth, and there was substantial overlap between attitudes and emotional bonds among the 154 dogowning youth who had non-missing data on both measures (r = 0.64, p < 0.001). Nevertheless, we might have found

stronger associations with youth outcomes if we had focused on emotional bonds with specific family pets. Third, results may have been confounded by definitions of pet ownership. In particular, sample sizes of youth who lived only with cats and/or small pets were too small to be considered individually. The fact that our findings largely replicated when we used a stricter comparison of current dog owners to youth who had not owned any pets in the past 10 years suggest that our results are not biased by our definition; however, larger samples sizes with greater diversity on pet ownership patterns are needed to explore this question more thoroughly. Fourth, while our sample contained relatively large numbers of Hispanic and Black youth, sample sizes were too small to determine whether the associations between ownership and attitudes with youth outcomes differed between these two racial/ethnic groups, so youth were combined into a binary variable of minority versus non-minority youth. We have examined racial and ethnic differences in pet ownership and attitudes toward pets in more detail in a separate manuscript, and results suggest that Hispanic and Black youth in this sample show similar patterns, and that both groups show significant differences in comparison to non-Hispanic Caucasian youth (Jacobson and Daly, unpublished). However, we do not know if results would generalize to American youth from other racial and ethnic groups, such as Asian-American or Native American youth. Finally, our results may not generalize to other populations. Specifically, pet ownership patterns and demographic correlates of pet ownership vary somewhat between the United States and other countries, so inconsistencies between our results with prior, large-scale studies conducted in other countries could be due to cultural factors. Hispanic youth living in America may also differ from Hispanic youth living in other countries. Cross-cultural studies are needed to disentangle the effects of culture from racial/ethnic background. In addition, our sample is drawn from a predominantly urban and suburban population. Thus, results may not generalize to rural youth. Finally, results are based on a community-based sample. While we did not exclude youth with emotional and behavioral problems from this study, it is possible that the positive benefits of pet ownership, attitude toward pets, and emotional bonds with pets would be greater among patient populations, or among other populations of vulnerable youth.

Despite these limitations, this is one of the only studies to obtain measures of both pet ownership and attitudes toward pets as well as a wide range of socioemotional outcomes in a diverse sample of youth. Our results indicate that pet ownership, per se, is unrelated to child outcomes once demographic factors associated with ownership are accounted for. At the same time, controlling

### REFERENCES


for demographic factors had limited impact on the magnitude of associations between attitudes toward pets and child outcomes, and these associations were largely consistent across different subgroups of children. Thus, our study contributes to a growing body of research suggesting that pets may have a positive, albeit modest impact on children.

### DATA AVAILABILITY STATEMENT

The raw data supporting the conclusions of this manuscript will be made available by the authors, without undue reservations, to any qualified researcher.

### AUTHOR CONTRIBUTIONS

KJ is the principal investigator of the overall project, conceived the study aims and hypotheses, collected and analyzed the data, and wrote the manuscript. LC assisted with literature review.

### FUNDING

This work was supported by two research grants to KJ. This work was funded by the National Institutes of Health through the NIH Director's New Innovator Award Program, Grant number DP2-OD003021. In addition, the project described was supported by Grant Number R03-HD066598 from the Eunice Kennedy Shriver National Institute of Child Health and Human Development and Mars-Waltham. The content is solely the responsibility of the authors and does not necessarily represent the official views of the Eunice Kennedy Shriver National Institute of Child Health and Human Development, the National Institutes of Health, or Mars-Waltham.

### ACKNOWLEDGMENTS

We would like to acknowledge current and former staff at the University of Chicago Clinical Neuroscience and Psychopharmacology Research Unit (CNPRU), especially Ms. Crystal Johnson, Dr. Kristen Jezior, and Ms. Bing Chen, for their assistance with data collection for this project. We further thank the youth and the families in the "Neighborhoods to Neurons and Beyond" cohort for participating in this research.

interaction with a therapy dog. Anthrozoös 23, 79–91. doi: 10.2752/ 175303710X12627079939341




old children in liverpool, UK. BMC Vet. Res. 9:102. doi: 10.1186/1746-6148- 9-102


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Jacobson and Chang. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# To a Future Where Everyone Can Walk a Dog Even if They Don't Own One

Eunice Y. Chen\*

*Department of Psychology, Temple University, Philadelphia, PA, United States*

Keywords: animals, dogs, walking, overweight, students, humans, animal-assisted therapy

Despite desperately wanting a dog, like many children, because of restricted financial circumstances, I did not have the good fortune of owning a dog as a child. However, research may overcome this barrier so that lack of dog ownership need not be a barrier to spending time with and walking dogs.

The US Department of Health and Human Services recommends walking 5 days a week for at least 45 minutes at a time, with 30 minutes at a moderate to brisk pace of 3–4 miles per hour and 15 minutes at a very brisk walking of 5–6 miles per hour (1). This level of physical activity is associated with a decreased risk of mortality, cardiorespiratory disease, and increased likelihood for weight-loss, and improvements in musculoskeletal health (2). Only about half of Americans engage in this recommended amount of physical activity (3). But if every American had a healthy dog and walked it regularly, they would be more likely to achieve these recommended levels of activity (4).

#### Edited by:

*Peggy D. McCardle, Consultant, New Haven, CT, United States*

#### Reviewed by:

*Carri Westgarth, University of Liverpool, United Kingdom*

#### \*Correspondence:

*Eunice Y. Chen eunice.chen@temple.edu*

#### Specialty section:

*This article was submitted to Children and Health, a section of the journal Frontiers in Public Health*

Received: *15 April 2018* Accepted: *09 November 2018* Published: *30 November 2018*

#### Citation:

*Chen EY (2018) To a Future Where Everyone Can Walk a Dog Even if They Don't Own One. Front. Public Health 6:349. doi: 10.3389/fpubh.2018.00349*

Multiple studies show that dog ownership improves human health. Becoming a dog owner increases physical activity (4–7) and walking (8–10), reduces your weight (11), and decreases your odds of diabetes, hypertension, hypercholesterolemia, and depression (12–15). Dog ownership reduces predictors of negative cardiovascular outcomes like blood pressure (15–19), triglycerides (20, 21), and stress (22–25). Indeed, owning a dog increases the likelihood of you surviving a heart attack (26–28) such that even the American Heart Association advocates dog ownership as a way to reduce the risk of cardiovascular disease (29). Australian, German, and Chinese studies show that pet ownership decreases doctor visits, and reduces the likelihood of cardiac problems and sleeping difficulties (30–33). Interventions with dogs improve the outcome of children, adolescents, and adults with a range of medical and psychological problems including post-traumatic stress disorder, developmental disabilities, schizophrenia, autism-spectrum disorders, and cancer (34– 38). In a short period of time, human-animal studies have progressed from small experimental studies to studies assessing the public health impact of dogs on human lives, including increasing human physical activity, please see these reviews (6, 34–43).

Despite the double challenge of conducting clinical trials with humans and animals (44) randomized controlled trials are needed in this field. For instance in considering the needs of both dogs and humans, human-animal trials require approval from both human and animal institutional review boards. However, the randomized controlled trial is the gold standard for the assessment of intervention efficacy because it most effectively and efficiently evaluates an intervention's effect, eliminating systematic, and random bias (45).

A search using the terms "randomized controlled trial" AND "dogs" and "walking" resulted in 40 hits in PubMed up to 4/15/2018. When studies with groups with psychological or medical problems were excluded, this yielded five randomized controlled trials (46–51), that examine the effects of dog-walking on human walking, see **Table 1**. An effect size could not be calculated for one of the five randomized controlled trials, and of the remaining 4 randomized controlled trials, two had moderate to large effects (Hedges g) for the dog-walking intervention arm, and two of the four had small effects.


*Calculation for Hedges g* = *(m<sup>1</sup>* −*m2)/s* \* *where m<sup>1</sup>* =*baseline mean, m<sup>2</sup>* =*mean at 2nd timepoint, s*\* = √ *[(n1*−*1)s<sup>1</sup> <sup>2</sup>* + *(n2*−*1) s<sup>2</sup> 2 / (n<sup>1</sup>* +*n2*−*2)] where n<sup>1</sup>* =*baseline sample size, n<sup>2</sup> is sample size at 2nd timepoint calculated on the intervention arm of interest. In* Table 1*, n*=*number of participants in intervention arm of interest, N*=*number of participants in the whole study. NA* = *not available.*

These randomized controlled trials focus on dog owners walking their dogs. However, randomized controlled trials that focus on dog owners walking their own dogs may limit whether dog-walking interventions can be "scaled-up" or implemented on a more widespread basis (52).

Certain correlates may distinguish dog owners from nondog owners. For instance, a large US study showed that dog ownership is associated with being white, with home ownership, and with living in a house (53). A review of the literature also suggests that living close to places wheredogs can be walked (41) also increases the likelihood of owning a dog. The implication of this is that racial diversity, renting rather than owning a home, living in an apartment, and possibly socioeconomic disadvantage may decrease the likelihood of dog ownership. So although 44% of households in the US (2015–2016) are estimated to own a dog, the majority of households do not (54).

But does lack of dog ownership have to be a barrier to dogwalking interventions? One of the foreseeable challenges for dog-walking interventions targeting human physical activity is working out how to scale this intervention to individuals who do not own a dog. There are numerous online media reports of shelter dog-walking programs, even phone applications for walking shelter dogs. However, trials published in peerreviewed journals are scant (55, 56). One small open trial with public housing residents showed that overweight individuals who borrowed and walked dogs from a dog-shelter had small (hedges g = 0.17), but significant weight loss (57). This suggests that pairing individuals who do not own a dog with dogs in rescues or shelters may be a feasible weight loss solution.

Designing behavioral interventions that can "scale-up" is increasingly becoming an important criteria for the success of an intervention (58, 59). One of the ways that this critical barrier can be addressed is by considering mutually beneficial partnerships between dog shelters/rescues and other institutions, some of which may seem improbable at first.

An important start is to take into account both the socioecological structure and function of institutions. Drawing on Bronferonner's work (60), Westgarth et al. (41) describes the structure of a socio-ecological model of dog-walking that highlights the individual sphere of influence and its dog-related factors, as well as more distal social-environmental, and physicalenvironmental factors that are associated with dog-walking (6). This can add to a consideration of the functions of different institutions and how these can promote healthy behaviors in both individuals and institutions.

Human-dog relationships have been described as a form of social capital which is defined as an "investment in social relations with expected returns" (61–63). On a system-level mutually reinforcing, sustainable partnerships can be formed between institutions to improve the health of both humans and animals (62, 63). Social capital requires the utilization of resources embedded in a social structure, accessibility to this, and the mobilization of these resources for purposive action, e.g., improved health for both humans and animals (62, 63). Increasing physical activityis a serious public health challenge; as is ensuring that homeless dogs are cared for. Considering a model of human-dog interventions that considers function as well as the structure of institutions and individuals is empowering and self-sustaining, and has the potential to build scalable, sustainable interventions.

Everyone who does not own a dog could probably benefit from walking a dog daily. However, there may be certain institutions that could provide the other half of a mutually beneficial partnership. For instance, with appropriate supervision, various institutions could offer dog-walking opportunities: health care institutions like rehabilitation centers, half-way houses, group homes or elder care facilities; educational institutions, such as colleges or schools; even insurance companies.

Is it possible to do something like this? Recently undergraduates at Temple University undertook this endeavor. Temple University is a large urban college in Center City Philadelphia consisting of a socioeconomically and racially diverse group of more than 30,000 undergraduates. After a review of the literature (64) over the summer of 2017, one of my lab undergraduates sent an informational Facebook message on July 18th, 2017 to the incoming freshmen class and in one day 22 incoming freshmen posted their interest (and photos of their dogs) in joining a volunteer dog-walking association, with another 56 freshmen signing up. In 10 days, 172 incoming freshmen posted their interest in joining the volunteer dogwalking association. On December 6th, 2017, the "Diamond Dogs" undergraduate dog-walking association was formally ratified by the university, with support from the Dean's office and since then there have been 3 townhall meetings of ∼70 students a time. "Diamond Dogs" is partnered with two inner-city dog rescues approximately a mile from Temple University's campus. My last communication with the rescues was that students are participating in their orientations and training and are walking their dogs. From the perspective of dog rescues, this is a "win-win." Inner city rescues in Philadelphia find it difficult to recuit regular dog-walkers but a schedule of regular volunteer walkers has eased their need to play and to walk their healthy dogs daily.

Humans form strong attachments to pets, particularly dogs (65–68) and pets increase social capital through creating more social connections and networks (61, 69, 70). But will people who don't own a dog develop an attachment for a dog at a shelter or rescue who may leave in a couple of weeks because they have been adopted? Are the positive cardiovascular effects in response to stressful situations also found in non-dog owners who interact with shelter or rescue dogs (71)? Can shelter/rescue dog-walking interventions increase the likelihood of walkers to eventually adopt a dog? Can cortisol or immunological measures be used to check that shelter/rescue dogs are experiencing less stress with regular walkers (72)? While these are important research questions there are also important ethical and practical issues to consider in research of this nature.

Rescue and shelter dogs often have greater psychological and medical needs than dogs not in a shelter or rescue. Volunteers at

#### shelters are likely to be pet-owners (73) and shelters and rescues are likely to prefer experienced dog owners rather than non-dog owners to walk their dogs. Guidelines from the Association of Shelter Veterinarians (74), and from the American Veterinary Medical Association (75) and the Humane Society (76) highlight the importance of training and supervising shelter and rescue volunteers, including basic training in animal handling and bite prevention. However, research addressing the training of nondog owner volunteers is in its infancy (77, 78). Some questions that will need to be answered include: how much training do non-dog owner volunteer walkers need to achieve the skill level of experienced dog owners? How are these skills best assessed? And what are the effects of volunteers with varied experience on shelter dogs?

Assessing the efficacy of shelter dog-walking intervention for non- owners, deserves the best designs and methods. This means that non-peer reviewed reports by the media are not sufficient as evidence for the efficacy of interventions like this and the use of phone applications to encourage the walking of shelter dogs warrant thoughtful and rigorous testing prior to widespread use or claims about their effect. Clinical trials with shelter dogs and humans need to meet both Human Institutional Review Board and Institutional Animal Care and Use Committee requirements. The design and oversight of these trials require the partnership of human clinical trial experts, human-animal intervention research experts, veternarians, and shelters/rescues. There are not only ethical issues that must be considered but also legal issues in clinical trials of this nature. For instance animal rescues in the United States are governed by state, county and city ordinances that may differ between rescues. Careful attention to these ordinances are needed in rescue dog-walking intervention trials.

It is important to conduct clinical trials to test if it is possible to engage educational and social institutions in mutually reinforcing partnerships to improve the health of people and animals. The Shelter dog-walking intervention proposed can potentially make dog-walking a scalable intervention and has broad applicability to a wide variety of institutions and partnerships.

Despite the challenges, I'm looking forward to a future where everyone can walk a dog even if they don't own one.

### AUTHOR CONTRIBUTIONS

EC conceived of this idea and opinion and wrote this paper.

### REFERENCES


veterinarian-prescribed physical activity. Anthrozoös (2014) 27:325–33. doi: 10.2752/175303714X14036956449224


**Conflict of Interest Statement:** The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Chen. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Adolescents' Affective and Physiological Regulation Shape Negative Behavior During Challenging Equine Assisted Learning Activities

#### Patricia Pendry\*, Alexa M. Carr and Jaymie L. Vandagriff

Department of Human Development, Washington State University, Pullman, WA, United States

#### Edited by:

Peggy D. McCardle, Haskins Laboratories and Peggy McCardle Consulting, LLC, United States

#### Reviewed by:

Federica Pirrone, Università degli Studi di Milano, Italy Nina Ekholm Fry, University of Denver, United States

> \*Correspondence: Patricia Pendry ppendry@wsu.edu

#### Specialty section:

This article was submitted to Veterinary Humanities and Social Sciences, a section of the journal Frontiers in Veterinary Science

Received: 16 April 2018 Accepted: 09 November 2018 Published: 04 December 2018

#### Citation:

Pendry P, Carr AM and Vandagriff JL (2018) Adolescents' Affective and Physiological Regulation Shape Negative Behavior During Challenging Equine Assisted Learning Activities. Front. Vet. Sci. 5:300. doi: 10.3389/fvets.2018.00300 This study examined associations between adolescents' (N = 59; M age = 11.63) diurnal and momentary activity of the Hypothalamic Pituitary Adrenal (HPA) axis as marked by salivary cortisol, and affective and behavioral responses to their first, mounted equine assisted learning (EAL) activity. The introduction to riding occurred during the fifth week of an 11-week EAL program for at-risk and typically developing adolescents. Before the 11-week program began, participants collected 6 salivary cortisol samples at prescribed times (wakeup, 4 p.m., bedtime) over 2 days, from which indices of diurnal cortisol activity were derived. Six weeks later, on the day of their first mounted activity in week five, participants provided three salivary cortisol samples, reflecting their basal cortisol level at the end of their regular school day, and their cortisol levels linked to the beginning and end of their first ride. Participants reported on positive and negative emotion immediately before mounting the horse, and immediately after dismounting, using an 11-item survey. Using a 43-item checklist, three independent observers rated participants' behavior throughout the 90-min session. Regression analyses showed that adolescents with higher cortisol levels immediately before mounting reported higher levels of negative emotion (B = 0.350, p = 0.041) and lower levels of positive emotion (B = −0.697, p = 0.013), while basal levels and potential dysregulation of cortisol diurnal patterns were controlled. Greater cortisol reactivity in response to 10 min of riding was linked to higher negative (B= 2.95, p = 0.001), and lower positive emotion (B = −3.73, p = 0.007) after dismounting. Higher levels of pre-ride negative emotion (B = 5.50, p = 0.046), and lower levels of post-ride positive emotion (B = −5.17, p = 0.027), and an increase in cortisol reactivity in response to riding (B = 0.242, p = 0.049), predicted higher levels of negative behavior during the 90-min session that day. These findings show that participants' HPA axis activity informs their program experience and behavior. Results suggest that EAL facilitators need to employ strategies to down regulate adolescents' physiological and affective arousal during mounted sessions to prevent and redirect negative behavior.

Keywords: equine facilitated learning (EFL), equine assisted learning (EAL), momentary emotion, HPA axis, cortisol, observed behavior, adolescents

## INTRODUCTION

Equine assisted learning (EAL), which combines experiential learning, interaction with equines, and life skills education to increase participants' affective, physiological, and behavioral regulation, has seen a significant increase in use and popularity; in 2016, the Professional Association of Therapeutic Horsemanship International (PATH Intl.) provided Equine Assisted Learning programs at 357 of its 881 member service centers, up from 185 in 2009 (1). Programs that incorporate EAL are appealing because they are well-suited as community or school-based prevention programs, require less training and expertise in comparison to psychotherapy (equine-assisted psychotherapy), and enjoy positive public perception. Also, although limited in number, there is a small but promising number of studies featuring causal designs suggesting participation in EAL has positive effects on adolescents' selfperceived social support (2), adolescent social competence and behavior (3, 4) and adolescents' basal activity of the Hypothalamic Pituitary Adrenal (HPA) axis, as measured by salivary cortisol levels (5).

At first glance, one might assume that increased use of EAL is informed by the prevailing model guiding preventive intervention research, the Preventive Intervention Research Cycle. According to this model, interventions are developed with a comprehensive theoretical and empirical understanding of the target issue, tested for efficacy under tightly controlled research conditions, then examined in real-world settings for effectiveness in broader populations, and finally disseminated for wide-spread implementation. It is a significant concern that this sequencing has not occurred for EAL, which is widely promoted and implemented, despite the fact that the number of causal studies are limited. In fact, little is known about whether, how, under which conditions, and for whom, EAL programs facilitate safe, efficient and effective improvement of affective, physiological, and behavioral regulation.

Based on a survey on EAL and Equine Assisted Therapy (EAT) methodologies Nelson et al. (6) suggested that in order to move EAT and EAL into mainstream professional practice, the theoretical and empirical underpinnings of activities must be defined more clearly. In particular, given that there are no standardized, evidence-based curricula, widely-accepted implementation protocols, or even broad principles guiding EAL implementation, we do not know which EAL activities are essential to achieve desired effects on targeted outcomes in a given population. Furthermore, the most effective ways to enhance learning services through interacting with horses and the equine environment are not yet known to us. In fact, we do not know much about the ways in which one of the most common and popular EAL activities—riding is best implemented to facilitate an effective experience for populations who vary in age, risk-status, regulatory ability, and prior horse exposure. Specifically, given that EAL programs often target at-risk populations, it would be helpful to understand the role of participants' affective and physiological regulatory abilities commonly associated with risk status, in shaping their moment-to-moment experiences and responses to EAL activities, especially activities that challenge those abilities. Understanding the dynamic interplay between participants' regulatory characteristics and responses in the context of a mounted activity can help equine facilitators anticipate, recognize, respond and redirect signs of participants' arousal, cognitions, emotion, and behavior to enhance participants' subjective perceptions about EAL experiences, which may enhance treatment effects.

Among numerous indicators of physiological regulation, an examination of diurnal, basal, and momentary response patterns of the Hypothalamic Pituitary Adrenal (HPA) axis holds particular relevance for the field of EAL. Marked by cortisol, which can be measured conveniently and unobtrusively in naturalistic settings using salivary sampling, the HPA axis is one of the body's most relevant stress-sensitive systems on the basis of its connection to social, emotional, and psychological events (7). In the EAL context, events that may activate HPA axis activity include exposure to environmental stressors (e.g., riding a 1,200-pound animal, performing a new task with an audience) and social support systems (e.g., encouragement from staff and peers, horse responding to cues).

Typically, healthy individuals show a pronounced diurnal pattern of cortisol, in which levels are highest in the morning soon after waking, drop rapidly in the first few hours after waking, then continue to drop more slowly, reaching a low point around midnight (8, 9). In fact, time of day has been shown to account for ∼70% of the variation in cortisol levels (10). There is a substantial body of research on associations between diurnal cortisol activity and individuals' trait-like affective and behavioral characteristics. For example, in adolescents flatter cortisol slopes from wakeup to bedtime have been observed for males with high levels of neuroticism (11) and in youth who have high trait loneliness (12). Moreover, in a prospective longitudinal study, adolescents with a higher baseline Cortisol Awakening Response (CAR) are significantly more likely to experience an episode of major depression over the following year (13); yet, those who experience current or past major depressive episodes exhibit flatter cortisol curves. Depression itself is thought to further alter the functioning of the HPA axis (14), further capturing the dynamic interaction between affective experiences and HPA functioning over time.

Beyond diurnal patterns of cortisol production attributable to the time of day, the remaining variation in cortisol levels is conceptualized as cortisol reactivity, which is determined by responding to momentary influences and events, including social or psychological stressors and supports. Examining momentary reactivity in the context of EAL is reasonable, as there is evidence to suggest that momentary changes in adolescent cortisol levels are associated with momentary, within-person changes in emotion states and social environments. For example, Adam (13, 14) found that momentary cortisol levels were higher than expected for that time of day at moments when individuals were experiencing negative emotion (e.g., anger, worry, stress), and lower when they were experiencing positive social emotions. This has direct relevance for EAL context, as it is likely that EAL activities may evoke negative and positive emotions in participants depending on the activity, the participant's characteristics, as well as the context in which the as activity occurs.

Furthermore, there is evidence to suggest that the presence of social support from a trusted adult may buffer elevations in momentary cortisol during times of fear or emotional distress (15). Although adolescence marks a time when caregivers' capacity to buffer their children's HPA response to stressors appears to decline (16), due to increases in cognitive complexity adolescents are increasingly able to represent others, such as non-family adults and peers, as sources of support (17), providing opportunity for those involved in EAL, including facilitators, volunteers, and fellow participants, to play a role in the development of coping abilities in this context.

Whether adolescents' momentary mood states alter their levels of cortisol, or whether their hormonal states influence their momentary mood states remains undetermined. For example, there is a large body of experimental evidence showing that exposure to stressful situations increases cortisol levels (18, 19), as well evidence demonstrating that experimentally manipulated changes in cortisol are associated with changes in affective and cognitive state (20). It is thus likely that adolescent mood states and cortisol levels transact dynamically and continuously during EAL activities, particularly when experiencing a novel, potentially stressful event, such as engaging in a first mounted activity.

While individual differences in the size and duration of cortisol reactivity are thought to be relevant for affective and behavioral regulation, it is incorrect to assume that downregulation of cortisol activity and reactivity should always be an inherent goal of EAL, as this is dependent on the population under study and the targeted outcome. Researchers studying associations between HPA axis activity and emotional pathology in adolescence have theorized that adolescents with elevated basal cortisol levels and/or a tendency toward greater HPA reactivity to social and emotional challenges may be at greater risk for the development of internalizing problems, including depression and anxiety disorders (21, 22). As such, attempts to downregulate basal cortisol and lower cortisol reactivity may be warranted for some adolescents reporting symptoms of depression and anxiety to prevent the development of clinical levels of either disorder. On the other hand, there is evidence to suggest that physiological regulation may be different for individuals with developmental disorders or psychopathology. For example, individuals with ADHD tend to have low basal levels, as well as low momentary arousal in response to activities, and may as such benefit from program activities that aim to increase physiological arousal (23), a feature that would be counterproductive when working with adolescents with autism, who tend to have over arousal of cortisol levels (24). In sum, given the limited knowledge about the efficacy of EAL to affect momentary down- or upregulation of HPA axis activity, it is somewhat premature to advise practitioners about in- or exclusion of up- or downregulation activities purely based on the etiology of the population served. In fact, examining associations between participants' moment-to-moment emotion, diurnal, basal, and moment-to-moment reactivity of HPA-axis activity in the context of a common, yet challenging EAL activity, and the contributions of these responses to participants' behavioral regulation during the activity is an important first step toward increasing our knowledge. Although there is theoretical rationale to expect individual differences in physiological and affective responding to the stressors and support inherent in EAL settings, there is currently no prior study that has examined these processes on the momentary, experiential level.

The first aim of this study was to examine moment-tomoment emotion states of 10–14-year-old adolescents before (i.e., anticipatory emotion) and after (i.e., post-ride emotion) their first ever mounted EAL activity. We selected the first mounted activity for closer examination as it was expected to evoke different reactions with regards to participants' perceptions of stress and support, which we hypothesized would result into varying levels of HPA activation, differences in moment-tomoment emotion and observed behavior. This first mounted activity took place during the fifth week of an 11-week EAL program which reduces the likelihood that affective and physiological arousal observed in our study occurred merely in response to the novelty of being exposed to equines. The second aim was to model these emotion states by examining contributions of adolescent characteristics (e.g., gender, age, referral status) and indices of physiological regulation (e.g., basal cortisol levels, dysregulation of diurnal patterns, and momentary cortisol reactivity in response to riding). The third aim focused on the extent to which participants' emotion and arousal in anticipation and response to riding informed the quality of observed adolescent behavior during their first mounted activity.

### MATERIAL AND METHODS

All study procedures involving human participants were conducted in accordance with the ethical standards of the institutional research committee and with the 1964 Helsinki declaration and its later amendments or comparable ethical standards. All procedures performed involving animals were in accordance with the ethical standards of the University's Institutional Animal Care and Use Committee.

### The EAL Program and Focus Session

The EAL session under study was conducted during the fifth week of an 11-week EAL program aimed at enhancing adolescents' social competence and reducing stress. The overall program consisted of eleven weekly, 90-min sessions of individual, team, and group-focused equine assisted activities, which were conducted at a PATH Intl. Premier Accredited Center in a university setting. The curriculum utilized in the study was designed by several individuals including a PATH Intl. certified instructor, a licensed counseling psychologist, and a developmental psychologist. Each of these individuals had participated in a range of equine assisted learning and therapy workshops and trainings. While activities varied from week to week and were based on principles of equitation science (25) and natural horsemanship, the curriculum was not situated in one particular field or perspective. Various underlying principles from the aforementioned models were combined with those based on the developmental stress system literature,

#### TABLE 1 | Outline of lesson objectives by week.


<sup>a</sup>Pressure refers to signaling toward the horse (implied/indirect), pushing with fingers/hand or leg until the desired behavior or the equine is fulfilled (i.e., moving away from the pressure).

<sup>b</sup>Driving refers to signaling the horse to move forward by pointing toward the desired space, and applying pressure behind the equine's shoulder to "drive" them forward and ahead of the equestrian.

<sup>c</sup>Desensitizing refers to exercises to condition the equine to ignore a stimulus or object, thus avoiding "spooking" of certain gestures or objects.

<sup>d</sup>Horse Massage refers to an exercise taught to participants in which they stroke the horse with slow, rhythmic strokes down the equines' body to "relax" him/her before riding.

learning theories, and counseling perspectives and subsequently incorporated into the activities over the 11-week period. The resulting curriculum featured a combination of mounted and un-mounted activities and horse-human interactions, including observation of equine behavior, engagement in equine management (e.g., grooming), in-hand horsemanship, some riding, and personal and group reflection. The program was implemented by a team of PATH Intl. certified instructors and certified equine specialists, undergraduate students in child-development, education, and animal science, professional counseling psychologists, and graduate-level counseling students. Each weekly session featured eight participants, which were paired up with a peer and engaged in a team that included one equine, an equine specialist, and a facilitator, which remained the same throughout the 11 sessions. Sessions were conducted on weekday afternoons as an after school program and included transportation of participants from school, immediately following their regular school day, to the program site and back using program vans. A full description of the weekly program objectives and activities is provided in **Table 1**. A full description of the activities conducted during the focus session, as well as the timing of data collection is described below and summarized in **Table 2**.

### Recruitment

This study employs data collected from participants recruited in the first year of a 2-year randomized controlled trial on the effects of the 11-week program, in which this study is embedded. Although the EAL program studied was suitable to be administered universally or selectively, the aim of researchers and program facilitators was to recruit selectively, with the aim of recruiting approximately equal numbers of boys and girls per grade, as well as giving priority to adolescents with lower social competence. Program participants were recruited and assessed through distribution of flyers and advertisements in two school districts serving ten schools in two small university communities in the states of Washington and Idaho. Recruitment of participants was also accomplished through advertisements in local newspapers and through soliciting referrals by school counselors and local mental health providers. Adolescents referred by professional counselors were either receiving school counseling services for academic and/or behavioral adjustment issues or had parents who had sought consultation by counseling staff about concerns over their adolescent's exposure to school and/or home-based stress. Criteria for program participation were that (1) parents and adolescents were proficient English speakers, (2) the adolescent did not have physical or developmental disabilities, and (3) attended the 5th through 8th grade.

### Screening and Selection

For purposes of screening and selection, parents first completed a standardized measure of adolescent social competence and reported on exposure to school and/or family-based stress for which they were paid five dollars. Social competence was measured with the DESSA (26), a 72-item measure, originally designed for use in schools, asking parents to indicate how often various adolescent behaviors occurred based on a 5-point Likert scale ranging from 0 (never), 1 (rarely), 2 (occasionally), 3 (frequently), to 4 (very frequently) over the last month. The DESSA has excellent internal reliability (α<sup>1</sup> = 0.98) and shows significant, moderate-to-high correlations with two widely-used measures with good psychometric properties, the BERS−2nd Edition (27) and the BASC−2nd Edition (28). The DESSA is a strength-based assessment featuring a social competence composite score by summing scores across 8 subscales including Optimistic Thinking (α = 0.87), Self-Management (α = 0.86), Goal-Directed Behavior (α = 0.89), Self-Awareness (α = 0.82), Social-Awareness (α = 0.86), Personal Responsibility (α = 0.87), Decision Making (α = 0.91), and Relationship Skills (α = 0.93). Screened participants were rank-ordered based on these scores to give priority to adolescents with lower social competence.

Next, selected study participants were randomly assigned to a treatment group starting program participation a week later, or to a waitlisted control group, who started participation 16 weeks later. In year 1 of the trial, 64 participants (Nboys = 30; Ngirls = 34; Mage = 10.93 years) were selected for study participation and randomly assigned to an experimental group (N = 33) or waitlisted control condition (N = 31). Of those, 59 participants attended the session of focus in this study (Nboys = 27; Ngirls = 32; Nreferred = 10; Mage = 11.63 years), predominantly White

TABLE 2 | Approximate timing of activities and associated variables during mounted activity session.


<sup>a</sup>Since each participant received one on one assistance of the lead instructor, outlined start, and end times are approximate. While variation in clocked times of riding, cortisol sampling, and survey completion occurred, timing between activities/events, and cortisol sampling (i.e., 25 min), time between cortisol sample 2 and 3 (i.e., 10 min) and total ride time (i.e., 20 min) was identical for each participants. <sup>b</sup>Participants had a lead walker at all times throughout their riding experience. <sup>c</sup>Obstacles included weaving through cones, zigzagging through poles, halting between poles, etc.

(81.6%) and of non-Latino or Hispanic ethnicity (88.8%), with the remaining adolescents reporting across racial categories that included more than one race (8%), Asian (3.2%), and American Indian or Alaska Native (1.6%), or unknown race (5.6%). Although this study is embedded in a larger, 2-year causal trial, the analyses featured in this manuscript are not designed to make causal inferences.

### Diurnal Activity of HPA Axis: In-home Salivary Sampling

Parents and adolescents were instructed and consented/assented in person by the PI. Two weeks before the beginning of the EAL program, they received written and verbal instructions and a hands-on demonstration on how to collect and store salivary samples. In the week before the EAL program started, parents assisted adolescents with the in-home sampling of six salivary cortisol samples using the passive drool method by spitting through a straw into a sterile 1.8 ml cryovial. Saliva was collected three times a day, on each of two consecutive weekdays at prescribed events (immediately upon waking, immediately before bedtime) and at prescribed times (4:00 p.m.) in their own home and as they went about their normal daily lives. Participants and parents recorded the exact time each sample was taken, and completed an activity and event report on each sampling day on their use of steroid-based medication, timing of intake of food and beverages, sleep and wake-times, as well as any unusual circumstances that may have influenced reliable sampling or interpretation (e.g., sickness, exposure to unusual or stressful event) during days of salivary sampling. Substantial efforts (i.e., written, verbal and in-person instruction, handson demonstration, and telephone and email reminders) were made to impress upon participants the importance of compliance with the study's sampling procedures, particularly with regards to the timing of saliva sampling and reporting of sampling times, especially those collected upon waking. Compliance was high as 92% of adolescents provided all six diurnal samples as requested. A set of completed samples were retrieved from participants at the beginning of the EAL program session when samples were stored on ice in a cooler for transfer to our laboratory-based freezer for storage at −80 degrees Celsius that day, after visual inspection for blood contamination and sample cataloging.

### Momentary Cortisol Sampling

In addition to calculating diurnal indices of HPA axis activity a week before the start of the program, participants were asked to provide three momentary samples of salivary cortisol during the fifth session, which constitutes the focus of this study. Adolescents provided their first saliva sample under supervision from research assistants, immediately upon being picked-up from school before being transported to the barn. Since it takes ∼25 min after an event or stressor for cortisol levels to reach their peak in saliva (29), participants' cortisol levels collected at the time of pickup reflected their HPA axis activity ∼25 min prior to the end of their regular school day, therefore referred to as basal cortisol.

Although the focus session was implemented according to the activity and sampling outline presented in **Table 2**, each participant's specific mounting, riding, dismounting, sampling, and survey activity was individually managed, systematicallytimed with a stopwatch, and documented to ensure that the observed associations between riding-related events, levels of cortisol present in saliva, and participants' momentary emotions reflecting riding-related events are empirically justified. Since participants mounted and dismounted one by one, while under direct supervision of the certified lead PATH instructor, each individuals' precise mounting time was used as the anchoring time for determining the salivary sampling time −25 and 35 min later—of the individual's subsequent two cortisol samples. We conceptualized the cortisol parameter collected 25 min after mounting as anticipatory cortisol to best capture the dynamic, continuous, and transactional nature of HPA axis activity likely to operate in the context of the mounting process, which constitutes a novel, potentially arousing stressor usually lasting several minutes. The third saliva sample was collected exactly 35 min after mounting (e.g., ride cortisol), which was used to calculate the participants' cortisol reactivity by calculating the difference between the third and second sample.

While we recognize that the precise time elapsed between events and peak level of cortisol obtained in saliva is an approximation, it is less important to focus on the absolute values of these variables than it is to consider their value in capturing change in participants' cortisol levels in the context of ∼10 min of riding, which is informed by participants' perception of the availability of coping resources and support (21). Any factor which influences the individual's perception of themselves and their environment, including their past experience (e.g., risk-status), their emotional and physical traits and states (e.g., moment-to-moment emotion, basal cortisol, diurnal dysregulation), and the nature of their support system (e.g., ability of equine-assisted facilitators to facilitate down regulation of arousal, presence of supportive peers, a responsive, gentle horse) may influence this perception and the associated reactivity. Each participant rode for a total of 20 min and dismounted in the order they mounted. All saliva samples were collected under the supervision of trained research assistants who immediately marked each sample with the exact time and date of collection. Completed samples were stored on ice in a cooler for transfer to our laboratorybased freezer for storage at −80 degrees Celsius that day, after visual inspection for blood contamination and sample cataloging.

### Momentary Emotion Sampling

Measurement of participants' moment-to-moment emotion was based on procedures and measures of the experience sampling method (ESM) (30), which is a method that provides detailed information about participants' subjective interpretations of their experiences in naturalistic settings. Designed to capture individuals' subjective evaluations of events at a particular moment is particularly valuable for studying emotions such as stress, since it is possible to determine a person's stress level at a given moment, as well as to identify specific instances when stress increases or decreases in response to specific events (31). Research examining the quality of ESM data has concluded that these data are reliable and valid when compared with data obtained from other instruments (32, 33). Findings also indicate that respondents are generally truthful in reporting their immediate subjective experiences (34), thus confirming the validity of ESM. Using a protocol from a prior study linking emotional functioning to cortisol levels (35, 36), ESM reports were taken twice during the session under study, once immediately before mounting the horse, and once immediately after dismounting, 20 min later. The 1-min 11-item survey asked participants to endorse on a four-point scale, ranging from 0 (not at all), 1 (a little), 2 (somewhat), to 3 (very much), the extent to which they "felt embarrassed, nervous, overwhelmed, frustrated, stressed, confident, relaxed, excited, happy, proud, and relieved." To reduce the potential of participants providing socially desirable answers, each participant was provided privacy while they completed their survey. To reduce the possibility of type 1 error, principal component analyses (with a varimax rotation) were performed revealing two main emotion factors, including negative emotion (i.e., embarrassed, nervous, overwhelmed, frustrated, stressed, and confident and relaxed, which were both reverse coded) and positive emotion (i.e., excited, happy, proud, and relieved). Coefficient alphas for negative and positive emotion were 0.73 and 0.68, respectively. These scores were used to derive at variables capturing positive and negative emotion before riding (i.e., pre-ride positive emotion, pre-ride negative emotion) and after riding (i.e., post-ride positive emotion, post-ride negative emotion).

### Participant Behavior

For study purposes related to the 2-year RCT in which this study was embedded, participants' positive and negative behaviors were rated weekly after each session using the Animal Assisted Therapy—Psychosocial Session Form (AAT-PSF) (37). Since these ratings had been completed each week, we used rating of behavior obtained at the beginning of the program (Week 1) as a control variable in models predicting behavior observed during the focus week under study. Each adolescent's behavior was independently rated by two program facilitators who worked with the adolescent during the focus session, and a third rater, a research assistant not engaged in the facilitation of human-horse interaction. Raters indicated the extent to which adolescents engaged in 25 positive behaviors (e.g., following direction, accepting feedback, sharing, making eye contact, appropriately assertive) and 18 negative behaviors (e.g., argumentative, fidgeting, withdrawn, hyperactive, resistant) on a six—point Likert scale containing 0 (none), 1 (very low), 2 (low), 3 (medium), 4 (high), and 5 (very high). Summed scores for each participant's positive and negative behaviors were averaged across observers, whose ratings were positively associated as evidenced Pendry et al. HPA-Axis, Emotion, Behavior During EAL

by a significant intra-class correlation, r = 0.829, p < 0.001 resulting in a score of positive and negative behavior for each participant.

### Data Reduction of Cortisol Values and Parameters Calculation

All samples were analyzed by a professional laboratory specializing in salivary cortisol assaying by enzyme immunoassay. The test used for these assays had a range of sensitivity from 0.007 to 1.8 µg/dl, and average intra- and inter-assay coefficients of variation <3 and 7%, respectively. Before calculating diurnal parameters, we replaced missing cortisol values with the value of the participant's own cortisol value taken at the same time on the other sampling day, rather than replacing the value with the sample mean. To limit the influence of extremely high or low individual cortisol values, diurnal cortisol values for each time point were winsorized to three standard deviations above and below the mean. The slope value of the adolescent's diurnal cortisol curve was calculated by regressing all six of the participant's cortisol values on his or her sampling time over the 2-day sampling period and the unstandardized coefficients derived from these regression analyses were used as a dependent variable for each participant, effectively controlling for the time of day samples were taken, and for the total time the adolescent was awake. Slope values were utilized in the classification of whether adolescents displayed cortisol dysregulation, as indicated by a positive slope, for which an indicator was assigned. As is common when conducting these types of analyses, a natural logarithmic transformation for each cortisol parameter at each time point (e.g., basal cortisol, anticipatory cortisol, and cortisol reactivity) was used to reduce positive skewness typical of this biomarker, followed by standardizing for regression analyses.

### RESULTS

### Comparing Momentary Emotion Before and After Riding

Descriptive characteristics of participants' emotion factors and items are reported in **Table 3**. Using paired-samples t-tests, we compared reports of momentary emotion for each item and factor reported immediately before riding during mounting (i.e., anticipatory emotion) to emotion reported immediately following riding for 20 min (i.e., post-ride emotion). Given the withinsubject nature of these analyses, we calculated a Cohen's d effect size for each item and factor while controlling for the correlation between measurements of each item (38).

The overall factor for negative emotion revealed that adolescents' levels of negative emotion were significantly higher immediately before riding compared to their levels of negative emotion after having ridden for 20 min t(58) = 1.98, p = 0.052, d<sup>z</sup> = 0.26. This was mostly driven by adolescents feeling less nervoust(57) = −3.58, p = 0.001, d<sup>z</sup> = −0.47 and more confident t(57) = 2.09, p = 0.041, d<sup>z</sup> = 0.27 after having completed their ride. Results also showed that participants reported significantly higher positive emotion after riding, t(57) = 2.19, p = 0.033. TABLE 3 | Descriptives of momentary emotion before and after 20 min of riding (N = 59).


\*p < 0.05. \*\*p < 0.01. \*\*\*p < 0.001.

<sup>a</sup>Confident and relaxed are represented as actual scores, yet were reverse coded to enable factor calculations.

This was mostly driven by adolescent's reports of feeling proud t(56) = 4.16, p < 0.001. We also examined individual differences in positive and negative emotion factors by gender and referral status before and after riding. Results suggest that there were no significant differences by gender for positive emotion before, F(1, 57) = 0.001, p = 0.972, or after riding F(1, 57) = 0.770, p = 0.384. Similarly, there were no significant differences by gender for negative emotion before, F(1, 57) = 0.421, p = 0.519, or after riding, F(1, 57) = 0.002, p = 0.967. With regards to referral status, we noted that the most pronounced difference was experienced in the levels of negative emotion after riding, F(1, 57) = 11.78, p = 0.001 showing that adolescents who came to the program referred by school counselors reported significantly higher levels of negative emotion after riding, M = 0.691, SD = 0.87, than those who were non-referred, M = 0.341, SD = 0.42. As such, in models examining the role of participants' risk-related characteristics on momentary emotion before and after riding, we included referral status.

### Diurnal Cortisol Collected at pre-test, Before Start of 11 Week Program

Descriptive analyses on observed cortisol levels and sampling times are described in **Table 4**. Results show that diurnal cortisol levels for participants collected at pre-test (1 week before the start of the program, 6 weeks before the session under study took place) were considered in the normal range across both sampling days for wakeup cortisol levels M Day1 = 0.330, SD = 0.20, M Day2 = 0.332, SD = 0.22, afternoon cortisol levels M Day1 = 0.096, SD = 0.16, M Day2 = 0.133, SD = 0.31, and bedtime cortisol levels, M Day1 = 0.107, SD = 0.21, M Day2 = 0.061, SD = 0.15. On average, adolescents were awake for a total of 13.57 h per day. The slope value of the adolescent's diurnal cortisol curve, calculated by TABLE 4 | Descriptives of untransformed adolescent cortisol levels (µ/dl).


regressing each individual's cortisol level on his or her sampling time over the 2-day sampling period, was also in the normal range, M = −0.019, SD = 0.01. Indicators of slope dysregulation were assigned to participants with positive slopes. We did not find any significant differences in the slope of adolescents' diurnal cortisol by gender, F(1, 52) = 1.34, p = 0.25, or referral status, F(1, 51) = 1.21, p = 0.28.

### Momentary Cortisol Levels Collected During the Riding Session

Basal cortisol levels reflecting HPA axis activity 25 prior to ending their regular school day (M = 0.083, SD = 0.061), during mounting (M anticipatorycortisol = 0.063, SD = 0.052), and after 10 min of riding (M ridecortisol = 0.067, SD = 0.099) are also reported in **Table 4**. The reported mean levels and standard deviations reflect untransformed cortisol values, which were transformed to normalize the distribution using a natural logarithmic transformation before further analyses were conducted. There were no statistically significant differences in cortisol levels by gender [F(1, 57) = 1.93, p = 0.17; F(1, 51) = 3.55, p = 0.07], referral status [F(1, 57) = 1.77, p = 0.19; F(1, 51) = 0.238, p =0.58], or horse experience prior to starting the program [F(1, 57) = 3.02, p = 0.09; F(1, 51) = 0.313, p = 0.63], respectively.

### Physiological Contributions to Positive and Negative Momentary Emotion Before Riding

Using hierarchical linear regression, we first examined associations between participants' basal cortisol levels, anticipatory cortisol, cortisol dysregulation, referral status, and positive and negative emotion reported immediately before riding during mounting (**Table 5**). In predicting negative momentary emotion before riding (pre-ride negative emotion), results show that higher levels of anticipatory cortisol were significantly associated with higher levels of negative emotion before riding, β = 0.382, p = 0.041. As cortisol variables were logarithmically transformed, these variables will be discussed in terms of percentage change, rather than log-unit change for clarity in interpretation. These results suggest that a 1 SD increase in momentary anticipatory cortisol predicts a statistically significant 46.5% increase in feelings of negative emotion during mounting after basal cortisol levels, cortisol dysregulation, and referral status were accounted for.

Modeling of positive momentary emotion before riding demonstrated that higher levels of anticipatory cortisol significantly predicted lower feelings of positive emotion before riding, β = −0.508, p = 0.013. These results suggest that a 1 SD increase in anticipatory cortisol predicts a statistically significant 39.8% decrease in feelings of positive emotion. Interestingly, basal cortisol at pickup was positively associated with significantly higher levels of positive emotion before riding, β = 0.431, p = 0.034 (d = 53.9%) suggesting that higher basal cortisol levels that afternoon may have served to help the body respond to the stressor in preparation for the event in ways that inform participants' ability to mobilize biological resources (e.g., metabolic functioning, blood glucose levels, blood pressure) even though the size of the acute response as they were mounting their horse may have informed feelings of overwhelm, which may have reduced positive emotions of enjoyment and confidence. Interestingly, neither cortisol dysregulation nor referral status statistically predicted pre-ride levels of positive emotion.

### Effects of Cortisol Reactivity on Positive and Negative Momentary Emotion After Riding

Next, we examined the extent to which participants' cortisol reactivity in response to 10 min of riding predicted participants' positive and negative perceptions about their first ride, which were surveyed 10 min later after dismounting. Consistent with literature on contributions of both diurnal, basal, and momentary cortisol levels to affective states, basal cortisol levels and cortisol dysregulation were included in the model, as was referral status. Greater cortisol reactivity in response to riding significantly predicted higher levels of negative emotion after dismounting the horse, β = 0.395, p = 0.001 showing that a 1 SD increase in cortisol reactivity predicted a 48.4% increase in feelings of negative emotion after the first riding activity had ended. Additionally, referral to the program significantly predicted higher levels of negative emotion after riding, β = 0.338, p = 0.004. Modeling post-ride positive emotion revealed that greater cortisol reactivity in response to the first 10 min of riding predicted a statistically significant decrease in feelings of positive emotion after riding, β = −0.375, p = 0.007, suggesting a 31.3% decrease in positive emotions in response to a 1 SD increase in reactivity. Neither basal cortisol, cortisol dysregulation nor referral status statistically predicted feelings of positive emotion after riding. These findings suggest that greater amounts of acute, momentary activation of HPA axis activity in response to a challenging yet perceived joyful activity—riding—heightens negative emotions and dampens positive perceptions about the experience. In addition, while children who were referred TABLE 5 | Regression analyses predicting momentary emotion during riding session<sup>a</sup> .


\*p < 0.05. \*\* p < 0.01. \*\*\*p < 0.001.

<sup>a</sup>Final presented models control for participant age, gender, and whether the participant was a horse novice.

<sup>b</sup>Due to the logarithmically transformed independent variable (i.e., natural log of cortisol values), the inverse function of that transformation (i.e., exponential function) was applied prior to calculating the effect size (i.e., percent change) to return each coefficient to its value.

and perceived as "at-risk" did not have greater dysregulation of patterns of diurnal cortisol, they were clearly less able to regulate negative affective arousal in the context of these experiences.

### Predicting Negative Behavior During Riding Session

In addition to better understanding the contributions of physiological arousal to affective responses to a common EAL activity, we were interested in examining the extent to which participants' subjective experiences informed their negative behavior during the session. We controlled for gender in the model as descriptive analyses determined that boys were reported to have significantly lower levels of positive behavior M boys = 75.85, SD = 16.13, compared to girls M girls = 86.44, SD = 16.25. Additionally, boys displayed significantly higher levels of negative behavior, M boys = 14.91, SD = 11.88, than girls, M girls = 6.92, SD = 5.62).

In our first model (not shown, but discussed in text) we modeled the relationship between positive and negative emotion before and after riding on negative behavior while controlling for gender and age, and found that increased negative emotion measured before mounting significantly predicted higher levels of negative behavior during the riding session, β = 0.361, t(57) = 2.35, p = 0.023. Additionally, increased feelings of positive emotion after 20 min of riding significantly predicted lower levels of observed negative behavior, β = −0.546, t(57) = −2.88, p = 0.006. Feelings of positive emotion in anticipation to riding, and feelings of negative emotion in response to riding did not significantly predict observed negative behaviors for this session and were thus excluded in the final model, model 2, discussed below, and presented in **Table 6**.

The final model (**Table 6**) integrates additional participant characteristics used in previous analyses while also incorporating relevant indices of HPA axis activity. Increased feelings of negative emotion in anticipation of riding—measured immediately before mounting—significantly predicts negative behavior during the riding session, β = 0.293, t(57) = 2.05, p = 0.046; and increased feelings of positive emotion after riding significantly predicted lower levels of observed negative behavior, β = −0.546, t(57) = −2.88, p = 0.006. We also see evidence of HPA-axis reactivity contributing to observed reports of negative behavior. A 1 SD increase in cortisol reactivity in response to 10 min of riding predicted a 27% increase in observed negative behavior during this equine assisted learning program session. Also, higher levels of observed negative behavior during



\*p < 0.05. \*\*p < 0.01. \*\*\*p < 0.001.

<sup>a</sup>Due to the logarithmically transformed independent variable (i.e., natural log of cortisol values), the inverse function of that transformation (i.e., exponential function) was applied prior to calculating the effect size (i.e., percent change) to return each coefficient to its value.

the first program session were associated with a statistically significant increase in levels of observed negative behavior during the first mounted program session. These findings suggest that negative perceptions in anticipation of riding for the first time, which are influenced by basal levels of cortisol on that day, along with increased physiological and affective arousal in response to riding, increased the negative behavior of EAL participants.

### DISCUSSION

The main objective of this study was to examine the associations between adolescents' physiological characteristics, affective experiences, and behaviors during a novel equine activity approximately halfway through the 11-week program. In our examination of the emotional states of adolescents before and after their first mounted equine activity of the program, we found that negative emotion significantly decreased and positive emotion significantly increased. These findings echo findings by Frederick et al. (39), who found that at the end of a 5-week EAL program, academically-at-risk adolescents randomly assigned to additional EAL activities reported higher levels of hope and lower feelings of depression. These findings support the overall notion that getting to ride a horse can be enjoyable, while participants' experiences throughout an EAL session vary depending on the task, as well as the subjective experiences of that adolescent.

Next, we examined the extent to which adolescent characteristics and diurnal and momentary indices of physiological regulation predicted adolescent positive and negative emotion in anticipation and response to riding. Before riding, we found that higher levels of negative emotion were significantly associated with increased momentary cortisol levels, whereas higher levels of positive emotion were significantly associated with lower levels of momentary cortisol before riding and higher levels of afternoon basal cortisol. After the riding activity was completed, higher levels of negative emotion were significantly associated with higher cortisol reactivity in response to 10 min of riding, and being referred to the program, whereas higher levels of positive emotion were significantly associated with lower levels of cortisol reactivity in response to riding.

The presence of individual differences in cortisol reactivity to negative emotion in naturalistic settings is not surprising given the known role of differences in developmental histories, perceptions of stressors, and coping resources for individual differences in cortisol reactivity. Prior research found that the size of the HPA axis response to challenges intended to stimulate corticotropin-releasing hormone had a significant genetic component, whereas individual differences in cortisol reactivity to laboratory-based psychosocial stressors did not (40). Regardless of its origins, variability in cortisol responsivity to negative emotion and events experienced in daily life is potentially of considerable clinical interest. It is possible that "high cortisol responders" to this EAL challenge may experience more negative emotion and less positive behavior. Although participation in EAL activities may be generally enjoyable for adolescents, their experience as indicated by their emotional, physiological, and behavioral response at any point in the activity may vary based on HPA-axis activity, individual characteristics and referral status.

Finally, we examined the extent to which participants' positive and negative emotion in anticipation and response to riding informed the quality of observed adolescent behavior during their first mounted activity. We found that higher levels of observed negative behavior were significantly associated with higher levels of negative emotion before riding, lower levels of positive emotion after riding, and greater cortisol reactivity in response to riding for 10 min. Additionally, higher levels of negative behavior on the first day of the program were significantly associated with higher levels of negative behavior during the first mounted EAL session.

Based on these findings, it is important that program instructors and facilitators take into consideration some of the "invisible" individual characteristics that may be informing a participant's EAL experience, particularly during mounted sessions. Their attention should go beyond simply redirecting participant behavior with the intention of promoting safety, peer relations, activity goals, etc., to also recognize the different regulation abilities of the participants during activities that may be arousing to the adolescent. One avenue of accomplishing this may come from the nature of the programs themselves. EAL may provide a unique platform to assist adolescents in better understanding and cuing into their own physiological and emotional needs. Carlsson et al. (41) reported that clients and staff members engaged in equine assisted social work claimed that the horse's ability to cue into and respond to their emotions was influential in allowing themselves to become aware of those emotions. Notschaele (42) suggests that it is the horse's ability to provide moment-to-moment feedback on human's non-verbal communication that provides the framework for a feedback system that humans do not typically encounter. The underlying idea for this feedback loop is further supported by evidence that horses do in fact respond both physically and behaviorally to human's psychological and physical stress (43, 44). With their knowledge of horse behavior, facilitators can thus provide valuable guidance to participants to assist them with downregulation following an arousing experience, by effectively utilizing the horse's behavior as an external feedback indicator. On the other hand, this approach suggests that all horses cope actively or visually with stressors that can occur in this situation. As such, it may place undue responsibility on the horse as an instrument, when in reality it may be unethical to wait for the horse to communicate discomfort related to the participant. In sum, while the horse may offer up recognizable indicators to facilitators that their client is experiencing stress, it is equally important that each program facilitator be able to independently recognize behaviors of stress in their human clients. Fostering increased awareness of the role of HPA axis activity and arousal during program participation may enable program facilitators to better recognize and respond to arousal in their clients. With the duel knowledge of indicators of stress in both humans and horses, facilitators will be able to better respond to human needs, while simultaneously protecting and promoting the wellbeing of the horses they are working with.

By paying attention to the behaviors and cues of both the adolescent and horse, program staff have the opportunity to take a more active role in facilitating the downregulation of aroused adolescents during mounted activities. At times arousing circumstances may require the highlighting of positive actions (i.e., "Nice job rewarding your horse for standing still while mounting"), naming and discussing "negatives" (i.e., "I wonder whether the horse can sense that you are nervous; what do you think?"), or directly instructing the adolescent to use the horse as a measure of their downregulation (i.e., "It seems like you may be gripping your horse with your legs which may be telling your horse that you want to go faster, even though you are also asking your horse to slow down. Let's see if your horse will relax with you; take a deep breath and relax your legs when you exhale"). It is important that facilitators and instructors of EAL programs are aware that participants may not be able to regulate their behavior (i.e., listen to direction, sit still, not pull on the reins) because they are preoccupied and possibly impaired in their own affective and physiological arousal in response to what they may perceive as a psychological or physical threat. As such, rather than focusing on changing the negative behavior per se, facilitators may need to focus on helping participants change their arousal and perception as a tool to facilitate desired positive behavior. Doing so will also provide opportunities for the facilitators to point out the horse's desired response, which can be used to reinforce downregulation and positive perception for the participant, functioning as a feedback loop in the physiological, affective, and behavioral domains. Overall, increasing the participants' awareness of emotion, behavior, and cognitions enhances EAL's original goal.

### Strengths and Limitations

Strengths of this study lie in several methodological aspects in the measuring of various dependent and independent variables. First, the collection of the diurnal and momentary cortisol samples and participant moment-to-moment emotion were conducted under a precise, carefully-timed protocol. Although maintaining such a methodological protocol in a naturalistic setting is logistically and procedurally difficult to do, it adds a dynamic set of measures to the expanding literature on EAL programming. In addition, the modeling approach is one of the few in the literature that simultaneously examines diurnal, basal, and momentary activity of the HPA axis. Considering aspects that inform cortisol activity in humans comprehensively thereby informs realistic expectations about participants' ability to control their arousal and reduces the likelihood that participants, especially those with dysregulated patterns, are expected to regulate their behavior more than they are psychologically able. Not recognizing the role of arousal may create the incorrect impression that participants are not willing to follow directions, which can lead to unproductive tension between the participant, their peer, the facilitator, and the horse, particularly when attempts are made to redirect using behavioral approaches (i.e., rewards such as praise; punishment such as sitting out), which tend to be less effective when affective and physiological arousal underlie undesired actions. As such, we believe that these findings provide a much-needed contribution that can inform the practice of EAL for various populations. Another strength of this study is that behavior was assessed by three independent raters, including those unconnected to the EAL implementation and thus unvested in treatment success, the horse or participant. This reduces the likelihood that the ratings were biased by staff exaggerating positive treatment effects.

With regards to study limitations, while the data analyzed in this study were collected under naturalistic settings as a part of a larger randomized controlled trial, the nature of the data does not allow for causal interpretation of our findings. However, these findings may inform future causal experimentation examining "best practices" for program facilitation. Additionally, although we gathered self-reports on emotional states before and after riding, we did not measure participants' appraisal of their experience. Future work examining how participants appraise the situation at hand may better inform the relationship between their physiological reactivity, emotionality, and behavior.

### CONCLUSION

In summary, this study provides evidence that studying the associations between participants' physiological and affective experiences in response to common EAL activities is informing our understanding of participants' individual differences in behavior. In the long term, gaining a better understanding of how the dynamics of participants' emotions and experiences relate to the dynamics of their cortisol levels and their behavior may help to illuminate the pathways by which EAL can get "under the skin" to influence program success. EAL programming staff have an opportunity to assist adolescents, particularly those at-risk and those experiencing high levels of stress and related symptoms, in downregulating their arousal during mounted program activities to encourage appropriate behaviors, as well as create positive, enjoyable, relaxing experiences. It is important therefore that program staff receive the necessary training and support related to participants' developmental characteristics in addition to their experience with horses in order to not only maintain program safety, but also to promote positive learning opportunities for program participants by highlighting the unique ways in which horses can facilitate these processes.

### AUTHOR CONTRIBUTIONS

All authors contributed to the manuscript and have read and approved the final version. PP obtained funding, conceived the study concept and design, and drove the analysis and interpretation of data. AC participated in data collection, assisted in the analysis and interpretation of data, and contributed to the manuscript draft. JV critically revised the manuscript.

### REFERENCES


### FUNDING

This work was supported through grant number 1R03HD066590-01 to the first author from the Eunice Kennedy Shriver National Institute of Child Health & Human Development and Mars-Waltham. The content is solely the responsibility of the authors and does not necessarily represent the official views of the Eunice Kennedy Shriver National Institute of Child Health & Human Development, the National Institutes of Health, or Mars-Waltham.

### ACKNOWLEDGMENTS

We would like to acknowledge Sue Jacobson, program coordinator of People-Pet Partnership at the College of Veterinary Medicine at Washington State University and Dr. Phyllis Erdman, associate Dean of the College of Education also at Washington State University, who implemented the PATH to Success program. We also thank all PATH to Success Study volunteers, research assistants, participants, their parents, and teachers.

cortisol patterns in adolescents. Psychoneuroendocrinology (2008) 33:1344–56. doi: 10.1016/j.psyneuen.2008.07.011


Marsden C, editors. Behavioral Neuroscience of Attention Deficit Hyperactivity Disorder and its Treatment. Berlin; Heidelberg: Springer (2010). p. 93–111.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Pendry, Carr and Vandagriff. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Dog Training Intervention Shows Social-Cognitive Change in the Journals of Incarcerated Youth

Tiffany Syzmanski <sup>1</sup> , Rita J. Casey <sup>1</sup> \*, Amy Johnson<sup>2</sup> , Annmarie Cano<sup>1</sup> , Dana Albright <sup>1</sup> and Nicholas P. Seivert <sup>1</sup>

*<sup>1</sup> Psychology, Wayne State University, Detroit, MI, United States, <sup>2</sup> School of Nursing, Oakland University, Rochester, MI, United States*

There is limited research assessing the effectiveness of Animal-Assisted Therapy in at-risk adolescent populations. In a recent study, 138 adjudicated adolescents participated in a randomized controlled trial of an animal-assisted intervention, in which participants either trained shelter dogs (Teacher's Pet group) or walked the dogs (control group), with both groups participating in classroom work related to dogs (1). Journal writing was a part of class activities for all youth in the study. Conventional assessments of youth behavior made by staff or youth themselves did not demonstrate the expected differences between the groups favoring the dog training group, as youth in both groups showed a significant increase in staff and youth rated internalizing behavior problems and empathy from the beginning to the end of the project (1). However, subsequent analysis of the journal content from 73 of the adjudicated youth reported here, did reveal significant differences between treatment and control groups, favoring the Teacher's Pet group. Youth participating in the dog training intervention showed through their journal writing greater social-cognitive growth, more attachment, and more positive attitudes toward the animal-assisted intervention compared to youth in the control group. The 73 youth whose journals were available were very similar to youth in the larger group. Their results illustrate that journaling can be a useful method of assessing effects of similar animal-assisted interventions for at-risk youth. Writing done by youth receiving therapy appeared to promote self-reflection, desirable cognitive change, and prosocial attitudes that may signify improving quality of life for such youth. The expressive writing of participants could reveal important effects of treatment beyond the behavioral changes that are often the targeted outcomes of animal-assisted interventions.

#### Keywords: dogs, dog training, incarcerated youth, journaling, animal-assisted treatment

### INTRODUCTION

This study analyzed the content of journals kept by incarcerated youth who participated in a randomized controlled trial of an animal-assisted therapy, known as Teacher's Pet (1). Incarcerated youth in the United States are highly likely to have a psychiatric disorder (2), a predictor of recidivism (3). Effective treatment of at-risk youth is crucial not only to reduce the risk of recidivism in adulthood, but also to improve their quality of life (4–6). This project determined whether journal entries made during an animal-assisted treatment of detained youth could be analyzed meaningfully, and if so, whether their content provided insight into potential positive effects of the treatment.

#### Edited by:

*Peggy D. McCardle, Consultant, New Haven, CT, United States*

#### Reviewed by:

*Hsin-Yi Weng, Purdue University, United States Mitsuaki Ohta, Tokyo University of Agriculture, Japan*

> \*Correspondence: *Rita J. Casey r.casey@wayne.edu*

#### Specialty section:

*This article was submitted to Veterinary Humanities and Social Sciences, a section of the journal Frontiers in Veterinary Science*

Received: *04 June 2018* Accepted: *12 November 2018* Published: *11 December 2018*

#### Citation:

*Syzmanski T, Casey RJ, Johnson A, Cano A, Albright D and Seivert NP (2018) Dog Training Intervention Shows Social-Cognitive Change in the Journals of Incarcerated Youth. Front. Vet. Sci. 5:302. doi: 10.3389/fvets.2018.00302*

Youth are placed in detention for many reasons. About a quarter of youth engage in severe crimes, such as homicide, robbery, or aggravated and sexual assault (7). However, the vast majority of imprisoned youth engage in less serious crimes such as theft, burglary, substance abuse, simple assault, weapon possession, running away, particularly from foster homes (8), or truancy, etc.

The majority of incarcerated youth in the United States have been given diagnoses or are likely to meet diagnostic criteria for some form of mental illness. Teplin et al. (4) found that the most common disorders diagnosed were (in terms of male and female prevalence, respectively), substance use disorders (50.7 and 46.8%) conduct and oppositional defiant disorder (41.4 and 45.6%), and high rates of other disorders characterized by internalizing symptoms such as anxiety (21.3 and 30.8%) and other mood problems (18.7 and 27.6%) (4). Other studies with detained youth have found similar rates of mental disorders (5, 6).

Given the high prevalence of psychopathology in this population, it is important that treatments are available for these youth, to address such problems. Treatment regimens for incarcerated youth most commonly focus on substance abuse. However, interventions with those targets are not sufficient to meet the needs of youth with other mental disorders and multiple diagnoses (4, 6, 9, 10). Nevertheless, very few facilities implement appropriate treatments for incarcerated youth. Wilson and Lipsey (11) examined the effects of different treatments of adjudicated youth aimed at reducing recidivism, finding that overall, they reduced recidivism rates by 12%. The most effective programs focused on building social and communication skills via reinforcement in learning adaptive interactions (12). Mixed results were found from other treatment methods, such as cognitive behavioral approaches and vocational training.

More recently, interaction with animals has become a way to provide treatment to youth with serious psychological difficulties, including incarcerated youth. Such treatments hold promise of being possibly easy to implement, and attractive as novel, highly liked intervention that many youth will enjoy (13). Such interventions have ranged widely in terms of the animals used as the central aspect of the treatment, as well as the kind of target problems and participants who are included (14).

Inexpensive methods for examining effects of animal assisted treatment, such as expressive writing, could be integrated into treatment programming, in order to show outcomes of animal-assisted interventions for incarcerated, high-risk youth. Journaling is a form of expressive writing that is typically done at regular, frequent intervals. Individuals doing journaling record their thoughts, feelings, and experiences about their life. It is an activity that has shown therapeutic as well as learning benefits (15–18), and can be used as either a treatment in itself, or as an adjunct or way to examine the outcomes of treatments. It is a form of self-reflection that can facilitate self-awareness, personal growth, insight into emotions and behaviors, and aid in restructuring persons' thoughts about their experiences.

Although expressive writing is considered to be helpful in adult populations, very few studies have focused on the benefits of expressive writing for youth. Journaling could potentially be a very useful activity for adolescents in that it is an engaging and creative, especially in the age of social media in which youth are accustomed to sharing their thoughts and actions with their peers via the internet, often through writing (19, 20). Expressive writing could be a useful method for examining effects of treatment given to incarcerated youth, in that writing is inexpensive to implement, and can also help fill the passage of time for youth whose incarceration makes many ordinary youth activities beyond the range of possibility. Writing down thoughts and emotions about life events, particularly those that are stressful, can aid youth in expressing and understanding their deepest emotions and forming a deeper understanding of their life experiences. Writing can ultimately help to reduce stress by providing a form of written disclosure (17, 21). Keeping a journal can also help youth develop a sense of meaning in life, show their understanding of life events, and provide a window into critical thinking skills, changed self-perceptions, and better skills for coping with difficult emotions (15, 22).

Although more research is needed on expressive writing with adolescents, it is a promising activity for teens, particularly if it can help provide insight into those who are considered at-risk. Adolescents, particularly ones with behavioral problems, can use journaling to record their reactions to treatment and reflect on them (23). Journaling can also potentially aid in revealing youths' self-awareness of life experiences, which can, in turn, increase empathy toward others and ultimately, improve interpersonal relationships (23, 24). Thus, journaling among adolescents can facilitate examination of personal growth, self-awareness, and insight, especially since adolescence is a big transitional period in emotional development from childhood to adult life (20).

Expressive writing, with its emphasis on communication of emotions, thus presents a potentially welcome and cost-effective approach for assessing treatment outcomes for incarcerated youth. Through writing about their own emotions and thoughts associated with life events, they have an opportunity to develop their perspective taking, as well as improving insight into their own behavior and generating a sense of self-efficacy in their actions (25). Furthermore, when expressive writing through journaling is done as a way to look at the effects of a time-bound intervention, such as the animal-assisted treatment reported in this project, such writing has the potential to be a valuable asset to that intervention, allowing a look at changes that may be difficult to capture through additional outcome measures focused primarily on behavior ratings. A recent meta-analysis of effects of treatments given to youth with conduct disorder (26) found that youth self-ratings were typically non-significant, and similar ratings from parents and teachers showed small to moderate effects, mostly from parents. Thus journaling might offer a way to examine treatment outcomes of youth that will not be observed in self-ratings of youth behavior.

The original project from which the current study was drawn (1), was an animal-assisted therapy intervention with incarcerated youth, in which interaction with shelter dogs was a crucial part of the intervention. Participants in a dog-training intervention (known as Teacher's Pet) designed to teach youth how to train undersocialized dogs, were expected to improve in human social, emotional, and behavioral functioning. A control Syzmanski et al. Dog Training AAT With Journaling

group, whose principal activity consisted of walking dogs for the same amount of time with dogs as the treatment group, were expected to have less positive outcomes than the dog-training youth,. However, behavior ratings from staff members at the facilities as well as self-ratings of the youth did not show positive changes as a function of group assignment. Rather, youth in both the Teacher's Pet intervention and the dog walking control group demonstrated increases in internalizing behaviors. In addition, all youth increased in empathy, with no differences by group assignment.

This project analyzed journal content of incarcerated teens who participated in that animal-assisted therapy (AAT) (1). We wanted to see whether journal writing would show outcomes that demonstrated differences between treatment and control conditions, consisting of youths' thoughts concerning social relationships and attitudes during the AAT. Thus, this project analyzed the journals produced by incarcerated youth who were participating in an animal-assisted therapy compared to journals of youth in a control condition. We looked for meaningful patterns of content in the youths' writing that would vary according to participants' group assignment to either the experimental treatment or the control condition.

Specifically, we hypothesized that the pattern of writing content would covary with participant group membership in ways showed more positive outcomes for the experimental condition. If this expectation was correct, journal writing of the experimental group's participants would show more positive signs of social cognition and attitudes than were observed in the journal writing of youth in the control group.

### MATERIALS AND METHODS

### Participants

Participants in this project were drawn from the larger Teacher's Pet Research Study (1). That project consisted of 138 adolescents in two county juvenile detention centers in Southeast Michigan. By the end of the last cohort in June 2014, 73 participant journals were collected and identified as to particular youth. The missing journals were primarily unavailable because initially, the journal writing was not thought of as reflecting any meaningful outcome of the intervention, thus no plans were initially made to collect youth journals. However, a quick review of journals from some of the earliest participants in the project showed some interesting content, so journals were given code numbers to associate them with other data collected from individual participants. This took some time to implement, thus some youth took their writing materials with them when they left the facility they were in, before their journals could be copied. Other journals were copied, but without the needed identification to associate them with other individual data.

The youth whose journals were available to this project, were very similar in overall characteristics to the larger group from which they were drawn. Participants were mostly male (72.5%); 58.9% of participants were in the Teacher's Pet group (N = 43), and 41.1% of participants were in the Dog walking control group (N = 30) (see **Table 1** for overall characteristics of the youth whose journals were available for this study). The background TABLE 1 | Characteristics of participants.


*TP, Teacher's Pet (Intervention); DW, Dog walking (Control).*

of the youth with journals available for this study were very similar to that of the larger group. About one-quarter of the participants had a history of parental abuse or neglect, or a history in the foster care system. Two-thirds had a psychiatric diagnosis and/or had been in treatment for one. Ethnicity and gender were not specifically controlled for this study; rather, these characteristics reflected the general makeup of adolescents in the juvenile detention centers that were involved in the program. The percentages by ethnicity were very similar in the 73 youth in this report compared to the larger group. Analyses of the distribution of these characteristics in the larger and smaller groups showed no significant differences in the distribution of ethnicity (X <sup>2</sup> = 0.385, p = 0.55) or gender (X <sup>2</sup> = 0.212, p = 0.66) between the larger set of 138 youth and the 73 youth whose journals were analyzed for this study.

### Procedure

Participants in each cohort were randomly assigned to either an animal assisted therapy group (the experimental group, known as Teacher's Pet) or a dog walking control group. The Teacher's Pet Program is an animal assisted therapy program (AAT) that had already been carried out at both facilities prior to becoming the center of this research project. Those activities were familiar to center staff but had not been executed in comparison to any other intervention or comparison activity in the past. In the experimental group's activity, youth were instructed to train undersocialized dogs in order to make the dogs more suitable for adoption out of the shelters from which they came. Participants in the control group walked assigned dogs but did not teach them. Both youth who were dog walkers and youth who participated in Teachers' Pet training worked with the same dogs. Somewhat more youth were assigned to the Experimental (Teacher's Pet) group than the Control, dog walking group, because of facility staff requests, scheduling, and the desire to populate the Experimental group because of the small sample. This limitation is discussed further in the Discussion section. We note, however, that the sizes of treatment and control groups were more similar across the 73 youth whose data are reported here, compared to the larger group (X <sup>2</sup> = 4.495, p = 0.034).

### Consent

All project procedures were reviewed and approved by the Wayne State University Institutional Review Board, specifically by those committees that dealt with research involving vulnerable participants, in this case minors and prisoners (the latter also required U.S. Department of Health and Human Services approval, which was obtained prior to study commencement). Permission for every youth to participate in the Teacher's Pet Program was obtained from the parent or legal guardian of each youth, or from an advocate such as a facility staff member if parental or legal guardian consent could not be obtained, due to absence of a legal guardian. Assent was also obtained from each adolescent who participated. Youth were free to participate or not, and were allowed to stop participating in the program before their portion of the study was completed, although none chose to do so. Compensation for full completion of the Teacher's Pet Program consisted of a \$50 Target gift card that was given to each participant after completing the program, when their period of incarceration ended.

### Study Conditions

The Teacher's Pet Program was completed in cohorts, with each cohort participating in training and dog walking that lasted 10 weeks. There were about 10 participants per cohort, with both experimental and control conditions being carried out for each cohort. Youth in each cohort were randomly assigned to either the animal assisted therapy group (Teacher's Pet—learning to train dogs), or the control group (walking the dogs only, for 10 weeks and the same amount of time with dogs as the experimental participants had). Participants in both conditions interacted with dogs for an equal period of time each week, about 2 h total. They also had classroom-based didactic sessions each week that focused on information about dog care, dog behavior, and humane treatment, with youth assigned to both conditions together in the same class periods. Journals were available and journal writing took place during those classroom sessions<sup>1</sup> .

Sessions in which the participants interacted with the dogs occurred for 2 h each week in either an indoor gymnasium, or weather permitting, in an enclosed outdoor courtyard within the facility. Training of dogs occurred in 1 h increments, twice per week, whereas dog walkers either walked assigned dogs for 1 h twice per week or for half an hour 4 times per week. Participants in the experimental group were assigned a dog to train; the main goal for these participants was to train their assigned dog for one half of the program (5 weeks) and were then given another dog to train for the remaining 5 weeks of the program. Dog walkers were assigned different dogs for each session of dog walking, although some youth traded dogs for walking if they chose to do so. These participants were instructed not to train the dogs they walked. Experienced dog trainers or shelter staff were present with the youth and animals through every session of dog-youth interaction, to ensure both fidelity to the assigned activities and safety for the youth and the dogs. Coders were also present to observe participant behavior during training sessions.

### Dogs in the Study

The dogs who participated in this project were brought in daily from nearby Southeast Michigan animal shelters. Facilitators or volunteers for the Teacher's Pet Study worked at the animal shelters and transported the dogs to and from the shelters for each session. No dogs were hurt in the course of their participation in this study. The procedures for the dogs selected for this project were approved by the Institutional Animal Care and Use Committee of Wayne State University. All shelter dogs had a health examination as well as a temperament assessment. Potential dogs for this project also underwent another screening by Teacher's Pet staff to test for major behavioral issues, to ensure the safety of dogs and participants. This screening was essentially similar to that given dogs made available for adoption. Any dog that displayed aggression toward humans or other dogs was immediately ineligible for this study. The dogs included in this study were made up of a variety of breeds, commonly pit bull mixes, and were at least 1 year of age or older. Dogs that displayed minor behavior problems, such as jumping, pulling, and having socialization difficulties, made good candidates for the programs' enrichment and training.

The history of the program was such that dogs trained by the youth had a record of high success in being adopted once they had been trained by Teacher's Pet participants. Although dog adoption was not a specific outcome planned for this project, estimates are that adoption rates for the participating dogs were close to 90%. National estimates of adoption rates for shelter dogs are about 60% (27).

### Didactics

All participants also engaged in a 1 hour classroom (didactic) session twice weekly, held after training or walking activities. Youth in these classroom activities were not separated by treatment condition, but were grouped together for all class sessions. In the classroom portion of the program, participants learned about dog training techniques, animal shelters, puppy mills, dog behavior, facts about certain dog breeds, etc. Teacher's Pet staff and the staff from the juvenile detention center facilitated the classroom sessions. During the last part of each classroom meeting, the youth were instructed to write in personal journals that were given to them specifically for this program. Participants had either specific writing assignments about varying topics (many relating to the participants' assigned dogs or material in the classroom), or the youth could write freely for this part of the classroom sessions. The youth were also allowed to use the journals to take notes on information that was presented during more didactic classroom portions, if they wished. Although assignments were made or suggested in some didactic sessions, journal content was not graded or evaluated for conformity to any specific instructions given during classroom sessions. The journal content became the primary focus of this study.

<sup>1</sup>Additional details concerning the content of the Teacher's Pet intervention can be obtained from the authors upon request, or by going to the web site associated with the intervention (www.teacherspetmi.org). See Author Notes for information about contacting the authors.

### Instruments

Instruments used for the original Teacher's Pet Study consisted mainly of ratings of youth by the facility staff, and selfreport measures given to participants pre- and post-intervention. Staff of the detention facilities also reviewed each participant's information and medical chart. Some information for this project was gathered from that review, including participant age, gender, ethnicity, psychiatric and medical history, and history in foster care.

### Analysis of Participant Journal Content

Although all youth in the program (regardless of whether they were in the Teacher's Pet group or the control, dog walking group) had a journal to use during the didactic portion of the study, as noted, not all journals were made available for the project reported here as noted above.

Four raters were assigned to evaluate the content of the journals. Although each had some knowledge of the overall purpose of the project, the ratings that they did were blind to the condition to which journal writers were assigned. Raters did not serve as coders of the live interactions of youth with the dogs. Other than three of the journals that explicitly mentioned which group the writer was in, condition was not obvious from reading the journals. Each rater coded two-thirds of the journals, with each journal being rated by two additional and different raters among the group, through a random process.

### Development of the Coding System

Through a series of analytic steps applied to the journals collected, we devised a coding system for the content of the journals. Originally, it was hoped that writing rated in the journals would represent 6 kinds of categories, including perspective-taking/empathy with others; humane attitudes, attachment to a dog or dogs, self-efficacy/self-perceived competence, emotion regulation of self, and overall reactions to the program. However, when the coders began an initial reading of the journals to orient themselves prior to coding, only some of those categories appeared easily observable, leaving much writing uncategorized.

Therefore, a coding system was developed inductively, beginning with the content of the youths' writing. First, journals were reviewed and topics and content that seemed important or which were mentioned by several youth were noted. These included topics presented freely in participant writing, things clearly appearing to respond to classroom presentations where journaling followed those presentations, and some interesting extraneous material such as drawings. These topics and characteristics were discussed extensively by the group of three coders, until we agreed initially on 50 different types of content or writing characteristics that were observed in the journals. A manual was developed in order to guide the researchers through the process of analyzing and coding individual journals for the 50 types of data that were observed across the entire set of available journals. This manual contained direct and indirect examples of each of the individual coding categories taken from excerpts of the journals themselves in order to help researchers with coding properly. Next, every journal was coded for each of these 50 types of content, on a numerical scale pertaining to how many times a certain category was mentioned or described in writing or observed within each individual journal.

### Reliability of Initial Coding

For reliability purposes, every journal was rated by three raters, with the particular combination of raters determined by a random order designed to produce the same number of raters for each journal, and each rater rating the same number of journals as every other rater. In addition, every journal was rated by a total of three raters, with the configuration of which three persons rated any individual journal across the entire set of journals determined at random. The data obtained from multiple ratings of the same set of journals, were used to calculate pair-wise reliability as well as reliability across three raters. As a result of these ratings, several categories were removed from the rating system, because they were unreliable as coded, or had rarely been observed across the set of journals.

### Obtained Reliability of the Remaining Categories

The obtained average internal consistency of ratings of the remaining categories (which were obtained through Intra-class correlation reliability coefficients (ICC) as recommended by Shrout and Fleiss (28) are found in **Table 2**. These categories were chosen based on the high range of agreement across three raters as well as the relevance of these categories to the study. Categories below ICC of 0.625 were dropped from further consideration. Individual code categories were then placed into groupings that appeared to be related, based on the content of each code, as described above. Thus, six larger categories were identified and labeled to reflect the items they contained: future orientation, cognitive growth and self-awareness. These, taken together, appeared to be primarily cognitive codes. Other responses, as coded, involved attachment, attitude toward the program and positivity of emotion. These together seemed to reflect primarily emotion-related content.

For the emotion-related codes, a group of codes called Attachment consisted of participants' writing about their interactions with their assigned dog, such as physical contact, empathy toward their assigned dogs or dogs in general (such as a negative reaction toward a movie on animal shelters), wanting to help dogs, or feeling sorry for their assigned dogs or other shelter dogs, writing about patience, physical contact with their dog (e.g., hugging), talking about having affection for their dog, as well as writing about the feelings or thoughts of the assigned dog. The codes categorized as Attitude Toward the Program included writing about the participant's general outlook of the program and what they got out of the program, writing about liking the program, training challenges or goals for their assigned dog, and writing about what they had learned in the program. Positivity of Emotion, the label given to another set of codes, included participant writing about their dog liking them, any mention of their own positive feelings (e.g., feeling good or happy) whether related to the program or not, mention of negative feelings whether related to the program TABLE 2 | Sets of coding for participant journal contents.


\**ICC, Intraclass correlation coefficients, using the average of random raters (28).*

or not, a critique of their assigned dog, and writing about being happy about working and/or being with their assigned dog.

For the cognitive-related codes, a group of codes labeled Future Orientation included writing about being hopeful of the future for themselves (e.g., writing about careers, what they plan on doing after the program, writing about the future in general, etc.) and being hopeful about the future of their assigned dog(s), e.g., hoping that the dog or dogs get adopted, etc. A set of codes called Self-Awareness included participants' writing about their relationship with the staff at the facility, showing insight into their own behavior (e.g., self-reflection), comparing themselves to their assigned dog and vice-versa, writing about what they had learned in the program (also included in the Attitude Toward the Program group); and mentioning positive and negative feelings whether the emotions were related to the program or not. Finally, the set of codes called Cognitive Growth included such things as a letter to the adopter of a dog, a flier or a story about an assigned dog (a writing prompt given during the classroom portion in the study), the participant writing about how a dog changed the participant's attitude, writing about the observed behaviors of an assigned dog, and presence of notes left by the staff of the program (staff read participants' journals and left notes and/or follow-up questions for some but not all of the youth).

In summary, the codes were organized into larger, labeled sets of codes as described above, with the overall total of the individual ratings for each set serving as data for further analysis. **Table 2** contains interrater reliability coefficients for the codes making up each set.

### Educational Level of Writing

It was reasonable that the sophistication or general education level of the journal writing could also have an effect on what youth wrote (29, 30), given that incarcerated youth often show academic deficits. It was also the case that no analysis of educational level was made in the larger project from which this study data had been obtained, nor did youth records indicate their educational level. Thus, it was decided to estimate youths' writing/education level, to make it available for analyses of experimental vs. control group differences in journal content. Participant journals were therefore rated based on overall written sophistication observed in each journal. Two expert raters with extensive background knowledge and teaching experience working with young writers and students made holistic ratings of each journal in determining educational sophistication based on journal writing content and detail. These raters had not met nor observed any of the participants during the intervention and were unaware of the group to which the youth writers had been assigned. Both raters rated every journal. Ratings were based on the general written sophistication of content in the journal entries, using a scale of 1–3, with "1" being lowest, "2" medium, and "3" highest in written sophistication. The inter-rater agreement for the two raters was 0.75 (Cohen's Kappa) across the full set of journals. The ratings of the two raters for each participant were summed, producing scores ranging from 2 to 6, with lower scores representing less writing skill or educational sophistication than higher scores. These scores were used in the data analysis to take into account possible effects, if any, of writing/educational sophistication on the content of the journals' content ratings.

After writing level scores were obtained, relations to writing content sets were analyzed. Education sophistication level had significant positive correlations with every set of codes, with higher writing levels associated with higher scores in the content categories. These relationships of general writing/educational level to content categories included Cognitive Growth; r(71) = 0.46, p < 0.01; Future Orientation; r(71) = 0.50, p < 0.01; Self-Awareness; r(71) = 0.49, p < 0.01; Attitude Toward the Program; r(71) = 0.38, p < 0.01; Positivity of Emotion r(71) = 0.29, p < 0.05; and Attachment r(71) = 0.40, p < 0.01. Thus, it was decided that educational sophistication would be used as a covariate in subsequent analyses of differences between experimental and control groups<sup>2</sup> .

### Hypotheses

It was hypothesized that there would be significant between group differences for all 6 areas of journal content, whether Emotional (Attachment, Positivity of Emotion, Attitude Toward the Program) or Cognitive (Future Orientation, Cognitive Growth and Self-Awareness), in that participants in the Teacher's Pet group would have significantly higher average scores compared to participants in the dog walking group.

A MANCOVA (Multivariate Analysis of Covariance) was conducted in order to determine if there were significant differences between the journal content of treatment vs. control groups in terms of Cognitive Growth, Future Orientation, Self-Awareness, Attachment, Positivity of Emotion and Attitude Toward the Program seen in their journal entries, controlling for education sophistication level.

### RESULTS

### Group Differences

**Table 3** reports the comparisons and effect sizes for each final code. As expected, there was a statistically significant difference between treatment vs. control groups in means for Cognitive Growth scores, [F(1,71) = 11.32, p = 0.05, Wilk's 3 = 0.84, partial η <sup>2</sup> = 0.16], in that the participants in the Teacher's Pet group had a significantly higher average Cognitive Growth score (Mean = 8.05, SD = 3.93) compared to the Dog walking group (Mean = 4.70, SD = 2.52), with an effect size for the intervention of g = 0.98, a large effect. Although group differences in Future Orientation scores were not significant, the effect size was moderate, g = 0.60, [F(1,71) = 3.65, p = 0.06]. Self-Awareness scores, however, did not show group differences at or close to the conventional p-level, and did not produce a meaningful effect size [F(1,71) = 0.44, p = 0.51, g = 0.06]. Thus, in terms of cognitive writing as a function of group membership, changes in Cognitive Growth and Future Orientation scores, favored the Teacher's Pet group, with moderate to large effects.

In terms of emotional aspects of their journal writing as seen in the emotion category scores, there were meaningful effect sizes favoring the Teacher's Pet treatment, in all three sets of coded writing (see **Table 3**). Scores for Attitude Toward the Program [F(1,71) = 12.67, p = 0.05], were significantly higher for participants in the Teacher's Pet group (Mean = 3.12, SD = 2.63) compared to participants in the dog walking group (Mean = 1.08, SD = 0.95), effect size of g = 0.97, a large effect. There were also significant differences on scores of Attachment for the two groups [F(1,71) = 7.28, p = 0.05, Wilk's 3 0.82, partial η <sup>2</sup> = 0.18], in that participants in the Teacher's Pet group showed significantly higher scores for Attachment (Mean = 6.79, SD = 5.29), compared to participants in the dog walking group (Mean = 3.13, SD = 3.10), with an effect size that was moderately large, g = 0.81. The group differences for Positivity of Emotion, [F(1,71) = 3.43, p = 0.07], though not conventionally TABLE 3 | Comparisons of experimental vs. control group for journal content codes.


*Higher Means signify larger outcomes as seen in the coded measure of journaling content. Effect sizes were calculated using Hedge's g; its computation takes into account both differences and variation between treatment groups. It is noteworthy that strict use of p* ≤ *0.05 would eliminate consideration of a medium effect size, that is likely important.*

significant, were just outside that level. In addition, the effect size for Positivity of Emotion was g = 0.58, a moderate effect. Thus, for ratings of emotional journal writing, then, youths' journal writing in the experimental group showed more positive attitudes about the animal assisted intervention they were doing, stronger attachment to dogs and other living things, and had more positive emotions in general than were observed in writing of the control group. Effects for the set of emotion scores ranged from high to moderate in size.

Overall, the Teacher's Pet intervention demonstrated effects seen through youths' journal writing that ranged from moderate to large in both emotional and cognitive ratings of their writing, with somewhat more differences visible in emotional compared to cognitive ratings of their journal entries.

### DISCUSSION

The results indicate that there were significant differences between the Teacher's Pet (AAT) group and the Dog walking control group in the rated emotional as well as cognitive content of what participants wrote in their journal entries, with youth in the experimental group showing more positive outcomes. This result was somewhat surprising, given that the behavioral outcomes of the youth in the project did not show differences between the experimental and the control group (1). Both groups demonstrated increases in empathy and internalizing problems in analyses of the larger group of youth from which the journals reported here came.

A key question is why the journal writing revealed positive outcomes for treatment that were not seen in behavior ratings made by center personnel nor the ratings that youth gave themselves. Scores based on different journal content pertaining to all categories of codes other than Self-Awareness were noteworthy in showing medium to large effect sizes. Thus, the writing of the youths showed treatment-related positive outcomes in Cognitive Growth, Attitude Toward Being in the Program, Attachment, Future orientation, and Positivity of Emotion. Participating in the Teacher's Pet intervention

<sup>2</sup>Data from the study are available on request from the authors.

produced more positive effects, seen in youth journal writing, compared to the dog walking group.

One explanation for their higher average scores is that participants in the Teacher's Pet group were more likely to write adoption flyers, letters to potential adopters of their assigned dog as well as stories about their assigned dog. This writing prompt was given to all the participants during a classroom portion, before the dogs' "graduation day," at the end of the treatment period. However, because participants in the experimental group worked very closely with just two dogs, this kind of writing may have been more likely for them to do and do better than youth in the dog walking group, who walked several dogs.

The Teacher's Pet Staff also left notes in some of the participants' journals pertaining to any follow-up questions that a participant had, or a youth's journal content, or his or her progress in the program. This was a potential confounding factor for Cognitive Growth, as the staff knew the groups to which the participants were assigned. They may have been more likely to leave notes for participants in the experimental group regarding their progress in the program, which could have also accounted for the higher score for this category. However, no differences in the number of staff comments were noted between the two groups, however, though content of those comments were not evaluated separately.

Participants in the Teacher's Pet group had closer and very different kinds of interactions with their dogs than the dog walking group had, due to training their assigned dogs for the period of the study. Thus, experimental participants were more likely to be aware of and write about the behavior of their particular assigned dogs, compared to dog walkers who may have observed more general dog behavior seen in several animals. In addition, the training itself may have caused the experimental group to track their dog's problems as well as positive behaviors closely, more than would have been the case for the dog walkers. Learning about dog behavior during the educational portion of the study and applying that to their observations while working with their assigned dog(s), and writing about it, could have enhanced their understanding more than would have been the case for the dog walking group. This is consistent with the findings that journaling has helped individuals in therapy gain better understanding of their own behaviors and behaviors of others, and can aid in self-reflection and better overall treatment outcomes (23, 24).

Participants in the Teacher's Pet group also showed significantly higher scores for both Attachment and Attitude Toward Program. The Attachment code included components such as having empathy for the dog, physical contact, affection for their assigned dog), writing about dogs' feelings or thoughts, writing about patience, and wanting to help their own dog or other dogs in general. The higher scores for attachment seen in writing by the Teacher's Pet group could be because those participants spent more time training a specific dog and learning and being more aware of their dog's characteristics and behavior problems. This in turn could have facilitated a strong bond with the dog, possibly aiding their understanding of their dog's feelings or thoughts and promoting feelings of empathy as well as more attachment to their dog. Through this, these participants may also have been more understanding toward these dogs and their situation as shelter animals without homes, as well as writing about patience through the process of training a dog and changing its behaviors. Youth who walked dogs did not have such long-lasting relationships with particular dogs; they typically handled more dogs without being able to stick with just one animal for several sessions, thus their attachment to and knowledge of particular animals was less likely to occur. However, that potential explanation is countered by the finding in the larger project that both groups of youth increased in a formal rating of their empathy, which ordinarily might be expected to be related to content of their writing (1).

In addition, participants in the Teacher's Pet group also showed significantly higher ratings for their Attitude Toward the Program, seen in positive statements such as writing about liking the program, what they learned, and training challenges and goals for their assigned dogs, compared to youth in the dog walking group. Perhaps participants in the Teacher's Pet group were more likely to write and reflect on what they learned in the program through training their assigned dogs. These participants, compared to the control group, were more likely to set goals regarding training or problem behaviors of the dogs. Thus, these participants likely wrote more about what they got out of the program through this process. This is consistent with research findings (31, 32) that writing about what occurs in therapy and reflecting on it can help individuals gain insight and process what is happening in treatment and what benefits them.

There were no meaningful effect sizes Self-Awareness, which typically consisted of writing about having a relationship with the staff at the facility, showing insight into their own behavior, comparing themselves to their assigned dogs and vice-versa, writing about what they learned in the program or mentioning emotions regardless of whether the emotions were related to the program. All participants were asked to write about how their day went, and scoring of positive and negative emotions in the journals was done whether participants wrote about their emotions related to the program or some other topic. This writing prompt may have provided an opportunity for self-reflection for participants in both groups. Moreover, both groups attended the classroom portion of the study and could have reflected on class activities, which did not vary by group.

There is another puzzling aspect to the journal writing. The journal entries were made throughout training, not just at the end of the intervention. Thus, differences in writing as a function of receiving the intervention had an effect that occurred during the intervention, not necessarily as the overall final outcome of the intervention. If the journaling content reveals inner effects of the animal assisted intervention, those effects begin earlier and may build, before behavioral changes are observable.

It may also be worth considering that the original categories that the coders thought would be found in youths' writing were not usable. The coders were not highly familiar with the intervention, and may have also lacked knowledge of adolescent characteristics, or made inaccurate assumptions about the youth in the study. Coder characteristics are not often studied and might have influenced their original ideas about what journal content would include. Future studies should look more closely at rater and coder characteristics to see how they match with what youth actually write about, as reactions to an intervention. Facility staff, who had a lot of contact with youth and thus presumably knew them well, did not see changes, but unfamiliar coders did see signs of change in the writing of the youth, that accorded with their group assignment. Facility staff ratings focused on behavior, whereas the writing more commonly reflected youths' internal thoughts. Changes in thinking and emotion may precede behavioral changes, or be unavailable to persons who have no access to youths' thoughts that could be found in their writing. Perhaps familiarity with youth being rated could alter rater or coder responses; this might particularly be the case among facility staff, where a great deal of disapproval and possible stigma might be attached to expectations of youth. Such negative expectations might not be present with a more neutral assessment, where raters and coders know only a little of the court-ordered placement of the youth.

Despite there being some areas of journal content that were similar for both groups, the results showing treatment effects offer interesting insights into the intervention. The journal content of participants in dog training therapy were rated more positively, on average, for several aspects of that writing. This underscores the idea that youth engaging in an animalassisted therapy intervention may show how the intervention has increased attachment, empathy, patience, and awareness of what youth have learned, through what they write about. Our findings highlight how journaling can demonstrate progress through treatment, and show attitudes, thought patterns, and emotions that many who work with troubled youth think are valuable outcomes, but which are very hard to observe in youth behavior. If so, this suggests that knowing the content of expressive writing of youth as they work with animals, may reveal the depths of how human-animal interaction influences the adolescents in a positive direction, before behavior change is evident (18).

### Limitations

There were several limitations to this study. The number of participant with journals rated was smaller than the number of youth in the larger project. Although youth whose journals were available for rating appeared to be similar to the larger group, the missing journals cannot be ruled out as a cause of the differences in outcomes for this study compared to the larger project. It is possible, though not likely, that the writing of the youth whose journals were not available for analysis was sufficiently different from the journals analyzed for this report, that the group differences would not be present had all journals been rated.

It is not clear how these results are related to the outcomes of the larger study. It would be ideal had all the journals been analyzed for content, allowing direct comparison of outcomes for the full number of participants who received the intervention or the control. In addition, there was no follow-up of these participants. It is not known whether the positive outcomes seen in the writing of the experimental group do in fact predict better longer term outcomes, such as lower rates of recidivism. The youth in this study, while being generally representative of the facilities from which they were drawn, are not known as to what degree they represent at-risk incarcerated youth nationwide. The AAT, Teacher's Pet, in this project, is a longrunning, well-developed intervention that has highly trained staff. Other animal assisted interventions may not have the same degree of training and experience, which could be keys to success of treatments in general for mental health and behavior problems.

Finally, the uneven randomization due to facility constraints may have impacted the results, even though the group sizes for the 73 youth participant in this examination of journal writing, were more balanced than was the case for the 138 youth in the original project. This limitation is a function of conducting research with community partners, for which programmatic, staffing, and scheduling resources are natural priorities. Similar imbalances favoring more youth assigned to treatment are common in studies of conduct disordered youth (26). Researchers must continue to balance internal and external validity needs with the reality of conducting real-world investigations that can be affected by many competing priorities.

### Implications and Recommendations

Given the simplicity of including journaling as a way to assess changes due to treatment, and the relatively low cost of animal assisted interventions, programs such as Teacher's Pet show promise for expanding the range of interventions for incarcerated youth. It is also worth assessing for use with persons, including youth, who for other reasons cannot receive the typical model of therapy that rests on one person with one therapist at a time, over an indefinite period. The efficiency of group interventions along with youth attraction to dogs, makes such interventions an attractive option for incarcerated youth, as well as community based treatment programs.

Recommendations for future research include studying the journal content of additional at-risk adolescents engaged in a similar intervention and analyzing common themes in the journals along with self-reported scores for behavioral problems pre and post-assessment, in order to evaluate behavioral progress. It could be beneficial to interview participants pre and postassessment in order to assess how the content of their writing relate to other youths' perceptions of their responses to intervention.

As it was primarily undesirable behavior that caused the youth in this project to be incarcerated, more study is needed to see whether animal-assisted interventions are effective in changing such youths' behavior after they are released. Using journaling as an adjunct to assessing other outcomes of treatment could show changes in motivations and attitudes that lead to successful post-incarceration adjustment. Long-term follow-up is therefore an essential need for future studies. Only by seeing whether the positive effects such as those found here in journal writing predict a lack of recidivism will it be clear that such interventions, assessed through journal writing, are accurately predictive of long term positive outcomes.

The findings of this study highlight the potential usefulness of journaling as a method for the assessment of treatment. This could be particularly valuable as a way to evaluate an active, very participatory intervention such as the animal-assisted therapy featured in this study, in which the participants trained dogs or worked closely with them. By analyzing the content of the journals of these participants, it appears that this animalassisted therapy facilitated empathy, attachment, behavioral insight, and patience in the youth, outcomes that were shown only by examining the writing of the youth in treatment. This demonstrates that animal-assisted therapy can facilitate some changes in cognitive as well as emotional attitudes in youth that are not readily observable in their overt behavior. Those changes in turn can extend into improved interpersonal relationships and empathy toward others among the youth, after the intervention is over (33–35).

Journaling can be a cost-effective and useful way to assess the effects of treatments given to vulnerable adolescents, facilitating knowledge of treatment outcomes in a population that is still developing cognitively, who may not show therapeutic changes visibly, although changes are resulting from the treatment. Similar interventions including animals as a key feature of the treatment along with journaling as an assessment of responses to therapy, are promising for the possibility that they can ultimately show that treatments reduce recidivism and improve the quality of life for at-risk youth.

### REFERENCES


### AUTHOR CONTRIBUTIONS

RC was one of the primary leaders of the original study, coprincipal investigator of the NIH grant that funded the project. AC was the principal investigator of the NIH grant for the original study. AJ was one of the co-principal investigators of the original, NIH funded project. She is the leader of the Teacher's Pet AAT program, which was central to the treatment condition in this project. TS was a research assistant to this project. She was central to the development and execution of the journal coding system reported in this project. NS was a graduate research assistant to this project, he analyzed data and reported on the larger project from which this study was developed. DA was a graduate research assistant to this project. She was involved in the planning of the project, and was a leader in the collection of data and interaction with shelter staff and coders for the project.

### FUNDING

This research was supported by Grant #R03HD070621 from The Eunice Kennedy Shriver National Institute of Child Health & Human Development and Mars-WALTHAM <sup>R</sup> .


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Syzmanski, Casey, Johnson, Cano, Albright and Seivert. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Replication Pilot Trial of Therapeutic Horseback Riding and Cortisol Collection With Children on the Autism Spectrum

Zhaoxing Pan1,2, Douglas A. Granger 3,4, Noémie A. Guérin<sup>5</sup> , Amy Shoffner <sup>1</sup> and Robin L. Gabriels 1,2 \*

<sup>1</sup> Department of Pediatrics, University of Colorado Anschutz Medical Campus, Aurora, CO, United States, <sup>2</sup> Children's Hospital Colorado, Aurora, CO, United States, <sup>3</sup> Institute for Interdisciplinary Salivary Bioscience Research, University of California, Irvine, Irvine, CA, United States, <sup>4</sup> Bloomberg School of Public Health and School of Nursing, Johns Hopkins University School of Medicine, Baltimore, MD, United States, <sup>5</sup> Center for the Human-Animal Bond of Purdue, College of Veterinary Medicine, Purdue University, West Lafayette, IN, United States

#### Edited by:

Peggy D. McCardle, Consultant, New Haven, CT, United States

#### Reviewed by:

Andrea Beetz, University of Rostock, Germany Aviva Vincent, Case Western Reserve University, United States Caiti Peters, Colorado State University, United States

\*Correspondence:

Robin L. Gabriels robin.gabriels@childrenscolorado.org

#### Specialty section:

This article was submitted to Veterinary Humanities and Social Sciences, a section of the journal Frontiers in Veterinary Science

Received: 17 April 2018 Accepted: 26 November 2018 Published: 14 January 2019

#### Citation:

Pan Z, Granger DA, Guérin NA, Shoffner A and Gabriels RL (2019) Replication Pilot Trial of Therapeutic Horseback Riding and Cortisol Collection With Children on the Autism Spectrum. Front. Vet. Sci. 5:312. doi: 10.3389/fvets.2018.00312 We aimed to determine whether results of our prior randomized control trial [RCT; NCT02301195, (1)] of Therapeutic Horseback Riding (THR) for children and adolescents with autism spectrum disorder (ASD) could be replicated at a different riding center and if treatment effects also included differences in the expression of associations between problem behavior and the activity of the hypothalamic-pituitary-adrenal (HPA) axis. Participants with ASD (N = 16) ages 6-16 years were randomized by nonverbal intelligence quotient to either a 10-week THR group (n = 8) or no horse interaction barn activity (BA) control group (n = 8). Outcome measures were a standard speech-language sample and caregiver-report of aberrant and social behaviors. Participants' saliva was sampled weekly at a consistent afternoon time immediately pre- and 20 min' post-condition (later assayed for cortisol). Intent-to-treat analysis revealed that compared to controls, THR participants had significant improvements in hyperactivity, and social awareness, and significant improvements at the 0.1 significance level in irritability and social communication behaviors. There were no significant improvements in number of words or new words spoken during the standard language sample. Linear mixed effects model analysis indicated that greater weekly pre-lesson irritability levels were associated with smaller post-lesson reduction in salivary cortisol levels, and greater weekly prelesson hyperactivity levels were associated with smaller cortisol reduction in the THR group, but not in the BA control group. The findings represent a partial replication of prior results (1), extend prior observations to include THR effects on biobehavioral relationships and suggest that cortisol could be a target mediator for THR effects on irritability and hyperactivity behaviors in youth with ASD.

Clinical Trial Registration: Trial of Therapeutic Horseback Riding in Children and Adolescents with Autism Spectrum Disorder; http://clinicaltrials.gov, identifier: NCT02301195

Keywords: autism spectrum disorder, equine-assisted activities and therapies, human-animal interaction, therapeutic horseback riding, salivary cortisol

## INTRODUCTION

In addition to core impairments in social and communication skills, restricted interests, and repetitive behaviors (2), individuals with autism spectrum disorder (ASD) have high rates of coexisting psychiatric symptoms that include anxiety, depression, irritability, and attention-deficit and hyperactivity disorder (3– 11). Such co-existing conditions can impair functioning, which puts this population at risk to engage in dangerous aberrant behaviors (12) (e.g., aggression and self-injury) and to seek costly crisis psychiatric care services (e.g., emergency department and inpatient hospitalization) (13, 14). To proactively address the core impairments and aberrant behaviors unique to individuals with ASD, one increasingly popular intervention is animalassisted intervention (AAI) (15, 16).

Systematic reviews of the literature reflect a recent increase in the quantity and quality of research on AAI with the pediatric population of individuals with ASD (15, 16). Most studies of AAI programs for ASD are comprised of 8–12 weekly sessions, and the most commonly reported outcome is improved social interactions. Horses are the most common species included in AAI research through the practice of therapeutic horseback riding (THR) (16); In 2015, Gabriels et al. conducted the first large-scale randomized clinical trial of THR for children with ASD, with 127 participants ages 6–16 (1). Compared to participants in a barn activity (BA) control group, participants in a 10-week THR intervention made significant improvements in symptoms of irritability and hyperactivity as measured by the Aberrant Behavior Checklist-Community (ABC-C) (17), improvements in core symptoms of autism (e.g., social cognition and social communication) measured by the Social Responsiveness Scale (SRS) (18), and word fluency (e.g., total number of words and new words spoken) measured by a standardized language sample (1). A more recent study of THR replicated use of the ABC-C (17) to measure outcomes in a sample of 26 children with ASD (19). This study found that children participating in five to seven 45-minute weekly riding lessons compared to a control group receiving treatment as usual, improved on the ABC-C (17) Hyperactivity scale, but not on the Irritability scale (19). It is promising that Harris and Williams (19) attempted to replicate the irritability and hyperactivity outcomes previously observed by Gabriels et al. (1); however, the advancement of the AAI field requires more methodological standardization and replication of methods to confirm the efficacy of THR on outcomes in children with ASD (15, 16, 20, 21). Additionally, improved methodological rigor can lead to an increased understanding of the mechanisms, such as physiological arousal levels, that might help explain observed benefits of, for example, THR on children with ASD.

The field of AAI has historically claimed that interacting with animals can reduce an individual's arousal level to dampen stressed/anxious states. There are a number of AAI studies that have observed favorable autonomic response patterns using physiological measures (e.g., cortisol, cardiovascular, electrodermal) in individuals when they are engaged with animals, providing support for the assertion that AAI can produce a regulated state of arousal (22).

In the ASD population, poorly regulated emotional/arousal states tend to manifest as symptoms of stress/anxiety, depression, irritability, and hyperactivity, which are particularly prevalent (11, 23). Specifically, irritability behaviors in the ASD population have been characterized as heightened emotional (e.g., anger) and behavioral (e.g., aggression. severe tantrums, self-injury) reactivity (24), behaviors that often require high levels of intensive interventions. Given this information, it is reasonable to hypothesize that elements inherent in THR may activate a physiological state of regulation that leads to beneficial outcomes such as reductions in irritability behaviors.

Our understanding of the effects of AAI on physiological arousal levels such as the reactivity and regulation of environmentally sensitive biological systems, such as the hypothalamic-pituitary-adrenal (HPA) axis, and the association of these AAI-related changes in physiology with behavior in the context of ASD is in its infancy.

The HPA axis is one of the two main components of the psychobiology of the stress response, and its primary product, cortisol can accurately (using minimally invasive collection methods) be measured in saliva. An extensive literature reveals changes in cortisol in response to novelty, defeat, and social evaluative threat and these changes are most pronounced when individuals do not have prior experience or sufficient coping skills or resources to adapt to those events by changing their actions or thoughts [see for review (25)]. A 2014 review of cortisol investigations in the ASD population, reported that individuals with high rates of irritability behaviors show a more sluggish response of the HPA axis to stressors (26). A similar finding was reported in a study of high functioning (HF) boys with ASD who endorsed having high levels of irritability, yet their cortisol levels were lower/less responsive to a psychosocial stress test compared to HF boys with ASD who endorsed having lower levels of irritability (27). These recent study findings raise questions about the role of irritability in influencing the physiological response patterns (e.g., HPA axis) in the ASD population.

The handful of studies of HPA axis reactivity and regulation in ASD suggest that compared to typically-developing children, children with ASD experience higher HPA axis reactivity to daily stressors (28, 29). Understanding whether the effects of AAI reveal at both the behavioral surface, and the level of fast acting environmentally sensitive biological systems, like the HPA axis, may be key to advancing our understanding of individual differences in, or the degree of short- versus longer-term, benefits of AAI in the context of ASD. An RCT on typically-developing adolescents found that compared to a control group, adolescents participating in an 11-week equine-facilitated learning (EFL) program had lower basal salivary cortisol levels (30). One study examining the effect of service dogs on salivary cortisol levels of 42 children with ASD found that having a service dog led to significantly lower cortisol awakening responses (CAR), but did not influence average diurnal cortisol levels (31). In eight male children with ASD, hippotherapy led to reduced cortisol after riding compared to before riding, apart from the first riding session, which may represent the stressful effect of getting used to a new environment and riding for the first time (32). Overall, it seems that AAI may have a direct, at least short term, effect on reactivity and regulation of the HPA axis.

In the present study, the first aim was to implement a previously reported THR intervention model from a large scale RCT in a different THR riding center to examine its feasibility and effectiveness (1). The second aim was to extend the findings of Gabriels et al. (1) by replicating effects of THR on ASDrelated aberrant behavior, but also by examining treatment effects on levels of cortisol before and 20 min after THR, and on the expression of the association between cortisol and ASD-related aberrant behavior.

## MATERIALS AND METHODS

### Participants

For this IRB-approved study, participants were recruited via inpatient hospital and out-patient therapy services, schools, and ASD-parent groups. Participant inclusion criteria replicated those reported by Gabriels et al. (1): Ages 6–16 years; a diagnosis of ASD confirmed [i.e., meeting the cut-off of ≥15 on the Social Communication Questionnaire (SCQ) (33) and meeting the empirically-derived cutoffs for ASD or Autism on the Autism Diagnostic Observation Schedule-2nd Edition (ADOS-2) (34)]; a combined total score of >11 on the Irritability and Stereotypy subscales of the Aberrant Behavior Checklist-Community (ABC-C) (17); and a nonverbal IQ (NVIQ) score of ≥40 standard score measured by the Leiter-3 (35). Exclusion criteria also included a screening for contraindications based on guidelines from the Professional Association of Therapeutic Horsemanship International (PATH Intl.) Standards for Certification and Accreditation (36). Contraindications included medical or behavioral concerns that might make it dangerous to participate in the horseback riding activity such as uncontrolled seizures, or a history of animal abuse. Participants were also excluded if they had participated in a THR intervention within 6 months prior to entering the study, weighed 200 pounds or more, exceeding the riding center's policies to ride a horse, or if they were taking steroid medications, as steroids might confound cortisol results. See **Figure 1** for screening and enrollment information.

## Study Design

### Screening Visit I

Interested caregivers and participants were engaged in an IRBapproved informed consent/assent and screening process at the first authors' institution setting before traveling to the riding center for a second level screening. During this first screening, caregivers completed demographic, diagnostic and behavior rating forms regarding their child that included the SCQ (33), ABC-C (17) and the Spence Children's Anxiety Scale-Parent Version (SCAS-P) (37). Participants completed the Leiter-3 (35) and ADOS-2 (34). Additionally, participants and their caregivers were instructed (via demonstration and hands-on practice) how to collect saliva samples, provided with visual food cues to help stimulate saliva production and informed that the child participant needed to avoid eating, drinking or brushing teeth for at least 30 min before all sample collections occurred at the riding center.

participants. <sup>a</sup>ABC-C was not returned for one participant. <sup>b</sup>One participant completed 5 sessions only. <sup>c</sup>One participant completed 2 sessions only. <sup>d</sup>Of the 7 participants, one has no posttreatment SALT evaluation but ABC-C and SRS. <sup>e</sup>Of the 7 participants, one has no posttreatment ABC-C data but SALT and SRS.

### Randomization

Participants meeting inclusion criteria were then randomized into either an intervention group (THR) or control Barn Activity (BA) control group with no horse contact, stratified by NVIQ (≤85 or >85).

### Screening Riding Center for THR Research Site

This replication trial took place at a therapeutic riding center, located in a rural setting in the foothills of northern Colorado, approximately 1-h driving time from Wyoming. This riding center has been operating since 1997 and maintained Premier Accreditation through PATH (Professional Association of Therapeutic Horsemanship) International since 2002. This premiere accreditation status is the highest level of accreditation in the field of equine assisted activities and therapies (EAAT) and requires the facility to follow rigorous and comprehensive standards across all aspects of programming, including safety and animal welfare. This facility has 23 acres, two indoor arenas, a large outdoor arena and a large sensory trail. The riding center was evaluated for appropriateness to conduct research based on a standardized site review. The research site review screening addressed the need for consistent, high quality programming for the duration of the 10-week intervention. During an on-site observation with research staff, the riding center confirmed it was able to provide an appropriate indoor/outdoor facility, horse's sound in mind and body, trained volunteers, and staff qualified to work with riders with ASD.

### Screening Visit II: Riding Center

After participants' medical clearance forms were completed by and received from their physicians and caregivers, participants met with their assigned group leader at the riding center for an adaptive functioning screen. This screening visit involved an interview with the participant and caregiver about the participant's strengths and needs as well as a standardized 10 min direct observational assessment of the participant's adaptive skills. For the THR group this involved a 10-min horseback riding activity and for the BA control group, a drawing activity about horses.

### Intervention Fidelity

Before initiating interventions, site riding center instructors and volunteers participated in a 2-h presentation reviewing methods for working with children with ASD in the riding center environment. This presentation was delivered by the onsite coordinator (second author, who was a certified Advanced PATH International therapeutic riding instructor). Prior to the intervention phase of this study, this coordinator also trained the two riding center THR group instructors on the manual-based (38) methods for conducting the 10-week THR intervention and provided on-site observation of instructor implementation of 20% of the THR lesson to measure intervention fidelity. BA control group instructor implementation of 20% of lessons were also observed and measured using this same fidelity tool by the senior author, who was 80% reliable with the on-site coordinator on three consecutive THR lessons (38).

### Intervention and Control Groups

Both the 10-week THR and the BA control group intervention were 45-min in length and involved two to four participants, per group, with at least one volunteer assigned to assist per participant. The content of the THR and BA control groups were consistent for each of the 10-weekly lessons and included information about horses and horse care as described in the manual (38). However, the control group did not have interactions with horses, rather participants were only exposed to a pony-sized stuffed horse, which they used to practice activities such as grooming and tacking. Both groups were led by a THR instructor and employed teaching methods consistent with best practices for children with ASD that included use of consistent routines, visual schedules, demonstration and other concrete visual cues to enhance comprehension of information and expectations. Both the THR and control groups were (45 min in length and involved the following general schedule of routines:


Of note, the control group leader and co-leader were the same as those who led the control group in the previous RCT (1). The THR and BA control groups occurred simultaneously (same day and afternoon times) at the riding center.

### Outcome Measures

## Baseline and post-intervention Measures

#### **Systematic Analysis of Language Transcripts (SALT)**

Within one month pre- and post-THR and control group interventions, a study speech therapist blind to participants' condition group assignment conducted a five-minute language sample with each participant using the Systematic Analysis of Language Transcripts (SALT) (39). The SALT (39) provides standard guidelines to elicit, transcribe, and analyze language samples from individuals, including those diagnosed with ASD. Language samples were transcribed from recordings and then entered into the SALT language analysis program to compute vocabulary diversity. The SALT (39) was an outcome measure used and described in the previous RCT (1).

### **Social Responsiveness Scale (SRS)**

Additionally, within 1 month pre- and post- interventions, a consistent caregiver for each participant completed the Social Responsiveness Scale (SRS) (18) about their child's social behaviors. The SRS measures social impairments of ASD that includes five subscales (Social Awareness, Social Cognition, Social Motivation, Social Communication and Autistic Mannerisms) (18). The SRS was an outcome measure also described in the previous RCT (1, 18).

### Intervention Phase Measures

### **Aberrant Behavior Checklist–Community (ABC-C)**

During the 10-week intervention phase of this study, the identified consistent caregiver for each participant completed the ABC-C (17) form to report on participant's behavior observed during the week preceding each group lesson (THR or control). The subscales of the ABC-C include Irritability, Lethargy/Social Withdrawal, Stereotypy, Hyperactivity, and Inappropriate Speech behaviors and items are rated on a 0- 3 Likert-type severity rating scale. This is a 58-item symptom checklist was the primary outcome measure described and demonstrating significant changes in participants of the THR group from the previous RCT (1).

#### **Saliva collection and determination of cortisol**

Immediately before each THR session and 20 min following each session, study personnel collected saliva samples from participants (THR and control) using an absorbent swab specifically designed for use with children (SalivaBio, Carlsbad, CA). These collection times occurred at a consistent afternoon time (between 1:00-5:00 PM) when diurnal cortisol levels typically decline (40). The first sample was collected immediately before the groups when participants were seated with their Pan et al. Autism Therapeutic Horseback Riding Cortisol

volunteers either on a bench in the arena (THR group) or at a group table (BA control group). Participants were instructed to mouth the foam rod for 1 min. A mini 1-min sand timer was given to each participant to provide visual reference and enable them to track the collection time duration. The second saliva sample was collected 20 min after the conclusion of the standard 45-min THR or BA control group lessons (i.e., after dismounting the horse for the THR group and completing a review of things learned for the BA control group). Our methods to collect cortisol 20 minutes' post intervention is supported by previous findings that there is a 5-20-min lag in the detection of salivary cortisol (41). Participants followed the same procedures as previously described as each group participants sat at a table with their respective small groups and engaged in coloring or painting pictures. Each group (THR and control) sat in a separate room and did not have contact with each other. All samples were immediately frozen and shipped frozen to the Institute for Interdisciplinary Salivary Bioscience Research (IISBR) laboratory for analyses. Following methods described by Granger et al. (25), all saliva samples were assayed for cortisol using a commercially available immunoassay specifically designed for use with saliva without modification to the manufacturers recommended protocol https://www. salimetrics.com/assay-kits/#tab1 (Salimetrics, Carlsbad; Cat #1- 3002). On the day of assay, samples were thawed, centrifuged to remove mucins, and assayed for cortisol in duplicate using an immunoassay specifically designed for use with saliva (Salimetrics, Carlsbad, CA) without modification to the manufacturers recommended protocol. The sample test volume was 25 µl, range of calibrators from 0.01 to 3.0 µg/dL, and lower limit of sensitivity 0.007 µg/dL. On average, inter and intra-assay coefficients of variation were less than 10 and 5% respectively. The average of the duplicate assays for each sample was used in the statistical analyses. Units for cortisol are expressed in micrograms per deciliter (ug/dL).

### Data Analysis

All the analyses were conducted using SAS 9.4 software (SAS Institute Inc.<sup>1</sup> ). Demographic, diagnosis and baseline data were compared using Student t-tests and Fisher's exact tests for continuous and categorical variables respectively. The primary intent-to-treat analyses included data collected within 1 month pre- and post-THR and control group (or pre-session level of salivary cortisol at first and last week of intervention) and used a linear mixed effects model (LMM) without any data imputation. The LMM model consists of the baseline value and the postevaluations as outcome measures, evaluation time (baseline or post-evaluation) of outcome, group (THR or control) and their interaction term as fixed effects and an unstructured covariance. Test of the time by group interaction term was used to assess the statistical significance of THR effectiveness. Effect size was calculated as (2xt value)/<sup>√</sup> (DF), from the contrast of the time by group interaction. Sensitivity analyses were conducted to see how robust the conclusion were, including: (a) repeating the ITT primary analyses among participants completed at least 80% of THR or BA lessons, (b) testing the effectiveness using LMM model while adjusting for age and NVIQ and baseline anxiety score and (c) fitting a linear mixed model to all the weekly data of ABC-C (17) and testing the time by group interaction. Weekly immediate change in salivary cortisol level after an intervention lesson was compared between two groups using LMM model. Association of this immediate cortisol change with irritability and hyperactivity was examined using LMM model. The fidelity of the THR treatment implementation was computed as a percentage of the eight intervention component ratings. Irritability subscale of ABC-C (17) was deemed as the primary outcome. No adjustment for multiple secondary outcome variables was applied.

### Power of the Study

This study was a pilot study to replicate the RCT (1) study in a new riding center. This study was not powered to detect a specific effect size. A sample size of 16 (8 per arm) allows to detect an effect size of 1.5 common standard deviation with 80% power at 5% significance.

### RESULTS

### Preliminary Analyses

Of the 17 potential participants screened, 16 (94%) met study inclusion criteria and were enrolled in this trial and randomized (see **Figure 1**). Of note, 75% of this sample had communitybased psychiatric diagnoses. Every participant in THR group and

TABLE 1 | Characteristics of Participants.


<sup>a</sup>Two tailed p-value from two sample t-test and Fisher's exact test as appropriate (ug/dL). Date points of same symbol are from the same participants.

<sup>1</sup> SAS Institute Inc., "SAS." (Cary, NC).

TABLE 2A | Analysis of efficacy of Therapeutic Horseback Riding (THR) (n = 8) compared to the Barn Activity (BA) control (n = 8)<sup>a</sup> .


<sup>a</sup>Analyses included all participants who were randomized and had either baseline line and/or End of treatment (EoT) assessment. Cortisol assessed at intervention weeks one and the last week (weeks 9 or 10) for THR (n = 7) and BA control (n = 7) groups were used to approximate baseline and EoT cortisol level. Sample means and standard deviation were reported for baseline and EoT. Mean and standard errors of change and the time by group interaction are from mixed effects model analysis of baseline and EoT data for all the outcome variables. The mixed effects model consists of time (baseline/EoT), group (THR/BA control) and their interaction as fixed effects and an unstructured covariance. Test of the time by group interaction (i.e., THR minus Barn control in change from baseline) is used to assess the efficacy of THR.

b Irritability subscale is deemed as the primary efficacy outcome in this study. √

<sup>c</sup>Effect size is calculated (2 × t value)/ DF from the contrast of the time by group interaction.

<sup>d</sup>p-value < 0.05 are in bold face form.

four participants in control group had one or more psychiatric diagnoses. On the ABC-C (17) measure, participants in THR group had a more stereotypy behaviors and were more irritable and hyperactive at baseline. On the SCAS-P, participants in THR group had higher score on the Panic/Agoraphobia subscale. The groups did not differ otherwise at baseline (see **Tables 1**, **2**). Five THR participants completed all 10 THR lessons; two completed nine lessons; and one completed five lessons. Two BA control group participants completed all 10 intervention lessons, four completed nine lessons, one completed eight lessons and one completed one lesson.

## Intervention Fidelity

### THR Group

The average overall fidelity rating for the THR group was 92.22%, with average ratings in the four domains as follows: Teaching Techniques & Class Structure 88.32%; Volunteers 100%; Environment 100%.

#### Control Group

The average overall fidelity rating for the control group was 93.47%, with average ratings in the four domains as follows: Teaching Techniques & Class Structure 95.37%; Volunteers 80.55%; Environment 95.83%.

### Clinical Outcomes

**Tables 2A**,**B** show the effectiveness of the THR intervention compared to the BA control group for the primary (ABC-C) and secondary (SRS, SALT, salivary cortisol) outcome variables. **Figure 2** shows the mean response patterns of the six outcome variables on which THR demonstrated favorable effect from the original RCT (1).

#### Primary Outcome Variable (ABC-C)

Participants in the THR group had lower average post- treatment Irritability and Hyperactivity subscale scores while participants in BA control group had higher average post-treatment scores for both subscales as compared to the baseline values (see **Figures 2A,B**). Between-treatment difference in post-treatment change was significant on the Hyperactivity subscale (es = 1.49, p = 0.02) and significant at the 0.1 significance level on the Irritability subscale (es = 1.08, p = 0.08), indicating THR participants made more improvements from baseline to post-treatment on both outcomes compared to BA control group participants. Moreover, a consistent result was found from the LMM analysis with baseline panic agoraphobia score as a covariate for irritability (p = 0.09) and hyperactivity (p = 0.02). Although not statistically significant, larger panic agoraphobia score was associated with larger irritability, but small hyperactivity score. If age and non-verbal IQ were



<sup>a</sup>Analyses included all participants who were randomized and had either baseline line and/or End of treatment (EoT) assessment among those completed 80% of intervention lessons. Sample means and standard deviation were reported for baseline and EoT. Mean and standard errors of change and the time by group interaction are from mixed effects model analysis of baseline and EoT data for all the outcome variables. The mixed effects model consists of time (baseline/EoT), group (THR/Barn activity control) and their interaction as fixed effects and an unstructured covariance. Test of the time by group interaction (i.e. THR minus Barn in change from baseline) is used to assess the efficacy of THR.

b Irritability subscale is deemed as the primary efficacy outcome in this study. √

<sup>c</sup>Effect size is calculated (2 × t value)/ DF from the contrast of the time by group interaction.

<sup>d</sup>p-value < 0.05 are in bold face form.

adjusted in the LMM model, significant effectiveness of THR was found respectively for irritability (p = 0.037) and hyperactivity (p = 0.013). The time course of the weekly Irritability and Hyperactivity scales (see **Figures 2C,D**) were also analyzed using a linear mixed effects model (LMM). Statistical test of the time by treatment interactions were significant (p = 0.016 for the Irritability and p = 0.0005 for Hyperactivity subscales). For the Irritability subscale, baseline and post-treatment means (SEM) estimated by LMM were respectively 21.75 (3.88) and 16.19 (3.96) for THR participants and 10.47 (3.95) and 17.28 (4.05) in BA control group participants, resulting in the between-treatment difference in change from baseline of 12.36 (3.97), which was statistically significant (p = 0.0023). For the Hyperactivity subscale, baseline and post-treatment means (SEM) were respectively 22.5 (3.03) and 17.67 (3.11) for THR participants and 16.22 (3.10) and 24.82 (3.19) in the BA control group participants; the corresponding between-treatment difference in change from base was then 13.42 (3.47), which was statistically significant (p = 0.0002). There was no significant difference between the two groups on any of the other ABC-C (17) subscales.

To examine the robustness of these primary analyses, the same analysis was repeated among the THR (n = 7) and BA (n = 7) participants, each who completed 80% or more intended sessions. These produced the same results for effects on the Irritability and Hyperactivity subscales (**Table 2A**).

### Secondary Outcome Variables SRS, SALT, and Salivary Cortisol

**SRS**

For the SRS (18), the THR group had greater improvements on the Social Communication (p = 0.08) and Social Awareness (p = 0.02) subscales compared to the BA control group. In analysis of participants who completed at least 8 weeks of the THR and BA control group interventions, the SRS Social Communication subscale became significant (p = 0.03), a finding similar to the previously published RCT (1). There was no significant difference between groups on any other of the SRS subscales.

#### **SALT**

On the SALT (39) there was no statistically significant difference in improvement of number words or different words spoken after treatment between the two groups even though the response pattern was in favor of THR group, similar to the previous RCT (1).

#### **Salivary Cortisol**

We compared week one cortisol levels and the cortisol levels collected the last week of intervention to assess efficacy. Separate analyses were conducted for pre-lesson and post-lesson cortisol levels. Median (range) of salivary sample collection times of prelesson were 13:45 (12:53–13:45) for THR and 13:45 (12:15–14:30) for BA at first week and 14:23 (12:50–14:30) for THR and 12:30 (12:27–14:09) for BA at the last lesson. There was no difference

between two groups in the change of pre-lesson (p = 0.49) or post-lesson (p = 0.63) cortisol levels between the first and last week of the intervention (**Table 2A**). This non-significance remained after adjusting for salivary sampling time and baseline panic agoraphobia scores (p = 0.61 for pre-lesson and p = 0.62 for post-lesson cortisol). Of a total of 60 completed THR lessons, pre- and post-lesson salivary samples were successfully collected for 90% riding lessons while salivary samples were collected in 74% of a total 65 BA control lessons.

Looking at all the weekly data together with a LMM analysis, a significant decrease in cortisol after the THR lessons was observed in THR participants (mean (SEM): from 0.11 (0.012) to 0.07 (0.009), p = 0.004). The decrease in cortisol after the BA control lessons was significant at 0.1 level in the BA control participants (mean (SEM): from 0.13 (0.014) to 0.10 (0.010), p = 0.07). However, THR group did not show significantly more post-lesson decline in cortisol as compared to BA control (p = 0.38).

In fact, the post-lesson cortisol change can either be an increase or decrease, varying from participant to participant from week to week. Association of weekly Irritability or Hyperactivity subscale scores with weekly post-lesson cortisol change was then examined using LMM. Greater ABC-C weekly Irritability and Hyperactivity scores were respectively associated with a smaller amount of cortisol reduction after the THR lesson (**Figure 3**, slope = 0.002, p = 0.053 for Irritability and slope = 0.003, p = 0.028 for Hyperactivity). Such a relationship was not statistically significant in BA the control group. However, there was no statistically significant difference in slope between the two groups (p = 0.68 for Irritability and p = 0.93 for Hyperactivity).

These LMM analyses were also conducted while adjusting for time of pre-lesson salivary cortisol sampling and the minutes between pre- and post-lesson salivary cortisol collection in order to remove the potential confounding effect of the diurnal decrease of cortisol. These analyses produce the same significant (p < 0.05) correlation results as the unadjusted analyses.

### DISCUSSION

This article reports results from a replication pilot of an RCT study that evaluated the effects of THR for children with ASD (1). Both studies compared a 10-week manual-based (38) THR intervention to a BA control group. The present replication study took place at a different riding center and enrolled 16 participants ages 6–16 years with a study confirmed diagnosis of ASD. The goals of the current study were to replicate the RCT, and to explore the effect of THR on salivary cortisol for children with ASD. Part of the results of the RCT were replicated, in that compared to the BA control group, THR participants significantly improved on the ABC-C (17) Hyperactivity subscale (p = 0.02). Additionally, the THR group had significant improvements at 0.1 level on the ABC-C (17) Irritability subscale (p = 0.08) and SRS (18) Social Communication subscale (p = 0.08). The replication of finding for hyperactivity but not the irritability subscale on the ABC-C matches up with another small scale study of the effect of THR for children with ASD (19), indicating that THR may have a stronger effect on hyperactivity than on irritability behaviors. There were no significant improvements in the number of words or new words spoken on the SALT (39) standard language sample. There was no significant decrease in salivary cortisol over 10-weeks intervention for either the THR or the BA control group. When examining the immediate pre- and postlesson cortisol level changes, children with lower pre-session measures of Hyperactivity and Irritability behaviors on the ABC-C showed greater post-lesson decreases in salivary cortisol. This may suggest that pre-lesson cortisol can be considered as target mediator outcome for future THR research.

There are several limitations of this study. This study is limited by the small sample size, which limited power and randomization. Although randomly assigned, groups were significantly different from one another in pre-test irritability and hyperactivity, and co-occurring conditions. This factor may lead to a biased estimate of THR efficacy due to a regression to the mean. The THR intervention was replicated in the same state where the original trial was conducted, which limits the generalization of the results to other populations.

This is the first known study to report partial replication of results from a previous RCT of THR, thereby extending previous THR efficacy findings by examining the effects of a standardized THR intervention at a different riding center. A future larger scale replication study can provide conclusive replication validation. This study also provides preliminary data to objectively evaluate if the act of riding a horse in the context of a standard 10-week THR group can have immediate biological effect on reducing stress levels as measured by salivary cortisol levels as compared to the BA control. Although significant between group differences on cortisol reduction was not found in this pilot study, it appears that the extent of cortisol reduction after THR was associated with the participants' level of irritability and hyperactivity prior to riding. This very preliminary finding suggests that cortisol may play some role in the THR effect on irritability and hyperactivity. A larger scale study is required to investigate the potential mediation effect of cortisol activity on THR.

### ETHICS STATEMENT

This study was carried out in accordance with the recommendations of the Colorado Multiple Institutional Review Board. The protocol was approved by the Colorado Multiple Institutional Review Board. All subjects gave written informed consent in accordance with the Declaration of Helsinki. This study protocol was approved by the Colorado Multiple Institute Review Board.

### AUTHOR CONTRIBUTIONS

ZP served as the statistical expert for this study and wrote the result section of this manuscript. DG assisted with study

### REFERENCES


design and methods, supervised assay of project samples, and provided editorial comments on the manuscript. NG assisted in writing the introduction of this manuscript. AS was contracted as consultant for this study and provided editorial edits to this manuscript. RG was the principal investigator of this study and contributed to writing the majority of this manuscript.

### ACKNOWLEDGMENTS

Portions of the research described in this study have been previously presented at the International Meeting for Autism Research in May 2016 (Baltimore, USA). The authors gratefully acknowledge children and families who participated in this study along with those who assisted with this project: Briar Dechant, Laurie Burnside, MSM, CCRC, Shaina Holderness, MA, Tina Farrell, CCC-SLP, Jessie Lucas, BA, Kat Blasco, MSW, and staff and volunteers of the Hearts and Horses Therapeutic Riding Center, particularly program director, Jan Pollema, M.Ed., Alex Whittey, and Michele Kane, MA, LPCC and Tamara Merritt. We also thank Luitpold Pharmaceuticals and Allen Mann for donating Adequan <sup>R</sup> for the horses in this study in coordination with the riding center's veterinarian and thank Stephanie Luallin for her technical assistance formatting this manuscript.


**Conflict of Interest Statement:** RG is a co-author of the book, Growing Up with Autism: Working with School-aged Children and Adolescents (Guilford Press) and the book, Autism from Research to Individualized Practice (Jessica Kingsley Publishers), from which she receives royalties. Current grant funding for RG provided by Simons and Lurie Foundations, MARS/WALTHAM, and The Human-Animal Bond Research Institute (HABRI) Foundation. We note that DG is the founder and Chief Scientific and Strategy advisor at Salimetrics and Salivabio and the nature of these relationships is managed by the policies of the committees on conflict of interest at Johns Hopkins University School of Medicine and the University of California at Irvine

The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Pan, Granger, Guérin, Shoffner and Gabriels. This is an openaccess article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Measuring Human-Animal Attachment in a Large U.S. Survey: Two Brief Measures for Children and Their Primary Caregivers

Regina M. Bures <sup>1</sup> \*, Megan Kiely Mueller 2,3 and Nancy R. Gee4,5

 *Eunice Kennedy Shriver National Institute of Child Health and Human Development (NICHD), Bethesda, MD, United States, Cummings School of Veterinary Medicine, Tufts University, North Grafton, MA, United States, <sup>3</sup> Tisch College of Civic Life, Tufts University, Medford, MA, United States, <sup>4</sup> Department of Psychology, SUNY at Fredonia, Fredonia, NY, United States, WALTHAM Centre for Pet Nutrition, Melton Mowbray, United Kingdom*

Researchers in the human-animal interaction (HAI) field face a challenge in generalizing the impact of pet ownership and companion animal interaction from small samples to larger populations. While researchers in Europe and Australia have included measures of pet ownership and attachment in surveys for some time (e.g., the Avon Longitudinal Study of Parents and Children), survey researchers in the United States have been slow to incorporate questions related to HAI in population representative studies. One reason for this may be that many of the current HAI-related measures involve long, complex scales. From the survey administration perspective, using complex scales is costly in terms of both time and money. The development and validation of brief measures of HAI will facilitate the inclusion of these measures in larger surveys. This paper describes the psychometric properties of two brief attachment measures used in the first population-representative study of child development in the United States that includes HAI items, the 2014 Panel Study of Income Dynamics (PSID) Child Development Supplement (CDS). We use two measures derived from the 29 item CENSHARE Pet Attachment Survey, one for children aged 8–17 (6-items) and one for the primary caregiver (3 items). The results suggest that such brief measures of attachment to pets are psychometrically valid and are a practical method of measuring HAI attachment in larger surveys using only a few survey items. We encourage HAI researchers to work with other ongoing surveys to incorporate these and comparable HAI measures.

Keywords: child development, human-animal interaction (HAI), population representative sample, measurement, Panel Study of Income Dynamics (PSID)

## INTRODUCTION

Research on human-animal interaction (HAI) has focused on how companion animals affect the health and well-being for people of all ages [see (1–3)]. Pets are often given the status of family members (4, 5), and can play important roles in children's lives (6). The "pet effect" is that living with an animal can improve human health and well-being (7), and there is a growing body of evidence—both consistent, and inconsistent—exploring this concept. Some research has shown positive effects of HAI, including that pets can serve as a source of emotional support, ease anxiety, and encourage exercise (8). However, there are also a number of studies

#### Edited by:

*Peggy D. McCardle, Consultant, New Haven, CT, United States*

#### Reviewed by:

*Mary Renck Jalongo, Haskins Laboratories, United States Chris Fradkin, Pontifical Catholic University of Rio de Janeiro, Brazil*

> \*Correspondence: *Regina M. Bures regina.bures@nih.gov*

#### Specialty section:

*This article was submitted to Children and Health, a section of the journal Frontiers in Public Health*

Received: *28 September 2018* Accepted: *11 April 2019* Published: *14 May 2019*

#### Citation:

*Bures RM, Mueller MK and Gee NR (2019) Measuring Human-Animal Attachment in a Large U.S. Survey: Two Brief Measures for Children and Their Primary Caregivers. Front. Public Health 7:107. doi: 10.3389/fpubh.2019.00107*

**152**

that have demonstrated mixed or null findings regarding the health benefits of pet ownership (2, 9). Most of the studies to date that have examined the association between companion animals and children's health, development and well-being have been based on small, non-representative sample sizes, which limits the generalizability of the findings. There remain significant gaps in our knowledge of the social and health consequences of human-animal interaction, particularly for children and child development.

Researchers investigating the impact of animals on human health and well-being recognize the importance of understanding the nature of the bond that humans and animals share [e.g., (10)]. Many researchers argue that deriving benefits from humananimal interaction is likely related to the type and depth of emotional connection between the human and the animal (8) and measures of pet attachment have been developed to assess this connection.

### MEASURING ATTACHMENT TO PETS

Wilson et al. (11) concluded that few measures of pet attachment existed that were reliable and valid, but, since then, several measures have been developed. Anderson (12) provided the first compendium of measures of pet attachment and other aspects of the human-animal bond. Gee and Schulenberg (13) integrated information from this compendium and others (14, 15) into recommendations focused on examining the impact of animals in educational settings. The frequent use of attachment measures within the HAI field demonstrates the importance of understanding the quality of human-animal relationships as a component of the theoretical framework for understanding HAI in families. While progress has been made in measuring HAI, the field remains focused on small studies and there remains a need for brief measures that can be incorporated into population representative surveys.

### INTEGRATION OF HAI MEASURES INTO NATIONAL SURVEYS

Several large European and Australian surveys have incorporated HAI measures but, to date, despite the fact that 68% of American households report owning at least one pet (16), few U.S. population-representative data collections do so. When HAI questions are included in U.S. surveys they often relate to a single topic such as dog ownership or dog walking (17, 18). One strategy for developing generalizable findings about HAI and child health and development is to add HAI measures into existing large scale, national surveys, such as longitudinal panel studies. This approach allows researchers to leverage robust and diverse samples and to use the longitudinal data to analyze how pet ownership affects human health and development. HAI measures can also be included cross-sectionally to allow for retrospective analyses using other measures of mental and physical health embedded within the study.

Few large-scale, longitudinal U.S. studies exist because of the extensive resources needed to develop and maintain such projects over multiple periods of measurement. The trade-offs between survey length and costs limits the addition of new measures and, as a result, pet-related questions are usually limited to dog or general pet ownership. Based on small studies, we know that HAI appears to be a critical factor in promoting human health and healthy development, especially among children. We argue that, to assess the effects of pet ownership on health and development at the population level, measures of HAI included in large studies need to go beyond simple pet ownership to include measures of the quality of the human-animal relationship. Inclusion of HAI measures in large, population-representative studies requires the construction of short-form measures of HAI attachment. This paper describes two brief measures of HAI attachment (assessing child and parent attachment to pets) that can be incorporated into larger studies and, in doing so, addresses the need for validated, short form measures of attachment that can feasibly be included in large, population representative surveys.

## PRESENT STUDY

To address the need for validated, short form measures of HAI in youth, we assessed the psychometric properties of a shortened version of an existing attachment measure CENSHARE Pet Attachment Survey (PAS); (19) within a longitudinal, nationally representative study. We also assessed a similar attachment measure for the primary caregiver, typically a parent.

## DATA AND METHODS

The Panel Study of Income Dynamics (PSID) is a longitudinal, nationally representative household survey that began in 1968. The original sample comprised over 18,000 individuals living in 5,000 families in the United States. The PSID Child Development Supplement (CDS) is a supplemental study to the main study. The first CDS study collected data on a sample of children from PSID families who were 0 to 12 years old in 1997, and followed those children over three waves, ending in 2007-08. The CDS-2014 includes all eligible children in PSID households born since 1997. This paper uses publicly available, de-identified data from the PSID CDS dataset. For additional information on the PSID CDS see: https://psidonline.isr.umich.edu/Studies.aspx.

The CDS-2014 collected data on children from the household primary caregiver (PCG) and, for older children, the children themselves. Primary caregivers are parents/guardians, typically mothers, who co-reside with CDS children and answer questions about each CDS child and about themselves and the household environment. Pet-related questions were added to the instruments for both the PCG and the older children. The questions on pet ownership and attachment were added to the PSID-CDS with funding support provided by MARS/WALTHAMTMthrough the NICHD-MARS/WALTHAMTM public-private partnership. The inclusion of these questions in the CDS will provide baseline measures of levels of pet interaction and levels of child development that may potentially be revisited in future waves of data collection.

#### TABLE 1 | Pet Attachment Questions Included in the PSID CDS.


The older children, ages 8–17, were asked questions about the characteristics of their pets and interactions with family pets, including whether the child has a pet as well as a favorite pet, type of pet, and six questions about pet attachment. For the PCGs, questions included the number and types of pets in families and the PCG's interaction with and attitudes about their pets. The petrelated items for PCGs included number and type of current pets, whether the family had a pet 5 years ago, reasons for not owning a pet, and three questions related to pet attachment.

The attachment items for both child and parental pet attachment are based on a subset of items from the CENSHARE PAS, which was comprised of 29 items [see (12, 19)]. The items included in the CDS were chosen to address several specific aspects of pet relationships that have been hypothesized as theoretically important aspects of the human-animal bond: physical activity engagement, emotional and social support, and proximity (6, 10, 20). The response options for the CDS attachment questions (e.g., ". . . How often do you spend time each day playing with or exercising your pet?") were "Almost always, often, sometimes, or never." For the analyses, the coding of the responses was reversed so that "almost always" was coded as 4; "never" as 1. An attachment score was calculated by averaging the 6 items for the children and the 3 items for the PCGs.

In addition to the pet attachment measures, we examined several demographic and family characteristics including sex, age, family size, presence of only one child under 18 in the household, only one pet in the household, dog ownership, and cat ownership. Sex is coded female = 1, male = 0. Age and family size are continuous measures. All other measures are dichotomous: the child is the only child under 18 in the household (or the PCG reports only one child under 18 in the household), child or PCG reports a single pet (1 = one pet, 0 = more than one pet). All children and PCGs in the analyses have at least one pet. Thus, we include measures of dog ownership [has dog = 1; other pet(s) = 0] and cat ownership [has cat = 1; other pet(s) = 0]. For the children, a single pet, typically a dog or cat, is reported on; the PCGs are asked about all family pets so they may potentially report both a dog and a cat.

The PSID CDS 2014 collected data from 2,525 PCGs, typically parents, and 1,508 older children. Because the focus of this paper is the pet attachment questions, we exclude cases with missing responses to these questions. Our analytic sample includes respondents who reported having one or more pets and responded to all of the pet attachment questions (1,536 PCGs and 931 children). Principal factor analyses, correlation matrices, and additional descriptive statistics are reported in the results. Statistical analyses were conducted separately for the child and PCG samples using SAS 9.4. All results are unweighted.

### RESULTS

**Table 1** summarizes the pet attachment questions included in the PSID CDS 2014. The response distributions of the questions demonstrate distribution across the response options. The only question to which a majority of both child (nearly 81%) and PCGs (70 %) responded "Almost always" was ". . . how often do you consider your pet to be a member of your family?" This is also reflected in the mean for this measure (range 1 = never to 4 = almost always): for children the mean response was 3.71 (SD = 0.65) and for PCGs 3.49 (SD = 0.88).

Our analyses of the pet attachment questions and attachment scales follow the initial Holcomb et al. (19) approach. To examine the internal consistency of the two sets of pet attachment measures, we conducted two principal factor analyses to explore the relationships between the pet questions included in the child and PCG surveys and single measures of human-animal attachment. A single factor was extracted for both the child



(6 item) and PCG (3 item) samples. For the child sample the eigenvalue was 1.95; the PCG sample eigenvalue was 1.28. For the pet attachment measures, we conducted correlation analyses and computed the Cronbach's coefficient alpha. **Table 1** summarizes the questions and the results of these analyses, including the overall Cronbach's alpha. The alpha coefficients for the analyses were 0.7518 (child) and 0.7329 (PCG), suggesting that the two sets of items have acceptable internal consistency.

For the children, the mean scores for the combined pet Attachment measure was 2.9 (SD = 0.63); for the PCGs, 2.8 (SD = 0.78). Additional descriptive results are summarized in **Table 2**. While almost half (48.9%) of the child sample was female, nearly 80% of the PCGs were female, consistent with most PCGs being the mothers. In the CDS 2014, 63% of the older children and 61% of the PCGs reported having one or more pets. Approximately 44% of both children and PCGs reported a single pet in their household. Among the pet families, dogs (73.7% child, 78.3% PCG) were the most common pet, followed by cats (17.8% child, 34.1% PCG). Differences between these numbers may be attributable to question wording: In the CDS, the children were asked if they had a favorite pet and what it was; whereas the PCG asked specifically about different types of pets.

ANOVA was used to test the significance of relationships between several key measures (see **Table 2**) and attachment. Oneway ANOVA was used to test for significance by gender, age, one child in house under 18, family size, one pet in household, dog in household, and cat in household. Girls (M = 2.99, F = 8.36, p < 0.004) and women (M = 2.84, F = 8.16, p < 0.004) had significantly higher levels of attachment than boys (M = 2.87) and men (M = 2.70). Age was not significantly associated with attachment for either the child or PCG samples. Only one child in the house under age 18 was significantly associated with higher attachment for both children (M = 3.04, F = 7.52, p < 0.006) and PCGs (M = 2.87, F = 5.04, p < 0.02) compared to children (M = 2.90) and PCGs (M = 2.78) in households with more than one child. Family size was significantly associated with pet attachment for both children (F = 5.13, p < 0.0001) and PCGs (F = 2.48, p < 0.008), with higher attachment among the larger families. Children (M = 2.87, F = 6.29, p < 0.01) and PCGs (M = 2.70, F = 27.74, p < 0.0001) who reported having only one pet had slightly lower mean of attachment than other pet owners (child M = 2.97, PCG M = 2.90). This finding may reflect the diversity of pets with some pets such as turtles or fish being less interactive. Having a dog was significantly associated with higher attachment for both children (M = 2.99, F = 23.46, p < 0.0001) and PCGs (M = 2.92, F = 117.30, p < 0.0001), with the mean attachment score higher for dog owners than for other pet owners (child M = 2.76, PCG M = 2.42). The results for cats were mixed: for children with cats the attachment scores were not significantly different from those without cats (but other pets); for the PCGs, cat ownership was significantly related to higher attachment (M = 2.95, F = 26.37, p < 0.0001) compared to other pet owners (M = 2.74).

### DISCUSSION

This purpose of this paper was to describe and evaluate two shortened versions of the CENSHARE PAS that were incorporated into the PSID CDS 2014. Principal factor and correlation analyses of our 6-item older child and 3-item PCG versions of the PAS demonstrated a single factor, general attachment, and acceptable reliability. Using 29 items, Holcomb et al. (19) had identified 2 subscales within the original CENSHARE PAS: relationship maintenance (16 items) and intimacy (11 items). These two subscales had similar scores (3.16, 3.17) and were moderately correlated. Of the questions included in the CDS-2014, the exercise (child & PCG), moods (child), and greeting (child) questions were identified as parts of the Holcomb et al.'s (19) relationship maintenance subscale; the comfort (child & PCG), family (child & PCG), and study, read or watch TV (child) questions were identified as parts of their intimacy subscale.

Our findings of a single factor are due in part to the limited number of items included in the two measures. The larger instrument will likely continue to be useful for smaller studies, where researchers may be more focused on describing the dimensions of the human-animal bond. The lower scale scores (2.93, 2.81) for these brief scales may also reflect the limited nature of the shortened items. These issues reflect some of the limitations of the current analyses: The current findings may be missing some of the nuance of the multiple attachment subscales. In addition, the design of the parent study, the PSID CDS, focused on child development and well-being. This limited both the number of scale questions and additional pet-related questions. Nonetheless, we argue that the benefits outweigh the limitations and encourage other researchers to explore the use of shorter measures in large, ongoing studies to incorporate a general measure of human-animal attachment in studies that may focus on broader social, behavioral, and health topics.

In comparing some of the ANOVA results, several findings are consistent with those of the earlier study including: greater levels of attachment among girls and women and lower levels of attachment in larger households. We also find higher levels of attachment among dog owners. Findings for the cats are mixed with, no significant relationship in the child sample, but a positive relationship for the PCGs. This may reflect the greater relative reliance of the PCG short scale on the intimacy subscale (2 of 3 items) of the PAS. These differences underscore the need to measure the presence of pets, as well as attitudes toward pets, consistently both within and across surveys.

The field of human-animal interaction continues to grow and there is an increasing need for shorted, validated measures of dimensions of human-animal relationships. This paper has demonstrated multiple benefits of the development and use of brief attachment scales. Shorter scales can be cost-effective when seeking to include measures in large, population representative studies such as the PSID where space is at a premium. In the current paper, we have demonstrated that the shorter scales appear to be reliable and valid indicators of a general measure of pet attachment for older children and their primary caregivers. There is an ongoing need to explore the validity of brief measures in more detail and conduct more detailed

### REFERENCES


analyses of differences in attachment across both human and pet characteristics and among additional populations, particularly younger children, where measurement of relational bonds via self-report measures can be challenging. Future work will take a deeper look at the current data to explore the relationship between pet attachment and multiple dimensions of family and child well-being and development.

### ETHICS STATEMENT

This article involves secondary analysis of publicly available data. It does not contain any studies with human participants or animals performed by any of the authors. Therefore, ethical approval was not required.

### AUTHOR CONTRIBUTIONS

RB, MM, and NG all contributed to the abstract, literature review, introduction, and discussion. RB conducted the data analyses and drafted the data and methods and results sections.

Mental-Health Professionals. New York, NY: Routledge Publishers, Taylor & Francis Group (2017).


**Disclaimer:** The views expressed in this paper are those of the authors and do not necessarily represent those of the National Institutes of Health, Eunice Kennedy Shriver National Institute of Child Health and Human Development, the U.S. Department of Health and Human Services, or any component of the federal government.

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Bures, Mueller and Gee. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Human-Animal Interaction Research: Progress and Possibilities

James A. Griffin<sup>1</sup> \*, Karyl Hurley <sup>2</sup> and Sandra McCune<sup>3</sup>

*<sup>1</sup> The National Institutes of Health, Eunice Kennedy Shriver National Institute of Child Health and Human Development, Bethesda, MD, United States, <sup>2</sup> Mars Incorporated, Global Scientific Affairs, McLean, VA, United States, <sup>3</sup> The WALTHAM Centre for Pet Nutrition, Leicestershire, United Kingdom*

Keywords: human-animal interaction, animal-assisted interventions, anthrozoology, companion animals, service animals, child development, autism spectrum disorder, attention deficit hyperactivity disorder

Ten years ago, while reviewing the extant research literature on Human-Animal Interaction (HAI), a single question came to mind: "Why don't we know more about this topic?" (Griffin et al., 2011). A decade later the answer appears to have been, in part, a lack of infrastructure to organize and support stand-alone workshops and symposia at scientific conferences (and resulting edited volumes and journal articles) and concomitant sustained funding for rigorous research studies. As evidenced by this Frontiers Research Topic which includes 13 original data papers, the 10-year Public-Private Partnership (PPP) between the Eunice Kennedy Shriver National Institute of Child Health and Human Development (NICHD) and the WALTHAM <sup>R</sup> Centre for Pet Nutrition (WALTHAM <sup>R</sup> ), a part of Mars Incorporated, has enabled the Human-Animal Interaction (HAI) field to make remarkable progress in, by scientific standards, a very brief timespan (McCune et al., 2020). In the final Opinion paper of this Frontiers Research Topic, Human-Animal Interaction Research: A Decade of Progress, we provide a commentary on how progress in basic and translational HAI research can be sustained as well as explore how ongoing challenges and untapped possibilities will impact the next decade of HAI research.

#### Edited by:

*Livio Provenzi, Neurological Institute Foundation Casimiro Mondino (IRCCS), Italy*

#### Reviewed by:

*Andrea Saffran, Ludwig Maximilian University of Munich, Germany*

### \*Correspondence:

*James A. Griffin james.griffin@nih.gov*

#### Specialty section:

*This article was submitted to Developmental Psychology, a section of the journal Frontiers in Psychology*

Received: *17 June 2019* Accepted: *28 November 2019* Published: *20 December 2019*

#### Citation:

*Griffin JA, Hurley K and McCune S (2019) Human-Animal Interaction Research: Progress and Possibilities. Front. Psychol. 10:2803. doi: 10.3389/fpsyg.2019.02803* EXEMPLARS OF PROGRESS FROM THIS FRONTIERS RESEARCH TOPIC

The 13 papers that comprise the core of this Research Topic are a microcosm of the advances that have been made in HAI research, including the range of human and animal participants, the rigor of the study designs employed, and the diversity (and commonality) of measures used to capture mediator/moderator and outcome variables of interest.

### Human Participants

Although the initial studies funded under the PPP were constrained to those that would address the developmental mission of NICHD, the range of ages represented across the 13 studies is nonetheless impressive, with age of human participants ranging from 4-month-old infants (Hurley and Oakes, 2018) to adolescents (Pendry et al., 2018) and young adults (Syzmanski et al., 2018). The inclusion of a broad age range is important to document variations in HAI with development, and indeed, across the lifespan (Friedmann and Gee, 2018). Funding opportunities under the PPP were later expanded to include people with disabilities and those in need of rehabilitative services, further broadening the range of human participants (although this is not represented in this Research Topic because those studies are ongoing). In addition, the study participants represented in this Research Topic include children and adolescents from a wide variety of normative (e.g., Meints et al., 2018) and clinical populations (e.g., Gabriels et al., 2018; Schuck et al., 2018). Approximately half of the studies (7) represented are based on samples of typically developing children, with another including them as one of three subgroups. A range of clinical populations was represented in the studies, including individuals with Autism Spectrum Disorder (ASD, three studies), Attention Deficit Hyperactivity Disorder (ADHD, two studies), and incarcerated youth (one study). Despite the diversity of age and sample background characteristics, only one of the 13 studies (Jacobson and Chang, 2018) specifically examined the role of socioeconomic status (SES) as it relates to HAI and socioemotional outcomes. Racial/ethnic and cultural variables were not examined, and gender differences were not a focus of the analyses. One of the studies (MacLean and Hare, 2018) did not include humans at all; rather, it examined differences between assistance and detection dogs.

### Animal Participants

It is important to note that the studies employed or examined the relationship of children to a wide variety of animals, including dogs (the most common), horses (second), cats, and guinea pigs. While these studies undergo ethical review to ensure the welfare of the animals, the animal part of HAI is often not assessed, and the majority of the studies did not report the results of measures in the animals assessing stress at the behavioral or physiological level. Likewise, most did not report complete information on the age and gender of the animal, spaying/neutering status, and, if part of an intervention, what training and certification the animal had received and when. Additional information on the handlers of assistance or service dogs if they participated during an intervention would also aid replication and meta-analysis of these variables across a wide range of studies.

### Study Design

Four of the studies employed a randomized controlled trial design, and another employed an integrated control group design with each child being his or her own control (before and after a learning intervention and at the follow-up testing). One used survey data from a nationally representative study. The remaining studies used samples of convenience and employed surveys, observation, and laboratory testing to examine associations across a variety of types of HAI and child outcomes. Although progress has been made, more HAI studies are needed that employ randomized designs with sufficient sample sizes to ensure adequate power and control conditions that address competing explanations for detected effects.

### Measures and Methodology

The 13 studies in the Research Topic employed a wide range of measures, including salivary cortisol to assess Hypothalamic Pituitary Adrenal (HPA) axis reactivity and stress levels (Pendry et al., 2018; Pan et al., 2019), survey questions (Bures et al., 2019), and a genetic assay to look at the relationship between a single polymorphism in the oxytocin receptor and individual differences in response to HAI activities (Kertes et al., 2018).

Two studies specifically focused on examining the psychometric properties of HAI survey items related to children's attachment to their pets (Hart et al., 2018; Bures et al., 2019) and another used observational coding of children's interactions with animals in a naturalistic setting (Gabriels et al., 2018). Such methodology papers are critical to the advancement of HAI research.

## LOOKING TO THE FUTURE

It is important to note that while the 13 research papers that make up this Frontiers Research Topic provide a fair representation of the types of research funded by the PPP, they are only a partial representation of such studies. Some of the PPP-supported investigations have already published their findings elsewhere, and others are still actively engaged in data collection and analysis. Likewise, important collections of HAI journal articles have been published on topics ranging from Animal-Assisted Interventions (AAIs) in special populations (McCune et al., 2017), the range of experiences children have with companion animals (Beetz et al., 2018), and the growing university-based infrastructure supporting HAI research (Serpell et al., 2017; O'Haire et al., 2018). Taken together, these efforts document the growth that has taken place over the last decade in HAI research. The overarching question now is what will be necessary to sustain such growth over the next 10 years?

The progress of the past decade covers a number of fronts: increased use of a single keyword to index publications (humananimal interaction), the widespread adoption of standardized terms and definitions (IAHAIO, 2014), increased methodological rigor in terms of study designs, standardized measurements (including genetic and biomarker assays), expansion of HAI research into new settings (Gee et al., 2017), and better communication across scientific disciplinary boundaries via scientific journals and professional societies (as evidenced by this Frontiers Research Topic). In order to build upon and sustain this growth there are a number of practices that need to be adopted or more widely utilized by the field, including but not limited to those outlined below.

### Use of Video

Although significant progress has been made in measuring human and animal behavioral and physiological responses, the "I" in HAI has largely gone unmeasured. The availability of low-cost high-definition video recording and editing makes it possible for researchers conducting HAI studies in laboratories, homes and other naturalistic settings to record HAIs as they take place for later coding and analysis. An exemplar of this approach (Guérin et al., 2018) demonstrates the power of having video from multiple studies with different populations to establish the psychometric properties of a behavior coding tool. Indeed, using video to record both training sessions prior to implementing interventions and the interventions themselves could make it possible to implement the intervention at another site with fidelity, allow for the coding of behaviors by raters blind to the intent of the intervention, and provide a resource for secondary data analyses and meta-analyses. A data library, Databrary, partially funded through grants from the NICHD and the National Science Foundation (NSF), has been established to promote the archiving and sharing of video data to improve the reproducibility of studies and accelerate the pace of scientific discoveries across a range of disciplines (Gilmore et al., 2018).

### Data Sharing

Clearly data archiving and sharing should not be limited to video data—HAI researchers should be encouraged to document and archive their datasets so that they can be used for secondary and meta-analytic studies by others in the field and by researchers from other disciplines who might code and analyze their data for variables of interest outside of HAI per se (e.g., language samples from young children interacting with their pets). One such data repository is the Data and Specimen Hub (DASH) funded by the NICHD, which provides archiving infrastructure for both electronic data and biospecimens (Hazra et al., 2018).

### Documenting and Manualizing Animal-Assisted Interventions

With the proliferation of AAIs comes the challenge of reproducibility and dissemination. Few AAI studies report their interventions in sufficient detail to allow for replication by another researcher or a practitioner in the field. Such information is critical to avoid failures of replication or improper implementation with clinical populations. The potential list of such details is extensive; e.g., how free is the animal to approach/withdraw or to interact with a subject? Is the subject allowed to touch the animal, and if so, in what way? How is the animal introduced to the subject? How is the intervention executed (such as reading to an animal?). These details would likely need to be included in supplementary material for a journal article or to be available for download on a lab website (e.g., Tufts Institute for Human-Animal Interaction, 2016). Likewise, there is a need to systematically report on various training and certification programs for therapy and service animals (e.g., what criteria were used to screen and select the therapy animals and their human owners or handlers for inclusion in the study? Were one or more certifications accepted for the animals/handlers?). It is important to document such details for future replication studies and potential adoption by other service providers for AAIs as well as visitation programs, especially those serving potentially vulnerable populations (e.g., individuals in hospitals and nursing homes).

### Inclusion

Although HAI research is carried out around the globe, the majority of studies conducted utilize convenience samples that are homogeneous in socioeconomic status, race/ethnicity, and cultural and religious background of the participants. Even fewer examine cross-cultural differences employing geographically diverse samples within and across countries (McCune et al., 2014). There is a great deal of work ahead for the field to fully explore these differences as they relate to HAI. These include the complex relationships between humans and other animals, with some seen as a source of food, others as beloved pets. As pet ownership becomes more accepted in some countries it is possible to treat this change as a natural experiment to examine how attitudes toward HAI change, and meet resistance, over time (Headey et al., 2008).

### POTENTIAL PITFALLS: WHAT WILL NOT CONTRIBUTE TO GROWTH OF HAI RESEARCH?

While the remarkable growth of HAI research over the last decade has added significantly to our scientific knowledge base, it is worth highlighting a few persistent and new trends that will not contribute to the accumulation of knowledge regarding HAI.

### Pets Do Not Confer Immortality

It is now commonplace to hear a news report about the health benefits of pet ownership, only to hear a similar story highlighting the exact opposite results (Herzog, 2011). To be clear, it is very important to examine the health benefits of companion animals, service animals, and AAIs generally, including the mechanisms by which those gains are realized (Serpell et al., 2017), and to acknowledge the strengths and limitations of the corpus of research used as the basis for statements by, for example, the American Heart Association (Levine et al., 2013) and the Mayo Clinic (Creagan et al., 2015).

Not all studies indicate benefits, and the data are not consistently positive or in agreement. At the heart of the contradictory findings is the fact that many studies on both sides of the health claims debate are generated by survey data that provide very limited information, often having a single item on pet ownership and a few health questions (Batty et al., 2017). Such studies cannot take into account reasons for (or to forego) pet ownership, the length of pet ownership, the relationship to the pet (positive and negative), including attachment, etc. (Ding et al., 2017). There are very likely both positive and negative effects of pet ownership on human health, with the aggregated data resulting in statistically significant associations depending on the sample and covariates employed in the analysis (Mueller et al., 2018). Obtaining a pet to obtain health benefits would be like seeking a spouse because married people tend to live longer (Tatangelo et al., 2017); one would likely be unhappy with the outcome.

A similar cautionary flag must also be raised regarding the benefits of service animals with select populations: although the preliminary research findings are promising (O'Haire and Rodriguez, 2018), many factors must be taken into account in the decision to obtain a service animal, including the ability to care for it (Crossman and Kazdin, 2015).

### Proliferation of Systematic Reviews and Meta-Analyses

There are times when the generation of new research findings in a given area warrant a similar rapid turn-around in the number of systematic reviews in that area of science. One such example is AAI studies with children who have Autism Spectrum Disorder (ASD), a topic which has generated a plethora of studies and two rapid cycle systematic reviews (O'Haire, 2013, 2016). That said, the number of systematic reviews and meta-analyses being conducted on this topic alone (Berry et al., 2013; Davis et al., 2015; Hill et al., 2018) indicates that disciplinary journals are being saturated with articles reviewing the same small set of papers with little or no cross-referencing of the other extant literature, including other systematic reviews. Journal editors and peer reviewers need to do at least a cursory search of the literature to see if manuscripts are sufficiently different from previous publications to warrant further consideration.

A similar issue will arise as data archiving becomes more common, allowing for secondary data analysis. Editors and reviewers must make sure any secondary papers are not reporting findings previously published by the original authors of the study. The bottom line is that to make meaningful contributions to the advancement of HAI research, investigators conducting systematic reviews and meta-analyses need to ensure that they are making a meaningful contribution to the current knowledge base and using the findings from their reviews and metaanalyses to make informed recommendations regarding future research directions, including topics, designs and measures. These reviews must not shy away from documenting null and negative effects, as well as noting adverse events that occur across similar interventions, as these are important for determining the risk level associated with an intervention and optimal ways to manage it.

### CONCLUSION: BUILD IT AND THEY WILL COME

The PPP cannot claim credit for the significant advances in infrastructure supporting HAI research (university-based research centers, new dedicated journals, the second of two Frontiers Research Topics dedicated to HAI research), but it

### REFERENCES


can claim some small role in encouraging their foundation and development (Esposito et al., 2011), with a concomitant rise in the adaptation of rigorous research designs and methodologies and the use of standardized measures which allow for comparison of methodologies and findings across studies. All of these factors support the accumulation of a growing empirical knowledge base and interdisciplinary research teams capable of conducting high quality HAI studies. It is hoped that the PPP has and will continue to contribute to the maturation of a field that provides timely evidenced-based information to guide topics ranging from pet ownership and social isolation to addressing the contributions of service animals to the social functioning of children with ASD and military service veterans with PTSD. The breadth of the HAI field is both a blessing and a curse, but those committed to advancing our understanding of the complex relationship between humans and animals would not have it any other way.

### AUTHOR'S NOTE

The views expressed in this article are those of the authors and do not necessarily represent those of the National Institutes of Health, Eunice Kennedy Shriver National Institute of Child Health and Human Development, or the U.S. Department of Health and Human Services.

### AUTHOR CONTRIBUTIONS

All authors listed have made substantial, direct and intellectual contributions to the work, and approved it for publication.

a pooled longitudinal analysis of cohorts. Am. J. Prev. Med. 54, 289–293. doi: 10.1016/j.amepre.2017.09.012


**Conflict of Interest:** KH is employed by Mars Incorporated, Global Scientific Affairs, McLean, VA, United States. SM was employed by the Mars Incorporated, WALTHAM Centre for Pet Nutrition, Leicestershire, United Kingdom.

The remaining author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Griffin, Hurley and McCune. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.