# MODELING DISEASE SPREAD AND CONTROL

EDITED BY: Tariq Halasa and Salome Dürr PUBLISHED IN: Frontiers in Veterinary Science

#### *Frontiers Copyright Statement*

*© Copyright 2007-2018 Frontiers Media SA. All rights reserved. All content included on this site, such as text, graphics, logos, button icons, images, video/audio clips, downloads, data compilations and software, is the property of or is licensed to Frontiers Media SA ("Frontiers") or its licensees and/or subcontractors. The copyright in the text of individual articles is the property of their respective authors, subject to a license granted to Frontiers.*

*The compilation of articles constituting this e-book, wherever published, as well as the compilation of all other content on this site, is the exclusive property of Frontiers. For the conditions for downloading and copying of e-books from Frontiers' website, please see the Terms for Website Use. If purchasing Frontiers e-books from other websites or sources, the conditions of the website concerned apply.*

*Images and graphics not forming part of user-contributed materials may not be downloaded or copied without permission.*

*Individual articles may be downloaded and reproduced in accordance with the principles of the CC-BY licence subject to any copyright or other notices. They may not be re-sold as an e-book.*

*As author or other contributor you grant a CC-BY licence to others to reproduce your articles, including any graphics and third-party materials supplied by you, in accordance with the Conditions for Website Use and subject to any copyright notices which you include in connection with your articles and materials.*

*All copyright, and all rights therein, are protected by national and international copyright laws. The above represents a summary only. For the full conditions see the Conditions for Authors and the Conditions for Website Use.*

ISSN 1664-8714 ISBN 978-2-88945-384-9 DOI 10.3389/978-2-88945-384-9

#### About Frontiers

Frontiers is more than just an open-access publisher of scholarly articles: it is a pioneering approach to the world of academia, radically improving the way scholarly research is managed. The grand vision of Frontiers is a world where all people have an equal opportunity to seek, share and generate knowledge. Frontiers provides immediate and permanent online open access to all its publications, but this alone is not enough to realize our grand goals.

#### Frontiers Journal Series

The Frontiers Journal Series is a multi-tier and interdisciplinary set of open-access, online journals, promising a paradigm shift from the current review, selection and dissemination processes in academic publishing. All Frontiers journals are driven by researchers for researchers; therefore, they constitute a service to the scholarly community. At the same time, the Frontiers Journal Series operates on a revolutionary invention, the tiered publishing system, initially addressing specific communities of scholars, and gradually climbing up to broader public understanding, thus serving the interests of the lay society, too.

#### Dedication to quality

Each Frontiers article is a landmark of the highest quality, thanks to genuinely collaborative interactions between authors and review editors, who include some of the world's best academicians. Research must be certified by peers before entering a stream of knowledge that may eventually reach the public - and shape society; therefore, Frontiers only applies the most rigorous and unbiased reviews.

Frontiers revolutionizes research publishing by freely delivering the most outstanding research, evaluated with no bias from both the academic and social point of view. By applying the most advanced information technologies, Frontiers is catapulting scholarly publishing into a new generation.

#### What are Frontiers Research Topics?

Frontiers Research Topics are very popular trademarks of the Frontiers Journals Series: they are collections of at least ten articles, all centered on a particular subject. With their unique mix of varied contributions from Original Research to Review Articles, Frontiers Research Topics unify the most influential researchers, the latest key findings and historical advances in a hot research area! Find out more on how to host your own Frontiers Research Topic or contribute to one as an author by contacting the Frontiers Editorial Office: researchtopics@frontiersin.org

## **MODELING DISEASE SPREAD AND CONTROL**

Topic Editors:

**Tariq Halasa,** Technical University of Denmark, Denmark **Salome Dürr,** University of Bern, Switzerland

Image: Pashi/Pixabay.com.

Mathematical models are useful tools to understand the epidemiology and agent-host interaction of diseases. They are developed and applied since over a century, but with increasing computer capacity, they become increasingly prominent as part of evidence based decision making. Mathematical models are frequently used to construct preparedness and contingency plans for highly contagious diseases such as foot-and-mouth disease. This allows proposing effective strategies to control the spread of the disease in case of an incursion, and avails useful tools to support decision making during an outbreak. They are also used to monitor, prevent and control endemic diseases within populations or farms. In addition, mathematical models improve our understanding of the contact structure between farms, pointing out risky elements in the contact network for disease introduction or further spread within the population.

This Research Topic presents valuable studies presenting different aspects and implementations of mathematical modeling for disease spread and control in the veterinary field. The areas covered include model construction, network analysis, tools for decision makers, and costeffective control of endemic diseases.

**Citation:** Halasa, T., Dürr, S., eds. (2018). Modeling Disease Spread and Control. Lausanne: Frontiers Media. doi: 10.3389/978-2-88945-384-9

# Table of Contents


#### *58 Early Decision Indicators for Foot-and-Mouth Disease Outbreaks in Non-Endemic Countries*

Michael G. Garner, Iain J. East, Mark A. Stevenson, Robert L. Sanson, Thomas G. Rawdon, Richard A. Bradhurst, Sharon E. Roche, Pham Van Ha and Tom Kompas

*72 Semiquantitative Decision Tools for FMD Emergency Vaccination Informed by Field Observations and Simulated Outbreak Data*

Preben William Willeberg, Mohammad AlKhamis, Anette Boklund, Andres M. Perez, Claes Enøe and Tariq Halasa


Fernanda C. Dórea, Maria Nöremark, Stefan Widgren, Jenny Frössling, Anette Boklund, Tariq Halasa and Karl Ståhl

#### **Cost-Effective Control of Endemic Diseases**

#### *106 Simulating the Epidemiological and Economic Impact of Paratuberculosis Control Actions in Dairy Cattle*

Carsten Kirkeby, Kaare Græsbøll, Søren Saxmose Nielsen, Lasse E. Christiansen, Nils Toft, Erik Rattenborg and Tariq Halasa

#### *119 Epidemiological and Economic Evaluation of Alternative On-Farm Management Scenarios for Ovine Footrot in Switzerland*

Dana Zingg, Sandro Steinbach, Christian Kuhlgatz, Matthias Rediger, Gertraud Schüpbach-Regula, Matteo Aepli, Gry M. Grøneng and Salome Dürr

## Editorial: Modeling Disease Spread and Control

*Tariq Halasa1 \* and Salome Dürr2*

*1National Veterinary Institute, Technical University of Denmark, Kongens Lyngby, Denmark, 2Veterinary Public Health Institute, University of Bern, Bern, Switzerland*

Keywords: model, disease spread, control, network analysis, decision support

**Editorial on the Research Topic**

**Modeling Disease Spread and Control**

#### INTRODUCTION

Infectious diseases are a major burden for health (1) for both humans and animals and pose a constant economic challenge for the global economy (2, 3). Climate change, intensive global trade, emergence/reemergence of infectious agents and of antimicrobial resistance, combined with intensive livestock production systems make prevention and control of livestock infectious diseases a major global challenge. This intensifies the demand for tools to aid in better understanding of disease spread for cost-effective contingency planning and disease prevention and control.

#### *Edited by:*

*Andres M. Perez, University of Minnesota, United States*

#### *Reviewed by:*

*Amy Delgado, Animal and Plant Health Inspection Service (USDA), United States*

> *\*Correspondence: Tariq Halasa tahbh@vet.dtu.dk*

#### *Specialty section:*

*This article was submitted to Veterinary Epidemiology and Economics, a section of the journal Frontiers in Veterinary Science*

*Received: 11 October 2017 Accepted: 07 November 2017 Published: 21 November 2017*

#### *Citation:*

*Halasa T and Dürr S (2017) Editorial: Modeling Disease Spread and Control. Front. Vet. Sci. 4:199. doi: 10.3389/fvets.2017.00199*

Mathematical and simulation models have contributed to improve our understanding of the population dynamics of infectious diseases. In addition, they have provided decision makers with tools to aid in disease prevention and control based on scientific evidence (4–6). This research topic includes 10 scientific studies presenting different aspects and implementations of mathematical modeling for disease spread and control. The studies can be divided into: (1) model construction (two studies); (2) network analysis (two studies); (3) tools for decision makers (four studies); and (4) cost-effective control of endemic diseases (two studies).

#### MODEL CONSTRUCTION

Constructing and describing a model of livestock production systems is challenging, as the systems are complex and may vary largely. It requires determining the most appropriate structure to use and the elements to include in the model. An integrated conceptual analysis is presented in this study availing a guideline for the construction of infectious disease process models and a comparison between the different modeling approaches (Mancy et al.). The authors discussed the different motivations for use of models in epidemiological research identifying key steps in model design and use and presented a conceptual framework for guiding model construction and comparison, depending on the modeled epidemiological systems.

The impact of indirect transmission of foot-and-mouth disease (FMD) *via* explicit modeling of virus persistence outside the host (in the environment) on the overall spread of the virus was examined using a stochastic individual-based model on the example of wild boar populations (Lange et al.). The authors compared a situation where there is transmission *via* direct and indirect contacts and a situation where transmission occurs only through direct contact. The results showed that the simplified, direct transmission model underestimates necessary sample size in surveillance plans by up to one order of magnitude, but overestimates the area put under control measures. Consequently, incorporation of indirect transmission mechanisms in epidemiological modeling is necessary.

#### NETWORK ANALYSIS

Livestock industries are increasingly connected in ways that make control strategies based on local geographic boundaries or proximity unsuccessful. Long distance and complex patterns of movements complicate our understanding of how diseases spread, and how they should be controlled. The impact of changing the activity level of the German pig trade network on the probability of disease outbreaks, size, and duration of epidemics was studied (Lebl et al.). The results showed that small changes of the activity level of the network would have dramatic effects on the outcomes. These results are important because they indicate that the activity level of a trade network should be considered when simulating disease spread between pig herds, as it may influence the results significantly.

Exponential random graph modeling was used to reproduce, understand and predict pig trade networks in different European production systems (Relun et al.). The results showed that production system and farm characteristics—such as the geographical location, the production type, belonging to a pig company or housing system—were key drivers of pig trade. Statistics on local network configurations was necessary to capture the clustering observed in pig trade networks. This work provides approaches to simulate realistic pig trade networks that may be included in epidemic models.

#### TOOLS FOR DECISION MAKERS

Resources are limited and hence control strategies must be effective. Modeling offers a unique opportunity to evaluate control strategies and decision-making in the absence of an actual outbreak, as well as to estimate resource requirements that are needed for an appropriate response. A modeling study was carried out to identify characteristics measurable during the early phase of a FMD outbreak that might be useful predictors of epidemic outcomes, such as the total number of infected premises (IPs), outbreak duration, and the total area under control (AUC) at the end of the outbreak (Garner et al.). The results showed that these outcomes were associated with the number of IPs, the number of pending culls, the AUC, and the rate of disease spread at days 7, 14, and 21 following first detection, as well as cattle density around the index herd. These findings show that information available early in the outbreak can indicate its likely magnitude.

Simple semi-quantitative model-based decision tools are presented aiming to estimate the likelihood and the consequences of the ultimate size of an ongoing FMD epidemic, using simulated and actual outbreak data (Willeberg et al.). The results showed that the number of outbreaks at day 14 after FMD incursion is a useful predictor of the final epidemic size. In addition, the authors recommended that EU member states adopt simulation models as tools to aid decision-making, while ensuring that the output of such models is clearly understood by decision makers.

An iterative tool was developed with the aim of estimating the resources needed during an outbreak of FMD and identifying areas with limited resources that can delay the control of the disease (Boklund et al.). Outcomes of a simulation model of FMD spread were used to determine the daily required resources. The results showed that the number of needed personnel was predicted to peak within the first week. In addition, the time needed for surveillance visits was predicted to be the most influential factor for the required personnel.

The spread of a hypothetical outbreak of FMD in Sweden was studied and different control measures were simulated and evaluated (Dórea et al.). The results showed that the density of farms in the area where the epidemic started would have little impact on the time to control the outbreak. However, spread in high-density areas would require more surveillance resources, compared to areas of lower farm density. Based on these results, FMD outbreaks could be kept limited in Sweden using the EU standard control strategy and a national standstill of 3 days.

#### Cost-Effective Control of Endemic Diseases

Endemic diseases cause large economic damage to livestock production, which requires constant evaluation of strategies to cost-effectively monitor, control, and prevent these diseases. A stochastic individual-based model simulating the spread and control of *Mycobacterium avium* subsp. *paratuberculosis* (MAP) within a dairy cattle herd was presented (Kirkeby et al.). The results showed that it was possible to eradicate MAP from a dairy cattle herd. Nevertheless, from an economic stand point, this was not attractive since the expenses for the control actions outweighed the benefits.

A comparison between two nationwide control strategies for footrot and a no intervention scenario with the current situation was conducted, to quantify their net economic effects (Zingg et al.). This was done by sequential application of a maximum entropy model, epidemiological simulation, and calculation of net economic effects using the net present value method. The results showed that a systematic Swiss-wide management program under the application of the recent PCR diagnostic test is the most recommendable strategy for a cost-effective control of footrot in Switzerland.

### CONCLUSION

The use of mathematical modeling to support decision-making is noticeably increasing, as its importance is progressively recognized by decision makers. The current research topic provides approaches, methods, and models that can support this evolution. It also provides useful tools to support decision-making for contingency planning and for the prevention and control of animal diseases on both the herd and the national level.

#### AUTHOR CONTRIBUTIONS

All authors listed have made a substantial, direct, and intellectual contribution to the work and approved it for publication.

#### ACKNOWLEDGMENTS

We would like to acknowledge all authors who contributed to this research topic with their valuable scientific work. In addition, we acknowledge Prof. Eyal Klement, Prof. Andres Perez, Francisco Ruiz-Fons, and Alejandra V. Capozzo for assistance in editing the submitted papers.

### REFERENCES


6. Dürr S, Fasel-Clemenz C, Thür B, Schwermer H, Doherr MG, Dohna HZ, et al. Evaluation of the benefit of emergency vaccination in a foot-and-mouth disease free country with low livestock density. *Prev Vet Med* (2014) 113:34–46. doi:10.1016/j.prevetmed.2013.10.015

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2017 Halasa and Dürr. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## An Integrated Framework for Process-Driven Model Construction in Disease Ecology and Animal Health

*Rebecca Mancy1,2 \*, Patrick M. Brock1,2 \* and Rowland R. Kao1,2*

*1College of Veterinary, Medical and Life Sciences, Institute of Biodiversity, Animal Health and Comparative Medicine, University of Glasgow, Glasgow, United Kingdom, 2Boyd Orr Centre for Population and Ecosystem Health, University of Glasgow, Glasgow, United Kingdom*

#### *Edited by:*

*Salome Dürr, University of Bern, Switzerland*

#### *Reviewed by:*

*Thomas Selhorst, Bundesinstitut für Risikobewertung, Germany Kaare Græsbøll, Technical University of Denmark, Denmark*

#### *\*Correspondence:*

*Rebecca Mancy rebecca.mancy@glasgow.ac.uk; Patrick M. Brock paddy.brock@glasgow.ac.uk*

#### *Specialty section:*

*This article was submitted to Veterinary Epidemiology and Economics, a section of the journal Frontiers in Veterinary Science*

*Received: 16 January 2017 Accepted: 06 September 2017 Published: 27 September 2017*

#### *Citation:*

*Mancy R, Brock PM and Kao RR (2017) An Integrated Framework for Process-Driven Model Construction in Disease Ecology and Animal Health. Front. Vet. Sci. 4:155. doi: 10.3389/fvets.2017.00155*

Process models that focus on explicitly representing biological mechanisms are increasingly important in disease ecology and animal health research. However, the large number of process modelling approaches makes it difficult to decide which is most appropriate for a given disease system and research question. Here, we discuss different motivations for using process models and present an integrated conceptual analysis that can be used to guide the construction of infectious disease process models and comparisons between them. Our presentation complements existing work by clarifying the major differences between modelling approaches and their relationship with the biological characteristics of the epidemiological system. We first discuss distinct motivations for using process models in epidemiological research, identifying the key steps in model design and use associated with each. We then present a conceptual framework for guiding model construction and comparison, organised according to key aspects of epidemiological systems. Specifically, we discuss the number and type of disease states, whether to focus on individual hosts (e.g., cows) or groups of hosts (e.g., herds or farms), how space or host connectivity affect disease transmission, whether demographic and epidemiological processes are periodic or can occur at any time, and the extent to which stochasticity is important. We use foot-and-mouth disease and bovine tuberculosis in cattle to illustrate our discussion and support explanations of cases in which different models are used to address similar problems. The framework should help those constructing models to structure their approach to modelling decisions and facilitate comparisons between models in the literature.

Keywords: process models, modelling, model construction, epidemiology, infectious disease, disease ecology, foot-and-mouth disease, bovine tuberculosis

### BACKGROUND

The use of models is becoming increasingly popular for understanding the biological processes that drive the host-to-host spread and within-host progression of infectious diseases, for both theoretical and applied problems. Key epidemiological processes include transmission arising from contact between infectious and susceptible hosts, disease progression within hosts (e.g., onset of symptoms, recovery), and interventions such as vaccination or treatment. These processes are dynamic, and this time dimension can be captured through changes in the epidemiological state of individuals over time. Models in which these processes are represented explicitly, often referred to as *mechanistic* or *process models*, 1 are increasingly common in disease ecology, including in veterinary epidemiology.

The explicit incorporation of biological mechanism makes process models ideal for studying systems in which populationlevel effects (such as disease outbreaks) arise from individual-level processes in ways that are difficult to anticipate (such as details of how infectious individuals today feed into generating new infections tomorrow) (1). By changing model inputs to simulate interventions, we can also use them to analyse the effects of policies that cannot be tested in the real world because doing so would be infeasible due to time or resource constraints or because such experiments would be ethically undesirable. Such effects would be difficult or impossible to investigate using standard statistical frameworks.

A wide range of options exists for constructing process models. Early epidemiological process models typically took the form of differential equation models representing susceptibleinfectious-susceptible (SIS) or susceptible-infectious-removed (SIR) disease dynamics. However, increasingly, researchers are using models that incorporate higher levels of biological detail. One approach is to use agent-based models (ABMs; also referred to as individual-based models) in which each host is modelled explicitly and its state (i.e., its disease state plus all relevant epidemiological characteristics) progresses according to a set of rules. These rules can be simple, but can and often do incorporate greater complexity. Many more types of process model exist, including partial differential equation models and cellular automata. This diversity makes it challenging to make defensible decisions when developing new models, and complicates the task of elucidating any inconsistencies between studies, assessing the weight of the evidence for particular claims, and identifying research gaps. However, most overviews focus on describing technical aspects of each type and include only limited discussion of their relationship with biology, while original research articles typically compare only a small number of model types.

Here, we clarify the relationship between process modelling approaches both in relation to high-level motivations for their use and to lower-level decisions about the characteristics of a particular model. We describe five distinct motivations, identifying the key steps in model design and uses associated with each and present a framework that forms the conceptual basis for making modelling decisions, considering the constraints imposed by the epidemiological system, the available data, relevant knowledge and expertise, and the questions of interest. Our approach is explanatory rather than prescriptive, as the implications of modelling decisions are highly dependent on context. We anticipate that this analysis will be most valuable for researchers who are relatively new to process modelling, but believe it has broader value by providing an organisational structure for comparing and contrasting modelling approaches.

#### FROM SYSTEM TO MODEL

A major preoccupation when choosing a modelling approach is that it should, in some sense, be "correct." Although Box's (2) claim that "all models are wrong but some are useful" has become something of a mantra, its practical implications remain a justifiable concern. Nearly a century earlier, Claude Bernard (3, 4) had noted that, like models, scientific theories are always "wrong" insofar as they are "only partial and provisional truths," but had emphasised their necessary role in science as "steps on which we rest, so as to go on with investigation." In evaluating models, Odenbaugh (5) argues for a shift of focus away from model "truth" towards the appropriateness of particular modelling "idealisations" (simplifications or abstractions), which should be considered in the context of the biological system to which we apply the model and the questions it helps us answer. Below, we discuss the application of these principles to epidemiological modelling.

To provide meaningful context, we discuss model construction decisions with reference to two illustrative2 disease systems in veterinary epidemiology involving cattle: foot-and-mouth disease (FMD; caused by FMD virus) and bovine tuberculosis (bTB; caused by *Mycobacterium bovis*). To keep the focus on the underlying concepts and avoid introducing multiple epidemiological systems, we limit practical examples discussed to those provided by modelling work on these two diseases. These two high profile pathogens have important animal health implications and are associated with a considerable body of research using process models, much of which has focused on recent UK epidemics: a major outbreak of FMD occurred in 2001 and was controlled by large-scale intervention (6), while bTB remains endemic in the UK (7). In thinking about potential influences of the epidemiology of bTB and FMD in cattle, we observe that: different numbers of cows are kept on farms of variable size; cows have fixed *attributes* (e.g., breed) and changing *states* (e.g., age); and there can be variation in the environment at, around, and between farms. We note that cattle come into contact with one another through activities such as grazing on the same or neighbouring pastureland, and are moved between farms as part of trade, slaughter, and breeding activities. These factors can affect epidemiological outcomes and it is important to consider whether and how to model them. For example, in an ABM, we would represent cattle hosts as distinct agents. The attributes, states, and movement in continuous space of these individuals could be simulated and tracked through time. The model would be initiated with a population of cattle and seeded with infection. When the model was run, the initial population would be subject to processes such as aging, movement, infection, and recovery.

<sup>1</sup>The relationship between model and biological mechanism differs between process models and statistical approaches based on linear models, traditionally the mainstay of research and education in biology. Although these statistical approaches are often guided by mechanistic theory and employed with the aim of understanding biological mechanisms, the mechanisms themselves are not modelled directly and are instead inferred from associations with explanatory and response variables. In contrast, process models incorporate processes explicitly based on biological understanding, potentially in the absence of detailed data on every aspect of the modelled system. They can be thought of as simplified "model worlds" in which key epidemiological processes unfold over time, analogously to their progression in the real world.

<sup>2</sup>Despite this restriction, much of the conceptual ground covered in this article applies to any biological discipline in which process models are used as research tools (e.g., ecology). We return to this point in the Section "Discussion."

Cattle movement within farms could be modelled as a random process, while between-farm movement could be informed by trade volume data, and transmission between hosts determined by contact between susceptible and infectious individuals. As time progressed, the state of every host would be updated following the specified process rules.

Because the structure of detailed ABMs maps closely to our understanding of the real world, they often appear intuitive. However, they are time-consuming to construct because they involve making many decisions about the processes to model. Many alternatives exist, yet selection can be challenging. The remainder of this paper consists of two main sections: in the first, we distinguish between five motivations for using process models and the particular aspects of model construction to focus on for each; in the second, we provide an organisational structure for navigating modelling decisions. The first of these sections is more philosophical and the second more practically oriented; they can be read together or as stand-alone sections.

#### THE QUESTIONS WE WISH TO ANSWER

The appropriateness of a particular model depends on both our precise research question and our motivations for using a process model to help answer it. Clarifying our motivations for using a model and the steps required to use it in this way helps guide modelling decisions towards the aspects of the model that are most critical for the way it will be used and help us determine the appropriate level of complexity or output accuracy, something we return to in Section "Applying the Model Construction Approach." Irrespectively of whether the focus is on highly specific or abstract systems, one or several motivations can apply in each piece of work. These are not always made explicit in the text of an article, but distinguishing between them can help us to understand the role of modelling within a piece of work and, therefore, evaluate its appropriateness. In this section, we present five motivations for using process models in epidemiology.3 In **Table 1**, we describe start and end points for each motivation, provide illustrative questions or observations associated

<sup>3</sup>Although motivations for modelling have been classified in different ways, see, e.g., Ref. (8), we draw on the five categories described by Odenbaugh (5). We build on these by clarifying distinctions between them, adapting them to the context of epidemiological process modelling, and clearly identifying the steps in model construction and use associated with each. We rename certain categories for conceptual clarity: Odenbaugh refers to Mapping and formalising theory as "providing conceptual frameworks," to Building theory as "generating explanations," and to a category similar to Testing theory as "investigating more complex systems."


Frontiers in Veterinary Science | www.frontiersin.org September 2017 | Volume 4 | Article 155

with each (with references, where available for bTB/FMD and otherwise for more abstract disease systems), and identify the primary focus during model construction. In the main text, we provide an overview of the associated steps involved.

#### Mapping and Formalising Theory

When our initial thinking about an epidemiological system is relatively imprecise, the process of formalising our ideas helps improve conceptual clarity and can even allow us to develop new epidemiological concepts or refine existing ones. By "mapping and formalising theory," we refer to the process by which we go from our (typically informal) understanding of an epidemiological system derived from verbal theory or experience, to a formal model that can be written down (e.g., using mathematical symbols or computer code). Formal and symbolic models provide conceptual frameworks that allow us to reason about more complex systems than would be possible using purely verbal arguments. Odenbaugh (5) points out that "model building is first and foremost a *strategy* for coping with an extraordinarily complex world," making this an important motivation for model construction. Heesterbeek (9) describes how formalising theory led to the development of the basic reproductive number (sometimes called "rate" or "ratio"), *R*0, that has become one of the most important concepts in contemporary infectious disease epidemiology. *R*0 is now widely used independently of the original models (or indeed of any model at all), as a communication tool, to raise and formulate new questions about epidemiological systems and how to model them.

Although the benefits of increased conceptual precision that arise from formalisation are often considered a side effect of model development (and rarely reported in research articles), formalising theory can be a deliberate strategy to help us understand complex systems. In this case, during model construction, we focus on questions about how best to represent the key processes in our system. These questions can relate to direct analysis of our biological system; alternatively, we may seek inspiration in similar concepts from other domains, attempting to identify the link between the two. If a mapping can be found, then concepts, insights, and results can be transferred—for example, identifying parallels with network models developed in the statistical physics literature has been particularly fruitful in epidemiology, e.g., Ref. (17)—but even failed attempts would ideally be reported because they help us identify inconsistencies, potentially leading to the development of new concepts.

#### Exploring Theory

Once a theory has been incorporated into a formal model, its implications can be explored. An epidemiological model constrains the range of outcomes that can arise, so it can be used to deduce or simulate the range of possible system behaviours or the probability that those behaviours arise. This might be achieved in different ways, including mathematical deduction that allows us to develop and prove theorems and experimental approaches based on simulation. For example, process models could be used to explore the range of potential long-run behaviours of epidemiological systems (e.g., whether endemicity can arise), or to establish whether chaotic behaviour is possible (5).

When exploring theory, we begin with a model that represents that theory and use logical reasoning, mathematical deduction or computational approaches to understand the range of possible behaviours that can be generated under the assumptions of the theory, and potentially their associated probabilities. When used to explore theory, neither the model construction nor the exploration stage focuses directly on the correspondence between model and data, but rather on the correspondence between theory and model, to ensure that the implications for theory of model findings are clear. The theory under investigation may be general and abstract, and correspond to a hypothetical disease, or a specific epidemiological system. Exploration does not rely on the existence of data; even in the absence of data, it can guide epidemiological science by suggesting behaviours to look for in empirical work and their expected frequencies, and by helping to determine sample sizes (18).

#### Building Theory

Theory building consists of generating possible explanations for empirical observations. When we use models to help us with this task, our motivation is to help us formulate hypotheses by suggesting potential causal mechanisms that explain an observed phenomenon. The phenomenon to explain could take the form of a general trend or pattern observed in a range of epidemiological systems, or an observation arising from a specific dataset. For example, following the 10-year randomised badger culling trial, bTB incidence in cattle decreased in the badger culling area, but increased in adjoining areas. This observation was initially counter-intuitive and theory building was required to explain how it arose (12).

When constructing models to assist us in theory building, our starting point is an unexplained observation. Our primary focus in their construction is on the way in which structures and parameters, usually from an existing model, might suggest reformulations or extensions that constitute hypotheses. These hypotheses can be based on mechanisms drawn from general theory (e.g., about different kinds of host contact structure) or system-specific mechanisms, such as those based on experiential knowledge of the system (e.g., differences between England and Scotland in cattle-trading behaviour). These mechanisms need to be incorporated into the model before proceeding with theory testing (see Testing Theory) when the model will be used to characterise their effect and make comparisons with empirical observations.

#### Testing Theory

Theory testing refers to attempts to establish whether a theory provides a good explanation for empirical observations or data. Our motivation for using models for this purpose is their ability to generate falsifiable predictions that we can compare with existing data or observations or employ to guide data collection protocols or experiments. Although verbal theory is sometimes sufficient to make falsifiable predictions, a model can be valuable if we want to generate quantitative predictions or if the system is too complex to reason about otherwise. When using a model to test theory, no initial claim is made about the truth status of the model, and we often acknowledge that it is idealised or incomplete. Model predictions used to test theory are not forecasts; rather, the intention is that they are empirically falsifiable. Indeed, when models fail to make accurate predictions, they often fail for reasons that are very informative about the systems under study (5).

When testing theory, it is important to incorporate hypothesised mechanisms into the model in a way that allows us to establish whether it produces an observed real-world phenomenon. During model construction, we have a dual focus on accurately incorporating the mechanism and on devising ways to characterise its effect in a form that can be compared with data or observations. Comparisons with observations can be more or less formal depending on whether the model is highly idealised or complex, whether our predictions are qualitative of quantitative, and the form and extent of real-world observations or data. For example, formal model fitting and parameter estimation rely on the availability of data and a model at an appropriate level of complexity for the methods employed.4 Simple or complex models can be used to pinpoint errors in scientific hypotheses by using them to generate several predictions and establishing which are more (or less) compatible with observations. For example, in the work by Donnelly et al. (12) described in the section above, once explanations had been suggested through theory building, model predictions were used to narrow the space of possibilities. Ultimately, these authors concluded the unexpected pattern of disease incidence following culling was due to the perturbation of badger social structure and consequent movement. Additional uses of models to test theory include the use of a "null model" to serve as a baseline (or "straw man") and sensitivity analysis to gauge the importance of model assumptions.

#### Applying Theory

By applying theory, we refer to the use of models to forecast potential future events, or to predict events that might have occurred or could occur under circumstances or interventions that differ from those observed (sometimes referred to as "counterfactuals"). For example, Porphyre et al. (16) used a model to investigate the possible effects of a vaccination intervention in the event of a hypothetical introduction of FMD into Scotland, and Ferguson et al. (15) used a model to compare 2001 FMD outcomes under the implemented culling strategy with those that might have occurred without the intervention.

When motivated by the application of theory, model construction focuses on ensuring that key mechanisms are modelled as accurately as possible. Predictions are not intended to be falsified: they should be as accurate as possible so that we can compare actual and counterfactual scenarios. Steps in model use include careful construction, verification, and validation, subsequent use to conduct "experiments" under different conditions, and the examination of any effect on epidemiological outcomes (20). This use capitalises on the power of process models: to the extent that the model embodies real-world mechanisms, changing the conditions in which those mechanisms play out allows us to observe, characterise, and quantify the effects of these changes. Using models in this way requires a solid understanding of the processes acting in the epidemiological system for at least two reasons. First, empirical investigation of counterfactual scenarios is usually infeasible or impossible, e.g., in the case of historical counterfactuals, but also experimental testing of a range of policy options (also making falsification of model results impossible). Second, if models are used to inform policy decisions, incorrect predictions can be harmful, giving model use an important ethical dimension.

#### MODEL CONSTRUCTION DECISIONS

Establishing our motivations for using process models informs lower-level modelling decisions by helping us focus on how to select and represent the parts of the system to best answer the research question. Our key decisions centre on which aspects of reality to simplify and in which ways, taking our knowledge of the epidemiological system and research questions into account. As Grassly and Fraser (18) point out, "unnecessary complexity can obscure fundamental results and is almost as undesirable as over-simplification," thus "model choice—the process of deciding which model complexities are necessary—is a central part of mathematical modelling of infectious diseases." In this section, we identify and discuss five important modelling decisions and associated options.

We begin with three key decisions that apply to all infectious disease models in epidemiology: whether we want to track the infection status of individuals or groups; how to model the connectivity of hosts that determines transmission between them; and which disease states to model. These decisions relate to how we choose to represent the fundamental elements of infectious disease epidemiology: the disease states of the host population and the connections supporting transmission between them. The first two decisions determine the epidemiological states that we wish to track in the model, shown by the colours and rows of **Figure 1**, while different forms of connectivity are shown in the columns. **Figure 1** demonstrates how their combinations can result in different model structures, while **Table 2** provides descriptions of model types that include key search terms to aid with literature searching. We then describe two further decisions that relate to how we choose to implement changes of epidemiological state over time: whether these are modelled as taking place continuously or in discrete time; and whether and how to incorporate randomness into the processes that we model.

#### How to Model Hosts

Our first decision, captured by the rows in **Figure 1**, concerns whether we want to track the infection status of individuals (e.g., cows) or of groups (e.g., herds). In our earlier ABM of disease spread in UK cattle (**Figure 1, 1a**), each cow is modelled explicitly and tracked over time, taking up its own space in computer memory. This allows us to track both the total number of infected animals and the fate of individual cows; however, as the number of cows increases, so too does the computational cost of simulation and potentially the complexity of outputs, making them difficult to interpret. As a result, detailed ABMs are rarely used to model

<sup>4</sup>A full discussion of formal model fitting and parameter estimation is beyond the scope of this paper, but this is an area of particularly active development, see, e.g., discussion in Ref. (19).

disease spread across a system as large as the UK and are more frequently used on local scales. For example, Biek et al. (24) used an ABM (**Figure 1, 1a**) to explore the correspondence between likely transmission pathways identified through epidemiological modelling and data on phylogenetic relationships in the bacterial population.

When detailed ABMs are too computationally intensive or complex to analyse, we might choose to group individuals. For example, it is sometimes preferable to focus on groups, such as herds or farms, or groups defined by age, sex, or spatial distribution; in **Figure 1**, the difference between row one and the remaining rows represents this grouping distinction. Whether grouping is an appropriate decision depends on both the system and the research question. For example, it makes sense when individuals fall into relatively clear groups that have important implications for their epidemiology, or if we have better information about group-level than individual-level epidemiological processes.

In rows 2, 3, and 4, we track groups rather than individuals, and the distinctions between these rows relate to the kind of information we choose to track about group infection status. There are several main alternatives: we could track the number of individuals displaying each infection status (row 2), the presence (or absence) of hosts in each state (row 3), or the density or proportion of hosts in each state (row 4). Models that track the number or proportion of hosts in one of three disease states (susceptible, infectious, and removed) may be familiar as simple SIR-type transmission models based on ordinary differential equations (ODEs). A further simplification consists of tracking only the presence (and absence) of infection in groups of hosts.

In addition to increased analytic and computational tractability, summarising the infection status of a group of hosts sometimes better represents the scale at which data are available, or at which transmission processes are understood. However, one drawback of modelling groups rather than individuals is that we only know that at least one animal is infected, losing information about which individuals are infected. Grouping also forces us to aggregate, so we lose the capacity to study withingroup heterogeneity. Further, if a model tracked the disease state of farms (**Figure 1, 3a**) rather than cows (**Figure 1, 1a**), the influence of individual attributes could no longer be TABLE 2 | Typical names used to describe the models shown in Figure 1, to assist in literature searches (particularly terms highlighted in italics); descriptions of almost all modelling approaches discussed here are provided in Ref. (21–23), as well as in the references cited throughout this article.


(*Continued*) places).


*Not all model types are used in the literature on bTB/FMD, and some, therefore, do not have a reference within this literature. Note that although much of the literature refers to only differential equation models as "compartmental models," all models referred to in this article are compartmental models in the sense that states are discrete (an individual can only be one state, e.g., susceptible, exposed, infectious, etc.). Depending on the number of states, all could, therefore, be described by reference to the states included, so could be referred to as, e.g., susceptible-infectious-susceptible (SIS), susceptibleinfectious-removed, SIR models (those in Figure 1 have only 2 states, so are SI or SIS models).*

investigated. Presence-absence and density approaches (rows 3 and 4 in **Figure 1**) are usually more appropriate for systems in which randomness and individual-level variation are small. However, they are more difficult to justify when we are interested in modelling processes in which individuality matters. For example, they can cause problems when we are interested in studying extinction because continuous population densities mean that the pathogen population can become arbitrarily small (e.g., less than one pathogen present) without going extinct.

The grouping of cattle into herds, each of which is associated with a farm, underpins most models of FMD and bTB transmission in the UK. For example, in the InterSpread modelling framework used during the 2001 FMD outbreak in the UK, researchers initialised all UK farms with counts of the number of different kinds of livestock recorded during the most recent farm census (29, 31), and simulated whether each farm was susceptible or infected (the framework is able to capture different distance measures, so corresponds to **Figure 1**, **3a** and **3c**). Similar approaches to modelling the spread of bTB within and between farms have been implemented, tracking the numbers of animals moving between farms, and the number and disease state of animals on each farm, with some work using a combination of approaches from row 2, e.g., Ref. (27), that involves both a tiling and a network, i.e., **Figure 1**, **2b** and **2c**. In Ref. (28), only the presence or absence of infection in groups is tracked (**Figure 1**, **3a**).

#### How to Model Connectivity

Our third modelling decision, represented by columns in **Figure 1**, relates to how infection passes between individuals or groups of hosts. In our original ABM, hosts move in continuous space,5 and space itself determines host connectivity—for example, pathogen transmission might be modelled as occurring when agents are sufficiently close to one another (**Figure 1, 1a**),

<sup>5</sup>When modelling in continuous space, each host has an (*x,y*) location, in which *x* and *y* are not restricted to integer values (i.e., can have arbitrarily many decimal

or as a continuous function of distance. Models can also represent space as continuous without explicitly representing individual hosts, e.g., by modelling the location of farms in continuous space, with distances between farms affecting the spread of infection among them (**Figure 1, 2a**). This is an appropriate approach when our primary interest is between-farm transmission and spatial distance is considered a good proxy for strength (or probability) of potentially infectious contact. A continuous space approach that models the proportion of infected hosts is that of reaction–diffusion models that are based on PDEs (**Figure 1, 4a**).

One alternative to modelling hosts in continuous space is to divide the modelled landscape into discrete areas in the form of a tiling (often referred to as a grid or lattice), as in the example scenarios shown in **Figure 1** column b. A tiling can be composed of regular shapes such as hexagons or squares (as in **Figure 1**), as is often the case for satellite data; alternatively, it can be irregular, as might be the case for administrative jurisdictions. Discretisation of space is often used when model outputs will be compared with data that are only available in spatially discrete form, or when epidemiological interventions are necessarily applied over predetermined areas because of administrative jurisdiction. Each patch in a tiling can have different characteristics (e.g., host density), making it possible to examine the influence of this heterogeneity.

A tiling covers the full space and can, therefore, only be used when connectivity can be collapsed to two dimensions (32); however, often there are also parts of the landscape that we do not need to model explicitly. This can occur in the case where hosts aggregate in housing or pastureland, or when non-spatial mechanisms such as human-mediated transport or watercourses determine transmission. A spatial tiling is a special case of a network that takes a lattice form, and a more general network approach can be valuable for this kind of problem, modelling farms as nodes in a connectivity network, with the strength of the links between farm nodes determining the probability of transmission from one to another. Link strength could, for example, be determined by the shortest distance between farms when travelling by road, or by previous trading history between them. This broad category of network models is represented by the diagrams in **Figure 1**, **2c**, **3c**, and **4c**. The approach is similar to that of metapopulation models used in ecology, in which the network connections or distance influence otherwise independently modelled populations that exist in patches.6 This type of model, usually represented in the form of a distance- or contact-matrix, was the basis of several FMD models in which transmission was modelled using known distance between farms, e.g., Ref. (28), making it a model of type 3a in **Figure 1**. Using distances between farms (**Figure 1, 3a**) is also a special case of **Figure 1**, **3c**, where these distances determine the modelled connection between nodes.

The final broad category of model approaches to connectivity is represented by column d in **Figure 1**, in which the effects of space are not modelled. The most familiar form of disease model in which space is implicit is the simplest SI model, represented by **Figure 1, 4d**, in which all modelled individuals have the same probability of encountering one another per unit time, as if they occupied a theoretical homogenous space and mixed randomly within it. This "complete mixing" assumption may be appropriate for certain systems (perhaps for waterborne diseases of fish), but can also be used as a simplification when the complexity of other aspects of a model (e.g., number of disease states) make a spatially explicit model difficult to analyze. Non-spatial models can also be used to track only the presence or absence of disease in a system (**Figure 1, 3d**), modelling numbers of individuals (**Figure 1, 2d**), or keeping track of distinct individuals (**Figure 1, 1d**).

The decision about how to model connectivity is usually based on a combination of factors including data availability and our understanding of disease processes. For example, for the analysis of local culling policies for the 2001 FMD epidemic, one difficulty was that individual farms often contained multiple parcels of land whereas the data only represented each farm as a single spatial point. As a result, many more farms were actually contiguous (and thus needed to be subject to culling) than was apparent from the available data. However, by grouping farms into discrete tiles with neighbouring tiles used to establish contiguity and counting the number of infected farms per tile (as in **Figure 1, 2b**), it was possible to mimic the extent of culling recorded during the epidemic and, therefore, explore counterfactual culling policies (25). In this case, a discretisation of space allowed the simulation of more realistic interventions. Modelling in discrete space can also be used as a tool for detecting the spatial scales of key processes driving transmission, e.g., Ref. (33) for a general epidemiological example. **Figure 1, 1c** also highlights that it is possible to model the connections between individuals as a network, as in social network analysis models or models of sexually transmitted disease spread, or between farms *via* the movement of livestock (26).

#### How to Model States

Once we have decided on a level of grouping of hosts and how infection passes between them, we need to decide what states, captured by colours and spatial locations and other attributes in **Figure 1**, each individual animal or group can take. In the simplest model, we might decide that each host (individual or group) can have only two states, susceptible or infectious. Classic SI models based on differential equations are models of this type. However, we might decide to include additional states such as spatial location or age, or additional disease states that capture incubating or immune status, or changes in the level of infectivity during different stages of infection.7 In **Figure 1**, colours are used to illustrate disease state, with blue and red used to represent susceptible and infected cows or patches; in 1a, differences in

<sup>6</sup>The emphasis of heterogeneity in metapopulation models is typically on the attributes of patches, and patches are usually thought of as being equally connected or connected by distance. In network models, it is more common to assume nodes are homogeneous and emphasise different levels or kinds of connectivity between them. Nonetheless, this is a question of research tradition and emphasis, and a model from one of the two frameworks can be thought of from either perspective.

<sup>7</sup> States could be either discrete (as in compartmental models) or continuous (e.g., antibody titer).

shading of the markings on the head of each cow emphasise that individuals have additional distinct attributes and states, while in columns a and b, the position of crosses, points, and square cells denote spatial location.

As we increase the number of attributes and states that we model, the complexity of the model grows quickly. For example, if we model space using patches/nodes or a grid, the full system state at any given time consists of the combination of the states of all the patches. For example, even for the simple presence– absence model in **Figure 1, 3a**, although each patch has only two states, the whole system has 23= 8 states. By using "R" to represent red and "B" for blue, these states are: (R,R,R), (R,R,B), (R,B,R), (B,R,R), (R,B,B), (B,R,B), (B,B,R), and (B,B,B). Adding a single patch leads to 24 = 16 system states. As the numbers of patches or animals and states grow, the number of possible system states grows very quickly. Although we do not expect all possible system states to occur, even if only a small proportion of these arise during the dynamic process captured by the model, this number can be very large. As a result, it is often helpful to track instead a one-dimensional variable, such as the number of infectious hosts or groups in the system.

#### How to Model Time

The order of events—the sequence in which farms become infected—and whether two events occur close together in time or far apart, can have important effects on disease dynamics, so determining how to model time forms an important consideration in the construction of process models.

In determining how to model the progress of time, it is helpful to distinguish between epidemiological processes that can occur at any time—i.e., in continuous time—and those in discrete time. Modelling approaches have been developed that respect this distinction between continuous and discrete time and are named accordingly. Biological processes take place in continuous time in the sense that the interval between events can be arbitrarily small, so in some senses, discrete time representations are always an approximation, with a key difference being that when time steps are sufficiently long in discrete time models, more than one event can occur at the same time. Nonetheless, if events are highly clustered in time, for events within the same time window, it may not matter—and it may not even be possible to decide—which happened first. For example, within the period of a day, it may not be important which cow became infectious first, especially if cows only come into contact during milking. In a more extreme scenario, the host population might be periodically eliminated (e.g., harvested crops), or there could be periods of the year during which new infections cannot arise (e.g., if vectors overwinter in diapause). In these cases, we have a biologically driven reason to model in discrete time. Discrete time representations can also be used as an approximation to continuous time processes, perhaps because researchers are more comfortable with the modelling techniques or in cases where discrete time approximations are less computationally expensive.

When discrete time approaches are used, it is important to choose an appropriate time step length. When discrete time modelling has a biological justification, the length of the interval should be chosen so that it can reasonably be assumed that the order of events within the same time step is unimportant. This means that we need to focus on the fastest process in the system, typically transmission dynamics rather than host demographic processes. When discrete time is used as an approximation, time needs to advance in very short steps; however, it is usually very difficult to decide how short the interval needs to be to avoid influencing model predictions. For example, Mancy et al. (34) showed that the outcome of spatial competition between two species differed between a continuous and a discrete time model, even for very short time steps. It is, thus, important to report whether a discrete or continuous time model is employed along with the rationale for the decisions made.8

In general, the decision about whether to model in continuous or discrete time is driven primarily by the biology of the system under study and associated research questions, rather than our motivations for modelling. Although when testing theory, time is often modelled discretely to support comparisons with data, if real-world processes are continuous, it is often preferable to model in continuous time and then aggregate model output to correspond to data intervals. Similarly, when applying theory, the timing of interventions should be modelled according to the feasibility of implementing these interventions in the real world. Examples of continuous time modelling paradigms are differential equations (both ordinary and PDEs) and simulation approaches such as the Gillespie algorithm (modelling discrete events in continuous time, **Figure 2**, column a),9 whereas discrete time approaches include difference equations and cellular automata. The symbolic form of these is shown in **Figure 2** (column b).

#### Whether and How to Model Stochasticity

The final decision we discuss is whether to model epidemiological processes as deterministic or stochastic. Although familiar, this distinction bears repeating as it relates to differences in both the relationship between models and the real world, and to the steps required for their use. Deterministic models are those in which outcomes are entirely predictable based on the parameter values; in contrast, the output of stochastic models is not fully dependent on parameter values so cannot be predicted precisely. This means that deterministic models only need to be solved once for each set of parameter values, whereas stochastic models need to be run multiple times to gain good insight into the "average" or "typical" outcome. Running a model multiple times creates an operational overhead; however, the variation generated allows us to exploit information on the distribution of outcomes in real world data in their validation.

The first issue that arises is whether deterministic or stochastic models provide a better representation of the system we are

<sup>8</sup> It is not uncommon to read articles in which this information is not provided. One tip to determine whether a model represents time as continuous as discrete is to search for the keyword "time step" or "timestep" in the text and figure captions. 9Although, in the Gillespie algorithm, time between events is "skipped over" such that simulation time progresses in a step-wise manner, it is nonetheless a continuous time approach because each step can have arbitrarily many decimal places (i.e., is a real number, to the limits of computational accuracy).

FIGURE 2 | Illustrative examples of deterministic and stochastic models and their symbolic formulation for continuous and discrete time.

modelling. The discussion about whether the universe is truly deterministic or stochastic is an unresolved debate in the philosophy of science literature; however, because epidemiological systems are never fully known, it is often preferable to use stochastic models. Bolker (35) partitions stochasticity into three sources of random variability: process-related stochasticity in the form of either endogenous stochasticity or environmental stochasticity and measurement error. *Endogenous stochasticity*<sup>10</sup> is variability which is inherent to the system itself and that would occur between realisations even under identical environmental (or experimental) conditions, including variability in host demographic processes and the number of secondary cases. *Environmental stochasticity* refers to the unpredictability of exogenous processes (i.e., those outside of the system of interest and that occur independently of it), such as extreme weather events that affect host demography or disease dynamics. Depending on where we locate the limit between our system and the environment, the same source of stochasticity might be thought of as endogenous or exogenous: for example, weather and climate stochasticity are usually treated as environmental; in contrast, stochasticity in individual farmer responses to policy interventions might be viewed as exogenous or as part of the system. In contrast to process stochasticity that exists regardless of whether we study the system, *measurement error* arises in conjunction with our role as scientists and refers to variability in data due to difficulties of measurement. Within this category, Clark and Bjørnstad (36) refer to measurement inaccuracy, missing data points, lags between the biological process of interest and measurable outcomes, and "hidden" system states that are not amenable to measurement; methods for dealing with measurement error are discussed in Calder et al. (37).

Starting from deterministic models in the form of, for example, ODEs or difference equations (**Figure 2, a1**, **b1**), there are several ways in which process stochasticity can be incorporated into epidemiological models. Two common approaches are the inclusion of a stochastic error term into an otherwise deterministic framework (**Figure 2, a2**) and the use of a fully stochastic process model (e.g., a Markov model) (**Figure 2, a3, b2**).

<sup>10</sup>Bolker refers to endogenous stochasticity as demographic stochasticity; however, the term is awkward in infectious disease epidemiology in which both host and pathogen demographic processes are important.

In the first, epidemiological processes are modelled as having a deterministic component, with variability around this deterministic trajectory modelled as "noise." In a fully stochastic approach, the state of the system depends on previous states and a random component, with transitions occurring according to probabilities often represented in matrix form. If we decide to ignore states further back in time, we can employ results from mathematical Markov process theory, including those that facilitate simulation approaches using the Gillespie algorithm, a continuous time, event-driven approach (38, 39). In contrast to ODE models, advantages of simulating Markov processes (e.g., using the Gillespie algorithm or discrete-time equivalents) are that negative population counts and partial individuals are not possible and it is possible to identify a precise extinction time. For the Gillespie algorithm, there is a clear relationship with deterministic ODE approaches, including *R*0 calculations, meaning that results from these simpler models can be compared with simulation outcomes. The Gillespie algorithm assumes exponentially distributed waiting times, which are often unrealistic (40), so it may be necessary to combine exponential distributions *via* the so-called "method of stages" (41), or associate the approach with alternative simulation algorithms for to achieve other distributions, e.g., for infectious periods.

In addition to determining the type of model to use, the decision between deterministic and stochastic models also affects the steps involved in model use. "Solving" a process model can refer to obtaining either long-run outcomes, often in the form of an equilibrium solution, or to obtaining the time path of the system states. For each set of parameters and initial conditions, solving a deterministic process model leads to a single time path. For simple deterministic systems, it is sometimes possible to solve for time paths or equilibria analytically (i.e., symbolically). For more complex systems, numerical methods that allow us to obtain approximate solutions are often available (e.g., numerical methods for solving differential equations such as Runge–Kutta and variants thereof).11 This means that we only require one solution of the model per parameter set (and initialisation, where appropriate). In contrast, stochastic models lead to a set of solutions and associated probabilities. For some types of stochastic models, numerical methods are available to obtain certain general results (e.g., the stationary distribution of a Markov chain can be obtained from eigenvector relations, for which numerical methods are available), and some more complex models can be solved in this way if we are interested only in summary statistics such as means. However, in many cases, it is necessary to use stochastic simulation in which system states are computed as a function of previous states and transition probabilities, and for each initialisation and parameter set, multiple solutions are obtained. We are, therefore, required to simulate multiple times for each parameter set and initialisation and compute summary statistics on model output.

When deciding whether and how to incorporate different sources of stochasticity into process models, it is helpful to consider the system, the questions we want to answer and our motivations for using a model. In relation to the system, a commonly recognised point is that stochasticity causing random population size fluctuations has stronger effects in smaller systems. In disease ecology, stochasticity is more important when the host population is small, but also at the very beginning and end of an outbreak when there are fewest infectious agents (18). Stochasticity is also important for certain questions: for example, process stochasticity is important when studying pathogen elimination (39), not just because pathogen populations are small close to elimination but also because populations can go extinct through random processes even when their deterministic equivalent persists. Further, incorporating existing knowledge about stochasticity in epidemiological processes can be intrinsic to certain questions: for example, O'Hare et al. (14) used knowledge of variability in the number of secondary cases to guide model choice decisions when investigating the role of "superspreaders." The type of answer required can also drive decisions. For example, mathematical tools for deterministic models are also generally more developed, meaning that it is possible to obtain analytic expressions or precise numerical estimates of quantities of interest, and is easier to examine threshold behaviours. In such cases, choosing between deterministic and stochastic models is, therefore, driven primarily by our questions, rather than our motivations. Nonetheless, especially when exploring theory, we may decide to ignore stochasticity altogether or include only one form at a time, because this allows us to isolate the effect of each in conjunction with parameter changes. Incorporating measurement error is important when the motivation for using models involves using or explaining data or observations affected by measurement error; when using models for exploring theory, it can often be ignored.

#### APPLYING THE MODEL CONSTRUCTION APPROACH

To apply the approach outlined here, we would begin by identifying our motivations for model use to guide us towards the aspects of model construction that require the most attention. We would then identify the biological entities that need to be distinguished, and then consider whether and how each of these should be modelled, according to the dimensions discussed above, referring to the references provided and standard texts on process modelling. This approach can also be applied when analysing the relationship between studies reported in the literature, to compare and contrast model-based findings. This should make it easier to pinpoint complementarities between approaches used to address very similar questions about related (or even the same) epidemiological systems. Indeed, although the examples provided in this article all relate to bTB or FMD in cattle, different modelling decisions are made in different pieces of work, as was also the case for work focusing on the 2001 FMD epidemic (42). For example, Ferguson et al. (15) investigated the

<sup>11</sup>Although numerical solutions are often referred to as simulations, it is useful to distinguish between the two. A *simulation* is defined as the imitation of a process over time. In a stochastic simulation such as an ABM/IBM, the computer simulation is an imitation of real-world processes, but is the actual playing out of the model. In contrast, a numerical algorithm used to solve a differential equation model imitates the differential equation; strictly speaking it is, therefore, not a simulation of the real world processes (although it is a simulation of the differential equation model).

potential for exploiting local clustering of transmission to target culling, and chose an approach (deterministic moment closure) that formulated disease spread in the context of an ODE model (as in **Figure 1, 4d**) but where individual states represent not just the status of individual farms, but the combined statuses of triplets of farms (e.g., not just states S, I, and R in an SIR model, but with S-S-I an explicit state representing the proportion of triplets with two susceptible farms and one infected farm). Geographical space was represented abstractly by "counting," on the map of farms in Great Britain, the proportion of times two neighbouring farms shared a neighbour (this proportion is commonly called the "clustering coefficient"). This decision may have been motivated both by their previous analytical approaches and the need to provide rapid, responsive advice. However, similar situations can lead to different decisions: Keeling et al. (28) had previously used deterministic moment closure models to describe epidemiological invasions scenarios, but in 2001 developed a stochastic "transmission kernel" simulation approach (as in **Figure 1**, **3a** with farms as the individuals and transmission probability declining with distance) in order to capture the explicit heterogeneity in transmission potential of FMD across Great Britain (28). Our examples are drawn from research on bTB and FMD, but for other ecological or epidemiological systems, different sets of model types might be more appropriate. For example, for diseases of wildlife, natural host groupings corresponding to herds may not exist, making this simplification inappropriate, or host movements or demographic processes might be highly seasonal, leading to different decisions about how to model time. In addition, because bTB and FMD are reportable diseases in the UK, good datasets exist for their tracking; however, for other diseases or in areas of the world where this is not the case, more limited data can influence modelling decisions.

Once possible modelling options are identified, their appropriateness can then be (re)considered with reference to the epidemiological system, our motivation for using modelling, and the precise questions asked. It is therefore valuable to identify explicit criteria for assessing whether modelling decisions are satisfactory in terms of their accuracy and level of detail. Drawing on military terminology, Holling (43) contrasted *strategic* models, usually designed to be as simple as possible to reveal potential explanatory generalities, and *tactical* models, deliberately higher in complexity because they are designed to predict the dynamics of specific systems. Historically, models used to explore theory have typically been simpler than those used to apply it, in part because they have been more amenable to analysis using mathematical techniques. However, contemporary techniques including principled use of computer simulation (44) and mathematical tools for the analysis of stochastic systems (45) have made it easier to conduct theory exploration and obtain general results, even for relatively complex models. Although still a common heuristic, the distinction between simple and complex models and their relative roles in relation to motivations for using models is beginning to break down (46).

In determining the appropriate level of complexity or output accuracy, we focus on three factors: the respects, the degree, and the specificity of the system to which results should apply (5). When we determine the *respects* in which a model is required to be accurate or sufficiently detailed, we are asking the qualitative question "what epidemiologically relevant phenomena do we want our model to reproduce?" For example, when using models to explore theory, we may decide that it is sufficient that our model provides information on whether a disease persists or goes extinct; however, when we use models to apply theory, we may also want the model to provide information on time until extinction or the spatial locations at which a pathogen is likely to persist the longest. When we determine the *extent* to which our model is required to be accurate, we ask the quantitative question of "how close do we need those phenomena to be to those seen in the real world?" In terms of model outputs, this question is primarily relevant when we use models to help us generate, test and apply theory. However, in relation to model inputs, such as parameter values and initial conditions, it can apply to all motivations for model use. When we determine the *system* for which our model is required to be sufficiently accurate or detailed, we are asking "for what systems do we want this to apply?" For example, much of the more theoretical work using process models, often in the form of theory exploration, is deliberately general and a specific disease system may not be mentioned (e.g., it may apply to abstract SIR processes); when applying theory, it is usually critical for the model to relate to a specific disease system, but it may be sufficient for it to be accurate for a country or region, or host breed. Decisions about the level of detail or accuracy required of a model will often be driven by practical considerations such as funding, publication targets, or data availability.

#### DISCUSSION

Historically, the majority of veterinary disease modelling has followed a statistical approach, in which the focus has been on characterising statistical associations between a response variable and explanatory variables. For example, we might use this approach to identify farm factors—e.g., sanitation practices that affect risk of an outbreak. This form of modelling becomes increasingly involved if the explanatory variables interact with one another, or if the response variable depends on its previous values. Although techniques have been developed to cope with interactions between the variables of interest, these become increasingly unwieldy as the number of interactions increases. Further, even the sophisticated techniques developed to account for these interactions usually only identify them as statistical relationships and do not explicitly represent the direction of causality in the links between them. Furthermore, models of this type are particularly reliant on all variables being available within a single dataset, using information on measurable states to infer knowledge of processes from interactions between variables that define those states. Process models, on the other hand, focus explicitly on biological processes. They can, therefore, generate considerably greater insight for investigations of the impact of changes to those processes, for example, to understand the population level impact of imperfect vaccination. Recent developments include the introduction of Bayesian modelling approaches, such as hidden Markov models that are underpinned by process models, and that bridge statistical and process approaches.

There are good reasons for increasing interest in process modelling among disease ecologists and veterinary epidemiologists. Yet, making decisions about how to construct process models, and knowing how to compare different approaches in the literature, is complicated by the existence of multiple process model types and the difficulty of establishing the relationship between them. In applied work, the model is often treated as a tool, so only the chosen modelling approach is described without comparison with alternatives, while in more theoretical work focused on developing new modelling approaches, space often limits comparative discussion to a small set of related modelling approaches. Further, most introductory texts on process modelling in disease ecology proceed by describing prototypical examples of a small range of modelling paradigms. This tends to obscure the relationships among modelling approaches and fails to make explicit the link between the appropriateness of different approaches for different systems, while reinforcing the popularity of particular paradigmatic approaches somewhat arbitrarily.

Ideally, model construction decisions should be guided primarily by the system, what we know about it, and our scientific questions. Nonetheless, our decisions are often constrained to some extent by practical considerations, including technical limitations (e.g., computational resources) and modelling knowledge. Although the increasing development of specialist software simplifies the mechanics of modelling, understanding the modelling assumptions embedded within any software used is important for accurate interpretation of outputs. As models incorporate more and more components—such as in the case of ABMs or complex models represented in matrix form—we can quickly reach a situation in which more information needs to be available to simulate or solve the model than can be held in computer memory. Time can also be a constraint if models take a long time to run or solve, especially if they need to be run or solved multiple times. Depending on which aspects of the outputs we choose to store, these can put a strain on storage capacity, and although hard disk storage capacity is increasingly cheap, managing these large volumes of data, both in terms of transferring data between devices and maintaining a file structure that is easy to navigate, can be challenging. In relation to technical knowledge, there are limitations to the number of techniques we can acquire, and it is often preferable to sacrifice some accuracy or detail in modelling decisions to allow us to use an approach that we understand well, in terms of its strengths, weaknesses, and underpinning assumptions. A strong understanding allows us to safeguard against known pitfalls, and critically, to better account for any assumptions in interpreting model outcomes. In interdisciplinary work or work at the research–policy interface where team members often have different skillsets, it can also be advantageous to use a form of modelling that all team members can understand.

In this article, we have described one way to approach model construction, based around a set of modelling decisions and their relationship with the system under study, the research questions, and our motivations in using modelling. In most cases, we have presented modelling decisions as though they were either/or decisions. In reality, within the same project, several motivations might underpin model use. Similarly, researchers often use several models, and do so with a range of motivations. For example, we might begin by using a modelling exercise to formalise our ideas about an epidemiological system, constructing a model that we use to explore theory, ultimately using it to test theory once we have acquired appropriate data (47). We might also try out multiple modelling frameworks in a single piece of work. For example, it is often very valuable to begin with a relatively simple model that we understand well or that has been analysed previously and incrementally add or change the epidemiological processes involved. This allows us to understand the effects of these changes, as well as to locate any errors in our logic or solution processes. For example, even if we are interested in the effects of host population heterogeneity, we often begin with a simple model of a well-mixed host population for comparison. Indeed, using multiple model types to address the same problem is often very useful, both within and between research teams, as redundancy, overlap, and replication serve to reduce the risk of unidentified errors (48).

To conclude, we believe that the structured approach presented here, based on the identification and classification of model construction decisions, should help those new to epidemiological modelling to reach a level of model construction expertise more quickly, while providing an analytic structure and terminology for more experienced readers. This conceptual analysis helps clarify the relationship between the biological system and the assumptions about it embedded in the model and highlights the similarities and differences between modelling approaches.

#### AUTHOR CONTRIBUTIONS

RM and PB developed the conceptual framework for model type analysis, planned, and drafted the manuscript. RM produced the figures. RK contributed scientific content and to manuscript revision. All authors have critically reviewed and revised the manuscript and approved the final product.

#### ACKNOWLEDGMENTS

The authors wish to thank the following for useful discussions and insightful feedback on earlier drafts: Laurie Baker, Katie Hampson, Caroline Millins, Patrick Prosser, Simon Rogers, Eva Smeti, Sofie Spatharis, and Hannah Trewby.

#### FUNDING

PB is supported by MRC ESEI grant G1100796/1. During the development of this paper, RM has received support from Wellcome Trust grant 095787/Z/11/Z and EPSRC EP/P505534/1.

#### REFERENCES


*bovis* in sympatric cattle and badger populations. *PLoS Pathog* (2012) 8(11):e1003008. doi:10.1371/journal.ppat.1003008


48. Thiele JC, Grimm V. *Replicating and Breaking Models: Good for You and Good for Ecology*. Oikos (2015). p. 691–6.

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2017 Mancy, Brock and Kao. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## **Relevance of Indirect Transmission for Wildlife Disease Surveillance**

*Martin Lange<sup>1</sup> \*, Stephanie Kramer-Schadt <sup>2</sup> and Hans-Hermann Thulke<sup>1</sup>*

*<sup>1</sup> Department of Ecological Modelling, Helmholtz Centre for Environmental Research Leipzig – UFZ, Leipzig, Germany, 2 Leibniz Institute for Zoo and Wildlife Research, Berlin, Germany*

Epidemiological models of infectious diseases are essential tools in support of risk assessment, surveillance design, and contingency planning in public and animal health. Direct pathogen transmission from host to host is an essential process of each host–pathogen system and respective epidemiological modeling concepts. It is widely accepted that numerous diseases involve indirect transmission (IT) through pathogens shed by infectious hosts to their environment. However, epidemiological models largely do not represent pathogen persistence outside the host explicitly. We hypothesize that this simplification might bias management-related model predictions for disease agents that can persist outside their host for a certain time span. We adapted an individual-based, spatially explicit epidemiological model that can mimic both transmission processes. One version explicitly simulated indirect pathogen transmission through a contaminated environment. The second version simulated direct host-to-host transmission only. We aligned the model variants by the transmission potential per infectious host (i.e., basic reproductive number *R*0) and the spatial transmission kernel of the infection to allow unbiased comparison of predictions. The quantitative model results are provided for the example of surveillance plans for early detection of foot-and-mouth disease in wild boar, a social host. We applied systematic sampling strategies on the serological status of randomly selected host individuals in both models. We compared between the model variants the time to detection and the area affected prior to detection, measures that strongly influence mitigation costs. Moreover, the ideal sampling strategy to detect the infection in a given time frame was compared between both models. We found the simplified, direct transmission model to underestimate necessary sample size by up to one order of magnitude but to overestimate the area put under control measures. Thus, the model predictions underestimated surveillance efforts but overestimated mitigation costs. We discuss parameterization of IT models and related knowledge gaps. We conclude that the explicit incorporation of IT mechanisms in epidemiological modeling may reward by adapting surveillance and mitigation efforts.

**Keywords: indirect transmission, wildlife surveillance, wild boar, FMD, simulation model, contingency planning, environmental transmission, individual-based** *R***<sup>0</sup>**

#### **INTRODUCTION**

Host–pathogen models play an essential role in epidemiology (1). Epidemiological models are widely used to support risk assessment, surveillance design, and contingency planning (2–5). The driving force of any infectious disease is the transmission of the pathogen to susceptible hosts (6, 7), and its adequate representation in epidemiological models is therefore of crucial importance (8, 9).

#### *Edited by:*

*Tariq Halasa, Technical University of Denmark, Denmark*

#### *Reviewed by:*

*Matthew Denwood, University of Copenhagen, Denmark Rodney Beard, University of Glasgow, UK*

> *\*Correspondence: Martin Lange martin.lange@ufz.de*

#### *Specialty section:*

*This article was submitted to Veterinary Epidemiology and Economics, a section of the journal Frontiers in Veterinary Science*

> *Received: 31 August 2016 Accepted: 17 November 2016 Published: 30 November 2016*

#### *Citation:*

*Lange M, Kramer-Schadt S and Thulke H-H (2016) Relevance of Indirect Transmission for Wildlife Disease Surveillance. Front. Vet. Sci. 3:110. doi: 10.3389/fvets.2016.00110*

The relevance of indirect transmission (IT) without a vector or reservoir, but through contaminated environment, was demonstrated for pathogenic viruses, bacteria, prions, and macroparasites. Examples include highly contagious diseases of wildlife and livestock like foot-and-mouth disease [FMD (10), reviewed in Ref. (11, 12)], classical swine fever [CSF; (13, 14)], bovine tuberculosis [bTB; (15, 16)], brucellosis (17), avian influenza [AIV; (18)], porcine reproductive and respiratory syndrome [PRRS; (19)], and chronic wasting disease [CWD; (20)]. Zoonotics and diseases of man with IT mode include infections with influenza viruses (21), cholera bacteria (22, 23), hantaviruses (24), and *Salmonella* bacteria (25). For several pathogens, longevity outside the host was investigated under experimental conditions [see, e.g., Ref. (26) for FMD, CSF, BVDV, and PPV; (27) review FMD; (28) review poultry diseases; (29) review CSF; (30) CSF; (31, 32) AIV; (33) Influenza A, B; (34) cholera].

The necessity to incorporate indirect environmental transmission in epidemiological models was already claimed by several authors (20, 35, 36). Despite this fact, only recent modeling studies considered this transmission mode explicitly [(18) AIV; (20) CWD; (37) cholera; (38) brucellosis]. Instead, the majority of epidemiological models follow a century-old postulate by modeling transmission proportionally to both the current number of infectious and the current number of susceptible individuals (39). Using this approach, Breban (40) elaborated the theory of incorporating IT in epidemiological models. It is not always necessary, indeed, to explicitly model all possible routes of pathogen transmission. One may argue that, for example, infectiousness of environmental contamination being short compared to the host infectious period, and then nothing is lost by summarizing everything in increased estimates of direct transmission (DT) (40). However, if empirical evidence suggests a more fundamental role of pathogen transmission through an environmental pathway, then the previous model paradigm does circumvent the explicit consideration of the biologically independent mechanisms. Such mechanisms may respond differently to interference, e.g., to measures or treatments. Summarizing transmission models, hence, do not allow inferences to be made concerning the role of pathogen stages that can persist outside of their host. Interestingly, studies assessing the impact of IT on disease dynamics or disease mitigation are rare [see, e.g., Ref. (18), for example, Ref. (41–43)].

Explicit consideration of an indirect environmental transmission mode may not only be of serious relevance to understand experimental results or the dynamics of host–pathogen systems [e.g., Ref. (40, 43, 44)]. We claim that the explicit inclusion of environmental transmission in models of wildlife diseases may be necessary for adequate predictions in the context of management activities, e.g., surveillance, mitigation, and contingency planning. Further, IT is particularly relevant in socially organized wildlife species, where direct contact is mainly restricted to the social group, and for multi-host pathogens, where direct contact between species is rare (45, 46). We addressed this hypothesis using a parameterized stochastic spatially explicit, individualbased model (SEIBM) designed for studying infectious diseases in landscape-scale populations of social (47–50) and multi-species wildlife hosts (51, 52).

We used the host–pathogen system of FMD in large wild boar (*Sus scrofa*) populations as a biological example. The wild boar is a social species, widely distributed in many parts of the world. It is the most abundant large mammal species in Europe (53) with increasing geographic range and population densities (53, 54) maintaining a number of infectious diseases (55, 56). FMD is one of the economically most important livestock diseases, which can be devastating in case of an incursion, like in the outbreaks in the UK in 2001 with more than 6.5 million animals culled and economic losses estimated at 5 billion £ (57, 58). FMD affects approximately 70 species of cloven-hoofed domestic and wild animals including wild boar (59). However, epidemiology of FMD in European wildlife populations is largely unknown. The FMD virus (FMDV) can survive outside the host for hours to months, depending on the environmental conditions. In pig slurry, FMDV was detectable for 14 days at 20°C and more than 100 days at 5°C in an experiment by Bøtner and Belsham (26). In a recent outbreak of FMD in wildlife and livestock in Bulgarian Thrace in 2011, wild boars were detected as being virus- and seropositive for FMD, suggesting the potential involvement of the species in FMD epidemics (59, 60).

The objective of this study was to evaluate whether infections with IT may require different surveillance and mitigation efforts than predicted by models based on DT. To this end, we extracted from the SEIBM seroprevalence time series as obtained under surveillance conditions and compared measures important for outbreak mitigation such as time to detection and the minimum sample size needed for disease surveillance.

#### **MATERIALS AND METHODS**

#### **Model Description**

#### Overview

The FMD wildlife model was based on a spatially explicit, stochastic, individual-based demographic model for wild boars (*S. scrofa*) in a geographic area with suitable habitat. Superimposed is a transmission and disease course model for the FMDV. Epidemiological data on FMDV infections in wild boar are available from the field (59) and laboratory experiments (61, 62). The model is documented following the ODD protocol [Overview, Design, and Details; (63, 64)].

#### *Purpose*

The aim of the modeling study was to provide an experimental environment to test the hypothesis that neglect of pathogen persistence outside its host is an inappropriate simplification from the perspective of surveillance or contingency planning. The model was designed to compare the predictions between explicit IT and equivalently parameterized DT. For this purpose, two model variants were constructed only differing by the exclusion (DT) or inclusion (IT) of an environmental transmission model. Hence, the following model documentation is representative for all simulations performed with the submodels of direct and IT substituting each other (see Virus Transmission in Section "Details").

#### *State Variables and Scales*

The model comprises two major components: spatial habitat units and wild boar individuals. All processes take place on a raster map of spatial habitat units. Each cell represents a functional classification of the landscape denoting habitat quality and a scalar value denoting environmental pathogen load. The cells of the model landscape represent 4 km<sup>2</sup> (2 km *×* 2 km), encompassing a boar group's core home range (65). State variables comprise boar habitat quality of the grid cell. At run time, habitat quality is interpreted as breeding capacity, i.e., the number of female boars that are allowed to have offspring [explicit density regulation; (66)]. Furthermore, an FMDV state of the habitat cell represents environmental virus load and accumulates infection pressure as shed by viremic animals.

State variables of host individuals are the wild boar's age in weeks [where 1 week represents the approximate FMD infectious period in wild boar; (61, 62)], resulting in age classes: piglet (*<*8 months *±* 6 weeks), sub-adult (*<*2 years *±* 6 weeks), and adult (67). Each host individual has a location, which denotes its home range cell on the raster grid as well as its family group. The individual host animal comprises an epidemiological status (*susceptible*, *infected*, or *immune* after recovery or due to transient maternal antibodies). Sub-adult wild boar may disperse during the dispersal period (i.e., early summer).

#### *Process Overview and Scheduling*

The model proceeds in weekly time steps and processes are executed in the following order (see **Figure 1**): virus release, infection, dispersal of subadults, reproduction, death, and aging. In the first week of each year, mortality probabilities are assigned stochastically to represent annual fluctuations in wild boar living conditions, and female wild boars are assigned to breed or not, according to the carrying capacity of their home range cell.

#### Design Concepts

Wild boar population dynamics emerge from individual behavior, defined by age-dependent seasonal reproduction and mortality probabilities and age- and density-dependent dispersal behavior, all including stochasticity. The epidemic course in the DT model emerges from virus transmission within and between groups and wild boar dispersal. The epidemic course in the IT model emerges from virus excretion by infectious hosts, survival dynamics of infectious virus outside the host, contact to infectious doses, and wild boar dispersal.

We included stochasticity by representing demographic, behavioral, and pathogen parameters as probabilities or probability distributions. Annual fluctuations of living conditions are realized by annually varying mortality rates.

#### Details

#### *Initialization*

The model landscape represents 60 km *×* 60 km of connected wildlife habitat without barriers. The specified extent ensures that the epidemic wave does not reach the edge of the landscape before detection in any simulation. The 900 grid cells were randomly initialized with integer values of local breeding capacity in range 0, *. . .*, 3. Breeding capacity was scaled to result in an average wild boar density of 5 hosts/km<sup>2</sup> in January, i.e., before the reproductive season (68, 69). The average population size in January was 18,000 individuals.

One boar group was released to each habitat cell, where group size is six times breeding capacity. Initial age distributions were taken from the results of a 100 years model run [see Table S1 in Supplementary Material; (48)].

#### *Input*

The applied model setup does not include any external inputs or driving variables.

#### *Submodels*

Submodels are described where essential to understand the study. The Supplementary Material contains the complete descriptions of all submodels. A list of parameters with their values and sources is given in Table S2 in Supplementary Material.

#### *Virus Release*

The virus was released to the population by infection of five wild boars, randomly selected from the nine most central habitat cells. Release takes place in the sixth year of each simulation (see Simulation Experiments) to allow population dynamics to be established. Introduction was chosen in the season of most likely establishment of the infection according to the increasing population numbers, i.e., at the start of the reproductive season of wild boar.

#### *Disease Course*

The disease course following infection is modeled for each infected individual. The infectious period of a host *t*inf is 1 week. After the infectious period, hosts achieve lifelong immunity. We assumed minimum case lethality (61, 62).

#### *Virus Transmission*

*Direct Transmission.* Direct transmission in the model is a stochastic process. Parameters determine the probability of contracting the infection from an infectious group mate *P* (*i*) inf and the probability of contracting the infection from an infectious animal in a neighboring group *P* (*e*) inf (3 *×* 3 neighborhood) during 1 week. For each susceptible animal, the probability of becoming infected accumulates over all infectious animals within the group and in the neighborhood:

$$
\Pi\_l = 1 - \left(1 - P\_{\rm inf}^{\left(i\right)}\right)^{I\_l} \left(1 - P\_{\rm inf}^{\left(\epsilon\right)}\right)^{\Sigma\_{\parallel} I\_{\parallel}},\tag{1}
$$

where *I<sup>i</sup>* is the number of infected individuals in the home group *i* and *I<sup>j</sup>* is the number of infected individuals in wild boar groups of the eight neighboring cells *j∈*{1, *. . .*, 8}. The model iterates over all individuals and stochastically sets each susceptible individual to infected if a uniformly distributed random number *r* drawn from *U*(0, 1) is smaller than Π*<sup>i</sup>* of its home cell.

*Indirect Transmission.* We modeled indirect virus transmission *via* excretion of infectious material, decay of infectious material by time in the environment (i.e., outside of host individuals), and contact of hosts to infectious material in the environment. At contact, we modeled the effective infection stochastically with the event probability derived from a standard dose–response relation.

The weekly dynamics of the pathogen pool used in the model are based on parameters available from literature on a daily basis. Temporal evolution of the pathogen pool *C* of each cell is an exponential decay process and the term of pathogen load added to the cell:

$$\frac{d\mathcal{C}}{dt} = -\lambda \mathcal{C} + \mathfrak{s},\tag{2}$$

with λ being the decay constant λ = ln(2)/*T*1/2, *s* being the pathogen added to the cell per time unit, and *t* being time in weeks. Solve

$$C\left(t\right) = \left(C\_0 - \frac{s}{\lambda}\right)e^{-\lambda t} + \frac{s}{\lambda}.\tag{3}$$

Within one time step, *s* is constant. Thus, the pathogen pool can be calculated analytically as

$$\mathcal{C}\_{t+1} = \left(\mathcal{C}\_t - \frac{s}{\lambda}\right) e^{-\lambda} + \frac{s}{\lambda} \tag{4}$$

The average available dose for uptake during the weekly time step is

$$\bar{C} = \int\_{t}^{t+1} C\left(t\right) \, dt = \frac{C\_t \left(1 - e^{-\lambda}\right) + s}{\lambda} + \frac{s\left(e^{-\lambda} - 1\right)}{\lambda^2}.\tag{5}$$

The pathogen source *s* for a cell is determined from the number of infectious hosts in the cell and in neighboring cells. Hosts in infectious state excrete infectious material with constant daily rate (parameter *g*; i.e., 7*g* is the weekly excretion), measured in tissue culture infective dose 50% (TCID50) per day. A host animal spends a portion of daytime (parameter *pt*) in contact areas, i.e., areas subsequently reached by neighboring animal groups. Accordingly, excreted infectious material is distributed to different cells: *g*(1 *− pt*) doses adding to the pool of the home cell of the host, while 1/8*g p<sup>t</sup>* doses are added to each of the eight neighboring cells. Therefore, the pathogen added to a cell on a weekly basis is:

$$s = 7\lg\left(\left(1 - p\_t\right)I\_i + 1/8\ \left.p\_t \sum\_j I\_j\right). \tag{6}$$

Per host, individual contact to infectious material in the environment is determined as constant share (parameter *u* on a daily basis; i.e., 7*u* corresponds to the weekly share) of the available dose *C*¯ in its home range cell. The weekly contact dose CD is

$$\text{CD} = 7\mu \bar{\text{C}}.\tag{7}$$

Effective infection after contact to a particular dose of infectious material is modeled stochastically as a binomial chance process so that the individual's weekly probability of becoming infected follows an exponential dose–response relation:

$$P\_{\rm CD} = 1 - \left(1 - P\_{\rm TCI50}\right)^{\rm CD},\tag{8}$$

with *P*TCID50 being the probability of infection after contact to one TCID<sup>50</sup> dose. **Figure 2** shows the dose–response curve for *P*TCID50 = 0.003 (70, 71).

### **Parameters, Simulation Experiments, and Analysis**

#### Parameters

A complete list of all parameters with their values and sources is shown in Table S2 in Supplementary Material.

#### Parameterization of Transmission

In the DT model, the transmission is defined by scaling the two parameters *P* (*i*) inf and *P* (*e*) inf . In the IT model, an analog to *P*inf can be calculated from Eq. 8 and the dose available from one infectious host. To calculate the available dose, Eq. 5 is applied for 1 week after infection (i.e., parameter infectious period) including the excretion into the environment (i.e., *s >* 0) and for infinite time without further excretion. The total available dose over time is

$$
\bar{\mathbf{C}}^{\infty} = \int\_0^1 \mathbf{C}^+(t) \, dt + \int\_0^{\infty} \mathbf{C}^-(t) \, dt,\tag{9}
$$

**FIGURE 2 | Dose–response curves for wild boar (***P***TCID50 = 0.003)**. Inset: linear ordinate.

where*C* <sup>+</sup>(*t*) is the pathogen pool with pathogen excretion starting with *C*<sup>0</sup> = 0 (Eq. 3). *C <sup>−</sup>*(*t*) is the pathogen pool without pathogen excretion for an initial pool equal to the value after the first week [i.e., *C*<sup>0</sup> = *C* <sup>+</sup>(1)]. Solve

$$
\bar{\mathcal{C}}^{\infty} = \frac{s}{\lambda} \tag{10}
$$

or, without stressing mathematics, it is the product of added material *s* and average lifetime of the pathogen in the environment τ = 1/λ.

With Eqs 7 and 8, this gives

$$P\_{\rm inf}^{(i)\*} = 1 - \left(1 - P\_{\rm TCHO50}\right)^{\tau \mu\_l/\lambda} \tag{11}$$

$$P\_{\rm inf}^{(\epsilon)\*} = 1 - \left(1 - P\_{\rm TCID50}\right)^{\tau \mu\_{\epsilon}/\lambda},\tag{12}$$

with newly added pathogen *s<sup>i</sup>* = 7*g*(1 *− pt*) for within-group transmission and *s<sup>e</sup>* = 7*g*(1/8)*p<sup>t</sup>* for between-group transmission.

By choosing *P* (*i*) inf = *P* (*i*)*∗* inf and *P* (*e*) inf = *P* (*e*)*∗* inf , both models produce the same basic reproductive number *R*<sup>0</sup> (for validation, see **Figure 3**).

#### Parallel of *R*<sup>0</sup> in DT and IT Models

The DT model was parameterized to mimic the IT model in terms of the basic reproduction number *R*0. Accounting for transmission within and between groups, *R*<sup>0</sup> was calculated for both scales of spatial transmission separately. This gives the expected number of infections from one case to its group-mates *R* (*i*) 0 and to the animals of neighboring groups *R* (*e*) 0 , summing up to *R*<sup>0</sup> = *R* (*i*) <sup>0</sup> + *R* (*e*) 0 .

In the DT model with an infectious period of 1 week, *R*<sup>0</sup> is a linear function of *P*inf:

$$\mathcal{R}\_0^{(i)} = \mathbb{S}\_i \mathcal{P}\_{\text{inf}}^{(i)} \tag{13}$$

$$R\_0^{(\epsilon)} = \mathbb{S}\_{\mathfrak{e}} P\_{\inf}^{(\epsilon)} \tag{14}$$

**between-group component (average of 500 simulations)**. Black: without population dynamics, white: with population dynamics. Lines indicate the theoretical values, *R* (*i*) *<sup>o</sup>* = 1.653 and *R* (*e*) *<sup>o</sup>* = 0.606. Numbers indicate *p*-values of two-sided Mann–Whitney *U* tests of total *R*<sup>0</sup> without population dynamics against DT model (*H*0: not different from DT).

*S<sup>i</sup>* is the number of susceptible hosts in the group of the infectious individual. *S<sup>e</sup>* is the number of susceptible hosts in its neighboring groups.

We can calculate *R*<sup>0</sup> from the parameters of the IT model using Eqs 11 and 13 for within-group transmission and Eqs 12 and 14 for between-group transmission:

$$\mathcal{R}\_0^{(i)} = \mathcal{S}\_i \left( 1 - \left( 1 - P\_{\text{TCIDS0}} \right)^{\tau\_{\text{bus}}/\lambda} \right) \tag{15}$$

$$\mathcal{R}\_0^{(\epsilon)} = \mathcal{S}\_{\mathfrak{c}} \left( 1 - \left( 1 - P\_{\text{TCIDS}0} \right)^{\gamma\_{\text{ue}}/\lambda} \right) \tag{16}$$

The exponent in Eqs 15 and 16 can be transformed to 7*us*/λ = 7*usT*1/2/ln(2). Thus, *R*<sup>0</sup> in the IT model can be kept constant over arbitrary pathogen half-life *T*1/2 by compensatory scaling of the uptake *u*, i.e., *u × T*1/2 is constant (see **Figure 4**). With pathogen half-life approaching 0, the IT model becomes equivalent to the DT model as pathogen uptake becomes instantaneous.

#### Independent Variables

The primary independent variable was the pathogen half-life *T*1/2.

#### Simulation Experiments

We performed simulations for the IT model with environmental pathogen half-life *T*1/2 *∈*{1/8, 1/4, 1/2, *. . .*, 32, 64} days (**Figure 4**). To keep *R*<sup>0</sup> constant over all IT simulations, we scaled *u* according to *<sup>u</sup>* <sup>=</sup> <sup>4</sup> *<sup>×</sup>* <sup>10</sup>*<sup>−</sup>*<sup>6</sup> /*T*1/2. All parameter combinations resulted in *R*<sup>0</sup> = 2.259. For comparison, we repeated the simulations with the DT model. To achieve the same *R*<sup>0</sup> as the IT model, transmission parameters were scaled to *P* (*i*) inf = 0*.*087 and *P* (*e*) inf = 0*.*00379. Each parameter set was repeated 500 times.

We performed supplementary simulations to measure an individual-based equivalent of *R*<sup>0</sup> (20) in order to verify accordance of the transmission model with the theoretical calculations for the basic reproduction number. This was achieved by allowing only the first disease case per model run to be infectious and count of the number of secondary infection in the initially infected cell

and in its neighboring cells. The theoretical calculations neglect population turnover, therefore in the third set of simulations, reproduction and mortality were deactivated from the week of pathogen introduction onward. The model runs for 100 times the pathogen half-life after the initial infection to make sure that the environmental reservoir completely decayed and no secondary infections were missed in the analysis.

#### Dependent Variables

We recorded seroprevalence time series for each run on a weekly basis as the first order dependent variable. These prevalence time series were then used to determine second-order dependent variables: (1) time to detection for fixed weekly sample sizes, (2) size of the outbreak at the time of detection, and (3) sample sizes needed to detect the disease within an *a priori* specified time frame. For second-order dependent variables, see Section "Analysis."

#### Analysis

We mimicked systematic surveillance on the seroprevalence outcome *p* of the DT and the IT model deriving the following secondorder dependent variables from prevalence time series.

#### *Time to Detection*

Given a weekly sample size *n* and seroprevalence *p*, the probability to not find any seropositives in a particular week *t* is

$$
\hat{P}\_0\left(t\right) = \left(1 - \rho\left(t\right)\right)^n.\tag{17}
$$

The probability of not finding any seropositives until the given week can be determined as

$$P\_0\left(t\right) = \prod\_{i=0}^{t} \hat{P}\_0\left(t\right). \tag{18}$$

Hence, the probability to detect the disease until the given week is

$$P\_D\left(t\right) = 1 - \prod\_{i=0}^{t} \hat{P}\_0\left(t\right). \tag{19}$$

For each model run, the first week of *PD*(*t*) *≥* 0.95 determines the time of detection. Subtracting the week of virus incursion, this gives the time to detection *t<sup>D</sup>* of the individual run. The geometric mean of the distribution over the runs gives the time to detection *t<sup>D</sup>* with 95% confidence.

Sample sizes for the underlying surveillance scheme were determined on a monthly basis according to the following equation (72):

$$n\_{\text{month}} = \left(1 - (1 - \text{CL})^{\frac{1}{N - p}}\right) \left(N - \frac{N \times p - 1}{2}\right),\tag{20}$$

with true population size *N*. Parameters of interest were CL = 95%, *p* = 5% and 1%. The required sample size was 58.3 per month (14 per week) for *p* = 5% and 295.6 per month (69 per week) for *p* = 1%.

#### *Outbreak Size*

The area affected by the disease (area of cells infected) before detection *Aaff* was determined as a measure of the spatial extent of the outbreak.

#### *Required Sample Size*

The probability to detect the disease before the given week is calculated according to Eq. 19. This gives the weekly sample size needed to detect the disease in a given time frame *t* for a given seroprevalence time series:

$$m\_D = \frac{\ln(1 - \text{CL})}{\ln\prod\_{i=0}^{t} (1 - p(i))}.\tag{21}$$

We calculated the required weekly sample sizes for each model run.

#### *Statistical Analysis*

For each simulated value of *T*1/2 in the IT model, we compared distributions of time to detection *t<sup>D</sup>* and weekly sample size needed *n<sup>D</sup>* to the outcome of the DT model using the Mann–Whitney *U* test (*H*0: distribution with IT not greater than distribution with DT). Similarly, distributions of *Aaff* were compared to the outcome of the DT model using the Mann–Whitney *U* test (*H*0: distribution with IT not less than distribution with DT). Significance was defined as *p*-value *<* 0.01.

#### **RESULTS**

#### **Basic Reproduction Number**

The individual-based equivalent to *R*<sup>0</sup> did not differ systematically from the theoretical calculations (compare points to lines in **Figure 3**). Differences between IT and DT models were not significant (Mann–Whitney *U*, without population dynamics: *p ≥* 0.3, black fill and numbers in **Figure 3**; with population dynamics: *p ≥* 0.35, white fill in **Figure 3**).

#### **Seroprevalence**

Seroprevalence increased most rapidly in the DT model (**Figure 5**). The first maximum was reached after less than 40 weeks. In the IT model with equal *R*0, the increase of seroprevalence slowed down with increasing pathogen half-life (**Figure 5**, numbers).

#### **Time to Detection**

In the first experiment, i.e., detection of 5% seroprevalence with 95% confidence within one month of sampling, the surveillance design required 14 samples per week. Applying this sample size to the time series of the DT model, the disease was detected 13.3 weeks after incursion with 95% confidence (geometric mean, **Figure 6A**, left-most box). With the IT model, time to detection depended on the half-life of pathogen *T*1/2 (**Figure 6A**). Already at *T*1/2 *>* 1 day, detection times were significantly longer than in the DT model (Mann–Whitney *U* test, *p <* 0.01). For halflife of 16 days, time to detection increased to 23.9 weeks. When half-life was 64 days (maximum simulated), time to detection more than doubled compared to the DT model and reached 36.6 weeks.

In the second experiment (detection of 1% seroprevalence with 95% confidence within 1 month of sampling, 69 samples per week), the DT model resulted in detection within 8.6 weeks (**Figure 6B**, left-most box). Increase of time to detection was significant for *T*1/2 *>* 1 day (**Figure 6B**). *T*1/2 = 16 days resulted in 15.1 weeks and *T*1/2 = 64 days in 22.2 weeks to outbreak detection.

#### **Outbreak Size**

In both experiments (design prevalence of 5 and 1%), the spatial extent of the outbreaks *Aaff* in the IT model decreased significantly compared to the DT model for *T*1/2 *>* 1/2 and *T*1/2 *>* 1 day, respectively (**Figures 7A,B**).

#### **Required Sample Size**

We calculated the weekly sample size for detection within 9 weeks with 95% confidence. In the DT model, an average of 69 samples per week was necessary for detection with 95% confidence (**Figure 8**, left-most box). With the IT model and for pathogen half-life *T*1/2 *>* 1/2 day, the required sample size increased exponentially (**Figure 8**). With *T*1/2 = 16 days, the required sample size was 406 per week. For the maximum half-life of 64 days, 828 samples per week were required for detection within 9 weeks.

### **DISCUSSION**

For a wildlife host–pathogen system with a social host species, we investigated the consequences of an *a priori* assumption of direct host-to-host transmission in models for surveillance design.

**half-life**. Outlier symbols (+) show 5 and 95% quantiles. Text shows geometric means. Asterisks show significance of Mann–Whitney *U* test against DT model (*H*0: not greater than DT; \**p <* 0.05, \*\**p <* 0.01, \*\*\**p <* 0.001).

than DT; \**p <* 0.05, \*\**p <* 0.01, \*\*\**p <* 0.001).

We show that the simplified, DT model underestimated necessary sampling efforts by up to one order of magnitude, but overestimated the outbreak area that would receive control or mitigation measures. Thus, simplifying transmission risk as being proportional to the abundance of infectious and susceptible individuals hindered estimation of the most appropriate surveillance and contingency parameters.

The outcomes of a DT model were compared to results from equivalently parameterized IT models with different environmental pathogen persistence. In abstract models, the DT model is a special version of IT assuming persistence time of infectious pathogen in the environment being 0 (40). Here, we are talking about explicit process models tailored to surveillance design in the field. In the field, direct and IT modes correspond with different biological mechanisms that need adequate representation in a model to allow targeted manipulations (see model documentation). The inclusion of environmental transmission is no longer a matter of model re-parameterization but corresponds to a structural change in the model. In this sense both models, the direct and the IT model become fundamentally different. Our results pinpoint the relevance of a decision on whether environmental transmission needs to be represented in a model or not already prior to making predictions. In the logic of our analysis, however, it was necessary to allow seamless transition between models in spite of two alternative transmission mechanisms involved. We have achieved the virtual equivalence of the models while keeping the transmission potential per infected host unchanged.

Environmental transmission in a disease model might be represented assuming prolonged infectiousness of infected hosts along with prolonged half-life of the pathogen in the environment. Logically then, prolonged pathogen persistence in the environment leads to increased transmission potential of the average infected host in turn changing disease dynamics [see, e.g., Ref. (40)]. Here, we were not interested in theoretical variation of the infectious potential of infected hosts across alternative pathogens. Rather, we were addressing alternative models of the same infection, e.g., a pathogen with *R*<sup>0</sup> established in experiments. This approach was fundamental to the presented comparative assessment of model predictions on a particular disease, i.e., when the DT and the IT version of the model are aligned by the *R*<sup>0</sup> value.

We focused the comparative assessment of the different transmission models on three measurements for two surveillance schemes: (1) time to detection of an outbreak *tD*, (2) spatial extent of the outbreak*Aaff* at the time of detection, and (3) the sample size required for outbreak detection within a prescribed time frame.

Indirect transmission slowed down the increase in *seroprevalence* compared to DT with equal *R*0. An IT route through the environment results in prolonged infectiousness beyond the infectious period of the host. This causes delayed infections compared to the DT mode, where the infectious period of the hosts limits the time span for new infections. Outbreaks governed by IT may progress much slower and hence less obvious.

*Time to detection t<sup>D</sup>* is a central measure to be minimized by a surveillance scheme (73). The underestimated time to detection in the DT model will impede the realized probability of detection of a given surveillance design. Therefore, a surveillance scheme based on the estimates from the DT model [e.g., Ref. (74)] would not meet its aim of detecting an outbreak within the time horizon it was designed for. The pathogen would circulate undetected in the wildlife population longer than expected, therewith increasing the risk of infection of other hosts, e.g., livestock, and the risk of far range spread by transportation or airborne aerosols [e.g., reviewed for FMD in Ref. (75)].

The *spatial extent of the outbreak Aaff* reflects the area under intervention measures to be implemented after outbreak detection. *Aaff* was overestimated by the DT model. With IT but equal *R*0, the disease spread slower than with DT and also has more time to spread due to later detection. Due to the continuous surveillance scheme with accumulation of chance of detection over time, the longer period of undetected pathogen circulation could not completely compensate the slower spread, thus outbreak size at detection was smaller. Control and restriction zones would be oversized if designed on estimates of undetected spread from a DT model. Thereby, the applied measures would be overly expensive and an unnecessary burden for the livestock sector (76).

The DT model underestimated the *required sample size* per time unit for disease detection within a given time frame. This measure quantifies the effort that is actually necessary to achieve the original aim of the surveillance program, namely, outbreak detection within a prescribed time horizon with given confidence. The extreme increase of the sample size for long pathogen persistence suggests that other methods than testing host individuals for seropositivity may be necessary for the surveillance of certain diseases (77, 78).

Remarkably, time to detection and required sample size differed from the predictions of the DT model for pathogen half-life as short as 1 day. This time span is by almost one order of magnitude shorter than the infectious period of 1 week. This fact emphasizes the relevance of IT, even in absence of extreme pathogen longevity.

The model used in this study has been previously applied for risk assessment (47), for assessment of disease control measures (79), and to contribute to the understanding of wildlife host–pathogen systems (48, 49, 51). In this study, we extend our previous work by the integration of IT and compared surveillancerelated predictions of different model versions.

We restricted the model versions to either DT or IT, but did not combine both. Although DT is likely to play a role in most host–pathogen systems with IT mode, we were interested in the differences between the two modes. As the IT model with short pathogen half-life resembles the DT model, we nonetheless examined a continuous transition between the aggregation to DT and the explicit IT model.

Numerous empirical and modeling studies dealt with the quantification of indirect, particularly airborne transmission of FMD and other diseases between domestic livestock holdings [e.g., Ref. (71, 80–84), reviewed for FMD in Ref. (85)], but IT of FMDV in wildlife animals has, to our knowledge, not yet been quantified. We developed a modeling approach that breaks down IT into components that are accessible to experimental measurements, namely pathogen shedding, survival/decay in the environment, contact with infectious material, and infection according to a dose–response relationship. Although some experiments quantified pathogen excretion and secretion of FMDV [reviewed in Ref. (12)] and other pathogens [see, e.g., Ref. (86) for CSF] by domestic animals, knowledge for wildlife is rare (87). The large differences between domestic animal species regarding the shedding rates of FMDV (12, 75) call for further attention to this issue. The same applies for the susceptibility of different species, i.e., the dose–response relation (12, 75). Some quantification for domestic animals can be found in the literature [e.g., Ref. (70, 88) for FMD], but the qualitative relation between dose in the environment and probability of infection is often unclear (89, 90). Survival outside the host has been investigated for several pathogens in animal products and excrements under laboratory conditions (for references, see INTRODUCTION), but further research is necessary for environmental factors that influence pathogen survival. The contact of animals with viral contamination in the environment remains the most uncertain parameter. Here, an inverse parameter fitting approach could aid the quantification. Given assumptions for the other parameters, contact to viral contamination could be estimated from the probability of infection.

Experimental investigations of virus survival outside the host depict striking dependence on temperature and humidity [see, e.g., Ref. (26) for FMD, CSF, BVDV, and PPV; (91) for PRRS virus; (92) for influenza A]. This fact gives rise to seasonal fluctuations of

#### **REFERENCES**


the magnitude of IT. Indeed, for several viral diseases, fluctuations of their transmission were associated with climatic seasonality, partly related to virus survival outside the host [see, e.g., Ref. (92, 93) for influenza viruses; (94) for hepatitis A]. Therefore, climatic factors are expected to play a role in regional variations of the epidemiology of infectious diseases with an IT mode.

With this work, we contribute to the research on IT, which is still in an early stage but attracting increasing attention. Previous work focused on the impact of IT on key figures of host–pathogen systems such as the basic reproductive number (20), disease persistence (41), and formal conditions of relevance for modeling (40).

Our results resemble findings byWearing et al. (1) and Almberg et al. (20), which show that a neglect of prolonged infectiousness, e.g., through environmental pathogen stages or inappropriate assumptions about the infectious period, may result in an underestimate of *R*0, if derived from the prevalence growth rate. Reciprocal, in our study prevalence growth rates decreased under IT despite equal reproductive potential (*R*0). Thus, we transferred the findings regarding the relevance of IT from a theoretical underestimation of infection dynamics, i.e., *R*0, to the application-oriented context of designing surveillance of any particular wildlife disease, i.e., *R*<sup>0</sup> being fixed.

We conclude that a simplified aggregation of transmission processes, particularly a neglect of environmental pathogen stages, may considerably bias model predictions of the performance of disease surveillance and mitigation strategies. We state that this applies even for pathogens with an average environmental survival time that is comparatively short compared to the infectious period of the host.

#### **AUTHOR CONTRIBUTIONS**

ML and H-HT conceived and designed the experiments; ML performed the experiments and analyzed the data; and ML, SK-S, and H-HT developed the model and wrote the manuscript.

#### **FUNDING**

ML was partially funded by the European Food Safety Agency (EFSA).

#### **SUPPLEMENTARY MATERIAL**

The Supplementary Material for this article can be found online at http://journal.frontiersin.org/article/10.3389/fvets.2016. 00110/full#supplementary-material.


Islands: implications for transmission to wildlife. *Auk* (2008) 125(2):445–55. doi:10.1525/auk.2008.06235


swine fever – legend or actual epidemiological process? *Prev Vet Med* (2012) 106:185–95. doi:10.1016/j.prevetmed.2012.01.024


porcine reproductive and respiratory syndrome virus in aerosols. *Vet Res* (2007) 38:81–93. doi:10.1051/vetres:2006044


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2016 Lange, Kramer-Schadt and Thulke. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## Impact of Network Activity on the Spread of Infectious Diseases through the German Pig Trade Network

#### *Karin Lebl1 \*, Hartmut H. K. Lentz1 , Beate Pinior2 and Thomas Selhorst3*

*<sup>1</sup> Institute of Epidemiology, Friedrich-Loeffler-Institute, Greifswald, Insel Riems, Germany, 2 Institute for Veterinary Public Health, University of Veterinary Medicine Vienna, Vienna, Austria, 3Unit Epidemiology, Statistics and Mathematical Modelling, Federal Institute for Risk Assessment, Berlin, Germany*

#### *Edited by:*

*Tariq Halasa, Technical University of Denmark, Denmark*

#### *Reviewed by:*

*Lina Mur, Kansas State University, USA Beatriz Martínez-López, UC Davis, USA*

> *\*Correspondence: Karin Lebl k.lebl@gmx.at*

#### *Specialty section:*

*This article was submitted to Veterinary Epidemiology and Economics, a section of the journal Frontiers in Veterinary Science*

*Received: 17 March 2016 Accepted: 07 June 2016 Published: 21 June 2016*

#### *Citation:*

*Lebl K, Lentz HHK, Pinior B and Selhorst T (2016) Impact of Network Activity on the Spread of Infectious Diseases through the German Pig Trade Network. Front. Vet. Sci. 3:48. doi: 10.3389/fvets.2016.00048*

The trade of livestock is an important and growing economic sector, but it is also a major factor in the spread of diseases. The spreading of diseases in a trade network is likely to be influenced by how often existing trade connections are active. The activity α is defined as the mean frequency of occurrences of existing trade links, thus 0 < α ≤ 1. The observed German pig trade network had an activity of α = 0.11, thus each existing trade connection between two farms was, on average, active at about 10% of the time during the observation period 2008–2009. The aim of this study is to analyze how changes in the *activity* level of the German pig trade network influence the probability of disease outbreaks, size, and duration of epidemics for different disease transmission probabilities. Thus, we want to investigate the question, whether it makes a difference for a hypothetical spread of an animal disease to transport many animals at the same time or few animals at many times. A SIR model was used to simulate the spread of a disease within the German pig trade network. Our results show that for transmission probabilities <1, the outbreak probability increases in the case of a decreased frequency of animal transports, peaking range of α from 0.05 to 0.1. However, for the final outbreak size, we find that a threshold exists such that finite outbreaks occur only above a critical value of α, which is ~0.1, and therefore in proximity of the observed activity level. Thus, although the outbreak probability increased when decreasing α, these outbreaks affect only a small number of farms. The duration of the epidemic peaks at an activity level in the range of α = 0.2–0.3. Additionally, the results of our simulations show that even small changes in the activity level of the German pig trade network would have dramatic effects on outbreak probability, outbreak size, and epidemic duration. Thus, we can conclude and recommend that the network activity is an important aspect, which should be taken into account when modeling the spread of diseases within trade networks.

Keywords: network analysis, disease spread, trade activities, temporal network, animal movements, epidemiology

#### INTRODUCTION

Live animal trade represents an important economic sector but is permanently subject to fluctuations. For instance, consignments of pigs increased to 48% within EU-27 member states between 2005 and 2009 (1). However, the financial crisis in the subsequent years might have lessened this effect. The importance of live animal trade on the economy is also demonstrated during animal disease outbreaks. Trade restrictions with movement bans cause enormous financial losses for the affected livestock holdings and countries. For example, the outbreak of classical swine fever (CSF) in the 1990s in Germany led to an economical loss of approximately €1 billion (2). Thus, as demonstrated during CSF outbreak in Germany, livestock trade between farms is one of the major routes for the spread of animal diseases, although other infection routes, like proximity to infected herds or contact with contaminated persons and vehicles, exist as well (2).

Scientific research has primarily focused on the influence of the trade structure of farms on disease dynamics (3, 4). Farms differ with respect to their trade activity, i.e., with respect to the number of trading partners, trade connections, trade volume, and time intervals (5). Within the trade network, farms with greater trade activities are the most important contributors to disease spread (6). Veterinary epidemiology assessments utilized social network analysis (SNA) tools, such as centrality measures, developed within the field of social sciences, to calculate the importance of farms for the spread of animal infections. Numerous centrality measures, such as in- and out-degree, betweenness, and closeness (7), were correlated with standard epidemiological parameters, such as size of an epidemic, duration of the epidemic, time to peak of the epidemic, and the basic reproduction number *R*0 (4, 8–10).

Previous studies applying SNA on pig trade networks have already provided important insight for disease prevention and control. One aspect of this research was the identification of the structure of trade communities (11, 12). Another essential finding was that there is a large degree of heterogeneity associated with movements of pigs at the movement level and at the premise specific network level as well (13). As a result, pig trade has a right-skewed distribution of all centrality parameters, i.e., few holdings have high centrality, while most have a low centrality. Thus, strategic removal of the most central nodes would result in a decomposition of the network into fragments, which would interrupt infection chains and prevent further disease spread (14–16). It was also shown that the holding types differ in their centrality measures, which allow for a targeted removal of specific holding types in the case of a disease outbreak (16–18). Further, SNA has been utilized to simulate the spread of specific diseases to estimate the effects of an outbreak, e.g., the spread of Methicillin-resistant *Staphylococcus aureus* (MRSA) through the Danish pig trade network (19).

Although SNA provides useful insights into epidemic dynamics on trade systems, the methods used in SNA do not take into account the temporal ordering of trade links. Whenever a network is traversed using trade links, each traversal has to follow a causal sequence of connections. This constraint can have a significant impact on the spreading paths for pathogens in networks (20). For this reason, recent work has been focused on *temporal network analysis*, where each connection has a time stamp marking its occurrence time. The probability of contagion between two individuals is not constant in time and depends, beside the transmission rate and infectious period, also on the frequency and duration of the contact (21–24). Studies that considered the heterogeneity and duration of contacts and their importance for the epidemic showed the importance to elucidate the time dependency of activities in order to investigate disease dynamics (22, 24–26). Previously, it has been shown that the aggregation of trade links into static networks leads to an overestimation of the epidemic size (27–30), the outbreak probability (31), and the epidemic duration. Thus, scientific research in the veterinary field has increasingly focused on time-dependent networks. Methods have been adapted and extended from static analyses to time-dependent analyses (20, 27, 31–37).

A temporal network view on livestock trade networks includes the frequency of trade links. For the whole system, this frequency can be considered as the pace of trading. This raises the question, whether it makes a difference for a potential spread of an animal disease to transport many animals at the same time (low frequency) or few animals at many times (high frequency). From the economic point of view, it is appropriate to choose a low trade frequency and transport many animals at the same time.

In this work, we analyze the impact of the overall trade frequency on the spread of infectious disease. Hereby, we keep the total trade volume of the network constant and systematically investigate the impact of a changing frequency of traded animals. We define the *activity* of a network by averaging the frequency of all existing trade connections between node pairs and analyze how changes in the activity influence the probability of a disease outbreak, the final outbreak size, and the duration of an epidemic. A discrete stochastic SIR model is used to simulate the spread of a hypothetical disease through the trade network of the German pig production chain.

#### MATERIALS AND METHODS

In order to analyze the influence of network activity on the course of an epidemic, an outbreak model predicting the course of a hypothetical animal disease on a contact network between holdings belonging to the German pig production chain was set up. Besides the outbreak model, we propose a method how to systematically adjust the activity of the network.

#### Data and Network Setup

According to the EU directive EC/2000/15 (38), EU member states are obliged to collect and record livestock movement data in a national database. Pursuant to the German Animal Movement Directive (Viehverkehrsverordnung), each holding in the pig production chain (including piglet production, breeding, raising, fattening, slaughtering, and trading) is obliged to notify the movement of pigs within 7 days. All data are stored in a database, "Herkunftssicherungs- und Informationssystem für Tiere" (HI-Tier). In Germany, movement data for pigs are collected on a daily basis. In general, movement data of livestock comprise information about the source and target farms (unique identifiers), the date of movement, and the number of animals moved (batch size).

For this study, pig movement data from the federal states of Bavaria and Baden-Württemberg between the years 2008 and 2009 were used. It has previously been shown that a period of 2 years is suitable to cover all characteristic properties of the German pig trade network (31). In our data set in most cases (90%), only one movement per week took place between a supplier and buyer. Consequently, we decided to use a weekly timescale for our analysis. In the case of two movements per week, those were merged into one occasion.

To describe the pattern of trade activity over time, a temporal network was constructed. By implementing a temporal network, it is possible to take into account causality for network transversal. In other words, consecutive trade connections have to be temporally ordered in order to make up a valid indirect connection between farms (**Figure 1**). The network comprised nodes and edges, where each edge connected a node pair. Farms were represented by nodes, and movements of animals between farms at a certain point in time were represented by directed edges. A temporal network is defined as (*V*, , *T*), where *V* is the set of nodes within the network, is the set of directed edges, and *T* represents the length of the observation period, as we considered weekly time steps, *T* = 104 weeks. An edge (*u*, *v*, *t*, *w*)∈ describes the movement of *w* pigs from farm *u* to farm *v* at time *t* ≤ *T*. This network comprises |V| = 45,065 and || = 1,237,753 edges (i.e., overall number of transports during the observation period). Further, the static representation of the network was constructed by summing all observations in the temporal network over the study period, such as the static network is the time-aggregated network of . In the static representation of the network *G*(*V*, *E*), *V* represents the set of nodes and *E* the set of directed edges (|*E*| = 112,826). A directed edge between two nodes exists in the static network if a certain animal movement has taken place at least once during the observation period.

The aim of this analysis was to investigate the influence of the network activity on the outbreak size of an epidemic. However, this outbreak size would be strongly influenced by differences in the reachability of the nodes, i.e., nodes form distinct reachability

classes where a significant number of nodes may only cause trivial outbreak (11, 12). To reduce this bias, the data were first tailored to include only nodes, which are, in the static representation of the network, reachable from each other. We used the static network to identify the *largest strongly connected component* (LSCC; in a strongly connected component, each node is reachable by any other node in the component). The further analysis was limited to this LSCC, which we denote as *G*\*. Thus, the static representation of the network enables the disease to reach all nodes in finite time, no matter which node is the source of infection. All nodes and edges, which were not elements of the LSCC in the static network, were removed, as well as the corresponding elements from the temporal network. We hereby implicitly assumed that the concept of connectivity (35, 36) is preserved for the temporal network. In the resulting network, pigs moved between |*V*| = 7,455 farms (number of nodes in the LSCC) and |*E*| = 27,149 transport routes (number of edges in the LSCC) were recorded during the observational period, corresponding to || = 315,481 transports in the temporal network.

#### Setting the Network Activity **α**

Starting from the network generated as described above, we changed the activity systematically. The *activity of a single edge* in a temporal network can be described by its frequency, i.e., how often a certain edge was active during the study period divided by the length of the study period. The *network activity* α was defined as the mean edge frequency of a network, with 0 < α ≤ 1. The network activity α of a temporal network (*V*, , *T*) and its according static representation *G*(*V*, *E*) can be calculated as follows.

$$\alpha = \frac{|\mathcal{E}|}{|E| \times T},\tag{1}$$

where || is the number of edges in the temporal network, |*E*| is the number of edges in the aggregated network, and *T* is the observation period.

In order to investigate the influence of α on disease dynamics, we propose a method to systematically change the network activity. Since the results for a network with shifted α should be comparable to the original network, following constraints had to be considered: (i) the aggregated network *G* remained the same for all α, (ii) the total trade volume remained constant for all α, (iii) the temporal sequence of existing trade routes had to be preserved (see details below), and (iv) the observation period *T* was preserved.

In order to highlight the activity of a temporal network, we computed the activity according to Eq. 1 and denoted a temporal network with a certain network activity as α. For our observed network, we found α = 0.11, and we denote the observed network as α=0.11≡\*.

In order to create networks with a reduced α, randomly chosen edges from \*(*V*, , *T*) were removed. According to constraint (i), edges were removed in a way such that each edge of the aggregated network appeared at least once in the newly generated temporal network.

In order to increase α, we first considered our temporal network as a sequence of static network snapshots. In other

*t* = 1, thus before the disease has reached node *x* at *t* = 2.

words, a temporal network consists of an ordered sequence α(*V*, , *T*) = *G*1, *G*2, … , *G*T, where each *G*t∈ is a static snapshot of the temporal network at time *t*. In order to increase α, each snapshot was first duplicated (once or multiple times) and time-shifted by a certain value chosen at random. Second, these snapshots were merged into a new temporal network. In the case of overlapping edges occurring between the same node pairs (i.e., multiple occurrences of directed edges active at the same time; regardless of their edge weights), the edge weights *w* (i.e., number of transported pigs) were averaged. Using this approach, the existing trade routes remain preserved as required by constraint (iii).

In order to satisfy constraint (iv), we used periodic boundary conditions, i.e., for each edge *(u*, *v*, *t*, *w)* = (*u*, *v*, *t* + *T*, *w*). In other words, if the new times exceeded the observation period *T*, the times were shifted by subtracting *T*.

The procedures described above would already be sufficient to change the activity α of the observed network \*. Nevertheless, both procedures would violate constraint (ii), as the overall sum of edge weights changes as well. Therefore, the new edge weights had to be adjusted. During the observation period, a total of *W* = 24,995,162 transported pigs were recorded. The new edge weights for α were normalized, so that the sum of the new edge weights equaled the total of the observed edge weights *W*. Finally, edge weights for the generated network were rounded to a whole number, with the minimum number of pigs per transport set to one [constraint (i)].

For example, in a first step, we duplicated the graph \* once and conducted a 52-week shift (i.e., a shift of 1 year in the duplicate). Thus, an edge active in the original graph at weeks 2, 40, 63, and 92 would be active in a 52-week-shifted graph at weeks 54, 92, 11, and 40. Merging the original with the time-shifted graph would thus result in a graph where this certain edge is active at weeks 2, 11, 40, 54, 63, and 92 (see **Figure 2** for a more detailed example).

Overall, 22 different networks were generated with different activity values, including the original network \* = 0.11. The considered values for α were approximately evenly distributed in the interval (0; 1].

#### Disease Dynamics SIR Model

In order to analyze the influence of network activity α on the course of an epidemic, we simulated the spread of a disease on different temporal networks α with parameter α. Disease dynamics were modeled by applying a stochastic discrete-time SIR model (29). Farms were treated as epidemiological units that are assigned to one of the three epidemiological states: susceptible (*S*), infected (*I*), and recovered (*R*). The infection spread along an

FIGURE 2 | Example for creating a graph with an increased **α** using a time shift of two time steps. In this example, the original graph has |*V*| = 4 nodes (A, B, C, D) and || = 6 directed edges (with *u* as the starting node and *v* as the receiving node), corresponding to |*E*| = 5 in the time-aggregated network. The edges are active at times *t* ϵ {1, 2, … , 5} (numbers next to the drawn edges), thus *T* = 5. The line widths of the edges correspond to the edge weights *w*. Overlapping edges (i.e., edges with the same *u*, *v*, *t*) are marked in red. The newly generated graph has the same number of nodes, but an increased number of edges (|| = 11). Note that, due to rounding errors, the sum of edge weights in the original and the new graph are only approximately equal.

edge (*u*, *v*, *t*, *w*), if at time *t* the supplying node *u* was in state *I*, and the receiving node *v* in state *S*. Thus, a receiving farm could only became infected, if a transport took place from the supplying farm to the receiving farm during the time period in which the supplying farm was in the *I* state. Infectious nodes stayed in the *I* state for μ time steps, thereafter they passed to the *R* state. Nodes in the *R* state remained in this state until the end of a simulation run. Infectious farms infect susceptible farms with probability *pe*.

Due to the fact that certain information was not available, the following model assumptions were made. (I) farms representing the nodes within the network were all treated identically (8, 29, 31, 39). Thus, in this model, the number of animals on the farm, breed, farm type, or farm practices did not have an effect on the transmission dynamics. (II) the epidemiological status does not alter the trade contact structure. The latter is a strong assumption, but it allowed an examination of the influence of network topology on unmanaged disease dynamics (29).

#### Model Parameters

In order to compute the transmission probability *pe* for each edge, we first considered the risk of infection for each transported animal. For every transport from an infected to a susceptible farm, each transported animal has a probability *p* to infect the receiving node. In this work, the probabilities *p* = {0.25, 0.5, 0.75, 1} were considered. The receiving node became infected, when at least one transported animal spread the disease. The probability *pe* can be described with a binomial function *B*(*w*, *p*), whereas the function depends on the parameters edge weight *w* and an animals' transmission probability *p*.

$$\mathcal{P}\_c = P\left(X > 0\right) \sim B\left(\mathcal{W}, \mathcal{P}\right) = 1 - \left(1 - \mathcal{P}\right)^{\text{w}}.\tag{2}$$

A transmission probability of *p* = 1 corresponds to a highly infectious disease: the supplied farm always became infected, independent of the batch size *w*. This corresponds to a worst-case scenario and is therefore often used in studies investigating the spread of diseases within the trade networks (6, 8, 31).

Nodes remain in the *I* state for the *infectious period* μ and then pass to the *R* state. Nodes in the *R* state remained in this state. In this paper, we considered a constant infectious period of μ = 4 weeks [as estimated for cases, such as CSF, African swine fever, foot-and-mouth disease; Ref. (40)].

#### Initial Conditions

In the analysis presented here, the model predicted the disease dynamics for discrete intervals of 1 week. Initially, all farms were in the susceptible state (*S*). At a randomly chosen time, the state of one randomly selected farm was set to infected (*I*).

The disease dynamics were simulated on a temporal network α. All possible start times and initially infected nodes (index nodes) had the same selection probability. The start times were selected from the interval [1; *T* − 40] to avoid that the durations of the epidemics exceed the observation period of 104 weeks. We chose 40 weeks arbitrarily, as the first test runs showed that the duration of the epidemic only rarely exceeded this time period. However, in some cases, the duration of the epidemic still exceeded the study period – those cases were excluded from the further analysis. The simulation stopped when the number of infectious nodes reached 0.

#### Summary of Parameters

For each value of the activity parameter coming from one of the 22 investigated α, each with the 4 transmission probabilities as described above, the simulation was repeated 2,000 times, as test runs showed that this number of iterations provided robust results. Thus, 176,000 simulation experiments were run in total (**Table 1**). In 175,877 of those simulations, the duration of the infection did not exceed the observation period and were used for further analysis.

#### Analysis

We wanted to determine the probability that a disease outbreak occurs for a certain level of α. The *outbreak probability* was estimated as the proportion of the 2,000 simulation runs, in which the disease spread beyond the starting node. In those cases where the disease spread beyond the starting node, the *outbreak size* was calculated as the total number of infected nodes. In addition, the *outbreak duration* was defined as the number of weeks in which infected nodes occurred. The distribution of the latter two measures was skewed to the right, and thus we give the median and the first and third quartiles (Q1, Q3).

All analyses were conducted using the open-source software *R* version 3.2.1 (41). The package *igraph* (42) was used to generate and analyze the network.

#### RESULTS

#### Descriptors of *G*\*

For this static representation *G*\*, we found an average shortest length of 6.33; the path length between the two most distant nodes (diameter) was 17. The median in-degree, measuring the number of trade partners delivering animals to a certain node was only one, while the median for the number of trade partners a certain holding delivers to (out-degree) was two (**Table 2**). The values for the median ingoing and outgoing closeness centrality were rather similar (**Table 2**), indicating that the number of steps required to reach a certain node equals the number of steps required to reach



any other node from a certain node. The number of shortest paths going through a certain node (betweenness centrality, **Table 2**) showed a high variation, ranging from 0 to 15,166,160.

#### Outbreak Probability

We observed that the outbreak probability is finite, independent of the particular values of transmission probability *p* and network activity α (**Figure 3**). Even for the smallest considered activity values (α = 0.01), the outbreak probabilities was in the region of 5% for all considered transmission probabilities.

TABLE 2 | Minimum, 25% quartile, median, 75% quartile and maximum of the calculated centrality parameters for *G*\*, the static representation of the observed network.


We now focus on the outbreak probability for a transmission probability of *p* = 1, i.e., the worst-case scenario, in which transports of any size spread the infection. In this scenario, a monotonous increase of the outbreak probability with increasing activity was observed. The outbreak probability saturated for larger values of α. More precisely, the outbreak probability was greater than 99% for all α > 0.80. For small and intermediate values of α, it can be observed that even relatively small changes in α had a strong effect on the outbreak probability. Our observed network (α = 0.11) lies in this region. Consequently, small changes in the real system would result in large changes in the outbreak probability.

We now focus on transmission probabilities of *p* < 1. For all considered *p* < 1, a qualitatively similar behavior could be observed. Contrary to the worst-case scenario (*p* = 1), the outbreak probabilities for *p* < 1 did not increase monotonously, but rather showed a maximum. The location of these maxima was shifted to the right for increasing values of *p*. It should be noted that the location of these maxima was relatively close to the activity of the observed network \*.

#### Final Outbreak Size

We now consider the cases where the infection spread beyond the starting node and the corresponding outbreak sizes for different values for α and *p* (**Figure 4**). For the worst-case scenario *p* = 1, the outbreak size increased monotonously with increasing α. The

orange line represents α for the observed pig trade network \*.

possibility that all nodes in the network became infected was only found at this scenario (*p* = 1), but only for very high network activities. For the observed network (α= 0.11), ~15% of the nodes would become infected in the worst-case scenario.

For smaller transmission probabilities (*p* < 1), we observed that outbreak sizes are significantly smaller than in the worst-case scenario. In contrast to the worst-case scenario, the outbreak sizes showed a maximum at approximately α = 0.3.

The authors would like to stress the fact that the outbreak size showed a critical threshold regarding the network activity. This means that there was a *critical activity* αcrit, such that finite outbreaks occurred only if α>αcrit. To estimate αcrit, we calculated the central point between the last value of α below and the first value above the threshold. For transmission probability *p* = 1, we found αcrit = 0.1, and for transmission probabilities of 0.75, 0.5, 0.25, we found αcrit = 0.15. Interestingly, the activity of the observed network was close to the critical region. For *p* = 1, the activity of the observed network \* was only slightly above the critical threshold, whereas for transmission probabilities *p* < 1, the observed network was subcritical. As it is typical for such critical regimes, small changes in the activity result in large changes in the outbreak size (**Figure 4**).

#### Outbreak Duration

Although the shapes of the outbreak durations were similar for different transmission probabilities, we found that the outbreak duration increased with higher transmission probabilities (**Figure 5**). However, for all transmission probabilities, a maximum in the outbreak probability at approximately α = 0.2 could be found, with the exception of *p*= 0.25, where the maximum was at approximately α= 0.3. The reason for these maxima is the existence of two dueling effects. (i) For small α, the outbreak duration correlates with the outbreak size. Outbreaks were typically small here, and increasing α increased the possible number of paths to other nodes. Topological and temporal shortcuts played a minor role here. (ii) For large values of α, the network was likely to form a number of shortcuts, accelerating the spread of a disease.

#### DISCUSSION

In this study, we investigated how the spreading of hypothetical infectious diseases through a trade network is influenced by the networks activity level. For the observed German pig trade network α = 0.11, thus each existing trade connection between two farms was on average active at about 10% of the time during the observation period (using weekly time steps). At this observed low network activity, the chances for a disease to spread beyond the starting node were relatively low, especially for low transmission probabilities (e.g., 10% at *p* = 0.25). Even in the case that an infection spread, the total number of infected farms was for all but the worst-case scenario only about 0.2% of the nodes within the network. Previously, the size of the largest connected component

has often been used as an estimate for the potential final size of an epidemic spreading through a network (43). However, even at the applied worst-case scenario (*p* = 1), the size of the epidemic in our simulation was only a fraction (around 16%) of the total number of nodes for the observed trade data. Using the LSCC would therefore have considerably overestimated the final size of the epidemic. Thus, our results indicate that, at the observed level of the network activity, the threat of large epidemics spreading through the German pig trade network is relatively low, especially for diseases with low transmission rates. However, as we focused in our study on the spread of diseases through the trade network, the actual number of infected farms could be higher due to additional spreading *via* other infection routes (2).

For our analysis, we limited the trade network to the LSCC in order to avoid bias in the results caused by differences in the reachability of the nodes. The observed network activity of the untailored network (α = 0.105) was very similar to the activity in the LSCC (α = 0.112), indicating that changes from the observed network activity would have the same effect on the outbreak probability, -size, and -duration. However, due to the differences in the reachability of the nodes, a much higher variation in the results is to be expected (20).

Interestingly, the German pig trade network seems to be at a rather unstable state, as even small changes in the networks activity level would have a large impact on the spreading of diseases. The main factor that could change the network activity of the German pig trade network is likely to be the farm size. In the last years, the pig production in Germany and other EU countries increased, resulting in larger farm sizes and increased number of traded pigs (1, 44). This would also result in increasing animal transports, which could by archived either by increasing the animals per transport (i.e., edge weights) or by a higher frequency of transports (i.e., higher activity level), whereby the latter would likely have a higher impact on disease dynamics. If an increase in the network activity is to be expected in the long term, the probability for an outbreak and outbreak size are likely to increase, as shown in this study. Considering all three investigated measurements (outbreak probability, final outbreak size, and duration of epidemic), it becomes apparent that an increase in the network activity should be avoided. Further, in order to confine disease spreads, a decrease in the activity of the German pig trade network would be conducive, even if this reduction would only be minor. In our model, a decrease of the activity is realized by random deletion of edges. We assume that a targeted deletion of edges might even have a larger effect (45). From a practical point of view, a reduction in the network activity would mean that animal transports from one farm to another would have to be concentrated to fewer occasions. This also implies that a matching pig production schedule would be necessary, favoring "all-in-all-out" production systems.

The final outbreak size for different network activities shows, as depicted in **Figure 4**, strong similarities to the threshold behavior known from epidemic SIR-type models (46). This epidemic threshold describes a condition above which an epidemic becomes global, while below this threshold only a limited number of nodes become infected (46, 47). To estimate the epidemic threshold in a given network is thus important as it allows predicting the possibility that an infection spreads on a large scale. Hence, it is essential for the planning control and intervention strategies. Different methods exist to identify the epidemic thresholds, with the performance of those methods depending on the topology of the network (48, 49). The results of our study show not only the existence of a threshold but also that its position varies with the transmission probability. Read and colleagues (22) demonstrated for a small-scale human contact network that the encounter rate had a strong effect on the outbreak size at high transmission rates but could find no significant effect at low transmission rates. This concurs with our results, where the effects of the network activity on the outbreak size were most produced at high transmission rates. Again, it seems that the actual activity of the investigated system is close to this threshold value, as even a small increase in the activity level has a large impact on the outbreak size of an epidemic.

The outbreak probability peaked in a region below this threshold for a global epidemic. As the total number of transported animals was kept constant for all network activities, the batch sizes per transport increased, while the frequency of transports decreased at low network activities. Thus, as the edge infection probability *pe* depends on the batch size, the chance to transmit a disease beyond the starting node is rather high at low network activity levels, given the case that a transport occurs. As the number of transports is low at low levels of α, the epidemics are restricted to only a few livestock holdings. On the other hand, a decrease in the observed outbreak probabilities for large values of α can be observed, which can be explained by the fact that the batch sizes are small in this regime. Thus, disease spread was dominated by strong fluctuations in edge infection probability *pe*. These described effects only apply to transmission probabilities <1, as in the case of *p* = 1, the spreading of diseases is independent of the batch size.

As the necessary information was not available to us, we had to made several simplifications for our analysis. Especially, the farm type has already been shown to be an important factor in the spreading of the disease in animal trade networks (15). The farm type defines how long animals remain at a certain node. It is likely, that if the network activity would change due to an overall increase or decrease in the German pig production, the change in the activity of the individual trade connections would be irregular and vary according to the type of the source and the receiving node. This would be an important point to consider in further studies, as heterogeneous waiting times have been shown to influence the spread of diseases in networks (50, 51). For our simulation, we neglected within-herd transmission dynamics as well. Within-herd transmission depends not only on the specifics of a disease but is also influenced by several external factors that were not available to us (e.g., farm size or biosecurity measures on the farm level). The numbers of infected animals within a farm vary over time (52), and it is unlikely that all animals are simultaneously infected over a certain time period, as assumed in our simulation. Consequently, the presented results could overestimate the probability of a disease outbreak and the size of the epidemic. For our model, we assumed that the epidemiological status of the farms does not alter the trade contact structure. This applies to rather harmless diseases, like porcine reproductive and respiratory syndrome (PRRS), porcine circovirus type 2 (PCV2), or MRSA. However, depending on the severity of a disease, trade connections with an infected farm could cease. The withdrawal of trade connections would not be instantaneous but depend on various factors like incubation period or the occurring of clinical symptoms, resulting in a high variation between the time of infection of a farm and the potential termination of trade connections. Thus, the more likely a disease results in trade restrictions and the faster those restrictions are applied, the more our model is prone to overestimate the size of an epidemic. In case of an outbreak of a severe disease, trade connections could change due to the targeted implementation of trade restrictions by veterinary authorities. However, the extent of trade restrictions often differs between countries. For instance, during the bluetongue virus outbreak in Europe starting in 2006, trade restrictions in France were directed to specific areas (53), while in Germany, as well as in Austria and Swiss, the whole country was declared a single restriction zone at an early stage of the epidemic (54–56). Thus, if the whole country is declared a single restriction zone, the within-country trade network would likely show only marginal changes. The effect of lowering the contact rate on outbreak probability, -size, and -duration is shown in this analysis, but the implementation of trade restrictions directed to specific areas could lead to different dynamics.

In our study, we presented the *network activity* as a new indicator value for networks. With this parameter, it is possible to investigate how changes in the mean frequency in the activation of existing trade connections can affect the spread of diseases. By setting the total trade volume constant, as we did in this study, it was possible to differentiate between effects of trade frequency and trade volume. There are two specific characteristics of α: first, it is designed to be a characteristic of a temporal network. It has been shown that several network parameters drawn from a static network correlate with standard epidemiological parameters. Especially in networks with a right-skewed degree distribution, as we found for the pig trade network, nodes with a high degree can play an important role in the spreading of diseases (8, 14, 16). However, the frequency of trade links cannot be represented by a static network; static networks generated from different levels of α would be identical and thus network measurements (like centrality measurements) would be identical as well. As static networks do not take the temporal causality of the paths into account, results drawn from such static representations can be problematic. For example, it has been shown that compared to a temporal network, its static representation overestimates the size of a disease outbreak (20). Thus, in the last years, measurements for temporal networks have been developed (20, 57), and their relation to disease spread, however, remains to be investigated. In comparison with most of those measurements, the calculation of the *network activity* is simple, as it is obtained from the total number of edges in the static and the temporal network. Second, the *network activity* is a measurement for the state of the whole network and not for single nodes. It can be used as a measurement of how well a temporal network is described by its static representation. An α = 1 would be equal to network, where each existing trade link is active at all time steps, thus the static representation would be true at any time. The more closely the network activity is to one, the more accurate is its static representation. Still, for now, we would like to suggest carefulness in applying the results to other networks. While the general pattern is likely to stay the same, the exact location of the maxima/threshold of the investigated parameters could vary. Further, when comparing the *network activity* of different networks, care must be taken to use the same time period and time steps, as α changes with those two values.

In this study, we could demonstrate that the network activity α is an important factor in evaluating the effects of a disease spread in the German pig trade network. We would like to propose applying this indicator number to other networks used to demonstrate the spread of disease or other malicious agents as well, as the networks' activity is likely to have a strong impact on the spreading.

#### REFERENCES


#### AUTHOR CONTRIBUTIONS

This work was designed by TS, KL, and HL. BP, HL, and KL processed the raw data, and KL performed the data analysis. The results of the analysis were interpreted by KL, HL, and TS. KL, BP, and HL drafted and wrote the manuscript. All authors revised the manuscript and approved to the final version.

#### ACKNOWLEDGMENTS

We thank A. Fröhlich and C. Firth for their comments, which substantially helped to improve the manuscript.

#### FUNDING

This work was supported by Federal Ministry of Education and Research (BMBF, Bonn, Germany) research grant 13N11208 (KL, HL) and by the project VET-Austria (BP), a cooperation between the Austrian Federal Ministry of Health, the Austrian Agency for Health and Food Safety, and the University of Veterinary Medicine Vienna.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2016 Lebl, Lentz, Pinior and Selhorst. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## Prediction of Pig Trade Movements in Different European Production Systems Using Exponential Random Graph Models

*Anne Relun1,2 , Vladimir Grosbois2 , Tsviatko Alexandrov3 , Jose M. Sánchez-Vizcaíno4 , Agnes Waret-Szkuta5 , Sophie Molia2 , Eric Marcel Charles Etter2 and Beatriz Martínez-López1 \**

*1Center for Animal Disease Modeling and Surveillance (CADMS), VM: Medicine and Epidemiology, University of California Davis, Davis, CA, USA, 2CIRAD, UPR AGIRs, Montpellier, France, 3Animal Health and Welfare Directorate, Bulgarian Food Safety Agency, Sofia, Bulgaria, 4Animal Health Center (VISAVET), Animal Health Department, Veterinary School, Complutense University of Madrid, Madrid, Spain, 5 INRA, INP, ENVT, UMR 1225, IHAP, Université de Toulouse, Toulouse, France*

#### *Edited by:*

*Tariq Halasa, Technical University of Denmark, Denmark*

#### *Reviewed by:*

*Anette Boklund, Technical University of Denmark, Denmark Gaëlle Nicolas, Université libre de Bruxelles, Belgium*

#### *\*Correspondence:*

*Beatriz Martínez-López beamartinezlopez@ucdavis.edu*

#### *Specialty section:*

*This article was submitted to Veterinary Epidemiology and Economics, a section of the journal Frontiers in Veterinary Science*

*Received: 06 October 2016 Accepted: 15 February 2017 Published: 03 March 2017*

#### *Citation:*

*Relun A, Grosbois V, Alexandrov T, Sánchez-Vizcaíno JM, Waret-Szkuta A, Molia S, Etter EM and Martínez-López B (2017) Prediction of Pig Trade Movements in Different European Production Systems Using Exponential Random Graph Models. Front. Vet. Sci. 4:27. doi: 10.3389/fvets.2017.00027*

In most European countries, data regarding movements of live animals are routinely collected and can greatly aid predictive epidemic modeling. However, the use of complete movements' dataset to conduct policy-relevant predictions has been so far limited by the massive amount of data that have to be processed (e.g., in intensive commercial systems) or the restricted availability of timely and updated records on animal movements (e.g., in areas where small-scale or extensive production is predominant). The aim of this study was to use exponential random graph models (ERGMs) to reproduce, understand, and predict pig trade networks in different European production systems. Three trade networks were built by aggregating movements of pig batches among premises (farms and trade operators) over 2011 in Bulgaria, Extremadura (Spain), and Côtes-d'Armor (France), where small-scale, extensive, and intensive pig production are predominant, respectively. Three ERGMs were fitted to each network with various demographic and geographic attributes of the nodes as well as six internal network configurations. Several statistical and graphical diagnostic methods were applied to assess the goodness of fit of the models. For all systems, both exogenous (attribute-based) and endogenous (network-based) processes appeared to govern the structure of pig trade network, and neither alone were capable of capturing all aspects of the network structure. Geographic mixing patterns strongly structured pig trade organization in the small-scale production system, whereas belonging to the same company or keeping pigs in the same housing system appeared to be key drivers of pig trade, in intensive and extensive production systems, respectively. Heterogeneous mixing between types of production also explained a part of network structure, whichever production system considered. Limited information is thus needed to capture most of the global structure of pig trade networks. Such findings will be useful to simplify trade networks analysis and better inform European policy makers on risk-based and more cost-effective prevention and control against swine diseases such as African swine fever, classical swine fever, or porcine reproductive and respiratory syndrome.

Keywords: ERGM, network modeling, livestock contact networks, risk-based surveillance, infectious diseases

#### INTRODUCTION

Movements of animals play a key role in the spread of several major infectious diseases, like foot-and-mouth disease, classical swine fever, or African swine fever (1–3). Therefore, detailed data on livestock movements may help to better simulate transmission dynamics and identify areas, periods, and farms that are more likely to spread the diseases and could be targeted to improve surveillance and control strategies (4, 5). However, one of the challenges of using livestock movement data to support decision-making in preventive veterinary medicine is the limited availability of timely and updated records on animal movements and, if available, the massive amount of data that have to be processed. This is particularly challenging when considering diverse and, sometimes, epidemiologically complex, production systems, such as backyard or extensive systems, where the information may not be frequently collected and accessible. Models of livestock movement networks based on holding characteristics and past-temporal observed networks could be useful to simplify real-world networks and to predict disease spread even in backyard or extensive environments.

Pig trade movements can be represented as a network, consisting of a set of nodes (here the pig premises) connected by links (also called edges) representing movements of pigs between them. These networks are not strictly identical from 1 year to the following, but their structural properties, which impact disease dynamics, are likely to be stable over time (6, 7). These properties emerge from pig trading behaviors. For example, some premises may be more likely to trade with each other due to geographical proximity or because they belong to the same pig company [selective mixing or homophily, see Morris (8)]. Some particular types of premises may also be more likely to trade with a high number of premises (attributes that influence degree). Finally, if a trading partner B of premises A trades with a third premises C, this might encourage A to trade with C (structural balance effect).

The first statistical models developed to evaluate which processes lead to observed network structures were quite simple. They only addressed relational reciprocation [i.e., mutuality; see Holland and Leinhardt (9)] or assortative mixing (8). The recent developments of exponential random graph models (ERGMs), also known as *p\** models (10), offer possibilities to better capture the complexity of real-life networks (11). This family of models assumes that the observed network is only one realization among many potential networks with similar characteristics and that the probability that a link exists is a logit-linear function of predictors that reflect node characteristics, link characteristics, and network structural properties (10, 12, 13). Although they were developed to handle the inherent non-independence of network data, the results of ERGMs are interpreted in similar ways to logistic regression, making this a very useful method for examining contact networks in the context of epidemiology.

The aims of this paper were to use ERGMs to (1) develop models that reproduce observed pig trade networks; (2) understand the mechanisms that underlie the organization of pig trade networks; and (3) predict trade networks structures in three different European pig production systems (i.e., industrial, extensive, and backyard). Results of this study are intended to inform the design of prevention and control programs for swine diseases such as African swine fever, classical swine fever, or porcine reproductive and respiratory syndrome under diverse epidemiological scenarios and pig productions systems in Europe.

#### MATERIALS AND METHODS

#### Data Collection and Network Construction

Three areas were selected to represent different European pig production systems: Bulgaria, where most premises raise pigs for own consumption; the autonomous community of "Extremadura," which is the cradle of extensive Iberian pig production in Spain; and the department of "Côtes-d'Armor," which is the French department with the highest concentration of industrial pig premises.

Data on pig movements and premises characteristics were obtained from national databases, through Bulgarian Food Safety Agency in Bulgaria, the professional database of swine (La Base de Données Professionnelle Porcine—BDPORC) in France, and the Ministry of Agriculture, Food and Environment (MAGRAMA) in Spain. The year 2011, which was common in all databases, was retained for analysis. The premises characteristics available were the classification or type of farm (described in the next two sentences), the size of premises (i.e., number of sows, weaners, and finishers on farm), the type of housing system (i.e., indoor or outdoor), the geographical coordinates, and the pig company number (only for France). In Bulgaria, pig farms were classified as small producers (<10 pigs kept for own consumption), type B (medium-size: 10–500 pigs; with low biosecurity level: access to other pigs or feral pigs, use of swill feeding, no fences around the holdings, and/or no disinfection at the entrance and exit of buildings), type A (medium-size, high biosecurity level), or industrial farms (large size: >500 pigs; high biosecurity level) (14, 15). Traditionally and outdoor-raised East Balkan pig herds are also found in the South East of Bulgaria. For Spain and France, pig farms were classified as multipliers (premises that produce breeding stocks and semen), farrowing farms, farrow-to-finish farms, finishing farms, or small producers. Small producers for Spain were defined as those that produce pigs for own consumption, whereas for France were those with ≤4 pigs. Traders, collection centers, markets, fairs, and stopping points (i.e., or staging point: locations used to feed, water, rest, accommodate, care for, and dispatch animals in transit before arriving to their final destinations) were considered as trade operators. Because of the dead-end characteristics of slaughterhouses, these premises were excluded from analysis.

For each area, yearly networks (i.e., using year 2011) were built, the nodes being all pig premises of the study areas, even those that were not trading pigs during the study period. Movement data were aggregated over the study period, and a direct link was drawn whenever a shipment of pigs occurred between the corresponding premises. Movement imported from or exported to outside areas was excluded from the analyses.

#### The ERGMs

Exponential random graph models specify the probability of any random network **Y** given a set of *n* nodes and their attributes as in Eq. 1.

$$P\_{\theta} \left( \mathbf{Y} = \boldsymbol{\mathcal{Y}} \mid n \text{nodes} \right) = \left( \frac{1}{c} \right) \exp \left( \sum\_{k=1}^{K} \theta\_{k} z\_{k} (\boldsymbol{\mathcal{Y}}) \right) \tag{1}$$

The *zk*(*y*) terms represent model covariates, any set of *K* network statistics calculated on the *y* observed network and hypothesized to affect the probability of this network forming. The model covariates can include network parameters that account for the frequency of occurrence of certain network configurations (e.g., two-path, triangles), as well as node or edgewise covariates like the pig company to which a premise belongs or the distance between two premises, respectively. The θ coefficients estimate the strength of the effect of each covariate. The denominator *c* represents the normalizing constant, which correspond to the sum of exp ( θ ) *k k <sup>k</sup> z y* =1 K ( ) ∑ over all possible networks with *<sup>n</sup>* nodes.

Because ERGMs' calculation time dramatically increased with the increase of network size, it was decided to exclude isolates, i.e., pig premises that did not trade with other premises, from the small-scale productions system (Bulgaria, initially 28,729 premises, of which 95.3% were isolated premises).

#### Model Specification

First, an exploration of network data was undertaken, with the computation of several local topological measures (number of isolates, triangles, degree distribution, etc.) and of mixing matrices for premises' attributes (16). Specifically, we computed the number of nodes, the network density, the percentage of isolates, the clustering coefficient, and the mean and range of in-degree and out-degree centrality measures [e.g., Ref. (5)]. Network graphs were plotted, with the nodes colored according to nodes' attributes, to better visualize the selective mixing patterns.

Based on this exploration, several network statistics were chosen to represent hypothetic rules for trade movements (**Table 1**). *L*(*y*) captures the density of the observed network *y*. *Mi,v,a*(*y*), *Mo,v,a*(*y*), *Ha,v*(*y*), *Ua,v*(*y*)*, Sa,v*(*y*), and *E*(*y*) are attributespecific terms that capture the way in which premise attributes structure trading patterns, where *a* represents the attribute (e.g., housing system) and *v* the level (e.g., indoor, outdoor). The main effects, *Mi,v,a*(*y*) and *Mo,v,a*(*y*)*,* allow variation in the propensity of a premise to form incoming and outgoing edges according to the level of an attribute characterizing this premise. *Ha,v*(*y*) models a tendency of edges to occur between premises belonging to the same attribute level that varies among attribute levels (hereafter referred to as *differential homophily*), while *Ua,v*(*y*) models a uniform tendency of edges to occur between premises belonging to the same attribute level (hereafter referred to as *uniform homophily*). *Sa,v*(*y*) accounts for variation in the occurrence of edges according to the levels of an attribute characterizing each of two premises (hereafter referred to as *selective mixing*). *E*(*y*) captures variation in the propensity of premises to form links according to the Euclidean distance in km to other premises.

Table 1 | Network statistics used to fit the exponential random graph models of pig trade networks.


*a Some statistics use attribute-specific terms where a and v represent the attribute and level, respectively. The observed network is represented by y and the scale parameter by* α*.*

*A*(*y*) and *I*(*y*) model the tendency of premises to form unidirectional links or no links, respectively. The terms *gwdsp*(*y,* α), *gwesp*(*y,* α), *gwid*(*y,* α), and *gwod*(*y,* α) are related to local structures and represent the parametric forms of the alternating twopaths, clustering (alternating *k*-triangles) and in- and out-degree distributions, respectively. A fixed value of 0.5 was adopted for the scale parameter α in these terms (11).

The Markov Chain Monte Carlo (MCMC) algorithm was used to estimate the maximum likelihood for the θ coefficients included in models (12). The MCMC chain is intended to step around the sample space of possible networks, selecting a network at regular intervals to evaluate the statistics in the model. For each MCMC step, *n* (*n* = 1 in the simple case) toggles are proposed to change the dyad(s) to the opposite value. A chain burn-in of 105 toggles, an MCMC sample size of 104 , and an interval between successive samples of 103 was fixed for these models.

#### Model Selection and Goodness of Fit

For each area, four models were built: (1) a simple Bernoulli model that only includes the number of edges; (2) a model with edges and statistics based on nodal attributes (hereafter called "edge + attribute" model); (3) a model with edges and structurerelated statistics ("edge + network statistics" model); and (4) a model with edges, nodal attributes, and structure-related statistics ("edge + attributes + network statistics" model).

For the "edge + attribute" and "edge + network statistics" models, univariable analyses were performed first. The terms (i.e., attributes and network statistics) were then added one by one, until the best model fit was obtained. The fourth model was based on the final "edge + attribute" model, and network statistics terms were added one by one manually until the best model fit was obtained.

Three approaches were used to examine goodness of fit of the models: (1) check for model convergence and degeneracy; (2) comparison of Akaike information criteria; and (3) comparison of goodness of fit plotting for higher order statistics (11). For this purpose, four sets of statistics were used: the in- and out-degree distributions, the geodesic distance distribution, and the edgewise shared partner distribution, which reflects the clustering of the network (17). These statistics were chosen because of their impact on disease spread dynamics (18). Finally, plots of simulated networks were visually compared to the plot of the observed networks.

All analyses were conducted in R (19) using the "statnet" suite of packages (20, 21).

#### RESULTS

A total of 7,811 out of the 45,224 premises keeping pigs (i.e., 17.3%) were actively moving pigs during 2011. Description of the pig industry demographics (i.e., number of premises for each type of farm), pig trade (to/from different types of farm), and topological characteristics in Côtes-d'Armor (France), Bulgaria, and Extremadura (Spain) in 2011 are presented in **Tables 2**–**4**.

The inclusion of both nodal attributes and network configurations statistics provided the best fit to the data (**Tables 5**–**7**; **Figures 1** and **2**). Selective mixing between premises according to their type of production appeared to be an important mechanism of pig organization, whichever system considered (**Tables 5**–**7**). In addition to this mechanism, the other mechanisms related to premises characteristics that impacted the most on trade organization were belonging to the same pig company, the tendency of outdoors premise to trade with outdoor premises, and the geographical location of pig premises, in the intensive, extensive, and small-scale production systems, respectively (**Tables 5**–**7**). Network statistics on dyadwise and edgewise shared partner distributions, as well as on in- or out-degree distributions were



*a For Bulgaria only: industrial farm (large size:* >*500 pigs, high biosecurity level farm); type A farm (medium-size: 10–500 pigs, high biosecurity level); type B farm (medium-size: 10–500 pigs, low biosecurity level); east Balkan pigs (traditional outdoor pig herds). Small producer for Bulgaria:* <*10 pigs kept for own consumption. bSmall producers were defined for Spain as those farms (any size) that produce pigs for own consumption; for France were those with* ≤*4 pigs and for Bulgaria were those with* <*10 pigs kept for own consumption. NA, not applicable/not available.*



*IQR, interquartile range.*

*a Premises that sent or received pigs in 2011.*

#### Table 4 | Topological statistics of the pig trade networks in 2011.


*kin, in-degree; kout, out-degree.*

*a Isolates were excluded for Bulgaria: initially there were 28,729 premises, of which 95.3% were isolated premises.*


Table 5 | Parameter coefficients and fit for the four exponential random graph models (ERGMs) of pig trade in a small-scale production system (Bulgaria).

*a \*\*\** <*0.001; \*\** <*0.01; \** <*0.05.*

*bE, east; NW, north-west; S, south; SW, south-west.*

*c SP, small producer; IN, industrial; TA, type A; TB, type B.*

needed to fit the models (**Tables 5**–**7**). These statistics better reflected the clustering of the observed networks and allowed to reproduce well the observed global network properties (**Figures 1** and **2**). This can be clearly observed in the goodness of fit diagnostics plot (**Figure 2**), where the value of all the observed network statistics (solid black line) is only well captured by the distribution of values of the simulated networks (underlying boxplots) generated with the final ERGM model (i.e., model with edges + attributes + network statistics).

#### DISCUSSION

Exponential random graph models were used to represent, understand, and predict pig trade networks structures from different European production systems, with predominantly small-scale, extensive, or intensive pig producers. Such information improved our understanding of the processes that govern the organization of pig trade and can further be used to better inform European policy makers on prevention and control


Table 6 | Parameter coefficients and fit for the four exponential random graph models (ERGMs) of pig trade in an extensive production system (Spain—Extremadura).

*a \*\*\** <*0.001; \*\** <*0.01; \** <*0.05.*

*bIn, indoor; Out, outdoor.*

*c U, multipliers; FA, farrow farms; FF, farrow-to-finish farms; FI, finishers; SP, small producers.*

measures against swine diseases such as African swine fever, classical swine fever, or porcine reproductive and respiratory syndrome.

Rapidity of targeted action during the initial phase of an outbreak is fundamental to effectively curtail the transmission and minimize the disease burden. At this time, movements of animals have not been banned, and it is thus relevant to use "peace-time" movement networks to compare different control strategies. Until recently, variability in contact patterns was mostly approached in epidemic models by combining probabilities of contact between premises according to their type of production and to the distance between premises (22, 23). These efforts may fail to capture the structural properties of livestock trade networks that will impact diseases dynamic as well as their spatial spread (24–26). The use of ERGMs to model pig trade networks allows capturing both network topology and complex behaviors that depend on various premises characteristics. They may thus help to generate more realistic networks that may be used to study diseases spread, identify premises that could be targeted for risk-based surveillance, early detection, and rapid control of diseases, and compare different control strategies (27–29). Indeed, by simulating diseases spread on several simulated networks, we could identify some farms that are frequently and early infected and thus that should be targeted to provide timely and accurate indications of epidemic activity (30, 31). These networks could also be simulated to


Table 7 | Parameter coefficients and fit for the four exponential random graph models (ERGMs) of pig trade in an intensive production system (France—Côtes-d'Armor).

*a \*\*\** <*0.001; \*\** <*0.01; \** <*0.05.*

*bFor readability, not all values for selective for the pig companies are shown; NC, no company.*

*c MU, multipliers; FA, farrow farms; FF, farrow-to-finish farms; FI, finishers; SP, small producers; TR, trade operators.*

assess the effectiveness of compartmentalization or zoning, strategies that might be efficient to prevent disease's spread without disrupting pig trade (32, 33).

Exponential random graph models developed in this study also improved our understanding of the drivers of pig trade in different production systems. Geographic mixing patterns

strongly structured pig trade organization in the small-scale production system, whereas belonging to the same company, or keeping pigs in the same housing system appeared to be key drivers of pig trade, in intensive and extensive production systems, respectively. As expected, the specialization and organization of pig production also explained a part of trading

behaviors, illustrated by the heterogeneous mixing between types of production. This mechanism was however less important than the geographical location of premises and as important as belonging to the same pig company in the small-scale and intensive production systems, respectively. Geographical proximity did not appear to play a role in the intensive system, whereas it was significant in the small-scale producer system. Unfortunately, model degeneracy occurred when trying to include the distance between premises as a covariate in the extensive production system (Spain—Extremadura), preventing a conclusion to be drawn on the impact of this covariate in this production system.

Finally, this study revealed that the inclusion of nodal attributes was necessary to represent the mixing patterns, but it was not sufficient to reproduce the great clustering observed in the pig trade networks, which could only be represented when adding additional statistics on local network configurations. These statistics reveal some features of these networks, such as the propensity of trade to have a short path length (negative coefficient for the degree distribution terms). Some social behavior, economic factors, or unobserved covariates, which may differ between countries, may also have driven the choices of farmers for trading partners (e.g., pig prices, road distribution, traditions, or cultural practices, etc.). This may explain the increased clustering as represented by the positive coefficient for the *gwesp* term.

Until recently, problems of degeneracy and computational intractability for large network sizes limited the use of ERGMs in epidemiological modeling (11, 34). Indeed, ERGMs have been mainly used on small networks to understand the factors driving human social behaviors (35, 36) and have sometimes been applied on disease transmission modeling (37). Ortiz-Pelaez et al. (38) were the first to introduce ERGMs in preventive veterinary medicine, using this method to understand the factors driving livestock trade in a small network of villages in Ethiopia. In the present study, the use of new parameters that limit degeneracy problems (12) allowed us to obtain statistical models with a good fit to the large-size observed networks. Isolates always depends on the spatial and temporal "frontiers" that are decided and on the exchange with farms outside these "frontiers." For Extremadura, most of movements were inside the region (i.e., from the 9,544 isolates, 174 of them sent pigs to farms outside Extremadura and 73 received pigs from farms outside Extremadura); therefore, most of the isolates (97.4%) can be considered as true isolates during the study period (i.e., not movements within the region and not trading with other regions). For Côtes-d'Armor, there was a lot of exchange with other departments, and only 47% of the 489 isolates can be considered as true isolates (i.e., 57 sent pigs to farms outside Côtes-d'Armor and 202 received pigs from farms outside the region). Therefore, the scale to simulate networks for disease spread models should consider areas where almost all movements are inside these areas. For Bulgaria, the entire country was evaluated, and all were considered to be true isolates; however, it was not feasible to include these isolates in the ERGM due to memory limits, and therefore the model produced here might not fully represent the true pig movement network in Bulgaria. Further studies should be conducted to validate this network once computational difficulties to fit ERGM's to large networks are solved.

Since implementation of Regulation (EC) no 1760/2000 of the European parliament, recording of livestock movements between premises is mandatory, making data on pig trade movements available, at least in the main producing countries in the EU. However, there are no standards on the definition of different types of premises (e.g., backyard) or on other premises attributes. The scales of the networks considered in this study were also different, being the national level for Bulgaria and the regional level for France and Spain, to better study very specific production systems. Therefore, though the analyses and results in the different settings intended to illustrate the applicability and usefulness of the approach in the predominant swine production systems in the EU, mechanisms and rules that govern trade organization in the different study populations are not fully comparable.

Several studies showed that, in addition to the topology of a contact network, heterogeneity in the weight of edges and temporal network dynamics had a strong influence on diseases spreading (39, 40). Tools to model such networks are still under development, and their application is currently limited by the size of the networks modeled (41–43). In the next few years, these methods could be promising tools to improve our representation of real-world livestock trade networks.

#### CONCLUSION

This study is one of the very first to illustrate the usefulness of ERGMs to understand and simulate livestock trade networks under different European production systems, specifically small-scale, extensive, and intensive swine production systems. Depending on the production system, some premises characteristics, such as their geographical location, type of production, belonging to a pig company or housing system, were key drivers of pig trade, but adding statistics on local network configurations was necessary to accurately capture the great clustering observed in all pig trade networks. These models offer a framework to simulate realistic pig trade networks that may be included in epidemic models to compare different control strategies against major swine diseases such as African swine fever, classical swine fever, or porcine reproductive and respiratory syndrome.

#### AUTHOR CONTRIBUTIONS

AR, BM-L, and VG designed the study and developed the R codes. AR and BM-L gathered, cleaned, and verified the data. TA, AW-S, SM, EE, and JS-V contributed to the interpretation and critical discussion of the nature, characteristics, and structure of the data for the different study regions. AR carried out the analyses and wrote the draft of the manuscript. All authors participated in the interpretation and discussion of the results, read, edit, and approved the final manuscript.

#### ACKNOWLEDGMENTS

The research leading to these results received funding from the European Union, Seventh Framework Programme (FP7/2007- 2013) under grant agreement no 311931 (ASFORCE) and Boehringer Ingelheim Vetmedica, Inc. (gift support). The authors are very grateful to BFSA, BDPORC, and MAGRAMA for providing the data.

#### REFERENCES


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

The reviewer AB and handling Editor declared their shared affiliation, and the handling Editor states that the process nevertheless met the standards of a fair and objective review.

*Copyright © 2017 Relun, Grosbois, Alexandrov, Sánchez-Vizcaíno, Waret-Szkuta, Molia, Etter and Martínez-López. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## Early Decision Indicators for Foot-and-Mouth Disease Outbreaks in Non-Endemic Countries

*Michael G. Garner1 , Iain J. East1 \*, Mark A. Stevenson2 , Robert L. Sanson3 , Thomas G. Rawdon4 , Richard A. Bradhurst5 , Sharon E. Roche1 , Pham Van Ha6 and Tom Kompas5*

*1Animal Health Policy Branch, Department of Agriculture and Water Resources, Canberra, ACT, Australia, 2 Faculty of Veterinary and Agricultural Sciences, University of Melbourne, Parkville, VIC, Australia, 3AsureQuality Limited, Palmerston, New Zealand, 4 Investigation and Diagnostic Centre and Response Directorate, Ministry for Primary Industries, Wellington, New Zealand, 5Centre of Excellence for Biosecurity Risk Analysis, University of Melbourne, Parkville, VIC, Australia, 6Crawford School of Public Policy, Australian National University, Acton, ACT, Australia*

#### *Edited by:*

*Tariq Halasa, Technical University of Denmark, Denmark*

#### *Reviewed by:*

*Kimberly VanderWaal, University of Minnesota, USA Preben William Willeberg, Technical University of Denmark, Denmark*

> *\*Correspondence: Iain J. East*

*ian.j.east@agriculture.gov.au*

#### *Specialty section:*

*This article was submitted to Veterinary Epidemiology and Economics, a section of the journal Frontiers in Veterinary Science*

*Received: 26 August 2016 Accepted: 17 November 2016 Published: 30 November 2016*

#### *Citation:*

*Garner MG, East IJ, Stevenson MA, Sanson RL, Rawdon TG, Bradhurst RA, Roche SE, Van Ha P and Kompas T (2016) Early Decision Indicators for Foot-and-Mouth Disease Outbreaks in Non-Endemic Countries. Front. Vet. Sci. 3:109. doi: 10.3389/fvets.2016.00109*

Disease managers face many challenges when deciding on the most effective control strategy to manage an outbreak of foot-and-mouth disease (FMD). Decisions have to be made under conditions of uncertainty and where the situation is continually evolving. In addition, resources for control are often limited. A modeling study was carried out to identify characteristics measurable during the early phase of a FMD outbreak that might be useful as predictors of the total number of infected places, outbreak duration, and the total area under control (AUC). The study involved two modeling platforms in two countries (Australia and New Zealand) and encompassed a large number of incursion scenarios. Linear regression, classification and regression tree, and boosted regression tree analyses were used to quantify the predictive value of a set of parameters on three outcome variables of interest: the total number of infected places, outbreak duration, and the total AUC. The number of infected premises (IPs), number of pending culls, AUC, estimated dissemination ratio, and cattle density around the index herd at days 7, 14, and 21 following first detection were associated with each of the outcome variables. Regression models for the size of the AUC had the highest predictive value (*R*<sup>2</sup> = 0.51–0.9) followed by the number of IPs (*R*<sup>2</sup> = 0.3–0.75) and outbreak duration (*R*<sup>2</sup> = 0.28–0.57). Predictability improved at later time points in the outbreak. Predictive regression models using various cut-points at day 14 to define small and large outbreaks had positive predictive values of 0.85–0.98 and negative predictive values of 0.52–0.91, with 79–97% of outbreaks correctly classified. On the strict assumption that each of the simulation models used in this study provide a realistic indication of the spread of FMD in animal populations. Our conclusion is that relatively simple metrics available early in a control program can be used to indicate the likely magnitude of an FMD outbreak under Australian and New Zealand conditions.

Keywords: FMD, early decision indicators, vaccination, simulation models, decision-support, regression analysis

#### INTRODUCTION

Disease managers are faced with a number of challenges when deciding on the most effective disease control strategy to implement in an exotic animal disease outbreak. Foot-and-mouth disease (FMD) is particularly challenging given its wide range of host species, potential for rapid spread, and serious socioeconomic consequences. For countries such as Australia and New Zealand, FMD represents the most serious threat to their livestock industries. A recent study estimated the 2013 value of total direct economic loses over 10 years for a large multi-state outbreak of FMD in Australia at USD 47 billion (1). Animal products constitute a significant proportion of New Zealand exports, and the provisional results of recent modeling of the economic impacts of a large FMD outbreak in New Zealand have estimated net 2014 GDP losses over an 8-year period to be between USD 13 and 17 billion (2). Consequently, Australia and New Zealand invest considerable resources in preparedness and planning for emergency animal disease outbreaks, including maintaining vaccine banks for FMD. Despite recent changes to contingency plans to recognize that vaccination could be an important component of an FMD control program, it is unclear how or when vaccination should be used, and if it is used, how vaccinated animals should be managed once an outbreak has been resolved.

Modeling studies carried out in Australia (3–5) and overseas (6–8) have shown that vaccination is effective in reducing the duration and/or size of FMD outbreaks in situations where disease is widespread, where there is a high rate of spread or the resources for stamping out are limited. Reports suggest that early vaccination may have been beneficial in eradicating the disease earlier than was the case with recent FMD outbreaks in Korea (9) and Japan (10). Thus, vaccination is increasingly being recognized as a potential useful tool to assist in containing and eradicating FMD outbreaks in countries where the disease is not endemic. However, while vaccination may contribute to earlier eradication of the disease, it will be associated with additional costs – keeping vaccinated animals in the population will delay the period until FMD-free status is regained under current World Organization for Animal Health guidelines (11) and add additional complexity to post-outbreak surveillance programs. These issues are of particular concern for countries with significant exports of livestock and livestock products because, under current conditions, the use of vaccination and the presence of FMD vaccinated animals in the population could be expected to cause significant market access difficulties.

From a planning and management perspective, it would be useful to have access to decision support tools that take into account the information that would be available to disease managers early in an outbreak to provide an indication of the potential severity of the outbreak that could ensue. This would enable decisions on specific measures like vaccination to be made at a time when they are likely to be most effective.

McLaws and Ribble (12) documented the relationship between the interval (in days) from incursion to detection and epidemic size [expressed as the total number of infected premises (IPs)] for 24 FMD outbreaks in non-endemic countries that occurred between 1992 and 2003. They did not find a direct relationship between time to detection and total number of IPs or total animals culled for disease control, concluding that the movement of animals through markets was the most critical factor contributing to large outbreaks. Sarandopoulos (13) conducted a review of 125 FMD epidemics in non-endemic temperate countries reported to the OIE between January 1, 2005 and December 31, 2013 to identify associations between epidemic size/duration and early outbreak explanatory variables. The explanatory variables assessed in this study included susceptible animal densities, weather conditions at the time of detection, the number of IPs detected in the first 7 days, and the size of the area under control (AUC) at 7 days (based on a convex hull calculation). In total, ten candidate explanatory variables were tested for their association with epidemic size and duration using a zero-inflated negative binomial regression model. Cattle density, pig density, and the number of IPs at day 7 post-detection were all positively associated with epidemic size while increased average temperature in the month of detection was associated with "smaller" outbreaks.

Using data from the outbreak of FMD that occurred in the UK in 2001, first fortnight incidence (FFI), i.e., the cumulative number of new FMD-IPs found in the first 2 weeks of the response, was found to be a useful predictor of the size and duration of outbreaks at the regional and national scale (14, 15). The larger the number of detected herds within the first 2 weeks, the higher the risk of the large outbreak. Halasa et al. (16) extended the approach of Hutber et al. to incorporate the first fortnight spatial spread (FFS) as well as FFI (which they renamed first fortnight outbreaks – FFO, since a true incidence rate is not actually calculated) – in a simple decision tool using simulated FMD outbreaks. In terms of outcome, in addition to the number of IPs and outbreak duration, they also considered the size of the AUC and costs. Halasa and colleagues found good correlations between FFO and FFS and all of the outcome variables, indicating that both FFO and FFS have the potential as predictors of epidemic outcomes. They also found that the type of index herd was a significant predictor of epidemic outcome.

The combined work of Hutber et al. (15), Halasa et al. (16), and Sarandopoulos (13) indicates that information available early in an outbreak can be used to make inferences about the potential severity of an FMD outbreak and could perhaps be incorporated into decision support tools. However, one of the concerns is that FFO and FFS are quite simple parameters that are likely to be sensitive to outbreak management response, in particular the effectiveness of the surveillance/reporting system. For example, while a low FFO may be indicative of a limited spread and small number of infected places, it could also be indicative of the adequacy of resources to undertake surveillance and tracing. In addition, based on the work of McLaws and Ribble (12) and Sarandopoulos (13), other factors such as animal densities at the location of the index premises and involvement of animal markets may also be important.

With this background, this study was undertaken to identify characteristics measurable during the early phase of a FMD outbreak that might be useful as predictors of the severity of an FMD epidemic (expressed as the total number of infected places, outbreak duration, and the total AUC). The study also aimed to assess how robust findings were across different incursion scenarios and between different production and management systems. A key point is that in this study simulation models of FMD were used to generate a series of outbreaks listing incident infected places over time and geographical space. Regression approaches were then used to identify characteristics measurable during the early phase of a simulated outbreak that might be useful as predictors of the total number of infected places, outbreak duration, and the total AUC predicted by each simulation model. The inferences drawn from this study are dependent on the strict assumption that each of the simulation models used in this study provide a realistic indication of the spread of FMD in animal populations.

#### MATERIALS AND METHODS

A modeling study was undertaken to test a range of explanatory variables as predictors of the total number of infected places, outbreak duration, and the total AUC in an FMD outbreak. The study involved two countries (Australia and New Zealand) and two modeling platforms. Linear regression, classification and regression tree (CART), and boosted regression tree (BRT) analyses were used to assess the association between putative explanatory variables and the three outcome variables.

#### Disease Models

The Australian Animal Disease Spread Model [AADIS (17, 18)] is a hybrid model that simulates the spread and control of FMD in livestock populations at a national scale. AADIS uses the herd as the epidemiological unit of interest and models the spread of disease both within and between herds. Spread of disease within a herd is modeled through a deterministic equation-based model, and between-herd spread is modeled with a spatially explicit stochastic agent-based model. There are five discrete spread pathways in the between-herd model: direct animal movements, local spread (infection of farms within close geographical proximity by unspecified means), indirect contact (*via* contaminated equipment, people, or animal products), animal movements *via* saleyards, and windborne spread.

The model incorporates the attributes and spatial locations of individual farms, saleyards, weather stations, local government areas, and various other features of the environment. For FMD control, AADIS is configured to support the range of mitigation strategies described in Australia's contingency plans for FMD (19) with the effectiveness of these measures dependent on available resources (4).

InterSpread Plus [ISP (20)] is a spatial and stochastic simulation model of infectious disease in domestic animal populations. ISP is a state-transition model meaning that the epidemiological units of interest (farm locations) exist in either the susceptible, infected, or not-at-risk state at any given time. Similar to AADIS, ISP uses a series of user-defined parameters to define the spread of an infectious agent from one farm location to another through local spread, windborne spread, and direct and indirect contacts. Updated movement parameters are informed by findings from recent livestock movement studies in New Zealand (21, 22). Control measures, such as depopulation, vaccination, and movement restrictions, in addition to varying disease surveillance intensity can be simulated, with the ability to carry out each of these activities subject to user-defined resource constraints, similar to the AADIS model.

#### Study Design

Epidemics of FMD in Australia and New Zealand were simulated using the AADIS and ISP models, respectively. A total of 10,000 FMD outbreak simulations were carried out using each model. For each simulation, FMD was introduced into a single livestock farm selected at random within assessed high risk areas for FMD. For Australia, the study area was the whole country, with initial seeding of infection confined to south eastern Australia (**Figure 1A**). South eastern Australia comprises the states of Victoria and Tasmania and parts of New South Wales and South Australia. This area contains a mix of farming enterprises. It is the center of Australia's dairy production and is considered a higher risk area for introduction, establishment, and spread of FMD (23).

The study area for New Zealand comprised the whole of mainland New Zealand, incorporating the North and South Islands. Initial seeding of infection was confined to the Auckland megaregion (Auckland and its three neighboring regional council areas, **Figure 1B**) as it is assumed that the most likely introduction scenario for FMD into New Zealand would involve people or contaminated products seeding infection into livestock in this area. The Auckland mega-region has the largest international air and sea ports. Furthermore, yachts visiting the country are more likely to make landfall in the north.

The following assumptions were used for the Australian and New Zealand FMD models. The time from incursion to first detection was probabilistically determined based on farming systems and expected disease reporting rates in the two countries. For Australia, data on the daily probability of detection and the delay from incursion to first detection were sourced from Martin et al. (24). For New Zealand, data on the daily probability of detection were sourced from Murray and Sanson (25). Outbreak control was based on application of animal movement controls, enhanced surveillance, tracing, and stamping out (i.e., destruction, disposal, and decontamination) on detected IPs. These were applied according to each country's FMD response plan (19, 26). Resources for disease control were based on each country's estimates of expected resources (5). Each model run ended when disease was eradicated or after 1 year, whichever occurred first.

#### Explanatory Variables

Three time points (days 7, 14, and 21 after first detection) were selected, and candidate explanatory variables based on data that would be available to disease managers at these time points were collated: (1) outbreak location: farm and animal densities around the site of first detection; (2) the involvement of markets/ saleyards; (3) measures of the geographic distribution of IPs, as measured by the AUC, and the number of discrete disease clusters; (4) measures of temporal spread, as measured by the number of IPs reported, and the number of traced premises identified; (5) the rate of disease spread, as measured using the estimated dissemination ratio (EDR), calculated using the methods

described by Miller (27) and Morris et al. (28); and (6) adequacy of resources available for control. A description of each of the candidate explanatory variables is provided in **Table 1**.

For each simulated outbreak, we defined three outcome variables: the total number of IPs, outbreak duration (defined as the number of days from first detection until the date on which the last IP was culled), and the total AUC (in km2 ). Modeling results for each country were analyzed separately.

#### Statistical Methods

#### Linear Regression

The Stata/IC statistical package (29) was used for all linear regression analyses. Datasets were imported into Stata and the three outcome variables and each of the explanatory variables checked for normality and log transformed, where necessary, to minimize problems due to non-normality and heteroscedasticity of model residuals (30). Scatterplots of each of the log transformed outcome variables and each of the log transformed candidate explanatory variables were made and the association between variable pairs assessed by superimposing a Lowess-smoothed curve on each plot. After log transformation, all relationships were linear or near linear. Subsequent analyses used two modeling techniques: (1) linear regression modeling using robust estimates to account for non-normally distributed dependent variables (31) and (2) negative binomial regression. It was considered appropriate to use linear regression techniques because the methodology is robust to violations of the requirement for normally distributed dependent variables if the number of observations is large (32, 33). The outputs of the two regression models were similar, so only the results of the linear regressions are presented. The linear regression was preferred because a small proportion of values had excessively large residuals in the negative binomial models.

#### TABLE 1 | Explanatory variables tested.


Candidate explanatory variables were initially tested for unconditional associations with each of the three outcome variables. Explanatory variables that were associated with the outcome variables with *P* < 0.20 were selected for inclusion in the initial multiple regression models. The initial multiple regression model was then reduced step-wise by removing the explanatory variable with the highest *P*Wald value. This process was repeated until all remaining explanatory variables had *P*Wald < 0.05. After the most parsimonious model was developed, all excluded explanatory variables were reassessed by adding them individually back into the model. All biologically plausible, first-order interaction terms were tested, one at a time and retained in the model if the *P*Wald < 0.05 (no interaction terms were retained). The extent of confounding was assessed using the variance inflation factor. No significant confounding was observed in the final models presented.

We acknowledge that the outcome variables measured on a given EDI day (i.e., days 7, 14, and 21 post-detection), which were used as explanatory variables in each model were correlated with their corresponding outcome variable. To investigate this issue further, alternative models were developed where the outcome variable was expressed as (for example) the total number of IPs − IPs identified up to day 14. Using this approach, we identified no substantial differences in the final set of explanatory variables included in the model and the direction and magnitude of the adjusted measure of association between each explanatory variable and the outcome were essentially the same. For this reason, and also to allow our findings to be compared with previous studies (16), we elected to use the total number of infected places, total outbreak duration, and the total AUC as outcome variables in each of the models presented.

While the explanatory variables that remained in each of the Australian and New Zealand regression models differed, those with the most explanatory power (that is, those with the highest beta weight values) were present in both of the country models. For parsimony, a simpler regression model was built using only the explanatory variables that were common to both the Australian and New Zealand models with little or no loss of explanatory power (see Results).

For the linear regression models, the *R*<sup>2</sup> value is reported as a measure of the goodness of fit of the model. Based on the regression coefficients estimated for the explanatory variables included in each of the three regression models for each country, predictions of the total number of infected places, outbreak duration, and the total AUC were computed. Several cutpoints (e.g., more or less than 20 IPs) were then arbitrarily selected to divide the model iterations into large and small outbreaks. Two by two contingency tables were constructed to compare the regression model estimates with the actual values of the (classified) outcome variables. These data were then used to calculate negative and positive predictive values for the day 14 model estimates using standard techniques (34).

#### Regression Trees

Acknowledging the possibility of non-linear relationships between the explanatory variables and the three outcome variables used in this study, we used CART and BRT analyses as an alternative approach for identifying associations in these data. CART analysis involves recursively partitioning an outcome variable into two parts based on the value of a given predictor variable that best splits the data. A complete CART returns a "tree" with multiple splits, depicted as branches. Predictor variables and their split points are chosen to optimize a given goodness-of-fit criterion, such as minimizing the residual sum of squares (for continuous data). CART analysis is mathematically identical to some multivariable regression techniques, but presents the results in a way that is easily understood by nontechnical audiences.

In contrast to CART, a BRT analysis generates a large number of regression trees based on random samples of the data (35). A BRT model returns a list of predictor variables used to create the splits in each of the trees computed using the randomly sampled data. A relative weight is then calculated for each predictor variable by computing the average number of times the variable was chosen for splitting weighted by the squared improvement to the model from each split and scaled to sum to 100. Larger weights indicate a stronger influence between an explanatory variable and the outcome. The BRT analysis requires the analyst to specify the learning rate and tree complexity. Learning rate controls how much each tree contributes to the model as it develops. In general, smaller learning rates result in better predictions than larger learning rates. Tree complexity sets the number of interactions fitted in the model: a tree complexity of two allows for two-way interactions, three allows for three-way interactions, and so on.

Classification and regression tree analyses were carried out for each of the three outcome variables for the Australian and New Zealand data using the rpart package (36) implemented in R version 3.3.1 (37). The BRT analyses were carried out using the dismo package (38) in R.

#### RESULTS

Of the 10,000 outbreaks that were simulated, FMD did not establish (there was no spread from the seed herd) in 3210 simulations in Australia and 1180 simulations in New Zealand. These simulations were excluded from subsequent analyses. Descriptive statistics of the simulated outbreaks and explanatory variables for Australia and New Zealand are shown in **Tables 2** and **3**, respectively. Descriptive statistics of the outcome variables for the AADIS (Australia) and ISP (New Zealand) models (that is, the total number of infected places, outbreak duration and the total AUC) are shown in **Table 4**.

For the New Zealand (ISP) simulations, the area used for seeding FMD outbreaks had a substantially higher density of cattle (median of 152 head/km2 ) than the areas where FMD was seeded for the Australian (AADIS) simulations (median of 28 head/km2 ).

Compared with the FMD outbreaks simulated by AADIS, ISP simulated relatively high numbers of IPs during the early phase of


*AUC, area under control; EDR, estimated dissemination ratio.*

*a Cattle density: number of cattle/km2 .*

*bSheep density: number of sheep/km2 .*

*c Pig density: number of pigs/km2 .*

*dHuman density: number of humans/km2 .*



*AUC, area under control (km2 ); EDR, estimated dissemination ratio.*

*.*

*a Cattle density: number of cattle/km2 .*

*bSheep density: number of sheep/km2*

*c Pig density: number of pigs/km2 .*

*dHuman density: number of humans/km2 .*



each epidemic. The median number of IPs on days 7, 14, and 21 for ISP was 6, 9, and 11 (respectively) compared with 3, 5, and 5 for AADIS (**Tables 2** and **3**). Similarly, the number of traces generated by ISP in the early phase of each epidemic was higher than those generated by AADIS. The median number of traces generated by days 7, 14, and 21 for ISP was 8, 11, and 12 (respectively) compared with 2, 3, and 4 for AADIS. There are three possible explanations for these findings: (1) differences in characteristics of the countries and/or study regions and incursion scenarios used for each model; (2) differences in model parameterization, resulting in different probabilities of farm-to-farm transmission of virus; and (3) differences in model design (in ISP the probabilities of transmission vary according to farm type but not farm size whereas in AADIS both farm size and farm type influence probabilities of transmission).

Outbreak durations for the two models were similar: a median of 43 (minimum 16, maximum 365) days for AADIS compared with a median of 43 (minimum 21, maximum 263) for ISP. The size of the AUC was substantially lower for the AADIS simulations. The median AUC for the AADIS simulations was 680 km2 (minimum 300, maximum 29,953) compared with 1176 km2 (minimum 316, maximum 12,815) for ISP.

#### Linear Regression

Regression coefficients and their standard errors for the linear regression models of the total number of infected places, outbreak duration, and the total AUC for the AADIS and ISP models of FMD are provided in **Table 5**. **Table 6** provides details of the goodness of fit (*R*<sup>2</sup> ) for each of the linear regression models developed for Australia and New Zealand. A consistent pattern was observed with the goodness of fit of the models improving from days 7 to 14 to 21 for all outcome variables for both the Australian and New Zealand data sets.

Positive and negative predictive values for "large" or "small" outbreaks (for the total number of IPs and total AUC) or "short" or "long" outbreaks (for outbreak duration) for AADIS and ISP



*EDR, estimated dissemination ratio.*

TABLE 6 | Goodness-of-fit statistics (*R*<sup>2</sup> ) for each of the linear regression models for the total number of infected places, outbreak duration, and area under control using days 7, 14, and 21 explanatory variables for the AADIS and InterSpread Plus models of FMD.


are shown in **Table 7**. The proportions of correctly classified outbreaks ranged from 0.88 to 0.97 for AADIS and 0.79 to 0.92 for ISP.

#### Regression Trees

Classification and regression tree analyses were carried out to identify factors associated with the total number of IPs, outbreak duration, and total AUC. Similar to the approach used for the linear regression analyses, three sets of explanatory variables were used: those at day 7 post-detection, day 14 post-detection, and day 21 post-detection. Using these three sets of explanatory variables with each of the three outcome variables and both the Australian and New Zealand data sets resulted in 18 CART analyses in total. BRT models using the same explanatory variables and the same outcome variables were developed using the Australian and New Zealand data.

The CART for the predicted total number of IPs using day 14 explanatory variables for the Australian and New Zealand data are shown in **Figures 2** and **3**, respectively. For both the AADIS and ISP models, the number of IPs at day 14 had the greatest influence on the total number IPs. For the AADIS model, in TABLE 7 | Positive and negative predictive values and the proportion of outbreaks correctly classified as large or small (or short or long) using the day 14 linear regression model for the AADIS and InterSpread Plus simulated FMD outbreaks.


*a Used to classify outbreak as small or large.* addition to the number of IPs identified at day 14, the total AUC at day 14 and cattle density influenced the total number of IPs.

The five most influential explanatory variables (and their weights) from the BRT models for the total number of IPs, outbreak duration, and the total AUC for the Australian and New Zealand data are listed in **Table 8**. Consistent with the CART analyses, for both AADIS and ISP, the number of IPs identified at day 14 was associated with each of the three outcome variables. While for ISP the number of IPs identified at day 14 had the highest weight for each of the three outcomes, the total number of outbreak clusters identified at day 14 had the greatest weight as a predictor of the total AUC for AADIS. For AADIS, the density of cattle was associated with each of the three outcome variables, albeit with a relatively low regression weight in each model (10.2, 15.4, and 0.1 for the total number of IPs, outbreak duration, and the total AUC, respectively).

The predictive ability of each of the day 14 boosted regression models was assessed by calculating the positive and negative predictive values for each model (**Table 9**), similar to the

TABLE 8 | Identified explanatory variables (*n* **=** 5) and their weights (in brackets) for the boosted regression tree model of first 14 day predictors of area under control, the total number of infected places, and outbreak duration for the AADIS and InterSpread Plus models of FMD.


approach taken for the linear regression models. Overall, the BRT models were able to correctly classify simulated outbreaks as either large or small (for the total number of IPs and total AUC) or short or long (for outbreak duration) with the proportion of correctly classified outbreaks ranging from 0.82 to 0.96 for AADIS and 0.77 to 0.93 for ISP. In general, negative predictive values for the BRT models were greater than the positive predictive values.

#### DISCUSSION

During a disease outbreak, decisions on control are often made under significant uncertainty and in conditions that are continually evolving. Resources are often limited and will influence the effectiveness of disease control efforts. Experience overseas suggests that resource and logistical issues are critical considerations when evaluating disease control strategies (39–41). Vaccination is increasingly being recognized as an important tool to assist in containing and eradicating FMD outbreaks (6–8, 42, 43). Garner et al. Early Decision Indicators for FMD Control

TABLE 9 | Positive and negative predictive values and the proportion of outbreaks correctly classified as large or small (or short or long) using the day 14 boosted regression tree model for the AADIS and InterSpread Plus simulated FMD outbreaks.


*a Used to classify outbreak as small or large.*

Vaccination has been shown to be most effective in situations where disease is spreading rapidly or resources are inadequate to maintain effective stamping out (4). A number of studies have shown that vaccination is more beneficial when used early in an outbreak (8, 44, 45).

Although vaccination can be an important tool to control FMD, it will make achieving recognition of FMD-free status more difficult – keeping vaccinated animals in the population will delay the period until FMD-free status is regained under the World Organization for Animal Health (OIE) guidelines and add additional complications to the post-outbreak surveillance program (46). Shifting attitudes to vaccination among the international veterinary community means that it is no longer viewed as a measure of last resort. In Australia and New Zealand, vaccination will be given consideration as a potential additional measure (alongside stamping out) from day one of any FMD eradication response. However, given the complications and costs associated with implementing a vaccination strategy, it would only be used if authorities consider that it would be beneficial in managing the outbreak (19, 26, 47). A decision to vaccinate early in the outbreak may result in situations where it was not actually required and have consequent implications for post-outbreak surveillance, management of vaccinated animals, and regaining FMD-free status and access to markets. Conversely, not using vaccination in some situations may lead to larger and longer outbreaks, increased control costs, and greater on-going impacts on industry and local communities.

Although Australia and New Zealand have developed frameworks to support decision-making on FMD control (19, 48), these are qualitative and subjective. We reason that it would be useful if disease managers could identify early in an outbreak those situations that are likely to progress to "large" outbreaks and for which additional measures like vaccination are likely to be beneficial. In this context, measurable parameters such as the number of IPs, numbers of traced premises and/or farms under surveillance, and estimated rates of spread might be useful indicators of the potential severity of an outbreak.

The overarching aim of this project was to identify factors that could be used to predict the total number of IPs, outbreak duration, and the total AUC. Here, "factors" refers to characteristics of the physical environment in which an FMD incursion first occurs (e.g., farm density, animal density, human population density) or characteristics of the outbreak itself (e.g., the number of IPs reported at a given point in time post first detection). We were particularly interested in how robust the findings were to outbreaks in different settings. For this study, we used a wide range of FMD incursions in terms of location, production systems and seed farm type, and time to first detection (determined probabilistically). These outbreaks were simulated in two countries using two independent modeling platforms.

It is reassuring for animal health authorities that, in both countries, the simulated FMD outbreaks tended to be small and readily able to be contained and eradicated with available resources. For both countries, median outbreak durations were around 6 weeks. This finding assumes that FMD is reported relatively quickly and resources are adequate to implement effective control programs. For Australia, the median time from first introduction to reporting was 17 days (range 9–89), and for New Zealand, the median time to detection was 13 days. A previous Australian study found considerable regional variability in the probability that an individual infected farm would report suspect FMD (24, 49). Recent experience of outbreaks of FMD in non-endemic countries indicate that it can take up to 3 weeks after introduction of the virus to the primary farm before the disease is recognized (40, 50–52). However, early detection does not necessarily mean that an outbreak will be small. A total of 3.4% of the 10,000 outbreaks of FMD in Australia that were simulated in this study had more than 100 IPs and 7.2% of the 10,000 outbreaks lasted longer than 90 days. For New Zealand, there was a 7.2% probability of an outbreak involving more than 100 IPs and an 8.6% probability of an outbreak lasting more than 90 days.

The key objective of this study was to test whether information known or available to disease managers early in an FMD outbreak could be used to predict the severity of the epidemic outcome. Epidemic outcome was defined in terms of the total number of IPs, outbreak duration, and the total geographic AUC. While FFO and FFS have been shown to correlate with epidemic size (16), it was recognized that it would be more useful to consider a broader range of times than just 14 days. Accordingly, three time points were considered: 7, 14, and 21 days into the control program. A range of potential explanatory variables were tested using different analytical approaches, including linear regression, CART, and BRT analyses.

Although there was some variability between the different analyses and between countries, the cumulative number of IPs at specified time points early in the outbreak were consistently found to be strongly associated with the final number of IPs and the duration of an outbreak. It was possible to build relatively simple linear regression models for predicting the magnitude and duration of simulated FMD outbreaks that fitted both the Australian and New Zealand data (see **Table 5**). *R*<sup>2</sup> values as a measure of goodness of fit ranged from 0.3 to 0.9 depending on time point, outbreak variable, and country (**Table 6**). A consistent pattern was observed, with the fit of the models improving from days 7 to 14 to 21 for all dependent variables and for both data sets (Australia and New Zealand). The total AUC had the highest predictability and duration of an outbreak the lowest. In this study, we found that the number of IPs occurring up to a given time point provided the most predictive power for both size (total IPs) and outbreak duration. This confirms previous findings by Hutber et al. (15) and Halasa et al. (16). The AUC at a given time point was most predictive of the total AUC.

These findings were confirmed in the CART and BRT analyses. Consistency between the different approaches helps build confidence that the criteria identified are relevant to response decision-making. CART techniques are a useful alternative as they provide a visual decision tree output that is intuitive and likely to be well received by those not familiar with statistical analysis (see **Figures 2** and **3**). The tree diagrams produced in a CART analysis are consistent with clinical reasoning used by animal health professionals and can help to structure explanations of prediction. Compared with regression-based approaches, an advantage of a CART analysis is that it can accommodate non-linear relationships between an outcome variable and a set of explanatory variables as well as missing data.

Boosted regression trees have the advantages of being able to handle a range of explanatory variable types, not requiring any data transformations, and being able to account for complex, non-linear relationships (35). BRTs are better-able to describe linear relationships and are more robust in terms of predictive accuracy, although interpretability suffers as a result. CARTs and BRTs are complementary. CARTs are relatively simple and provide readily interpretable output; BRTs are more complex and robust, but with reduced interpretability. The BRTs for both countries had good predictive ability when the total number of IPs was less than 100. When the total number of IPs was greater than 100, the BRT analyses tended to under predict total IP numbers.

Although it is informative to build statistical models to summarize factors influencing outputs from complex simulation models of FMD, for disease managers, the key issue is how this information can be used to support decision-making. From a disease manager's perspective, it is useful to consider how good the models are at predicting small and large outbreaks. To do this, it is necessary to make some judgment calls about what constitutes a "large outbreak." It is difficult in advance to reach agreement on what are acceptable benchmarks in terms of eradicating FMD, as this will be influenced by the time and location of an outbreak, availability of resources, etc. Accordingly, we looked at a series of arbitrary "cut points" for classifying outcomes into small and large (or long and short) outbreaks. Model sensitivity, specificity, and positive and negative predictive values were calculated using these cut-points. In general, the linear regression models were very good at predicting when an outbreak would be small or short; the positive predictive values varied from 0.85 to 0.98 meaning that a small outbreak was correctly predicted between 85 and 98% of the time. It should also be noted that having predicted a small outbreak at day 14 (which would probably mean that a decision to vaccinate would not be made), this decision could be revisited at a later time in the outbreak when more information was available. Incorrectly predicting a large outbreak and using vaccine when it is not actually required will have trade implications and increase outbreak costs. The models were less accurate at predicting a large or long outbreak with the negative predictive values for outbreak duration exceeding 90 days being as low as 0.52 for the models of FMD in New Zealand. The negative predictive values for the total number of IPs and the total AUC were better ranging from 0.77 to 0.91 for both AADIS and ISP.

The BRT models were able to correctly classify simulated outbreaks as either large or small with the proportion of correctly classified outbreaks ranging from 0.77 to 0.96. Negative predictive values tended to be higher than the positive predictive values for the BRT models.

In conclusion, this study shows that based on simulated FMD outbreak data relatively simple metrics available at 1–3 weeks into the control program can be used to predict the size of an FMD outbreak under Australian and New Zealand conditions and provide a basis for making decisions on the use of vaccination as a control measure. It should be noted that the simulation modeling analyses carried out for this study focused on introduction of FMD into the areas considered to be at higher risk of disease entry and dissemination in Australia and New Zealand (23). The results need further validation with modeling data generated from other areas of these countries. Finally, it should be recognized that in the absence of FMD outbreaks in Australia and New Zealand, this study has fitted statistical models to simulated, not real outbreak data. Although the modeling teams have been careful to parameterize the respective models as realistically as possible, it is inevitable that assumptions and extrapolations from overseas experience have had to have been made. These considerations need to be taken into account when using the findings from this study.

#### AUTHOR CONTRIBUTIONS

All the authors made substantial contributions to the work, through concept and design (MG, TR, RS, SR, TK, and PH); data generation and modeling (MG, TR, RS, and RB); statistical analyses (IE, MS, and RS); or interpretation (MG, IE, MS, TR, and TK). MG, IE, and MS had prime responsibility for drafting and revision. MG, TR, and TK have approved the final version for publication and are in agreement to be accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved.

#### FUNDING

This work was funded by the Centre of Excellence for Biosecurity Risk Analysis (CEBRA) Project 1404D and supported by the Australian Department of Agriculture and Water Resources and the New Zealand Ministry for Primary Industries.

#### REFERENCES


disease in animal populations. *Prev Vet Med* (2013) 109:10–24. doi:10.1016/j. prevetmed.2012.08.015


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

The reviewer PW and handling Editor declared their shared affiliation, and the handling Editor states that the process nevertheless met the standards of a fair and objective review.

*Copyright © 2016 Garner, East, Stevenson, Sanson, Rawdon, Bradhurst, Roche, Van Ha and Kompas. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## Semiquantitative Decision Tools for FMD Emergency Vaccination Informed by Field Observations and Simulated Outbreak Data

*Preben William Willeberg1 \*, Mohammad AlKhamis2,3, Anette Boklund1 , Andres M. Perez3 , Claes Enøe1 and Tariq Halasa1*

*1Department of Diagnostic and Scientific Advice, National Veterinary Institute, Technical University of Denmark, Copenhagen, Denmark, 2Environment and Life Sciences Research Center, Kuwait Institute for Scientific Research, Kuwait City, Kuwait, 3Department of Veterinary Population Medicine, College of Veterinary Medicine, University of Minnesota, St. Paul, USA*

#### *Edited by:*

*Alejandra Victoria Capozzo, Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET), Argentina*

#### *Reviewed by:*

*Fedor Korennoy, Federal Center for Animal Health (FGBI ARRIAH), Russia Francois Frederick Maree, Agricultural Research Council, South Africa*

*\*Correspondence:*

*Preben William Willeberg prwil@vet.dtu.dk*

#### *Specialty section:*

*This article was submitted to Veterinary Epidemiology and Economics, a section of the journal Frontiers in Veterinary Science*

*Received: 30 November 2016 Accepted: 09 March 2017 Published: 27 March 2017*

#### *Citation:*

*Willeberg PW, AlKhamis M, Boklund A, Perez AM, Enøe C and Halasa T (2017) Semiquantitative Decision Tools for FMD Emergency Vaccination Informed by Field Observations and Simulated Outbreak Data. Front. Vet. Sci. 4:43. doi: 10.3389/fvets.2017.00043*

We present two simple, semiquantitative model-based decision tools, based on the principle of first 14 days incidence (FFI). The aim is to estimate the likelihood and the consequences, respectively, of the ultimate size of an ongoing FMD epidemic. The tools allow risk assessors to communicate timely, objectively, and efficiently to risk managers and less technically inclined stakeholders about the potential of introducing FMD suppressive emergency vaccination. To explore the FFI principle with complementary field data, we analyzed the FMD outbreaks in Argentina in 2001, with the 17 affected provinces as the units of observation. Two different vaccination strategies were applied during this extended epidemic. In a series of 5,000 Danish simulated FMD epidemics, the numbers of outbreak herds at day 14 and at the end of the epidemics were estimated under different control strategies. To simplify and optimize the presentation of the resulting data for urgent decisions to be made by the risk managers, we estimated the sensitivity, specificity, as well as the negative and positive predictive values, using a chosen day-14 outbreak number as predictor of the magnitude of the number of remaining post-day-14 outbreaks under a continued basic control strategy. Furthermore, during an ongoing outbreak, the actual cumulative number of detected infected herds at day 14 will be known exactly. Among the number of epidemics lasting >14 days out of the 5,000 simulations under the basic control scenario, we selected those with an assumed accumulated number of detected outbreaks at day 14. The distribution of the estimated number of detected outbreaks at the end of the simulated epidemics minus the number at day 14 was estimated for the epidemics lasting more than 14 days. For comparison, the same was done for identical epidemics (i.e., seeded with the same primary outbreak herds) under a suppressive vaccination scenario. The results indicate that, during the course of an FMD epidemic, simulated likelihood predictions of the remaining epidemic size and of potential benefits of alternative control strategies can be presented to risk managers and other stakeholders in objective and easily communicable ways.

Keywords: epidemics, modeling, disease control, risk communication, Foot-and-Mouth Disease

#### INTRODUCTION

A series of 10 criteria supporting a decision of whether or not to make use of protective emergency FMD-vaccination is listed in Annex X of Council Directive 2003/85/EC (1). These criteria include: "a rapidly rising incidence slope of outbreaks." A simple, quantitative tool was proposed and documented by Hutber et al. (2), using the "first 14 days incidence" (FFI) of outbreaks in forecasting the duration and the cumulative number of outbreaks at the end using data from 12 regional foci of the UK 2001 FMD epidemic. Thus, according to the abovementioned directive, the FFI might be considered a useful parameter in deciding about the launching of emergency vaccinations in an attempt to lower the total number of outbreaks, as well as to shorten the duration and to lower the losses and costs of an ongoing epidemic.

Modeling the effects of available risk management options during an FMD outbreak in Denmark was undertaken in a recent research project (3). Results and comparisons of simulations of the basic control strategy were compared to different versions of depopulation and vaccination strategies, in terms of their influence on epidemic duration, size, losses, and costs. Simulation results to evaluate the FFI principle were presented by Halasa et al. (4) who introduced the alternative term "first 14 days outbreaks" (FFO).

Decision tools should not only provide scientifically valid results but also have to be transparent and communicable to non-scientists, such as politicians, the media, and the general public to appear trustworthy. Therefore, the technically complex simulation-based results (4) were reformulated as presented here to better allow for communication of the results to the non-scientifically inclined stakeholders. Preliminary results of this work have been presented elsewhere (5–7).

The objectives of this study are as follows:


#### MATERIALS AND METHODS

#### Argentina

Data from the 2001 FMD outbreaks in the 17 affected provinces were obtained from SENASA, as described previously (8). The number of outbreaks with complete data required for the analyses (2,244 outbreaks or approx. 95% of all recorded outbreaks) are shown in **Table 1**, where the outbreaks are grouped by time of detection relative to day 14 of the epidemic, as proposed (2). **Figure 1** shows a plot of the 14 provincial observations of the relationship between the numbers of accumulated detected outbreaks after day 14 against the accumulated number of outbreaks at day 14. A regression analysis was used to predict number of outbreaks at the end of the epidemic (*Ci*) using the accumulated number of outbreaks at day 14 as a direct predictor (θ*i*) (see **Table 1**), model I WinBugs version 1.4.3 (10) was used to quantify this relationship through a Bayesian mixed log-linear model, where *Ci* was assumed to follow a Poisson (λi) process, in which λ is the distribution of the total number of cases in each affected province in 2001. Therefore, the model is formally expressed as: *Ci* ~ Poisson (λi), log(λi) = β<sup>0</sup> + β1θ<sup>i</sup> + *Ui* where β0 denotes the model intercept, β1 denotes the regression coefficients for θ*i*, and *U*i denotes non-structured random effect. *Ui* is included in the formula to account for lack of independence in the observations due to variables other than θ*i*. Non-informative prior distributions of the form *N*~ (0, 0.001) and *N* [0, δ ~ gamma (0.05, 0.005)] were used to model prior knowledge on the value of the regression coefficients. The model was run using 20,000 iterations after burning out the first 1,000 iterations.

Furthermore, provincial herd densities (ω*i*) were included in a second model with θ*i* (**Table 1**, model II), since herd density has been shown to be an important determinant for withinprovince clustering of FMD herds (8). Confounding of ω*i* was evaluated using the method of change in the estimates of βi, and best fitting model was assessed based on the smallest value of the

TABLE 1 | Detected number of outbreaks at day 14 (**θ***i*), provincial herd densities (**ω***i*), and detected and predicted number of outbreaks after day 14 for each affected province.


*Not included in subsequent analyses due to 0 outbreaks after day 14.*

*a*

deviance information criteria (DIC). Finally, a Kruskal–Wallis test was used to assess the significance of the differences between observed and predicted number of outbreaks at the end of the epidemic.

#### Denmark

Danish data were obtained from a series of FMD simulations from models with actual Danish population data on swine, cattle, and sheep herds at the national level and using FFO to designate the cumulative number of outbreaks detected in the first 14-days as a predictor for the size, duration, and costs of the epidemics. The simulation model, the farm data (simulated population), and the simulation study are explained in the next sections.

#### The Simulation Model

The DTU-DADS model (version 0.15) (11) was further updated (to version 0.16) and used to obtain the simulation data for the analysis in this study. The updates included updating the modeling of the local spread and adjusting the code to correct a coding error. For the local spread, in the earlier version of the model, the probability of infection varied depending on distance from the infectious herds in a maximum of 3 km zone (3, 11). In the current version, the disease could spread locally depending on distance from the infectious herd in the same way as modeled earlier, but depending on time from the start of the infectiousness of the infectious herd using similar probabilities (12).

#### *The Farm Data*

The data consisted of all cattle, swine, sheep, and goat herds that are registered in the Central Herd Register (CHR) of Denmark in the period from first October 2006 until 30th September 2007. During this period, there were 23,550, 11,473, and 15,830 cattle, swine, and sheep/goat herds, respectively. The data included information about the identification number, the UTM coordinates, the number of animals, and the rate of animal movements per day for each herd. While cattle herds were divided into milking and non-milking herds, sheep and goats were grouped into one category, while swine herds were split into 19 types (13). When a farm included several animal species, each species was given a different ID and set as a different herd on the same location and with the same CHR number. Further details about the study population and model input parameters can be found elsewhere (3, 11).

#### The Simulation Study

#### *Modeling Virus Spread*

Spread of infection between herds was simulated through seven spread mechanisms: (1) direct animal movement between herds; (2) abattoir trucks; (3) milk tankers; (4) veterinarians, artificial inseminators, and/or a milk controllers (medium risk contact); (5) visitors, feedstuff, and/or rendering trucks (low risk contact); (6) markets; and (7) local spread (11, 14).

The virus spread *via* animal movements and abattoir contacts was simulated based on the rate of movements/contacts per day calculated from actual movement data. For abattoir contacts, an additional parameter representing the number of herds that will be contacted by the abattoir truck on its way to the abattoir following its contact to the infectious herd was included based on the type of the infectious herd. Virus spread *via* medium and low risk contacts was simulated using the daily frequency of contact between herds *via* these routes. Virus spread *via* milk tank was possible only from a milking to another milking herd using the daily frequency of milk pickup from the dairy herds. Virus spread *via* markets was possible initially between cattle herds as markets in Denmark are restricted to cattle only. From markets, the virus could spread to susceptible herds due to movement of animals, people, and/or vehicles (11).

#### *Modeling Disease Detection*

An infected herd could be detected in one of the three mechanisms, namely: first detection, basic detection, and detection following surveillance or tracing. First detection reflected the detection of the disease in the country (the index case/outbreak). This occurred following a specific number of days after the introduction of infection. A PERT distribution was used to determine the day of first detection following virus introduction. The minimum, mode, and maximum values were 18, 21, and 23 days, respectively. Basic detection reflected the farmers' awareness of a problem within their herds and hence calling the veterinarian, while detection through surveillance or tracing occurred following a visit by the veterinary authorities. The probabilities of detection using the last two detection mechanisms were dependent on the type of the herd (11).

#### *Modeling Disease Control*

Once the first infected herd is detected, a set of control actions are enforced as explained earlier (11). These actions include (1) depopulation, cleaning, and disinfection of all detected herds, (2) the implementation of a 3-km protection zone and a 15-km surveillance zone around each detected herd, (3) all susceptible herds within the zones are surveyed and animal movements and contacts between herds are restricted within the zones, (4) forward and backward tracing of animal movements and contacts to and from detected herds, and (5) the implementation of a 3-day national stand still on animal movements. In addition, herds that received animals from detected herds were depopulated (11).

Extra control strategies, including preemptive depopulation or suppressive vaccination, were adopted in separate scenarios. When implemented, herds with susceptible animals within a 1-km radius around newly detected herds were subjected to the extra control. The extra control strategy was initiated 14 days after first detection.

#### *Initiation of Simulation and Model Run*

The simulation started with the model loading the input data and, thereafter, selecting the primary outbreak herd, which is the first infected herd in the epidemic. About 5,000 cattle herds were selected randomly as potential primary outbreak herds to initiate disease spread. Earlier results have shown that epidemics initiated in cattle herds would provide larger spread than epidemics initiated in other species (14), reflecting a worse-case scenario. Each primary outbreak herd initiated an epidemic once, resulting in outbreak data from 5,000 epidemics.

The outcomes of the model included epidemic duration (number of days between first detection and the culling of the last detected herd), number of infected herds, number of depopulated or vaccinated herds, and the total costs of the epidemics. The total costs were calculated as direct cost and export losses (11).

#### The Decision Tools

To simplify the presentation of pros and cons of vaccination to all stakeholders and to enable the urgent decisions to be made by the risk managers during the course of an ongoing FMD epidemic, a two-step methodology based on the FFO principle was applied in presenting the Danish simulation outcomes.

#### *Quantifying Uncertainty (Predictive Values) of Estimating the Likelihood of a "Catastrophic" Epidemic*

During an ongoing national FMD epidemic, the actual cumulative number of outbreaks at day 14 will have a given observed value, e.g., 15 herds (i.e., FFO = 15). Data from the simulated epidemics lasting more than 14 days were distributed among the cells of a two-by-two table based on a selected cutoff value for both the independent (i.e., FFO = 15) and of the dependent variables (a chosen "catastrophic" number of post-day-14 outbreaks, e.g., 50 or 100). This enables estimation of sensitivity, specificity, and negative and positive predictive values describing the association between the observed FFO value and a "catastrophic" epidemic in terms of the cumulative number of outbreaks occurring after day 14.

The estimation procedure described above was performed using simulated data for the basic control strategy throughout 5,000 simulated epidemics in Denmark.

If applied as part of an exercise to update FMD contingency plans, a series of alternative FFO and "catastrophic" simulated outbreak numbers post-day-14 might be explored for the comparative control strategies.

#### *Quantifying the Consequences (Expected Benefits) in Terms of the Number of Prevented "Catastrophic" Outbreaks when Changing to a Vaccination Strategy at Day 14 during the Epidemic*

The frequency distribution of the observed number of total cumulative outbreaks for the series of simulated epidemics with the observed fixed FFO-value were compared for the basic control strategy and the vaccination strategy, with both strategies applied to the same set of simulated epidemics in terms of the seeded primary outbreak herds. The benefit of changing from basic control to emergency vaccination can be estimated by comparing the number and proportion of "catastrophic" epidemics expected in the basic with those in the vaccination simulations.

#### RESULTS

#### The Argentinian Epidemic

**Table 2** summarizes the posterior estimates for the two regression models. While θ*i* is a significant predictor for Ci by itself, ω*i* is an important confounder based on the method of the change in the posterior estimate (i.e., the constant's coefficient changed by 27%), a significant predictor (97.5% CI), and substantially improved the fit of the model (smallest DIC value).

The observed and predicted number of outbreaks for models I and II are summarized in **Table 2**. No significant differences were identified between the number of observed and the number of predicted outbreaks at the end of the epidemic for models I and II (Kruskal–Wallis *p*-value >0.54).

#### Descriptive Results of the Danish Simulations

**Table 3** shows the overall descriptive results from the simulation study, using the basic control, preemptive depopulation, and suppressive vaccination scenarios. For instance, using the median basic scenario, epidemic duration is predicted to be 36 days (5th and 95th percentiles 2–128 days), resulting in 22 infected herds (5th and 95th percentiles 2–145 herds), and a total loss of €869 million (5th and 95th percentiles €703–€1,434 million).

Among the group of 5,000 simulations specifically considered here, 4,092 epidemics lasted >14 days. **Figure 2** shows a plot of these simulated epidemics with their accumulated number of



*a 97.5% credible interval.*



*a Change from basic control after day 14.*

outbreaks occurring after day 14 until the end of the epidemics, against the accumulated number of outbreaks at day 14.

#### Quantifying Predictive Values Based on the FFO Principle

For a set of chosen cutoff values [i.e., FFO < 15 vs. FFO ≥ 15 outbreaks and <50 vs. ≥50 outbreaks recorded after day 14 (**Table 4**)], the negative predictive value (NPV) is 93% and the positive predictive value (PPV) is 33%. This would mean a 7% probability of an epidemic with <15 outbreaks at day 14 becoming a "catastrophic epidemic" of ≥50 subsequent outbreaks, if the basic strategy was continued, while there would be a 33% probability that an epidemic with ≥15 outbreaks at day14 would turn out to be a" catastrophic epidemic" of ≥50 subsequent outbreaks with no change in strategy.

Changing the cutoff value for "catastrophic" epidemics to ≥100 outbreaks changes the NPV to 97% (**Table 4**). This means that if <15 infected herds have been detected up until day 14, there would be an estimated probability of just 3% that the ongoing epidemic under a continued basic control strategy would result in a cumulative number of outbreaks of ≥100. The PPV, however, is estimated at only 11%, which is explainable by the relatively low probability of "catastrophic" epidemics of ≥100 outbreaks among the simulated outcomes (*p* = 7%).

#### Quantifying the Expected Benefit of Changing to a Vaccination Strategy during an Epidemic

Among the simulated epidemics lasting 14 days or more, all the simulations with a cumulative number of outbreaks equal to 15 TABLE 4 | Danish-simulated FMD epidemics lasting more than 14 days: specificity, sensitivity, negative predictive value (NPV), and positive predictive value (PPV) for two alternative combinations (A and B) of the presumed observed cumulative outbreak size on day 14, and the subsequent cumulative outbreak size until the end of the epidemic.

A: more or less than 50 outbreaks expected subsequently


B: more or less than 100 outbreaks expected subsequently


were chosen (i.e., assuming that in an ongoing field epidemic, FFO = 15), resulting in 182 epidemics, which were further analyzed. The distribution of the number of outbreaks at the end of the epidemics minus the FFO-value of 15 was determined (**Figure 3**). The distribution of the number of outbreaks under the basic control scenario is compared to that under a suppressive vaccination scenario for the same 182 epidemics, i.e., using the same primary outbreak herds as in the basic control scenario (**Figure 4**). Of the eight "catastrophic epidemics" with ≥100 outbreaks after day 14 in the basic scenario, 5 (63%) were predicted to be spared by applying emergency vaccination. Using 50 outbreaks as the cutoff, 13 out of the 29 (i.e., 45%) of these "catastrophic epidemics" were predicted to be spared by vaccination, see **Figures 3** and **4**.

#### DISCUSSION

For the Argentina 2001 epidemic, the median herd disease reproduction ratio decreased significantly from 2.4 (before the epidemic was officially recognized) to 1.2 during the mass-vaccination campaign and <1 following the mass-vaccination campaign (9). This is consistent with our finding of the agreement with the FFO principle for this epidemic, although once the index outbreak was detected, control activities were applied including both emergency vaccination of in-contact-herds, and subsequently mass vaccination, which started late and lasted for a long period due the extended area and large numbers of herds to be covered (8, 9).

One would generally expect a marked decrease in the number of outbreaks following vaccination, but in this case, the initial intervention probably was of insufficient magnitude to effectively control the spread, resulting in a substantial epidemic tail-end in terms of number of outbreaks, duration, and geographical coverage. Had it been possible to monitor the early part of the epidemic and to apply the FFI/FFO tools described here, i.e., to act on day 14 of the epidemic, it might have led to a smaller epidemic and a faster recovery. It is worth noting that the magnitude of the association between the accumulated number of outbreaks at day 14 and the subsequent number of cases in each affected province, as indicated by the value of the regression coefficient (**Table 2**), is influenced by the presence of the large outbreak contributions from the Buenos Aires province (see **Figure 1**). However, the nature of the association, as indicated by the positive sign of the regression coefficient and its significance (see 97.5% CI, **Table 2**) remains (data not shown). For that reason, we conclude that the model was robust to the inclusion or exclusion of Buenos Aires in the analysis. Inclusion is justified, however, by the biological and economical significance of the 67% of the total number of outbreaks, which this province accounted for.

A high degree of variation is seen in the Danish simulation results (**Figure 2**). This is likely due to the Danish data describing 5,000 different simulated epidemics, while both the British (2) and the Argentinean data considered here were concerned with regional variations within individual extended field epidemics. Our main finding as shown in **Figures 3** and **4**, favoring vaccination over continued basic control when aiming to avoid catastrophic epidemics, is consistent with the overall results of the Danish simulation study (4). As can be seen from **Table 3**, on average, the alternative control strategies do not differ much; however, the extreme upper range values tend to be lower for the vaccination and cull strategies than for the basic control strategy. The relatively low positive predictive values (**Tables 4**) of course influence the average benefit/cost ratio of implementing a vaccination strategy based on this procedure, as many such vaccination campaigns apparently may be wasted, since by far most epidemics would entail <50 outbreaks with just the basic control strategy. This might indicate that the basic control strategy could be continued with a reasonably high degree of confidence. So here, also the PPV indicates a limited effect to be expected from a change of strategy toward vaccination. Apparently, a cutoff value of 100 predicted outbreaks to be used for vaccination considerations may be too high to yield useful decision criteria, because only a small percentage (here 7%) of simulated outbreaks reach that level under the basic control strategy. Such information would be valuable to note, when using simulations as part of FMD contingency planning and exercising. The benefits of possibly reducing the actual number of outbreaks within "non-catastrophic" epidemics due to vaccination are, however, not taken into account in these estimates.

The added economic costs introduced by applying FMD vaccination should be considered when setting the cutoff for what would be a "catastrophic epidemic" in terms of the number of outbreaks. Implementing vaccination in a control strategy by itself might be very costly, e.g., due to a lengthier trade ban for Danish animals and products on the export markets (3). Thus, risk managers might tolerate up to a moderate likelihood of a high number of outbreaks in order to avoid these economic consequences of vaccination. However, if the decision tool predicts vaccination to spare a relative large number of outbreaks, the added costs may appear acceptable, also considering the welfare benefits of a limited culling after implementation of suppressive vaccination strategy.

Along with the aspects of risk assessment and risk management discussed above, risk communication is an equally important part of an FMD risk analysis in the face of an ongoing epidemic (15). The interactions of these three components are nowhere more critical than in the initial phases of a national FMD epidemic, when alternative control strategies must be considered. Fast and reliable assessment of the likelihood and consequences of spread and the continuous evaluation and selection of optimal management and control measures should be supported by timely, robust, and transparent communication among risk assessors, risk managers, and other stakeholders. Only then may urgent and critically important decisions be properly understood and accepted. Emergency vaccination should be considered, if the anticipated cumulative size of the epidemic under a continued basic control strategy appears alarming and if a sufficient reduction can be expected in the magnitude and duration of the epidemic to justify the estimated additional direct and indirect control costs incurred by vaccination (3).

Several FMD epidemics affected Europe and South America in 2001 and the control strategies applied have been discussed extensively (8, 9), major obstacles to effective prevention, detection, and control of FMD have been identified and the role of disease models in animal health emergency preparedness has been highlighted (16–18). In particular, the following statements characterize the situation facing authorities when it comes to decisions on the potential use of emergency vaccination during an FMD epidemic:


When future epidemics occur, scientific and political debate will rise again regarding the merits of vaccination, with many technical, logistical, economic, political, cultural, and historical facts affecting the decision. Generally, vaccination decisions have to be made quickly and will be influenced greatly by previous experiences, but because large FMD epidemics are extremely rare events, the opportunities to directly assess the effects of control strategies are very limited (9). Therefore, effective computational models should be made available for a range of outbreak scenarios to assist objective decision-making and minimize bureaucratic delays in vaccine application, and continued efforts are required to develop robust models for use during outbreaks in FMD-free countries (19, 21). Comparison of the pros and cons of alternative control strategies has been the aim of numerous simulation modeling studies, as recently reviewed and discussed (21, 22). The special importance of communicating output results from modeling tools to decision makers has been highlighted by

#### REFERENCES


the European Commission for the control of Foot-and-Mouth Disease—EuFMD (23):


The results presented here indicate that, in the context of a decision-making aid, choice of control strategy and predictions of epidemic consequences based on the cumulative number of outbreaks detected by day 14 would be useful. Furthermore, results from simulation models comparing alternative control strategies can be documented and communicated to risk managers and stakeholders in simple ways, which seem appropriate in urgently informing decisions about whether or not to implement changes, such as deployment of emergency vaccination.

#### AUTHOR CONTRIBUTIONS

PW developed the extension to the FFO concept of the two decision support tools and completed the initial analyses on the Danish model simulation data to document their applicability. CE contributed to the initial conceptual discussions of the design of the study and participated in acquisition of the Danish data for the simulations. AB and TH provided these data and results from their model simulation studies. AP provided access to the Argentina FMD epidemic data, and MA provided the statistical analyses of these data. PW drafted the initial draft manuscript, and all the co-authors reviewed the work critically, suggested revisions, and finally approved the version to be published.

#### FUNDING

The study was financially supported by the Directorate for Food, Agriculture, and Fisheries, Denmark (grant no. 3304-FVFP-07- 782-01) and was conducted in collaboration with the EU Seventh Framework Program project FMD-DISCONVAC (grant agreement no. 226556). Additional support for PW's work was obtained through Cooperative Agreements 07-9208-0203-CA and 08-9208-0203-CA between University of California, Davis and USDA-APHIS.

*EuFMD Standing Technical Committee*. Jerez, Spain (2012). p. 29–31. Available from: http://www.fao.org/fileadmin/user\_upload/eufmd/docs/ Open\_Session2012/Appendices/12\_Simple\_decisions\_tools\_informed\_by\_ model\_predictionswhen\_considering-FMD\_emergency\_\_P-Willeberg\_P. Willeberg.pdf


analysis: 1 – overview of global status and gap analysis. *Transbound Emerg Dis* (2016) 63(Suppl 1):3–13. doi:10.1111/tbed.12522


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2017 Willeberg, AlKhamis, Boklund, Perez, Enøe and Halasa. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## Resource Estimations in Contingency Planning for Foot-and-Mouth Disease

#### *Anette Boklund1 \*, Sten Mortensen2 , Maren H. Johansen3 and Tariq Halasa1*

*1Veterinary Institute, Technical University of Denmark, Copenhagen, Denmark, 2 The Danish Veterinary and Food Administration, Head Office, Glostrup, Denmark, 3Veterinary Control Office North, Herning, Denmark*

Preparedness planning for a veterinary crisis is important to be fast and effective in the eradication of disease. For countries with a large export of animals and animal products, each extra day in an epidemic will cost millions of Euros due to the closure of export markets. This is important for the Danish husbandry industry, especially the swine industry, which had an export of €4.4 billion in 2012. The purposes of this project were to (1) develop an iterative tool with the aim of estimating the resources needed during an outbreak of foot-and-mouth disease (FMD) in Denmark, (2) identify areas, which can delay the control of the disease. The tool developed should easily be updated, when knowledge is gained from other veterinary crises or during an outbreak of FMD. The stochastic simulation model DTU-DADS was used to simulate spread of FMD in Denmark. For each task occurring during an epidemic of FMD, the time and personnel needed per herd was estimated by a working group with expertise in contingency and crisis management. By combining this information, an iterative model was created to calculate the needed personnel on a daily basis during the epidemic. The needed personnel was predicted to peak within the first week with a requirement of approximately 123 (65–175) veterinarians, 33 (23–64) technicians, and 36 (26–49) administrative staff on day 2, while the personnel needed in the Danish Emergency Management Agency (responsible for the hygiene barrier and initial cleaning and disinfection of the farm) was predicted to be 174 (58–464), mostly recruits. The time needed for surveillance visits was predicted to be the most influential factor in the calculations. Based on results from a stochastic simulation model, it was possible to create an iterative model to estimate the requirements for personnel during an FMD outbreak in Denmark. The model can easily be adjusted, when new information on resources appears from management of other crisis or from new model runs.

Keywords: stochastic modeling, veterinary crisis, epidemics, simulation models, preparedness

### INTRODUCTION

Foot-and-mouth disease (FMD) is a highly contagious disease, which is known to spread easily within and between herds and cause severe economic losses in each herd as well as in the country (1). The control and eradication of FMD within the EU is governed by EU legislation (Council Directive 2003/85/EC of 29 September 2003; http://eur-lex.europa.eu/legal-content/EN/

#### *Edited by:*

*Francisco Ruiz-Fons, Spanish Research Council, Spain*

#### *Reviewed by:*

*Chrisovalantis Malesios, Democritus University of Thrace, Greece Aurélie Courcoul, Agence Nationale de Sécurité Sanitaire de l'Alimentation, de l'Environnement et du Travail (ANSES), France*

*\*Correspondence:*

*Anette Boklund anebo@vet.dtu.dk*

#### *Specialty section:*

*This article was submitted to Veterinary Epidemiology and Economics, a section of the journal Frontiers in Veterinary Science*

*Received: 30 November 2016 Accepted: 19 April 2017 Published: 11 May 2017*

#### *Citation:*

*Boklund A, Mortensen S, Johansen MH and Halasa T (2017) Resource Estimations in Contingency Planning for Foot-and-Mouth Disease. Front. Vet. Sci. 4:64. doi: 10.3389/fvets.2017.00064*

TXT/PDF/?uri=CELEX:32003L0085&from=EN). Following the directive, EU member states are obliged to use a stamping out policy, involving quarantine, movement restrictions, zoning, and slaughter and disposal of all affected herds, followed by cleaning and disinfection (CD) of the farm. Additional control measures can be used, for example, preemptive culling or vaccination, after approval of the plan by the European Commission regulatory committee *Standing Committee on the Food Chain and Animal Health*.

Preparedness planning for a veterinary crisis, such as an FMD outbreak, is important in order to be fast and effective in the eradication of the disease. For countries with a large export of animals and animal products, each extra day in an epidemic will cost millions of Euros due to the closure of export markets. This is of utmost importance for the Danish swine industry, which had a yearly export of €4.4 billion in 2012 (2).

Modeling results have previously been used to inform decisionmaking related to disease control options (3–14). Models have traditionally been including limitations on resources for culling and vaccination, while only few models have included resources for other things such as surveillance visits [e.g., Ref. (15)]. With this work, we propose to also use the outputs of simulation modeling for planning and operational purposes in a Veterinary Administration of a country before and during a veterinary crisis.

Geering and Lubroth (16) describe how the first step in preparing a resource plan is to make a resource inventory, "listing all the resources needed to respond to a moderate-sized FMD outbreak or other high-priority emergency disease. The plan includes personnel, equipment, and other physical resources." Garner et al. (17) estimated the needed resources during a hypothetical FMD-epidemic in Australia, focusing on the 90th percentile out of 100 iterations simulated to set-off from the area previously predicted to give the worst epidemics in terms of size and duration (18). For the Danish Veterinary and Food Administration, it is important to know how many persons must be available within hours and days in order to efficiently handle and eradicate the disease. And to consider whether these people are already available in the organization and how extra personnel can be recruited. Furthermore, it is important to consider, whether this personnel has the required level of education or if extra training and education is needed. Similarly, the need for materials and services during a veterinary crisis must be identified and quantified. These materials and services include for example cars, sampling materials and testing capacity at the laboratory, equipment for culling of animals, protective clothing, disinfection agents, valuators, trucks and rendering capacity.

The purposes of this project were (1) to develop an iterative tool with the aim of estimating the resources needed during an outbreak of FMD in Denmark and (2) to identify areas, which can be bottle necks in the veterinary administration, and thereby delay the control of the disease and the time to regain disease free status and the access to export markets. The tool developed should easily be updated, when new knowledge is gain from other veterinary crises or during an outbreak of FMD.

#### MATERIALS AND METHODS

#### The Simulation Model

The DTU-DADS model (version 0.16) was used to simulate spread of FMD in Denmark (19, 20).

#### Farm Data

From the official Danish central herd register, all Danish herds registered with cattle, swine, sheep, or goat in the period from October 1st 2006 to September 30th 2007 were extracted and used in the model. This period was used to avoid influence of the blue tongue outbreak in Denmark, which started in October 2007. For each herd, a unique identification number, the herd type, the numbers of animals of different types, and the UTM geo-coordinates was extracted. In total, 23,550 cattle herds, 11,473 swine herds, and 15,830 sheep or goat herds were included. Sheep and goat herds were grouped in one category, called sheep, as there are a limited number of goat herds in Denmark, and they are considered to be handled similar to sheep herds during an epidemic. Furthermore, herds were described as different types, cattle herds as milking or beef cattle, sheep herds as commercial or hobby herds, and swine herds as 19 different herd types, based on their SPF status and their production type (5). For each herd, the rate of daily movements were calculated as the total numbers of movements off the herd in the 1-year period mentioned above divided by 365, for batches of animals moved to other herds or to abattoirs, respectively. For swine herds, animals moved to other herds were divided into sows or weaners. Farms including several species were separated into several herds, with different herd IDs but with the same coordinates.

#### Modeling Spread of Disease

Spread of disease was modeled to occur through seven different spread mechanisms: (1) direct contact, i.e., animal movements, (2) indirect medium risk contacts, i.e., veterinarians, artificial inseminators, or milk controllers, (3) indirect low risk contacts, i.e., visitors, feed stuff/rendering trucks, (4) abattoir trucks, (5) milk tankers, (6) markets, and (7) local spread.

Based on movement data from the period October 2006 to September 2007, the rate of movement per day for the individual herd was used as λ in a Poisson distribution simulating spread of disease. Similarly, the rate of abattoir deliveries was calculated for the individual herd and used as λ in a Poisson distribution simulating the risk of spread on the abattoir route. Contrary, the pick-up of milk from dairy herds was simulated as a Poisson distribution with λ = 0.6 for all dairy herds. Indirect medium and low risk contacts were simulated with different λ for different herd types (5, 19). Markets were simulated for cattle only, as markets in Denmark are restricted to cattle and horses, with an average of 3.5 extra contacts generated from a market. And local spread was simulated as a probability of spread within 3 km from infected herds, simulating the unexplained spread within short distance as a consequence of for example limited airborne spread, rodents, birds, flies, and animal movements or person contacts not registered (19, 20).

#### Modeling Detection of Disease

Detection of the first infected herd was simulated to occur between day 18 and 23 after the disease was introduced, with a mode of 21 days. Thereafter, the disease could be detected either from the farmer or veterinarian (basic surveillance) or from surveillance by official veterinarians as result of tracing contacts from infected herds or as surveillance in the protection or surveillance zones. The probabilities of detection by each type of surveillance were assumed to be dependent on the herd type, as different species show more or less clinical signs (19).

#### Modeling Control of Disease

After detection of disease, a set of control measures must be applied based on the EU council directive 2003/85/EC. These include depopulation of all detected herds, CD of infected properties, tracing (back and forward) of contacts, and establishment of protection (3 km) and surveillance (10 km) zones around detected herds. In both zones, movements of animals are prohibited and all herds must be surveyed at least twice in the protection zone and once in the surveillance zone, before the zones can be lifted.

In the model, herds were assumed to be depopulated as soon as capacity was available. A daily capacity for depopulation of 2,400 ruminants and 4,800 swine was estimated by the industry and the Veterinary and Food administration based on experiences with other diseases (19). Animals moved from a detected herd within 14 days before detection were assumed to be traced, and the receiving herd to be culled. For other traced contacts, the herds receiving contact were assumed to be put under surveillance. Traced contacts and herds in the protection zone would be visited as soon as possible, depending on available resources for surveillance. A daily capacity for surveillance was estimated to 450 herds by the Veterinary and Food administration, based on experiences with other diseases (19). Herds in the protection and the surveillance zone would be set for surveillance visit immediately after the initiation of the zones, and herds in the protection zone would be set for another visit after 21 days and before lifting the zone. Details on how surveillance is modeled can be obtained from Halasa and Boklund (15). All sheep within the zones were simulated to be tested as described in the Danish contingency plan due to non-specific clinical signs in sheep (21). The probability of detecting disease from clinical surveillance and testing increased with time (5).

Furthermore, a 3-day national standstill for all animal movements was modeled, based on the Danish contingency plan for FMD.

#### Initiation of Disease Spread and Model Run

One thousand cattle herds were randomly chosen and used to initiate the spread of disease (index herds). Epidemics starting in cattle herds have previously been shown to create some of the largest outbreaks under the simulated circumstances (5, 19).

For each index herd, one iteration was run, resulting in 1,000 epidemic simulations. The outcome of the model included, for every iteration and for every day in the epidemic, which herds were detected, which herds were depopulated, and which herds were surveyed. Of the 1,000 simulated epidemics, in 19 cases, the disease did not spread from the first infected herd and was not detected, resulting in 981 simulated epidemics in total.

#### Estimations of Resources during an Outbreak

A working-group of 12 persons1 was constituted with staff from the Danish Veterinary and Food Administration (10 persons), with experience in contingency planning and handling of veterinary crises, and experts from the Danish Emergency Management Agency (DEMA) (1 person) and the National Veterinary Institute (1 person). A series of meetings were undertaken, in order for these experts to identify best practices for all work tasks during an epidemic and estimate the man power and other resources needed. In some cases, information was from external sources, while other information was exclusively based on the knowledge and experience within the group.

Estimates for resources during an outbreak were divided in resources for detected herds, suspected herds, surveyed herds, and local crisis centers (**Tables 1** and **2**). No assumptions were made regarding the skills required for neither different tasks nor the manpower available, except for the resource assumptions in the model, described in Section "Modeling Control of Disease." Neither did we decide on whether veterinarians (VET) should be official veterinarians, vets from private practice or from other sources. "Technicians" (TECH) are defined as non-vets working as animal technicians or as legal advisors, HR, or IT personnel. Administrative personnel (ADM) were only related to work in the local crisis center (LCC). The DEMA is hired to be involved in the culling and cleaning phase on detected herds. They will be taking care of setting up an organizational board at the farm, cleaning and disinfecting people and trucks entering and leaving the farm, preliminary CD of the farm, and eventually transportation of culled animals. Personnel from DEMA were categorized as leading officers, officers, and recruits. For all groups of personnel, a detailed description of tasks and necessary skills was provided in the Danish report from the project (22).

Based on the daily outputs from the simulation model, personnel for valuation of herds for each day (i) in the epidemic was calculated as the total type of personnel (p) needed, i.e., the numbers of VETs, TECHs, ADMs, and staff from DEMA, for a given task (t), here valuation, and a given species (a) as:

$$\text{Total}\_{p,\text{a},\text{t}} = \sum \text{animals} \left( \text{orherds} \right)\_{\text{g.s},\text{i}} \cdot \text{team}\_{\text{p.t},\text{a}} \cdot \frac{1}{K\_{\text{t},\text{a}}} \tag{1}$$

where a is reflecting the animal species, g is the action these animals are undergoing—i.e., detection, depopulation, or surveillance, teamp,t,a is the estimated team for a given type of personnel, task, and species, and *K*t,a is the number of animals (or herds) of a given species that a team can handle per day for the given task.

For valuation, the number of veterinarians needed would then be calculated based on Eq. 1 as:

$$\text{Total}\_{\text{VET,a,i,\text{ valuation}}} = \sum \text{hereds}\_{\text{detection,a,i}} \cdot \text{VET}\_{\text{valuation},} \cdot \frac{1}{K\_{\text{valuation,a}}} \tag{2}$$

where VETvaluation is the number of VETs in the valuation team and *K*valuation is the number of herds a valuation team can handle in one day (**Table 1**).

<sup>1</sup>Of the 12 persons, 3 are included as authors, and the 9 other persons are listed in the acknowledgements.


Collection of blood samples #Surveyed herds 4 sheep herds

#### TABLE 1 | Inputs used for estimation of the total personnel resources needed during a foot-and-mouth disease epidemic in Denmark (brackets refer to Eq. 1).

*Vet, veterinarian.*

*a Divided over 21 days.*

The total number of veterinarians needed for depopulation was calculated as:

$$\begin{aligned} \text{Total}\_{\text{VET,a,i,\text{depop}}} &= \Sigma \,\text{herds}\_{\text{depop,at,i}} \cdot \text{coordinatingVet} \\ &+ \Sigma \,\text{animals}\_{\text{depop,at,i}} \cdot \text{VET}\_{\text{depop}} \cdot \\ &\quad \frac{1}{\text{workingHours} \cdot K\_{\text{depop,at}}} \end{aligned} \tag{3}$$

where at is now reflecting the animal species and type of animal i.e., cattle, sheep/goat, sows, finishers, or weaners, VETdepop is the number of VETs in the depopulation team, and *K*depop is the number of animals a depopulation team can handle per working hour. The numbers of coordinating vets at the herd is a constant, with 1 as the default value. Working hours was estimated from the working group to be 8 h efficient work a day, excluding transport time, and breaks.

The number of veterinarians needed for the cleaning and disinfection point (CDP) of the herd was calculated as:

$$\text{Total}\_{\text{VET,a,i,CDP}} = \sum \text{hereds}\_{\text{depop,a,i}} \cdot \text{CDP} \cdot \text{VET}\_{\text{cmp}} \tag{4}$$

where CDP is the numbers of CD points in a herd (default = 1), and VETCDP is the number of days that a veterinarian will be needed at the CDP (default = 0.5). The CDPs were assumed to be used in cattle and swine herds only, based on the limited herd sizes of Danish sheep and goat herds.

The numbers of veterinarians used for clinical inspection (CI) in detected cattle and swine was calculated as:

$$\text{Total}\_{\text{VET,a,i,CI}} = \sum \text{Irends}\_{\text{depop,a,i}} \cdot \text{VET}\_{\text{CI}} \cdot \frac{1}{K\_{\text{CI},a}} \tag{5}$$

where VETCI is the numbers of veterinarians in the team used for CI in the herd and *K*CI is the number of herds a CI team can handle


#### TABLE 2 | Inputs used for estimation of the total personnel resources in local crisis centers (LCCs) during a foot-and-mouth disease epidemic in Denmark.

*Vet, veterinarian; Adm, administrative personnel.*

*a Names in brackets refer to the abbreviations used in the R-script (Supplementary Material).*

in one day. Because of the limited size of Danish sheep herds, CI and blood sampling was assumed to be included in the culling of the animals in sheep herds.

The numbers of veterinarians used for preliminary CD of detected herds was calculated as:

$$\text{Total}\_{\text{VET,a,i,CD}} = \sum \text{Irends}\_{\text{dspop,a,i}} \cdot \text{VET}\_{\text{CD,a}} \cdot \frac{1}{K\_{\text{CD,a}}} \tag{6}$$

where VETCD is the numbers of veterinarians in the team used for initial CD of the herd and *K*CD is the number of herds a CD team can handle in 1 day.

The following final cleaning and disinfection (FCD) of herds was assumed to be done by private commercial cleaning companies, but under guidance and acceptance by the official veterinarians. Therefore, for each species, a certain time was needed for the veterinarians, but spread over a 3 week period of time, as few hours are needed a day. This was calculated as:

$$\text{Total}\_{\text{VET},\text{a},\text{FCD}} = \frac{\sum \text{hends}\_{\text{depop.a},\text{i}} \cdot \text{VET}\_{\text{FCD},\text{a}} \cdot \frac{1}{K\_{\text{FCD},\text{a}}}}{\text{duration}\_{\text{FCD}}} \tag{7}$$

where durationFCD is the time period over which the FCD is taking place and *K*FCD is the number of herds a FCD team can handle in 1 day. This value is then included every day over the durationFCD.

When an epidemic is running, there will be suspicion of disease, also in herds that are not infected. Suspicions (SI), which are following detected with FMD as result of investigation, will in the model be counted as detected herds. However, there will be suspicion of FMD in non-infected herds as well.

As the simulation model only simulate spread of infection, we do not have information on non-infected SI from the model. Therefore, a conservative estimate based on data from the UK 2001 epidemic was used, i.e., five SI per detected herd was assumed for the numbers of inspections based on passive surveillance (23).

The SI were randomly distributed over a period of 10 days, starting the day after a herd was detected in the model. As it is not known in which herd type a suspicion will occur, we could not take herd type into account for SI. The number of veterinarians needed to inspect SI of FMD was calculated as:

$$\text{Total}\_{\text{VET,i,SI}} = \sum \text{herds}\_{\text{suspliciton,i}} \cdot \text{VET}\_{\text{SI}} \cdot \frac{1}{K\_{\text{SI},\text{a}}} \tag{8}$$

where VETSI is the number of veterinarians used in the team investigating a suspicion and *K*SI is the number of herds a suspicion inspection team can handle in 1 day.

The numbers of veterinarians needed for surveillance in traced herds and in herds in the protection or surveillance zones (zoneSurv) were calculated as:

$$\text{Total}\_{\text{VET,i,nonSurv}} = \sum \text{hereds}\_{\text{surveillance},i} \cdot \text{VET}\_{\text{nonSurv}} \tag{9}$$

where VETzoneSurv was the numbers of veterinarian needed for a surveillance visits. No difference was assumed between herd types for surveillance visits. From the output of the simulation model, the day of the surveillance visit was extracted and, therefore, eventual waiting time for a surveillance visit was already accounted for.

During an epidemic, a LCC will be created according to the Danish veterinary contingency plan (24). The numbers of LCCs in Denmark could vary from 1 to 3, related to the regions for official veterinarians. It was assumed that all LCCs were active from the beginning to the end of the epidemic. The needed numbers of veterinarians were calculated as a total for all LCCs (**Table 2**). After the first 14 days after first detection, it was assumed that the experience in the crisis centers would result in more effectiveness in the centers and, therefore, the time needed for different work tasks would be reduced (**Table 2**). The numbers of veterinarians needed in the LCCs were calculated as:

$$\begin{aligned} \text{Total}\_{\text{VET,i.l.CC}} &= \text{LCC} \cdot \text{(LCC}\_{\text{VET,mammet}} + \sum \text{hered}\_{\text{destt},\text{a},i} \cdot \text{subpicture} \cdot \text{(2.1)}\\ &+ \text{LCC}\_{\text{VET,u:pessim}} + \sum \text{hered}\_{\text{detect},\text{a},i} \cdot \text{LCC}\_{\text{VET,HPI}} \\ &+ \sum \text{hered}\_{\text{depepelled},\text{a},i} \cdot \text{LCC}\_{\text{VET,d:pop}} + \text{LCC}\_{\text{VET,mewe}} \\ &+ \text{LCC}\_{\text{VET,comp}} \end{aligned} \tag{10}$$

where LCC is the numbers of local crisis centers, LCCVET,management are veterinarians working in the management of the group, LCCVET,suspecison are veterinarians working suspicisons, LCCVET,EPI are veterinarians working with epidemiology of the epidemic, LCCVET,depop are veterinarians working with depopulation of detected herds, LCCVET,move are veterinarians working with movement restrictions in the zones, and LCCVET,comp are veterinarians working with educating new staff during the epidemic.

Similarly, the needed numbers of technicians, administrative staff, and personnel from DEMA were calculated for each day and each task in the epidemic, and summed over all tasks resulting in the daily needs for personnel. Furthermore, the needs for rendering capacity was calculated for ruminants and non-ruminants, and the needed equipment for culling and testing was calculated, however, not included in this paper. Details from these calculations can be obtained from the authors.

#### Materials

The simulation model as well as the calculations of resources was run using the freeware R (25). All estimated resources are presented in **Tables 1** and **2** and calculations are presented above. The full model is available in the Datasheet S1 in Supplementary Material. From the stochastic simulation model, the following outputs per day of the epidemics were used as inputs in the resource calculations: numbers of detected herds, numbers of depopulated herds, numbers of animals (for each type of animal) in depopulated herds, and numbers of surveyed herds (**Table 1**) resulting in a stochastic model of resources needed during an outbreak. Resource estimations were calculated for every single epidemic (981) and presented as median values and 5th–95th percentiles.

#### Sensitivity Analyses

The influence of estimates on the required number of staff during an outbreak was investigated by decreasing or increasing the number of vets, technicians, and administrative staff as described in **Table 4**. We investigated the effect of change on valuation, culling, CD, surveillance visits in herds under suspicion of disease and in herds located in protection and surveillance zones, on staff at the LCCs being more or less efficient, the influence of only 1 LCC, of DEMA present only 1 day in each herd compared to 2 days (default), and of the numbers of DEMA personnel needed. Sensitivity analyses were run in 100 iterations.

#### RESULTS

The simulated epidemics had a median size of 22 (5–95%: 2–155) infected and detected herds (**Figure 1**) and a median duration of 34 days (5–95%: 2–142), counted from first detection until the last herd is culled, but not taking into account the time until zones are lifted. The median number of SI was 110 (5–95%: 10–7,775).

Based on the results from the simulation models, we estimated that the need for personnel in the regions would peak in the first couple of days with a median of 116 veterinarians, 22 technicians needed, while the need for administrative personnel would peak a little later with a need for a median of 45 administrative personnel 21 days in the epidemic (**Figures 2**–**4**; **Table 3**). Furthermore, the numbers of needed veterinarians would also increase at day 21, caused by the second surveillance visit of herds in the protection zone (**Figure 2**). Additionally, 174 persons would be needed from DEMA at day 2, mostly recruits (**Figure 5**; **Table 3**).

From the sensitivity analyses (**Table 4**), it was clear that the time needed to perform clinical surveillance in farms (either suspected farms or farms located in protection and surveillance zones) influences the estimated numbers of veterinarians and technicians needed during an outbreak. Increased efficiency

FIGURE 3 | The number of technicians needed during an foot-andmouth disease (FMD) epidemic in Denmark. Based on results from a stochastic simulation model simulating 981 FMD epidemics in Denmark, all starting in a cattle herd. The central administration is not included. The solid black line indicates the median value, the dotted black line indicates the 75th percentile, and the lower and upper dotted gray lines indicate the 5th and 95th percentiles, respectively.

FIGURE 4 | The number of administrative personnel needed during an foot-and-mouth disease (FMD) epidemic in Denmark. Based on results from a stochastic simulation model simulating 981 FMD epidemics in Denmark, all starting in a cattle herd. The central administration is not included. The solid black line indicates the median value, the dotted black line indicates the 75th percentile, and the lower and upper dotted gray lines indicate the 5th and 95th percentiles, respectively.

TABLE 3 | Estimated personnel needed at day 2, 7, 14, and 21 in 981 simulated foot-and-mouth disease-epidemics in Denmark, starting in cattle herds given as median and 5th–95th percentiles.


in the LCCs, leading to decreased time needed for each task, decreased the need for veterinarians, technicians and administrative personnel, and using DEMA personnel for only 1 day instead of 2 in detected herds, had large influence on the total numbers of DEMA staff needed. Furthermore, the involvement

(FMD) epidemic in Denmark. Based on results from a stochastic simulation model simulating 981 FMD epidemics in Denmark, all starting in a cattle herd. The solid black line indicates the median value, the dotted black line indicates the 75th percentile, and the lower and upper dotted gray lines indicate the 5th and 95th percentiles, respectively.

of only one LCC decreased the total manpower needed during the epidemics.

#### DISCUSSION

Based on results from a stochastic simulation model, it was possible to create a model in R to estimate the requirements for personnel during an FMD outbreak in Denmark. The model can easily be adjusted, when new information on resources appear from management of other crisis, or when new simulation results are available from new model runs in peacetime. Furthermore, it is possible to adjust the model during a crisis, when model results from daily runs of the stochastic simulation model gives more precise estimates on the specific epidemic, or when adjustments in management procedures becomes available.

It was not surprising to find that especially the number of staff needed for surveillance visit influenced our results, as the numbers of herds in zones are so large. This was also in line with what was found by Garner et al. (17). This means that if veterinarians doing surveillance visits can be more efficient, the number of needed veterinarians will decrease substantially. On the other hand, if veterinarians are not careful, the probability of detection by surveillance visits will decrease, resulting in larger and longer lasting epidemics.

A peak for veterinarians was predicted very early in the epidemic (**Figure 2**). However, the assumption in the model is to be able to survey 450 farms a day in the protection and surveillance zones. If resources for this surveillance are reduced, as described by Halasa et al. (15), surveillance visits will be delayed, leading to delayed detections, prolonged epidemic duration, and an expected right shift in the peak for resources needed.

The resources estimated here were based on simulated epidemics and were shown to follow the simulated epidemic peaks closely (**Figures 1**, **2** and **5**), however, with some delays for technicians and administrative personnel (**Figures 3** and **4**) and with an increase in needed resources again around day. Varying model inputs in the simulation model have previously been shown to change the


*bDivided over 21 days.*

*aSensitivity analyses are run in 100 iterations, so for comparisons, the same 100 iterations (1–100) were extracted.*

outputs (5, 19) and corresponding changes in resources can be expected. Especially, the low risk contacts and the probability of local spread and disease detection were highly influential (19). Furthermore, a decrease in the length of the high-risk period (HRP) would decrease the size, duration, and costs of an outbreak (5). However, using a conservative estimate with a mode of 21 days (18–23) as the HRP would relate to the 2001 FMD epidemic in Europe, where the HRP was estimated to 21 days in the UK (26, 27) as well as in the Netherlands (28).

Our estimates were based on daily outputs from 981 simulated epidemics under a basic control strategy, e.g., the strategy expected to be used, if an outbreak would occur tomorrow. However, in very large epidemics, there is a probability that decision makers would not choose to stick to the basic control strategy, but would most likely add extra control measures such as preemptive culling or emergency vaccination, and therefore, the resources needed in the extreme epidemics would change.

Surprisingly, it was shown that there was a very high need for recruits from the DEMA used in the CD of detected herds (**Figure 5**), which might turn out to be a bottle neck; while our expectations were that the Danish Veterinary and Food Administration could run out of veterinarians.

Before the UK 2001 outbreak of FMD, the UK State Veterinary Service used two scenarios in their contingency planning, one moderate scenario with 10 simultaneous outbreaks and 1 severe outbreak, also with 10 simultaneous outbreaks, but with a large herd density. And they found a need for 235 veterinary officers, which they extrapolated to around 300 in more severe outbreaks. During the UK 2001 outbreak, 57 premises were infected before the first herd was diagnosed, leading to an almost immediate need for all state veterinary officers. Before the end of the outbreak, another 2,500 temporary veterinary inspectors were appointed, nearly 70 from abroad, and another 700 foreign government vets and secondees assisted in periods (29).

Based on the experience in the UK, we could fear that we are underestimating the needs during an FMD crisis in Denmark. However, even though we were interested in estimating manpower and materials needed, we were also aware that we can end up with even larger epidemics. Therefore, it was important for us to create a model, which can easily be adjusted during a crisis in an iterative process. Each time new information become available, regarding the epidemic or the resources needed, it can be fed into the model, and new outputs can be calculated. For example, if the compositions of the veterinary teams for different tasks are changed, we will change the inputs in the model and rerun it. Or if we rerun the stochastic model with historic data from the first 10 days of the epidemic, we will have more precise estimates on the further development of the epidemic and that can be put into the resource model.

The results of our estimations seem somewhat lower than what was estimated in Australia (17). Direct comparisons are difficult, due to geographical differences, resulting in several state disease control centers and local disease control centers in Australia, differences in estimated size of the epidemic, i.e., Garner et al. chose a 90th percentile epidemic and differences in how results are presented. Garner estimated nearly 20% of staff needed was veterinarians, while we estimated 33%.

The calculation of resources needed is an iterative process. The simulation model includes assumptions regarding resources, to simulate realistic epidemics, as scarce resources will prolong the epidemics. After assuming the available resources, we then calculate the daily needs. Naturally, this seems like a circular argument. However, in the simulation model, resources are roughly set as numbers of animals or herds that can be processed daily for either depopulation or surveillance. In the resource calculation presented here, we go into details regarding the teams for each task, the time needed, and look at number of herds and numbers of animals to process. The influence of the assumptions regarding resources has previously been described for depopulation (5) and surveillance (19). In both situations, the influence of reducing the resources was limited, reflecting that plenty of resources were assumed for most simulated outbreaks. This means that the calculations presented in this paper closely reflect the daily needs, when resources are not a limiting factor.

One of the assumptions was that all three veterinary regions would be involved from the beginning of the epidemic. While this was not truly realistic, the influence of this assumption was assumed to be limited, as many parameters even in the LCC depended on the numbers of herds and animals involved in the epidemic rather than the numbers of LCC. However, overall, we did estimate a clear decrease in the manpower needed for veterinarians as well as technicians and administrative staff. Therefore, an adjustment of the model taking region into account will be considered in future versions of the model.

In the current estimations, the very basic needs during an epidemic were estimated. Traveling time between herds was taken into account in the estimates (**Table 1**), while logistic challenges were not taken into account, such as veterinarians stuck in a herd after a surveillance visit that turned out to become a detection of an infected herd. In situations like that the veterinarian will stay in the detected herd and will not be able to visit other herds for the two following days. However, we assumed that the veterinarian would then be able to carry out other tasks, for example in the LCC. The competences needed for personnel involved in each task are described in details in the contingency plan for FMD (24) and in the project report (22). Furthermore, geographical challenges were not taken into consideration in these calculations. Denmark is a rather small country, where farmost destinations can be reached in a reasonable driving time (3–4 h). However, longer driving time will of course reduce the number of herds a veterinarian can visit on a given day. Nevertheless, estimating the amount of personnel needed gives us no answers in itself. To be able to use these estimates, it is necessary for the Danish Veterinary and Food Administration to compare with the present staff available and to consider how and where more personnel can be recruited to meet the needs during a crisis and which type of training is required in peace time, to be ready for an outbreak. The working group has continued working on this matter to update the Danish FMD contingency plan according to the results of the resource estimations and has given detailed descriptions on required competences for different types of staff for different tasks and how people can be trained to meet the challenges during a crisis. All of these results are described in a report from the expert group, in Danish (22).

#### AUTHOR CONTRIBUTIONS

AB, SM, and MJ in study design; AB and TH developed the model; and all the authors contributed to the results and discussion. AB wrote the manuscript, and all the authors commented and approved the final version.

#### ACKNOWLEDGMENTS

The authors wish to thank the Danish Veterinary and Food Administration for funding this project. Furthermore, we would like to give thanks to the expert group for their great effort in this work: in the Danish Veterinary and Food Administration: deputy head of veterinary control office, North—Majbritt Birkmose, official veterinarian—Jesper Valbak, official veterinarian—Annelise Pallesen, official veterinarian—Peter Lybecker Larsen, veterinary officer (head office)—Tina Mørk, deputy head of division for animal health—Stig Mellergaard, head of Danish alert unit for

#### REFERENCES


food—Kim Vandrup Sigsgaard, and special advisor—Erik Jepsen. And from the Danish Emergency Management Agency: Major (CP)—Hans Kaj Henrik Bruhn.

#### FUNDING

This project was financially supported as part of an agreement of commissioned work between the Ministry of Environment and Food of Denmark and the Technical University of Denmark. The work was commissioned by the Danish Veterinary and Food Administration.

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at http://journal.frontiersin.org/article/10.3389/fvets.2017.00064/ full#supplementary-material.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2017 Boklund, Mortensen, Johansen and Halasa. This is an openaccess article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## Evaluation of Strategies to Control a Potential Outbreak of Foot-and-Mouth Disease in Sweden

*Fernanda C. Dórea1 \*, Maria Nöremark1 , Stefan Widgren1 , Jenny Frössling1 , Anette Boklund2 , Tariq Halasa2 and Karl Ståhl1*

*1Department of Disease Control and Epidemiology, National Veterinary Institute (SVA), Uppsala, Sweden, 2Department of Diagnostics and Scientific Advice, The National Veterinary Institute, Copenhagen, Denmark*

To minimize the potential consequences of an introduction of foot-and-mouth disease (FMD) in Europe, European Union (EU) member states are required to present a contingency plan. This study used a simulation model to study potential outbreak scenarios in Sweden and evaluate the best control strategies. The model was informed by the Swedish livestock structure using herd information from cattle, pig, and small ruminant holdings in the country. The contact structure was based on animal movement data and studies investigating the movements between farms of veterinarians, service trucks, and other farm visitors. All scenarios of outbreak control included depopulation of detected herds, 3 km protection and 10 km surveillance zones, movement tracing, and 3 days national standstill. The effect of availability of surveillance resources, i.e., number of field veterinarians per day, and timeliness of enforcement of interventions, was assessed. With the estimated currently available resources, an FMD outbreak in Sweden is expected to be controlled (i.e., last infected herd detected) within 3 weeks of detection in any evaluated scenario. The density of farms in the area where the epidemic started would have little impact on the time to control the outbreak, but spread in high density areas would require more surveillance resources, compared to areas of lower farm density. The use of vaccination did not result in a reduction in the expected number of infected herds. Preemptive depopulation was able to reduce the number of infected herds in extreme scenarios designed to test a combination of worst-case conditions of virus introduction and spread, but at the cost of doubling the number of herds culled. This likely resulted from a combination of the small outbreaks predicted by the spread model, and the high efficacy of the basic control measures evaluated, under the conditions of the Swedish livestock industry, and considering the assumed control resources available. The results indicate that the duration and extent of FMD outbreaks could be kept limited in Sweden using the EU standard control strategy and a 3 days national standstill.

Keywords: foot-and-mouth disease, spread model, simulation, vaccination, stamping out, outbreak control

#### INTRODUCTION

Foot-and-mouth disease (FMD) is described by the World Organisation for Animal Health (OIE) as "the most contagious disease of mammals" (1). The FMD virus (FMDV, family *Picornaviridae*, genus *Aphthovirus*) causes an acute vesicular disease in cloven-hoofed animals. Seven FMDV serotypes have been described, with cross-protection among serotypes not being observed: O, A, C,

#### *Edited by:*

*Eyal Klement, The Hebrew University, Israel*

#### *Reviewed by:*

*Michael Tildesley, University of Warwick, United Kingdom Krishna Thakur, UPEI, Canada*

*\*Correspondence:*

*Fernanda C. Dórea fernanda.dorea@sva.se*

#### *Specialty section:*

*This article was submitted to Veterinary Epidemiology and Economics, a section of the journal Frontiers in Veterinary Science*

*Received: 30 November 2016 Accepted: 07 July 2017 Published: 24 July 2017*

#### *Citation:*

*Dórea FC, Nöremark M, Widgren S, Frössling J, Boklund A, Halasa T and Ståhl K (2017) Evaluation of Strategies to Control a Potential Outbreak of Foot-and-Mouth Disease in Sweden. Front. Vet. Sci. 4:118. doi: 10.3389/fvets.2017.00118*

Asia1, and SAT1, SAT2, SAT3 (2). Due to its exceptional economic impact, the disease is a high priority in disease surveillance, contingency planning, and trading agreements around the globe. Despite not being a zoonosis, the disease can have severe psychosocial impact for the farming society. The extent of the negative effects of an outbreak in previously free countries, such as economic, social, and in animal welfare, can be demonstrated by the European 2001 outbreak that started in the UK (3).

In 2000 and 2001, outbreaks in the Republic of Korea, Japan, Russia, Mongolia, South Africa, the United Kingdom, Republic of Ireland, France, and the Netherlands were caused by FMDV of serotype O (of a particular genetic lineage named the PanAsia strain) (4). The European outbreak ignited an intense debate regarding the best control strategy during the outbreak, as well as their effect on trading reestablishment after the outbreak. The discussions resulted in a revision of the European Union (EU) legislation for the control of FMD, now established in the Council Directive 2003/85/EC. One of the main new elements of the current legislation, compared to previous ones, is the emphasis on preparation of contingency plans (5). Countries are urged to include the preparation for a "worst-case" scenario in the plan, and contingency plans should be regularly updated in light of current information.

Mathematical modeling was used extensively during the 2001 FMD outbreak, especially in the UK, which was most severely affected (6–10). Since then, it has been a tool for evaluating control strategies in hypothetical scenarios, and supporting decisions when elaborating contingency plans (11–17).

Davis animal disease spread (DADS) is a stochastic simulation model developed at the University of Davis (11) and programmed in R (18). The model was later adapted by the Technical University of Denmark (DTU) to simulate the spread of FMD using different control measures (12, 13, 19). The resulting DTU-DADS model has two main components. *Between-herd* spread is simulated using an agent-based model that simulates FMD spread through direct and indirect contact. *Within*-herd spread is modeled as a compartmental model based on the work of Carpenter et al. (20), and parameterized following (21), as detailed in Ref. (12). Several options for outbreak control have been set up in the DTU-DADS model, which can be enforced in specific herds, buffer zones, or following contact tracing. The model explicitly takes into account the resources available and herds are queued if resources are exceeded.

We used a simulation model to study potential outbreak scenarios in Sweden in case of an introduction of FMD, assess their expected magnitude, and evaluate control strategy options. The model developed is a result of the partnership between epidemiologists from the Swedish National Veterinary Institute (SVA) and the Danish team that developed the spread model DTU-DADS at the Technical University of Denmark (DTU). SVA and the Swedish Board of Agriculture (SJV) worked together to define the main questions to be addressed, and the needed support to the decision-making process of drafting a contingency plan. Emphasis was given to the comparative effect of different control measures.

#### MATERIALS AND METHODS

The DTU-DADS spread model [version 0.15 (19)] was adapted by feeding the model with specific Swedish data, and by adjusting the R codes when needed. All model details are discussed below, and a full description of the model and parameters, including original descriptions from the DTU-DADS model when needed [transcribed from Ref. (12), including updates], are available in the Presentation S1 in Supplementary Material. Model parameterization focused on FMDV serotype O, the same that caused the European outbreaks of 2001, and which is the most widely distributed and prevalent FMDV serotype (4).

An overview of the stochastic events simulated in the model is given in **Figure 1**. Events are simulated in discrete time steps of 1 day. Simulations run from the day of the virus introduction until all infected herds are detected, or up to 365 days if the outbreak is not controlled.

model.

**Table 1** lists all model parameters and their sources. Further details for each parameter are given in Presentation S1 in Supplementary Material.

All disease transmission parameters that were thought to be readily applicable from the Danish to the Swedish livestock population, or to be independent from the host population (intrinsic pathogen properties) were kept as set up in the DTU-DADS model, as explained individually for the parameters in **Table 1** and Presentation S1 in Supplementary Material.

To adapt the model to the Swedish livestock population, specific data were collected for all FMD susceptible herds in Sweden, including animal movement data, as shown in **Table 1** and detailed in Presentation S1 in Supplementary Material. The direct and indirect contact networks among these herds were also characterized. Animal and people movements were characterized and modeled according to herd type, but independently for two main geographical regions in Sweden: North and South. This was to account for the lower farm density in the north of Sweden.

The model considers each group of animals from the same species, within the same farm, as one herd, and models herds individually; if a farm contains cattle and pigs, for example, cattle and pig herds are modeled individually. A farm ID is used to keep track of herds in the same farm, and enforce control measures in all herds within a farm equally. If for instance one of the herds is detected as infected, all herds belonging to the same farm are culled. A high probability of local area spread within 100 m is used to account for horizontal transmission between herds in the same farm.

Outbreaks were modeled under different scenarios of disease introduction, to assess the effect of different population parameters in the development of the outbreak. In each of 21 base scenarios, outbreaks were set to start in a herd of a particular type, and in each iteration, the first infected herd was randomly selected among all herds of that type. Seven scenarios had infection seeded in the south of Sweden, in one specific herd type (dairy cattle, cattle herds without milking activity, sow herds, fattening pig herds, weaners, multiplying pig herds, or small ruminant herds); another seven scenarios were related to the same herd types, but seeded in the north of Sweden; and finally herds were chosen based on the frequency of direct animal contacts in a year (low, medium or high contact network cattle herds; low, medium or high contact network pig herds; or high contact network small ruminant herds). In addition, spread was also evaluated when 2, 3, or 4 initial seeds were set (number of infected herds to start the epidemic), all in cattle herds. The evaluated scenarios are listed in **Table 2**.

Base scenarios were simulated using a fixed control strategy (here we use "control strategy" to denote a specified collection of "control measures"). In these *base control scenarios* the mandatory conditions determined in the EU Council Directive 2003/85/EC were implemented, and in addition a 3-day national standstill:



(*Continued*)

#### TABLE 1 | Continued


(*Continued*)

#### TABLE 1 | Continued


*Further details for each parameter are given in the Supplementary Material. a Probability distribution used (minimum, most likely, and maximum).*


A reduction in the number of indirect contacts among farms after detection of the outbreak was also enforced, as per parameters listed in **Table 1**. The standard detection day used in the DTU-DADS model (21 days) was set, and the estimated surveillance capacity in Sweden is listed in **Table 1**.

After the effects of different scenarios of disease introduction were evaluated with this base control strategy, one of the *base* 


(*Continued*)

TABLE 2 | List of all evaluated scenarios of disease spread and control measures.



*control scenarios* was chosen to evaluate the effect of applying alternative control strategies. The choice was based on the evaluation of the *base control scenarios* and is described in the results. *Alternative disease control measures* were evaluated, using a range of parameters listed in **Table 1** (see **Table 2** for a list of the evaluated scenarios):


Based on the results of previous scenarios, three *worst-case scenarios* were chosen. *Sensitivity analysis* was carried out in these worst-case scenarios to ensure that the effect of different parameters could be more easily identified. Model sensitivity was evaluated against a range of values for the detection day and the effectiveness of the alternative control measures, and variation in the amount of surveillance resources available (daily survey and culling capacity, see **Tables 1** and **2**).

Finally, to confirm the conclusions drawn from the previous steps, all disease spread conditions determined to have high impact in the outbreak size were manipulated to exaggerate the worst-case scenarios, and create a *chaos scenario*. "Chaos" was assumed to be a consequence of a very high infection pressure to start with (four infected herds to start the epidemic, all in the south of Sweden and close to the Danish border), and a cumulative number of failures in the effectiveness of all control measures applied. These conditions were intended to mimic an epidemic that starts and develops with a much greater magnitude than expected, compared to the typical outbreak scenarios modeled previously, or an epidemic that gets out of control. The effectiveness of specific control measures were challenged against this chaotic scenario (see **Table 2**).

**Table 2** lists every scenario evaluated. For each scenario, the following outputs are reported:


Ten-thousand iterations of scenario 1 showed that output medians and interquartile ranges were stable after 500 iterations, but the maximum varied due to longer epidemics observed in individual iterations when more repetitions were run. As a

T

ABLE 2 | Continued

*One-thousand iterations were modeled for each scenario.*

compromise between achieving higher variability and keeping computational time manageable, 1,000 iterations were simulated for each scenario.

The progression of scenarios described above focused on testing the model sensitivity to the control measures. To also evaluate the structure of the model, and the impact of the parameters that were imported from the Danish model, the transmission parameters listed in **Table 1** were also subjected to sensitivity analysis. The probability of transmission associated with direct contact, slaughter trucks, low risk contact, and medium risk contact were increased and reduced to up to 20%. The effect of local spread was also subjected to sensitivity analysis, by removing any local spread that was not between herds in the same farm, or increasing the probabilities in radius from 1 to 3 km up to five times.

#### RESULTS

In general, the results showed that an FMD outbreak in Sweden would most likely be small and of short duration, and that base control measures as specified in the EU legislation, complemented with a 3-day national standstill of all susceptible animals movements, would be enough for bringing the outbreak under control. Considering the 24 base scenarios evaluated, the median epidemic duration (time from detection of the first infected herd to the day in which the last herd was detected) was 3–15 days, and the median number of infected herds was 2–19 (with a median number of culled animals of 46–4,136). The 95% percentiles were for an epidemic of 20 days, involving 15 infected herds and culling nearly 5,000 animals.

Summary statistics for all the scenarios evaluated are presented in Table S2-1 in Presentation S2 in Supplementary Material and **Table 1**, and relevant results and conclusions are presented and discussed by group of scenarios below. Please note that epidemic duration is counted from the detection day. Simulations in which the epidemic was considered to die off before detection resulted in negative epidemic duration.

The results of the base scenarios (**Figure 2**) showed that the region where the outbreak started (North versus South) had little effect on the expected size and duration of the epidemic. By looking in detail into individual iterations, and mapping every modeled transmission event, it was possible to conclude that this was because epidemics starting in the North eventually spread to the South through long distance movements. The main

difference between epidemics starting in the North and South are the resources needed to control the outbreak, as farm density is lower in the North, and therefore a smaller number of farms ends up in the surveillance zones. The median number of farms that needed to be visited by surveillance teams (direct contacts of infected farms, or farms within the surveillance zones) was 106–381 for the base control scenarios in which epidemic started in the South (95% percentile = 448–915) and 49–195 in the North (95% = 188–734).

The effect of species and herd types seemed to be a direct effect of the contact network structure for each herd type. Epidemics starting in sheep/goats herds were generally smaller, due to a lower probability of direct (animal movement) and indirect (people movement) contacts. Cattle herds had very different results depending on whether it was a dairy herd or not, reflecting the larger number of indirect contacts expected daily in herds with milking animals. Outbreaks starting in pig herds in general resulted in an average epidemic size between milking and nonmilking cattle herds. The main impact of starting epidemics in pig herds was the higher number of animals that were culled, a reflection of their much larger herd size (see Presentation S1 in Supplementary Material for herd statistics). As epidemics were generally small, with only a few herds being culled, the size of the seeding herd had high impact on the total number of culled animals. Only the number of animals culled is shown in **Figure 2**, since the number of herds culled was almost always the same as the number of infected herds (Table S2-1 in Supplementary Material).

The number of pig herds in Sweden is very small, and as a result, epidemics that started in pig herds were ultimately driven by spread among cattle herds, as we could conclude from extensive analysis of the base control scenarios. Since cattle herds seemed to be driving spread, and the contact network (direct and indirect contacts) was the main driver of the epidemic size, a "typical outbreak scenario" was chosen as one starting in a cattle herd in the south of Sweden, with an average number of yearly direct contacts. This scenario was chosen to test the effect of alternative control measures, as shown in **Table 2**.

The main result for all scenarios designed to evaluate the effect of alternative control measures (listed in **Table 2**) was a remarkable lack of variation between these scenarios, as demonstrated for a few selected scenarios in **Figure 3**, and for all scenarios in Presentation S2 in Supplementary Material (Figure S2-1 in Supplementary Material). Late detection (modeled as a pert distribution from 21 to 25 days, with most likely 23 days) had an effect in increasing the epidemic duration and the maximum observed number of infected herds, but not increasing the median number of infected herds. In all the different scenarios simulated the median number of infected herds was 3, and the 95% percentile ranged from 12 to 15 for all scenarios but late

FIGURE 3 | Results of selected scenarios comparing alternative control measures and amount of resources available. Scenario labels are as presented in Table 2. Individual box plots represent the summary of 1,000 iterations for each scenario. Red lines mark the median for all the iterations in the "typical outbreak scenario" against which all measures are compared (first box plot), and the dashed lines represent the 25 and 75% percentiles for that scenario.

detection, in which the 95% percentile for the number of infected herds was 19. Ring culling had only a marginal effect in reducing the epidemic duration, but not the number of herds infected. Ring vaccination did not reduce the epidemic duration nor the number of herds infected.

The "typical outbreak scenario" was examined on a daily basis, focusing on the number of herds put in the surveillance list daily, in comparison to the number of available teams. The results (Figure S2-3 in Supplementary Material, Presentation 2) showed that the number of herds to be visited per day only exceeded the capacity of surveillance immediately after detection in the median outbreak, with herds waiting at most a day to be visited in the cases when the outbreak was controlled within a week. Epidemics in iterations placed above the 90% percentile could take up to 19 days to be controlled. In those cases the number of days a herd would have to wait to be visited could be as high as 14, with daily medians ranging from 0 to 8.

Sensitivity analysis were performed using three "worst-case scenarios" to magnify the observed effectiveness of control measures, which could have been hard to observe in the small outbreak sizes associated with the "typical outbreak scenario." The sensitivity analysis showed that these results were robust for the range of parameters tested in the sensitivity analysis (see sensitivity analysis section in **Table 2**), except for one: the day of detection (Figure S2-2 in Supplementary Material, Presentation 2). **Table 2** lists the 19 scenarios evaluated based on worst-case scenario A (cattle scenario with highest expected epidemic size and duration—starting in a milking herd). If we exclude the two scenarios in which late detection was tested, the median number of infected herds for all other 17 scenarios ranged from 7 to 8, and the median epidemic duration was always 11 days (from detection of the first until detection of the last infected herd). Each week of delayed detection doubled the median number of infected herds, resulting in medians 16 and 32 herds for the scenarios of detection on days 28 and 35, respectively. The median epidemic duration for the late detection scenarios were 14 and 18 days.

For worst-case scenario B (starting in a pig herd with a great number of direct contacts), the median number of infected herds in the 17 scenarios tested with detection on day 21 (but varying the efficacy of various control measures) ranged 10–11, and the median epidemic duration was always 13 days. Detection on days 28 and 35 increased the median number of infected herds to 30 and 61, respectively, and resulted in a median epidemic duration of 17 and 21 days.

When the epidemic was seeded in four cattle herds at the same time (worst-case scenario C), but detection was not delayed, the median number of infected herds in all sensitivity analysis scenarios evaluated varied between 19 and 20 herds, with median epidemic duration varying between 15 and 16 days. Detection on days 28 and 35 increased the median number of infected herds to 40 and 87, respectively, and resulted in a median epidemic duration of 19 and 24 days.

Sensitivity analysis for the transmission parameters showed that the results were very robust to changes in punctual transmission parameters. As for the previous analysis, this was particularly true in scenarios with low expected number of infected herds. In the "typical outbreak scenario," for instance, changes of up to 20% in the probability of transmission following direct contact did not change the median number of infected herds. Evaluation of the percentage of all transmission events, over all iterations in that scenario, showed that about 45% were a result of direct contact, and 5% of movement to slaughter. This resulted in robustness of the model to changes in the probabilities of transmission associated with slaughter movements. About 28% of the transmission events were due to indirect contact (low and medium risk contacts), and 22% due to local spread. The probability of local spread within 100 m was kept high in all scenarios to ensure transmission between herds within the same farm. As expected, increases in the probability of transmission for other distances resulted in a higher number of infected herds, but a fivefold increase in the probability of transmission within 1 km, for instance, only increased the median number of infected herds in the typical scenario by about two herds.

Based on the results of scenarios presented above, a cutoff of 10 detected infected herds was set as a decision point for when authorities should start considering that the outbreak was not being brought under control. In all base scenarios the expected number of infected herds was under 10, and only higher in scenarios with multiple starting seeds or failures in the effectiveness of control measures. The effect of deciding to implement ring culling or vaccination after this threshold was reached was evaluated in the chaos scenarios, and results are presented in **Figure 4**.

The scenario with infection seeded in four cattle herds in the south of Sweden at the same time, and detection after 4 weeks (base chaos scenario), resulted in a median number of 42 infected herds (95% percentile of 83 herds), and an epidemic duration of 20 days between detection of the first and last herd (95% percentile at 33 days). This is assuming that all base control measures would be applied, and the surveillance capacity would be at a regular level, but all applied control measures would be 15% less effective than in the base scenarios (for instance effectiveness of enforcement of the standstill, and effectiveness of tracing). Increasing the period of standstill was not effective in reducing the number of infected herds nor the epidemic duration. Ring vaccination was not effective in reducing the median number of herds infected, although the median epidemic duration was reduced by 1 day (median 19 and 95% percentile of 28 days).

The implementation of preemptive depopulation of all susceptible animals, in a radius of 1 km around every infected farm, would reduce the median epidemic duration in the chaos scenarios by 4 days (median 16 days; 95% percentile at 26 days). The median number of infected herds was reduced to 38 (95% percentile at 72 herds). As a consequence of the reduction in the number of infected herds (fewer surveillance zones to be established), the median number of visited herds was reduced from 1,075 in the base chaos scenario to 875 when culling was applied (95% percentiles at 1,439 and 1,327, respectively). The median number of culled herds, however, was increased from 42 to 86 (95% percentiles at 82 and 179, respectively), and the median number of animals culled from 6,682 to 12,789 (95% percentiles at 23,002 and 29,897, respectively, for the scenarios without and with preemptive depopulation).

FIGURE 4 | Results of spread under a scenario of "chaos," with only base control measures in place, and with implementation of additional controls. All scenarios are further detailed in Table 2. Individual box plots represent the summary of 1,000 iterations for each scenario. Red lines mark the median for all the iterations in the scenario with base control measures (first box plot), and the dashed lines represent the 25 and 75% percentiles for that scenario.

#### DISCUSSION

A disease spread model was adapted to the Swedish livestock structure to evaluate the effect of different control strategies and inform FMD preparedness in Sweden. Results showed that an FMD introduction in Sweden will most likely spread slowly and be readily contained with adoption of a control strategy combining the control measures required in the current EU legislation for the control of FMD (Council Directive 2003/85/EC), and a national standstill. The detailed control strategy is: a 3-day prohibition of all movements of susceptible animals after first detection (standstill), 3 km protection zones and 10 km surveillance zones around every detected farm, and culling of all animals in detected farms and their high risk contacts.

The results of the model are not meant to be interpreted as a strictly quantitative representation of reality. The application of models to decision-making, in general, should serve primarily as a means for comparing the effectiveness of different control measures, and assessing the comparative magnitude of various scenarios to understand the main outbreak drivers and the most important control targets (16). While we do not expect the model to tell us the exact number of herds that would be affected by a FMD outbreak in Sweden, for instance, the range of results evaluated gave us the expected dimension of the problem, in particular when compared among scenarios within this work, and also when compared to results from other countries.

Our results are a direct contrast to those observed when the same model was applied in Denmark, where the adoption of additional measures such as protective vaccination and ring depopulation were concluded to be cost-effective on most scenarios of spread (12). The contrasting conclusions, however, increase confidence that the results observed are not an artifact of the model, and highlight the impact that the specific characteristics of the Swedish livestock structure had in the model. In comparison to Denmark, Sweden is characterized by a low density of farms, with much smaller herd sizes on average, and most particularly, a small pig industry (23–25). Many farms also have very limited trade of live animals (26). In Finland, where cattle and pig farms are also typically family owned and small in size compared to the rest of Europe, and where the livestock industry has also been decreasing in recent years, results of a risk assessment published in 2011 were similar to the ones presented here (27). The authors concluded that a possible FMD outbreak in Finland would be controlled within 5 weeks of introduction, affecting on average four farms, and even the larger expected outbreaks would involve few farms and be promptly controlled.

Another difference to the original Danish model is that a reduction in the number of indirect contacts between farms after outbreak detection was assumed. The assumption that people would reduce all unnecessary traffic from and to their farms, once an outbreak is known to be occurring in the country, is based on feedback from farmers (28). It is also informed by the experience of our group through several outbreaks (of diseases other than FMD) and in particular a change in behavior noted in the country during the FMD outbreak in the UK in 2001.

The model was scrutinized by individual evaluation of multiple iterations per scenario, and mapping of every modeled transmission event, including the mode of transmission (direct or indirect contact). This confirmed that epidemic size was mainly driven by infected cattle farms. It also confirmed the expected effect of long distance movements in keeping North and South of Sweden highly inter-connected (29).

The choice of a predefined detection date (21 days after seeding the infection) was based on extensive review of information from previous outbreaks performed by the Danish team that developed the DTU-DADS model (12, 13). Complementary work (not presented in this paper) trying to estimate the detection date based on the probability of animals showing clinical signs, and the documented efficacy of passive surveillance in Sweden, suggested that 21 days is a conservative assumption. Relaxation of this assumption, i.e., assuming a later detection, was the single parameter with the most impact in the epidemic size. Sensitivity analysis showed that each week of later detection generally doubled the expected total number of infected herds by the time the epidemic is controlled. The epidemic duration (i.e., from day of detection to day of detection of last infected herd), however, showed remarkable robustness when compared to the number of infected herds, and the median epidemic duration was increased by only 3–4 days when detection was delayed by 1 week, and another 4–5 days for an extra week of delay. This highlights that surveillance resources were rarely exceeded, and the base control measures modeled were sufficient to cope with outbreaks of dimensions much larger than what was considered the "typical outbreak scenario" for Sweden.

The DTU-DADS model (in the version used to carry out this work, 0.15) did not allow adjustment of the surveillance capacity along the outbreak, that is, surveillance resources are fixed for the whole period of the epidemic. The surveillance capacity used in this model was based on what the Swedish Board of Agriculture considered feasible to gather in the first 1–3 days after detection of the first suspicion (and therefore arguably at the same time or shortly after confirmation). In reality, the number of surveillance resources could be increased after a few days of outbreak control. In the "typical outbreak scenario," only in exceptional epidemics the number of herds that needed to be visited for clinical surveillance was greater than the number of teams available for field visits (see Figure S2-3 in Supplementary Material), and a herd queue was generated. The culling capacity, however, was never exceeded. Most of the farms in the model had herds smaller than the daily culling capacity per team declared by the Swedish Board of Agriculture and used in the model. Moreover, the number of herds that needed to be culled in the same day was very small.

The relatively small expected outbreak size and low demand for surveillance resources in all scenarios resulted in a high observed efficacy of the base control measures. In all evaluated scenarios, even the most chaotic ones, an FMD epidemic is expected to be controlled within 3 weeks from the detection of the first case. The number of herds infected is small, and most of the surveillance effort needed will be to visit farms that fall into the surveillance zones around each infected farm, to rule out infection. Surveillance capacity was not often exceeded. In epidemics that took longer than 2 weeks to control, herds could eventually wait longer than 2 days to be visited by a surveillance queue. However, herds in queue were those that needed to be visited because they fell within the surveillance zone. Suspected farms and high risk contacts are given priority in the surveillance visiting list, and therefore can be visited on the day of detection/tracing, as long as the number of infected herds and their direct contacts is below the number of surveillance teams, as was the case in all scenarios evaluated.

The base control measures were not only predicted to be effective, they were also robust. Reductions of up to 40% in the efficacy of a single measure can be compensated if everything else is assumed to be working properly. The number of infected herds was more sensitive to failures in control than the expected epidemic duration, due to the reasons discussed above.

Direct contact and local spread were the main modes of disease transmission. The central role of direct contact transmission is expected (30). In this model, the high percentage of local spread transmission is a consequence of the way the model was set up. Individual herds are modeled independently, and transmission between herds in the same farm is enforced by setting a high probability of local transmission within 100 m. The model was obviously sensitive to the set probability of transmission for other distance radius. In this model transmission events are modeled individually, and the addition of a local spread component was meant only to reflect any residual transmission not accounted for after modeling direct and indirect contacts explicitly.

The worst-case scenarios observed were those related to multiple introductions at the same time, and delayed detection of introduction. Even in those cases, a reduction in the expected number of infected herds as a result of the application of vaccination could not be demonstrated. Preemptive depopulation had an effect in reducing the median number of infected herds when very large epidemics were modeled (multiple introductions and late detection). Considering, however, that this measure would double the median number of herds and total animals to be culled, cost–benefit analysis will be needed to determine whether the benefits of applying this measure would justify the costs both in resources and animal welfare. As the current results indicate that the effect of preemptive depopulation can only become relevant for very large epidemics, this measure should only be considered after a large number of infected herds have been detected. Models exploring scenarios of FMD spread in the UK and Denmark have shown that ring culling can have a positive effect in specific circumstances (17, 31). We have, as those authors, concluded that the effect is not very pronounced, and more extensive analysis will be needed to determine the exact conditions under which an outbreak may have become large enough to justify preemptive depopulation.

While the base control strategy recommended based on this work is expected to be effective, it should be highlighted that the overall costs to the society and governmental agencies, as well as the workload, should not be overlooked. Effective control is associated with prompt implementation of a contingency strategy that would require deployment of 40 field surveillance teams per day and capacity to destroy thousands of animals per day. And for the control to be efficient, additional teams on central and regional level are needed working with contact tracing, data analysis, dissemination of information, logistics, etc., although these functions have not been included as a limiting factor in the model.

In summary, a potential FMD outbreak in Sweden is expected to be small and controlled fast through a 3-day national standstill, application of surveillance zones around infected farms, and culling of all animals in detected farms and their high risk contacts. This result is based on the assumption that detection would not be delayed by more than 4–5 weeks after introduction, that these measures would be enforced quickly after detection, and that the effectiveness of these control measures can be expected to fall within the range of values evaluated in this work.

#### AUTHOR CONTRIBUTIONS

FD adapted the infectious disease model to Sweden, parameterized and ran the model, summarized results, and wrote the

#### REFERENCES


manuscript. MN, JF, and KS helped parameterize the model, set relevant scenarios to be evaluated, and analyze and interpret results. SW helped adapt the model to Sweden, prepare Swedish data, and evaluate model behavior. AB and TH wrote the initial infectious disease model, trained the group into using it, and helped inspect model behavior and interpret results once the model was adapted to Sweden.

#### ACKNOWLEDGMENTS

The authors thank the Swedish Board of Agriculture (Diana Viske, Vida Jordén, Thomas Svensson, Håkan Henriksson, and Bengt Larsson), Svensk Lantbrukstjänst (Mikael Lidholm), and Jord på trynet (Mats Schörling) for their invaluable support in informing the parameters for the model, and help "asking the model the right questions." Financial support for this work was given by the Swedish Civil Contingencies Agency under grant program 2:4 on Emergency Preparedness/Civil Contingency.

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at http://journal.frontiersin.org/article/10.3389/fvets.2017.00118/ full#supplementary-material.


livestock farmers. *Transbound Emerg Dis* (2010) 57:225–36. doi:10.1111/j.1865- 1682.2010.01140.x


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2017 Dórea, Nöremark, Widgren, Frössling, Boklund, Halasa and Ståhl. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## **Simulating the Epidemiological and Economic Impact of Paratuberculosis Control Actions in Dairy Cattle**

*Carsten Kirkeby <sup>1</sup> \*, Kaare Græsbøll 1,2 , Søren Saxmose Nielsen<sup>3</sup> , Lasse E. Christiansen<sup>2</sup> , Nils Toft <sup>1</sup> , Erik Rattenborg<sup>4</sup> and Tariq Halasa<sup>1</sup>*

*<sup>1</sup>DTU VET, Section for Epidemiology, Technical University of Denmark, Frederiksberg, Denmark, <sup>2</sup> DTU Compute, Section for Dynamical Systems, Department of Applied Mathematics and Computer Science, Technical University of Denmark, Frederiksberg, Denmark, <sup>3</sup> Section for Animal Welfare and Disease Control, Department of Large Animal Sciences, University of Copenhagen, Frederiksberg, Denmark, <sup>4</sup> SEGES, Agro Food Park, Aarhus, Denmark*

We describe a new mechanistic bioeconomic model for simulating the spread of *Mycobacterium avium* subsp. *paratuberculosis* (MAP) within a dairy cattle herd. The model includes age-dependent susceptibility for infection; age-dependent sensitivity for detection; environmental MAP build up in five separate areas of the farm; *in utero* infection; infection *via* colostrum and waste milk, and it allows for realistic culling (i.e., due to other diseases) by including a ranking system. We calibrated the model using a unique dataset from Denmark, including 102 random farms with no control actions against spread of MAP. Likewise, four control actions recommended in the Danish MAP control program were implemented in the model based on reported management strategies in Danish dairy herds in a MAP control scheme. We tested the model parameterization in a sensitivity analysis. We show that a test-and-cull strategy is on average the most cost-effective solution to decrease the prevalence and increase the total net revenue on a farm with low hygiene, but not more profitable than no control strategy on a farm with average hygiene. Although it is possible to eradicate MAP from the farm by implementing all four control actions from the Danish MAP control program, it was not economically attractive since the expenses for the control actions outweigh the benefits. Furthermore, the three most popular control actions against the spread of MAP on the farm were found to be costly and inefficient in lowering the prevalence when used independently.

#### **Keywords: bioeconomic model, dairy cow, MAP, paratuberculosis, simulation model**

#### **INTRODUCTION**

Paratuberculosis is a chronic infection in ruminants caused by *Mycobacterium avium* subsp. *paratuberculosis* (MAP), and resulting in financial losses to the dairy industry worldwide (1), where the prevalence of infected farms is believed to be substantial (2). Infected cattle can be subclinically infected for years until the animals develop acute diarrhea and eventually die. Infected animals also exhibit a decline in milk production. The annual economic loss due to MAP infection has been estimated to be as high as \$200 million in the US alone (3). In Denmark, a national voluntary MAP control program was initiated in 2006, and in 2013, the estimated median true between- and withinherd prevalences among 925 herds participating in the control program were estimated to be 77 and 7%, respectively (4).

*Edited by:*

*Eyal Klement, The Hebrew University, Israel*

#### *Reviewed by:*

*Dannele E. Peck, University of Wyoming, USA Karin Orsel, University of Calgary, Canada*

> *\*Correspondence: Carsten Kirkeby ckir@vet.dtu.dk*

#### *Specialty section:*

*This article was submitted to Veterinary Epidemiology and Economics, a section of the journal Frontiers in Veterinary Science*

> *Received: 12 August 2016 Accepted: 26 September 2016 Published: 10 October 2016*

#### *Citation:*

*Kirkeby C, Græsbøll K, Nielsen SS, Christiansen LE, Toft N, Rattenborg E and Halasa T (2016) Simulating the Epidemiological and Economic Impact of Paratuberculosis Control Actions in Dairy Cattle. Front. Vet. Sci. 3:90. doi: 10.3389/fvets.2016.00090*

Simulation models have been used in evaluating the impact of different actions on the prevalence and spread of MAP in dairy herds [e.g., Ref. (5–8)]. These models predict that the within-herd true prevalence increases from 50 to 90% if no control actions are implemented (6–11). However, the within-herd true prevalence on farms in Denmark is much lower, around 7% (4), indicating an endemic situation with a stable prevalence. The previous models (mentioned above) are frequency-dependent models, in which the probability of infection depends directly on the number of infectious animals. Such models are suited for simulating epidemic situations [as described by Ryder et al. (12)]. Nevertheless, paratuberculosis is a slow progressing disease of endemic nature (7) and, hence, the chosen simulation model should reflect this nature. Therefore, we chose a density-dependent model, in which the probability of infection is dependent on the density of MAP in the herd. This model is suitable for modeling disease spread in endemic situations, especially when pathogens are spread through the environment (12). The objective of our study was to build a bioeconomic model framework, PTB-iCull calibrated to field data. Here, we describe the model and show how we used it to estimate the economic and epidemiologic impact of recommended MAP control actions from Denmark's paratuberculosis control program.

### **MATERIALS AND METHODS**

The PTB-iCull model is a stochastic, mechanistic, and dynamic discrete event simulation model that deals with the spread of MAP within a dairy herd in Denmark. It is written in R (13), and the current version of the model simulates a closed herd (without purchase of livestock) with a constricted herd size. The model consists of two main components: a herd dynamics component (LifeStep component) and a disease dynamics component. The time periods determining the life stage for each animal and the durations of the disease states of MAP infection are stochastically drawn from relevant distributions (Tables S1 and S2 in Supplementary Material). We used a dataset from the Danish Cattle Database hosted by SEGES (www.seges.dk) including milk records from 293,929 individual cows on 610 farms, recorded between year 2000 and 2013. In total, almost 5 million records were used to parameterize the cows in the model with regard to milk production and somatic cell count (SCC). We also used another dataset comprising 102 randomly chosen herds that were not enrolled in the Danish MAP control program. These herds were tested in August 2011 due to a sampling error where all milk-recorded herds were tested instead of just those in the control program. The error was detected after 5 days, so this cohort was considered as a random selection of non-program herds (program herds were excluded). All lactating animals in the herds were tested using the ID Screen (IDvet, Graebels, France) ELISA for detection of MAP (see also Section "ELISA" for test characteristics).

The simulation process is as follows: first, an initial herd is generated. The proportions of heifers, milking cows, and all other life steps are chosen to create a stable model, with regard to the number of animals in each life stage (Table S3 in Supplementary Material). In this study, the model represents a closed system (no purchase of animals), but it is possible to simulate an open herd. For each time step (1 day), the model tracks and updates the age of all animals in the herd, days in milk (DIM, the number of days a cow has been milking in the current lactation), the number of days that remain in the present life stage of each animal, and the number of days each animal has spent in the present disease state. For each day, we calculated the animal units in the herd and the number of slaughtered animals in each disease state.

We here use the model to simulate different scenarios. In each scenario, the farmer use a different strategy to control MAP on the farm, from no control to implementation of three control actions, and a test-and-cull strategy.

#### **Herd Dynamics**

The model simulated a herd where the animals are kept indoors throughout the year. The cattle pass successively through the life stages in the model: calf; heifer; inseminated heifer; pregnant heifer; early lactation stage (after calving); inseminated cow; pregnant cow; and dry cow, and then again to the early lactation stage of next parity and so on. An animal can be culled at any stage of its life, which is modeled based on distributions in the Danish cattle population (Table S3 in Supplementary Material). We used the initial number of lactating cows as the maximum number of lactating cows during the simulated period. A typical Danish farm is divided into five sections based on the life stage of the animal. In the simulation model, animals are placed in one of the five farm sections: calves (0–1 year old), heifers (1 year old until first calving), lactating cows, dry cows, and calving pens. This reflects a common structure of farms in Denmark and allows us to simulate the spread of MAP within each section.

#### Insemination

When a heifer or a cow is inseminated, the insemination success (and hence the probability of continuing into pregnancy) is given by the probability of detecting the heat and the probability of conception following insemination. Of the unsuccessful inseminations, 90% (default) will wait 41 days before another insemination is attempted. The remaining 10% (default) will only wait one estrous cycle (21 days) before a new attempt, corresponding to the proportion of cows failing to conceive from an insemination. The default maximum number of insemination attempts before a cow is culled is seven (expert opinion).

#### Pregnancy and Calving

When an animal conceives, the number of days for pregnancy is drawn from a normal distribution (Table S1 in Supplementary Material). During the last stage of pregnancy, a cow is given a number of days in the dry period (Table S1 in Supplementary Material). Half of the calves are bull calves and are sold from the farm at a given price [161AC per calf (14)], and 4% of the calves are stillborn (Table S4 in Supplementary Material). Female calves proceed in the herd and are raised to heifers. After calving, the dam enters the early lactation stage where it produces milk. The number of days spent in the early lactation stage is drawn from a normal distribution (Table S1 in Supplementary Material). After the early milking stage, the cows are inseminated.

#### Culling

All heifers are inseminated, calve, and are put in the milking section. If there are more than 200 cows in the milking section, the excess number will be culled once per week. Culling is divided into two parts: voluntary and involuntary culling. Involuntary culling includes animals that are injured or subjected to other diseases and therefore sent to slaughter. These are randomly selected, but the probability of culling is dependent on the parity in order to balance the demographic structure of the herd. Data on the reasons for culling show that about 33% of cases are voluntary (Kasper Krogh, SEGES, personal communication, 2014). Voluntary culling in the model is carried out by prioritizing which cows should be culled, based on the information about simulated milk production, reproduction status, SCC, and repeated MAP ELISA values. We simulate quarterly observations of the milk production and SCC level for each cow. Practically, this is done by updating a cow-specific indicator measure for milk yield, and another cow-specific indicator measure for SCC, for each cow every time, there is a new observation. The observed level of milk or SCC for each cow is a weighted measure with 35% weight on the latest measurement and 65% on the previous value of the indicator. This results in an exponential smoothing mechanism where the four latest measurements account for 73% of the information, to mimic a farmer's decision. Furthermore, the farmer can use flags to mark and prioritize cows for culling as in the national Danish Cattle Database (SEGES, Aarhus, Denmark). A cow is flagged each time, it exceeds a specified value for each of four categories: (1) milk yield is in the lowest 20% of cows on the farm; (2) number of insemination attempts in the current lactation is seven (default) or more; (3) observed SCC is above 200,000/ml (default), and (4) if test-and-cull strategy is used, a minimum of two of the last four MAP ELISA values are positive (according to the Danish MAP control program). Cull rates for each parity estimated from the dataset from SEGES (parity 1: 26%, parity 2: 40%, parity 3: 51%, parity 4: 59%, parity 5: 65%, and parity 6: 70%) are then added numerically to the number of flags per cow to balance the cullings. We kept the income from a culled cow fixed in the model at 483AC [600 kg *×* 0.805AC as listed in Kudahl et al. (14)]. Besides culling, cows can die due to background mortality at a cost of 79AC per carcass (from the rendering plant "DAKA SecAnim" 2014, see Table S4 in Supplementary Material).

#### Milk Yield

The model simulates a non-quota system without any assumptions about financial support or delivery contracts. Milk yield is recorded in kilograms of ECM (energy-corrected milk yield). We assigned an individual milk production level to each cow, relative to the other cows on the same farm. From this individual milk production level, we modeled the daily milk yield with two cowspecific parameters using the Wood lactation curve (15). We used a daily variation (SD) in the milk yield of 0.1.

Heifers inherit the milk production level from their dam. Animals get a new shape parameter for each parity. The shape parameter, *S*, is drawn from an exponential distribution:

$$S \sim \exp\left(\lambda\right) \tag{1}$$

where λ is 6.735207. The individual milk level, α <sup>M</sup>, for a cow will be inherited by its offspring, with a regression tendency toward the mean:

$$\alpha^{\rm M} = N \left( 1 + \left( \alpha\_{\rm dam}^{\rm M} - 1 \right) \cdot 0.13, \ 0.19 \right) \tag{2}$$

where α <sup>M</sup> is the milk level of the dam, and 0.19 is the SD. We calculated DIM for each cow (and heifer) from when they have calved to when they are dried off. The farmer discards the milk in the first 2 days of each lactation period.

#### Somatic Cell Count

We modeled the SCC per cow during each lactation period. The values are generally inversely proportional to the milk yield over a lactation and have been parameterized by fitting a Wilmink style curve (16) to SCC data from the large dataset. The SCC level, α C , for each animal is drawn from a normal distribution as estimated from the dataset of Danish dairy herds:

$$\boldsymbol{\alpha}^{\complement} = \boldsymbol{N}(1, \ 0.051) \tag{3}$$

where 0.051 is the cow-specific variation estimated from the data. The SCC for the bulk tank milk is calculated for each day using a daily variation (SD) of 0.043, which is normally distributed before the log–log transformation.

We simulated a milk price according to the rules of Arla Foods (17). If the farmer has a bulk tank milk SCC count lower than 200,000 ml*<sup>−</sup>*<sup>1</sup> , the milk price increases by 2%. If the bulk tank milk SCC count is between 201,000 and 300,000 ml*<sup>−</sup>*<sup>1</sup> , the price increases by 1%. If the bulk tank milk SCC count is between 401,000 and 500,000 ml*<sup>−</sup>*<sup>1</sup> , the price decreases by 4%, and if the bulk tank milk SCC count is higher than 501,000 ml*<sup>−</sup>*<sup>1</sup> , the price decreases by 10%.

#### Feeding

We calculated the cost of feed for every simulated day. Farmers each have a specific feeding strategy, and therefore in order to include a standardized procedure in this model, we simulated a basic scenario for the cost of feed (Table S1 in Supplementary Material). For calves, the feeding costs are a linear function from 0AC at day 1 to the daily heifer costs when they are 1 year old, resulting in feeding cost for a calf of 170AC for their first year of life. For heifers and dry cows, we set the feed to cost 0.931AC per day, adding up to 340AC per year. The feed costs for raising a 2-year-old heifer are thus 510AC (170AC + 340AC). For milking cows, we set the feed to cost 0.195AC per kilogram of milk produced per day (18).

#### **Disease Dynamics**

Infected animals go through the following states of the disease in succession: susceptible, low-shedding, high-shedding, and affected. The low-shedding state corresponds to a stage where the animal is infected, asymptomatic, and without detectable levels of MAP-specific IgG1, whereas the high-shedding state corresponds to the infection stage where the cow becomes less able to control the infection, with increasing amounts of MAP-specific IgG1 and an increase in excretion of MAP (19). To scale the amount of MAP shed in the low-shedding, high-shedding, and affected states, we set the shedding amount to between 0 and 100% of the possible shedding amount. Thus, low-shedders shed 5%, high-shedders shed 20%, and affected cows shed 100% of the possible amount (see Table S2 in Supplementary Material). Once a cow begins to show clinical signs such as reduced milk yield or diarrhea, it transfers to the affected state. The number of days spent in each disease stage is drawn from a specified distribution and assigned to each animal (Table S2 in Supplementary Material). Susceptible animals can be infected with MAP from the environment. MAP is shed in the manure of infected animals, and viable MAP bacteria, once introduced to a farm section, are capable of persisting in the environment here. In the model, we keep track of every cow, the amount of MAP it sheds, and in which farm section. We, therefore, calculated the bacterial load in each farm section per day, and the daily survival rate for MAP was modeled by:

$$\text{Surv}\,(day) = \exp\left(day \cdot \left(\frac{\log\left(0.01\right)}{385}\right)\right) \tag{4}$$

where *day* is the number of days that have passed since the bacteria were shed. This suggests that 99% of the bacteria will be dead after 385 days, in concordance with Whittington et al. (20).

#### Contamination between Farm Sections

We also simulate the spread of MAP between farm sections on the dirty boots of personnel and wheels on machines (including contaminated tools). For each time step, the amount of crosscontamination of MAP from each section is calculated as:

$$\text{Spiillover}\_{\text{\%}} = \left( \exp\left(\frac{-1}{N\text{p}}\right) \cdot \text{S} \text{\textdegree} + \exp\left(\frac{-1}{N\text{M}}\right) \cdot \text{S} \text{\textdegree} \right) \cdot \text{MAP}\_{\text{\%}} \text{\textdegree \{5\}}$$

where *N*<sup>P</sup> and *N*<sup>M</sup> are the numbers of personnel and machines, respectively, that can potentially transmit MAP between farm sections, *S*<sup>P</sup> and *S*<sup>M</sup> is the level of cross-contamination from boots and machines, respectively, and *MAP*<sup>j</sup> is the amount of MAP shed in farm section j. Cross-contamination of MAP from all four sections is summed daily and divided equally between the sections to simulate an even spread of MAP on the farm. A machine on the farm by default takes 3% of the bacteria shed from all farm sections (maximum 8% of shed MAP can stick to machines) and divides it evenly into all sections again. The default for one farm worker is 0.3% of daily shed bacteria in each section (maximum 1% of shed MAP can stick to personnel).

#### Risk of Infection from the Environment

The daily risk of obtaining an infection for the individual animals is modeled through the environment:

$$R\_{\hat{\mathbb{I}}} = 1 - \left(1/\exp\left(\left(1/H\right) \cdot F \cdot MAP\_{\hat{\mathbb{I}}}\right)\right) \tag{6}$$

where *R*<sup>j</sup> is the probability of each animal of acquiring an infection with MAP shed in farm section j, *H* is the hygiene level on the farm, *F* is the force of infection parameter, which was calibrated in the model (see below). *MAP*<sup>j</sup> is the amount of MAP shed in section j. MAP can thus accumulate in each farm section, but the survival of the bacteria decreases with time. It is possible to adjust the force of infection in the model, thus increasing or decreasing the risk of infection from the environment.

The hygiene level represents the likelihood that MAP will find a surface to stick to in the farm section, i.e., in principle a proxy of how clean the stable is.

#### Risk from Other Transmission Routes

Within the model, there are different transmission routes: *in utero* infection, infection from MAP in the environment of each farm section, infection from colostrum, and infection from waste milk. For*in utero* infection, we used the estimates fromWhittington and Windsor (21), but for the latter three, we did not have a direct estimate from the literature. However, a previous Danish study (22) estimated that the annual reduction in the odds ratio of infection when calves were not fed waste milk from repeatedly test-positive cows was *−*0.05. Over a 5-year period, this effect corresponds to an odds ratio of exp(*−*0*.*05*·*5) = 0.78 (CI: 0.65–0.95). Similarly, the effect of not using colostrum from repeatedly test-positive cows was exp(*−*0*.*04*·*5) = 0.82 (CI: 0.67–1.00). In this model, we set the risk of infection from waste milk to 0 when the calves are not fed with waste milk from repeatedly test-positive cows. Likewise, we set the risk of infection from colostrum to 0 if the farmer did not feed calves with colostrum from repeatedly testpositive cows. To adjust waste milk risk, colostrum risk and force of infection, we ran a series of simulations [with 500 iterations, 3-year burn-in period (initial simulation time required to stabilize the system) + 5 simulation years and 5.6% initial prevalence], varying these levels (data not shown). This gave a 3D parameter space for calibrating the model. We then chose the set of parameters that came closest (based on visual inspection) to (1) keep the true prevalence stable at about 5.6% over the five simulated years, (2) yield a 22% lower apparent prevalence when not feeding calves with waste milk from repeatedly test-positive cows, and (3) yield a 18% lower apparent prevalence when not using colostrum from repeatedly test-positive cows. This approximation was deemed appropriate for calibrating the three levels of infection in the model to maintain a stable endemic status of MAP in the farm. If the farmer removed the calves from the dam at birth, the risk of infection from the dam was reduced by 95%, causing the apparent prevalence to drop to about 0.80% compared with no calves being removed. This was within the confidence limits for the estimated effect of this control action which was previously estimated to reduce the risk of infection to 0.70% (0.53–0.95% CI) of the previous level (22). We did not want to reduce this risk to 0 when the control action was implemented, since the calf still faces some risk of infection from the dam *via in utero* transmission.

#### Model Calibration

In the model, we record both true and apparent prevalence within the herd. The true prevalence is observed from the number of infected (adult) cows, in all states of disease. The apparent prevalence is calculated from the number of (adult) cows that are testpositive. We used the maximum of the estimated prevalence (45% prevalence) to determine a low-hygiene scenario in a herd with high prevalence, hereafter referred to as the low-hygiene herd (**Figure 1**).

We calibrated the force of infection in the model, so the prevalence was stable over five simulated years. This time span was

chosen because the estimated effects (odds ratios) of the control actions were based on a maximum of 4 years (22). Paratuberculosis has been present in Denmark for many years. Therefore, we assume that the within-herd prevalences have stabilized within each farm. Furthermore, we found no evidence in our data (**Figure 1**) of within-herd prevalences *>*45%. Even if some farms have such a high prevalence, they would be rare, and our aim was to represent average Danish farms.

We based our "no control" (baseline) scenario on the dataset of 102 farms described above. Age-specific sensitivity estimates along with the specificity (23) were used to estimate the true prevalence within these herds, using the approach described in Sergeant et al. (24). The resulting prevalences are shown in **Figure 1**.

#### Calves

At calving, calves have a probability, *P*j, of becoming infected:

$$P\_{\circ} = 1 - \left(1 - D\right)\left(1 - S\right)\left(1 - \left(R\_{\circ}\left(1 - C\right)\right)\right) \tag{7}$$

where j denotes the (calving) section, *D* is the probability of infection from the dam to the calf: 9% for calves born from subclinical cows (states: low-shedding and high-shedding), and 39% from clinical cows (state: affected or clinical) [(20); Table S1 in Supplementary Material]. *S* is the reduction in the risk of infection from the dam to the calf when they are separated, corresponding to 0 if calves are not removed from the dam within 2 h from birth. *R* is the risk of obtaining an infection from the environment. *C* is the fractional reduction in the MAP shed within the calving area if the farmer cleans between calvings, default set to 1. See below, for the calibration of these parameters.

If the calf is removed from the dam within 2 h after calving, *D* is decreased by 7% [default; taken as the difference in risk between not removing any calves, and removing calves from cows identified as infected (22)]. We used 4% (default) risk of this if the individual calving pens were not cleaned between calvings [taken as the difference in risk between not cleaning calving pens and cleaning calving pens with repeatedly test-positive cows (22)]. The susceptibility of newborn calves to infection is equal to 1. A risk calf (i.e., a calf born to a dam with antibodies) is back-traced if their dam delivers a positive ELISA within 200 days after calving, enabling a strategy where risk calves can be culled (25).

After calving, the calves have a daily probability of becoming infected:

$$P\_{\mathbb{P}} = \left(1 - (1 - R\_{\mathbb{P}}) \left(1 - R\_{\mathbb{M}} \cdot F\right)\right) \cdot \text{Succ}(age) \tag{8}$$

where *P*<sup>j</sup> is the cumulated probability of infection from MAP in section j, here the calf section, *R*<sup>j</sup> is the risk of infection from MAP shed in farm section j, *R*<sup>M</sup> is the risk of infection from colostrum when the calf is 1–3 days old or from waste milk when the calf is between 3 days and 8 weeks old. *F* is the fraction of lactating cows that are infected at that time (and therefore able to infect *via* colostrum or waste milk), and *Susc*(*age*) is the age-dependent susceptibility, equal to 1 for newborn calves.

#### Heifers and Cows

Heifers, lactating cows, and dry cows have a daily probability, *P*j, of becoming infected:

$$P\_{\rangle} = R\_{\rangle} \cdot \text{Succ}(\text{age}) \tag{9}$$

where *R*<sup>j</sup> is the risk of infection from MAP shed in section j, where j can be any one of the four sections: heifers, lactating cows, dry cows, or the calving pen. *Susc*(*age*) is the susceptibility depending on the age of each animal.

Traditionally, calves have been perceived as most susceptible to MAP, but recent research has shown that animals older than 1 year are also susceptible to MAP infection (26, 27). In this model, we constructed the susceptibility of each animal to MAP given by this function:

$$\text{Succ}(\text{age}) = \exp(-0.01 \cdot \text{age})\tag{10}$$

where 0.01 is a scaling coefficient, and *age* is measured in years. In this way, the susceptibility to infection drops exponentially to 2.6% at the age of 1 year and 0.07% at the age of 2 years. Thus, there is a small risk of infection for older animals.

#### **ELISA**

We incorporated cow-specific results based on the ELISA in the model. Milk ELISA is done quarterly in all herds participating in the Danish MAP control program. The sensitivity of the ELISA is based on Nielsen et al. (23) and is a logarithmic function dependent on the age of the tested animal, resulting in an ELISA value indicating if the animal is infected.

To simulate the test strategy currently used in Denmark, cattle in the different states of MAP infection are tested every 3 months. Animals that are susceptible to MAP are assigned a test value for the ELISA reading taken from a uniform distribution between 0 and 0.30405 (to simulate a specificity of 98.67%). The cut-off for identifying a cow with antibodies is *≥*0.30, and test values above 0.30 for susceptible animals are considered technical variation. The test value for an animal in state 1 (infected and low-shedding) and state 2 (infected and high-shedding) is given by:

$$\begin{aligned} TV\,(\text{age}) &= U\left[\text{Cut} - \left(\left(1 - \text{Sens}\left(\text{age}\right)\right) \cdot 4.5\right); \text{Cut} \right. \\ &+ \left(\text{Sens}\left(\text{age}\right) \cdot 4.5\right) \end{aligned} \tag{11}$$

where *TV* (*age*) is the test value dependent on the age of an animal, *U* is a uniform distribution defined by [min; max], *Cut* is the cut-off used to determine positive tests, and *Sens*(*age*) is the sensitivity for the ELISA dependent on age of the tested animal, based on Nielsen et al. (23). The min and max of the uniform distribution are calculated, so that the proportion of the interval above the cut-off value is the same as the age-sensitivity for a given animal. In this way, the ELISA value is adjusted to the age of each tested animal and to the specific cut-off value. For example, animals at the age of 1, 2, 3, 4, and 5 years have the probabilities of 3, 27, 54, 68, and 74%, respectively, of a positive result. If the uniform distribution yields a negative test value, it is converted into 0 in concordance with real ELISA values. The test value 4.5 is introduced to create a maximum value of 4.5 test value units above the cut-off, reflecting real ELISA values.

Animals in the affected stage of infection get a test value taken from a uniform distribution between 0.5 and 5 and will therefore always be positive. Every 3 months, the animals are separated into antibody groups based on the repeated recordings of the last four test results, as described by Nielsen (25).

#### **Economics**

Infected animals are subjected to a reduction in slaughter value due to weight loss if they have tested positive in at least one of the last three tests (28). Therefore, cows with fluctuating responses lose 12.9% of their slaughter value, those with only the latest testpositive lose 7.9% of the slaughter value, and those with repeated positive ELISAs lose 16.6% of their slaughter value. The relative ECM yield level in infected cows is reduced according to the latest ELISA value (29), where we describe the daily milk reduction, *MR*, in ECM by:

$$MR = \begin{cases} 2/3TV^2 - 2/5TV + 1.02 & , \qquad TV < 0.3\\ 0.96 & , 0.3 \le TV < 0.9\\ 1 - 0.044TV & , \qquad 0.9 \le TV \end{cases} \tag{12}$$

Milk production and the income from slaughtered cows are summarized both as measures corrected for ELISA values and uncorrected measures for comparison with a MAP-free scenario. For each simulation, the number of ELISAs performed, bull calves sold, carcasses destroyed, and inseminations conducted are summarized, as are the man-hours spent cleaning calving boxes (1 h per calving), handling colostrum (2 h per test-positive cow calving), and handling calves if removed immediately following birth (1 h per test-positive cow calving). The model also summarizes the daily amount of money spent on feed (see Feeding). The prices of milk and labor costs per hour are listed in Table S1 in Supplementary Material.

For each simulated scenario, we calculated the change in net revenue annually by subtracting the expenses (feed, labor, inseminations, and destructions) from the income (milk production, sold bull calves, and slaughtered cows). To obtain the yearly change in net revenue per cow year, we also divided the net revenue by the annual number of cow years for a comparison of the simulated scenarios. We chose to use cow years for comparability with other studies even though the number of cows in the simulated herd varies only slightly over the years. For all scenarios, we also report total net revenue (the sum of net revenue over 10 years). We did not consider the development of value over time; that is, we assume a 0% discount rate.

#### **Test Herd Generation**

We generated a test herd to examine the model performance and evaluate the test scenarios. The test herd represents a medium-size Danish dairy herd with 118 calves (age 0–1 year), 127 heifers (age 1–2.5 years), and 200 cows (age 2–7 years) (Table S3 in Supplementary Material). The milk level for each cow was randomly assigned from a distribution estimated based on the dataset. A number of animals in the test herd were initially infected from the beginning of the simulations according to the specified prevalence. The number of initially infected animals and their progression through disease states were randomly chosen for every simulation. The number of days spent in the assigned disease state was drawn from a normal distribution with mean equal to the corresponding value taken from expert opinion (Table S2 in Supplementary Material).

For comparison of the results, we set the seed to a new value at the beginning of each iteration. We used the same string of seeds on all test scenarios to allow comparisons.

#### **Impact of MAP Control Actions**

We used the model to examine the epidemiological and economic impact of four of the seven recommended actions to control and prevent infection with MAP in dairy cattle herds (22, 25). The actions are built upon a classification system where cows are divided into "red," "amber," and "green" groups. "Red" cows have tested positive a minimum of two times within the last four tests (repeatedly test-positive cows), "amber" cows have tested positive at least once in the last four tests, and "green" cows have only tested negative in the four most recent tests (25).

The four evaluated actions are described below, including implementations, costs, and impact:


(4) If "red" cows are not allowed to calve, they get a flag on the culling list and are therefore prioritized for voluntary culling. This action has no direct economic cost and the impact is a direct output of the model.

#### **Herd Hygiene: Average vs. Low**

For each control scenario, we used a generalized test farm that resembled the average Danish farm with 200 cows. In all simulations, we used a burn-in period of 3 years to stabilize the herd (especially with regard to build-up of MAP in the environment) before any actions are implemented.We initiated all scenarios with 5.6% prevalence and repeated the simulations 500 times which was found to be adequate in the convergence test (Figure S1 in Supplementary Material).

For the average-hygiene herd, we set the hygiene level to 1 in this scenario, stabilizing the true prevalence at a median of 6% within the simulated herd in the baseline scenario. For all other MAP-related parameters, default values were used (Table S2 in Supplementary Material). In this study, we simulated the following scenarios:


We also simulated a herd with a prevalence of 45% (the highest herd prevalence found in the 102 herds with no control) for comparison with the average-hygiene herd. The initial prevalence was set to 45%, and the hygiene level was adjusted to 0.806. In a sensitivity analysis, this hygiene level was able to sustain a median prevalence of 45% over 10 years (data not shown). This low-hygiene herd reflects a scenario where the hygiene level, MAP build-up and general properties of the farm and management cause the prevalence to persist at 45%. As in the average-hygiene herd, we allowed a 3-year burn-in period for MAP to build up in the farm sections. We also used the same number of animals and the same age demographics as assumed for the average-hygiene herd.

#### **Model Validation**

#### Internal Validation

We internally validated the model using the rationalism method (checking the consistency of results and comparing results with different inputs), the tracing method (following single animals and their properties over time), unit testing (where cow attributes were observed and controlled during model iteration), and the face validity method (where the code was revised for functionality and all input parameters scrutinized) (30).

#### External Validation

We compared the true prevalence predicted by the model to the true prevalence from the dataset of 102 farms without control measures. In this way, we validated the baseline scenario using field data.

#### Convergence Test

In order to determine the required number of iterations, we conducted a convergence test on the median net revenue estimate from the model. We deemed that 500 iterations were sufficient to reach a stable variance of the estimates as determined by visual inspection (see Figure S1 in Supplementary Material).

#### Sensitivity Analysis

We tested 38 parameters in sensitivity analyses to assess the robustness of the model with regard to the prevalence, milk yield, and economic output. The parameter names are described in detail in Tables S1, S2, and S4 in Supplementary Material.

#### **RESULTS**

The results of the simulations for an average-hygiene herd and a low-hygiene herd are summarized in **Tables 1** and **2**. Extensive results about the epidemiological production and economic results of the seven scenarios, and the sensitivity analysis are shown in Supplementary Materials. In this section, we cite median results unless otherwise stated. In the figures, we show the 50% simulation envelope for the results, corresponding to the outcome between the 25th and 75th percentiles.

#### **Average-Hygiene Herd**

We show the results of the average-hygiene herd simulations in **Table 1**. Milk yield, income, and expenses are cumulated over the simulated 10-year period. The true prevalence and apparent prevalence shown are the end prevalences after 10 simulated years. The true prevalence is shown over time for the average-hygiene herd in **Figure 2**.

When all four examined actions against MAP were implemented, the model showed that it was possible to eradicate MAP from the farm. When only the test-and-cull strategy was implemented, it was also possible to eradicate MAP (i.e., to reduce true prevalence to 0). When the three most popular actions were implemented, true prevalence was reduced to a median level of 2.4%. There was only a marginal reduction in prevalence when the actions of removing calves, handling colostrum, and handling waste milk were implemented independently.

The best scenario judged using the mean milk production was the one where all actions were implemented, yielding a total of 20.17 million kilograms of ECM over the 10 simulated years (**Table 1**). The lowest milk production was observed when no control actions were implemented, yielding 20.12 million kilograms of ECM.

The scenario where no control actions were implemented generated the highest total net revenue (summed over 10 years),

#### **TABLE 1 | Results of the scenarios on an average-hygiene herd with a baseline true within-herd prevalence of 5.6%**.


*ECM, kilograms of ECM milk yield from of all cows on the farm; TP, true within-herd prevalence; AP, apparent within-herd prevalence; EXP, expenses in million* A*C; INC, income in million* A*C; TNR, total net revenue in million* A*C over 10 years.*

*The numbers are median results with 5 and 95% confidence limits, calculated over the 10 simulated years. Milk yield and economic values are shown in millions. Prevalences shown in % are the resulting prevalences at the end of the simulations.*

**TABLE 2 | Results of the scenarios on a low-hygiene herd with a baseline true within-herd prevalence of 45%**.


*ECM, kilograms of ECM milk yield from of all cows on the farm; TP, true within-herd prevalence; AP, apparent within-herd prevalence; EXP, expenses in* A*C; INC, income in* A*C; TNR, total net revenue in million* A*C over 10 years.*

5.08 (5.01; 5.15) 8.10 (7.97; 8.23) 3.02 (2.93; 3.11)

*The numbers are median results with 5 and 95% confidence limits, calculated over the 10 simulated years. Milk yield and economic values are shown in millions. Prevalences shown in % are the resulting prevalences at the end of the simulations.*

Cull pos. cows

**FIGURE 2 | True prevalence: 50% simulation envelope over 10 simulated years for the tested scenarios in the average-hygiene herd**. **(A)** "Three control actions" means the three control actions in **(B)**. "Four control actions" means the three actions in B plus test-and-cull. **(B)** "Handle waste milk" and "handle colostrum" means that the farmer only uses milk or colostrum from test-negative cows for feeding calves.

followed by the test-and-cull strategy. The lowest total net revenue was found when the three most popular actions were implemented. This is a result of the higher expenses for implementing these actions, which are not offset by sufficiently higher revenues (results not shown). The scenario with the highest (and intermittently positive) net revenue per cow year was when only a testand-cull strategy was implemented (**Figure 3**). All other scenarios consistently yielded negative change in net revenue per cow year during the 10 simulated years.

The apparent prevalence was slightly lower than the true prevalence in most of the scenarios for the average-hygiene herd (**Table 1**). However, when the prevalence was very low, for instance, when three actions were implemented, the apparent prevalence was higher (2.93) than the true prevalence (2.43). This is caused by the specificity of the test resulting in false positive results.

#### **Low-Hygiene Herd**

The results of the simulated low-hygiene herd are shown in **Table 2**. Milk yield, income, and expenses are cumulated over the simulated 10-year period. The true prevalence and apparent prevalence shown are the end prevalences after 10 simulated years, while the development is illustrated in **Figure 4**. The model was calibrated over 5 years to fit a median prevalence level of 45% in a scenario where no control actions were implemented. The prevalence then decreased to 39% over 10 years. This effect was seen because we calibrated the model to be stable over 5 years but used it to predict for 10 years and therefore the prevalence will change over longer periods (**Table 2**).

In the low-hygiene scenario, as was true for the average-hygiene herd, it was only possible to eradicate MAP from the herd by using all four actions or by using a test-and-cull strategy alone. The three most popular actions did not have considerable impact when implemented independently, but reduced the median true prevalence to 20% when combined.

Again, when considering milk production, the best scenario was the one where all actions were implemented, followed by the test-and-cull strategy (**Table 2**). The lowest milk production was reached when no control actions were implemented.

The number of cow years was kept stable throughout all simulations, with a mean of 205 cow years (min: 203, max: 208). Here, we report the revenue per cow year to ease comparison with other management actions and herd sizes.

The highest total net revenue (summed over 10 years), at 3.02 million AC, was attained in the scenario where test-positive cows were culled. This was largely due to an increased income from a higher milk yield and the higher slaughter value of healthy cows. The highest income came from the scenario where all actions were implemented (8.11 million AC), but this was counterbalanced by an increase in the expenses for the actions (5.16 million AC). The lowest expenses were in the scenario where no control or handling of waste milk was implemented, but these were counterbalanced by lower incomes. The scenario generating the second-highest net revenue on average was when no control actions were implemented. Scenarios generating the lowest net revenues were when three actions were implemented, or when calves were removed from potentially infectious dams. It was, therefore, more profitable on average to avoid implementing control actions, rather than implement the three most popular actions on their own. The testand-cull scenario consistently yielded a positive change in net revenue per cow year over the 10 simulated years (**Figure 5**). The scenario with all four actions implemented had positive change in net revenue per cow year in years 2 and 3 but was otherwise negative. The scenario with three actions showed steadily increasing net revenue per cow year after 4 years, yet it was still negative after 10 simulated years. All other scenarios had negative net revenue per cow year.

**FIGURE 4 | True prevalence: 50% simulation envelope over 10 simulated years for the tested scenarios in the low-hygiene herd**. **(A)** "Three control actions" means the three control actions in **(B)**. "Four control actions" means the three actions in **(B)** plus test-and-cull. **(B)** "Handle waste milk" and "handle colostrum" means that the farmer only uses milk or colostrum from test-negative cows for feeding calves.

### **Sensitivity Analyses**

The extensive results of the sensitivity analyses are shown in Tables S5–S10 in Supplementary Material. The parameters that had a negative correlation with the true prevalence were heat detection success for heifers; insemination success for heifers and cows; cross-contamination from boots and machines; duration of the low-shedding and high-shedding infection stages; and hygiene level. The negative impact of higher cross-contamination is likely due to a higher proportion of the shed MAP being spread out

on the farm, thus lowering the local probability of infection in each farm section. Parameters with a positive correlation to the true prevalence were the maximum number of heat cycles before culling, the percentage of voluntary culling, the amount of bacteria shed in all infection stages, the proportion of stillbirths, and the duration of the affected stage. The positive impact of more voluntary culling on the true prevalence is likely caused by a change in the demography of the herd, leading to higher transmission (no test-and-cull in this scenario). The test specificity did not seem to impact the true prevalence (no test-and-cull in this scenario), but was negatively correlated with the apparent prevalence. Lowering the level of the test sensitivity to 50% so that the sensitivity for 5-year-old cows was 37% resulted in a slightly increased true prevalence (from 7.02 to 7.28 median result) prevalence after 10 simulated years. Increasing the test sensitivity level to 120% so that it was 88% for 5-year-old cows was able to decrease the true prevalence from 7.02 to 6.83% (median result).

#### **DISCUSSION**

Our results showed that on average the most economically profitable strategy for a low-hygiene herd was to cull "red" and "amber" cows (**Figure 5**), resulting in eradication of the disease within 7–10 years (50% simulation envelope, **Figure 4**). However, it was not possible to eradicate MAP in all of the simulations within 10 years by using the test-and-cull strategy (**Figure 4**), where the true prevalence was still 1.47% at the 95% percentile (**Table 2**). For this scenario, 69% of the simulations resulted in zero prevalence after 10 years (data not shown). Therefore, we conclude that, although test-and-cull is the most profitable control action for low-hygiene herds, it is not guaranteed to eradicate MAP. Future studies should investigate whether eradication could be guaranteed by combining test-and-cull with some preventive measure (other than those already analyzed here), and whether this approach would be more profitable than test-and-cull alone.

In herds with average hygiene, test-and-cull is sufficient for eradication, but not more profitable than no control. Therefore, we suggest that in these herds, test-and-cull is used if the aim is to eradicate paratuberculosis or lower the prevalence. Because the effect of the simulated control measures in herds with average hygiene is limited, and because the costs are considerable, we suggest that these herds focus on test-and-cull alone.

We found that implementing only one of the three most popular control actions did not have much impact on the prevalence (**Figures 2B** and **4B**). However, there was a synergistic effect of implementing all three actions at the same time (**Figures 2A** and **4A**). Therefore, it is not economically attractive to implement just one of these actions due to the associated cost, which is not counterbalanced by enough benefit. And although a combination of these three control actions reduces prevalence more effectively, such a combination is among the least profitable strategies (**Figures 3** and **5**). This is concordant with the results from the SimHerd model (6).

When considering only the prevalence of MAP, we found that the optimal scenario was to implement all control actions, allowing the farmer to eradicate MAP from the farm completely in the average-hygiene herd within 3 or 5 years (50% simulation envelope, **Figures 2** and **4**, respectively). This assumes that MAP is not reintroduced into the farm at any point. This scenario was also the most expensive (**Table 2**).

The results of this study contradict results of previous models that have presented within-herd true prevalences between 50 and 90% if no control actions are implemented on the farm (6, 8–11). We did not find such high within-herd prevalences in Denmark, but found a median within-herd true prevalence at 5.6% and a maximum within-herd true prevalence of 45% (**Figure 1**). This

is close to the result of Verdugo et al. (4) who estimated the median within-herd true prevalence in Danish farms to be 7%. The reason for the difference between our results and those of previous models is mainly because we calibrated the model to keep a stable prevalence. This constraints the transmission process in the model, so the prevalence is not able to increase exponentially.

In both the low- and average-hygiene herd, the best action to reduce the prevalence was to cull test-positive cows, supporting the findings of Nielsen and Toft (22). However, this contradicts the findings from JohneSSim and SimHerd simulations, where it was found that test-and-cull strategies could not lower the prevalence and that it was not economically attractive (6, 31). However, in the SimHerd model, an ELISA-positive cow must be confirmed by a fecal culture. This is a more time consuming and expensive test than the ELISA used in the PTB-iCull model, where positive cows can be put on top of the culling list as soon as they are detected. The JohneSSim model simulated a low ELISA sensitivity based on the disease state of the animal contrary to the disease state and age-dependent sensitivity used in the current model. This could add to the differences between the results of our and previous models.

Previous models of MAP spread showed that it is impossible to eradicate MAP even with the use of rigorous test-and-cull strategy [e.g., Ref. (32)], which contradicts our results. The way we model MAP spread is different than in earlier work [see review by Marcé et al. (33)]. In those models, the probability of infection through the environment is a function of, among others, the number of infectious animals in the herd (frequency models) in a Reed-Frost model (33). In our work, we use a density-dependent transmission model to estimate the probability of infection through the environment, depending on density of the bacterial load in the environment. Density-dependent models tend to represent endemic situations better than frequency-dependent models that tend to seek pathogen/host extinction (12). In frequency-dependent models, when no control actions are taken, the prevalence often increases sharply to reach unusually high levels, as predicted by previous models. Furthermore, in frequency-dependent models, infectious cows that are culled immediately cease to be infective, whereas in our model, MAP is shed in the environment and can still give new infections if the shedding cow was culled earlier. As discussed above, we observed a median within-herd true prevalence of 5.6% in herds that have no control actions against MAP and have been practicing for several years (data not shown), indicating a stable endemic state of MAP in these herds. We, therefore, consider a density-dependent model more representative of actual field situation than a frequency-dependent model that would lead to massive spread of MAP.

In addition, our model differs from previous models in that the sensitivity of the ELISA is higher than those used in previous models, as we use more recent estimations based on Nielsen et al. (23).

An important difference between our and previous models is also that we model a closed herd with no risk of disease introduction through animal purchase. The reason we model a closed herd is that about 50% of the herds in Denmark are closed (data not shown), and that it is recommended to keep the herd closed to avoid introduction of MAP and other diseases. Simulating an open herd would prevent eradication because of the risk of continuous introduction of infected animals.

The relationship between susceptibility and age has not yet been fully established, so we chose this function to incorporate a small probability of infection even for old animals, as described earlier (26). Susceptibility to MAP is also influenced by genetic variation (34), but incorporating this into the simulation model would require simulating genetic profiles of each cow, which is out of the scope of the current study.

The findings of this study may be representative for other countries than Denmark. However, care must be taken, when translating the results to other countries, because certain key parameters, such as prevalence, interest rate, and control options, might differ between countries.

Further research should focus on investigating the relationship between bacterial load and force of infection, as this relationship might not be linear, as suggested by Slater et al. (35).

#### **CONCLUSION**

We used current knowledge of MAP infection and detection mechanisms to build a new framework for simulating MAP infection within a herd. We simulated the epidemiological and economic effects of different control strategies in a average and a low-hygiene herd. The most profitable scenario over 10 years in the average-hygiene herd was to avoid implementing a control strategy. In the low-hygiene herd, a test-and-cull strategy was the best solution economically. We did not find it profitable to

### **REFERENCES**


implement any of the three most popular actions for preventing the spread of MAP within herds in Denmark, either for the low or the average-hygiene herds. The results will help farmers improve control of MAP in their herds.

#### **AUTHOR CONTRIBUTIONS**

CK, KG, TH, SN, LC, and NT developed the model. ER provided knowledge and data and participated in the model formulation. CK wrote the first draft of the manuscript.

#### **ACKNOWLEDGMENTS**

We would like to thank Kaspar Krogh (CEVA), and Jørgen Nielsen (SEGES) for their great help and support with parameterization of the model. We also thank Henk Hogeveen for useful comments on the model and Sarah Layhe for language corrections.

#### **FUNDING**

This project was funded by the Green Development and Demonstration Program (GUDP) under the Danish Directorate for Food, Fisheries and Agriculture, grant no. 34009-13-0596.

#### **SUPPLEMENTARY MATERIAL**

The Supplementary Material for this article can be found online at http://journal.frontiersin.org/article/10.3389/fvets.2016.00090


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2016 Kirkeby, Græsbøll, Nielsen, Christiansen, Toft, Rattenborg and Halasa. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## **Epidemiological and Economic Evaluation of Alternative On-Farm Management Scenarios for Ovine Footrot in Switzerland**

*Dana Zingg<sup>1</sup> , Sandro Steinbach<sup>2</sup> , Christian Kuhlgatz <sup>3</sup> , Matthias Rediger <sup>3</sup> , Gertraud Schüpbach-Regula<sup>1</sup> , Matteo Aepli <sup>3</sup> , Gry M. Grøneng4,5 and Salome Dürr <sup>1</sup> \**

*<sup>1</sup> Veterinary Public Health Institute, University of Bern, Bern, Switzerland, <sup>2</sup> Center of Economic Research, Swiss Federal Institute of Technology in Zurich, Zurich, Switzerland, <sup>3</sup> Agricultural Economics, Swiss Federal Institute of Technology in Zurich, Institute for Environmental Decisions (IED), Zurich, Switzerland, <sup>4</sup> Norwegian Veterinary Institute, Oslo, Norway, <sup>5</sup> The Norwegian Institute of Public Health, Oslo, Norway*

Footrot is a multifactorial infectious disease mostly affecting sheep, caused by the bacteria

#### *Edited by:*

*Andres M. Perez, University of Minnesota, USA*

#### *Reviewed by:*

*Bouda Vosough Ahmadi, Scotland's Rural College, United Kingdom Dannele E. Peck, Agricultural Research Service (USDA), USA*

*\*Correspondence:*

*Salome Dürr salome.duerr@vetsuisse.unibe.ch*

#### *Specialty section:*

*This article was submitted to Veterinary Epidemiology and Economics, a section of the journal Frontiers in Veterinary Science*

> *Received: 07 November 2016 Accepted: 24 April 2017 Published: 16 May 2017*

#### *Citation:*

*Zingg D, Steinbach S, Kuhlgatz C, Rediger M, Schüpbach-Regula G, Aepli M, Grøneng GM and Dürr S (2017) Epidemiological and Economic Evaluation of Alternative On-Farm Management Scenarios for Ovine Footrot in Switzerland. Front. Vet. Sci. 4:70. doi: 10.3389/fvets.2017.00070* *Dichelobacter nodosus*. It causes painful feet lesions resulting in animal welfare issues, weight loss, and reduced wool production, which leads to a considerable economic burden in animal production. In Switzerland, the disease is endemic and mandatory coordinated control programs exist only in some parts of the country. This study aimed to compare two nationwide control strategies and a no intervention scenario with the current situation, and to quantify their net economic effect. This was done by sequential application of a maximum entropy model (MEM), epidemiological simulation, and calculation of net economic effect using the net present value method. Building upon data from a questionnaire, the MEM revealed a nationwide footrot prevalence of 40.2%. Regional prevalence values were used as inputs for the epidemiological model. Under the application of the nationwide coordinated control program without (scenario B) and with (scenario C) improved diagnostics [polymerase chain reaction (PCR) test], the Swisswide prevalence decreased within 10 years to 14 and 5%, respectively. Contrary, an increase to 48% prevalence was observed when terminating the current control strategies (scenario D). Management costs included labor and material costs. Management benefits included reduction of fattening time and improved animal welfare, which is valued by Swiss consumers and therefore reduces societal costs. The net economic effect of the alternative scenarios B and C was positive, the one of scenario D was negative and over a period of 17 years quantified at CHF 422.3, 538.3, and *−*172.3 million (1 CHF = 1.040 US\$), respectively. This implies that a systematic Swiss-wide management program under the application of the PCR diagnostic test is the most recommendable strategy for a cost-effective control of footrot in Switzerland.

**Keywords: decision-making,** *Dichelobacter nodosus***, epidemiological modeling, economic effect, prevalence, ruminant, welfare, Switzerland**

#### **INTRODUCTION**

Footrot is an old disease in European countries, mentioned in France as early as the end of the eighteenth century (1). Early reports in Switzerland date to 1929 and 1965, indicating that the disease has been known for at least 100 years in this country (2, 3). Since then, the disease has spread to all regions of Switzerland, and is currently endemic (4, 5).

Footrot is an infectious disease, which mainly causes severe hoof lesions in sheep, but is also found in other ruminant species all over the world (4, 6–8). It is a multifactorial disease favored by humid environments with temperate climate. The main causative agent is *Dichelobacter nodosus*, although *Fusobacterium necrophorum*, aerobic diphtheroids, and coliforms are also reported to contribute to the development of clinical signs (9). The development and severity of disease depend on the climate, the virulence of the isolate, and the immune system of an individual animal (10, 11). Because the disease causes painful hoof lesions, it is not only of relevance for animal health but also for animal welfare. These painful lesions result in direct costs for the producers through weight loss and reduced wool production. In addition, consumers generally value animal welfare, so that there is a societal economic loss when animals are affected by footrot. In combination with the costs for treatment, the disease imposes a considerable economic burden in animal production (12). Management of footrot consists of regular hoof trimming, foot bathing, separation or elimination of affected sheep, and usage of antibiotics. These control measures are usually applied in combinations and are costly to the farmers. For example, a study in Great Britain estimated direct costs of £1.32 per ewe and £0.15 per lamb, summing up to costs of £24.4 for British producers annually (13). As control measures of single farmers cannot wipe out footrot, some countries implemented systematic programs to eradicate the disease. An economic study on a footrot eradication program in Western Australia found that the benefits of the program outweigh its costs at a ratio of 5.3:1 (14).

Footrot is not listed as a notifiable disease in the Swiss legislation. Nevertheless, all sheep farmers are obliged to comply with animal welfare legislations, which imply that clinically affected sheep has to be treated or slaughtered. In the cantons of Grisons (GR) and Glarus (GL), a coordinated management program was implemented in 1990 and 2013, respectively. The program consists of regular control of sheep herds, hoof trimming and foot bathing with formalin, zinc, or copper sulfate, and biosecurity measures. In case of footrot problems, these measures are executed more frequently, and infected animals are separated. The management program has been successful in reducing footrot prevalence within these cantons. Currently, policy is moving toward a nationwide coordinated control strategy against footrot in Switzerland, presuming that the disease will be listed as notifiable and controlled by law.

Epidemiological models are helpful and necessary tools to predict prevalence trends under different control strategies (15–17). Outputs of such models can be used for the economic evaluation of management strategies (18). Cost–benefit analyses of control strategies are important, and ideally conducted in an early phase of planning for potential control programs. Examples include highly infectious animal diseases such as foot-and-mouth disease (19) or classical swine fever (20, 21). Cost-effectiveness of control strategies for zoonoses such as rabies or brucellosis has also been studied, taking into account the costs for human deaths (22–24).

The objective of our study is to evaluate epidemiologic and economic aspects of different management strategies to reduce footrot prevalence in Switzerland. For this purpose, the direct costs of producers and the intangible costs of the society, mostly caused by affection of animal welfare, are considered. No distinction between the virulent and benign strain of *D. nodosus* was made*.* A cost–benefit analysis of four control strategies was conducted to inform policy makers who are considering an evidence-based nationwide coordinated control strategy of footrot in Switzerland.

#### **MATERIALS AND METHODS**

The present study summarizes the results of a large project that evaluated the costs and benefits of centrally organized control programs for footrot in the Swiss sheep population. The entire project consisted of several successive subprojects (**Figure 1**). The animal experiment was approved by the Cantonal Veterinary Office of the Canton of Zug (approval number ZG 67/15) in accordance with the Swiss animal welfare legislation.

#### **Model Input Data**

A questionnaire was sent to all sheep farmers of Switzerland aimed at revealing the current perceived prevalence of footrot in Swiss sheep premises (25). Questions on herd management, trade of animals, health issues involving claws, and management measures against footrot were also included. Of the 15,036 questionnaires sent out, 9,386 were returned, and 7,836 (52%) were usable for further analysis. Large premises contributed most to the questionnaire study, 79.6% of the total sheep population in Switzerland was covered by the completed questionnaires. Overall, 37% of the respondents stated that they experienced problems with footrot during the year 2014.

Estimates of the impact of footrot on sheep health were based on experimental controlled trial including a healthy versus footrot-infected sheep flock (25). Briefly, 85 lambs in the diseased group and 99 lambs in the control group were followed from birth to slaughter, which occurred at an individual weight of 42–46 kg. Reduction of the fattening period for healthy lambs was converted to economic benefit (see "Management Benefit"). The trial was also used to estimate labor costs, i.e., the time required for implementing of control measures on the farm.

### **Definition of the Regions**

A total of 19484 herds were integrated into the model. For conceptual reasons of the epidemiological model, Switzerland had to be divided into regions. These regions also served as basis for the regionalization for the maximum entropy model (MEM) and the cost–benefit analysis, considering varying costs and benefits between the different regions.

Switzerland was divided into 27 regions for the footrot model (**Figure 2**). Two criteria were used for the allocation of the regions: density of sheep premises (first criterion) and the climate (second criterion). Data to inform the sheep premises density were sourced from the AGIS database (agrarian policy information system of Switzerland) and data were calculated as the number of premises per agricultural area per political district. The AGIS database only records data on professional premises and therefore non-professional premises were not considered for the classification of densities. District densities were divided into three categories using tertiles as limits. The transmission of footrot is also influenced by the climate in which mainly temperature and precipitation are seen as relevant factors (26, 27). Switzerland is divided into 12 climatic regions. Following these climatic regions, the density-classified regions were further subdivided or merged. In a final step, large regions with the same density and climate were subdivided following cantonal borders to avoid large differences in size between regions. For each region, the population size (number of sheep premises according to the AGIS database) and a climatic factor were calculated (Appendix in Supplementary Material). Currently, a footrot control program is mandatory for all sheep premises and implemented in the regions 23–27 (situated in the cantons of GR and GL).

#### **Estimation of Current Prevalence of Footrot Using MEM**

To account for the non-respondents of the questionnaire study and to extrapolate the prevalence estimates per region to entire Switzerland, an MEM was used (25). The MEM is a Bayesian method that integrates *a priori* information to estimate the probability of the occurrence of an unknown variable (28). Here, the maximum likelihood estimator was used to estimate the probability of footrot prevalence in the defined regions. To ensure stability of the MEM, regions with *<*200 herds had to be complied, leading to 22 regions out of the 27 regions (regions 1 and 2 were compiled, as well as regions 3 and 4, 13 and 14, 18 and 19, and 23 and 24).

*A priori* information included geographic location (region), farm size (number of animals, growth rate, and agricultural area), structural features of farms (whether or not the farm holds rams or keeps animals on pasture, age of the farmer), and contact information (exhibitions and pasturing) are used, sourced from the questionnaire and AGIS database. This *a priori* information was combined with the prevalence of farms per region that stated to have experienced problems with footrot in 2014. The model was tested by predicting footrot status of the farms within the sample where the status was known. The econometric model had a fit above 70% (measured as pseudo-*R* 2 ), implying that the model mimics the data-generating process well. Neither selection nor information bias was expected. It was then applied to the entire Swiss sheep farm population to estimate the footrot prevalence within each region (Table S1 in Supplementary Material). This prevalence was further used as a starting point for the epidemiological model. Within each of the compiled regions, the same prevalence was used (Table S1 in Supplementary Material).

### **Epidemiological Model**

#### Model Structure

The footrot transmission model has been developed based on a stochastic susceptible-infected-recovered compartmental model designed to simulate a footrot outbreak in Norway (29). The model was implemented in *R* 1 . The model allows the simulation of

1 https://cran.r-project.org the spread of the disease within and between defined geographical regions, using the sheep premises as the smallest unit (**Figure 3**). The time step of the simulation is 1 year.

#### Simulation of the Spread within a Region

Premises were grouped into three compartments within a region: susceptible (*S*), infected (*I*), and recovered (*R*) premises (**Figure 3**). Susceptible premises get infected with an infection rate β and recover afterward with a recovery rate σ. Subsequently, they either become re-infected (with the reversion rate γ) or again susceptible with a rate of 1 *−* γ. The spread between the compartments within a region *i* at the time *t* is:

$$\mathcal{S}\_{i,t+1} = \mathcal{S}\_{i,t} + (1 - \gamma\_i) \ast \mathcal{R}\_{i,t} - \mathcal{B}\_i \ast \mathcal{S}\_{i,t} \ast \ I\_{i,t} \tag{1}$$

$$I\_{i,t+1} = \ {I\_{i,t} - \ \sigma\_i \* I\_{i,t} + \ \mkern-5pt\/]\_{i} \* S\_{i,t} \* I\_{i,t} + \ \gamma\_i \* R\_{i,t} \tag{2}$$

$$R\_{i, t+1} = \ R\_{i, t} + \ \sigma\_i \* I\_{i, t} - \gamma\_i \* R\_{i, t} - (1 - \gamma\_i) \* R\_{i, t} \tag{3}$$

The population size *N* = *S* + *I* + *R* per region *i* was sourced from the AGIS database. The regional prevalence at the start of the simulation was informed by the output of the MEM (Table S1 in Supplementary Material). The infection rate β is a stochastic parameter (pert-distribution) calculated separately for each region and incorporates the regional sheep premises density and the climate (Appendix in Supplementary Material).

The recovery and the reversion rates were incorporated as stochastic parameters (uniform distributions, Appendix in

Supplementary Material) with separate values for regions with and without mandatory footrot control programs. Depending on the scenarios simulated, the regions with and without mandatory control program are varying.

#### Simulation of the Spread between the Regions

Spread of footrot between regions is implemented in three ways: sheep transport (trade), common pasture, and sheep expositions. Sheep transports are possible across entire Switzerland. The number of newly infected premises per year via this transmission pathway (θ*j,i*) was calculated out of the annual number of sheep transports on herd level from regions *j* to *i*, the proportion of infected premises in the sending region *j* and the proportion of susceptible premises in the receiving region *i* (Appendix in Supplementary Material). Sheep movement data were sourced from the questionnaire study in which each farmer was asked to state the two cantons—apart from the home canton—where the majority of sheep has been sent to and received from in the last 12 months.

The transmission from region *j* to *i* via common pasture (parameter τ*j,i*) and interregional sheep exhibitions (parameter δ*j,i*) follow the same principle. Animals of different regions come together, get infected with the transmission rate βpasture and βExpo, respectively, and go back to their premises at the end of the summer or exhibition, where they may infect other animals and premises. The number of newly infected premises per year via common pasture (τ*j,i*) was computed using information on the number of sheep herds from both regions *i* and *j* that spend the summer on common pasture, herd density and climate on the pastures, the proportion of infected herds in region *j*, and the proportion of susceptible herds in region *i* (Appendix in Supplementary Material). The number of sheep sent to common pasture for each region was sourced from the questionnaire. The size of common pasture area was sourced from the AGIS database, which was required to calculate the herd densities (herds per square kilometer) on pastures. Similarly, the number of newly infected premises per year via exhibitions (δ*j,i*) was calculated out of the number of sheep herds exhibited per year and regions *i* and *j*, the herd density and climate on the site of exposition, the proportion of infected herds in region *j*, and the proportion of susceptible herds in region *i* (Appendix in Supplementary Material). Information on the number of sheep herds exhibited by each region for the large interregional expositions was provided by the Swiss Sheep Breeding Association<sup>2</sup> .

As a result of the integration of the spread between the regions, subsuper1 1–3 were expanded so that the number of premises in each compartment of region *i* and year *t* is calculated as follows:

$$\begin{aligned} \mathfrak{S}\_{i,t+1} = \mathfrak{S}\_{i,t} + (1 - \mathfrak{\chi}) \ast \mathfrak{R}\_{i,t} - \min \left[ \left( \mathfrak{\mathfrak{F}}\_{i} \ast \mathfrak{S}\_{i,t} \ast I\_{i,t} + \sum\_{j \neq i} \mathfrak{G}\_{j,i,t} \right) \\ + \sum\_{j \neq i} \mathfrak{\tau}\_{j,i,t} + \sum\_{j \neq i} \mathfrak{S}\_{j,i,t} \right), \mathfrak{S}\_{i,t} \end{aligned} \tag{4}$$

$$I\_{i,t+1} = I\_{i,t} - \sigma \ast I\_{i,t} + \min\left[ \left( \mathfrak{B}\_i \ast \mathfrak{S}\_{i,t} \ast I\_{i,t} + \sum\_{j \neq i} \Theta\_{j,i,t} + \sum\_{j \neq i} \mathfrak{t}\_{j,i,t} \right) \right]$$

$$+ \sum\_{j \neq i} \mathfrak{S}\_{j,i,t} \Big), \mathcal{S}\_{i,t} \bigg[ + \gamma\_i \ast R\_{i,t} \qquad \text{(5)}$$

$$R\_{i, t+1} = R\_{i, t} + \sigma \ast I\_{i, t} - \gamma\_i \ast R\_{i, t} - (1 - \gamma\_i) \ast R\_{i, t} \tag{6}$$

where *i* and *j* denote the region receiving and transmitting footrot.

#### Global Sensitivity (GSA) Analysis

A GSA analysis was applied that differs from the classical "oneparameter-at-a-time" SA by considering interactions between the parameters (30). In total, 13 parameters were included within the GSA. These include the number of susceptible (*Si,t*=1) and infected (*Ii,t*=1) herds per region *i* at the start of the simulation (*t* = 1), the three interregional parameters (infection rate β*i*, recovery rate σ*i*, and reversion rate γ*i*) per region, the number of sheep herd transports between region *i* and *j* (MSh*j,i*), the number of sheep herds sent to common pastures (*n*pasture,*i*), and exhibitions (*n*Expo,*i*), respectively, and the herd density and climate on common pastures and exhibitions, respectively (*d*pasture, Clpasture, *d*Expo, ClExpo). In addition, the mean of all infection rates β*<sup>i</sup>* was incorporated, which was used to calculate the infection rates on pastures and exhibitions. For the GSA, all parameters were allowed to vary between *±*10% around their original value. The function "soboljansen" from the *R* package "sensitivity" was used (31, 32). One hundred and fifty thousand iterations were needed to result in narrow enough confidence intervals of the Sobol indices, the measures of the parameters' influence on the footrot prevalence.

#### **Fitting of the Model to the Swiss Situation**

To fit the model to the Swiss situation, it was assumed that footrot is currently in a stable endemic stage in Switzerland, thus the prevalence per region is constant over time. This assumption was made based on evidence of existence of the disease in the surrounding countries (Germany and France) since at least end of the eighteenth century (1, 3) and on a study providing evidence than footrot exists in all regions of Switzerland (5). The parameter values of β, σ, θ, τ, and δ were calculated and incorporated in the model as described above and in the Appendix in Supplementary Material. The value of the reversion rate γ was fitted to the countrywide prevalence in Switzerland so that the model output came as close as possible to the target prevalence of 40.2%, estimated by the MEM. Reversion rate values of 40–55% were tested with steps of 1%. The value of γ for regions 23–27 (cantons of GR and GL) was defined to be smaller than the one for the other regions, based on the ratio of the reversion rates calculated from the questionnaire dataset (43.6% for premises undergone a footrot control program on herd level, 74.5% for those that did not undergo such a program).

Because the Swiss-wide prevalence was used as the measure to fit the model and prevalence in the different regions deviated from the start value estimated by the MEM over the course of the simulated years (running time of the model = 100 years), a correction algorithm had to be applied. For each region *i*, a correction factor *k<sup>i</sup>* was calculated based on the target prevalence

<sup>2</sup> http://szv.caprovis.ch/

(target\_prev*i*, MEM outcome) and prevalence estimated by the simulation model at year 45 (prev*<sup>t</sup>* = 45,*i*, year with prevalence closest to the target value, see "Fitting to the Swiss Situation and Calculation of Reversion Rate γ"), so that

$$k\_i = \frac{\text{target\\_prev}\_i}{\text{prev}\_{t=45,i}}.\tag{7}$$

#### **Description of the Scenarios**

Four scenarios were defined (**Table 1**). For each scenario, 1,000 simulations were conducted and the mean, median, and 2.5 and 97.5 ‰ of the footrot prevalence were extracted for results presentation and further analysis. Each simulation ran over 100 years and started with the parameter values described above.

Scenario A (*laisser-faire*) was defined as the current status of footrot control in Switzerland and used as the baseline when different scenarios were compared. Recovery and reversion rates differed between regions 23 and 27 (located in canton GR and GL, mandatory control program ongoing) and regions 1–22 (rest of Switzerland, no mandatory control program implemented). It was assumed that in the future, the newly developed polymerase chain reaction (PCR) diagnostic test (33) will be considered in the regions with mandatory control program. This test also detects non-clinical animals, which results in a higher sensitivity of the footrot detection and in consequence in a lower reversion rate (Appendix in Supplementary Material).

Scenarios B and C were defined as extension of the mandatory control programs as currently implemented in the cantons of GR and GL to a nationwide level including all regions. This program consists of separation of the infected herd, hoof trimming, and regular foot bathing<sup>3</sup> . In scenario B, no PCR diagnostic test was

3 http://bgk.caprovis.ch/cms05/showlinx.asp?lang=1&id=1

**TABLE 1 | Definition of scenario with their recovery and reversion rate values for regions 1–22 (no mandatory footrot program implemented) and regions 23–27 (mandatory footrot program implemented)**.


considered and the definition of a premise being footrot free was based on clinical signs only, where every single sheep was tested. In scenario C, PCR was considered for the detection of footrot, addressing a given proportion of sheep (ranging from 100% for small herds to 10–40% for large herds). Examination by a veterinary (scenario B) or a PCR test (scenario C) and a hoof inspector are conducted in the first year of the sanitation.

For scenario D, it was assumed that all mandatory control measures were ceased in Switzerland. This comparison is relevant because the current benefit of existing management strategies should be assessed. The recovery rate was estimated based on the questionnaire database for premises that did not undergo a footrot control program. The reversion rate γ*<sup>D</sup>* was calculated from the fitted reversion rate γ (49.0%, see "Fitting to the Swiss Situation and Calculation of Reversion Rate γ") and ratio between the reversion rate of premises with no herd level control measures applied (γnon-controlled premises, 74.5%) and the reversion rate calculated from the entire questionnaire dataset (γentrie\_dataset, 58.1%):

$$\gamma\_D = \frac{\gamma}{\gamma\_{\text{entrie\\_dataset}}} \, \* \, \gamma\_{\text{non-controlled premieses}}.\tag{8}$$

After the model simulation, the model output of all scenarios per region *i* and year *t* was corrected by the correction factor *ki*. For each scenario, the final regional prevalence in the year *t* was calculated at:

$$\text{prev\\_final}\_{t,i} = \text{prev\\_modelOutput}\_{t,i} \* k\_i. \tag{9}$$

#### **Cost–Benefit Analysis**

The costs and benefits were calculated for each scenario according to how many herds were infected, susceptible, and recovered in each year. The cost–benefit analysis is a systematic approach for evaluating the economic implications of management scenarios. The aim of this analysis is to identify the management strategy maximizing the net welfare effect, which is we call net economic effect to avoid confusion with animal welfare. This method is frequently used to evaluate policies that aim at an improvement of animal health. To quantify the economic implications of footrot management, the net economic effect was measured with the net present value method as follows:

$$\text{NPV}(d, T) = \sum\_{t=1}^{T} \frac{\left(\sum\_{j=1}^{I} b\_{j,t} - \sum\_{i=1}^{I} c\_{i,t}\right)}{(1+d)^{t}} - \sum\_{i=1}^{I} c\_{i,0} \tag{10}$$

where the year was denoted with *t*, the discount rate with *d*, the benefits of management with *b*, and the costs of management with *c*. The costs and benefits consist of a number of components, which are summarized by *i* and *j*. The net economic effect was calculated at the farm level and then aggregated at the nation level.<sup>4</sup> The benefits of improved animal welfare were also considered in our analysis. However, as these benefits are not direct farm benefits, they were only considered at the national level. The cost–benefit analysis is concerned with the period 2014–2030. The

<sup>4</sup>The index for each farm was dropped in the NPV formula to simplify the notification.

analysis was limited to this period because uncertainty increases over time. For the evaluation of the effect of disease control, the time period after the implementation is of largest interest. The discount rate was assumed to be 1 during calculation period because the inflation rate in Switzerland remained close to 0 in the last years, although there is considerable uncertainty with respect to future economic development. Similarly, it was assumed that prices and salaries will remain at their respective level in 2014. The implementation of footrot measures affects the supply of Swiss sheep products and, therefore, their market prices. Price changes affect rents on the consumer and producer side [Ebel et al. (34)]. Such indirect economic effects of footrot are not taken into account in the conducted cost–benefit analysis, but are discussed below. Because the Swiss sheep industry has undergone major changes in recent years, it was necessary to predict the future sheep population and farms structure before assessing costs and benefits of the management of footrot.

#### Predicting the Future Sheep Population

The size of sheep population for 2014–2030 was estimated with historical data on sheep farming in Switzerland from the farm accounting database (AGIS database). This database contains information on the entire sheep population in Switzerland for 1999–2014 (Table S2 in Supplementary Material). The size of the sheep population in each region was calculated for every year. The data show that the number of sheep has been increasing over this period. However, the development is not homogenous with some regions observing a substantial decrease in the sheep population (regions 26 and 27) and others a substantial increase (regions 1 + 2, 9, and 17). Considering the substantial variation in the development of the sheep population, it is necessary to apply an identification strategy for the future sheep population that accounts for this heterogeneity. A number of regression specifications were compared to obtain a correct identification of the relationship using the farming data for 1999–2014.<sup>5</sup> It was found that the seemingly unrelated regression model with regionspecific fixed effects and linear time trends replicates the datagenerating process most appropriately. The regression model was developed by Zellner (35) and allows correlation in the error terms. The equation system is outlined below:

$$\mathbf{S}\_{i,t} = \alpha\_i + \mathbf{\beta}\_i T\_{i,t} + \mathbf{e}\_{i,t}, \quad E\left[\mathbf{e}\_{i,t} \* \mathbf{e}\_{k,t}|T\_t\right] = \mathbf{o}\_{i,k} \tag{11}$$

where *i* represents the equation number (region) and *t* the year. The region fixed effects were denoted with α*<sup>i</sup>* and the regionspecific linear time trend with *Ti,t*. The error term was denoted by ε*i,t*, which was allowed to be correlated across regions but not over time. The system of equations was solved simultaneously using the feasible general least squares method. The estimation results are summarized in Table S3 in Supplementary Material and illustrated in the Figure S1 in Supplementary Material. Most regions showed a highly significant and positive trend in the sheep population, and the largest effects are found in regions 7, 9, 10, and 20. The regression specifications fitted the underlying data well, which is indicated by the generally high predictive power (*R* 2 values).

#### Predicting the Future Farm Structure

The management costs were expected to vary between farm types. Larger farms were expected to benefit from scale effects because they can use their equipment more efficiently. Hence, the average fixed and variable cost of treatment per unit was expected to be substantially lower for larger farms. To account for scale effects (reduction in average cost per unit of output by increasing the production), sheep farmers were classified in each region according to the scale of their operation as small (1–30), medium (31–70), and large operations (*>*70). Substantial differences could be observed in the farm size between regions (Table S4 in Supplementary Material). Although most farms in Switzerland were classified as small operations, this share has been decreasing substantially since 1998. Therefore, sheep farming activities in Switzerland are becoming more professional with mostly small farms ceasing and large farm expanding their activities. To model future scale effects, the same regression model as used for the prediction of the sheep population was applied. The regression results are presented in Table S5 in Supplementary Material. It was found that the proportion of small and medium operations will decrease further in the future. Particularly for the southern and alpine part of Switzerland (regions 13–15 and 25–27), an increase in the size of farms is expected.

#### Management Cost

The management cost by farm type was defined according to the four management scenarios. They consist of labor costs on the farm, third-party labor costs, and material costs. In regions with mandatory control, the cost items consisted of hoof trimming and weekly hoof bathing over a period of 10 weeks for infected farms, four control visits in the first year and one each in the two consecutive years. In scenario B, control visits on farms include clinical inspection of all animals. In the other scenarios, samples for the PCR diagnostic test were taken during the control visits to identify infected animals. The PCR test is assumed to be conducted by trained personnel and only a proportion of animals were tested per herd (ranging from 100% for small herds to 10–40% for large herds), prioritizing high risk animals (lame animals, newly purchased animals, rams, and heavy ewes). This implies substantially lower management cost, which is accounted for as third-party labor costs. On the other hand, the additional laboratory cost of the PCR test increased the material costs (CHF 6.50 per test). For regions without control program, management activities were reduced to the minimal level defined by the animal welfare legislation. Costs related to this included hoof trimming and hoof spray. A detailed summary of the management approach and the management cost for the different scenarios is provided by Aepli et al. (25).

#### Management Benefit

The management benefit is composed of farm benefits and the reduction of intangible damage. Farm benefits arise mainly from reduction in fattening time. It was found in the experimental animal trial that the fattening time was significantly longer for infected lambs than for non-infected lambs (31.9 days longer, *p <* 0.01, linear mixed model) (25). An additional day of fattening was valued with CHF 2.70 (1 CHF = 0.918 € or 1.040 UD\$), which

<sup>5</sup> Focusing on the regression specification that replicates the data generating process most accurately; a detailed analysis of the different specifications can be found in the project report [Aepli et al. (25)].

is composed of feed cost (CHF 0.15), operation cost (CHF 0.25), and labor opportunity cost (CHF 2.30).

Intangible costs are not directly quantifiable costs that are related to an identifiable source. Therefore, they can be seen as external costs, which are not taken into account in the cost calculation of the producers. These costs were measured with the help of a structured expert elicitation. Two workshops were conducted in which stakeholders such as farmers, consumers, veterinaries, scientists, and government employees discussed the intangible costs of footrot. It was found that intangible costs are primarily related to the negative utility of society due to reduced animal health and limitation of natural behavior. As an average of the two workshops, the experts concluded that these two animal welfare issues contribute 84% of intangible costs. The monetary value of pain caused by footrot was then estimated using a similar method as proposed by Fitzpatrick et al. (36). Based on the discussed intangible cost components and the evaluated societal valuation of animal pain, the experts estimated the national costs of footrot. While there was a wide variation in the single expert opinions on the society values animal welfare, the workshop participants generally agreed with the mean monetary value derived in the workshop. The experts concluded that the annual nationwide intangible cost caused by footrot with a national prevalence of 70% equals CHF 53.03 million. The cost at prevalence rates of 0, 20, and 50% was evaluated as well. Piecewise cubic Hermite interpolation was used in succession to calculate the intangible cost for each prevalence level, in 0.1% steps. A more detailed description of the elicitation approach and results is provided by Aepli et al. (25).

#### **RESULTS**

#### **Fitting to the Swiss Situation and Calculation of Reversion Rate** γ

For the fitting procedure, the model started with a prevalence of 40.5%. This value was closest to the prevalence of 40.2% (target prevalence) while avoiding partial herds. At year 45, the model reached the target prevalence and stayed in an endemic steadystate afterward (variation of 40.38–40.48%; **Table 2**; Figure S2 in Supplementary Material). Year 45 was therefore defined as the year of data collection (year 2014) and the year when the alternative strategies where implemented (Figure S3 in Supplementary Material).

The value of the reversion rate γ, which resulted in a model prevalence closest to the target prevalence, was determined at 49.0% for the regions 1–22 and 36.8% for regions 23–27.



*Statistics of 1,000 simulations.*

#### **Footrot Prevalence under Scenarios A–D**

Scenario A was defined as the current state of footrot control, i.e., mandatory control program in regions 23–27 only, however, with the introduction of a new PCR diagnostic test in these regions. The nationwide prevalence and the prevalence in regions without mandatory control program only decreased slightly (*<*1%) over time (**Table 3**; **Figures 4** and **5**). For the regions with mandatory control program, a decrease in the prevalence was observed because of improved disease detection and consequently lower reinfection of controlled premises (**Figure 6**). On average these regions had a median prevalence of 25.5% at the beginning of the simulations. After 18 years of simulation, a plateau was reached at a median prevalence of 18% for the regions 23–27.

Scenario B was defined as the introduction of Swiss-wide mandatory control measures as currently implemented in the cantons of GR and GL, without using the PCR diagnostic test (only clinical diagnosis considered). A clear decrease in the nationwide prevalence was observed during the first year of simulation (**Table 3**; **Figure 4**). In the first 2 years of simulation, the median of the Swiss prevalence decreased from 40.4 to 28.0% (mean 28.0%, 95% CI 24.0–32.3%). The 10% mark was reached at year 14 with a median prevalence of 10.0% (mean 10.0%, CI 6.8–11.5%). In the following years, the prevalence further decreased continuously to a value of 1.8% (mean 2.0%, CI 0.4–4.7%) at the end of the simulation (year 57). Elimination of footrot (median prevalence of 0%) was only reached in regions 4 and 14 after 42 and 28 years of simulation, respectively. On average, the prevalence in the regions 1–22 fell more rapidly than that of the regions 23–27 (**Figures 5** and **6**). The prevalence of 10% was reached after a mean of 13.5 years (6–28 years for the different regions) and after a mean of 21 years (9–30 years) for the regions 1–22 and 23–27, respectively.



*Thousand simulations were conducted per scenario.*

Scenario C was defined as the introduction of Swiss-wide mandatory control measures as currently implemented in the cantons of GR and GL, but including the use of the PCR diagnostic test. The effect of the PCR diagnostics can therefore be observed by comparing scenario C with B. Shortly after the implementation of the control measures, the prevalence decreases even more rapidly than in scenario B. Starting at a nationwide median of

40.4%, it fell to 23.1% (mean 23.1%, CI 19.4–27.1%) after 2 years (**Table 3**; **Figure 4**). In the following years, the prevalence rapidly decreased further so that after 6 years of simulation the median prevalence fell below 10% and after 20 years to 1.0% (mean 1.0%, CI 0.3–2.0%). After 50 years of simulation, footrot is predicted to be eliminated on average (median nationwide prevalence of 0%). Only slight differences were observed between the regions 1–22 and 23–27 (**Figures 5** and **6**). The 10% prevalence was reached earlier for the median of the regions 1–22 (after 6 compared to after 7 years) and the footrot elimination (0% median prevalence) was achieved earlier for the regions 23–27 (after 24 compared to after 33 years).

Scenario D was defined as the cease of all mandatory control measures in Switzerland. The median of the Swiss prevalence increased slightly in the first 2 years up to 42.7% (mean 42.5%, CI 36.7–47.9%) (**Table 3**; **Figure 4**). An increase of 10% to a median of 50.4% (mean 50.3%, CI 44.0–56.4%) was observed after 15 years of simulation. This increasing trend continued and toward the end of the simulation (year 57), the median of the prevalence reached a plateau, which was 13% higher than at the beginning of the simulation (median 53.3%, mean 53.2%, CI 46.0–59.9%). The increase in median prevalence was faster in the regions 23–27, where the cease of the mandatory control program had a direct effect (**Figure 6**), than for the regions without earlier implemented control programs (**Figure 5**).

#### **GSA Analysis**

Two parameters were detected to mostly influence footrot prevalence (the outcome of the model). These are the recovery rate σ and reversion rate γ with total effect Sobol indices of 0.69 and 0.61, respectively. To a lower extent, infection rate β (total effect Sobol index = 0.49) and the number of susceptible herds at

line = median out of 1,000 simulations.

#### **TABLE 4 | Cost and benefit of footrot management for 2014–2030 (in 1,000 CHF)**.


*The discount factor is 1. All cost and benefit are expressed in constant 2014 prices. Direct cost and treatment cost are summed over time of the management period (2014–2030). The 95% confidence interval is reported in parenthesis. The total net economic effect is presented in Table 5.*

*Cost differences between small, medium, and large farms are taken into account. Composition of the cost categories depends on the scenario. On-farm labor costs are calculated as 28 CHF/h times the farm personnel's time estimated for foot bathing, hoof trimming, and presence at clinical inspections or collection of samples for diagnostic tests. Third-party costs include clinical inspections by hoof controllers and veterinaries and diagnostic tests. Material costs include water and zinc sulfate. The saved costs associated with the reduction of fattening time were calculated by assuming that the costs per animal are 2.70 CHF/day, and animals are not prematurely slaughtered. Intangible costs are calculated based on national prevalence rates, given the results of the expert elicitations.*

the beginning of the simulation (total effect Sobol index = 0.48) also resulted in Sobol indices slightly higher than for the other parameters, which range from 0.43 to 0.45 (Figure S4 in Supplementary Material). The total effect Sobol index also integrated the interactions between the respective parameter and with all other parameters tested in the GSA.

#### **Cost and Benefit Evaluation**

**Table 4** summarizes the cost and benefit of footrot management under the four management scenarios for 2014–2030. Among the components of management cost, labor cost accounted for the largest share in total cost. The smallest management costs were found under scenario C, and the highest costs were expected with scenario D. In comparison to scenario C, labor costs under scenario B were substantially larger. This is because PCR tests are less labor demanding than footrot inspections and this includes both, on-farm labor as well as third-party labor. Most of the total management costs of scenario C occur in the initial years after the management strategy was implemented, and cost decreases substantially in the following years as the prevalence rate drops. For the benefits, it was found that under scenario D, the fattening time would increase substantially and the animal welfare would decrease. By increasing the management intensity (scenarios B and C), a substantial decrease in fattening time and improvement of animal welfare could be achieved. The effect was larger for scenario C, where the benefit for reduced fattening time increased to CHF 52.8 million. While the intangible cost in scenario C is still nearly double as high as the gain through reduced fattening time, its value of 99.9 Mio. CHF is substantially lower than in any other scenario.

The net economic effect of footrot management was calculated by comparing the alternative management scenarios B–D with the baseline scenario A (**Table 5**). It was found that under scenario D, the management cost will increase by CHF 11.9 million. Since the management benefit will also be reduced by CHF 160.4 million, **TABLE 5 | Net economic effect of scenario B–D compared to scenario A (***laisser-faire***) in 1,000 CHF**.


*The total of cost and benefit are reported for the period 2014–2030. All cost and benefit are expressed in constant 2014 prices.*

the net economic effect of scenario D is negative (CHF *−*172.3 million), indicating that it is less preferable than the laissezfaire scenario A and clearly the least preferable option among the compared scenarios. In contrast, scenarios B and C have a positive net economic effect of CHF 422.3 and 538.3 million, respectively. In both scenarios, reductions of intangible costs are the largest fraction of economic gains. Given that the sanitation measures have to be paid by the farmers and intangible cost reductions are social gains, it is worthwhile to note that there is also a positive benefit due to shortened fattening time—which is a benefit received directly by the farmer. It was found that the management costs were substantially lower for scenario C than for scenario B. Moreover, due to higher accuracy of the PCR method in recognizing footrot, management benefits were estimated to be larger for scenario C than for scenario B.

#### **DISCUSSION**

The aim of this study was to evaluate the current footrot situation in the Swiss sheep population and the costs and benefits of Swiss-wide control programs. The joint analysis of the economic and epidemiological aspects of footrot allowed predicting the costs, benefits, and net economic effects under different control programs has not been implemented to date. By applying an epidemiological model, spatio-temporal prevalence information could be generated that served as basis for the economic analysis of the control strategies. Particularly in veterinary medicine, cost–benefit analysis is highly relevant for the decision whether or not to implement a disease control program.

The simulation model revealed that scenario C is most efficient in reducing nationwide footrot prevalence as fast as possible. This is due to the combination of the nationwide mandatory control program with the use of PCR diagnostics, which substantially increases the detection rate of infected animals. Nevertheless, this scenario was predicted to still require 6 and 10 years to reduce the Swiss-wide prevalence below 10 and 5%, respectively.

Global sensitivity analysis revealed that the recovery rate σ and reversion rate γ are most influential on the prediction of footrot prevalence over time. These parameters simulate the disease spread within the regions. Parameters defining the spread between regions, i.e., those related to common pasture, exhibitions, and sheep transport between regions, are less sensitive. This implies that in the current endemic footrot situation in Switzerland, the main effort should be directed toward reducing the prevalence within the regions. This finding is in line with experiences of the cantons GR and GL where the prevalence could be reduced significantly within a few years after the implementation of the mandatory control program (personal information cantonal veterinary office GR). However, reinfections have also been observed frequently through contact with infected animals on pastures or after purchase of infected sheep. Therefore, it can be hypothesized that the between region transmission pathways will become more relevant in an advanced control phase after the prevalence within the regions was successfully reduced. In this project, modification of between region pathways has not been investigated by restriction of sheep movements, pasturing, or participation at exhibitions. Yet, this is certainly worthwhile to be undertaken in the situation of advanced footrot management, because control measures restricted to regional activities are not sustainable enough.

From an economic point of view, it can be concluded that under current management costs and benefits, it is advisable to implement a systematic program that aims at a reduction of the footrot prevalence level. Over the long run management costs of individually tackling footrot are far higher than in a systematic, Swiss-wide approach, which is able to quickly reduce the prevalence of footrot in Switzerland. The analysis has shown that the net economic benefit increases with higher treatment intensity. Therefore, a systematic sanitation program with PCR method has been demonstrated to be the best choice.

An aspect is the potential economic effect of nationwide programs on the market price of sheep product markets. It has been demonstrated earlier that consumers are willing to pay higher prices for products yield from animal production with high welfare levels (37). For the Swiss sheep meat market, this effect is hard to predict and likely small due to several factors. On the one hand, a successful nationwide footrot program increases the number of healthy animals in Switzerland and with it the supply of sheep products. On the other hand, the high costs of implementing the mandatory measures might induce farmers to exit, which has adverse effects on the supply. The net effect of these diverging forces on the lamb meat market price is further dampened due to Swiss import regulations (potential adjustment of the import quota for lamb meat). The import quota is set quarterly by the Swiss meat association. A changing supply could, therefore, be compensated by higher or lower imports. However, it has to be noted that the minimum import amount set by the Uruguay round has to be 4,500 tons per year (38). During the last years, this threshold has always been exceeded, resulting in an import share of *>*50% of the Swiss sheep meat market (38).

Like all models, the simulation model is based on a series of assumptions. First, it was assumed that only one herd exists per premises. It might, therefore, be possible that the number of herds in Switzerland were underestimated. However, the influence on the output of the simulation model is expected to be negligible because the disease very likely spreads easily via pasturing or contaminated objects (e.g., foot-paring instruments) within the same premises even when more than one herd is kept. Second, neither disease transmission by migratory sheep flocks nor by cattle, goat, and wild ruminants were considered for the spread between regions. Migratory sheep flocks integrate sheep collected from different premises at the end of the pastoral season, and travel to the low land of Switzerland until they reach the weight to be slaughtered. Information on migration routes is not available in Switzerland. Therefore, uncertainties would have been too high to allow inclusion into the model. Also, only six migratory flocks are currently registered in Switzerland, their influence on the propagation of the disease is likely to be limited. The role of cattle in the transmission of virulent strains of *D. nodosus* leading to footrot in sheep is still under debate, although crossinfection between the two species in co-grazing settings was demonstrated (39, 40). In Switzerland, cattle and sheep are rarely kept in the same stable and are not transported together, which hampers potential transmission. It was demonstrated that the transmission of footrot is possible between goats and sheep when kept in close contact (41). However, in Switzerland sheep and goats are mainly kept together in smaller premises and hobby farms and the main part of sheep movements is caused by professional farmers. Therefore, it can be assumed that the role of goats in spread of footrot is negligible in Switzerland. Nevertheless, the influence of goats, but also other species such as wild ruminants on the spread of footrot to sheep should be further investigated.

In the presented work, epidemiological and economic models were combined to assess footrot management programs in the Swiss sheep population. It was found that a nationwide coordinated program with the use of the improved diagnostic test revealed to be the most cost-efficient strategy to control the disease. Implementation of such a program is therefore recommended from a scientific point of view.

#### **ETHICS STATEMENT**

The animal experiment was approved by the Cantonal Veterinary Office of the Canton of Zug (approval number ZG 67/15) in accordance with the Swiss animal welfare legislation.

#### **AUTHOR CONTRIBUTIONS**

All authors were involved in the design of the study described here, with responsibilities in different parts. MA was another overall leader of this project. SD, DZ, GS, and GG were involved in the epidemiological part of this work. SS, MR, and CK were involved in the economical analysis. DZ, SD, SS, and CK were the main authors of this study. All co-authors read and commented the manuscript and approved the final version to be submitted.

#### **ACKNOWLEDGMENTS**

This study was part of a more extensive collaborative work published in a report by Aepli et al. (25). We acknowledge all

### **REFERENCES**


contributors of that work: Sarah Bähler, Seraina Grieder, Christina Härdi, Rita Lüchinger, Regula Mengelt, Fabian Arnold, Nicolas Hofer, Daniel Langmeier, Claude Müller, and Camille Rubeaud. We are grateful for data provided by the agrarian policy information system of Switzerland (AGIS), TVD, and Swiss Sheep Breeding Association.

### **FUNDING**

We acknowledge the financial support from the Swiss Food Safety and Veterinary Office and the Swiss Federal Office of Agriculture.

### **SUPPLEMENTARY MATERIAL**

The Supplementary Material for this article can be found online at http://journal.frontiersin.org/article/10.3389/fvets.2017. 00070/full#supplementary-material.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2017 Zingg, Steinbach, Kuhlgatz, Rediger, Schüpbach-Regula, Aepli, Grøneng and Dürr. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*