Assessing the Risk of Spatial Spreading of Diseases in Hospitals

In recent years, the transmission of healthcare-associated infections (HAIs) has led to substantial economic loss, extensive damage, and many preventable deaths. With the increasing availability of data, mathematical models of pathogen spreading in healthcare settings are becoming more detailed and realistic. Here, we make use of spatial and temporal information that has been obtained from healthcare workers (HCWs) in three hospitals in Canada and generate data-driven networks that allow us to realistically simulate the spreading of an airborne respiratory pathogen in such settings. By exploring in depth the dynamics of HAIs on the generated networks, we quantify the infection risk associated with both the spatial units of the hospitals and HCWs categorized by their occupations. Our findings show that the “inpatient care” and “public area” are the riskiest categories of units and “nurse” is the occupation at a greater risk of getting infected. Our results provide valuable insights that can prove important for measuring risks associated with HAIs and for strengthening prevention and control measures with the potential to reduce transmission of infections in hospital settings.


INTRODUCTION
Healthcare-associated infections, or nosocomial infections, are infections transmitted within healthcare settings. For every one hundred patients admitted to hospitals, between seven and ten will acquire at least one type of HAI [1]. Nosocomial infections also play an important role in the spreading of pandemics, as the recent SARS-CoV-2 pandemic has shown [2,3]. As such, they have become an important public health concern [4][5][6][7] because the prevalence of HAIs not only yields losses of financial resources but also causes substantial morbidity and mortality [8][9][10]. The control or mitigation of HAIs is challenging since, in modern healthcare systems, there exists a variety of potential risk factors contributing to the spread of HAIs [11,12]. For instance, in terms of environmental aspects, the relatively restricted spaces in hospitals provide the conditions for the repeated and prolonged exposure to HAIs. Moreover, the various categories of spatial units play different roles in the transmission of HAIs as a consequence of their function.
Previous studies have shown that close contact is a major mode of transmission of healthcareassociated infections. Thus, the daily activities of healthcare workers, that form temporal networks of interactions, can potentially increase the risk of infection to patients and themselves [13][14][15]. In addition, the occupational nature of some healthcare workers also contributes to a certain extent to the transmission of HAIs between hospital units, making it necessary to adopt a system-wide view of all possible transmission paths. In summary, the risks associated with HAIs are heterogeneous and could depend on the characteristics of the categories to which the units belong and the various occupations of healthcare workers.
Over the last decade, measures aimed at preventing and containing HAIs have been developed, also accounting for the previously mentioned variability of transmission routes [16][17][18]. For instance, increasing hand hygiene and the regular use of personal protective equipment (PPE) are the most basic infection prevention and control strategies [12]. Although the implementation of disinfection measures has a significant effect on restraining the spread of HAIs, understanding the dynamics of HAIs that are transmitted by respiratory mechanisms can open the path for optimizing current strategies.
In terms of mathematical modeling of disease spreading, early on it was already shown that social networks in healthcare facilities are different from other networks since individuals have very particular roles, such as nurses and medical doctors [19]. Most models focus mainly on a single ward-usually ICUs-or in simplified hospital settings [20,21]. Others focus on certain pathogens, or on the transmission from patient to patient and across hospitals [22,23]. Yet, not much attention has been devoted to the different roles that healthcare workers' profiles-besides nurses and medical doctors-can play.
In this work, we use data collected from anonymous surveys carried out in three hospitals on the behavior of workers to study the propagation of airborne respiratory pathogens. We construct the interaction network of workers within the hospital using a data-driven approach and simulate the transmission of respiratory disease [24,25]. Our aim is to evaluate the infection risks of certain occupations, and how they are distributed through the different units of the hospital. To this end, we assess the infection risk of each spatial unit by calculating the disease hitting time and the number of infections produced in each location. We also evaluate the risk of different HCWs by computing the probability of getting infected and their potential infection capacities.
Our findings show that the transmission dynamics of a potential disease are very similar in the three hospitals analyzed, even though they are very different in terms of size and specialties. In particular, transmission is the highest in inpatient care units, and nurses are the HCWs at the highest risk, in line with previous studies based in different countries [26]. It is also noteworthy that these results could provide a valuable reference for monitoring the transmission of airborne infectious diseases within hospital settings and improve preventive measures to ultimately reduce the incidence of HAIs.

MATERIALS AND METHODS
Site-specific surveys were created for three Canadian hospitals as part of the CONNECT I study (henceforth, hospitals A, B, and C, in order of size, being A the largest). In the survey, employees were asked to provide information on the amount of time they spent at each location of the hospital during a normal week. The survey identified 19 different HCW occupational categories and over 100 different locations in each hospital. These were further grouped into seven HCW occupational categories (nurses, physicians, researchers, technologists, administration personnel, and clerks) and six unit categories (auxiliary rooms, inpatient areas, medical-staff rooms, other-staff rooms, outpatient areas, and public areas) using domain knowledge from the hospital administrations. The total number of employees in the three hospitals was over 8,000 (~4100,~2400, and~1600 in hospitals A, B, and C, respectively). In total, 38% of them responded to the surveys, with administration personnel and nurses being the two categories with the largest response rates. For further details on the CONNECT I study, we refer the reader to [15] in which a more complete description of the dataset is given.
In Figure 1, we report some of the results of the survey from hospital B (see also Supplementary Figure S1). We observe that there was an important heterogeneity in the weekly routines of different HCW groups. For instance, nurses and physicians did not visit auxiliary rooms, while administrative assistants and those that reported their category as other could be easily found there. Similarly, it is easier to find physicians in medical-staff rooms and public areas than nurses. Regarding their daily number of contacts with other co-workers, nurses were the single-category reporting the largest number, followed by administrative assistants and physicians. The category labeled as other was composed of very different HCWs and, thus, their contact patterns were also quite heterogeneous.
With this information, we applied a data-driven approach to construct the network of HCW interactions. Due to the lack of precise information, we did not include any patient interactions. Therefore, rather than analyzing the impact of a potential outbreak in the hospital, we focused on understanding how would an airborne disease spread throughout the hospital. To do so, we first assumed that two individuals can interact only if they have reported visiting the same unit. The connection relationship is denoted by where δ i,u = 1 if the healthcare worker i has visited unit u and δ i,u = 0 otherwise. To leverage the detailed information contained in the survey, we further took into account the amount of time that each individual spent in the said location with the aim of weighting the interaction. Therefore, in the generated network, each individual represents one node, and the weighted link between two nodes i and j in unit u is given by: where T i,u represents the amount of time that individual i spent in a spatial unit u and T i represents the total amount of time in the hospital reported by individual i. In Figure 2A, we schematically represent the connection relationship among the healthcare workers in our network. Figure 2B shows an example of the networks obtained with this technique. We can see that technologists tend to cluster together in a few locations, while nurses, who also tend to cluster together, can be found all over the hospital.
To simulate the spread of an airborne infectious disease, we implemented an SIR model on top of the network of HCW interactions. In the model, a worker might be in one of three states: susceptible (S), infected (I), or recovered (R). An infected individual, i, transmits the disease to a susceptible individual j with probability P(S → I) = 1−exp (−βw ij Δt), where β is the percontact transmission probability. This process is run synchronously for all infected individuals at each time step t. Then, those individuals that were already infected at time t might recover with probability P(I → R) = 1−exp (−μΔt). We set β = 0.01 and μ = 0.10. The average strength (sum of all the weights of each node) in each network was 31, 23, and 18 for hospitals A, B, and C, respectively, yielding a value of R 0 in the homogeneous approximation of 3.1, 2.3, and 1.8 [27]. Another possibility would be to fix R 0 and obtain the corresponding value of the per-contact transmission probability that would yield that reproduction number in the hospital. The results that we report in this study do not change with this choice, and thus we have not included them in this work.

Risk of Units
In this section, we assess the risk of each location in terms of its potential to spread an outbreak. Note that this does not capture where the outbreak originates. Instead, it was assumed that one HCW gets infected, either by a patient in the hospital, or outside, and then the outbreak spread through the other HCWs. This  process was simulated by randomly choosing one node in the system and moving it into the infected compartment. Due to the stochasticity of the model, the results were averaged over 10,000 independent runs. In the Supplementary Material, we also explored the impact of some of these assumptions. In particular, Supplementary Figures S2, S3 show the effect of adding a constant flux of infections from the outside and Supplementary Tables S1, S2 the impact of the seed selection.

Hitting Time
The hitting time is defined as the average amount of time that it takes for the disease to reach a specific location [28]. In the case of hospital units, we defined it as the time until one HCW located in unit l gets infected, HT l . Therefore, the smaller the hitting time, the more at risk a location is. We then gauged the risk of each location in comparison to the rest by dividing each HT l by the average hitting time in the hospital, 〈HT〉. In Figure 3A, we show the risk computed using this procedure for each type of unit. We observe that in the three hospitals, the disease would arrive sooner in units under the category of inpatient area. Note also that, even though most locations categorized as public areas are at an average risk, the riskiest locations belong to the previously mentioned category. These are mainly cafeterias, which are visited by many employees and, thus, it is easy for the disease to reach them.

Number of Infections
We further explored the risk posed by each unit by computing the average number of infections produced in each unit, NI l . As in the previous case, we normalized this number by the average number of infections produced in any unit 〈NI〉 in order to compare different hospitals. The results, Figure 3B, agree with the previous observation that the inpatient area was the highest risk category. However, we observed an important contribution of some medical-staff rooms. A closer inspection revealed that these locations are laboratories and research locations, in which the number of different HCWs was not that large, but the ones that visit those areas were likely to spend an important amount of time there. Thus, even though it took longer for the disease to arrive, once it did, it could easily spread throughout the workers located in those rooms.
To briefly sum up, the results obtained from two methods reveal that the risk of spatial units belonging to inpatient care and public area categories is relatively higher than others in both methods. The units at lower risk were other-staff rooms (accounting and administration) and auxiliary rooms (laundry and housekeeping).

Risk of Individuals
As we saw in Figure 1, there was an important degree of heterogeneity across HCW occupations and their roles in different units. In what follows, we explore the risk associated with each occupation category in order to better understand the dynamics of the spreading. In [15], the 19 self-reported occupations identified in the surveys were grouped in four categories, but given the results from the previous section, we extracted two groups of HCW from the "other HCW" used in the study: technologist and researcher (note that in hospital C no one reported anything related to research as their occupation). We also split the "Admin/Support" category into administration and clerk due to the relatively large number of individuals in each category.

Probability of Getting Infected
A basic observable of the risk carried by an individual is the probability of getting infected. To obtain it, we ran 10,000 stochastic simulations and computed the probability that each individual got infected, PI i . Then, we divided it over the average probability of getting infected for all individuals, 〈PI〉, and grouped them according to their occupation, Figure 4A. In order to clearly show the differences in the risk between different groups of HCWs, we transform the box-plot into Table 1. In the table, the values of the median and interquartile range of the risk for each occupational group of HCWs are listed for each hospital. In this case, we observe that nurses and technologists were the HCWs at higher risk. In contrast, people working in administration were at a lower risk of getting infected. Note that interactions with patients were not included, and it is expected that nurses will have more contact with patients than other occupations. Yet, the probability of getting infected is greater for them, purely based on their contacts with other HCWs. These observations are largely consistent with the reality in hospital settings, highlighting the role that occupational heterogeneity plays in the spreading risk of HAIs.

Effective Reproduction Number
Last, we computed the effective reproduction number of each individual. That is, the average number of secondary infections they produce during an outbreak, R i . We divided this quantity over the average effective reproduction number, 〈R〉, to gauge the risk posed by each individual. Then, we grouped them again according to their occupation, Figure 4B. In Table 2, we show the median and interquartile range of the risk associated with each group of HCWs for each hospital. In line with the previous observation, nurses were the category with the highest risk in terms of the effective reproduction number. Thus, not only it was more likely for nurses to get infected but also to spread the disease. As such, they should be a priority when implementing new protection measures against HAIs. On the other extreme, working under the administration category was relatively less risky.

DISCUSSION
In this work, we have considered the spatial and temporal information self-reported by healthcare workers in three hospitals in Canada to generate data-driven networks of interactions between HCWs. In hospital settings, the functional characteristics of some categories of units result in a particularly high risk of spreading  healthcare-associated infections. On the other hand, the heterogeneity of the daily activities of HCWs might also constitute an important risk factor for the transmission of HAIs. Therefore, the main purpose of this work was to quantify the differences across categories of spatial units and workers. Accordingly, we delved into the dynamics of HAIs by implementing an SIR model on the generated networks to simulate the spread of an infectious disease within the hospital settings. This simplified model allowed us to understand the dynamics of the system without having to focus on the specific characteristics of a certain type of pathogen.
For the units, we proposed two risk indicators given by the disease hitting time and the number of infections generated per unit. The first metric allows quantifying how fast a given disease could reach different units in the hospital, while the second provides a measure of the spreading potential of a unit. We found that the risk levels of units exhibit heterogeneous and spatially distributed characteristics. In particular, the units labeled as inpatient care or public area showed higher risks of spreading a disease than others. On the contrary, the other-staff rooms and auxiliary rooms categories were comparatively at a lower spreading risk. Similarly, we quantitatively assessed the risk of groups of HCWs using two metrics. We focused on the impact of the diversity of occupations on the risks of HCWs by calculating the probability of getting infected and the effective reproduction number. The results indicate that HCWs belonging to the nurse category were the most susceptible to being infected and, at the same time, nurse was the occupation with the largest spreading potential in terms of the capacity to generate new infected individuals. In addition, most of the HCWs labeled as administration are comparatively less likely to get infected and transmit the disease to others. Therefore, overall, our results indicate which units and occupations should be preferentially targeted by prevention plans aimed at reducing, the probability of getting infected, the likelihood of generating infections, and the spreading of HAIs in hospital settings.
Last, a few remarks are in order. First, we observe that merely the size of the hospital already produces denser and more connected networks, which increases the value of R 0 and thus facilitates the propagation of this type of infectious disease. Second, the networks were constructed using only the self-reported information on the interaction patterns among co-workers, disregarding the risk that some specific units might pose to the people working there, or the possibility of patients contributing to the spreading. However, we believe this limitation highlights something important. Based solely on the amount of time spent at each location, we have identified which areas and occupations are at the highest risk in terms of HAIs. These, in turn, are precisely areas with many patients and in which riskier activities take place, which can only increase their role in spreading HAIs. Thus, special attention should be paid to these interactions, rather than focusing only on patient-worker interactions. To conclude, it is important to note that this type of agent-based model requires the estimation of many parameters to calibrate them to real-world disease data, something beyond the scope of this study. However, there are promising inference techniques that have demonstrated their potential to make these mathematical models useful in real-world settings [29][30][31][32][33].
In summary, the data-driven model presented here provides insights into the dynamics of healthcare-associated infections in hospital settings. The findings reveal the key role of the diversity of categories of units and various occupations of HCWs on the risk of transmitting infectious diseases. In terms of controlling the transmission of HAIs, we believe that appropriate interventions could be developed through quantitatively assessing the risks of both units and individuals using data of the nature employed here. Finally, our work constitutes the first step towards more sophisticated analyses and models whose purpose is the optimization of the spatial organization of units and occupations of real settings to reduce the inherent risks of HAIs.

DATA AVAILABILITY STATEMENT
The data analyzed in this study is subject to the following licenses/ restrictions: The data collected in the surveys are not publicly available due to the inclusion of hospital administrative and human resources data. The networks constructed using the data extracted from the surveys can be downloaded from https://github.com/aaleta/hospital_ networks or https://doi.org/10.5281/zenodo.6561171.

AUTHOR CONTRIBUTIONS
DL, AA, and YM designed the study. DL and AA carried out the research. DL, AA, and YM analyzed and interpreted the results. DL wrote the first draft of the manuscript. DL, AA, and YM contributed to the writing and have approved this manuscript.

FUNDING
The original work leading to the construction of the networks of interactions was supported by grants from the Canadian Institutes of Health Research (CIHR). DL acknowledges support from the China Scholarship Council through a PhD fellowship. AA and YM acknowledge partial support from the Government of Aragon and FEDER funds, Spain through grant E36-20R (FENOL), by MCIN/AEI and FEDER funds (grant PID2020-115800GB-I00), from Banco Santander (Santander-UZ 2020/0274) and by Soremartec S.A. and Soremartec Italia, Ferrero Group. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.