<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v2.3 20070202//EN" "journalpublishing.dtd">
<article xml:lang="EN" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="research-article">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">Front. Big Data</journal-id>
<journal-title>Frontiers in Big Data</journal-title>
<abbrev-journal-title abbrev-type="pubmed">Front. Big Data</abbrev-journal-title>
<issn pub-type="epub">2624-909X</issn>
<publisher>
<publisher-name>Frontiers Media S.A.</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.3389/fdata.2022.893760</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Big Data</subject>
<subj-group>
<subject>Original Research</subject>
</subj-group>
</subj-group>
</article-categories>
<title-group>
<article-title>Network-Informed Constrained Divisive Pooled Testing Assignments</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author" corresp="yes">
<name><surname>Sewell</surname> <given-names>Daniel K.</given-names></name>
<xref ref-type="corresp" rid="c001"><sup>&#x0002A;</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/1370072/overview"/>
</contrib>
</contrib-group>
<aff><institution>Department of Biostatistics, University of Iowa</institution>, <addr-line>Iowa City, IA</addr-line>, <country>United States</country></aff>
<author-notes>
<fn fn-type="edited-by"><p>Edited by: Yang Yang, Northwestern University, United States</p></fn>
<fn fn-type="edited-by"><p>Reviewed by: Josh Introne, Syracuse University, United States; Yiqi Li, Syracuse University, United States</p></fn>
<corresp id="c001">&#x0002A;Correspondence: Daniel K. Sewell <email>daniel-sewell&#x00040;uiowa.edu</email></corresp>
<fn fn-type="other" id="fn001"><p>This article was submitted to Big Data Networks, a section of the journal Frontiers in Big Data</p></fn></author-notes>
<pub-date pub-type="epub">
<day>08</day>
<month>07</month>
<year>2022</year>
</pub-date>
<pub-date pub-type="collection">
<year>2022</year>
</pub-date>
<volume>5</volume>
<elocation-id>893760</elocation-id>
<history>
<date date-type="received">
<day>10</day>
<month>03</month>
<year>2022</year>
</date>
<date date-type="accepted">
<day>06</day>
<month>06</month>
<year>2022</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright &#x000A9; 2022 Sewell.</copyright-statement>
<copyright-year>2022</copyright-year>
<copyright-holder>Sewell</copyright-holder>
<license xlink:href="http://creativecommons.org/licenses/by/4.0/"><p>This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.</p></license></permissions>
<abstract>
<p>Frequent universal testing in a finite population is an effective approach to preventing large infectious disease outbreaks. Yet when the target group has many constituents, this strategy can be cost prohibitive. One approach to alleviate the resource burden is to group multiple individual tests into one unit in order to determine if further tests at the individual level are necessary. This approach, referred to as a group testing or pooled testing, has received much attention in finding the minimum cost pooling strategy. Existing approaches, however, assume either independence or very simple dependence structures between individuals. This assumption ignores the fact that in the context of infectious diseases there is an underlying transmission network that connects individuals. We develop a constrained divisive hierarchical clustering algorithm that assigns individuals to pools based on the contact patterns between individuals. In a simulation study based on real networks, we show the benefits of using our proposed approach compared to random assignments even when the network is imperfectly measured and there is a high degree of missingness in the data.</p></abstract>
<kwd-group>
<kwd>group testing</kwd>
<kwd>infectious disease</kwd>
<kwd>network analysis</kwd>
<kwd>divisive clustering</kwd>
<kwd>epidemiology</kwd>
</kwd-group>
<contract-num rid="cn001">5 U01 CK000531-02</contract-num>
<contract-sponsor id="cn001">Centers for Disease Control and Prevention<named-content content-type="fundref-id">10.13039/100000030</named-content></contract-sponsor>
<counts>
<fig-count count="2"/>
<table-count count="1"/>
<equation-count count="10"/>
<ref-count count="46"/>
<page-count count="7"/>
<word-count count="4949"/>
</counts>
</article-meta>
</front>
<body>
<sec sec-type="intro" id="s1">
<title>1. Introduction</title>
<p>The silent spreading of an infectious disease occurs when individuals who are asymptomatic or presymptomatic transmit the disease to those who are not infected. This has been one of the defining features of the current COVID-19 pandemic, differentiating SARS-CoV-2 from, say, the 2003 SARS-CoV epidemic (Huff, <xref ref-type="bibr" rid="B15">2020</xref>). Many studies have shown COVID-19 asymptomatic rates of 50% or higher (Oran and Topol, <xref ref-type="bibr" rid="B35">2020</xref>; Sutton et al., <xref ref-type="bibr" rid="B43">2020</xref>; Almadhi et al., <xref ref-type="bibr" rid="B3">2021</xref>), and even when symptoms do appear, peak viral shedding occurs prior to the presentation of symptoms (He et al., <xref ref-type="bibr" rid="B13">2020</xref>). Researchers have noted that even isolating 100% of symptomatic cases at the time of symptom onset is insufficient for infection control (Moghadas et al., <xref ref-type="bibr" rid="B32">2020</xref>), noting that &#x0201C;current strategies that rely solely on &#x02018;symptom onset&#x00027; for infection identification need urgent reassessment&#x0201D; (Huff and Singh, <xref ref-type="bibr" rid="B16">2020</xref>).</p>
<p>There are two traditional methods of dampening the impact of silent spread. The first is contact tracing, whereby known cases are asked to enumerate their recent contacts, and these contacts are subsequently asked to adhere to quarantining procedures. However, there exist many opportunities for this strategy to fail. Sociological studies have long shown that individuals (the known case, in our context) may forget several contacts, even some of the most important ones (Killworth and Bernard, <xref ref-type="bibr" rid="B20">1976</xref>, <xref ref-type="bibr" rid="B21">1977</xref>, <xref ref-type="bibr" rid="B22">1979</xref>; Bernard et al., <xref ref-type="bibr" rid="B5">1979</xref>, <xref ref-type="bibr" rid="B6">1982</xref>; Freeman et al., <xref ref-type="bibr" rid="B12">1987</xref>). In addition, it may be hard to make contact with these individuals, and even should contact be made, these individuals may choose to ignore some or all quarantining protocols. Indeed, studies have shown that the success rate of quarantining contacts in known cases is less than 20% (Reynolds et al., <xref ref-type="bibr" rid="B37">2008</xref>; Bharti et al., <xref ref-type="bibr" rid="B7">2020</xref>).</p>
<p>The second strategy for controlling silent spread is to implement regular universal screening, whereby everyone within some finite population of interest is tested on a regular basis in order to detect cases prior to symptom onset. This can be a highly efficacious strategy, but the frequency of testing often must be high (Larremore et al., <xref ref-type="bibr" rid="B23">2021</xref>). This places a very large resource burden on those tasked with providing so many tests, as still seen in the COVID-19 pandemic (Huff, <xref ref-type="bibr" rid="B15">2020</xref>).</p>
<p>Pooled testing is a method that in certain circumstances can be used to greatly alleviate this resource burden (Abdalhamid et al., <xref ref-type="bibr" rid="B1">2020</xref>; Pilcher et al., <xref ref-type="bibr" rid="B36">2020</xref>; Wacharapluesadee et al., <xref ref-type="bibr" rid="B46">2020</xref>). In the COVID-19 pandemic, several countries have implemented pooled testing, such as China, Germany, Israel, and Thailand (Mandavilli, <xref ref-type="bibr" rid="B30">2020</xref>). Within the United States, several organizations have also implemented pooled testing, including the Nebraska Public Health Laboratory (Stone, <xref ref-type="bibr" rid="B42">2020</xref>), Duke University (Denny et al., <xref ref-type="bibr" rid="B9">2020</xref>), Stony Brook University (The State University of New York at Stony Brook, <xref ref-type="bibr" rid="B44">2020</xref>), and UC San Diego Health (Elkalla, <xref ref-type="bibr" rid="B11">2020</xref>).</p>
<p>Broadly speaking, pooled testing is the act of combining multiple individual tests in order to determine whether individual-level testing is necessary. The analysis of pooled tests was first formalized in work by Dorfman (<xref ref-type="bibr" rid="B10">1943</xref>), which has since been referred to as the two-stage Dorfman procedure. This is a simple approach where a certain number of samples are pooled and tested; should the resulting diagnostic test be negative, no more tests are conducted, whereas if positive, all individuals comprising the pool are subsequently tested. Other pooled testing strategies include the Sterrett Procedure (Sterrett, <xref ref-type="bibr" rid="B41">1957</xref>) as well as hierarchical approaches (Black et al., <xref ref-type="bibr" rid="B8">2015</xref>; Malinovsky et al., <xref ref-type="bibr" rid="B29">2020</xref>). Work has also been done to generalize these procedures to the context where there are known heterogeneous probabilities of being infected (e.g., Hwang, <xref ref-type="bibr" rid="B18">1975</xref>), including some of the previously mentioned studies. Because of the simplicity and widespread use of the two-stage Dorfman procedure (Hughes-Oliver, <xref ref-type="bibr" rid="B17">2006</xref>), we will focus on this pooled testing strategy.</p>
<p>The above approaches all depend on the assumption of independent samples. This may be reasonable in some contexts, but when in the context of infectious disease, this assumption can only be justified if those being tested are sufficiently isolated from one another. If, e.g., a school, workplace, or public health department is testing a set of individuals who interact with one another, this assumption is grossly violated. This independence assumption is relaxed in a study by Lendle et al. (<xref ref-type="bibr" rid="B25">2012</xref>), yet even here it is assumed that the individuals being tested are exchangeable within certain clusters, and that individuals in different clusters are independent. This may be applicable in some settings (such as the example in Lendle et al. (<xref ref-type="bibr" rid="B25">2012</xref>)&#x00027;s study where multiple T-cell responses are measured within each individual, and hence a compound symmetry correlation structure is reasonable), but is clearly not the case with any realistic transmission network. In a recent study, Sewell (<xref ref-type="bibr" rid="B39">In Press</xref>) developed a method for utilizing network information in order to improve pooled testing efficiency. However, the proposed simulated annealing algorithm is very computationally burdensome and is simply not feasible for medium to large networks. The goal of this study is to develop an algorithm that can improve the efficiency of the two-stage Dorfman procedure by leveraging information on the underlying transmission network.</p>
<p>The remainder of the paper is as follows. Sections 2.1, 2.2 describes the objective function and our proposed algorithm. Section 2.3 describes the data we analyzed and the simulation study conducted. Section 3 reports the results from this study, and Section 4 provides a discussion.</p>
</sec>
<sec sec-type="methods" id="s2">
<title>2. Methods</title>
<sec>
<title>2.1. Objective</title>
<p>It has long been recognized that in the presence of diagnostic testing error (i.e., the sensitivity and specificity do not both equal 1), it should not be the goal to only minimize the expected number of tests. Rather, the expected number of correct classifications ought to be accounted for as well. Malinovsky et al. (<xref ref-type="bibr" rid="B28">2016</xref>) proposed using the ratio of the expected number of correctly classified individuals to the expected number of tests and then derived this quantity for the case of independent individuals. For the more general setting, our objective function is given below, but first, we need to introduce some notation.</p>
<p>Let <italic>y</italic><sub><italic>i</italic></sub> equal one if the <italic>i</italic><sup><italic>th</italic></sup> individual is infected and zero otherwise for <italic>i</italic> &#x0003D; 1, 2, &#x02026;, <italic>N</italic>, where <italic>N</italic> is the number of individuals to participate in the pooled testing. Let <italic>Z</italic><sub><italic>i</italic></sub> &#x02208; {1, 2, &#x02026;, <italic>P</italic>} denote which of the <italic>P</italic> pools individual <italic>i</italic> belongs to, and let <inline-formula><mml:math id="M1"><mml:msub><mml:mrow><mml:mrow><mml:mi mathvariant="-tex-caligraphic">I</mml:mi></mml:mrow></mml:mrow><mml:mrow><mml:mi>p</mml:mi></mml:mrow></mml:msub><mml:mo>&#x02282;</mml:mo><mml:mrow><mml:mo>{</mml:mo><mml:mrow><mml:mn>1</mml:mn><mml:mo>,</mml:mo><mml:mo>&#x02026;</mml:mo><mml:mo>,</mml:mo><mml:mi>N</mml:mi></mml:mrow><mml:mo>}</mml:mo></mml:mrow></mml:math></inline-formula> be the set of individuals belonging to the <italic>p</italic><sup><italic>th</italic></sup> pool, each of which is of size <italic>K</italic> (&#x0003D; <italic>N</italic>/<italic>P</italic>). Let <italic>T</italic> denote the total number of tests conducted and <italic>C</italic> the total number of correct classifications. Finally, let <italic>p</italic> denote the population prevalence of the disease, and let <italic>S</italic><sub><italic>p</italic></sub> and <italic>S</italic><sub><italic>e</italic></sub> denote the specificity and sensitivity of the test, respectively.</p>
<p>With regards to the network, let <italic>A</italic> denote the <italic>N</italic> &#x000D7; <italic>N</italic> adjacency matrix such that <italic>A</italic><sub><italic>ij</italic></sub> equals one if there is an edge between actors <italic>i</italic> and <italic>j</italic> and zero otherwise. Let <inline-formula><mml:math id="M2"><mml:msub><mml:mrow><mml:mrow><mml:mi mathvariant="-tex-caligraphic">N</mml:mi></mml:mrow></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> denote the neighbors of <italic>i</italic>, i.e., {<italic>j</italic>:<italic>A</italic><sub><italic>ij</italic></sub> &#x0003D; 1}.</p>
<p>The expected number of tests for the <italic>N</italic> individuals for a given pooling assignment vector <italic>Z</italic> can be shown to equal</p>
<disp-formula id="E1"><label>(1)</label><mml:math id="M3"><mml:mrow><mml:mi>&#x1D53C;</mml:mi><mml:mo stretchy='false'>(</mml:mo><mml:mi>T</mml:mi><mml:mo stretchy='false'>&#x0007C;</mml:mo><mml:mi>Z</mml:mi><mml:mo stretchy='false'>)</mml:mo><mml:mo>=</mml:mo><mml:mi>P</mml:mi><mml:mo>+</mml:mo><mml:mi>n</mml:mi><mml:msub><mml:mi>S</mml:mi><mml:mi>e</mml:mi></mml:msub><mml:mo>&#x02212;</mml:mo><mml:mi>K</mml:mi><mml:mo stretchy='false'>(</mml:mo><mml:msub><mml:mi>S</mml:mi><mml:mi>p</mml:mi></mml:msub><mml:mo>+</mml:mo><mml:msub><mml:mi>S</mml:mi><mml:mi>e</mml:mi></mml:msub><mml:mo>&#x02212;</mml:mo><mml:mn>1</mml:mn><mml:mo stretchy='false'>)</mml:mo><mml:mstyle displaystyle='true'><mml:munderover><mml:mo>&#x02211;</mml:mo><mml:mrow><mml:mi>p</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mi>p</mml:mi></mml:munderover><mml:mrow><mml:mi>&#x02119;</mml:mi><mml:mo stretchy='false'>(</mml:mo><mml:msup><mml:mstyle mathvariant='bold-italic' mathsize='normal'><mml:mi>y</mml:mi></mml:mstyle><mml:mo>&#x02032;</mml:mo></mml:msup><mml:msub><mml:mrow></mml:mrow><mml:mrow><mml:mi>I</mml:mi><mml:mi>p</mml:mi></mml:mrow></mml:msub><mml:msub><mml:mn>&#x1D7D9;</mml:mn><mml:mi>K</mml:mi></mml:msub><mml:mo>=</mml:mo><mml:mn>0</mml:mn><mml:mo stretchy='false'>)</mml:mo><mml:mo>,</mml:mo></mml:mrow></mml:mstyle></mml:mrow></mml:math></disp-formula>
<p>where &#x1D7D9;<sub><italic>m</italic></sub> is the <italic>m</italic> &#x000D7; 1 vector of ones. The expected number of correct classifications given <italic>Z</italic> can be shown to equal</p>
<disp-formula id="E2"><label>(2)</label><mml:math id="M4"><mml:mtable columnalign='left'><mml:mtr><mml:mtd><mml:mtable><mml:mtr><mml:mtd><mml:mrow><mml:mi>&#x1D53C;</mml:mi><mml:mo stretchy='false'>(</mml:mo><mml:mi>C</mml:mi><mml:mo stretchy='false'>&#x0007C;</mml:mo><mml:mi>Z</mml:mi><mml:mo stretchy='false'>)</mml:mo></mml:mrow></mml:mtd><mml:mtd><mml:mrow><mml:mo>=</mml:mo><mml:mtext>&#x000A0;</mml:mtext><mml:mi>n</mml:mi><mml:msubsup><mml:mi>S</mml:mi><mml:mi>e</mml:mi><mml:mn>2</mml:mn></mml:msubsup><mml:mo>+</mml:mo><mml:mi>N</mml:mi><mml:mo stretchy='false'>(</mml:mo><mml:mn>1</mml:mn><mml:mo>&#x02212;</mml:mo><mml:mi>p</mml:mi><mml:mo stretchy='false'>)</mml:mo><mml:mo stretchy='true'>(</mml:mo><mml:msub><mml:mi>S</mml:mi><mml:mi>e</mml:mi></mml:msub><mml:msub><mml:mi>S</mml:mi><mml:mi>p</mml:mi></mml:msub><mml:mo>+</mml:mo><mml:mn>1</mml:mn><mml:mo>&#x02212;</mml:mo><mml:msub><mml:mi>S</mml:mi><mml:mi>e</mml:mi></mml:msub><mml:mo>&#x02212;</mml:mo><mml:msubsup><mml:mi>S</mml:mi><mml:mi>e</mml:mi><mml:mn>2</mml:mn></mml:msubsup></mml:mrow></mml:mtd></mml:mtr></mml:mtable><mml:mo stretchy='true'>)</mml:mo></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mtext>&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;</mml:mtext><mml:mo>+</mml:mo><mml:mi>K</mml:mi><mml:mo stretchy='false'>(</mml:mo><mml:mn>1</mml:mn><mml:mo>&#x02212;</mml:mo><mml:msub><mml:mi>S</mml:mi><mml:mi>p</mml:mi></mml:msub><mml:mo stretchy='false'>)</mml:mo><mml:mo stretchy='false'>(</mml:mo><mml:msub><mml:mi>S</mml:mi><mml:mi>p</mml:mi></mml:msub><mml:mo>+</mml:mo><mml:msub><mml:mi>S</mml:mi><mml:mi>e</mml:mi></mml:msub><mml:mo>&#x02212;</mml:mo><mml:mn>1</mml:mn><mml:mo stretchy='false'>)</mml:mo><mml:mstyle displaystyle='true'><mml:munderover><mml:mo>&#x02211;</mml:mo><mml:mrow><mml:mi>p</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mi>p</mml:mi></mml:munderover><mml:mrow><mml:mi>&#x02119;</mml:mi><mml:mo stretchy='false'>(</mml:mo><mml:msub><mml:mstyle mathvariant='bold-italic' mathsize='normal'><mml:msup><mml:mi>y</mml:mi><mml:mo>&#x02032;</mml:mo></mml:msup></mml:mstyle><mml:mrow><mml:mi>I</mml:mi><mml:mi>p</mml:mi></mml:mrow></mml:msub><mml:msub><mml:mn>&#x1D7D9;</mml:mn><mml:mrow><mml:msub><mml:mi>K</mml:mi><mml:mi>p</mml:mi></mml:msub></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mn>0</mml:mn><mml:mo stretchy='false'>)</mml:mo><mml:mo>.</mml:mo></mml:mrow></mml:mstyle></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>The objective function is then defined to be</p>
<disp-formula id="E3"><label>(3)</label><mml:math id="M5"><mml:mtable class="eqnarray" columnalign="right center left"><mml:mtr><mml:mtd><mml:mi>Q</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>Z</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>:</mml:mo><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:mi>&#x1D53C;</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>C</mml:mi><mml:mo>|</mml:mo><mml:mi>Z</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mi>&#x1D53C;</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>T</mml:mi><mml:mo>|</mml:mo><mml:mi>Z</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:mfrac></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>In very few cases will the quantities <inline-formula><mml:math id="M6"><mml:mi>&#x02119;</mml:mi><mml:mo stretchy='false'>(</mml:mo><mml:msup><mml:mstyle mathvariant='bold-italic' mathsize='normal'><mml:mi>y</mml:mi></mml:mstyle><mml:mo>&#x02032;</mml:mo></mml:msup><mml:msub><mml:mrow></mml:mrow><mml:mrow><mml:mi>I</mml:mi><mml:mi>p</mml:mi></mml:mrow></mml:msub><mml:msub><mml:mn>&#x1D7D9;</mml:mn><mml:mrow><mml:msub><mml:mi>K</mml:mi><mml:mi>p</mml:mi></mml:msub></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mn>0</mml:mn><mml:mo stretchy='false'>)</mml:mo></mml:math></inline-formula>, and hence <italic>Q</italic>(<italic>Z</italic>), be known in a closed form. However, given any arbitrary simulator <italic>F</italic> of a data set <bold>y</bold> (e.g., that of a network-based compartmental or agent-based model), we can use Monte Carlo approximations to obtain arbitrarily exact estimates of these probabilities.</p>
</sec>
<sec>
<title>2.2. Constrained Divisive Pool Assignments</title>
<p>The way in which the specific assignation of individuals to pools affects the objective function is through the probability of having pools with no infected individuals. That is, the numerator of <italic>Q</italic>(<italic>Z</italic>) is maximized and the denominator is minimized by maximizing <inline-formula><mml:math id="M7"><mml:mstyle displaystyle='true'><mml:msubsup><mml:mo>&#x02211;</mml:mo><mml:mrow><mml:mi>p</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mi>p</mml:mi></mml:msubsup><mml:mrow><mml:mi>&#x02119;</mml:mi><mml:mo stretchy='false'>(</mml:mo><mml:msup><mml:mstyle mathvariant='bold-italic' mathsize='normal'><mml:mi>y</mml:mi></mml:mstyle><mml:mo>&#x02032;</mml:mo></mml:msup><mml:msub><mml:mrow></mml:mrow><mml:mrow><mml:mi>I</mml:mi><mml:mi>p</mml:mi></mml:mrow></mml:msub><mml:msub><mml:mn>&#x1D7D9;</mml:mn><mml:mrow><mml:msub><mml:mi>K</mml:mi><mml:mi>p</mml:mi></mml:msub></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mn>0</mml:mn><mml:mo stretchy='false'>)</mml:mo></mml:mrow></mml:mstyle></mml:math></inline-formula>. Telescoping this quantity out in the following way is, while very simple, somewhat revelatory to our purposes:</p>
<disp-formula id="E4"><label>(4)</label><mml:math id="M8"><mml:mtable columnalign='left'><mml:mtr><mml:mtd><mml:mstyle displaystyle='true'><mml:munderover><mml:mo>&#x02211;</mml:mo><mml:mrow><mml:mi>p</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mi>P</mml:mi></mml:munderover><mml:mi>&#x02119;</mml:mi></mml:mstyle><mml:mo stretchy='false'>(</mml:mo><mml:msup><mml:mstyle mathvariant='bold-italic' mathsize='normal'><mml:mi>y</mml:mi></mml:mstyle><mml:mo>&#x02032;</mml:mo></mml:msup><mml:msub><mml:mrow></mml:mrow><mml:mrow><mml:mi>I</mml:mi><mml:mi>p</mml:mi></mml:mrow></mml:msub><mml:msub><mml:mtext>&#x1D7D9;</mml:mtext><mml:mrow><mml:msub><mml:mi>K</mml:mi><mml:mi>p</mml:mi></mml:msub></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mn>0</mml:mn><mml:mo stretchy='false'>)</mml:mo></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mo>=</mml:mo><mml:mstyle displaystyle='true'><mml:munderover><mml:mo>&#x02211;</mml:mo><mml:mrow><mml:mi>p</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mi>P</mml:mi></mml:munderover><mml:mrow><mml:mrow><mml:mo>[</mml:mo> <mml:mrow><mml:mi>&#x02119;</mml:mi><mml:mo stretchy='false'>(</mml:mo><mml:msub><mml:mi>y</mml:mi><mml:mrow><mml:msub><mml:mi>i</mml:mi><mml:mrow><mml:mi>p</mml:mi><mml:mn>1</mml:mn></mml:mrow></mml:msub></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mn>0</mml:mn><mml:mo stretchy='false'>)</mml:mo><mml:mstyle displaystyle='true'><mml:munderover><mml:mo>&#x0220F;</mml:mo><mml:mrow><mml:mi>k</mml:mi><mml:mo>=</mml:mo><mml:mn>2</mml:mn></mml:mrow><mml:mi>K</mml:mi></mml:munderover><mml:mi>&#x02119;</mml:mi></mml:mstyle><mml:mo stretchy='false'>(</mml:mo><mml:msub><mml:mi>y</mml:mi><mml:mrow><mml:msub><mml:mi>i</mml:mi><mml:mrow><mml:mi>p</mml:mi><mml:mi>k</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mn>0</mml:mn><mml:mo stretchy='false'>&#x0007C;</mml:mo><mml:msub><mml:mi>y</mml:mi><mml:mrow><mml:msub><mml:mi>i</mml:mi><mml:mrow><mml:mi>p</mml:mi><mml:mn>1</mml:mn></mml:mrow></mml:msub></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mo>&#x022EF;</mml:mo><mml:mo>=</mml:mo><mml:msub><mml:mi>y</mml:mi><mml:mrow><mml:msub><mml:mi>i</mml:mi><mml:mrow><mml:mi>p</mml:mi><mml:mo stretchy='false'>(</mml:mo><mml:mi>k</mml:mi><mml:mo>&#x02212;</mml:mo><mml:mn>1</mml:mn><mml:mo stretchy='false'>)</mml:mo></mml:mrow></mml:msub></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mn>0</mml:mn><mml:mo stretchy='false'>)</mml:mo></mml:mrow> <mml:mo>]</mml:mo></mml:mrow></mml:mrow></mml:mstyle><mml:mo>,</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>where the subsequence <inline-formula><mml:math id="M9"><mml:msubsup><mml:mrow><mml:mrow><mml:mo>{</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>p</mml:mi><mml:mi>k</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo>}</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mi>k</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>K</mml:mi></mml:mrow></mml:msubsup></mml:math></inline-formula> consists of the <italic>K</italic> members of <inline-formula><mml:math id="M10"><mml:msub><mml:mrow><mml:mrow><mml:mi mathvariant="-tex-caligraphic">I</mml:mi></mml:mrow></mml:mrow><mml:mrow><mml:mi>p</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula>.</p>
<p>In the context of infectious disease, we feel it is eminently reasonable to assume the following:</p>
<disp-formula id="E5"><label>(5)</label><mml:math id="M11"><mml:mtable columnalign='left'><mml:mtr><mml:mtd><mml:mtext>For</mml:mtext><mml:mo>&#x000A0;</mml:mo><mml:msub><mml:mi mathvariant="-tex-caligraphic">S</mml:mi><mml:mn>1</mml:mn></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi mathvariant="-tex-caligraphic">S</mml:mi><mml:mn>2</mml:mn></mml:msub><mml:mo>&#x02282;</mml:mo><mml:mo>&#x0007B;</mml:mo><mml:mn>1</mml:mn><mml:mo>,</mml:mo><mml:mo>&#x02026;</mml:mo><mml:mo>,</mml:mo><mml:mi>N</mml:mi><mml:mo>&#x0007D;</mml:mo><mml:mo>&#x02216;</mml:mo><mml:mo>&#x0007B;</mml:mo><mml:mi>i</mml:mi><mml:mo>&#x0007D;</mml:mo><mml:mtext>&#x000A0;such&#x000A0;that</mml:mtext><mml:mo>&#x0007C;</mml:mo><mml:msub><mml:mi mathvariant="-tex-caligraphic">S</mml:mi><mml:mn>1</mml:mn></mml:msub><mml:mo>&#x0007C;</mml:mo><mml:mtext>&#x000A0;</mml:mtext><mml:mo>=</mml:mo><mml:mtext>&#x000A0;</mml:mtext><mml:mo>&#x0007C;</mml:mo><mml:mo>&#x000A0;</mml:mo><mml:msub><mml:mi>S</mml:mi><mml:mn>2</mml:mn></mml:msub><mml:mo>&#x0007C;</mml:mo><mml:mo>,</mml:mo></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mtext>if</mml:mtext><mml:mo>&#x0007C;</mml:mo><mml:msub><mml:mi mathvariant="-tex-caligraphic">S</mml:mi><mml:mn>1</mml:mn></mml:msub><mml:mo>&#x02229;</mml:mo><mml:msub><mml:mi mathvariant="-tex-caligraphic">N</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>&#x0007C;</mml:mo><mml:mtext>&#x000A0;</mml:mtext><mml:mo>&#x0003E;</mml:mo><mml:mtext>&#x000A0;</mml:mtext><mml:mo>&#x0007C;</mml:mo><mml:msub><mml:mi mathvariant="-tex-caligraphic">S</mml:mi><mml:mn>2</mml:mn></mml:msub><mml:mo>&#x02229;</mml:mo><mml:msub><mml:mi mathvariant="-tex-caligraphic">N</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>&#x0007C;</mml:mo></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mtext>then&#x000A0;</mml:mtext><mml:mi>&#x02119;</mml:mi><mml:mo stretchy='true'>(</mml:mo><mml:msub><mml:mi>y</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>=</mml:mo><mml:mn>0</mml:mn><mml:mo>&#x0007C;</mml:mo><mml:mo>&#x0007B;</mml:mo><mml:msub><mml:mi>y</mml:mi><mml:mi>j</mml:mi></mml:msub><mml:mo>=</mml:mo><mml:mn>0</mml:mn><mml:mo>,</mml:mo><mml:mi>j</mml:mi><mml:mo>&#x02208;</mml:mo><mml:msub><mml:mi mathvariant="-tex-caligraphic">S</mml:mi><mml:mn>1</mml:mn></mml:msub><mml:mo>&#x0007D;</mml:mo><mml:mo stretchy='true'>)</mml:mo><mml:mo>&#x0003E;</mml:mo><mml:mi>&#x02119;</mml:mi><mml:mo stretchy='true'>(</mml:mo><mml:msub><mml:mi>y</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>=</mml:mo><mml:mn>0</mml:mn><mml:mo>&#x0007C;</mml:mo><mml:mo>&#x0007B;</mml:mo><mml:msub><mml:mi>y</mml:mi><mml:mi>j</mml:mi></mml:msub><mml:mo>=</mml:mo><mml:mn>0</mml:mn><mml:mo>,</mml:mo><mml:mi>j</mml:mi><mml:mo>&#x02208;</mml:mo><mml:msub><mml:mi mathvariant="-tex-caligraphic">S</mml:mi><mml:mn>2</mml:mn></mml:msub><mml:mo>&#x0007D;</mml:mo><mml:mo stretchy='true'>)</mml:mo><mml:mo>.</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>In other words, we are more confident that an individual is not infected if we know their neighbors are also not infected than if we know that the same number of non-neighbors are not infected. As an example of this, consider the following autologistic actor attribute model (ALAAM) (Robins et al., <xref ref-type="bibr" rid="B38">2001</xref>), given by:</p>
<disp-formula id="E6"><mml:math id="M12"><mml:mrow><mml:mi>&#x02119;</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mstyle mathvariant="bold"><mml:mi>y</mml:mi></mml:mstyle></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>&#x003D5;</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mstyle mathvariant="bold"><mml:mi>&#x003B8;</mml:mi></mml:mstyle></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:mfrac><mml:mo class="qopname">exp</mml:mo><mml:mrow><mml:mo>{</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>&#x003B8;</mml:mi></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:msup><mml:mrow><mml:mstyle mathvariant="bold"><mml:mi>y</mml:mi></mml:mstyle></mml:mrow><mml:mrow><mml:mi>&#x02032;</mml:mi></mml:mrow></mml:msup><mml:msub><mml:mrow><mml:mn>&#x1D7D9;</mml:mn></mml:mrow><mml:mrow><mml:mi>N</mml:mi></mml:mrow></mml:msub><mml:mo>&#x0002B;</mml:mo><mml:mfrac><mml:mrow><mml:msub><mml:mrow><mml:mi>&#x003B8;</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msub></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:mfrac><mml:msup><mml:mrow><mml:mstyle mathvariant="bold"><mml:mi>y</mml:mi></mml:mstyle></mml:mrow><mml:mrow><mml:mi>&#x02032;</mml:mi></mml:mrow></mml:msup><mml:mi>A</mml:mi><mml:mstyle mathvariant="bold"><mml:mi>y</mml:mi></mml:mstyle></mml:mrow><mml:mo>}</mml:mo></mml:mrow><mml:mo>,</mml:mo></mml:mrow></mml:math></disp-formula>
<p>which controls the overall prevalence of the disease through the parameter &#x003B8;<sub>1</sub> and the transmissibility between neighbors through &#x003B8;<sub>2</sub>, and where &#x003D5;(<bold>&#x003B8;</bold>) is a normalizing constant involving <bold>&#x003B8;</bold>: &#x0003D; (&#x003B8;<sub>1</sub>, &#x003B8;<sub>2</sub>). Without loss of generality, consider <inline-formula><mml:math id="M13"><mml:mrow><mml:mi>&#x02119;</mml:mi><mml:mo stretchy='true'>(</mml:mo><mml:msub><mml:mi>y</mml:mi><mml:mn>1</mml:mn></mml:msub><mml:mo>=</mml:mo><mml:mn>0</mml:mn><mml:mo stretchy='false'>&#x0007C;</mml:mo><mml:mo>&#x0007B;</mml:mo><mml:msub><mml:mi>y</mml:mi><mml:mi>j</mml:mi></mml:msub><mml:mo>=</mml:mo><mml:mn>0</mml:mn><mml:mo>,</mml:mo><mml:mi>j</mml:mi><mml:mo>&#x02208;</mml:mo><mml:mi mathvariant="-tex-caligraphic">S</mml:mi><mml:mo>&#x0007D;</mml:mo><mml:mo stretchy='true'>)</mml:mo></mml:mrow></mml:math></inline-formula> for some set <inline-formula><mml:math id="M14"><mml:mrow><mml:mi mathvariant="-tex-caligraphic">S</mml:mi></mml:mrow><mml:mo>:</mml:mo><mml:mo>=</mml:mo><mml:mrow><mml:mo>{</mml:mo><mml:mrow><mml:mn>2</mml:mn><mml:mo>,</mml:mo><mml:mn>3</mml:mn><mml:mo>,</mml:mo><mml:mo>&#x02026;</mml:mo><mml:mo>,</mml:mo><mml:mi>S</mml:mi></mml:mrow><mml:mo>}</mml:mo></mml:mrow></mml:math></inline-formula>. This quantity can be shown to equal</p>
<disp-formula id="E7"><mml:math id="M15"><mml:mi>&#x02119;</mml:mi><mml:mo stretchy='false'>(</mml:mo><mml:msub><mml:mi>y</mml:mi><mml:mn>1</mml:mn></mml:msub><mml:mo>=</mml:mo><mml:mn>0</mml:mn><mml:mo>&#x0007C;</mml:mo><mml:mo>&#x0007B;</mml:mo><mml:msub><mml:mi>y</mml:mi><mml:mi>j</mml:mi></mml:msub><mml:mo>=</mml:mo><mml:mn>0</mml:mn><mml:mo>,</mml:mo><mml:mi>j</mml:mi><mml:mo>&#x02208;</mml:mo><mml:mi mathvariant="-tex-caligraphic">S</mml:mi><mml:mo>&#x0007D;</mml:mo><mml:mo stretchy='false'>)</mml:mo><mml:mo>=</mml:mo><mml:msup><mml:mrow><mml:mo>[</mml:mo> <mml:mrow><mml:mn>1</mml:mn><mml:mo>+</mml:mo><mml:mfrac><mml:mrow><mml:mstyle displaystyle='true'><mml:munder><mml:mo>&#x02211;</mml:mo><mml:mrow><mml:mo>&#x0007B;</mml:mo><mml:msub><mml:mi>y</mml:mi><mml:mi>j</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:mi>j</mml:mi><mml:mo>&#x0003E;</mml:mo><mml:mi>S</mml:mi><mml:mo>&#x0007D;</mml:mo></mml:mrow></mml:munder><mml:mrow><mml:mi>exp</mml:mi></mml:mrow></mml:mstyle><mml:mrow><mml:mo>{</mml:mo> <mml:mrow><mml:msub><mml:mi>&#x003B8;</mml:mi><mml:mn>1</mml:mn></mml:msub><mml:mstyle displaystyle='true'><mml:msub><mml:mo>&#x02211;</mml:mo><mml:mrow><mml:mi>j</mml:mi><mml:mo>&#x0003E;</mml:mo><mml:mi>S</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:msub><mml:mi>y</mml:mi><mml:mi>j</mml:mi></mml:msub></mml:mrow></mml:mstyle><mml:mo>+</mml:mo><mml:msub><mml:mi>&#x003B8;</mml:mi><mml:mn>2</mml:mn></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mstyle displaystyle='true'><mml:msub><mml:mo>&#x02211;</mml:mo><mml:mrow><mml:mi>j</mml:mi><mml:mo>&#x0003E;</mml:mo><mml:mi>S</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:msub><mml:mi>y</mml:mi><mml:mi>j</mml:mi></mml:msub></mml:mrow></mml:mstyle><mml:msub><mml:mi>A</mml:mi><mml:mrow><mml:mn>1</mml:mn><mml:mi>j</mml:mi></mml:mrow></mml:msub><mml:mo>+</mml:mo><mml:mstyle displaystyle='true'><mml:msub><mml:mo>&#x02211;</mml:mo><mml:mrow><mml:mi>S</mml:mi><mml:mo>&#x0003C;</mml:mo><mml:mi>j</mml:mi><mml:mo>&#x0003C;</mml:mo><mml:mi>k</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:msub><mml:mi>y</mml:mi><mml:mi>j</mml:mi></mml:msub></mml:mrow></mml:mstyle><mml:msub><mml:mi>y</mml:mi><mml:mi>k</mml:mi></mml:msub><mml:msub><mml:mi>A</mml:mi><mml:mrow><mml:mi>j</mml:mi><mml:mi>k</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow> <mml:mo>}</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mstyle displaystyle='true'><mml:munder><mml:mo>&#x02211;</mml:mo><mml:mrow><mml:mo>&#x0007B;</mml:mo><mml:msub><mml:mi>y</mml:mi><mml:mi>j</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:mi>j</mml:mi><mml:mo>&#x0003E;</mml:mo><mml:mi>S</mml:mi><mml:mo>&#x0007D;</mml:mo></mml:mrow></mml:munder><mml:mrow><mml:mi>exp</mml:mi></mml:mrow></mml:mstyle><mml:mrow><mml:mo>{</mml:mo> <mml:mrow><mml:msub><mml:mi>&#x003B8;</mml:mi><mml:mn>1</mml:mn></mml:msub><mml:mstyle displaystyle='true'><mml:msub><mml:mo>&#x02211;</mml:mo><mml:mrow><mml:mi>j</mml:mi><mml:mo>&#x0003E;</mml:mo><mml:mi>S</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:msub><mml:mi>y</mml:mi><mml:mi>j</mml:mi></mml:msub></mml:mrow></mml:mstyle><mml:mo>+</mml:mo><mml:msub><mml:mi>&#x003B8;</mml:mi><mml:mn>2</mml:mn></mml:msub><mml:mstyle displaystyle='true'><mml:msub><mml:mo>&#x02211;</mml:mo><mml:mrow><mml:mi>S</mml:mi><mml:mo>&#x0003C;</mml:mo><mml:mi>j</mml:mi><mml:mo>&#x0003C;</mml:mo><mml:mi>k</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:msub><mml:mi>y</mml:mi><mml:mi>j</mml:mi></mml:msub></mml:mrow></mml:mstyle><mml:msub><mml:mi>y</mml:mi><mml:mi>k</mml:mi></mml:msub><mml:msub><mml:mi>A</mml:mi><mml:mrow><mml:mi>j</mml:mi><mml:mi>k</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo>}</mml:mo></mml:mrow></mml:mrow></mml:mfrac><mml:msup><mml:mi>e</mml:mi><mml:mrow><mml:msub><mml:mi>&#x003B8;</mml:mi><mml:mn>1</mml:mn></mml:msub></mml:mrow></mml:msup></mml:mrow><mml:mo>]</mml:mo></mml:mrow><mml:mrow><mml:mo>&#x02212;</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msup><mml:mo>.</mml:mo></mml:math></disp-formula>
<p>From this, it can be seen that the higher the proportion of actor 1&#x00027;s edges belong to set <inline-formula><mml:math id="M16"><mml:mrow><mml:mi mathvariant="-tex-caligraphic">S</mml:mi></mml:mrow></mml:math></inline-formula>, and hence the smaller the quantity <inline-formula><mml:math id="M17"><mml:munder class="msub"><mml:mrow><mml:mo>&#x02211;</mml:mo></mml:mrow><mml:mrow><mml:mi>j</mml:mi><mml:mo>&#x0003E;</mml:mo><mml:mi>S</mml:mi></mml:mrow></mml:munder><mml:msub><mml:mrow><mml:mi>y</mml:mi></mml:mrow><mml:mrow><mml:mi>j</mml:mi></mml:mrow></mml:msub><mml:msub><mml:mrow><mml:mi>A</mml:mi></mml:mrow><mml:mrow><mml:mn>1</mml:mn><mml:mi>j</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula>, the larger the conditional probability that <italic>y</italic><sub>1</sub> &#x0003D; 0.</p>
<p>Under the mild assumption in Equation (5), it can be seen through Equation (4) that <italic>Q</italic>(<italic>Z</italic>) is maximized when the edges connect individuals in the same pool. That is, we wish to minimize the boundary sets of edges bridging individuals in different pools. To this end, we begin with spectral clustering, a natural candidate for this type of problem (refer to, e.g., Von Luxburg, <xref ref-type="bibr" rid="B45">2007</xref>). However, we cannot simply apply <italic>k</italic>-means or some other simple clustering algorithm to the eigenvalues of the Laplacian matrix because our pool sizes are each fixed a priori at <italic>K</italic>. Therefore, we propose using a constrained divisive clustering method based on DIANA (MacNaughton-Smith et al., <xref ref-type="bibr" rid="B27">1964</xref>; Kaufman and Rousseeuw, <xref ref-type="bibr" rid="B19">1990</xref>).</p>
<p>Our proposed approach begins by computing the Laplacian matrix, <italic>L</italic>: &#x0003D; <italic>D</italic>&#x02212;<italic>A</italic>, where <italic>D</italic> is the diagonal matrix with the actors&#x00027; degrees along with the diagonal elements (i.e., <inline-formula><mml:math id="M18"><mml:msub><mml:mrow><mml:mi>D</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>:</mml:mo><mml:mo>=</mml:mo><mml:munder class="msub"><mml:mrow><mml:mo>&#x02211;</mml:mo></mml:mrow><mml:mrow><mml:mi>j</mml:mi></mml:mrow></mml:munder><mml:msub><mml:mrow><mml:mi>A</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mi>j</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula>) and finding the eigenvectors corresponding to the <italic>P</italic> smallest eigenvalues. We then compute the distances between all <italic>N</italic> individuals and assign to the first pool the individual <italic>i</italic><sub>11</sub> who has the largest mean distance to all others. For <italic>k</italic> &#x0003D; 2, &#x02026;, <italic>K</italic>, we find the individual <italic>i</italic><sub>1<italic>k</italic></sub> who has the largest difference between the mean distance to those not belonging to the pool and the mean distance to those <italic>k</italic>&#x02212;1 individuals currently assigned to the pool. We remove these individuals (<italic>i</italic><sub>11</sub>, &#x02026;, <italic>i</italic><sub>1<italic>K</italic></sub>), and then iterate this for pools 2 through <italic>P</italic>&#x02212;1, where this last iteration splits the final 2<italic>K</italic> individuals into the last two pools. Details of the algorithm are given below in <xref ref-type="table" rid="T2">Algorithm 1</xref>.</p>
<table-wrap position="float" id="T2">
<label>Algorithm 1</label>
<caption><p>Divisive Pool Assignment Procedure.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fdata-05-893760-i0001.tif"/>
</table-wrap> 
<p>In nearly all cases, however, the pool size <italic>K</italic> will be relatively small (e.g., <italic>K</italic>&#x02208;[1, 100]), and certainly will not grow with <italic>N</italic>, i.e., <inline-formula><mml:math id="M19"><mml:mi>P</mml:mi><mml:mo>=</mml:mo><mml:mrow><mml:mi mathvariant="-tex-caligraphic">O</mml:mi></mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>N</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:math></inline-formula>. This induces a computational cost <inline-formula><mml:math id="M20"><mml:mrow><mml:mi mathvariant="-tex-caligraphic">O</mml:mi></mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msup><mml:mrow><mml:mi>N</mml:mi></mml:mrow><mml:mrow><mml:mn>3</mml:mn></mml:mrow></mml:msup></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:math></inline-formula> that is too high for large networks. In such cases, we, therefore, suggest replacing the distances obtained from the <italic>P</italic> eigenvalues in <xref ref-type="table" rid="T2">Algorithm 1</xref> with the geodesic distances, which only costs <inline-formula><mml:math id="M21"><mml:mrow><mml:mi mathvariant="-tex-caligraphic">O</mml:mi></mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msup><mml:mrow><mml:mi>N</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msup></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:math></inline-formula> to compute (Newman, <xref ref-type="bibr" rid="B34">2010</xref>). We will refer to this modification as <xref ref-type="table" rid="T2">Algorithm 2</xref>.</p>
</sec>
<sec>
<title>2.3. Add Health Data Analysis</title>
<sec>
<title>2.3.1. Network Data</title>
<p>The National Longitudinal Survey of Adolescent Health (Add Health) collected information from a nationally representative sample of adolescents in grades 7 through 12 spanning 144 schools (Moody, <xref ref-type="bibr" rid="B33">1999</xref>). Out of this study came friendship networks among students, which we will take to serve as a proxy for which students are most likely to transmit to one another. Data for 84 schools are available through the <monospace>R</monospace> package <monospace>networkdata</monospace> (Almquist, <xref ref-type="bibr" rid="B4">2014</xref>), with networks ranging in size from 25 to 2,587 students. For our analyses, we focused on two networks, one having 495 actors and 2,675 edges, and the other having 2,587 actors and 12,969 edges.</p>
<p>Network survey data has often been used in infectious disease modeling (Hoang et al., <xref ref-type="bibr" rid="B14">2019</xref>). Similar to contact diaries which have shown reasonably good associations between long contacts measured by sensor devices (e.g., Smieszek et al., <xref ref-type="bibr" rid="B40">2014</xref>; Leecaster et al., <xref ref-type="bibr" rid="B24">2016</xref>), in a study looking at high school data in France all long duration contacts were represented in a friendship network survey, and &#x0201C;the overall structure of the contact network [&#x02026;] is correctly captured by [&#x02026;] [self-reported] friendships&#x0201D; (Mastrandrea et al., <xref ref-type="bibr" rid="B31">2015</xref>). While self-reported friendship data may not be sufficiently accurate in all contexts, in the context of school students there is at least reasonable evidence showing that the long contacts which are most likely to act to transmit close-contact diseases are well approximated by self-reported friendships.</p>
<p>To evaluate our method on larger networks, we created a synthetic network having realistic topology in the following way. We fit an exponential random graph model (ERGM) based on the social-circuit dependence assumption on each of the 84 school networks described above. More specifically, each ERGM was fit using the following terms: &#x00023; edges, &#x00023; 2-stars, &#x00023; triangles, geometrically weighted edgewise shared partners, and geometrically weighted dyadwise shared partners. The first three terms correspond to Markov dependencies, and the latter two to the social-circuit dependencies (Lusher et al., <xref ref-type="bibr" rid="B26">2012</xref>). We then performed a fixed effects meta-analysis, where each coefficient was modeled as a function of the log of the network size. Using these coefficients, we then generated a network of size 10,000 actors, having 13,800 edges. Along with the two networks of size 495 and 2,675, this then gave us a third network to analyze, and we will refer to these networks as AH495, AH2587, and ERGM10000, respectively.</p>
</sec>
<sec>
<title>2.3.2. Simulation Framework</title>
<p>To evaluate <italic>Q</italic>(<italic>Z</italic>), we used a network-based susceptible-infectious-susceptible (SIS) model as our simulator <italic>F</italic> (refer to, e.g., Allen et al., <xref ref-type="bibr" rid="B2">2008</xref>). In most realistic infectious disease contexts where pooled testing may be implemented, there is more knowledge of the prevalence of the disease than other facets of disease spread. Therefore, we constrained the SIS model such that the prevalence is within a small range; in the simulation results given below, we chose 0.025&#x000B1;0.0075. Thus, in order to get samples from <italic>F</italic> with which to estimate <italic>Q</italic>(<italic>Z</italic>) we repeatedly performed the following steps until the desired number of simulated datasets were obtained:</p>
<list list-type="order">
<list-item><p>Draw the SIS transmission parameter from a uniform distribution.</p></list-item>
<list-item><p>Draw new <italic>y</italic><sub><italic>i</italic></sub>, <italic>i</italic> &#x0003D; 1, &#x02026;, <italic>N</italic> from SIS model.</p></list-item>
<list-item><p>If <inline-formula><mml:math id="M22"><mml:mfrac><mml:mrow><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>N</mml:mi></mml:mrow></mml:mfrac><mml:munder class="msub"><mml:mrow><mml:mo>&#x02211;</mml:mo></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:munder><mml:msub><mml:mrow><mml:mi>y</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>&#x02208;</mml:mo><mml:mrow><mml:mo>[</mml:mo><mml:mrow><mml:mn>0</mml:mn><mml:mo>.</mml:mo><mml:mn>025</mml:mn><mml:mo>-</mml:mo><mml:mn>0</mml:mn><mml:mo>.</mml:mo><mml:mn>0075</mml:mn><mml:mo>,</mml:mo><mml:mn>0</mml:mn><mml:mo>.</mml:mo><mml:mn>025</mml:mn><mml:mo>&#x0002B;</mml:mo><mml:mn>0</mml:mn><mml:mo>.</mml:mo><mml:mn>0075</mml:mn></mml:mrow><mml:mo>]</mml:mo></mml:mrow></mml:math></inline-formula> accept <bold>y</bold>, else reject.</p></list-item>
</list>
<p>With <italic>Q</italic>(<italic>Z</italic>) estimated <italic>via</italic> Monte Carlo from these draws from <italic>F</italic>, we can choose the optimal pool size <italic>K</italic>.</p>
<p>We then expanded our study to determine the effect of having imperfect knowledge of the underlying network, as well as the effect of varying non-response rates. We replicated two common network survey tools in simulating data. First, we simulated open ended responses with imperfect recall rates. This <italic>partial recall</italic> strategy assumed each individual would &#x0201C;forget&#x0201D; a given edge with a probability of 0.25. Second, we simulated a <italic>nominate-n</italic> design, where each individual gets to nominate up to <italic>n</italic> of their edges. In our simulations, we set <italic>n</italic> &#x0003D; 5. To address non-response, we simulated &#x0201C;observed&#x0201D; networks <italic>via</italic> the partial recall and nominate-5 strategies with 5, 10, or 20% of the network members failing to provide responses. For each configuration, we simulated 250 networks and estimated <italic>Q</italic>(<italic>Z</italic>) for each.</p>
</sec>
</sec>
</sec>
<sec sec-type="results" id="s3">
<title>3. Results</title>
<p>The values of <italic>Q</italic>(<italic>Z</italic>) for <italic>K</italic> ranging from 2 to 20 are displayed in <xref ref-type="fig" rid="F1">Figure 1</xref>. The optimal pool sizes for AH495, AH2587, and ERGM10000 were 10, 9, and 10, respectively. The dashed-dotted red line represents the average value of <italic>Q</italic> over 50 randomly assigned pools for each <italic>K</italic>. Results from <xref ref-type="table" rid="T2">Algorithm 1</xref> based on the Laplacian are given in the solid blue line, and from <xref ref-type="table" rid="T2">Algorithm 2</xref> based on geodesic distances in dashed green; for ERGM10000 it was not feasible to use <xref ref-type="table" rid="T2">Algorithm 1</xref>. It is clear that there is a negligible difference in performance between the <xref ref-type="table" rid="T2">Algorithm 1</xref> and the more computationally efficient <xref ref-type="table" rid="T2">Algorithm 2</xref> algorithms. Utilizing the network to inform the specific pool assignments dominated random pool assignments for all pool sizes <italic>K</italic>, and for all but very small pool sizes greatly increased the expected number of correct classifications per test.</p>
<fig id="F1" position="float">
<label>Figure 1</label>
<caption><p>Values of the objective function <italic>Q</italic>(<italic>Z</italic>) (vertical axis) vs. the pool size <italic>K</italic> (horizontal axis) for <bold>(A)</bold> AH495, <bold>(B)</bold> AH2587, and <bold>(C)</bold> ERGM10000. Values of <italic>Q</italic> are given using <xref ref-type="table" rid="T2">Algorithm 1</xref> based on the Laplacian eigenvectors, <xref ref-type="table" rid="T2">Algorithm 2</xref> based on geodesic distances, and using random pool assignments.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fdata-05-893760-g0001.tif"/>
</fig>
<p><xref ref-type="fig" rid="F2">Figure 2</xref> provides the results from perturbing the network by introducing missingness due to survey design and non-response rates. For reference, the oracle results using either <xref ref-type="table" rid="T2">Algorithm 1</xref> or <xref ref-type="table" rid="T2">Algorithm 2</xref> are presented as a vertical line, as are the results from random pool assignments. All results correspond to the optimal <italic>K</italic> given above. There is no clear pattern of superiority when comparing the two survey designs, nominate-5 and partial recall. While the results deteriorate somewhat as the non-response rate increases, these decreases are very marginal compared to random pool assignments that do not leverage the network information.</p>
<fig id="F2" position="float">
<label>Figure 2</label>
<caption><p>Results from introducing missingness into the <bold>(A)</bold> AH495, <bold>(B)</bold> AH2587, or <bold>(C)</bold> ERGM10000 network by simulating two common network survey tools and varying the level of non-response. The horizontal axis corresponds to <italic>Q</italic>(<italic>Z</italic>), and vertical lines show either the average of 50 random pool assignments or results based on the true underlying network.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fdata-05-893760-g0002.tif"/>
</fig>
<p>When our algorithms were run on a personal computer with an Intel(R) Core(TM) i7-9850H CPU 2.60GHz processor, we obtained the computation times provided in <xref ref-type="table" rid="T1">Table 1</xref>. These results indicate that our approach can feasibly be applied to even large organizations.</p>
<table-wrap position="float" id="T1">
<label>Table 1</label>
<caption><p>Computational time in seconds to run <xref ref-type="table" rid="T2">Algorithms 1, 2</xref>.</p></caption>
<table frame="hsides" rules="groups">
<thead><tr>
<th valign="top" align="left"><bold>Network</bold></th>
<th valign="top" align="center"><bold>Laplacian</bold></th>
<th valign="top" align="center"><bold>Geodesic</bold></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">AH495</td>
<td valign="top" align="center">2.95</td>
<td valign="top" align="center">0.04</td>
</tr>
<tr>
<td valign="top" align="left">AH2587</td>
<td valign="top" align="center">1080.22</td>
<td valign="top" align="center">1.07</td>
</tr>
<tr>
<td valign="top" align="left">ERGM10000</td>
<td valign="top" align="center">NA</td>
<td valign="top" align="center">16.42</td>
</tr>
</tbody>
</table>
</table-wrap>
</sec>
<sec sec-type="discussion" id="s4">
<title>4. Discussion</title>
<p>Regular universal screening can play an important role in infection control. The cost of implementing this strategy, however, can be out of reach for many organizations. Pooling tests and only testing individuals should their pool test positive leads to fewer overall tests being conducted, thereby lowering the resource burden to a more manageable level.</p>
<p>While the extant literature on pooled testing is vast, algorithms that aim at finding the optimal pool size ignore the fact that in the context of infectious disease there is an underlying transmission network that makes the individuals to be pooled not independent. We have shown that by utilizing the underlying network, the cost savings provided by pooled testing can be further increased.</p>
<p>In real applications, the true underlying contact network that leads to transmission events is of course unknown. We have shown, however, that using easily implemented survey tools to collect contact information can provide enough information about the network to yield results nearly equivalent to when the true network is known. Furthermore, our methods are robust to high non-response rates.</p>
</sec>
<sec sec-type="data-availability" id="s5">
<title>Data Availability Statement</title>
<p>Publicly available datasets were analyzed in this study. These data can be found here: <ext-link ext-link-type="uri" xlink:href="https://github.com/Z-co/networkdata">https://github.com/Z-co/networkdata</ext-link>.</p>
</sec>
<sec id="s6">
<title>Author Contributions</title>
<p>The author confirms being the sole contributor of this work and has approved it for publication.</p>
</sec>
<sec sec-type="funding-information" id="s7">
<title>Funding</title>
<p>This study was supported by the US Centers for Disease Control and Prevention (5 U01 CK000531-02) as part of the MInD-Healthcare Program.</p>
</sec>
<sec sec-type="COI-statement" id="conf1">
<title>Conflict of Interest</title>
<p>The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.</p>
</sec>
<sec sec-type="disclaimer" id="s8">
<title>Publisher&#x00027;s Note</title>
<p>All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.</p>
</sec> 
</body>
<back>
<ref-list>
<title>References</title>
<ref id="B1">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Abdalhamid</surname> <given-names>B.</given-names></name> <name><surname>Bilder</surname> <given-names>C. R.</given-names></name> <name><surname>McCutchen</surname> <given-names>E. L.</given-names></name> <name><surname>Hinrichs</surname> <given-names>S. H.</given-names></name> <name><surname>Koepsell</surname> <given-names>S. A.</given-names></name> <name><surname>Iwen</surname> <given-names>P. C.</given-names></name></person-group> (<year>2020</year>). <article-title>Assessment of specimen pooling to conserve sars cov-2 testing resources</article-title>. <source>Am. J. Clin. Pathol</source>. <volume>153</volume>, <fpage>715</fpage>&#x02013;<lpage>718</lpage>. <pub-id pub-id-type="doi">10.1093/ajcp/aqaa064</pub-id><pub-id pub-id-type="pmid">32511649</pub-id></citation></ref>
<ref id="B2">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Allen</surname> <given-names>L.</given-names></name> <name><surname>Bauch</surname> <given-names>C.</given-names></name> <name><surname>Castillo-Chavez</surname> <given-names>C.</given-names></name> <name><surname>Earn</surname> <given-names>D.</given-names></name> <name><surname>Feng</surname> <given-names>Z.</given-names></name> <name><surname>Lewis</surname> <given-names>M.</given-names></name> <etal/></person-group>. (<year>2008</year>). <source>Mathematical Epidemiology</source>. <publisher-loc>Heidelberg; Berlin</publisher-loc>: <publisher-name>Springer</publisher-name>.</citation>
</ref>
<ref id="B3">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Almadhi</surname> <given-names>M. A.</given-names></name> <name><surname>Abdulrahman</surname> <given-names>A.</given-names></name> <name><surname>Sharaf</surname> <given-names>S. A.</given-names></name> <name><surname>AlSaad</surname> <given-names>D.</given-names></name> <name><surname>Stevenson</surname> <given-names>N. J.</given-names></name> <name><surname>Atkin</surname> <given-names>S. L.</given-names></name> <etal/></person-group>. (<year>2021</year>). <article-title>The high prevalence of asymptomatic SARS-CoV-2 infection reveals the silent spread of covid-19</article-title>. <source>Int. J. Infectious Dis</source>. <volume>105</volume>, <fpage>656</fpage>&#x02013;<lpage>661</lpage>. <pub-id pub-id-type="doi">10.1016/j.ijid.2021.02.100</pub-id><pub-id pub-id-type="pmid">33647516</pub-id></citation></ref>
<ref id="B4">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Almquist</surname> <given-names>Z. W.</given-names></name></person-group> (<year>2014</year>). <source>networkdata: Lin Freeman&#x00027;s Network Data Collection</source>. R package version 0.01.</citation>
</ref>
<ref id="B5">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Bernard</surname> <given-names>H. R.</given-names></name> <name><surname>Killworth</surname> <given-names>P. D.</given-names></name> <name><surname>Sailer</surname> <given-names>L.</given-names></name></person-group> (<year>1979</year>). <article-title>Informant accuracy in social network data iv: a comparison of clique-level structure in behavioral and cognitive network data</article-title>. <source>Soc. Networks</source> <volume>2</volume>, <fpage>191</fpage>&#x02013;<lpage>218</lpage>. <pub-id pub-id-type="doi">10.1016/0378-8733(79)90014-5</pub-id></citation>
</ref>
<ref id="B6">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Bernard</surname> <given-names>H. R.</given-names></name> <name><surname>Killworth</surname> <given-names>P. D.</given-names></name> <name><surname>Sailer</surname> <given-names>L.</given-names></name></person-group> (<year>1982</year>). <article-title>Informant accuracy in social-network data v: an experimental attempt to predict actual communication from recall data</article-title>. <source>Soc. Sci. Res</source>. <volume>11</volume>, <fpage>30</fpage>&#x02013;<lpage>66</lpage>. <pub-id pub-id-type="doi">10.1016/0049-089X(82)90006-0</pub-id></citation>
</ref>
<ref id="B7">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Bharti</surname> <given-names>N.</given-names></name> <name><surname>Exten</surname> <given-names>C.</given-names></name> <name><surname>Oliver-Veronesi</surname> <given-names>R. E.</given-names></name></person-group> (<year>2020</year>). <article-title>Lessons from campus outbreak management using test, trace, and isolate efforts</article-title>. <source>Am. J. Infect. Control</source> <volume>49</volume>, <fpage>849</fpage>&#x02013;<lpage>851</lpage>. <pub-id pub-id-type="doi">10.1016/j.ajic.2020.11.008</pub-id><pub-id pub-id-type="pmid">33186679</pub-id></citation></ref>
<ref id="B8">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Black</surname> <given-names>M. S.</given-names></name> <name><surname>Bilder</surname> <given-names>C. R.</given-names></name> <name><surname>Tebbs</surname> <given-names>J. M.</given-names></name></person-group> (<year>2015</year>). <article-title>Optimal retesting configurations for hierarchical group testing</article-title>. <source>J. R. Stat. Soc</source>. <volume>64</volume>, <fpage>693</fpage>&#x02013;<lpage>710</lpage>. <pub-id pub-id-type="doi">10.1111/rssc.12097</pub-id><pub-id pub-id-type="pmid">26166904</pub-id></citation></ref>
<ref id="B9">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Denny</surname> <given-names>T. N.</given-names></name> <name><surname>Andrews</surname> <given-names>L.</given-names></name> <name><surname>Bonsignori</surname> <given-names>M.</given-names></name> <name><surname>Cavanaugh</surname> <given-names>K.</given-names></name> <name><surname>Datto</surname> <given-names>M. B.</given-names></name> <name><surname>Deckard</surname> <given-names>A.</given-names></name> <etal/></person-group>. (<year>2020</year>). <article-title>Implementation of a pooled surveillance testing program for asymptomatic SARS-CoV-2 infections on a college campus &#x02013; duke university, durham, north carolina, august 2-october 11, 2020</article-title>. <source>Morbid. Mortal Wkly. Rep</source>. 69, 1743. <pub-id pub-id-type="doi">10.15585/mmwr.mm6946e1</pub-id><pub-id pub-id-type="pmid">33211678</pub-id></citation></ref>
<ref id="B10">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Dorfman</surname> <given-names>R.</given-names></name></person-group> (<year>1943</year>). <article-title>The detection of defective members of large populations</article-title>. <source>Ann. Math. Stat</source>. <volume>14</volume>, <fpage>436</fpage>&#x02013;<lpage>440</lpage>. <pub-id pub-id-type="doi">10.1214/aoms/1177731363</pub-id><pub-id pub-id-type="pmid">12553167</pub-id></citation></ref>
<ref id="B11">
<citation citation-type="web"><person-group person-group-type="author"><name><surname>Elkalla</surname> <given-names>M.</given-names></name></person-group> (<year>2020</year>). <source>Ucsd Health Begins Covid-19 Pool Testing</source>. Available online at: <ext-link ext-link-type="uri" xlink:href="https://www.10news.com/news/coronavirus/ucsd-health-begins-covid-19-pool-testing">https://www.10news.com/news/coronavirus/ucsd-health-begins-covid-19-pool-testing</ext-link> (accessed February 22, 2022).</citation>
</ref>
<ref id="B12">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Freeman</surname> <given-names>L. C.</given-names></name> <name><surname>Romney</surname> <given-names>A. K.</given-names></name> <name><surname>Freeman</surname> <given-names>S. C.</given-names></name></person-group> (<year>1987</year>). <article-title>Cognitive structure and informant accuracy</article-title>. <source>Am. Anthropol</source>. <volume>89</volume>, <fpage>310</fpage>&#x02013;<lpage>325</lpage>. <pub-id pub-id-type="doi">10.1525/aa.1987.89.2.02a00020</pub-id></citation>
</ref>
<ref id="B13">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>He</surname> <given-names>X.</given-names></name> <name><surname>Lau</surname> <given-names>E. H.</given-names></name> <name><surname>Wu</surname> <given-names>P.</given-names></name> <name><surname>Deng</surname> <given-names>X.</given-names></name> <name><surname>Wang</surname> <given-names>J.</given-names></name> <name><surname>Hao</surname> <given-names>X.</given-names></name> <etal/></person-group>. (<year>2020</year>). <article-title>Author correction: temporal dynamics in viral shedding and transmissibility of covid-19</article-title>. <source>Nat. Med</source>. <volume>26</volume>, <fpage>1491</fpage>&#x02013;<lpage>1493</lpage>. <pub-id pub-id-type="doi">10.1038/s41591-020-1016-z</pub-id><pub-id pub-id-type="pmid">32770170</pub-id></citation></ref>
<ref id="B14">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hoang</surname> <given-names>T.</given-names></name> <name><surname>Coletti</surname> <given-names>P.</given-names></name> <name><surname>Melegaro</surname> <given-names>A.</given-names></name> <name><surname>Wallinga</surname> <given-names>J.</given-names></name> <name><surname>Grijalva</surname> <given-names>C. G.</given-names></name> <name><surname>Edmunds</surname> <given-names>J. W.</given-names></name> <etal/></person-group>. (<year>2019</year>). <article-title>A systematic review of social contact surveys to inform transmission models of close-contact infections</article-title>. <source>Epidemiology</source> <volume>30</volume>, <fpage>723</fpage>&#x02013;<lpage>736</lpage>. <pub-id pub-id-type="doi">10.1097/EDE.0000000000001047</pub-id><pub-id pub-id-type="pmid">31274572</pub-id></citation></ref>
<ref id="B15">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Huff</surname> <given-names>H. V.</given-names></name></person-group> (<year>2020</year>). <article-title>Controlling the covid-19 pandemic blindly: Silent spread in absence of rapid viral screening</article-title>. <source>Clin. Infect. Dis</source>. 73, e3053-e3054. <pub-id pub-id-type="doi">10.1093/cid/ciaa1251</pub-id><pub-id pub-id-type="pmid">33017460</pub-id></citation></ref>
<ref id="B16">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Huff</surname> <given-names>H. V.</given-names></name> <name><surname>Singh</surname> <given-names>A.</given-names></name></person-group> (<year>2020</year>). <article-title>Asymptomatic transmission during the coronavirus disease 2019 pandemic and implications for public health strategies</article-title>. <source>Clin. Infect. Dis</source>. <volume>71</volume>, <fpage>2752</fpage>&#x02013;<lpage>2756</lpage>. <pub-id pub-id-type="doi">10.1093/cid/ciaa654</pub-id><pub-id pub-id-type="pmid">32463076</pub-id></citation></ref>
<ref id="B17">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Hughes-Oliver</surname> <given-names>J. M.</given-names></name></person-group> (<year>2006</year>). <source>Pooling Experiments for Blood Screening and Drug Discovery</source>. <publisher-loc>New York, NY</publisher-loc>: <publisher-name>Springer New York</publisher-name>.</citation>
</ref>
<ref id="B18">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hwang</surname> <given-names>F. K.</given-names></name></person-group> (<year>1975</year>). <article-title>A generalized binomial group testing problem</article-title>. <source>J. Am. Stat. Assoc</source>. <volume>70</volume>, <fpage>923</fpage>&#x02013;<lpage>926</lpage>. <pub-id pub-id-type="doi">10.1080/01621459.1975.10480324</pub-id></citation>
</ref>
<ref id="B19">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Kaufman</surname> <given-names>L.</given-names></name> <name><surname>Rousseeuw</surname> <given-names>P. J.</given-names></name></person-group> (<year>1990</year>). <article-title>&#x0201C;Finding groups in data: an introduction to cluster analysis,&#x0201D;</article-title> in <source>Wiley series in probability and Mathematical Statistics. Applied Probability and Statistics</source> (<publisher-loc>New York, NY</publisher-loc>: <publisher-name>Wiley</publisher-name>).</citation>
</ref>
<ref id="B20">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Killworth</surname> <given-names>P. D.</given-names></name> <name><surname>Bernard</surname> <given-names>H. R.</given-names></name></person-group> (<year>1976</year>). <article-title>Informant accuracy in social network data</article-title>. <source>Hum. Organ</source>. <volume>35</volume>, <fpage>269</fpage>&#x02013;<lpage>286</lpage>. <pub-id pub-id-type="doi">10.17730/humo.35.3.10215j2m359266n2</pub-id><pub-id pub-id-type="pmid">34468710</pub-id></citation></ref>
<ref id="B21">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Killworth</surname> <given-names>P. D.</given-names></name> <name><surname>Bernard</surname> <given-names>H. R.</given-names></name></person-group> (<year>1977</year>). <article-title>Informant accuracy in social network data ii</article-title>. <source>Hum. Commun. Res</source>. <volume>4</volume>, <fpage>3</fpage>&#x02013;<lpage>18</lpage>. <pub-id pub-id-type="doi">10.1111/j.1468-2958.1977.tb00591.x</pub-id></citation>
</ref>
<ref id="B22">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Killworth</surname> <given-names>P. D.</given-names></name> <name><surname>Bernard</surname> <given-names>H. R.</given-names></name></person-group> (<year>1979</year>). <article-title>Informant accuracy in social network data iii: a comparison of triadic structure in behavioral and cognitive data</article-title>. <source>Soc. Networks</source> <volume>2</volume>, <fpage>19</fpage>&#x02013;<lpage>46</lpage>. <pub-id pub-id-type="doi">10.1016/0378-8733(79)90009-1</pub-id></citation>
</ref>
<ref id="B23">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Larremore</surname> <given-names>D. B.</given-names></name> <name><surname>Wilder</surname> <given-names>B.</given-names></name> <name><surname>Lester</surname> <given-names>E.</given-names></name> <name><surname>Shehata</surname> <given-names>S.</given-names></name> <name><surname>Burke</surname> <given-names>J. M.</given-names></name> <name><surname>Hay</surname> <given-names>J. A.</given-names></name> <etal/></person-group>. (<year>2021</year>). <article-title>Test sensitivity is secondary to frequency and turnaround time for covid-19 screening</article-title>. <source>Sci. Adv</source>. 7, abd5393. <pub-id pub-id-type="doi">10.1126/sciadv.abd5393</pub-id><pub-id pub-id-type="pmid">33219112</pub-id></citation></ref>
<ref id="B24">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Leecaster</surname> <given-names>M.</given-names></name> <name><surname>Toth</surname> <given-names>D. J. A.</given-names></name> <name><surname>Pettey</surname> <given-names>W. B. P.</given-names></name> <name><surname>Rainey</surname> <given-names>J. J.</given-names></name> <name><surname>Gao</surname> <given-names>H.</given-names></name> <name><surname>Uzicanin</surname> <given-names>A.</given-names></name> <etal/></person-group>. (<year>2016</year>). <article-title>Estimates of social contact in a middle school based on self-report and wireless sensor data</article-title>. <source>PLoS ONE</source> <volume>11</volume>, <fpage>e0153690</fpage>. <pub-id pub-id-type="doi">10.1371/journal.pone.0153690</pub-id><pub-id pub-id-type="pmid">27100090</pub-id></citation></ref>
<ref id="B25">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Lendle</surname> <given-names>S. D.</given-names></name> <name><surname>Hudgens</surname> <given-names>M. G.</given-names></name> <name><surname>Qaqish</surname> <given-names>B. F.</given-names></name></person-group> (<year>2012</year>). <article-title>Group testing for case identification with correlated responses</article-title>. <source>Biometrics</source> <volume>68</volume>, <fpage>532</fpage>&#x02013;<lpage>540</lpage>. <pub-id pub-id-type="doi">10.1111/j.1541-0420.2011.01674.x</pub-id><pub-id pub-id-type="pmid">21950447</pub-id></citation></ref>
<ref id="B26">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Lusher</surname> <given-names>D.</given-names></name> <name><surname>Koskinen</surname> <given-names>J.</given-names></name> <name><surname>Robins</surname> <given-names>G.</given-names></name></person-group> (<year>2012</year>). Exponential <italic>Random Graph Models for Social Networks: Theory, Methods, and Applications. Structural Analysis in the Social Sciences</italic>. <publisher-loc>Cambridge</publisher-loc>: <publisher-name>Cambridge University Press</publisher-name>.</citation>
</ref>
<ref id="B27">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>MacNaughton-Smith</surname> <given-names>P.</given-names></name> <name><surname>Williams</surname> <given-names>W. T.</given-names></name> <name><surname>Dale</surname> <given-names>M. B.</given-names></name> <name><surname>Mockett</surname> <given-names>L. G.</given-names></name></person-group> (<year>1964</year>). <article-title>Dissimilarity analysis: a new technique of hierarchical sub-division</article-title>. <source>Nature</source> <volume>202</volume>, <fpage>1034</fpage>&#x02013;<lpage>1035</lpage>. <pub-id pub-id-type="doi">10.1038/2021034a0</pub-id><pub-id pub-id-type="pmid">14198907</pub-id></citation></ref>
<ref id="B28">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Malinovsky</surname> <given-names>Y.</given-names></name> <name><surname>Albert</surname> <given-names>P. S.</given-names></name> <name><surname>Roy</surname> <given-names>A.</given-names></name></person-group> (<year>2016</year>). <article-title>Reader reaction: a note on the evaluation of group testing algorithms in the presence of misclassification</article-title>. <source>Biometrics</source> <volume>72</volume>, <fpage>299</fpage>&#x02013;<lpage>302</lpage>. <pub-id pub-id-type="doi">10.1111/biom.12385</pub-id><pub-id pub-id-type="pmid">26393800</pub-id></citation></ref>
<ref id="B29">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Malinovsky</surname> <given-names>Y.</given-names></name> <name><surname>Haber</surname> <given-names>G.</given-names></name> <name><surname>Albert</surname> <given-names>P. S.</given-names></name></person-group> (<year>2020</year>). <article-title>An optimal design for hierarchical generalized group testing</article-title>. <source>J. R. Stat. Soc. C</source> <volume>69</volume>, <fpage>607</fpage>&#x02013;<lpage>621</lpage>. <pub-id pub-id-type="doi">10.1111/rssc.12409</pub-id></citation>
</ref>
<ref id="B30">
<citation citation-type="web"><person-group person-group-type="author"><name><surname>Mandavilli</surname> <given-names>A.</given-names></name></person-group> (<year>2020</year>). <source>Federal Officials Turn to a New Testing Strategy as Infections Surge</source>. Available online at: <ext-link ext-link-type="uri" xlink:href="https://www.nytimes.com/2020/07/01/health/coronavirus-pooled-testing.html">https://www.nytimes.com/2020/07/01/health/coronavirus-pooled-testing.html</ext-link></citation>
</ref>
<ref id="B31">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Mastrandrea</surname> <given-names>R.</given-names></name> <name><surname>Fournet</surname> <given-names>J.</given-names></name> <name><surname>Barrat</surname> <given-names>A.</given-names></name></person-group> (<year>2015</year>). <article-title>Contact patterns in a high school: a comparison between data collected using wearable sensors, contact diaries and friendship surveys</article-title>. <source>PLoS ONE</source> <volume>10</volume>, <fpage>e0136497</fpage>. <pub-id pub-id-type="doi">10.1371/journal.pone.0136497</pub-id><pub-id pub-id-type="pmid">26325289</pub-id></citation></ref>
<ref id="B32">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Moghadas</surname> <given-names>S. M.</given-names></name> <name><surname>Shoukat</surname> <given-names>A.</given-names></name> <name><surname>Fitzpatrick</surname> <given-names>M. C.</given-names></name> <name><surname>Wells</surname> <given-names>C. R.</given-names></name> <name><surname>Sah</surname> <given-names>P.</given-names></name> <name><surname>Pandey</surname> <given-names>A.</given-names></name> <etal/></person-group>. (<year>2020</year>). <article-title>Projecting hospital utilization during the covid-19 outbreaks in the united states</article-title>. <source>Proc. Natl. Acad. Sci. U.S.A</source>. <volume>117</volume>, <fpage>9122</fpage>&#x02013;<lpage>9126</lpage>. <pub-id pub-id-type="doi">10.1073/pnas.2004064117</pub-id><pub-id pub-id-type="pmid">32245814</pub-id></citation></ref>
<ref id="B33">
<citation citation-type="web"><person-group person-group-type="author"><name><surname>Moody</surname> <given-names>J. W.</given-names></name></person-group> (<year>1999</year>). <source>The structure of adolescent social relations: Modeling friendship in dynamic social settings</source>. Available online at: <ext-link ext-link-type="uri" xlink:href="https://www.proquest.com/openview/3b0fe11b37f19311a088cfa2b4322c75/1?pq-origsite=gscholar&#x00026;cbl=18750&#x00026;diss=y">https://www.proquest.com/openview/3b0fe11b37f19311a088cfa2b4322c75/1?pq-origsite=gscholar&#x00026;cbl=18750&#x00026;diss=y</ext-link></citation>
</ref>
<ref id="B34">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Newman</surname> <given-names>M.</given-names></name></person-group> (<year>2010</year>). <source>Networks: An introduction</source>. <publisher-loc>Oxford</publisher-loc>: <publisher-name>Oxford University Press</publisher-name>.</citation>
</ref>
<ref id="B35">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Oran</surname> <given-names>D. P.</given-names></name> <name><surname>Topol</surname> <given-names>E. J.</given-names></name></person-group> (<year>2020</year>). <article-title>Prevalence of asymptomatic SARS-CoV-2 infection</article-title>. <source>Ann. Internal Med</source>. <volume>173</volume>, <fpage>362</fpage>&#x02013;<lpage>367</lpage>. <pub-id pub-id-type="doi">10.7326/M20-3012</pub-id><pub-id pub-id-type="pmid">32491919</pub-id></citation></ref>
<ref id="B36">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Pilcher</surname> <given-names>C. D.</given-names></name> <name><surname>Westreich</surname> <given-names>D.</given-names></name> <name><surname>Hudgens</surname> <given-names>M. G.</given-names></name></person-group> (<year>2020</year>). <article-title>Group testing for severe acute respiratory syndrome- coronavirus 2 to enable rapid scale-up of testing and real-time surveillance of incidence</article-title>. <source>J. Infect. Dis</source>. <volume>222</volume>, <fpage>903</fpage>&#x02013;<lpage>909</lpage>. <pub-id pub-id-type="doi">10.1093/infdis/jiaa378</pub-id><pub-id pub-id-type="pmid">32592581</pub-id></citation></ref>
<ref id="B37">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Reynolds</surname> <given-names>D.</given-names></name> <name><surname>Garay</surname> <given-names>J.</given-names></name> <name><surname>Deamond</surname> <given-names>S.</given-names></name> <name><surname>Moran</surname> <given-names>M.</given-names></name> <name><surname>Gold</surname> <given-names>W.</given-names></name> <name><surname>Styra</surname> <given-names>R.</given-names></name></person-group> (<year>2008</year>). <article-title>Understanding, compliance and psychological impact of the sars quarantine experience</article-title>. <source>Epidemiol. Infect</source>. <volume>136</volume>, <fpage>997</fpage>&#x02013;<lpage>1007</lpage>. <pub-id pub-id-type="doi">10.1017/S0950268807009156</pub-id><pub-id pub-id-type="pmid">17662167</pub-id></citation></ref>
<ref id="B38">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Robins</surname> <given-names>G.</given-names></name> <name><surname>Pattison</surname> <given-names>P.</given-names></name> <name><surname>Elliott</surname> <given-names>P.</given-names></name></person-group> (<year>2001</year>). <article-title>Network models for social influence processes</article-title>. <source>Psychometrika</source> <volume>66</volume>, <fpage>161</fpage>&#x02013;<lpage>189</lpage>. <pub-id pub-id-type="doi">10.1007/BF02294834</pub-id></citation>
</ref>
<ref id="B39">
<citation citation-type="journal"><source>Sewell D. K. (In Press). Leveraging Network Structure to Improve Pooled Testing Efficiency.</source></citation>
</ref>
<ref id="B40">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Smieszek</surname> <given-names>T.</given-names></name> <name><surname>Barclay</surname> <given-names>V. C.</given-names></name> <name><surname>Seeni</surname> <given-names>I.</given-names></name> <name><surname>Rainey</surname> <given-names>J. J.</given-names></name> <name><surname>Gao</surname> <given-names>H.</given-names></name> <name><surname>Uzicanin</surname> <given-names>A.</given-names></name> <etal/></person-group>. (<year>2014</year>). <article-title>How should social mixing be measured: comparing web-based survey and sensor-based methods</article-title>. <source>BMC Infect. Dis</source>. 14, 136. <pub-id pub-id-type="doi">10.1186/1471-2334-14-136</pub-id><pub-id pub-id-type="pmid">24612900</pub-id></citation></ref>
<ref id="B41">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Sterrett</surname> <given-names>A.</given-names></name></person-group> (<year>1957</year>). <article-title>On the detection of defective members of large populations</article-title>. <source>Ann. Math. Stat</source>. <volume>28</volume>, <fpage>1033</fpage>&#x02013;<lpage>1036</lpage>. <pub-id pub-id-type="doi">10.1214/aoms/1177706807</pub-id><pub-id pub-id-type="pmid">12553167</pub-id></citation></ref>
<ref id="B42">
<citation citation-type="web"><person-group person-group-type="author"><name><surname>Stone</surname> <given-names>A.</given-names></name></person-group> (<year>2020</year>). <source>Nebraska Public Health Lab Begins Pool Testing COVID-19 Samples</source>. Available online at: <ext-link ext-link-type="uri" xlink:href="https://www.ketv.com/article/nebraska-public-health-lab-begins-pool-testing-covid-19-samples/31934880">https://www.ketv.com/article/nebraska-public-health-lab-begins-pool-testing-covid-19-samples/31934880</ext-link></citation>
</ref>
<ref id="B43">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Sutton</surname> <given-names>D.</given-names></name> <name><surname>Fuchs</surname> <given-names>K.</given-names></name> <name><surname>DAlton</surname> <given-names>M.</given-names></name> <name><surname>Goffman</surname> <given-names>D.</given-names></name></person-group> (<year>2020</year>). <article-title>Universal screening for SARS-CoV-2 in women admitted for delivery</article-title>. <source>N. Engl. J. Med</source>. <volume>382</volume>, <fpage>2163</fpage>&#x02013;<lpage>2164</lpage>. <pub-id pub-id-type="doi">10.1056/NEJMc2009316</pub-id><pub-id pub-id-type="pmid">32283004</pub-id></citation></ref>
<ref id="B44">
<citation citation-type="web"><person-group person-group-type="author"><collab>The State University of New York at Stony Brook.</collab></person-group> (<year>2020</year>). <source>Chancellor Malatras and Stony Brook University President Mcinnis Announce Partnership With Suny Upstate Medical University to Launch Pooled Surveillance Testing for COVID-19</source>. Available online at: <ext-link ext-link-type="uri" xlink:href="https://www.suny.edu/suny-news/press-releases/09-2020/9-24-20/stony-brook-pooled-testing.html">https://www.suny.edu/suny-news/press-releases/09-2020/9-24-20/stony-brook-pooled-testing.html</ext-link> (accessed February 22, 2022).</citation>
</ref>
<ref id="B45">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Von Luxburg</surname> <given-names>U.</given-names></name></person-group> (<year>2007</year>). <article-title>A tutorial on spectral clustering</article-title>. <source>Stat. Comput</source>. <volume>17</volume>, <fpage>395</fpage>&#x02013;<lpage>416</lpage>. <pub-id pub-id-type="doi">10.1007/s11222-007-9033-z</pub-id><pub-id pub-id-type="pmid">32650053</pub-id></citation></ref>
<ref id="B46">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wacharapluesadee</surname> <given-names>S.</given-names></name> <name><surname>Kaewpom</surname> <given-names>T.</given-names></name> <name><surname>Ampoot</surname> <given-names>W.</given-names></name> <name><surname>Ghai</surname> <given-names>S.</given-names></name> <name><surname>Khamhang</surname> <given-names>W.</given-names></name> <name><surname>Worachotsueptrakun</surname> <given-names>K.</given-names></name> <etal/></person-group>. (<year>2020</year>). <article-title>Evaluating the efficiency of specimen pooling for pcr-based detection of covid-19</article-title>. <source>J. Med. Virol</source>. <volume>92</volume>, <fpage>2193</fpage>&#x02013;<lpage>2199</lpage>. <pub-id pub-id-type="doi">10.1002/jmv.26005</pub-id><pub-id pub-id-type="pmid">32401343</pub-id></citation></ref>
</ref-list> 
</back>
</article> 