<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v2.3 20070202//EN" "journalpublishing.dtd">
<article xml:lang="EN" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="research-article">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">Front. Big Data</journal-id>
<journal-title>Frontiers in Big Data</journal-title>
<abbrev-journal-title abbrev-type="pubmed">Front. Big Data</abbrev-journal-title>
<issn pub-type="epub">2624-909X</issn>
<publisher>
<publisher-name>Frontiers Media S.A.</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.3389/fdata.2023.1128649</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Big Data</subject>
<subj-group>
<subject>Original Research</subject>
</subj-group>
</subj-group>
</article-categories>
<title-group>
<article-title>Network experiment designs for inferring causal effects under interference</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author" corresp="yes">
<name><surname>Fatemi</surname> <given-names>Zahra</given-names></name>
<xref ref-type="corresp" rid="c001"><sup>&#x0002A;</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/2146975/overview"/>
</contrib>
<contrib contrib-type="author">
<name><surname>Zheleva</surname> <given-names>Elena</given-names></name>
<uri xlink:href="http://loop.frontiersin.org/people/1486491/overview"/>
</contrib>
</contrib-group>
<aff><institution>Department of Computer Science, University of Illinois Chicago</institution>, <addr-line>Chicago, IL</addr-line>, <country>United States</country></aff>
<author-notes>
<fn fn-type="edited-by"><p>Edited by: Xintao Wu, University of Arkansas, United States</p></fn>
<fn fn-type="edited-by"><p>Reviewed by: Liuyi Yao, Alibaba Co., Ltd., China; Yongkai Wu, Clemson University, United States</p></fn>
<corresp id="c001">&#x0002A;Correspondence: Zahra Fatemi <email>zfatem2&#x00040;uic.edu</email></corresp>
<fn fn-type="other" id="fn001"><p>This article was submitted to Data Mining and Management, a section of the journal Frontiers in Big Data</p></fn></author-notes>
<pub-date pub-type="epub">
<day>17</day>
<month>04</month>
<year>2023</year>
</pub-date>
<pub-date pub-type="collection">
<year>2023</year>
</pub-date>
<volume>6</volume>
<elocation-id>1128649</elocation-id>
<history>
<date date-type="received">
<day>21</day>
<month>12</month>
<year>2022</year>
</date>
<date date-type="accepted">
<day>23</day>
<month>03</month>
<year>2023</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright &#x000A9; 2023 Fatemi and Zheleva.</copyright-statement>
<copyright-year>2023</copyright-year>
<copyright-holder>Fatemi and Zheleva</copyright-holder>
<license xlink:href="http://creativecommons.org/licenses/by/4.0/"><p>This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.</p></license> </permissions>
<abstract>
<p>Current approaches to A/B testing in networks focus on limiting interference, the concern that treatment effects can &#x0201C;spill over&#x0201D; from treatment nodes to control nodes and lead to biased causal effect estimation. In the presence of interference, two main types of causal effects are direct treatment effects and total treatment effects. In this paper, we propose two network experiment designs that increase the accuracy of direct and total effect estimations in network experiments through minimizing interference between treatment and control units. For direct treatment effect estimation, we present a framework that takes advantage of <italic>independent sets</italic> and assigns treatment and control only to a set of non-adjacent nodes in a graph, in order to disentangle peer effects from direct treatment effect estimation. For total treatment effect estimation, our framework combines weighted graph clustering and cluster matching approaches to jointly minimize interference and selection bias. Through a series of simulated experiments on synthetic and real-world network datasets, we show that our designs significantly increase the accuracy of direct and total treatment effect estimation in network experiments.</p></abstract>
<kwd-group>
<kwd>causal inference</kwd>
<kwd>direct treatment effects</kwd>
<kwd>total treatment effects</kwd>
<kwd>interference</kwd>
<kwd>spillover</kwd>
<kwd>selection bias</kwd>
</kwd-group>
<counts>
<fig-count count="14"/>
<table-count count="2"/>
<equation-count count="18"/>
<ref-count count="61"/>
<page-count count="20"/>
<word-count count="13468"/>
</counts>
</article-meta>
</front>
<body>
<sec sec-type="intro" id="s1">
<title>1. Introduction</title>
<p>Causal inference plays a central role in many disciplines, from economics (Varian, <xref ref-type="bibr" rid="B56">2016</xref>; Holtz et al., <xref ref-type="bibr" rid="B20">2020</xref>) to health sciences (Antman et al., <xref ref-type="bibr" rid="B2">1992</xref>; Loucks and Thuma, <xref ref-type="bibr" rid="B29">2003</xref>) and social sciences (Sobel, <xref ref-type="bibr" rid="B49">2000</xref>; Gangl, <xref ref-type="bibr" rid="B13">2010</xref>). The goal of causal inference is to estimate the effect of an intervention on individuals&#x00027; outcomes. The gold standard for inferring causality is the use of <italic>controlled experiments</italic>, also known as A/B tests and Randomized Controlled Trials (RCTs), in which experimenters can assign treatment (e.g. a news feed ranking algorithm) to a random subset of a population and compare their outcomes with the outcomes of a control group, randomly selected from the same population (e.g., a group of users who used the old news feed ranking algorithm). Through randomization, the experimenter can control for confounding variables that can impact the treatment and outcome assignment but are not present in the data and assess whether the treatment can cause the target variable to change.</p>
<p>While it is straightforward to randomly assign treatment and control to units that are i.i.d., it is much harder to do that for units that interact with each other. The goal of designing <italic>network experiments</italic> is to ensure reliable causal effect estimation in controlled experiments for potentially interacting units. One of the challenges in network experiment design is dealing with interference (or spillover), the problem of treatment &#x0201C;spilling over&#x0201D; from a treated node to a control node. The presence of interference breaks the Stable Unit Treatment Value Assumption (SUTVA), the assumption that one unit&#x00027;s outcome is unaffected by another unit&#x00027;s treatment assignment, and challenges the validity of causal inference (Imbens and Rubin, <xref ref-type="bibr" rid="B22">2015</xref>). Different types of causal estimands are possible in the presence of interference: 1) the difference between the average outcomes of treated and untreated individuals due to the treatment alone (<italic>Direct Treatment Effects</italic>), 2) the influence of peers&#x00027; behavior on the unit&#x00027;s response to the treatment (<italic>Peer Effects</italic>), and 3) the combination of direct treatment effects and peer effects (<italic>Total Treatment Effects</italic>). Different estimands lead to different inference procedures&#x02014;both from a design and an analysis point of view. As a motivating example, consider the problem of quantifying the effect of changing the news feed ranking algorithm of an online social network website on the time that users spend interacting with the site. Direct treatment effects capture the effect of changing the news feed ranking algorithm on the time that a user spends on the website, regardless of the behavior of other users in the study. Peer effects quantify the effect of friends time spent on the website on the time that a user spends on the website. Total treatment effects show the total effect of changing the news feed ranking algorithm on the time all users spend on the website which is equal to the sum of peer effects and direct treatment effects.</p>
<p>The focus of this paper is measuring direct and total treatment effects in network data. The total treatment effect of applying a treatment to all units compared with applying a different (control) treatment to all units is a common causal estimand in network experiments. Prominent methods for total treatment effect estimation rely on two-stage or cluster-based randomization, in which clusters are identified using graph clustering and cluster randomization dictates the node assignment to treatment and control (Ugander et al., <xref ref-type="bibr" rid="B54">2013</xref>; Eckles et al., <xref ref-type="bibr" rid="B9">2016</xref>; Saveski et al., <xref ref-type="bibr" rid="B45">2017</xref>; Pouget-Abadie et al., <xref ref-type="bibr" rid="B40">2018</xref>; Fatemi and Zheleva, <xref ref-type="bibr" rid="B12">2020</xref>). Graph clustering aims to find densely connected clusters of nodes, such that few edges exist across clusters (Schaeffer, <xref ref-type="bibr" rid="B47">2007</xref>). The basic idea of applying it to causal inference is that little interference can occur between nodes in different clusters.</p>
<p>Clustering a connected graph component is guaranteed to leave edges between clusters, therefore removing interference completely is impossible. At the same time, some node pairs are more likely to interact than others, and assigning such pairs to different treatment groups is more likely to lead to undesired spillover (and biased causal effect estimation) than separating pairs with a low probability of interaction. We make the key observation that there is an inherent tradeoff between interference and selection bias in cluster-based randomization based on the chosen number of clusters (as demonstrated in <xref ref-type="fig" rid="F1">Figure 1</xref>). Due to the heterogeneity of real-world graphs, discovered clusters can be very different from each other, and the nodes in these clusters may not represent the same underlying population (Fatemi and Zheleva, <xref ref-type="bibr" rid="B12">2020</xref>). Therefore, cluster randomization can lead to selection bias in the data with causal effects that are confounded by the difference in node features of each cluster.</p>
<fig id="F1" position="float">
<label>Figure 1</label>
<caption><p>The tradeoff between selection bias (distance) and undesired spillover (RMSE) in cluster-based randomization; each data point is annotated with the number of clusters.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fdata-06-1128649-g0001.tif"/>
</fig>
<p>Here, we propose two methods for network experiment design in the presence of interference. First, we focus on quantifying direct treatment effects by designing a network experiment that disentangles peer effects from direct treatment effects and provides an unbiased estimation of direct treatment effects. We develop <italic>CauseIS</italic>, a framework that leverages independent algorithms on network nodes to divide nodes into two sets: 1) <italic>independent set nodes</italic>, and 2) graph nodes that are not in the independent set referred to as <italic>bystander nodes</italic>. By assigning the independent set nodes to treatment and control groups, we ensure that there are no peer effects between nodes participating in the experiment, regardless of whether they are in different treatment groups or the same treatment group. Key to the proposed experiment design is the idea that in expectation, the peer effects of bystander nodes on the treatment group are the same as the peer effect of bystander nodes on the control group, thus canceling each other in the total treatment effect estimation.</p>
<p>The second method focuses on total treatment effect estimation. We develop <italic>CMatch</italic>, a framework for network experiment design that minimizes both interference and selection bias through a novel objective function for matching clusters and combining node matching with weighted graph clustering to provide a more accurate estimation of total treatment effects (Fatemi and Zheleva, <xref ref-type="bibr" rid="B12">2020</xref>). We introduce the concept of &#x0201C;edge spillover probability&#x0201D; as the probability of interaction between entities and account for it in the design. In this work, incorporating node matching and edge spillover probabilities into graph clustering is novel.</p></sec>
<sec id="s2">
<title>2. Related works</title>
<p>By attracting attention toward network experiments, dependent on the assumptions made in the study different causal estimands for direct, peer, and total treatment effects have been proposed (Halloran and Struchiner, <xref ref-type="bibr" rid="B17">1995</xref>; Hudgens and Halloran, <xref ref-type="bibr" rid="B21">2008</xref>; Green et al., <xref ref-type="bibr" rid="B15">2016</xref>; Taylor and Eckles, <xref ref-type="bibr" rid="B52">2018</xref>; Pouget-Abadie et al., <xref ref-type="bibr" rid="B41">2019</xref>; Ugander and Yin, <xref ref-type="bibr" rid="B55">2020</xref>; Aronow et al., <xref ref-type="bibr" rid="B4">2021</xref>; S&#x000E4;vje et al., <xref ref-type="bibr" rid="B46">2021</xref>). In this section, we give an overview of relevant works to quantify direct and total treatment effects in RCTs.</p>
<sec>
<title>2.1. Direct treatment effect estimation</title>
<p>Estimating the effect of treatment alone has been studied in the context of network experiment design. Jagadeesan et al. (<xref ref-type="bibr" rid="B23">2020</xref>) propose an approach to reduce the bias of the Neymanian estimator of direct treatment effect estimation under interference and homophily. In this approach, treatment assignment is considered as a quasi-coloring on a graph and every treated node is tried to be matched with a control node with an identical number of treated and control neighbors to create a balanced interference in network experiments. In networks where perfect quasi-coloring is not possible, nodes are ordered by degree and then nodes with a similar degree are paired and assigned to treatment or control. The accuracy of causal effect estimation in this method depends on the network structure, degree distribution of the nodes, and approaching perfect quasi-coloring to perfect quasi-coloring. Recently, Li and Wager (<xref ref-type="bibr" rid="B27">2022</xref>) explore the problem of direct treatment effect estimation under random graph asymptotics where an interference graph is a random draw from an (unknown) graphon. Sussman and Airoldi (<xref ref-type="bibr" rid="B51">2017</xref>) propose an approach to estimate direct treatment effects considering a fixed design for potential outcomes. Similar to these approaches, we focus on estimating direct treatment effects in the presence of peer effects, but our approach can be applied in networks with different structural properties.</p></sec>
<sec>
<title>2.2. Total treatment effect estimation</title>
<p>Recent work that addresses interference in graphs relies on separating data samples through graph clustering (Backstrom and Kleinberg, <xref ref-type="bibr" rid="B5">2011</xref>; Ugander et al., <xref ref-type="bibr" rid="B54">2013</xref>; Gui et al., <xref ref-type="bibr" rid="B16">2015</xref>; Eckles et al., <xref ref-type="bibr" rid="B9">2016</xref>; Saveski et al., <xref ref-type="bibr" rid="B45">2017</xref>; Pouget-Abadie et al., <xref ref-type="bibr" rid="B40">2018</xref>), relational d-separation (Maier et al., <xref ref-type="bibr" rid="B32">2010</xref>, <xref ref-type="bibr" rid="B31">2013</xref>; Rattigan et al., <xref ref-type="bibr" rid="B42">2011</xref>; Marazopoulou et al., <xref ref-type="bibr" rid="B33">2015</xref>; Lee and Honavar, <xref ref-type="bibr" rid="B24">2016</xref>), or sequential randomization design (Toulis and Kao, <xref ref-type="bibr" rid="B53">2013</xref>). Among these approaches, cluster-based randomization methods attract significant attention recently. Graph clustering aims to find subgraph clusters with high intra-cluster and low inter-cluster edge density (Zhou et al., <xref ref-type="bibr" rid="B61">2009</xref>; Yang and Leskovec, <xref ref-type="bibr" rid="B58">2015</xref>). A number of algorithms exist for weighted graph clustering (Schaeffer, <xref ref-type="bibr" rid="B47">2007</xref>). Node representation learning approaches range from graph motifs (Milo et al., <xref ref-type="bibr" rid="B35">2002</xref>) to embedding representations (Hamilton et al., <xref ref-type="bibr" rid="B18">2017</xref>) and statistical relational learning (SRL) (Rossi et al., <xref ref-type="bibr" rid="B44">2012</xref>). Eckles et al. (<xref ref-type="bibr" rid="B9">2016</xref>) evaluate different methods for designing and analyzing randomized experiments and find substantial bias reduction in cluster-based randomization approaches, especially in networks with more clusters and stronger peer effects. Saveski et al. propose a procedure to detect interference bias in network experiments and propose a cluster-based randomization approach to mitigate interference bias in such studies. By comparing completely randomized and Cluster-based randomized experiments (Saveski et al., <xref ref-type="bibr" rid="B45">2017</xref>) on LinkedIn&#x00027;s experimental platform, they indicate the presence of network effects and bias in standard RCTs in a real-world setting. However, cluster-based randomized approaches have high variance, making them more difficult to accurately estimate the treatment effect. Ugander et al. (<xref ref-type="bibr" rid="B54">2013</xref>) define a restricted-growth condition on the growth rate of node&#x00027;s connections and show that the variance of estimators is bounded by the linear function of the degrees.</p>
<p>In controlled experiments, the treatment assignment is randomized by the experimenter, whereas in estimating causal effects from observational data, the process by which the treatment is assigned is not decided by the experimenter and is often unknown. Matching is a prominent method for mimicking randomization in observational data by pairing treated units with similar untreated units. Then, the causal effect of interest is estimated based on the matched pairs, rather than the full set of units present in the data, thus reducing the selection bias in observational data (Stuart, <xref ref-type="bibr" rid="B50">2010</xref>). There are two main approaches to matching, fully blocked and propensity score matching (PSM) (Stuart, <xref ref-type="bibr" rid="B50">2010</xref>). Fully blocked matching selects pairs of units whose distance in covariate space is under a pre-determined distance threshold. PSM models the treatment variable based on the observed covariates and matches units that have the same likelihood of treatment. The few research articles that look at the problem of matching for relational domains (Oktay et al., <xref ref-type="bibr" rid="B38">2010</xref>; Arbour et al., <xref ref-type="bibr" rid="B3">2014</xref>) consider SRL data representations. None of them consider cluster matching for a two-stage design which is one of our contributions.</p></sec></sec>
<sec id="s3">
<title>3. Preliminaries</title>
<p>In this section, we formally define the data model, the potential outcomes frameworks, and different types of causal estimands.</p>
<sec>
<title>3.1. Data model</title>
<p>A graph <italic>G</italic> &#x0003D; (<bold>V</bold>, <bold>E</bold>) consists of a set of <italic>n</italic> nodes <bold>V</bold> and a set of edges <bold>E</bold> &#x0003D; {<italic>e</italic><sub><italic>ij</italic></sub>} where <italic>e</italic><sub><italic>ij</italic></sub> denotes that there is an edge between node <italic>v</italic><sub><italic>i</italic></sub> &#x02208; <bold>V</bold> and node <italic>v</italic><sub><italic>j</italic></sub> &#x02208; <bold>V</bold>. Let <bold>N</bold><sub><italic>i</italic></sub> denote the set of neighbors for node <italic>v</italic><sub><italic>i</italic></sub>, i.e. set of nodes that share an edge with <italic>v</italic><sub><italic>i</italic></sub>. Let <italic>v</italic><sub><italic>i</italic></sub>.<bold>X</bold> denote the pre-treatment node feature variables (e.g., Twitter user features) for unit <italic>v</italic><sub><italic>i</italic></sub>. Let <italic>v</italic><sub><italic>i</italic></sub>.<italic>Y</italic> denote the outcome variable of interest for each node <italic>v</italic><sub><italic>i</italic></sub> (e.g., voting), and <italic>v</italic><sub><italic>i</italic></sub>.<italic>T</italic> &#x02208; {0, 1} denote whether node <italic>v</italic><sub><italic>i</italic></sub> (e.g., social media user) has been treated (e.g., shown a post about the benefits of voting), <italic>v</italic><sub><italic>i</italic></sub>.<italic>T</italic> &#x0003D; 1, or not, <italic>v</italic><sub><italic>i</italic></sub>.<italic>T</italic> &#x0003D; 0. Let <bold>Z</bold> &#x02208; {0, 1}<sup><italic>N</italic></sup> be the treatment assignment vector of all nodes. <italic>V</italic><sub>1</sub> and <italic>V</italic><sub>0</sub> indicate the sets of units in treatment and control groups, respectively. For simplicity, we assume that both <italic>v</italic><sub><italic>i</italic></sub>.<italic>T</italic> and <italic>v</italic><sub><italic>i</italic></sub>.<italic>Y</italic> are binary variables. The edge spillover probability <italic>e</italic><sub><italic>ij</italic></sub>.<italic>p</italic> refers to the probability of interference occurring between two nodes.</p></sec>
<sec>
<title>3.2. Potential outcomes framework</title>
<p>The fundamental problem of causal inference is that we can observe the outcome of a target variable for an individual <italic>v</italic><sub><italic>i</italic></sub> in either the treatment or control group but not in both. Let <italic>v</italic><sub><italic>i</italic></sub>.<italic>y</italic>(1) and <italic>v</italic><sub><italic>i</italic></sub>.<italic>y</italic>(0) denote the <italic>potential outcomes</italic> of <italic>v</italic><sub><italic>i</italic></sub>.<italic>y</italic> if unit <italic>v</italic><sub><italic>i</italic></sub> were assigned to the treatment or control group, respectively. The treatment effect (or causal effect) is the difference <italic>g</italic>(<italic>i</italic>) &#x0003D; <italic>v</italic><sub><italic>i</italic></sub>.<italic>y</italic>(1) &#x02212; <italic>v</italic><sub><italic>i</italic></sub>.<italic>y</italic>(0). Since we can never observe the outcome of a unit under both treatment and control simultaneously, the effect <inline-formula><mml:math id="M1"><mml:mover accent="true"><mml:mrow><mml:mi>&#x003BC;</mml:mi></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:math></inline-formula> of a treatment on an outcome is typically calculated through averaging outcomes over treatment and control groups <italic>via</italic> difference-in-means: <inline-formula><mml:math id="M2"><mml:mover accent="true"><mml:mrow><mml:mi>&#x003BC;</mml:mi></mml:mrow><mml:mo>^</mml:mo></mml:mover><mml:mo>=</mml:mo><mml:mover accent="false" class="mml-overline"><mml:mrow><mml:msub><mml:mrow><mml:mi>V</mml:mi></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:mo>.</mml:mo><mml:mi>Y</mml:mi></mml:mrow><mml:mo accent="true">&#x000AF;</mml:mo></mml:mover><mml:mo>-</mml:mo><mml:mover accent="false" class="mml-overline"><mml:mrow><mml:msub><mml:mrow><mml:mi>V</mml:mi></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub><mml:mo>.</mml:mo><mml:mi>Y</mml:mi></mml:mrow><mml:mo accent="true">&#x000AF;</mml:mo></mml:mover></mml:math></inline-formula> (Stuart, <xref ref-type="bibr" rid="B50">2010</xref>). For the treatment effect to be estimable, the following <italic>identifiability</italic> assumptions have to hold:</p>
<list list-type="bullet">
<list-item><p><italic>Stable unit treatment value assumption</italic> (SUTVA) refers to the assumption that the outcomes <italic>v</italic><sub><italic>i</italic></sub>.<italic>y</italic>(1) and <italic>v</italic><sub><italic>i</italic></sub>.<italic>y</italic>(0) are independent of the treatment assignment of other units: {<italic>v</italic><sub><italic>i</italic></sub>.<italic>y</italic>(1), <italic>v</italic><sub><italic>i</italic></sub>.<italic>y</italic>(0)}&#x022A5;<italic>v</italic><sub><italic>j</italic></sub>.<italic>T</italic>, &#x02200;<italic>v</italic><sub><italic>j</italic></sub> &#x02260; <italic>v</italic><sub><italic>i</italic></sub> &#x02208; <italic>V</italic>.</p></list-item>
<list-item><p><italic>Ignorability</italic> (Imbens and Rubin, <xref ref-type="bibr" rid="B22">2015</xref>)&#x02014;also known as <italic>conditional independence</italic> (Pearl, <xref ref-type="bibr" rid="B39">2009</xref>) and <italic>absence of unmeasured confoundness</italic>&#x02014;is the assumption that all variables <italic>v</italic><sub><italic>i</italic></sub>.<italic>X</italic> that can influence both the treatment and outcome <italic>v</italic><sub><italic>i</italic></sub>.<italic>Y</italic> are observed in the data and there are no unmeasured confounding variables that can cause changes in both the treatment and the outcome: {<italic>v</italic><sub><italic>i</italic></sub>.<italic>y</italic>(1), <italic>v</italic><sub><italic>i</italic></sub>.<italic>y</italic>(0)}&#x022A5;<italic>v</italic><sub><italic>i</italic></sub>.<italic>T</italic>&#x02223;<italic>v</italic><sub><italic>i</italic></sub>.<italic>X</italic>.</p></list-item>
<list-item><p><italic>Overlap</italic> is the assumption that each unit assigned to the treatment or control group could have been assigned to the other group. This is also known as <italic>positivity</italic> assumption: <italic>P</italic>(<italic>v</italic><sub><italic>i</italic></sub>.<italic>T</italic>|<italic>v</italic><sub><italic>i</italic></sub>.<italic>X</italic>) &#x0003E; 0 for all units and all possible <italic>T</italic> and <italic>X</italic>.</p></list-item>
</list></sec>
<sec>
<title>3.3. Types of causal effects in networks</title>
<p>We follow Hudgens and Halloran (<xref ref-type="bibr" rid="B21">2008</xref>) to define causal estimands for different types of effects possible in the presence of interference. However, our setting is different in a way that all nodes in the same group receive a similar treatment.</p>
<p><italic>Total Treatment Effects (TTE)</italic> is defined as the outcome difference between two alternative universes, one in which all nodes are assigned to treatment (<inline-formula><mml:math id="M3"><mml:mrow><mml:msub><mml:mstyle mathvariant='bold' mathsize='normal'><mml:mi>Z</mml:mi></mml:mstyle><mml:mstyle mathvariant='bold' mathsize='normal'><mml:mn>1</mml:mn></mml:mstyle></mml:msub><mml:mo>=</mml:mo><mml:msup><mml:mrow><mml:mo>&#x0007B;</mml:mo><mml:mn>1</mml:mn><mml:mo>&#x0007D;</mml:mo></mml:mrow><mml:mi>N</mml:mi></mml:msup></mml:mrow></mml:math></inline-formula>) and one in which all nodes are assigned to control (<inline-formula><mml:math id="M4"><mml:mrow><mml:msub><mml:mstyle mathvariant='bold' mathsize='normal'><mml:mi>Z</mml:mi></mml:mstyle><mml:mstyle mathvariant='bold' mathsize='normal'><mml:mn>0</mml:mn></mml:mstyle></mml:msub><mml:mo>=</mml:mo><mml:msup><mml:mrow><mml:mo>&#x0007B;</mml:mo><mml:mn>0</mml:mn><mml:mo>&#x0007D;</mml:mo></mml:mrow><mml:mi>N</mml:mi></mml:msup></mml:mrow></mml:math></inline-formula>) (Ugander et al., <xref ref-type="bibr" rid="B54">2013</xref>; Saveski et al., <xref ref-type="bibr" rid="B45">2017</xref>):</p>
<disp-formula id="E1"><mml:math id="M5"><mml:mrow><mml:mi>T</mml:mi><mml:mi>T</mml:mi><mml:mi>E</mml:mi><mml:mo>=</mml:mo><mml:mfrac><mml:mn>1</mml:mn><mml:mi>N</mml:mi></mml:mfrac><mml:mstyle displaystyle='true'><mml:munder><mml:mo>&#x02211;</mml:mo><mml:mrow><mml:msub><mml:mi>v</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>&#x02208;</mml:mo><mml:mi>V</mml:mi></mml:mrow></mml:munder><mml:mrow><mml:mo stretchy='false'>(</mml:mo><mml:msub><mml:mi>v</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>.</mml:mo><mml:mi>Y</mml:mi><mml:mo stretchy='false'>(</mml:mo><mml:msub><mml:mstyle mathvariant='bold' mathsize='normal'><mml:mi>Z</mml:mi></mml:mstyle><mml:mstyle mathvariant='bold' mathsize='normal'><mml:mn>1</mml:mn></mml:mstyle></mml:msub><mml:mo stretchy='false'>)</mml:mo><mml:mo>&#x02212;</mml:mo><mml:msub><mml:mi>v</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>.</mml:mo><mml:mi>Y</mml:mi><mml:mo stretchy='false'>(</mml:mo><mml:msub><mml:mstyle mathvariant='bold' mathsize='normal'><mml:mi>Z</mml:mi></mml:mstyle><mml:mstyle mathvariant='bold' mathsize='normal'><mml:mn>0</mml:mn></mml:mstyle></mml:msub><mml:mo stretchy='false'>)</mml:mo><mml:mo stretchy='false'>)</mml:mo></mml:mrow></mml:mstyle><mml:mo>.</mml:mo></mml:mrow></mml:math></disp-formula>
<p><italic>TTE</italic> is estimated as averages over the treatment and control group, and it accounts for two types of effects, <italic>Direct Treatment Effects (DTE)</italic> and <italic>Peer Effects (PE)</italic>:</p>
<disp-formula id="E2"><label>(1)</label><mml:math id="M6"><mml:mrow><mml:mi>T</mml:mi><mml:mover accent='true'><mml:mi>T</mml:mi><mml:mo stretchy='true'>&#x0005E;</mml:mo></mml:mover><mml:mi>E</mml:mi><mml:mo>=</mml:mo><mml:mover accent='true'><mml:mrow><mml:msub><mml:mi>V</mml:mi><mml:mn>1</mml:mn></mml:msub><mml:mo>.</mml:mo><mml:mi>Y</mml:mi></mml:mrow><mml:mo stretchy='true'>&#x000AF;</mml:mo></mml:mover><mml:mo>&#x02212;</mml:mo><mml:mover accent='true'><mml:mrow><mml:msub><mml:mi>V</mml:mi><mml:mn>0</mml:mn></mml:msub><mml:mo>.</mml:mo><mml:mi>Y</mml:mi></mml:mrow><mml:mo stretchy='true'>&#x000AF;</mml:mo></mml:mover><mml:mo>=</mml:mo><mml:mi>D</mml:mi><mml:mi>T</mml:mi><mml:mi>E</mml:mi><mml:mo stretchy='false'>(</mml:mo><mml:mstyle mathvariant='bold' mathsize='normal'><mml:mi>V</mml:mi></mml:mstyle><mml:mo stretchy='false'>)</mml:mo><mml:mo>+</mml:mo><mml:mi>P</mml:mi><mml:mi>E</mml:mi><mml:mo stretchy='false'>(</mml:mo><mml:msub><mml:mstyle mathvariant='bold' mathsize='normal'><mml:mi>V</mml:mi></mml:mstyle><mml:mn>1</mml:mn></mml:msub><mml:mo stretchy='false'>)</mml:mo><mml:mo>&#x02212;</mml:mo><mml:mi>P</mml:mi><mml:mi>E</mml:mi><mml:mo stretchy='false'>(</mml:mo><mml:msub><mml:mstyle mathvariant='bold' mathsize='normal'><mml:mi>V</mml:mi></mml:mstyle><mml:mn>0</mml:mn></mml:msub><mml:mo stretchy='false'>)</mml:mo><mml:mo>.</mml:mo></mml:mrow></mml:math></disp-formula>
<p>Direct Treatment effects (<italic>DTE</italic>) reflects the difference between the outcomes of treated and untreated subjects which can be attributed to the treatment alone. They are estimated as:</p>
<disp-formula id="E3"><label>(2)</label><mml:math id="M7"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:mi>D</mml:mi><mml:mi>T</mml:mi><mml:mi>E</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>V</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mstyle displaystyle="true"><mml:munder class="msub"><mml:mrow><mml:mo>&#x01D53C;</mml:mo></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mi>v</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>&#x02208;</mml:mo><mml:mi>V</mml:mi></mml:mrow></mml:munder></mml:mstyle><mml:mrow><mml:mo>[</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>v</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>.</mml:mo><mml:mi>Y</mml:mi><mml:mo>|</mml:mo><mml:msub><mml:mrow><mml:mi>v</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>.</mml:mo><mml:mi>T</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mo>]</mml:mo></mml:mrow><mml:mo>-</mml:mo><mml:mstyle displaystyle="true"><mml:munder class="msub"><mml:mrow><mml:mo>&#x01D53C;</mml:mo></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mi>v</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>&#x02208;</mml:mo><mml:mi>V</mml:mi></mml:mrow></mml:munder></mml:mstyle><mml:mrow><mml:mo>[</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>v</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>.</mml:mo><mml:mi>Y</mml:mi><mml:mo>|</mml:mo><mml:msub><mml:mrow><mml:mi>v</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>.</mml:mo><mml:mi>T</mml:mi><mml:mo>=</mml:mo><mml:mn>0</mml:mn></mml:mrow><mml:mo>]</mml:mo></mml:mrow><mml:mo>.</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>Peer effects (<italic>PE</italic>), known also as indirect effects in the prior studies (Halloran and Struchiner, <xref ref-type="bibr" rid="B17">1995</xref>; Hudgens and Halloran, <xref ref-type="bibr" rid="B21">2008</xref>; Jagadeesan et al., <xref ref-type="bibr" rid="B23">2020</xref>), reflect the difference in outcomes that can be attributed to the influence of other subjects in the experiment. Let <italic>N</italic><sub><italic>i</italic></sub>.<italic><bold>&#x003C0;</bold></italic> denote the vector of treatment assignments to node <italic>v</italic><sub><italic>i</italic></sub>&#x00027;s neighbors <italic>N</italic><sub><italic>i</italic></sub>. Average <italic>PE</italic> is estimated as having neighbors with a treatment vector:</p>
<disp-formula id="E4"><label>(3)</label><mml:math id="M8"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:mi>P</mml:mi><mml:mi>E</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>V</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mstyle displaystyle="true"><mml:munder class="msub"><mml:mrow><mml:mo>&#x01D53C;</mml:mo></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mi>v</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>&#x02208;</mml:mo><mml:mi>V</mml:mi></mml:mrow></mml:munder></mml:mstyle><mml:mrow><mml:mo>[</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>v</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>.</mml:mo><mml:mi>Y</mml:mi><mml:mo>|</mml:mo><mml:msub><mml:mrow><mml:mi>v</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>.</mml:mo><mml:mi>T</mml:mi><mml:mo>=</mml:mo><mml:mi>t</mml:mi><mml:mo>,</mml:mo><mml:msub><mml:mrow><mml:mi>N</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>.</mml:mo><mml:mstyle mathvariant="bold-italic"><mml:mi>&#x003C0;</mml:mi></mml:mstyle></mml:mrow><mml:mo>]</mml:mo></mml:mrow><mml:mo>-</mml:mo><mml:mstyle displaystyle="true"><mml:munder class="msub"><mml:mrow><mml:mo>&#x01D53C;</mml:mo></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mi>v</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>&#x02208;</mml:mo><mml:mi>V</mml:mi></mml:mrow></mml:munder></mml:mstyle><mml:mrow><mml:mo>[</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>v</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>.</mml:mo><mml:mi>Y</mml:mi><mml:mo>|</mml:mo><mml:msub><mml:mrow><mml:mi>v</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>.</mml:mo><mml:mi>T</mml:mi><mml:mo>=</mml:mo><mml:mi>t</mml:mi><mml:mo>,</mml:mo><mml:msub><mml:mrow><mml:mi>N</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mi>&#x02205;</mml:mi></mml:mrow><mml:mo>]</mml:mo></mml:mrow><mml:mo>.</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>Here, we distinguish between two types of peer effects, <italic>allowable peer effects</italic> (<italic>APE</italic>) and <italic>unallowable peer effects</italic> (<italic>UPE</italic>). Allowable peer effects are peer effects that occur within the same treatment group, and they are a natural consequence of network interactions. For example, if a social media company wants to introduce a new feature (e.g., nudging users to vote), it would introduce that feature to all users and the total effect of the feature would include both individual and peer effects. Unallowable peer effects are peer effects that occur across treatment groups and contribute to undesired spillover and incorrect causal effect estimation.</p>
<p>For each node <italic>v</italic><sub><italic>i</italic></sub> in treatment group <italic>t</italic>, we have two types of neighbors: 1) neighbors <inline-formula><mml:math id="M9"><mml:msubsup><mml:mrow><mml:mi>N</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>t</mml:mi></mml:mrow></mml:msubsup></mml:math></inline-formula> in the same treatment class as node <italic>v</italic><sub><italic>i</italic></sub> with treatment assignment set <inline-formula><mml:math id="M10"><mml:msubsup><mml:mrow><mml:mi>N</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>t</mml:mi></mml:mrow></mml:msubsup><mml:mo>.</mml:mo><mml:mstyle mathvariant="bold-italic"><mml:mi>&#x003C0;</mml:mi></mml:mstyle></mml:math></inline-formula>; 2) set of neighbors in a different treatment class <inline-formula><mml:math id="M11"><mml:msubsup><mml:mrow><mml:mi>N</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mover accent="false" class="mml-overline"><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mo accent="true">&#x000AF;</mml:mo></mml:mover></mml:mrow></mml:msubsup></mml:math></inline-formula> (<inline-formula><mml:math id="M12"><mml:mover accent="false" class="mml-overline"><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mo accent="true">&#x000AF;</mml:mo></mml:mover><mml:mo>&#x02260;</mml:mo><mml:mi>t</mml:mi></mml:math></inline-formula>) with treatment assignment denoted by <inline-formula><mml:math id="M13"><mml:msubsup><mml:mrow><mml:mi>N</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mover accent="false" class="mml-overline"><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mo accent="true">&#x000AF;</mml:mo></mml:mover></mml:mrow></mml:msubsup><mml:mo>.</mml:mo><mml:mstyle mathvariant="bold-italic"><mml:mi>&#x003C0;</mml:mi></mml:mstyle></mml:math></inline-formula>. The <italic>APE</italic> is defined as:</p>
<disp-formula id="E5"><label>(4)</label><mml:math id="M14"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:mi>A</mml:mi><mml:mi>P</mml:mi><mml:mi>E</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>V</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mstyle displaystyle="true"><mml:munder class="msub"><mml:mrow><mml:mo>&#x01D53C;</mml:mo></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mi>v</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>&#x02208;</mml:mo><mml:mi>V</mml:mi></mml:mrow></mml:munder></mml:mstyle><mml:mrow><mml:mo stretchy="false">[</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>v</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>.</mml:mo><mml:mi>Y</mml:mi><mml:mo>|</mml:mo><mml:msub><mml:mrow><mml:mi>v</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>.</mml:mo><mml:mi>T</mml:mi><mml:mo>=</mml:mo><mml:mi>t</mml:mi><mml:mo>,</mml:mo><mml:msubsup><mml:mrow><mml:mi>N</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>t</mml:mi></mml:mrow></mml:msubsup><mml:mo>.</mml:mo><mml:mstyle mathvariant="bold-italic"><mml:mi>&#x003C0;</mml:mi></mml:mstyle></mml:mrow><mml:mo stretchy="false">]</mml:mo></mml:mrow><mml:mo>-</mml:mo><mml:mstyle displaystyle="true"><mml:munder class="msub"><mml:mrow><mml:mo>&#x01D53C;</mml:mo></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mi>v</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>&#x02208;</mml:mo><mml:mi>V</mml:mi></mml:mrow></mml:munder></mml:mstyle><mml:mrow><mml:mo stretchy="false">[</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>v</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>.</mml:mo><mml:mi>Y</mml:mi><mml:mo>|</mml:mo><mml:msub><mml:mrow><mml:mi>v</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>.</mml:mo><mml:mi>T</mml:mi><mml:mo>=</mml:mo><mml:mi>t</mml:mi><mml:mo>,</mml:mo><mml:msubsup><mml:mrow><mml:mi>N</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>t</mml:mi></mml:mrow></mml:msubsup><mml:mo>=</mml:mo><mml:mi>&#x02205;</mml:mi></mml:mrow><mml:mo stretchy="false">]</mml:mo></mml:mrow><mml:mo>,</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>and the <italic>UPE</italic> is defined as:</p>
<disp-formula id="E6"><label>(5)</label><mml:math id="M15"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:mi>U</mml:mi><mml:mi>P</mml:mi><mml:mi>E</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>V</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mstyle displaystyle="true"><mml:munder class="msub"><mml:mrow><mml:mo>&#x01D53C;</mml:mo></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mi>v</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>&#x02208;</mml:mo><mml:mi>V</mml:mi></mml:mrow></mml:munder></mml:mstyle><mml:mrow><mml:mo stretchy="false">[</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>v</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>.</mml:mo><mml:mi>Y</mml:mi><mml:mo>|</mml:mo><mml:msub><mml:mrow><mml:mi>v</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>.</mml:mo><mml:mi>T</mml:mi><mml:mo>=</mml:mo><mml:mi>t</mml:mi><mml:mo>,</mml:mo><mml:msubsup><mml:mrow><mml:mi>N</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mover accent="false" class="mml-overline"><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mo accent="true">&#x000AF;</mml:mo></mml:mover></mml:mrow></mml:msubsup><mml:mo>.</mml:mo><mml:mstyle mathvariant="bold-italic"><mml:mi>&#x003C0;</mml:mi></mml:mstyle></mml:mrow><mml:mo stretchy="false">]</mml:mo></mml:mrow><mml:mo>-</mml:mo><mml:mstyle displaystyle="true"><mml:munder class="msub"><mml:mrow><mml:mo>&#x01D53C;</mml:mo></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mi>v</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>&#x02208;</mml:mo><mml:mi>V</mml:mi></mml:mrow></mml:munder></mml:mstyle><mml:mrow><mml:mo stretchy="false">[</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>v</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>.</mml:mo><mml:mi>Y</mml:mi><mml:mo>|</mml:mo><mml:msub><mml:mrow><mml:mi>v</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>.</mml:mo><mml:mi>T</mml:mi><mml:mo>=</mml:mo><mml:mi>t</mml:mi><mml:mo>,</mml:mo><mml:msubsup><mml:mrow><mml:mi>N</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mover accent="false" class="mml-overline"><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mo accent="true">&#x000AF;</mml:mo></mml:mover></mml:mrow></mml:msubsup><mml:mo>=</mml:mo><mml:mi>&#x02205;</mml:mi></mml:mrow><mml:mo stretchy="false">]</mml:mo></mml:mrow><mml:mo>.</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula></sec></sec>
<sec id="s4">
<title>4. Problem statement</title>
<p>The goal of designing network experiments is to ensure reliable causal effect estimation in controlled experiments by minimizing both unallowable peer effects in node assignment to treatment and control. In this work, we are interested to design two network experiments for quantifying direct and total treatment effects.</p>
<sec>
<title>4.1. Direct treatment effect estimation</title>
<p>The question we are interested to answer is: What is the causal effect of the treatment alone? This question has many practical applications for estimating the effectiveness of different policy interventions. Some examples include: What is the individual protection from a disease due to vaccination alone (and not herd immunity)? What is the effect of advertisements on motivating a person to buy a new phone? In network experiments, it is challenging to disentangle <italic>DTE</italic> from <italic>PE</italic> and this is one of the main goals of this paper. More formally:</p>
<p> Problem 1 (Network experiment design for direct treatment effect). estimation. Given an undirected graph <italic>G</italic> &#x0003D; (<bold>V</bold>, <bold>E</bold>), and a set of attributes <bold>V.X</bold> associated with each node. Find a treatment assignment vector <bold>Z</bold> of a population with three different subsets of nodes, the treatment nodes <bold>V</bold><sub>1</sub> &#x02208; <bold>V</bold>, the control nodes <bold>V</bold><sub>0</sub> &#x02208; <bold>V</bold>, and nodes excluded from the experiment <bold>V</bold><sub>2</sub> &#x02208; <bold>V</bold>, such that:</p>
<list list-type="simple">
<list-item><p>a. <bold>V</bold><sub>0</sub> &#x02229; <bold>V</bold><sub>1</sub> &#x02229; <bold>V</bold><sub>2</sub> &#x0003D; &#x02205;.</p></list-item>
<list-item><p>b. |<bold>V</bold><sub>0</sub>| &#x0002B; |<bold>V</bold><sub>1</sub>| is maximized.</p></list-item>
<list-item><p>c. <italic>PE</italic>(<italic>V</italic><sub>1</sub>) &#x02212; <italic>PE</italic>(<italic>V</italic><sub>0</sub>) &#x02248; 0.</p></list-item>
</list>
<p>The first component aims to choose treatment, control, and bystander nodes excluded from the experiments that do not overlap. The second component ensures to choose of as many nodes as possible from <bold>V</bold> to be assigned to treatment and control groups. The third component removes peer effects from causal effect estimation.</p></sec>
<sec>
<title>4.2. Total treatment effect estimation</title>
<p>TTE is one of the most popular causal estimands in network experiments, especially in cluster-based randomization approaches (Eckles et al., <xref ref-type="bibr" rid="B9">2016</xref>; Pouget-Abadie et al., <xref ref-type="bibr" rid="B40">2018</xref>). There are two main challenges with causal effect estimation in graphs.</p>
<sec>
<title>4.2.1. Challenge no. 1: it is hard to separate a graph into treatment and control nodes without leaving edges across</title>
<p>The presence of interference breaks the SUTVA assumption and leads to biased causal effect estimation in relational data. The two-stage experimental design addresses this problem by finding groups of units that are unlikely to interact with each other (stage 1) and then randomly assigning each group to treatment and control (stage 2). Clustering has been proposed as a way to discover such groups that are strongly connected within but loosely connected across, thus finding treatment and control subgraphs that have a low probability of spillover from one to the other. However, due to the density of real-world graphs, graph clustering techniques can leave as many as 65% to 79% of edges as inter-cluster edges (Table 2 in Saveski et al., <xref ref-type="bibr" rid="B45">2017</xref>). Leaving these edges across treatment and control nodes would lead to a large amount of spillover. Incorporating information about the edge probability of spillover into the clustering helps alleviate this problem and is one of the main contributions of our work.</p></sec>
<sec>
<title>4.2.2. Challenge no. 2: there is a tradeoff between interference and selection bias in cluster-based network experiments</title>
<p>While randomization of i.i.d. units in controlled experiments can guarantee ignorability and overlap, the two-stage design does not. One of the key observations in our work is that dependent on the number of clusters, there is a tradeoff between interference and selection bias in terms of the treatment and control group not representing the same underlying distribution. <xref ref-type="fig" rid="F1">Figure 1</xref> illustrates this tradeoff for Cora, one of the datasets in our experiments, using <italic>reLDG</italic> as the clustering method. When a network is separated into very few clusters, the Euclidean distance between nodes in treatment and control clusters is larger than the Euclidean distance when a lot of clusters are produced over the same network (e.g., 0.4 vs. 0.18 for 2 and 1, 000 clusters). This is intuitive because as the clusters get smaller and smaller, their randomization gets closer to mimicking full node randomization (shown as a star). At the same time, a larger number of clusters translates to a higher likelihood of edges between treatment and control nodes, which leads to higher undesired spillover and causal effect estimation error (e.g., 0.015 vs. 0.059 for 2 and 1000 clusters).</p>
<p>Ideally, we would like to measure <italic>TTE</italic> &#x0003D; <italic>DTE</italic>(<italic>V</italic>) &#x0002B; <italic>APE</italic>(<italic>V</italic><sub>1</sub>) &#x02212; <italic>APE</italic>(<italic>V</italic><sub>0</sub>). Due to undesired spillover in a controlled experiment, what we are able to measure instead is the overall effect that comprises both allowable and unallowable peer effects <italic>TTE</italic> &#x0003D; <italic>DTE</italic>(<italic>V</italic>) &#x0002B; <italic>APE</italic>(<italic>V</italic><sub>1</sub>) &#x02212; <italic>APE</italic>(<italic>V</italic><sub>0</sub>) &#x0002B; <italic>UPE</italic>(<italic>V</italic><sub>1</sub>) &#x02212; <italic>UPE</italic>(<italic>V</italic><sub>0</sub>). Therefore, when we design an experiment for minimum interference, we are interested in setting it up in a way that makes <italic>UPE</italic>(<italic>V</italic><sub>1</sub>) &#x0003D; 0 and <italic>UPE</italic>(<italic>V</italic><sub>0</sub>) &#x0003D; 0. More formally:</p>
<p>Problem 2 (Network experiment design for total treatment effect estimation). Given a graph <italic>G</italic> &#x0003D; (<bold>V</bold>, <bold>E</bold>), a set of attributes <bold>V.X</bold> associated with each node and a set of spillover probabilities <bold>E.P</bold> associated with the graph edges, we want to construct two sets of nodes, the control nodes <bold>V</bold><sub>0</sub> &#x02208; <bold>V</bold> and the treatment nodes <bold>V</bold><sub>1</sub> &#x02208; <bold>V</bold> such that:</p>
<list list-type="simple">
<list-item><p>a. <bold>V</bold><sub>0</sub> &#x02229; <bold>V</bold><sub>1</sub> &#x0003D; &#x02205;.</p></list-item>
<list-item><p>b. |<bold>V</bold><sub>0</sub>| &#x0002B; |<bold>V</bold><sub>1</sub>| is maximized.</p></list-item>
<list-item><p>c. &#x003B8; &#x0003D; <italic>UPE</italic>(<bold>V</bold><sub>1</sub>) &#x02212; <italic>UPE</italic>(<bold>V</bold><sub>0</sub>) is minimized.</p></list-item>
<list-item><p>d. <bold>V</bold><sub>0</sub><bold>.X</bold> and <bold>V</bold><sub>1</sub><bold>.X</bold> are identically distributed.</p></list-item>
</list>
<p>This problem definition describes the desired qualities of the experiment design at a high level. The first component ensures that the treatment and control nodes do not overlap. The second component aims to keep as many nodes as possible from <bold>V</bold> in the final design. The third component minimizes unallowable spillover. The fourth component requires that there is no selection bias between the treatment and control groups. The second and third components are at odds with one another and require a tradeoff because the lower &#x003B8;, the lower the number of selected nodes for the experiment |<bold>V</bold><sub>0</sub>| &#x0002B; |<bold>V</bold><sub>1</sub>|. As we showed in <xref ref-type="fig" rid="F1">Figure 1</xref>, there is also a tradeoff between the third and fourth components.</p></sec></sec></sec>
<sec id="s5">
<title>5. <italic>CauseIS</italic>: a network experiment design framework for direct treatment effect estimation</title>
<p>In this section, we define an objective function corresponding to the problem of this paper and describe our proposed framework which we refer to as <italic>CauseIS</italic> for estimating direct treatment effects in network experiments.</p>
<p>Typically, total treatment effect estimation includes both APE and UPE. In a randomized approach, TTE is estimated as:</p>
<disp-formula id="E7"><label>(6)</label><mml:math id="M16"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:mrow><mml:mi>T</mml:mi><mml:mover accent='true'><mml:mi>T</mml:mi><mml:mo stretchy='true'>&#x0005E;</mml:mo></mml:mover><mml:mi>E</mml:mi></mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mstyle mathvariant="bold"><mml:mtext>V</mml:mtext></mml:mstyle></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mi>D</mml:mi><mml:mi>T</mml:mi><mml:mi>E</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mstyle mathvariant="bold"><mml:mtext>V</mml:mtext></mml:mstyle></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>&#x0002B;</mml:mo><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>A</mml:mi><mml:mi>P</mml:mi><mml:mi>E</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mstyle mathvariant="bold"><mml:mtext>V</mml:mtext></mml:mstyle></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>-</mml:mo><mml:mi>A</mml:mi><mml:mi>P</mml:mi><mml:mi>E</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mstyle mathvariant="bold"><mml:mtext>V</mml:mtext></mml:mstyle></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>&#x0002B;</mml:mo><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>U</mml:mi><mml:mi>P</mml:mi><mml:mi>E</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mstyle mathvariant="bold"><mml:mtext>V</mml:mtext></mml:mstyle></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>-</mml:mo><mml:mi>U</mml:mi><mml:mi>P</mml:mi><mml:mi>E</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mstyle mathvariant="bold"><mml:mtext>V</mml:mtext></mml:mstyle></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>.</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>In this work, we propose an approach that makes APE(<bold>V</bold><sub>1</sub>)=0 and APE(<bold>V</bold><sub>0</sub>)=0 and in expectation makes UPE(<bold>V</bold><sub>1</sub>)-UPE(<bold>V</bold><sub>0</sub>)=0, thus making the estimated TTE correspond to DTE. We first define an objective function that addresses the goals specified in <italic>Problem 1</italic>.</p>
<sec>
<title>5.1. Objective function</title>
<p>The goal of the objective function is to find a subset of <bold>V</bold> with maximum cardinality (<italic>Problem 1.b</italic>) such that by randomizing treatment assignment over the selected subset, the allowable peer effects from the experiment are removed (<italic>Problem 1.c</italic>). We define <italic>s</italic> &#x02208; {0, 1} such that <italic>s</italic><sub><italic>i</italic></sub> &#x0003D; 1 if node <italic>v</italic><sub><italic>i</italic></sub> is in the set of selected nodes, else <italic>s</italic><sub><italic>i</italic></sub> &#x0003D; 0.</p>
<disp-formula id="E8"><mml:math id="M17"><mml:mtable columnalign='left'><mml:mtr><mml:mtd><mml:mtext>maximize&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;</mml:mtext><mml:mstyle displaystyle='true'><mml:munderover><mml:mo>&#x02211;</mml:mo><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mo>&#x0007C;</mml:mo><mml:mi>V</mml:mi><mml:mo>&#x0007C;</mml:mo></mml:mrow></mml:munderover><mml:mrow><mml:msub><mml:mi>s</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:mstyle></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mtext>subject&#x000A0;to&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;</mml:mtext><mml:msub><mml:mi>s</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>+</mml:mo><mml:msub><mml:mi>s</mml:mi><mml:mi>j</mml:mi></mml:msub><mml:mo>&#x02264;</mml:mo><mml:mn>1</mml:mn><mml:mtext>&#x02003;</mml:mtext><mml:mo>&#x02200;</mml:mo><mml:msub><mml:mi>e</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mo>,</mml:mo><mml:mi>j</mml:mi></mml:mrow></mml:msub><mml:mo>&#x02208;</mml:mo><mml:mi>E</mml:mi></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mtext>&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;</mml:mtext><mml:msub><mml:mi>s</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>&#x02208;</mml:mo><mml:mo>&#x0007B;</mml:mo><mml:mn>0</mml:mn><mml:mo>,</mml:mo><mml:mn>1</mml:mn><mml:mo>&#x0007D;</mml:mo><mml:mtext>&#x000A0;&#x000A0;&#x000A0;&#x000A0;</mml:mtext><mml:mo>&#x02200;</mml:mo><mml:msub><mml:mi>v</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>&#x02208;</mml:mo><mml:mi>V</mml:mi></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>The two constraints together guarantee that adjacent nodes are not included in our network experiment design. This optimization can be solved by reducing our problem to the maximum independent set problem in graph theory (Eisenbrand et al., <xref ref-type="bibr" rid="B10">2003</xref>) such that nodes in the independent set correspond to the nodes selected for the network experiment.</p>
<p>Given a graph <italic>G</italic> &#x0003D; (<bold>V</bold>, <bold>E</bold>), <bold>IS</bold> &#x02286; <bold>V</bold> is a subset of nodes such that for each pair of nodes <italic>v</italic><sub><italic>i</italic></sub> &#x02208; <bold>IS</bold> and <italic>v</italic><sub><italic>j</italic></sub> &#x02208; <bold>IS</bold> there is no shared edge between them (<italic>e</italic><sub><italic>i,j</italic></sub> &#x02209; <bold>E</bold>). A <italic>maximal independent set</italic> is an independent set that is not a subset of any other independent sets of the graph. Using a greedy sequential approach, a maximal independent set of a graph can be found in <italic>O</italic>(|<italic>E</italic>|) (Blelloch et al., <xref ref-type="bibr" rid="B6">2012</xref>) but there are parallel algorithms that can solve this problem much faster in <italic>O</italic>(<italic>log</italic>(<italic>N</italic>)) (Luby, <xref ref-type="bibr" rid="B30">1985</xref>; Yves et al., <xref ref-type="bibr" rid="B59">2009</xref>). A maximal independent set with the largest possible size for a given graph is known as a <italic>maximum independent set</italic>. Finding maximum independent sets in graphs is known to be NP-hard. There are exact algorithms that can find maximum independent sets in <italic>O</italic>(1.1996<sup><italic>n</italic></sup><italic>n</italic><sup><italic>O</italic>(1)</sup>) (Xiao and Nagamochi, <xref ref-type="bibr" rid="B57">2017</xref>) and also approximation algorithms that can find it in <italic>O</italic>(<italic>n</italic>/(<italic>logn</italic>)<sup>2</sup>) (Boppana and Halld&#x000F3;rsson, <xref ref-type="bibr" rid="B7">1990</xref>).</p></sec>
<sec>
<title>5.2. <italic>CauseIS</italic> Framework</title>
<p>We propose <italic>CauseIS</italic>, a network experiment design for robust estimation of Direct Treatment Effects by disentangling peer effects from DTE. <italic>CauseIS</italic> has two main steps:</p>
<list list-type="order">
<list-item><p>Finding a maximum independent set of the graph (<italic>Independent set graph</italic> in <xref ref-type="fig" rid="F2">Figure 2</xref>).</p></list-item>
<list-item><p>Assigning nodes of the maximum independent set to treatment and control in a randomized fashion (<italic>CauseIS</italic> <italic>output graph</italic> in <xref ref-type="fig" rid="F2">Figure 2</xref>).</p></list-item>
</list>
<fig id="F2" position="float">
<label>Figure 2</label>
<caption><p>Illustration of <italic>CauseIS</italic> frameworks in network experiments. <bold>Input graph</bold>: a graph of nodes and the connection between them. <bold>Independent set graph</bold>: a graph of bystander and independent set nodes selected by the independent set algorithm. <italic><bold>CauseIS</bold></italic> <bold>output graph</bold>: the output graph that represents randomized treatment assignment of independent set nodes and peer effects that exists in the experiment.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fdata-06-1128649-g0002.tif"/>
</fig>
<p>In this framework, we find the treatment assignment vector <bold>Z</bold> of nodes by dividing the population into treatment, control, and bystander nodes. Considering the proposed objective function, we first use an algorithm to find the maximum independent set of the given graph which partitions the graph into two sets of nodes: 1) nodes in the maximum independent set denoted by <italic>MIS</italic> (<italic>MIS</italic> &#x02286; <bold>V</bold>) where by randomizing treatment assignment over these nodes, we achieve treatment (<bold>V<sub>1</sub></bold>) and control (<bold>V<sub>0</sub></bold>) groups, and 2) bystander nodes (<bold>V<sub>2</sub></bold>) that are not in <italic>MIS</italic> where <bold>V<sub>2</sub></bold> &#x02286; <bold>V</bold>, <bold>V<sub>2</sub></bold> &#x02229; <bold>MIS</bold> &#x0003D; &#x02205;, and <bold>V<sub>2</sub></bold> &#x0222A; <bold>MIS</bold> &#x0003D; <bold>V</bold>. The main idea is to assign nodes of <italic>MIS</italic> to treatment and control at random and ensure that there is no peer effect across treatment and control nodes.</p>
<p><xref ref-type="fig" rid="F2">Figure 2</xref> represents the pipeline of the <italic>CauseIS</italic> framework. <italic>Input graph</italic> shows the graph of the network that the network experiment is conducted on. After using an independent set algorithm on the Input graph, independent set and bystander nodes are selected from the graph that is shown in <italic>Independent set graph</italic>. Finally, by randomizing treatment assignment over independent set nodes, treatment, and control nodes are selected. <italic>CauseIS</italic> <italic>output graph</italic> shows the assignment of Input graph nodes to three treatment groups where APE is removed from the experiment.</p>
<p>We remove bystander nodes from the randomized treatment assignment because of the interaction within these nodes which leads to APE in treatment effect estimation. However, it is still possible that information flows from peers in <bold>V<sub>2</sub></bold> to <bold>V</bold><sub>0</sub> and <bold>V</bold><sub>1</sub>, leading to undesired peer effects (nodes 1, 5, 7, 9, 10 in <xref ref-type="fig" rid="F2">Figure 2</xref>). In the running example, an infected person in <bold>V<sub>2</sub></bold> may infect his peers in <bold>V</bold><sub>0</sub> and <bold>V</bold><sub>1</sub>.</p>
<p>By removing APE from Equation (6), we have <inline-formula><mml:math id="M18"><mml:mrow><mml:mi>T</mml:mi><mml:mover accent='true'><mml:mi>T</mml:mi><mml:mo stretchy='true'>&#x0005E;</mml:mo></mml:mover><mml:mi>E</mml:mi></mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mstyle mathvariant="bold"><mml:mtext>V</mml:mtext></mml:mstyle></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mi>D</mml:mi><mml:mi>T</mml:mi><mml:mi>E</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mstyle mathvariant="bold"><mml:mtext>V</mml:mtext></mml:mstyle></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>&#x0002B;</mml:mo><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>U</mml:mi><mml:mi>P</mml:mi><mml:mi>E</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mstyle mathvariant="bold"><mml:mtext>V</mml:mtext></mml:mstyle></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>-</mml:mo><mml:mi>U</mml:mi><mml:mi>P</mml:mi><mml:mi>E</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mstyle mathvariant="bold"><mml:mtext>V</mml:mtext></mml:mstyle></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:math></inline-formula>. By randomizing the treatment assignment over <italic>MIS</italic> nodes, we aim to provide a chance for treatment and control nodes to have the same number of peers in bystander nodes <bold>V<sub>2</sub></bold>. Let <italic>&#x003B1;</italic><sub>1</sub> be the set of bystander nodes that are activated neighbors of treatment nodes, and <italic>&#x003B1;</italic><sub>2</sub> be the set of bystander nodes that are activated neighbors of control nodes at time t-1. Let <italic>V</italic><sub>1,<italic>&#x003B1;</italic></sub> and <italic>V</italic><sub>0,<italic>&#x003B1;</italic></sub> represent the set of treatment and control nodes activated by <italic>&#x003B1;</italic><sub>1</sub> and <italic>&#x003B1;</italic><sub>0</sub> at time t, and let <italic>V</italic><sub>1,&#x02212;<italic>&#x003B1;</italic></sub> and <italic>V</italic><sub>0,&#x02212;<italic>&#x003B1;</italic></sub> denote the set of treatment and control nodes not activated by bystander nodes, respectively. Through randomization over set <italic>MIS</italic>, we obtain |<italic>&#x003B1;</italic><sub>1</sub>| &#x02248; |<italic>&#x003B1;</italic><sub>0</sub>|. In this setup, TTE can be estimated as:</p>
<disp-formula id="E9"><label>(7)</label><mml:math id="M19"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:mrow><mml:mi>T</mml:mi><mml:mover accent='true'><mml:mi>T</mml:mi><mml:mo stretchy='true'>&#x0005E;</mml:mo></mml:mover><mml:mi>E</mml:mi></mml:mrow><mml:mo>=</mml:mo><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mfrac><mml:mrow><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mo>|</mml:mo><mml:msub><mml:mrow><mml:mstyle mathvariant="bold"><mml:mtext>V</mml:mtext></mml:mstyle></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:mo>|</mml:mo></mml:mrow></mml:mfrac><mml:mstyle displaystyle="true"><mml:munder class="msub"><mml:mrow><mml:mo>&#x02211;</mml:mo></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mi>v</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>&#x02208;</mml:mo><mml:msub><mml:mrow><mml:mstyle mathvariant="bold"><mml:mtext>V</mml:mtext></mml:mstyle></mml:mrow><mml:mrow><mml:mn>1</mml:mn><mml:mtext>,&#x02212;</mml:mtext><mml:mi>&#x003B1;</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:munder></mml:mstyle><mml:msub><mml:mrow><mml:mi>v</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>.</mml:mo><mml:mi>Y</mml:mi><mml:mo>-</mml:mo><mml:mfrac><mml:mrow><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mo>|</mml:mo><mml:msub><mml:mrow><mml:mstyle mathvariant="bold"><mml:mtext>V</mml:mtext></mml:mstyle></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub><mml:mo>|</mml:mo></mml:mrow></mml:mfrac><mml:mstyle displaystyle="true"><mml:munder class="msub"><mml:mrow><mml:mo>&#x02211;</mml:mo></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mi>v</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>&#x02208;</mml:mo><mml:msub><mml:mrow><mml:mstyle mathvariant="bold"><mml:mtext>V</mml:mtext></mml:mstyle></mml:mrow><mml:mrow><mml:mn>0</mml:mn><mml:mtext>,&#x02212;</mml:mtext><mml:mi>&#x003B1;</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:munder></mml:mstyle><mml:msub><mml:mrow><mml:mi>v</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>.</mml:mo><mml:mi>Y</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mtext>&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;</mml:mtext><mml:mo>&#x0002B;</mml:mo><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mfrac><mml:mrow><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mo>|</mml:mo><mml:msub><mml:mrow><mml:mstyle mathvariant="bold"><mml:mtext>V</mml:mtext></mml:mstyle></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:mo>|</mml:mo></mml:mrow></mml:mfrac><mml:mstyle displaystyle="true"><mml:munder class="msub"><mml:mrow><mml:mo>&#x02211;</mml:mo></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mi>v</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>&#x02208;</mml:mo><mml:msub><mml:mrow><mml:mstyle mathvariant="bold"><mml:mtext>V</mml:mtext></mml:mstyle></mml:mrow><mml:mrow><mml:mn>1</mml:mn><mml:mo>,</mml:mo><mml:mi>&#x003B1;</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:munder></mml:mstyle><mml:msub><mml:mrow><mml:mi>v</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>.</mml:mo><mml:mi>Y</mml:mi><mml:mo>-</mml:mo><mml:mfrac><mml:mrow><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mo>|</mml:mo><mml:msub><mml:mrow><mml:mstyle mathvariant="bold"><mml:mtext>V</mml:mtext></mml:mstyle></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub><mml:mo>|</mml:mo></mml:mrow></mml:mfrac><mml:mstyle displaystyle="true"><mml:munder class="msub"><mml:mrow><mml:mo>&#x02211;</mml:mo></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mi>v</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>&#x02208;</mml:mo><mml:msub><mml:mrow><mml:mstyle mathvariant="bold"><mml:mtext>V</mml:mtext></mml:mstyle></mml:mrow><mml:mrow><mml:mn>0</mml:mn><mml:mo>,</mml:mo><mml:mi>&#x003B1;</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:munder></mml:mstyle><mml:msub><mml:mrow><mml:mi>v</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>.</mml:mo><mml:mi>Y</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>.</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>If the probability of activating a treatment and a control node by a bystander node is equal, then, in expectation, an equal number of nodes in treatment and control nodes would get activated by bystander nodes (|<italic>V</italic><sub>1,<italic>&#x003B1;</italic></sub>| &#x02248; |<italic>V</italic><sub>0,<italic>&#x003B1;</italic></sub>|) and UPE(V1) is equal to UPE(V0), i.e., <inline-formula><mml:math id="M21"><mml:mfrac><mml:mrow><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mo>|</mml:mo><mml:msub><mml:mrow><mml:mstyle mathvariant="bold"><mml:mtext>V</mml:mtext></mml:mstyle></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:mo>|</mml:mo></mml:mrow></mml:mfrac><mml:munder class="msub"><mml:mrow><mml:mo>&#x02211;</mml:mo></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mi>v</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>&#x02208;</mml:mo><mml:msub><mml:mrow><mml:mstyle mathvariant="bold"><mml:mtext>V</mml:mtext></mml:mstyle></mml:mrow><mml:mrow><mml:mn>1</mml:mn><mml:mo>,</mml:mo><mml:mi>&#x003B1;</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:munder><mml:msub><mml:mrow><mml:mi>v</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>.</mml:mo><mml:mi>Y</mml:mi><mml:mo>&#x02248;</mml:mo><mml:mfrac><mml:mrow><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mo>|</mml:mo><mml:msub><mml:mrow><mml:mstyle mathvariant="bold"><mml:mtext>V</mml:mtext></mml:mstyle></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub><mml:mo>|</mml:mo></mml:mrow></mml:mfrac><mml:munder class="msub"><mml:mrow><mml:mo>&#x02211;</mml:mo></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mi>v</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>&#x02208;</mml:mo><mml:msub><mml:mrow><mml:mstyle mathvariant="bold"><mml:mtext>V</mml:mtext></mml:mstyle></mml:mrow><mml:mrow><mml:mn>0</mml:mn><mml:mo>,</mml:mo><mml:mi>&#x003B1;</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:munder><mml:msub><mml:mrow><mml:mi>v</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>.</mml:mo><mml:mi>Y</mml:mi></mml:math></inline-formula>. As a result, we have:</p>
<disp-formula id="E10"><label>(8)</label><mml:math id="M22"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:mrow><mml:mi>T</mml:mi><mml:mover accent='true'><mml:mi>T</mml:mi><mml:mo stretchy='true'>&#x0005E;</mml:mo></mml:mover><mml:mi>E</mml:mi></mml:mrow><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mo>|</mml:mo><mml:msub><mml:mrow><mml:mstyle mathvariant="bold"><mml:mtext>V</mml:mtext></mml:mstyle></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:mo>|</mml:mo></mml:mrow></mml:mfrac><mml:mstyle displaystyle="true"><mml:munder class="msub"><mml:mrow><mml:mo>&#x02211;</mml:mo></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mi>v</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>&#x02208;</mml:mo><mml:msub><mml:mrow><mml:mstyle mathvariant="bold"><mml:mtext>V</mml:mtext></mml:mstyle></mml:mrow><mml:mrow><mml:mn>1</mml:mn><mml:mtext>,&#x02212;</mml:mtext><mml:mi>&#x003B1;</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:munder></mml:mstyle><mml:msub><mml:mrow><mml:mi>v</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>.</mml:mo><mml:mi>Y</mml:mi><mml:mo>-</mml:mo><mml:mfrac><mml:mrow><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mo>|</mml:mo><mml:msub><mml:mrow><mml:mstyle mathvariant="bold"><mml:mtext>V</mml:mtext></mml:mstyle></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub><mml:mo>|</mml:mo></mml:mrow></mml:mfrac><mml:mstyle displaystyle="true"><mml:munder class="msub"><mml:mrow><mml:mo>&#x02211;</mml:mo></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mi>v</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>&#x02208;</mml:mo><mml:msub><mml:mrow><mml:mstyle mathvariant="bold"><mml:mtext>V</mml:mtext></mml:mstyle></mml:mrow><mml:mrow><mml:mn>0</mml:mn><mml:mo>,&#x02212;</mml:mo><mml:mi>&#x003B1;</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:munder></mml:mstyle><mml:msub><mml:mrow><mml:mi>v</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>.</mml:mo><mml:mi>Y</mml:mi><mml:mo>.</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>Since by design there are no peer effects between the treatment and control groups, Equation (8) estimates the DTE (<italic>TTE</italic> &#x02248; <italic>DTE</italic>).</p></sec></sec>
<sec id="s6">
<title>6. <italic>CMatch</italic>: a network experiment design framework for total treatment effect estimation</title>
<p>In this section, we describe our proposed <italic>CMatch</italic> framework that increases the accuracy of TTE by combining clustering and matching techniques.</p>
<sec>
<title>6.1. <italic>CMatch</italic> framework</title>
<p>Our network experiment design framework <italic>CMatch</italic>, illustrated in <xref ref-type="fig" rid="F3">Figure 3</xref>, has two main goals: 1) <italic>spillover minimization</italic> which it achieves through weighted graph clustering, and 2) <italic>selection bias minimization</italic> which it achieves through cluster matching. Clusters in each matched pair are assigned to different treatments, thus achieving covariate balance between treatment and control (Fatemi and Zheleva, <xref ref-type="bibr" rid="B12">2020</xref>). The first goal addresses part <italic>c</italic> of <italic>Problem 1</italic> and the second goal addresses part <italic>d</italic>. While the first goal can be achieved with existing graph mining algorithms, solving for the second one requires developing novel approaches. To achieve the second goal, we propose an objective function, which can be solved with maximum weighted matching, and present the nuances of operationalizing each step.</p>
<fig id="F3" position="float">
<label>Figure 3</label>
<caption><p>Illustration of <italic>CMatch</italic> framework for minimizing interference and selection bias in controlled experiments. <bold>Input</bold>: a graph of nodes and the connection between them. <bold>CMatch</bold>: node and cluster matching; the dashed circles indicates the clusters. Matched nodes are represented with a similar circle border. <bold>Output</bold>: assigning the matched cluster pairs to treatment and control randomly; circles with the same color represent matched clusters.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fdata-06-1128649-g0003.tif"/>
</fig>
<sec>
<title>6.1.1. Step 1: interference minimization through weighted graph clustering</title>
<p>Existing cluster-based techniques for network experiment design assume unweighted graphs (Backstrom and Kleinberg, <xref ref-type="bibr" rid="B5">2011</xref>; Ugander et al., <xref ref-type="bibr" rid="B54">2013</xref>; Gui et al., <xref ref-type="bibr" rid="B16">2015</xref>; Eckles et al., <xref ref-type="bibr" rid="B9">2016</xref>; Saveski et al., <xref ref-type="bibr" rid="B45">2017</xref>) and do not consider that different edges can have different likelihood of spillover. Incorporating information about the edge probability of spillover into the clustering helps alleviate this problem and is one of the main contributions of our work. In order to minimize undesired spillover, we operationalize minimizing &#x003B8; as minimizing the edges, and more specifically the edge spillover probabilities, between treatment and control nodes: <inline-formula><mml:math id="M23"><mml:mover accent="true"><mml:mrow><mml:mi>&#x003B8;</mml:mi></mml:mrow><mml:mo>^</mml:mo></mml:mover><mml:mo>=</mml:mo><mml:munder class="msub"><mml:mrow><mml:mo>&#x02211;</mml:mo></mml:mrow><mml:mrow><mml:mo>&#x02200;</mml:mo><mml:msub><mml:mrow><mml:mi>v</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>&#x02208;</mml:mo><mml:msub><mml:mrow><mml:mstyle mathvariant="bold"><mml:mtext>V</mml:mtext></mml:mstyle></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:mo>&#x02200;</mml:mo><mml:msub><mml:mrow><mml:mi>v</mml:mi></mml:mrow><mml:mrow><mml:mi>j</mml:mi></mml:mrow></mml:msub><mml:mo>&#x02208;</mml:mo><mml:msub><mml:mrow><mml:mstyle mathvariant="bold"><mml:mtext>V</mml:mtext></mml:mstyle></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub></mml:mrow></mml:munder><mml:msub><mml:mrow><mml:mi>e</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mi>j</mml:mi></mml:mrow></mml:msub><mml:mo>.</mml:mo><mml:mi>p</mml:mi></mml:math></inline-formula>. To achieve this, <italic>CMatch</italic> creates graph clusters for two-stage design by employing two functions, edge spillover probability estimation and weighted graph clustering.</p>
<sec>
<title>6.1.1.1. Edge spillover probability estimation</title>
<p>We consider edge strength, how strong the relationship between two nodes is, as a proxy for edge spillover probability. This reflects the notion that the probability of a person influencing a close friend to do something is higher than the probability of influencing an acquaintance. We can use common graph mining techniques to calculate edge strength, including ones based on topological proximity (Liben-Nowell and Kleinberg, <xref ref-type="bibr" rid="B28">2007</xref>), supervised classification (Gilbert and Karahalios, <xref ref-type="bibr" rid="B14">2009</xref>), or latent variable models (Li et al., <xref ref-type="bibr" rid="B26">2010</xref>).</p></sec>
<sec>
<title>6.1.1.2. Weighted graph clustering</title>
<p>In order to incorporate edge strength into clustering, we can use any existing weighted graph clustering algorithm (Enright et al., <xref ref-type="bibr" rid="B11">2002</xref>; Schaeffer, <xref ref-type="bibr" rid="B47">2007</xref>; Yang and Leskovec, <xref ref-type="bibr" rid="B58">2015</xref>). In our experiments, we use a prominent non-parametric algorithm, the <italic>Markov Clustering Algorithm (MCL)</italic> (Enright et al., <xref ref-type="bibr" rid="B11">2002</xref>) which applies the idea of random walk for clustering graphs and produces non-overlapping clusters. We also compare this algorithm with <italic>reLDG</italic> which was the basis of previous work (Saveski et al., <xref ref-type="bibr" rid="B45">2017</xref>). One of the advantages of <italic>MCL</italic> is that it automatically finds the optimal number of clusters, rather than requiring it as input. The main idea behind <italic>MCL</italic> is that nodes in the same cluster are connected with higher-weighted shortest paths than nodes in different clusters.</p></sec></sec>
<sec>
<title>6.1.2. Step 2: selection bias minimization through cluster matching</title>
<p>Randomizing treatment assignment over clusters in a two-stage design does not guarantee that nodes within those clusters would represent random samples of the population. We propose to address this selection bias problem by <italic>cluster matching</italic> and balancing covariates across treatment and control clusters. While methods for matching nodes exist (Oktay et al., <xref ref-type="bibr" rid="B38">2010</xref>; Stuart, <xref ref-type="bibr" rid="B50">2010</xref>; Arbour et al., <xref ref-type="bibr" rid="B3">2014</xref>), this work is the first to propose methods for matching clusters.</p>
<sec>
<title>6.1.2.1. Objective function</title>
<p>The goal of cluster matching is to find pairs of clusters with similar node covariate distributions and assign them to different treatment groups. We propose to capture this through a maximum weighted matching objective over a cluster graph in which each discovered cluster from step 1 is a node and edges between clusters represent their similarity. Suppose that graph <italic>G</italic> is partitioned into <italic>C</italic> &#x0003D; {<italic>c</italic><sub>1</sub>, <italic>c</italic><sub>2</sub>, ..., <italic>c</italic><sub><italic>g</italic></sub>} clusters. We define <italic>A</italic> &#x02208; {0, 1}, such that <italic>a</italic><sub><italic>ij</italic></sub> &#x0003D; 1 if two clusters <italic>c</italic><sub><italic>i</italic></sub> and <italic>c</italic><sub><italic>j</italic></sub> are matched, else <italic>a</italic><sub><italic>ij</italic></sub> &#x0003D; 0. <italic>w</italic><sub><italic>i, j</italic></sub> &#x02208; &#x0211D; represents the similarity between two clusters <italic>c</italic><sub><italic>i</italic></sub> and <italic>c</italic><sub><italic>j</italic></sub>. Then the objective function of <italic>CMatch</italic> is as follows:</p>
<disp-formula id="E11"><label>(9)</label><mml:math id="M24"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:mstyle displaystyle="true"><mml:munder><mml:mrow><mml:mi>a</mml:mi><mml:mi>r</mml:mi><mml:mi>g</mml:mi><mml:mi>m</mml:mi><mml:mi>a</mml:mi><mml:mi>x</mml:mi></mml:mrow><mml:mrow><mml:mi>A</mml:mi></mml:mrow></mml:munder></mml:mstyle><mml:mtext>&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;</mml:mtext><mml:mstyle displaystyle="true"><mml:munderover accentunder="false" accent="false"><mml:mrow><mml:mo>&#x02211;</mml:mo></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>g</mml:mi></mml:mrow></mml:munderover></mml:mstyle><mml:mstyle displaystyle="true"><mml:munderover accentunder="false" accent="false"><mml:mrow><mml:mo>&#x02211;</mml:mo></mml:mrow><mml:mrow><mml:mi>j</mml:mi><mml:mo>=</mml:mo><mml:mi>i</mml:mi><mml:mo>&#x0002B;</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>g</mml:mi></mml:mrow></mml:munderover></mml:mstyle><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>a</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mi>j</mml:mi></mml:mrow></mml:msub><mml:mo>&#x000B7;</mml:mo><mml:msub><mml:mrow><mml:mi>w</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mi>j</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mtext class="textrm" mathvariant="normal">subject to&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;</mml:mtext><mml:mo>&#x02200;</mml:mo><mml:msub><mml:mrow><mml:mi>c</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>&#x02208;</mml:mo><mml:mi>C</mml:mi><mml:mo>,</mml:mo><mml:mstyle displaystyle="true"><mml:munderover accentunder="false" accent="false"><mml:mrow><mml:mo>&#x02211;</mml:mo></mml:mrow><mml:mrow><mml:mi>j</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mo>|</mml:mo><mml:msub><mml:mrow><mml:mi>c</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>|</mml:mo></mml:mrow></mml:munderover></mml:mstyle><mml:msub><mml:mrow><mml:mi>a</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mi>j</mml:mi></mml:mrow></mml:msub><mml:mo>&#x02264;</mml:mo><mml:mn>1</mml:mn><mml:mo>,</mml:mo><mml:msub><mml:mrow><mml:mi>a</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mi>j</mml:mi></mml:mrow></mml:msub><mml:mo>&#x02208;</mml:mo><mml:mrow><mml:mo>{</mml:mo><mml:mrow><mml:mn>0</mml:mn><mml:mo>,</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mo>}</mml:mo></mml:mrow><mml:mo>.</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>This objective function maps to a maximum weighted matching problem for which there is a linear-time approximation algorithm (Duan and Pettie, <xref ref-type="bibr" rid="B8">2014</xref>) and a polynomial-time exact algorithm with <italic>O</italic>(<italic>N</italic><sup>2.376</sup>) (Mucha and Sankowski, <xref ref-type="bibr" rid="B36">2004</xref>; Harvey, <xref ref-type="bibr" rid="B19">2009</xref>).</p></sec>
<sec>
<title>6.1.2.2. Solution</title>
<p>In order to operationalize the solution to this objective, the main question that needs to be addressed is: what does it mean for two clusters to be similar? We propose to capture this cluster similarity through matched nodes. The more nodes can be matched based on their covariates across two clusters, the more similar the two clusters are. Thus, the operationalization comes down to the following three questions which we address next:</p>
<list list-type="order">
<list-item><p>What constitutes a <italic><bold>node match</bold></italic>?</p></list-item>
<list-item><p>How are node matches taken into consideration in computing the pairwise <italic><bold>cluster</bold></italic> <italic><bold>weights</bold> </italic>(cluster similarity)?</p></list-item>
<list-item><p>Given a cluster weight, what constitutes a potential cluster match, and thus an edge in the <italic><bold>cluster graph</bold></italic>?</p></list-item>
</list>
<p>Once these three questions are addressed, the cluster graph can be built and an existing maximum weighted matching algorithm can be applied to it to find the final cluster matches.</p>
<sec>
<title>6.1.2.2.1. Node Matching</title>
<p>The goal of node matching is to reduce the imbalance between treatment and control groups due to their different feature distributions. Given a node representation, fully blocked matching would look for the most similar nodes based on that representation (Stuart, <xref ref-type="bibr" rid="B50">2010</xref>). It is important to note that propensity score matching does not apply here because it models the probability of treatment in observational data and treatment is unknown at the time of designing a controlled experiment. In its simplest form, a node can be represented as a vector of attributes, including node-specific attributes, such as demographic characteristics, and structural attributes, such as node degree. For any two nodes, it is possible to apply an appropriate similarity measure <italic>sim</italic>(<italic>v</italic><sub><italic>i</italic></sub>, <italic>v</italic><sub><italic>j</italic></sub>), in order to match two nodes, including cosine similarity, Jaccard similarity, or Euclidean distance.</p>
<p>We consider two different options to match a pair of nodes in different clusters (and ignore matches within the same cluster):</p>
<list list-type="bullet">
<list-item><p><bold>Threshold-based node matching (TNM)</bold>: Node <italic>v</italic><sub><italic>k</italic></sub> in cluster <italic>c</italic><sub><italic>i</italic></sub> is matched with node <italic>v</italic><sub><italic>l</italic></sub> from a different cluster <italic>c</italic><sub><italic>j</italic></sub> if the pairwise similarity of nodes <italic>sim</italic>(<italic>v</italic><sub><italic>k</italic></sub>, <italic>v</italic><sub><italic>l</italic></sub>) &#x0003E; <italic>&#x003B1;</italic>. The threshold <italic>&#x003B1;</italic> can vary from 0, which liberally matches all pairs of nodes, to the maximum possible similarity which matches nodes only if they are exactly the same. In our experiments, we set <italic>&#x003B1;</italic> based on the covariate distribution of each dataset and consider different quartiles of pairwise similarity as thresholds. This allows for each node to have multiple possible matches across clusters.</p></list-item>
<list-item><p><bold>Best node matching (BNM)</bold>: Node <italic>v</italic><sub><italic>k</italic></sub> in cluster <italic>c</italic><sub><italic>i</italic></sub> is matched with only one node <italic>v</italic><sub><italic>l</italic></sub> which is most similar to <italic>v</italic><sub><italic>k</italic></sub> in the whole graph; <italic>v</italic><sub><italic>l</italic></sub> should be in a different cluster. This is a very conservative matching approach in which each node is uniquely matched but allows the matching to be asymmetric.</p></list-item>
</list></sec>
<sec>
<title>6.1.2.2.2. Cluster Weights</title>
<p>After the selection of a node matching mechanism, we are ready to define the pairwise similarity of clusters which is the basis of cluster matching. We consider three simple approaches and three more expensive approaches which require maximum weighted matching between nodes:</p>
<list list-type="bullet">
<list-item><p><bold>Euclidean distance (E)</bold>: This approach is the simplest of all because it does not consider node matches and it simply calculates the Euclidean distance between the node attribute vector means of two clusters.</p></list-item>
<list-item><p><bold>Matched node count (C)</bold>: The first approach counts the number of matched nodes in each pair of clusters <italic>c</italic><sub><italic>i</italic></sub> and <italic>c</italic><sub><italic>j</italic></sub> and considers the count as the clusters&#x00027; pairwise similarity: <inline-formula><mml:math id="M25"><mml:msub><mml:mrow><mml:mi>w</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mi>j</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:munderover accentunder="false" accent="false"><mml:mrow><mml:mo>&#x02211;</mml:mo></mml:mrow><mml:mrow><mml:mi>k</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mo stretchy="false">|</mml:mo><mml:msub><mml:mrow><mml:mi>c</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo stretchy="false">|</mml:mo></mml:mrow></mml:munderover><mml:munderover accentunder="false" accent="false"><mml:mrow><mml:mo>&#x02211;</mml:mo></mml:mrow><mml:mrow><mml:mi>l</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mo stretchy="false">|</mml:mo><mml:msub><mml:mrow><mml:mi>c</mml:mi></mml:mrow><mml:mrow><mml:mi>j</mml:mi></mml:mrow></mml:msub><mml:mo stretchy="false">|</mml:mo></mml:mrow></mml:munderover><mml:msubsup><mml:mrow><mml:mi>r</mml:mi></mml:mrow><mml:mrow><mml:mi>k</mml:mi><mml:mi>l</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mi>j</mml:mi></mml:mrow></mml:msubsup></mml:math></inline-formula>. A node in cluster <italic>c</italic><sub><italic>i</italic></sub> can have multiple matched nodes in <italic>c</italic><sub><italic>j</italic></sub>.</p></list-item>
<list-item><p><bold>Matched node average similarity (S)</bold>: Instead of the count, this approach considers the average similarity between matched nodes across two clusters <italic>c</italic><sub><italic>i</italic></sub> and <italic>c</italic><sub><italic>j</italic></sub>:</p>
<p><inline-formula><mml:math id="M26"><mml:msub><mml:mrow><mml:mi>w</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mi>j</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:munderover accentunder="false" accent="false"><mml:mrow><mml:mo>&#x02211;</mml:mo></mml:mrow><mml:mrow><mml:mi>k</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mo stretchy="false">|</mml:mo><mml:msub><mml:mrow><mml:mi>c</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo stretchy="false">|</mml:mo></mml:mrow></mml:munderover><mml:munderover accentunder="false" accent="false"><mml:mrow><mml:mo>&#x02211;</mml:mo></mml:mrow><mml:mrow><mml:mi>l</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mo stretchy="false">|</mml:mo><mml:msub><mml:mrow><mml:mi>c</mml:mi></mml:mrow><mml:mrow><mml:mi>j</mml:mi></mml:mrow></mml:msub><mml:mo stretchy="false">|</mml:mo></mml:mrow></mml:munderover><mml:msubsup><mml:mrow><mml:mi>r</mml:mi></mml:mrow><mml:mrow><mml:mi>k</mml:mi><mml:mi>l</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mi>j</mml:mi></mml:mrow></mml:msubsup><mml:mo>&#x000B7;</mml:mo><mml:mi>s</mml:mi><mml:mi>i</mml:mi><mml:mi>m</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>v</mml:mi></mml:mrow><mml:mrow><mml:mi>k</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mrow><mml:mi>v</mml:mi></mml:mrow><mml:mrow><mml:mi>l</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:munderover accentunder="false" accent="false"><mml:mrow><mml:mo>&#x02211;</mml:mo></mml:mrow><mml:mrow><mml:mi>k</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mo stretchy="false">|</mml:mo><mml:msub><mml:mrow><mml:mi>c</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo stretchy="false">|</mml:mo></mml:mrow></mml:munderover><mml:munderover accentunder="false" accent="false"><mml:mrow><mml:mo>&#x02211;</mml:mo></mml:mrow><mml:mrow><mml:mi>l</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mo stretchy="false">|</mml:mo><mml:msub><mml:mrow><mml:mi>c</mml:mi></mml:mrow><mml:mrow><mml:mi>j</mml:mi></mml:mrow></mml:msub><mml:mo stretchy="false">|</mml:mo></mml:mrow></mml:munderover><mml:msubsup><mml:mrow><mml:mi>r</mml:mi></mml:mrow><mml:mrow><mml:mi>k</mml:mi><mml:mi>l</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mi>j</mml:mi></mml:mrow></mml:msubsup></mml:mrow></mml:mfrac></mml:math></inline-formula>.</p>
</list-item>
</list>
<p>These first two approaches allow a single node to be matched with multiple nodes in another cluster and each of those matches counts toward the cluster pair weight. In order to distinguish this from a more desirable case in which multiple nodes in one cluster are matched to multiple nodes in another cluster, we propose approaches that allow each node to be considered only once in the matches that count toward the weight. For each pair of clusters, we build a node graph in which an edge is formed between nodes <italic>v</italic><sub><italic>i</italic></sub> and <italic>v</italic><sub><italic>j</italic></sub> in the two clusters and the weight of this edge is <italic>sim</italic>(<italic>v</italic><sub><italic>i</italic></sub>, <italic>v</italic><sub><italic>j</italic></sub>). Maximum weighted matching will find the best possible node matches in the two clusters. We consider three different variants for calculating the cluster pair weight based on the maximum weighted matching of nodes:</p>
<list list-type="bullet">
<list-item><p><bold>Maximum matched node count (MC)</bold>: This method calculates the cluster weight the same way as <bold>C</bold> except that the matches (whether <inline-formula><mml:math id="M27"><mml:msubsup><mml:mrow><mml:mi>r</mml:mi></mml:mrow><mml:mrow><mml:mi>k</mml:mi><mml:mi>l</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mi>j</mml:mi></mml:mrow></mml:msubsup></mml:math></inline-formula> is 0 or 1) are based on the maximum weighted matching result.</p></list-item>
<list-item><p><bold>Maximum matched node average similarity (MS)</bold>: This method calculates the cluster weight the same way as <bold>S</bold> except that the node matches are based on the maximum weighted matching result.</p></list-item>
<list-item><p><bold>Maximum matched node similarity sum (MSS)</bold>: This method calculates the cluster weight similarly to <bold>MS</bold> except that it does not average the node similarity: <inline-formula><mml:math id="M28"><mml:msub><mml:mrow><mml:mi>w</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mi>j</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:munderover accentunder="false" accent="false"><mml:mrow><mml:mo>&#x02211;</mml:mo></mml:mrow><mml:mrow><mml:mi>k</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mo stretchy="false">|</mml:mo><mml:msub><mml:mrow><mml:mi>c</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo stretchy="false">|</mml:mo></mml:mrow></mml:munderover><mml:munderover accentunder="false" accent="false"><mml:mrow><mml:mo>&#x02211;</mml:mo></mml:mrow><mml:mrow><mml:mi>l</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mo stretchy="false">|</mml:mo><mml:msub><mml:mrow><mml:mi>c</mml:mi></mml:mrow><mml:mrow><mml:mi>j</mml:mi></mml:mrow></mml:msub><mml:mo stretchy="false">|</mml:mo></mml:mrow></mml:munderover><mml:msubsup><mml:mrow><mml:mi>r</mml:mi></mml:mrow><mml:mrow><mml:mi>k</mml:mi><mml:mi>l</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mi>j</mml:mi></mml:mrow></mml:msubsup><mml:mo>&#x000B7;</mml:mo><mml:mi>s</mml:mi><mml:mi>i</mml:mi><mml:mi>m</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>v</mml:mi></mml:mrow><mml:mrow><mml:mi>k</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mrow><mml:mi>v</mml:mi></mml:mrow><mml:mrow><mml:mi>l</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:math></inline-formula>.</p></list-item>
</list></sec>
<sec>
<title>6.1.2.2.3. Cluster graph</title>
<p>Once the cluster similarities have been determined, we need to decide what similarity constitutes a potential cluster match. Such potential matches are added as edges in the cluster graph which is considered for maximum weighted matching. We consider three different options:</p>
<list list-type="bullet">
<list-item><p><bold>Threshold-based cluster matching (TCM)</bold>: Cluster <italic>c</italic><sub><italic>i</italic></sub> is considered as a potential match of cluster <italic>c</italic><sub><italic>j</italic></sub> if their weight <italic>w</italic><sub><italic>i, j</italic></sub> &#x0003E; <italic>&#x003B2;</italic>. The threshold <italic>&#x003B2;</italic> can vary from 0, which allows all pairs of clusters to be potential matches, to the maximum possible similarity which allows matching between clusters only if they are exactly the same. In our experiments, we set <italic>&#x003B2;</italic> based on the distribution of pairwise similarities and their quartiles as thresholds.</p></list-item>
<list-item><p><bold>Greedy cluster matching (GCM)</bold>: For each cluster <italic>c</italic><sub><italic>i</italic></sub>, a sorted list of the similarities between <italic>c</italic><sub><italic>i</italic></sub> and all other clusters is defined. Cluster <italic>c</italic><sub><italic>i</italic></sub> is considered a potential match only to the cluster with the highest similarity value in the list.</p></list-item>
</list>
<p>The last step in <italic>CMatch</italic> runs maximum weighted matching on the cluster graph. For every matched cluster pair, it assigns one cluster to treatment and the other one to control at random. This completes the network experiment design.</p></sec></sec></sec>
<sec>
<title>6.1.3. Analysis of the estimation bias</title>
<p>We follow Eckles et al. (<xref ref-type="bibr" rid="B9">2016</xref>) to analyze the estimation bias of the proposed cluster-based approach. One of the common approaches to measuring the causal effect <italic>&#x003BC;</italic> of a treatment on an outcome is averaging outcomes over treatment and control groups <italic>via</italic> difference-in-means: <inline-formula><mml:math id="M29"><mml:mrow><mml:msup><mml:mi>&#x003BC;</mml:mi><mml:mi>d</mml:mi></mml:msup><mml:mo>=</mml:mo><mml:msup><mml:mi>&#x003BC;</mml:mi><mml:mi>d</mml:mi></mml:msup><mml:mo stretchy='false'>(</mml:mo><mml:msub><mml:mstyle mathvariant='bold' mathsize='normal'><mml:mi>V</mml:mi></mml:mstyle><mml:mstyle mathvariant='bold' mathsize='normal'><mml:mn>1</mml:mn></mml:mstyle></mml:msub><mml:mo>.</mml:mo><mml:mi>Y</mml:mi><mml:mo stretchy='false'>)</mml:mo><mml:mo>&#x02212;</mml:mo><mml:msup><mml:mi>&#x003BC;</mml:mi><mml:mi>d</mml:mi></mml:msup><mml:mo stretchy='false'>(</mml:mo><mml:msub><mml:mstyle mathvariant='bold' mathsize='normal'><mml:mi>V</mml:mi></mml:mstyle><mml:mstyle mathvariant='bold' mathsize='normal'><mml:mn>0</mml:mn></mml:mstyle></mml:msub><mml:mo>.</mml:mo><mml:mi>Y</mml:mi><mml:mo stretchy='false'>)</mml:mo></mml:mrow></mml:math></inline-formula> where <inline-formula><mml:math id="M30"><mml:msup><mml:mrow><mml:mi>&#x003BC;</mml:mi></mml:mrow><mml:mrow><mml:mi>d</mml:mi></mml:mrow></mml:msup><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mstyle mathvariant="bold"><mml:mtext>V</mml:mtext></mml:mstyle></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:mo>.</mml:mo><mml:mi>Y</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:math></inline-formula> and <inline-formula><mml:math id="M31"><mml:msup><mml:mrow><mml:mi>&#x003BC;</mml:mi></mml:mrow><mml:mrow><mml:mi>d</mml:mi></mml:mrow></mml:msup><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mstyle mathvariant="bold"><mml:mtext>V</mml:mtext></mml:mstyle></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub><mml:mo>.</mml:mo><mml:mi>Y</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:math></inline-formula> are the mean outcomes of treatment and control nodes under experiment design d, respectively. In the presence of interference, &#x003BC;<sup><italic>d</italic></sup> does not yield the true total treatment effects (<italic>&#x003BC;</italic><sup><italic>d</italic></sup> &#x02212; <italic>&#x003BC;</italic> &#x02260; 0). The impact of each node on the estimation bias is equal to the difference between the expected outcome of a node due to the treatment alone and the observed outcome under global treatment assignment where all nodes in the network have a treatment assignment. The experimental design can control the size of this bias by controlling the global treatment assignment. Eckles et al. prove that this bias in the cluster-based randomization approach is less than or equal to the absolute bias under randomized assignment. Following this study, if we assume that we have a linear outcome model for each node <italic>v</italic><sub><italic>i</italic></sub> &#x02208; <bold>V</bold> as Eckles et al. (<xref ref-type="bibr" rid="B9">2016</xref>):</p>
<disp-formula id="E12"><label>(10)</label><mml:math id="M32"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:mi>E</mml:mi><mml:mrow><mml:mo>[</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>v</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>.</mml:mo><mml:mi>Y</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mstyle mathvariant="bold"><mml:mtext>Z</mml:mtext></mml:mstyle></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mo>]</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:msub><mml:mrow><mml:mi>a</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>&#x0002B;</mml:mo><mml:mstyle displaystyle="true"><mml:munder class="msub"><mml:mrow><mml:mo>&#x02211;</mml:mo></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mi>v</mml:mi></mml:mrow><mml:mrow><mml:mi>j</mml:mi></mml:mrow></mml:msub><mml:mo>&#x02208;</mml:mo><mml:mi>V</mml:mi></mml:mrow></mml:munder></mml:mstyle><mml:msub><mml:mrow><mml:mi>B</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mi>j</mml:mi></mml:mrow></mml:msub><mml:msub><mml:mrow><mml:mi>v</mml:mi></mml:mrow><mml:mrow><mml:mi>j</mml:mi></mml:mrow></mml:msub><mml:mo>.</mml:mo><mml:mi>T</mml:mi><mml:mo>,</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>Where <bold>B</bold> is the coefficient matrix, then true TTE <italic>&#x003BC;</italic> is calculated as Eckles et al. (<xref ref-type="bibr" rid="B9">2016</xref>):</p>
<disp-formula id="E13"><label>(11)</label><mml:math id="M33"><mml:mrow><mml:mi>&#x003BC;</mml:mi><mml:mo>=</mml:mo><mml:mi>&#x003BC;</mml:mi><mml:mo stretchy='false'>(</mml:mo><mml:msub><mml:mstyle mathvariant='bold' mathsize='normal'><mml:mi>Z</mml:mi></mml:mstyle><mml:mstyle mathvariant='bold' mathsize='normal'><mml:mn>1</mml:mn></mml:mstyle></mml:msub><mml:mo stretchy='false'>)</mml:mo><mml:mo>&#x02212;</mml:mo><mml:mi>&#x003BC;</mml:mi><mml:mo stretchy='false'>(</mml:mo><mml:msub><mml:mstyle mathvariant='bold' mathsize='normal'><mml:mi>Z</mml:mi></mml:mstyle><mml:mstyle mathvariant='bold' mathsize='normal'><mml:mn>0</mml:mn></mml:mstyle></mml:msub><mml:mo stretchy='false'>)</mml:mo><mml:mo>=</mml:mo><mml:mfrac><mml:mn>1</mml:mn><mml:mi>n</mml:mi></mml:mfrac><mml:mstyle displaystyle='true'><mml:munder><mml:mo>&#x02211;</mml:mo><mml:mrow><mml:mi>i</mml:mi><mml:mi>j</mml:mi></mml:mrow></mml:munder><mml:mrow><mml:msub><mml:mi>B</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mi>j</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:mstyle><mml:mo>.</mml:mo></mml:mrow></mml:math></disp-formula>
<p>Under cluster-based randomization assignment, we have Eckles et al. (<xref ref-type="bibr" rid="B9">2016</xref>):</p>
<disp-formula id="E14"><label>(12)</label><mml:math id="M34"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:msup><mml:mrow><mml:mi>&#x003BC;</mml:mi></mml:mrow><mml:mrow><mml:mi>c</mml:mi><mml:mi>b</mml:mi><mml:mi>r</mml:mi></mml:mrow></mml:msup><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>n</mml:mi></mml:mrow></mml:mfrac><mml:mstyle displaystyle="true"><mml:munder class="msub"><mml:mrow><mml:mo>&#x02211;</mml:mo></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mi>v</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mrow><mml:mi>v</mml:mi></mml:mrow><mml:mrow><mml:mi>j</mml:mi></mml:mrow></mml:msub><mml:mo>&#x02208;</mml:mo><mml:mi>V</mml:mi></mml:mrow></mml:munder></mml:mstyle><mml:msub><mml:mrow><mml:mi>B</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mi>j</mml:mi></mml:mrow></mml:msub><mml:mstyle mathvariant="bold"><mml:mn>1</mml:mn></mml:mstyle><mml:mrow><mml:mo>[</mml:mo><mml:mrow><mml:mi>C</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>v</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mi>C</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>v</mml:mi></mml:mrow><mml:mrow><mml:mi>j</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mo>]</mml:mo></mml:mrow><mml:mo>,</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>Where <italic>C</italic>(<italic>v</italic><sub><italic>i</italic></sub>) denotes the cluster assignment of <italic>v</italic><sub><italic>i</italic></sub>. Under randomized assignment, we have Eckles et al. (<xref ref-type="bibr" rid="B9">2016</xref>):</p>
<disp-formula id="E15"><label>(13)</label><mml:math id="M35"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:msup><mml:mrow><mml:mi>&#x003BC;</mml:mi></mml:mrow><mml:mrow><mml:mi>r</mml:mi><mml:mi>a</mml:mi><mml:mi>n</mml:mi><mml:mi>d</mml:mi></mml:mrow></mml:msup><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>n</mml:mi></mml:mrow></mml:mfrac><mml:mstyle displaystyle="true"><mml:munder class="msub"><mml:mrow><mml:mo>&#x02211;</mml:mo></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mi>v</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>&#x02208;</mml:mo><mml:mi>V</mml:mi></mml:mrow></mml:munder></mml:mstyle><mml:msub><mml:mrow><mml:mi>B</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>.</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>Equations (11)&#x02013;(13) imply that <italic>&#x003BC;</italic> &#x02212; <italic>&#x003BC;</italic><sup><italic>cbr</italic></sup> &#x02264; <italic>&#x003BC;</italic> &#x02212; <italic>&#x003BC;</italic><sup><italic>rand</italic></sup>. The effectiveness of cluster-based randomization in reducing bias depends on the strength of interactions within clusters. The ability of the clustering algorithm to capture the coefficient matrix <bold>B</bold> in a consistent manner also affects the degree of bias reduction. By incorporating the strength of connections between units into the clustering process, the method can better capture the structure of dependence between units, resulting in a smaller bias (<italic>&#x003BC;</italic> &#x02212; <italic>&#x003BC;</italic><sup><italic>cbr</italic></sup>). Considering Equations (11)&#x02013;(12), the relative bias is measured as Eckles et al. (<xref ref-type="bibr" rid="B9">2016</xref>):</p>
<disp-formula id="E16"><label>(14)</label><mml:math id="M36"><mml:mrow><mml:mfrac><mml:mi>&#x003BC;</mml:mi><mml:mrow><mml:msup><mml:mi>&#x003BC;</mml:mi><mml:mrow><mml:mi>c</mml:mi><mml:mi>b</mml:mi><mml:mi>r</mml:mi></mml:mrow></mml:msup></mml:mrow></mml:mfrac><mml:mo>&#x02212;</mml:mo><mml:mn>1</mml:mn><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:mstyle displaystyle='true'><mml:msub><mml:mo>&#x02211;</mml:mo><mml:mrow><mml:msub><mml:mi>v</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi>v</mml:mi><mml:mi>j</mml:mi></mml:msub><mml:mo>&#x02208;</mml:mo><mml:mi>V</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:msub><mml:mi>B</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mi>j</mml:mi></mml:mrow></mml:msub><mml:mstyle mathvariant='bold' mathsize='normal'><mml:mn>1</mml:mn></mml:mstyle><mml:mo stretchy='false'>[</mml:mo><mml:mi>C</mml:mi><mml:mo stretchy='false'>(</mml:mo><mml:msub><mml:mi>v</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo stretchy='false'>)</mml:mo><mml:mo>=</mml:mo><mml:mi>C</mml:mi><mml:mo stretchy='false'>(</mml:mo><mml:msub><mml:mi>v</mml:mi><mml:mi>j</mml:mi></mml:msub><mml:mo stretchy='false'>)</mml:mo><mml:mo stretchy='false'>]</mml:mo></mml:mrow></mml:mstyle></mml:mrow><mml:mrow><mml:mstyle displaystyle='true'><mml:msub><mml:mo>&#x02211;</mml:mo><mml:mrow><mml:msub><mml:mi>v</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi>v</mml:mi><mml:mi>j</mml:mi></mml:msub><mml:mo>&#x02208;</mml:mo><mml:mi>V</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:msub><mml:mi>B</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mi>j</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:mstyle></mml:mrow></mml:mfrac><mml:mo>&#x02212;</mml:mo><mml:mn>1.</mml:mn></mml:mrow></mml:math></disp-formula>
<p>If the clustering fails to capture the structural dependencies, the numerator in Equation (14) will be much smaller than the denominator. As a result, the method will underestimate the true total treatment effects.</p></sec></sec></sec>
<sec id="s7">
<title>7. Experiments</title>
<p>In this section, we evaluate the performance of <italic>CauseIS</italic> and <italic>CMatch</italic> in treatment effect estimation compared to the baselines. We first describe datasets used in our experiments and then discuss the experimental setup and results.</p>
<sec>
<title>7.1. Data generation</title>
<p>Since existing network datasets do not have ground truth for treatment and its causal effect on the outcome, we use synthetic and real-world data structures and simulate the outcome and causal effect in the experiments.</p>
<sec>
<title>7.1.1. Synthetic data</title>
<p>For generating synthetic networks, we use two network generation models:</p>
<list list-type="bullet">
<list-item><p><italic>Barab</italic>&#x000E1;<italic>si-Albert (BA)</italic> model: This model generates random scale-free networks using the preferential attachment model. In the beginning, the network is constructed from <italic>m</italic><sub>0</sub> connected nodes. Then, new nodes are connected to <italic>m</italic> existing nodes with a probability that is proportional to the number of edges that the existing nodes already have (Albert and Barab&#x000E1;si, <xref ref-type="bibr" rid="B1">2002</xref>). We set <italic>m</italic> &#x0003D; 3 in all experiments.</p></list-item>
<list-item><p><italic>Forest Fire (FF)</italic> model: In this model, a new node <italic>v</italic><sub><italic>i</italic></sub> attaches to an existing node <italic>v</italic><sub><italic>j</italic></sub> and then links to nodes connected to <italic>v</italic><sub><italic>j</italic></sub> with forward and backward burning probabilities denoted by <italic>p</italic><sub><italic>f</italic></sub> and <italic>p</italic><sub><italic>b</italic></sub>, respectively. Leskovec et al. (<xref ref-type="bibr" rid="B25">2007</xref>) show that the synthetic network generated by this model can mimic most real-world structure characteristics. In the experiments, we generate all the graphs with forward burning probability <italic>p</italic><sub><italic>f</italic></sub> &#x0003D; 0.3 and backward burning probability <italic>p</italic><sub><italic>b</italic></sub> &#x0003D; 0.3.</p></list-item>
</list>
<p>After generating the network structure, we generate 10 attributes for each node with a uniform distribution where the values vary in [ &#x02212; 1, 1].</p></sec>
<sec>
<title>7.1.2. Real-world data</title>
<p>We use five real-world datasets in our experiments. The <italic>50 Women</italic> dataset (Michell and Amos, <xref ref-type="bibr" rid="B34">1997</xref>) includes sport, smoking, drug, and alcohol habits of 50 students with 74 friendship connections. <italic>Cora</italic> and <italic>Citeseer</italic> datasets (Sen et al., <xref ref-type="bibr" rid="B48">2008</xref>) incorporate the citation networks of 2, 708 and 3, 312 article with binary bag-of-words attributes for each article and 4, 675 and 5278 edges, respectively. <italic>Hamsterster</italic> dataset (Zheleva et al., <xref ref-type="bibr" rid="B60">2008</xref>) includes the online friendship network of 2, 059 hamsters with 10, 943 edges. <italic>Hateful users</italic> dataset (Ribeiro et al., <xref ref-type="bibr" rid="B43">2018</xref>) is a sample of Twitter&#x00027;s retweet graph containing 100, 386 users with 1, 024 attributes and more than two millions retweet edges. In <italic>hateful users</italic> dataset, we remove singletons and nodes with degree 1 from the graph.</p></sec>
<sec>
<title>7.1.3. Synthetic causal effect</title>
<p>We assume that the underlying probability of activating a node (changing the outcome) due to treatment and allowable peer effects in the treatment group is 0.4 and the underlying probability of activating a control node due to treatment and allowable peer effects is 0.2 which makes the true causal effect <italic>TTE</italic> &#x0003D; 0.2. Based on these probabilities, we randomly assign each node as activated or not. For each inactivated node, we simulate two types of interference considering both fixed values (0.1 and 0.5) and values based on the edge weights for <italic>e</italic>.<italic>p</italic>:</p>
<list list-type="order">
<list-item><p>Direct interference: each treated neighbor of a control node activates the node with an unallowable spillover probability of <italic>e</italic>.<italic>p</italic>.</p></list-item>
<list-item><p>Contagion: inactive treated and untreated nodes get activated with the unallowable spillover probability of <italic>e</italic>.<italic>p</italic> if they are connected to at least one activated node in a different treatment class.</p></list-item>
</list></sec></sec>
<sec>
<title>7.2. Main algorithms and baselines</title>
<p>Our baselines differ corresponding to the causal effect of interest. In the following, we describe the main baselines for direct and total treatment effect estimation.</p>
<sec>
<title>7.2.1. Baselines for Direct Treatment Effect Estimation</title>
<p>We compare the performance of four different approaches in our experiments.</p>
<list list-type="bullet">
<list-item><p><bold>Randomized</bold>: This algorithm assigns nodes to treatment and control randomly, ignoring the network.</p></list-item>
<list-item><p><bold>Match</bold>: This algorithm matches nodes using the maximum weighted matching algorithm and then randomly assigns nodes in each matched pair to treatment and control at random without considering clustering.</p></list-item>
<list-item><p><bold>CauseIS</bold>: In our proposed framework, we use an algorithm to find the maximum independent set <italic>MIS</italic> and then assign nodes of the set to treatment or control at random.</p></list-item>
<list-item><p><bold>CauseIS_match</bold>: This method uses the <italic>CauseIS</italic> framework, but it matches nodes of <italic>MIS</italic> and then assigns nodes of matched pairs to treatment or control at random.</p></list-item>
</list>
<p>The goal of comparing our method with <italic>Match</italic> and <italic>CauseIS</italic>_<italic>Match</italic> is to show whether our method has selection bias. Using matching for RCT is unusual, but in small datasets altering the randomization process by posing structural constraints on the graph may lead to worse randomization and matching can mitigate this problem.</p></sec>
<sec>
<title>7.2.2. Baselines for total treatment effect estimation</title>
<p>For TTE, all our baseline and main algorithm variants take an attributed graph as an input and produce a set of clusters, each assigned to treatment, control, or none. For graph clustering, we considered two main algorithms, <italic>Restreaming Linear Deterministic Greedy (reLDG)</italic> (Nishimura and Ugander, <xref ref-type="bibr" rid="B37">2013</xref>) and <italic>Markov Clustering Algorithm (MCL)</italic> (Enright et al., <xref ref-type="bibr" rid="B11">2002</xref>). <italic>reLDG</italic> takes as input an unweighted graph and desired the number of clusters and produces a graph clustering. <italic>reLDG</italic> was reported to perform very well in state-of-the-art methods for network experiment design (Saveski et al., <xref ref-type="bibr" rid="B45">2017</xref>). <italic>MCL</italic> is a non-parametric algorithm that takes as input a weighted graph and produces a graph clustering. The edge weights which correspond to the probabilities of spillover are estimated based on node pair similarity using one minus the normalized L2 norm: 1 &#x02212; <italic>L</italic><sub>2</sub>(<italic>v</italic><sub><italic>i</italic></sub>.<italic>x, v</italic><sub><italic>j</italic></sub>.<italic>x</italic>).</p>
<p>The main algorithms and baselines are:</p>
<list list-type="bullet">
<list-item><p><bold>CR</bold> (Saveski et al., <xref ref-type="bibr" rid="B45">2017</xref>): The <italic>Completely Randomized (CR)</italic> algorithm was used as a baseline in Saveski et al. (<xref ref-type="bibr" rid="B45">2017</xref>). The algorithm clusters the unweighted graph using <italic>reLDG</italic> algorithm, assigns similar clusters to the same strata, and assigns nodes in strata to treatment and control in a randomized fashion.</p></list-item>
<list-item><p><bold>CBR<sub>reLDG</sub></bold> (Saveski et al., <xref ref-type="bibr" rid="B45">2017</xref>): <italic>Cluster-based Randomized assignment (CBR)</italic> is the main algorithm proposed by Saveski et al. (<xref ref-type="bibr" rid="B45">2017</xref>). The algorithm clusters the unweighted graph using <italic>reLDG</italic>, assigns similar clusters to the same strata, and randomly picks clusters within the same strata as treatment or control.</p></list-item>
<list-item><p><bold>CBR<sub>MCL</sub></bold>: A variant of <bold>CBR</bold> that we introduce for the sake of fairness which uses <italic>MCL</italic> for weighted-graph clustering.</p></list-item>
<list-item><p><bold>CMatch<sub>reLDG</sub></bold>: This method uses our <italic>CMatch</italic> framework but works on an unweighted graph. It uses <italic>reLDG</italic> for graph clustering.</p></list-item>
<list-item><p><bold>CMatch<sub>MCL</sub></bold>: This is our proposed technique which uses <italic>MCL</italic> for weighted graph clustering.</p></list-item>
</list>
<p>We consider <italic>Randomized</italic> and <italic>Match</italic> techniques described in Section 7.2.1 as two more baselines for total treatment effect estimation. <italic>CMatch</italic> uses the <italic>maximum</italic>_<italic>weight</italic>_<italic>matching</italic> function from the <italic>NetworkX</italic> Python library.</p></sec></sec>
<sec>
<title>7.3. Experimental setup</title>
<p>We run a number of experiments varying the underlying spillover assumptions, clustering algorithms, number of clusters, and node matching algorithms. Our experimental setup measures the desired properties for network experiment design, as described in Problem 2 and follows the experimental setups in existing work (Stuart, <xref ref-type="bibr" rid="B50">2010</xref>; Maier et al., <xref ref-type="bibr" rid="B31">2013</xref>; Arbour et al., <xref ref-type="bibr" rid="B3">2014</xref>; Eckles et al., <xref ref-type="bibr" rid="B9">2016</xref>; Saveski et al., <xref ref-type="bibr" rid="B45">2017</xref>).</p>
<p>To measure the strength of interference bias in different estimators, we report on two metrics:</p>
<list list-type="order">
<list-item><p><italic>Root Mean Squared Error</italic> (RMSE) of the treatment effect calculated as:
<disp-formula id="E17"><mml:math id="M37"><mml:mtable columnalign="left"><mml:mtr><mml:mtd><mml:mi>R</mml:mi><mml:mi>M</mml:mi><mml:mi>S</mml:mi><mml:mi>E</mml:mi><mml:mo>=</mml:mo><mml:msqrt><mml:mrow><mml:mfrac><mml:mrow><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>S</mml:mi></mml:mrow></mml:mfrac><mml:mstyle displaystyle="true"><mml:munderover accentunder="false" accent="false"><mml:mrow><mml:mo>&#x02211;</mml:mo></mml:mrow><mml:mrow><mml:mi>s</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>S</mml:mi></mml:mrow></mml:munderover></mml:mstyle><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msup><mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mover accent="true"><mml:mrow><mml:mi>&#x003C4;</mml:mi></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mi>s</mml:mi></mml:mrow></mml:msub><mml:mo>-</mml:mo><mml:msub><mml:mrow><mml:mi>&#x003C4;</mml:mi></mml:mrow><mml:mrow><mml:mi>s</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mn>2</mml:mn></mml:msup></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:msqrt></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula></p>
<p>where <italic>S</italic> is the number of runs and &#x003C4;<sub><italic>s</italic></sub> and <inline-formula><mml:math id="M38"><mml:msub><mml:mrow><mml:mover accent="true"><mml:mrow><mml:mi>&#x003C4;</mml:mi></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mi>s</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> are the true and estimated causal effect in run <italic>s</italic>, respectively. We set <italic>S</italic> &#x0003D; 10 in all experiments. The error can be attributed to undesired spillover only.</p></list-item>
<list-item><p>The number of edges and the sum of edge weights between treatment and control nodes are assigned by each algorithm.</p></list-item>
</list>
<p>To show the selection bias, we want to assess how different treatment vs. control nodes are. We compute the Euclidean distance between the attribute vector mean of treated and untreated nodes. We show the average and standard deviation over 10 runs.</p>
<p>To show the strength of UPE imposed by bystander nodes in the <italic>CauseIS</italic> framework, we calculate the difference between the percentage of edges from bystander nodes to treatment and control nodes as:</p>
<disp-formula id="E18"><label>(15)</label><mml:math id="M39"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:mfrac><mml:mrow><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mo stretchy="false">|</mml:mo><mml:mstyle mathvariant="bold"><mml:mtext>E</mml:mtext></mml:mstyle><mml:mo stretchy="false">|</mml:mo></mml:mrow></mml:mfrac><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mstyle displaystyle="true"><mml:munder class="msub"><mml:mrow><mml:mo>&#x02211;</mml:mo></mml:mrow><mml:mrow><mml:mtable style="text-align:axis;" equalrows="false" columnlines="none none none none none none none none none" equalcolumns="false" class="array"><mml:mtr><mml:mtd><mml:msub><mml:mrow><mml:mi>e</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mo>,</mml:mo><mml:mi>j</mml:mi></mml:mrow></mml:msub><mml:mo>&#x02208;</mml:mo><mml:mstyle mathvariant="bold"><mml:mtext>E</mml:mtext></mml:mstyle></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:msub><mml:mrow><mml:mi>v</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>&#x02208;</mml:mo><mml:mstyle mathvariant="bold"><mml:mtext>T</mml:mtext></mml:mstyle></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:msub><mml:mrow><mml:mi>v</mml:mi></mml:mrow><mml:mrow><mml:mi>j</mml:mi></mml:mrow></mml:msub><mml:mo>&#x02208;</mml:mo><mml:mstyle mathvariant="bold"><mml:mtext>B</mml:mtext></mml:mstyle></mml:mtd></mml:mtr></mml:mtable></mml:mrow></mml:munder></mml:mstyle><mml:msub><mml:mrow><mml:mi>d</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mo>,</mml:mo><mml:mi>j</mml:mi></mml:mrow></mml:msub><mml:mo>-</mml:mo><mml:mstyle displaystyle="true"><mml:munder class="msub"><mml:mrow><mml:mo>&#x02211;</mml:mo></mml:mrow><mml:mrow><mml:mtable style="text-align:axis;" equalrows="false" columnlines="none none none none none none none none none" equalcolumns="false" class="array"><mml:mtr><mml:mtd><mml:msub><mml:mrow><mml:mi>e</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mo>,</mml:mo><mml:mi>j</mml:mi></mml:mrow></mml:msub><mml:mo>&#x02208;</mml:mo><mml:mstyle mathvariant="bold"><mml:mtext>E</mml:mtext></mml:mstyle></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:msub><mml:mrow><mml:mi>v</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>&#x02208;</mml:mo><mml:mstyle mathvariant="bold"><mml:mtext>C</mml:mtext></mml:mstyle></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:msub><mml:mrow><mml:mi>v</mml:mi></mml:mrow><mml:mrow><mml:mi>j</mml:mi></mml:mrow></mml:msub><mml:mo>&#x02208;</mml:mo><mml:mstyle mathvariant="bold"><mml:mtext>B</mml:mtext></mml:mstyle></mml:mtd></mml:mtr></mml:mtable></mml:mrow></mml:munder></mml:mstyle><mml:msub><mml:mrow><mml:mi>d</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mo>,</mml:mo><mml:mi>j</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>&#x000D7;</mml:mo><mml:mn>100</mml:mn></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>Where <italic>d</italic><sub><italic>i,j</italic></sub> &#x0003D; 1 if there is an edge between node <italic>v</italic><sub><italic>i</italic></sub> and <italic>v</italic><sub><italic>j</italic></sub>. <italic>T</italic> and <italic>C</italic> show the vector of treatment and control nodes.</p>
<p>in our experiments, we use the <italic>maximal</italic>_<italic>independent</italic>_<italic>set</italic> function from the <italic>NetworkX</italic> Python library to find a maximal independent set of each graph which implements the approach by Blelloch et al. (<xref ref-type="bibr" rid="B6">2012</xref>).</p>
<p>We run all 115 possible combinations of CMatch options for node matching, cluster weights, and cluster graphs for each dataset. We consider four different values for the threshold <italic>&#x003B1;</italic> in <bold>TNM</bold>: 0 (<bold>TNM0</bold>), first (<bold>TNM1</bold>), second (<bold>TNM2</bold>) and third (<bold>TNM3</bold>) quantile of pairwise nodes&#x00027; similarity distribution where <italic>sim</italic>(<italic>v</italic><sub><italic>i</italic></sub>, <italic>v</italic><sub><italic>j</italic></sub>)= (1- the normalized <italic>L</italic><sub>2</sub> norm). For <bold>TCM</bold>, we consider four different <italic>&#x003B2;</italic> values: 0 (<bold>TCM0</bold>), first (<bold>TCM1</bold>), second (<bold>TCM2</bold>) and third (<bold>TCM3</bold>) quantile of the pairwise clusters&#x00027; similarity distribution for each dataset. We use <bold>TNM2 &#x0002B; C &#x0002B; TCM2</bold> in all the experiments of <italic>CMatch</italic><sub><italic>reLDG</italic></sub>.</p>
<p>Unless otherwise specified, the number of clusters is the same for all CBR and CMatch versions based on the optimal determination by MCL as optimal for each respective dataset. The number of clusters determined by <italic>MCL</italic> is 2, 497 for <italic>Citeseer</italic>, 1, 885 for <italic>Cora</italic>, 1, 056 for <italic>Hamsterster</italic> and 20 in <italic>50 Women</italic> dataset.</p></sec>
<sec>
<title>7.4. Results</title>
<p>Here, we present the experimental results for the proposed framework. We first describe the performance of the CauseIS approach in estimating direct treatment effects. Then, we show the effectiveness of the CMatch framework in mitigating interference and selection bias.</p>
<sec>
<title>7.4.1. Performance of <italic>CauseIS</italic> framework</title>
<sec>
<title>7.4.1.1. Evaluation of direct treatment effect estimation</title>
<p>To assess the accuracy of <italic>CauseIS</italic> in estimating DTE compared to the baselines, we measure causal effect estimation error for different unallowable peer effect probabilities. <xref ref-type="fig" rid="F4">Figure 4</xref> shows the RMSE of DTE in real-world data sets. In all five datasets, <italic>CauseIS</italic> and <italic>CauseIS</italic>_<italic>Match</italic> get lower estimation error, compared to <italic>Randomized</italic> and <italic>Match</italic>, especially in <italic>Hamsterster</italic> with 72.1% and 76.6% estimated error reduction for <italic>e</italic>.<italic>p</italic> &#x0003D; <italic>edge</italic>_<italic>weight</italic> and <italic>e</italic>.<italic>p</italic> &#x0003D; 0.5 and <italic>Hateful Users</italic> with 69.4% estimated error reduction for <italic>e</italic>.<italic>p</italic> &#x0003D; 0.1. By increasing the spillover probability from 0.1 to 0.5, we get higher estimation errors because the probability of changing treatment and control outcomes through peer effects increases.</p>
<fig id="F4" position="float">
<label>Figure 4</label>
<caption><p>RMSE of direct treatment effects in real-world datasets considering different unallowable peer effect probabilities.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fdata-06-1128649-g0004.tif"/>
</fig>
<p>Synthetic data experiments depict a similar picture. <xref ref-type="fig" rid="F5">Figure 5</xref> shows the stronger performance of <italic>CauseIS</italic> and <italic>CauseIS</italic>_<italic>Match</italic> over <italic>Randomized</italic> and <italic>Match</italic> methods in reducing causal effect estimation error. For example, <italic>CauseIS</italic>&#x00027;s error is more than half of the error of <italic>Randomized</italic> approach (0.04 vs. 0.12 for graphs with 10, 000 nodes, 0.13 vs. 0.035 for graphs with 20, 000 nodes in Forest Fire model). In graphs with 50, 000 nodes, <italic>CauseIS</italic> obtains 63.4% and 69.9% estimation error reduction in Forest Fire and Barab&#x000E1;si-Albert models respectively, compared to other graphs.</p>
<fig id="F5" position="float">
<label>Figure 5</label>
<caption><p>RMSE of direct treatment effect in synthetic data with a different number of nodes and edges. Numbers in the first row of the x-axis show the number of nodes in graphs, and the second row represents the size of MIS. <bold>(A)</bold> Forest Fire model. <bold>(B)</bold> Barab&#x00027;asi-Albert model.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fdata-06-1128649-g0005.tif"/>
</fig>
<p>In both synthetic and real-world datasets, <italic>Randomized</italic> and <italic>Match</italic> on one hand and <italic>CauseIS</italic> and <italic>CauseIS</italic>_<italic>Match</italic> on the other hand show similar performances. This is intuitive because they use similar randomization techniques. While MIS size is approximately half of the population size in all datasets, by increasing the size of MIS the estimation error of <italic>CauseIS</italic> is still significantly lower than <italic>Randomized</italic> methods with smaller population size.</p></sec>
<sec>
<title>7.4.1.2. Sensitivity to the density of networks</title>
<p>To assess the impact of network density on the estimation error of various models, we computed the average estimation error across 10 randomly generated graphs containing 10,000 nodes for each density value. We adjusted the density of graphs in the Barab&#x000E1;si-Albert model by altering the value of <italic>m</italic> within the range of 1&#x02013;9, while for the Forest Fire model, we set <italic>p</italic><sub><italic>f</italic></sub> &#x0003D; <italic>p</italic><sub><italic>b</italic></sub> and varied <italic>p</italic><sub><italic>f</italic></sub> between 0.01 and 0.35. <xref ref-type="fig" rid="F6">Figure 6</xref> illustrates that as the density of the graphs increases, the estimation error for all methods also increases. This observation is expected since an increase in the number of edges between treatment and control raises the possibility of unallowable peer effects in the experiment. However, the <italic>CauseIS</italic> and <italic>CauseIS</italic>_<italic>Match</italic> methods consistently outperform the other two baseline methods in all graphs. Moreover, an increase in the density of the graph leads to a decrease in the size of the MIS. A higher MIS rate (meaning fewer bystander nodes) implies fewer spillover effects from bystander nodes to treatment and control, resulting in smaller estimation errors.</p>
<fig id="F6" position="float">
<label>Figure 6</label>
<caption><p>RMSE of direct treatment effects in synthetic data with 10, 000 nodes and different densities. Numbers in the first row of the x-axis show the number of edges in graphs, and the second row represents the size of MIS. <bold>(A)</bold> Forest Fire model. <bold>(B)</bold> Barab&#x00027;asi-Albert model.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fdata-06-1128649-g0006.tif"/>
</fig></sec>
<sec>
<title>7.4.1.3. Selection bias evaluation</title>
<p>In this experiment, we evaluate the selection bias of different methods by comparing the Euclidean distance between treatment and control nodes&#x00027; attributes in real-world and synthetic datasets with different population sizes. <xref ref-type="fig" rid="F7">Figure 7</xref> shows this comparison of real-world and synthetic data. It is not surprising that the <italic>Match</italic> method gets the lowest selection bias in all datasets because it matches most similar treatment and control nodes based on the similarity of attributes. <italic>CauseIS</italic>_<italic>Match</italic> has a higher selection bias than <italic>Match</italic> because the number of nodes matched in this approach is less than the <italic>Match</italic> method. Although <italic>CauseIS</italic> has a high selection bias, <italic>CauseIS</italic>_<italic>Match</italic> reduces selection bias to some extent.</p>
<fig id="F7" position="float">
<label>Figure 7</label>
<caption><p>Euclidean distance between the attribute vector means of treatment and control nodes in real-world and synthetic datasets. In synthetic dataset plots, numbers in the first row of the x-axis show the number of nodes in graphs, and in the second row show the size of MIS. <bold>(A)</bold> Real-world data, <bold>(B)</bold> Forest Fire model, and <bold>(C)</bold> Barab&#x00027;asi-Albert model.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fdata-06-1128649-g0007.tif"/>
</fig>
<p>Next, we look at how sample size impacts selection bias. We expect that asymptotically, there would be no selection bias with randomization for any design. <xref ref-type="fig" rid="F7">Figure 7</xref> shows that independent from the network generating model, by increasing the population size the similarity between treatment and control nodes&#x00027; attributes reduces, and the value of matching decreases and disappears. For example, in graphs with 500 nodes generated by the Forest Fire model, the difference between Euclidean distance of treatment and control nodes in <italic>CauseIS</italic> is 0.24, while in graphs with 50, 000 nodes, this difference decreases to 0.024. These results confirm the advantage of the matching technique in small datasets.</p></sec>
<sec>
<title>7.4.1.4. Peer effect evaluation</title>
<p>To measure the extent to which UPE(<bold>V</bold><sub>0</sub>) and UPE(<bold>V</bold><sub>1</sub>) can cancel each other out, we consider the percentage of edges from bystander nodes to treatment and control nodes. <xref ref-type="fig" rid="F8">Figure 8</xref> shows this quantity in real-world and synthetic datasets using <italic>CauseIS</italic> and <italic>CauseIS</italic>_<italic>Match</italic> methods. As expected, results show that for graphs with fewer number of nodes, the difference between the number of edges to treatment and control nodes is higher compared to larger graphs, 2.5 vs. 0.04 in <italic>50 Women</italic> vs. <italic>Hateful Users</italic> dataset. In synthetic data with higher population sizes (40, 000 and 50, 000), the difference between the percentages of edges to treatment and control is close to zero.</p>
<fig id="F8" position="float">
<label>Figure 8</label>
<caption><p>Difference between the percentage of edges to treatment and control nodes in real-world and synthetic datasets with a different number of nodes and edges. In synthetic dataset plots, numbers in the first row of the x-axis show the number of nodes in graphs, and in the second row show the size of MIS. <bold>(A)</bold> Real-world data, <bold>(B)</bold> Forest Fire model, and <bold>(C)</bold> Barab&#x00027;asi-Albert model.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fdata-06-1128649-g0008.tif"/>
</fig>
<p>In both synthetic and real-world datasets, we observe that by increasing the sample size, the causal effect estimation error decreases because by increasing the density of the graph edges the percentage of edges from bystander nodes to treatment and control nodes becomes more similar and UPE(<bold>V<sub>1</sub></bold>) - UPE(<bold>V<sub>0</sub></bold>) goes to zero.</p></sec>
<sec>
<title>7.4.1.5. Degree distribution evaluation</title>
<p>To assess the extent to which the maximal independent set chosen by <italic>CauseIS</italic> biases the degree distribution of selected treatment and control nodes, we compare the degree distributions of treatment and control nodes selected by <italic>CauseIS</italic> and <italic>Randomized</italic>. <xref ref-type="fig" rid="F9">Figure 9</xref> shows that <italic>CauseIS</italic> selects treatment and control groups with roughly similar degree distribution in all datasets, except in <italic>50 Women</italic> dataset where the assignment looks more biased, likely due to its small size. <italic>CauseIS</italic> removes high-degree nodes from the experiment which results in incorporating treatment and control groups with a more balanced degree distribution in the experiments.</p>
<fig id="F9" position="float">
<label>Figure 9</label>
<caption><p>Degree distribution of treatment and control nodes selected by <italic>CauseIS</italic> <bold>(first row)</bold> and <italic>Randomized</italic> <bold>(second row</bold>).</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fdata-06-1128649-g0009.tif"/>
</fig></sec></sec>
<sec>
<title>7.4.2. Performance of <italic>CMatch</italic> framework</title>
<sec>
<title>7.4.2.1. Tradeoff between interference and selection bias in <italic>CMatch</italic> variants and baselines</title>
<p>Given the large number of <italic>CMatch</italic> option combinations (115), we first find which ones of these combinations have a good tradeoff between RMSE and Euclidean distance (between treatment and control) with <italic>e.p = edge-weight</italic>. Depending on the node matching and cluster matching thresholds, which are specified by the user, the performance of <italic>CMatch</italic> options varies. Based on these experiments, we notice that 1) methods with stricter cluster thresholds (<bold>TCM2</bold> and <bold>TCM3</bold>) tend to have a lower error, 2) stricter node match thresholds (<bold>TNM2</bold> and <bold>TNM3</bold>) have lower error than others for <bold>S</bold> and <bold>MSS</bold> and 3) <bold>MS</bold> has high error across thresholds. We show the detailed results for Cora in <xref ref-type="table" rid="T1">Table 1</xref>.</p>
<table-wrap position="float" id="T1">
<label>Table 1</label>
<caption><p>The tradeoff between selection bias (distance) and undesirable spillover (RMSE) in <italic>CMatch</italic> variants in the Cora dataset.</p></caption> 
<table frame="box" rules="all">
<thead><tr style="background-color:#919497;color:#ffffff">
<th/> 
<th/>
<th valign="top" align="center" colspan="2"><bold>TCM0</bold></th>
<th valign="top" align="center" colspan="2"><bold>TCM1</bold></th>
<th valign="top" align="center" colspan="2"><bold>TCM2</bold></th>
<th valign="top" align="center" colspan="2"><bold>TCM3</bold></th>
<th valign="top" align="center" colspan="2"><bold>GCM</bold></th>
</tr>
<tr style="background-color:#919497;color:#ffffff">
<td/>
<td/>
<td valign="top" align="center"><bold>RMSE</bold></td>
<td valign="top" align="center"><bold>ED</bold></td>
<td valign="top" align="center"><bold>RMSE</bold></td>
<td valign="top" align="center"><bold>ED</bold></td>
<td valign="top" align="center"><bold>RMSE</bold></td>
<td valign="top" align="center"><bold>ED</bold></td>
<td valign="top" align="center"><bold>RMSE</bold></td>
<td valign="top" align="center"><bold>ED</bold></td>
<td valign="top" align="center"><bold>RMSE</bold></td>
<td valign="top" align="center"><bold>ED</bold></td>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">C</td>
<td valign="top" align="center">TNM0</td>
<td valign="top" align="center">0.052</td>
<td valign="top" align="center">0.184</td>
<td valign="top" align="center">0.007</td>
<td valign="top" align="center">0.267</td>
<td valign="top" align="center">0.017</td>
<td valign="top" align="center">0.263</td>
<td valign="top" align="center">0.014</td>
<td valign="top" align="center">0.26</td>
<td valign="top" align="center">0.048</td>
<td valign="top" align="center">0.789</td>
</tr> <tr>
<td/>
<td valign="top" align="left">TNM1</td>
<td valign="top" align="center">0.055</td>
<td valign="top" align="center">0.176</td>
<td valign="top" align="center">0.051</td>
<td valign="top" align="center">0.177</td>
<td valign="top" align="center">0.008</td>
<td valign="top" align="center">0.258</td>
<td valign="top" align="center">0.012</td>
<td valign="top" align="center">0.26</td>
<td valign="top" align="center">0.031</td>
<td valign="top" align="center">0.6</td>
</tr> <tr>
<td/>
<td valign="top" align="left">TNM2</td>
<td valign="top" align="center">0.054</td>
<td valign="top" align="center">0.171</td>
<td valign="top" align="center"><bold>0.042</bold></td>
<td valign="top" align="center"><bold>0.171</bold></td>
<td valign="top" align="center"><bold>0.01</bold></td>
<td valign="top" align="center"><bold>0.253</bold></td>
<td valign="top" align="center">0.017</td>
<td valign="top" align="center">0.251</td>
<td valign="top" align="center">0.036</td>
<td valign="top" align="center">0.591</td>
</tr> <tr>
<td/>
<td valign="top" align="left">TNM3</td>
<td valign="top" align="center">0.043</td>
<td valign="top" align="center">0.175</td>
<td valign="top" align="center">0.043</td>
<td valign="top" align="center">0.175</td>
<td valign="top" align="center">0.0173</td>
<td valign="top" align="center">0.046</td>
<td valign="top" align="center">0.018</td>
<td valign="top" align="center">0.231</td>
<td valign="top" align="center">0.034</td>
<td valign="top" align="center">0.592</td>
</tr> <tr>
<td/>
<td valign="top" align="left">BNM</td>
<td valign="top" align="center">0.012</td>
<td valign="top" align="center">0.262</td>
<td valign="top" align="center">0.037</td>
<td valign="top" align="center">0.481</td>
<td valign="top" align="center">0.049</td>
<td valign="top" align="center">0.485</td>
<td valign="top" align="center">0.059</td>
<td valign="top" align="center">0.479</td>
<td valign="top" align="center">0.025</td>
<td valign="top" align="center">0.274</td>
</tr> <tr>
<td valign="top" align="left">S</td>
<td valign="top" align="center">TNM0</td>
<td valign="top" align="center">0.056</td>
<td valign="top" align="center">0.16</td>
<td valign="top" align="center">0.058</td>
<td valign="top" align="center">0.159</td>
<td valign="top" align="center">0.048</td>
<td valign="top" align="center">0.16</td>
<td valign="top" align="center">0.056</td>
<td valign="top" align="center">0.162</td>
<td valign="top" align="center">0.035</td>
<td valign="top" align="center">0.34</td>
</tr> <tr>
<td/>
<td valign="top" align="left">TNM1</td>
<td valign="top" align="center">0.055</td>
<td valign="top" align="center">0.16</td>
<td valign="top" align="center">0.053</td>
<td valign="top" align="center">0.162</td>
<td valign="top" align="center">0.057</td>
<td valign="top" align="center">0.165</td>
<td valign="top" align="center">0.054</td>
<td valign="top" align="center">0.166</td>
<td valign="top" align="center">0.026</td>
<td valign="top" align="center">0.31</td>
</tr> <tr>
<td/>
<td valign="top" align="left">TNM2</td>
<td valign="top" align="center">0.056</td>
<td valign="top" align="center">0.162</td>
<td valign="top" align="center">0.054</td>
<td valign="top" align="center">0.168</td>
<td valign="top" align="center">0.048</td>
<td valign="top" align="center">0.165</td>
<td valign="top" align="center">0.033</td>
<td valign="top" align="center">0.183</td>
<td valign="top" align="center">0.039</td>
<td valign="top" align="center">0.292</td>
</tr> <tr>
<td/>
<td valign="top" align="left">TNM3</td>
<td valign="top" align="center">0.057</td>
<td valign="top" align="center">0.169</td>
<td valign="top" align="center">0.041</td>
<td valign="top" align="center">0.174</td>
<td valign="top" align="center">0.024</td>
<td valign="top" align="center">0.198</td>
<td valign="top" align="center"><bold>0.015</bold></td>
<td valign="top" align="center"><bold>0.211</bold></td>
<td valign="top" align="center">0.021</td>
<td valign="top" align="center">0.275</td>
</tr> <tr>
<td/>
<td valign="top" align="left">BNM</td>
<td valign="top" align="center">0.014</td>
<td valign="top" align="center">0.253</td>
<td valign="top" align="center">0.017</td>
<td valign="top" align="center">0.264</td>
<td valign="top" align="center">0.02</td>
<td valign="top" align="center">0.27</td>
<td valign="top" align="center">0.027</td>
<td valign="top" align="center">0.303</td>
<td valign="top" align="center">0.014</td>
<td valign="top" align="center">0.277</td>
</tr> <tr>
<td valign="top" align="left">MC</td>
<td valign="top" align="center">TNM0</td>
<td valign="top" align="center">0.049</td>
<td valign="top" align="center">0.177</td>
<td valign="top" align="center">0.015</td>
<td valign="top" align="center">0.261</td>
<td valign="top" align="center">0.01</td>
<td valign="top" align="center">0.262</td>
<td valign="top" align="center">0.008</td>
<td valign="top" align="center">0.263</td>
<td valign="top" align="center">0.042</td>
<td valign="top" align="center">0.189</td>
</tr> <tr>
<td/>
<td valign="top" align="left">TNM1</td>
<td valign="top" align="center">0.055</td>
<td valign="top" align="center">0.173</td>
<td valign="top" align="center">0.052</td>
<td valign="top" align="center">0.174</td>
<td valign="top" align="center">0.01</td>
<td valign="top" align="center">0.257</td>
<td valign="top" align="center">0.012</td>
<td valign="top" align="center">0.253</td>
<td valign="top" align="center">0.040</td>
<td valign="top" align="center">0.191</td>
</tr> <tr>
<td/>
<td valign="top" align="left">TNM2</td>
<td valign="top" align="center">0.047</td>
<td valign="top" align="center">0.171</td>
<td valign="top" align="center">0.051</td>
<td valign="top" align="center">0.177</td>
<td valign="top" align="center">0.013</td>
<td valign="top" align="center">0.261</td>
<td valign="top" align="center">0.007</td>
<td valign="top" align="center">0.263</td>
<td valign="top" align="center">0.024</td>
<td valign="top" align="center">0.211</td>
</tr> <tr>
<td/>
<td valign="top" align="left">TNM3</td>
<td valign="top" align="center">0.047</td>
<td valign="top" align="center">0.173</td>
<td valign="top" align="center">0.049</td>
<td valign="top" align="center">0.178</td>
<td valign="top" align="center">0.051</td>
<td valign="top" align="center">0.176</td>
<td valign="top" align="center">0.011</td>
<td valign="top" align="center">0.249</td>
<td valign="top" align="center">0.012</td>
<td valign="top" align="center">0.244</td>
</tr> <tr>
<td/>
<td valign="top" align="left">BNM</td>
<td valign="top" align="center">N/A</td>
<td valign="top" align="center">N/A</td>
<td valign="top" align="center">N/A</td>
<td valign="top" align="center">N/A</td>
<td valign="top" align="center">N/A</td>
<td valign="top" align="center">N/A</td>
<td valign="top" align="center">N/A</td>
<td valign="top" align="center">N/A</td>
<td valign="top" align="center">N/A</td>
<td valign="top" align="center">N/A</td>
</tr> <tr>
<td valign="top" align="left">MS</td>
<td valign="top" align="center">TNM0</td>
<td valign="top" align="center"><bold>0.048</bold></td>
<td valign="top" align="center"><bold>0.155</bold></td>
<td valign="top" align="center">0.051</td>
<td valign="top" align="center">0.156</td>
<td valign="top" align="center">0.052</td>
<td valign="top" align="center">0.156</td>
<td valign="top" align="center">0.058</td>
<td valign="top" align="center">0.157</td>
<td valign="top" align="center">0.018</td>
<td valign="top" align="center">0.271</td>
</tr> <tr>
<td/>
<td valign="top" align="left">TNM1</td>
<td valign="top" align="center">0.051</td>
<td valign="top" align="center">0.156</td>
<td valign="top" align="center">0.057</td>
<td valign="top" align="center">0.157</td>
<td valign="top" align="center">0.048</td>
<td valign="top" align="center">0.156</td>
<td valign="top" align="center">0.052</td>
<td valign="top" align="center">0.16</td>
<td valign="top" align="center">0.022</td>
<td valign="top" align="center">0.264</td>
</tr> <tr>
<td/>
<td valign="top" align="left">TNM2</td>
<td valign="top" align="center">0.059</td>
<td valign="top" align="center">0.156</td>
<td valign="top" align="center">0.057</td>
<td valign="top" align="center">0.157</td>
<td valign="top" align="center">0.054</td>
<td valign="top" align="center">0.158</td>
<td valign="top" align="center">0.056</td>
<td valign="top" align="center">0.157</td>
<td valign="top" align="center">0.021</td>
<td valign="top" align="center">0.258</td>
</tr> <tr>
<td/>
<td valign="top" align="left">TNM3</td>
<td valign="top" align="center">0.053</td>
<td valign="top" align="center">0.157</td>
<td valign="top" align="center">0.05</td>
<td valign="top" align="center">0.159</td>
<td valign="top" align="center">0.056</td>
<td valign="top" align="center">0.155</td>
<td valign="top" align="center">0.051</td>
<td valign="top" align="center">0.156</td>
<td valign="top" align="center">0.028</td>
<td valign="top" align="center">0.27</td>
</tr> <tr>
<td/>
<td valign="top" align="left">BNM</td>
<td valign="top" align="center">N/A</td>
<td valign="top" align="center">N/A</td>
<td valign="top" align="center">N/A</td>
<td valign="top" align="center">N/A</td>
<td valign="top" align="center">N/A</td>
<td valign="top" align="center">N/A</td>
<td valign="top" align="center">N/A</td>
<td valign="top" align="center">N/A</td>
<td valign="top" align="center">N/A</td>
<td valign="top" align="center">N/A</td>
</tr> <tr>
<td valign="top" align="left">MSS</td>
<td valign="top" align="center">TNM0</td>
<td valign="top" align="center">0.059</td>
<td valign="top" align="center">0.162</td>
<td valign="top" align="center">0.048</td>
<td valign="top" align="center">0.162</td>
<td valign="top" align="center">0.061</td>
<td valign="top" align="center">0.159</td>
<td valign="top" align="center">0.036</td>
<td valign="top" align="center">0.184</td>
<td valign="top" align="center">0.026</td>
<td valign="top" align="center">0.271</td>
</tr> <tr>
<td/>
<td valign="top" align="left">TNM1</td>
<td valign="top" align="center">0.056</td>
<td valign="top" align="center">0.16</td>
<td valign="top" align="center">0.054</td>
<td valign="top" align="center">0.161</td>
<td valign="top" align="center">0.047</td>
<td valign="top" align="center">0.161</td>
<td valign="top" align="center">0.03</td>
<td valign="top" align="center">0.194</td>
<td valign="top" align="center">0.029</td>
<td valign="top" align="center">0.275</td>
</tr> <tr>
<td/>
<td valign="top" align="left">TNM2</td>
<td valign="top" align="center">0.052</td>
<td valign="top" align="center">0.161</td>
<td valign="top" align="center">0.057</td>
<td valign="top" align="center">0.161</td>
<td valign="top" align="center">0.045</td>
<td valign="top" align="center">0.172</td>
<td valign="top" align="center"><bold>0.028</bold></td>
<td valign="top" align="center"><bold>0.195</bold></td>
<td valign="top" align="center">0.021</td>
<td valign="top" align="center">0.281</td>
</tr> <tr>
<td/>
<td valign="top" align="left">TNM3</td>
<td valign="top" align="center">0.049</td>
<td valign="top" align="center">0.168</td>
<td valign="top" align="center">0.035</td>
<td valign="top" align="center">0.186</td>
<td valign="top" align="center">0.023</td>
<td valign="top" align="center">0.199</td>
<td valign="top" align="center">0.022</td>
<td valign="top" align="center">0.212</td>
<td valign="top" align="center">0.033</td>
<td valign="top" align="center">0.278</td>
</tr> <tr>
<td/>
<td valign="top" align="left">BNM</td>
<td valign="top" align="center">N/A</td>
<td valign="top" align="center">N/A</td>
<td valign="top" align="center">N/A</td>
<td valign="top" align="center">N/A</td>
<td valign="top" align="center">N/A</td>
<td valign="top" align="center">N/A</td>
<td valign="top" align="center">N/A</td>
<td valign="top" align="center">N/A</td>
<td valign="top" align="center">N/A</td>
<td valign="top" align="center">N/A</td>
</tr>
<tr>
<td valign="top" align="left">E</td>
<td valign="top" align="center">N/A</td>
<td valign="top" align="center">0.051</td>
<td valign="top" align="center">0.178</td>
<td valign="top" align="center">0.05</td>
<td valign="top" align="center">0.18</td>
<td valign="top" align="center">0.031</td>
<td valign="top" align="center">0.203</td>
<td valign="top" align="center">0.012</td>
<td valign="top" align="center">0.242</td>
<td valign="top" align="center">0.042</td>
<td valign="top" align="center">0.718</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<p><italic>CMatch</italic><sub><italic>MCL</italic></sub> variants used in <xref ref-type="fig" rid="F10">Figure 10</xref> are in bold.</p>
</table-wrap-foot>
</table-wrap>
<p><xref ref-type="fig" rid="F10">Figure 10</xref> shows the results for the <italic>CMatch</italic> variants with the best tradeoffs and their better performance when compared to the baselines for Cora. Full <italic>CMatch</italic> results can be found in <xref ref-type="table" rid="T1">Table 1</xref>. The figure clearly shows that the selection bias decreases at the expense of interference bias. For example, while the Euclidean distance for <bold>TNM0 &#x0002B; MS &#x0002B; TCM0</bold> is low (0.155) when compared to <bold>TNM2 &#x0002B; C &#x0002B; TCM2</bold> (0.253), its RMSE is higher, 0.048 vs. 0.01. The comparison between <italic>CBR</italic><sub><italic>reLDG</italic></sub> with different possible number of clusters is consistent with the tradeoff shown in <xref ref-type="fig" rid="F1">Figure 1</xref>. <italic>CBR</italic><sub><italic>reLDG</italic></sub> with the highest error (annotated with 1885) and <italic>CMatch</italic><sub><italic>MCL</italic></sub> have the same number of clusters. It is intuitive that the <italic>Match</italic> method has the least selection bias because all nodes have their best matches. However, similar to the <italic>Randomized</italic> method, it suffers from high interference bias (RMSE) because of the high density of edges between treatment and control nodes.</p>
<fig id="F10" position="float">
<label>Figure 10</label>
<caption><p>The tradeoff between selection bias (distance) and undesirable spillover (RMSE) in <italic>CMatch</italic><sub><italic>MCL</italic></sub> variants (labeled with methods applied in) and baselines in the Cora dataset for <italic>e.p = edge-weight</italic>; <italic>CBR</italic><sub><italic>reLDG</italic></sub> is annotated with the number of clusters.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fdata-06-1128649-g0010.tif"/>
</fig></sec>
<sec>
<title>7.4.2.2. Interference evaluation for contagion</title>
<p>We choose two <italic>CMatch</italic> variants with low estimation errors: <bold>TNM2 &#x0002B; MSS &#x0002B; TCM3</bold> and <bold>TNM2 &#x0002B; C &#x0002B; TCM2</bold>, denoted by <italic>CMatch</italic><sub><italic>MCL</italic><sub><italic>MSS</italic></sub></sub> and <italic>CMatch</italic><sub><italic>MCL</italic><sub><italic>C</italic></sub></sub> respectively, and compare their causal effect estimation error with the baselines. The first method uses a simpler cluster weight assignment while the second one uses the expensive maximum weighted matching of nodes. <xref ref-type="fig" rid="F11">Figure 11</xref> shows that both variants of <italic>CMatch</italic><sub><italic>MCL</italic></sub> get significantly lower error than other methods, especially in Citeseer and Cora with 75.5% and 81.8% estimated error reduction in comparison to <italic>CBR</italic><sub><italic>reLDG</italic></sub> for <italic>e.p = edge-weight</italic>. <italic>CMatch</italic><sub><italic>MCL</italic><sub><italic>MSS</italic></sub></sub> has higher error than <italic>CMatch</italic><sub><italic>MCL</italic><sub><italic>C</italic></sub></sub> in most of the experiments which are expected as shown in <xref ref-type="fig" rid="F10">Figure 10</xref>. <italic>Randomized</italic> and <italic>Match</italic> approaches have similar performance in all datasets because of their similarity in the node randomization approach. We also notice that <italic>CBR</italic><sub><italic>reLDG</italic></sub> has the highest estimation error in Hamsterster data which confirms that clustering has a significant effect on the unallowable spillover. Meanwhile, <italic>CMatch</italic><sub><italic>reLDG</italic></sub> outperforms other baselines in some datasets (Citeseer) but not in others (Hamsterster and 50 Women). In Citeseer, the <italic>CR</italic> method gets the largest estimation error.</p>
<fig id="F11" position="float">
<label>Figure 11</label>
<caption><p>RMSE of total effect in the presence of contagion considering different unallowable spillover probabilities in all datasets; <italic>CMatch</italic><sub><italic>MCL</italic><sub><italic>C</italic></sub></sub> achieves the lowest error in all datasets.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fdata-06-1128649-g0011.tif"/>
</fig>
<p><xref ref-type="fig" rid="F11">Figure 11</xref> also shows that the higher the unallowable spillover probability, the larger the estimation error but also the better our method becomes relative to the baselines. For example, by increasing the unallowable spillover probability from 0.1 to 0.5 in Citeseer, the estimation error increases from 0.005 to 0.02 for <italic>CMatch</italic><sub><italic>MCL</italic><sub><italic>C</italic></sub></sub> and from 0.023 to 0.086 for <italic>CBR</italic><sub><italic>reLDG</italic></sub>.</p></sec>
<sec>
<title>7.4.2.3. Interference evaluation for direct interference</title>
<p><xref ref-type="fig" rid="F12">Figure 12</xref> shows the difference between the RMSE of different estimators over the presence of direct interference for <italic>e.p = edge-weight</italic>. In four datasets, both variants of <italic>CMatch</italic><sub><italic>MCL</italic></sub> get the lowest estimation error in comparison to baseline methods. For example, <italic>CMatch</italic><sub><italic>MCL</italic><sub><italic>C</italic></sub></sub>&#x00027;s error is approximately half of the error of <italic>CBR</italic><sub><italic>reLDG</italic></sub> (0.06 vs. 0.13 for Citeseer, 0.1 vs. 0.22 for Cora, 0.31 vs. 0.54 for Hamsterster, 0.15 vs. 0.36 for 50 Women). Similar to contagion, <italic>Match</italic>, and <italic>Randomized</italic> methods have similar estimation errors.</p>
<fig id="F12" position="float">
<label>Figure 12</label>
<caption><p>RMSE of total effect in the presence of direct interference (<italic>e.p = edge-weight</italic>). <italic>CMatch</italic><sub><italic>MCL</italic><sub><italic>C</italic></sub></sub> and <italic>CMatch</italic><sub><italic>MCL</italic><sub><italic>MSS</italic></sub></sub> obtain the lowest RMSE for all datasets.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fdata-06-1128649-g0012.tif"/>
</fig></sec>
<sec>
<title>7.4.2.4. Potential spillover evaluation</title>
<p><xref ref-type="table" rid="T2">Table 2</xref> shows the potential spillover between treatment and control nodes assigned by different methods. This applies to both contagion and direct interference. <italic>CMatch</italic> has the lowest sum of edges and edge weights between treatment and control nodes across all datasets. The difference between <italic>CMatch</italic><sub><italic>MCL</italic><sub><italic>C</italic></sub></sub> and the baselines in Cora and Citeseer is substantial: <italic>CMatch</italic><sub><italic>MCL</italic><sub><italic>C</italic></sub></sub> has between 13.5 and 34.8% lower number of edges between treatment and control across datasets.</p>
<table-wrap position="float" id="T2">
<label>Table 2</label>
<caption><p>Percentage of edges (and edge weights) between treatment and control nodes.</p></caption> 
<table frame="box" rules="all">
<thead><tr style="background-color:#919497;color:#ffffff">
<th valign="top" align="left"><bold>Dataset</bold></th>
<th valign="top" align="center"><bold>Randomized</bold></th>
<th valign="top" align="center"><bold>CR</bold></th>
<th valign="top" align="center"><bold><italic>CBR</italic><sub><italic>reLDG</italic></sub></bold></th>
<th valign="top" align="center"><bold><italic>CBR</italic><sub><italic>MCL</italic></sub></bold></th>
<th valign="top" align="center"><bold>Match</bold></th>
<th valign="top" align="center"><bold><italic>CMatch</italic><sub><italic>reLDG</italic></sub></bold></th>
<th valign="top" align="center"><bold><italic>CMatch</italic><sub><italic>MCL</italic><sub><italic>C</italic></sub></sub></bold></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">Citeseer</td>
<td valign="top" align="center">49.9% (50%)</td>
<td valign="top" align="center">35.9% (36.3%)</td>
<td valign="top" align="center">39.8% (38.4%)</td>
<td valign="top" align="center">38.9% (38.4%)</td>
<td valign="top" align="center">53.9% (56.6%)</td>
<td valign="top" align="center">35.8% (34.4%)</td>
<td valign="top" align="center"><bold>7.5%</bold> (7.2%)</td>
</tr> <tr>
<td valign="top" align="left">Cora</td>
<td valign="top" align="center">49.7% (49.7%)</td>
<td valign="top" align="center">37.6% (37.6%)</td>
<td valign="top" align="center">43.4% (42.8%)</td>
<td valign="top" align="center">38.9% (33.6%)</td>
<td valign="top" align="center">51.8% (53.3%)</td>
<td valign="top" align="center">38.7% (38.2%)</td>
<td valign="top" align="center"><bold>8.6</bold>% (9.1%)</td>
</tr> <tr>
<td valign="top" align="left">Hamsterster</td>
<td valign="top" align="center">50.2% (50.1%)</td>
<td valign="top" align="center">31.7% (30.4%)</td>
<td valign="top" align="center">48.3% (48.3%)</td>
<td valign="top" align="center">35.1% (34.7%)</td>
<td valign="top" align="center">50% (50.1%)</td>
<td valign="top" align="center">43.3% (44.4%)</td>
<td valign="top" align="center"><bold>34.8%</bold> (34.4%)</td>
</tr>
<tr>
<td valign="top" align="left">50 Women</td>
<td valign="top" align="center">48.5% (48.1%)</td>
<td valign="top" align="center">31.8% (30.5%)</td>
<td valign="top" align="center">36.6% (34.3%)</td>
<td valign="top" align="center">18.3% (11.4%)</td>
<td valign="top" align="center">52.5% (52.7%)</td>
<td valign="top" align="center">16% (18.6%)</td>
<td valign="top" align="center"><bold>12.8%</bold> (9.7%)</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<p>The lower the number, the lower probability of undesired spillover. The smallest percentages are shown in bold.</p>
</table-wrap-foot>
</table-wrap></sec>
<sec>
<title>7.4.2.5. Selection bias evaluation for contagion</title>
<p>In this experiment, we look at the relationship between the number of clusters and the difference between treatment and control nodes with and without cluster matching. <xref ref-type="fig" rid="F13">Figure 13</xref> shows the Euclidean distance between the average of treatment and control nodes&#x00027; attributes in <italic>CMatch</italic><sub><italic>reLDG</italic></sub>, <italic>CBR</italic><sub><italic>reLDG</italic></sub> and <italic>reLDG</italic> for three different numbers of clusters and unallowable spillover probability <italic>e.p = edge-weight</italic>. Since <italic>CMatch</italic><sub><italic>reLDG</italic></sub> optimizes for selection bias directly, it is not surprising that it results in treatment and control nodes that have more similar feature distributions than the other two methods. In Citeseer the differences are more subtle than in the other datasets. Error bars show the variance of averages over 10 runs which confirm the low variance of estimations in all datasets except in 50 Women, which is a small dataset.</p>
<fig id="F13" position="float">
<label>Figure 13</label>
<caption><p>Euclidean distance between the attribute vector means of treatment and control nodes for a different number of clusters. The higher the number of clusters, the lower the selection bias.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fdata-06-1128649-g0013.tif"/>
</fig></sec>
<sec>
<title>7.4.2.6. Sensitivity to spillover probability metrics</title>
<p>Our last experiment compares metrics for calculating the spillover probability, Cosine similarity, Jaccard similarity, and the L2-based similarity used in all other experiments. We report on RMSE of total effect using <italic>CMatch</italic><sub><italic>MCL</italic><sub><italic>C</italic></sub></sub> and <italic>CMatch</italic><sub><italic>MCL</italic><sub><italic>MSS</italic></sub></sub> methods under contagion. <xref ref-type="fig" rid="F14">Figure 14</xref> shows that <italic>CMatch</italic><sub><italic>MCL</italic><sub><italic>C</italic></sub></sub> with L2-based similarity obtains the least error in all datasets except for Citeseer where Cosine similarity has a slightly lower error. For <italic>CMatch</italic><sub><italic>MCL</italic><sub><italic>MSS</italic></sub></sub>, Cosine similarity has the lowest RMSE in Citeseer and 50 Women dataset, while Euclidean similarity has the lowest error in the other datasets. Jaccard similarity has the highest estimation error in all almost all cases.</p>
<fig id="F14" position="float">
<label>Figure 14</label>
<caption><p>RMSE of total effect in the presence of contagion using three different similarity methods to calculate spillover probability: Cosine (co), Jaccard (ja) and L2 similarity.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fdata-06-1128649-g0014.tif"/>
</fig></sec></sec></sec></sec>
<sec sec-type="conclusions" id="s8">
<title>8. Conclusion</title>
<p>In this paper, we proposed two different frameworks for network experiment designs that provide a more accurate estimation of two common causal estimands under interference: direct treatment effects and total treatment effects. For direct treatment effect estimation, we presented <italic>CauseIS</italic>, a framework that uses an independent set explicitly to disentangle peer effects from direct treatment effect estimation and increase the accuracy of direct treatment effect estimation. For total treatment effect estimation, we introduced <italic>CMatch</italic>, the first optimization framework that minimizes both interference and selection bias in cluster-based network experiment design. Our experiments on synthetic and real-world datasets confirm that this approach decreases direct and total treatment effect estimation error significantly. Some possible extensions of our frameworks include understanding the impact of network structural properties on estimation, jointly optimizing for interference and selection bias, and developing frameworks that are able to mitigate multiple-hop diffusions.</p></sec>
<sec sec-type="data-availability" id="s9">
<title>Data availability statement</title>
<p>The original contributions presented in the study are included in the article/supplementary material, further inquiries can be directed to the corresponding author.</p></sec>
<sec sec-type="ethics-statement" id="s10">
<title>Ethics statement</title>
<p>Ethical approval was not required for the study involving human data in accordance with the local legislation and institutional requirements. Written informed consent from the participants or their legal guardian/next of kin was not required in accordance with the national legislation and the institutional requirements.</p></sec>
<sec sec-type="author-contributions" id="s11">
<title>Author contributions</title>
<p>ZF and EZ contributed to the brainstorming, conception, and design of the study. ZF implemented the ideas and performed the statistical analysis under EZ&#x00027;s supervision. ZF wrote the first draft of the manuscript. All authors contributed to manuscript revision, read, and approved the submitted version.</p></sec>
</body>
<back>
<sec sec-type="funding-information" id="s12">
<title>Funding</title>
<p>This research was funded in part by NSF under grant no. 2047899 and DARPA under contract number HR001121C0168.</p>
</sec>
<ack>
<p>Some of the material included in the submission originally appeared at ICWSM 2020 (Fatemi and Zheleva, <xref ref-type="bibr" rid="B12">2020</xref>). We obtained permission to publish the material from AAAI. Some of the material was presented at MLG workshop 2020 which does not have an archival paper associated with it.</p>
</ack>
<sec sec-type="COI-statement" id="conf1">
<title>Conflict of interest</title>
<p>The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.</p>
</sec>
<sec sec-type="disclaimer" id="s13">
<title>Publisher&#x00027;s note</title>
<p>All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.</p>
</sec>
<ref-list>
<title>References</title>
<ref id="B1">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Albert</surname> <given-names>R.</given-names></name> <name><surname>Barab&#x000E1;si</surname> <given-names>A.-L.</given-names></name></person-group> (<year>2002</year>). <article-title>Statistical mechanics of complex networks</article-title>. <source>Rev. Mod. Phys</source>. <volume>74</volume>, <fpage>47</fpage>&#x02013;<lpage>97</lpage>. <pub-id pub-id-type="doi">10.1103/RevModPhys.74.47</pub-id></citation>
</ref>
<ref id="B2">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Antman</surname> <given-names>E. M.</given-names></name> <name><surname>Lau</surname> <given-names>J.</given-names></name> <name><surname>Kupelnick</surname> <given-names>B.</given-names></name> <name><surname>Mosteller</surname> <given-names>F.</given-names></name> <name><surname>Chalmers</surname> <given-names>T. C.</given-names></name></person-group> (<year>1992</year>). <article-title>A comparison of results of meta-analyses of randomized control trials and recommendations of clinical experts: treatments for myocardial infarction</article-title>. <source>JAMA</source> <volume>268</volume>, <fpage>240</fpage>&#x02013;<lpage>248</lpage>. <pub-id pub-id-type="doi">10.1001/jama.1992.03490020088036</pub-id><pub-id pub-id-type="pmid">1535110</pub-id></citation></ref>
<ref id="B3">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Arbour</surname> <given-names>D. T.</given-names></name> <name><surname>Marazopoulou</surname> <given-names>K.</given-names></name> <name><surname>Garant</surname> <given-names>D.</given-names></name> <name><surname>Jensen</surname> <given-names>D. D.</given-names></name></person-group> (<year>2014</year>). <article-title>&#x0201C;Propensity score matching for causal inference with relational data,&#x0201D;</article-title> in <source>Proceedings of the UAI 2014 Conference on Causal Inference: Learning and Prediction - Volume 1274</source> (<publisher-loc>Quebec City, QC</publisher-loc>), <fpage>25</fpage>&#x02013;<lpage>34</lpage>. <pub-id pub-id-type="doi">10.5555/3020325.3020329</pub-id></citation>
</ref>
<ref id="B4">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Aronow</surname> <given-names>P. M.</given-names></name> <name><surname>Eckles</surname> <given-names>D.</given-names></name> <name><surname>Samii</surname> <given-names>C.</given-names></name> <name><surname>Zonszein</surname> <given-names>S.</given-names></name></person-group> (<year>2021</year>). <article-title>Spillover effects in experimental data</article-title>. <source>Adv. Exp. Polit. Sci</source>. <volume>289</volume>, <fpage>319</fpage>. <pub-id pub-id-type="doi">10.1017/9781108777919.021</pub-id></citation>
</ref>
<ref id="B5">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Backstrom</surname> <given-names>L.</given-names></name> <name><surname>Kleinberg</surname> <given-names>J.</given-names></name></person-group> (<year>2011</year>). <article-title>&#x0201C;Network bucket testing,&#x0201D;</article-title> in <source>Proceedings of the 20th International Conference on World Wide Web</source> (<publisher-loc>New York, NY</publisher-loc>: <publisher-name>Association for Computing Machinery</publisher-name>), <fpage>615</fpage>&#x02013;<lpage>624</lpage>. <pub-id pub-id-type="doi">10.1145/1963405.1963492</pub-id></citation>
</ref>
<ref id="B6">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Blelloch</surname> <given-names>G. E.</given-names></name> <name><surname>Fineman</surname> <given-names>J. T.</given-names></name> <name><surname>Shun</surname> <given-names>J.</given-names></name></person-group> (<year>2012</year>). <article-title>&#x0201C;Greedy sequential maximal independent set and matching are parallel on average,&#x0201D;</article-title> in <source>Proceedings of the Twenty-Fourth Annual ACM Symposium on Parallelism in Algorithms and Architectures, SPAA&#x00027;12</source> (<publisher-loc>New York, NY</publisher-loc>: <publisher-name>Association for Computing Machinery</publisher-name>), <fpage>308</fpage>&#x02013;<lpage>317</lpage>.</citation>
</ref>
<ref id="B7">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Boppana</surname> <given-names>R.</given-names></name> <name><surname>Halld&#x000F3;rsson</surname> <given-names>M. M.</given-names></name></person-group> (<year>1990</year>). <article-title>&#x0201C;Approximating maximum independent sets by excluding subgraphs,&#x0201D;</article-title> in <source>Proceedings of the Second Scandinavian Workshop on Algorithm Theory, SWAT 90</source> (<publisher-loc>Berlin; Heidelberg</publisher-loc>: <publisher-name>Springer-Verlag</publisher-name>), <fpage>13</fpage>&#x02013;<lpage>25</lpage>.</citation>
</ref>
<ref id="B8">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Duan</surname> <given-names>R.</given-names></name> <name><surname>Pettie</surname> <given-names>S.</given-names></name></person-group> (<year>2014</year>). <article-title>Linear-time approximation for maximum weight matching</article-title>. <source>J. ACM</source>. <volume>61</volume>, <fpage>23</fpage>. <pub-id pub-id-type="doi">10.1145/2529989</pub-id></citation>
</ref>
<ref id="B9">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Eckles</surname> <given-names>D.</given-names></name> <name><surname>Karrer</surname> <given-names>B.</given-names></name> <name><surname>Ugander</surname> <given-names>J.</given-names></name></person-group> (<year>2016</year>). <article-title>Design and analysis of experiments in networks: reducing bias from interference</article-title>. <source>J. Causal Inference</source> <volume>5</volume>, <fpage>20150021</fpage>. <pub-id pub-id-type="doi">10.1515/jci-2015-0021</pub-id></citation>
</ref>
<ref id="B10">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Eisenbrand</surname> <given-names>F.</given-names></name> <name><surname>Funke</surname> <given-names>S.</given-names></name> <name><surname>Garg</surname> <given-names>N.</given-names></name> <name><surname>K&#x000F6;nemann</surname> <given-names>J.</given-names></name></person-group> (<year>2003</year>). <article-title>&#x0201C;A combinatorial algorithm for computing a maximum independent set in a t-perfect graph,&#x0201D;</article-title> in <source>ACM-SIAM Symposium on Discrete Algorithms</source> (<publisher-loc>Baltimore, MD</publisher-loc>: <publisher-name>Society for Industrial and Applied Mathematics</publisher-name>), <fpage>517</fpage>&#x02013;<lpage>522</lpage>. <pub-id pub-id-type="doi">10.5555/644108.644194</pub-id></citation>
</ref>
<ref id="B11">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Enright</surname> <given-names>A. J.</given-names></name> <name><surname>van Dongen</surname> <given-names>S.</given-names></name> <name><surname>Ouzounis</surname> <given-names>C. A.</given-names></name></person-group> (<year>2002</year>). <article-title>An efficient algorithm for large-scale detection of protein families</article-title>. <source>Nucleic Acids Res</source>. <volume>30</volume>, <fpage>1575</fpage>&#x02013;<lpage>1584</lpage>. <pub-id pub-id-type="doi">10.1093/nar/30.7.1575</pub-id><pub-id pub-id-type="pmid">11917018</pub-id></citation></ref>
<ref id="B12">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Fatemi</surname> <given-names>Z.</given-names></name> <name><surname>Zheleva</surname> <given-names>E.</given-names></name></person-group> (<year>2020</year>). <article-title>Minimizing interference and selection bias in network experiment design</article-title>. <source>Proc. Int. AAAI Conf. Web Soc. Media</source> <volume>14</volume>, <fpage>176</fpage>&#x02013;<lpage>186</lpage>. <pub-id pub-id-type="doi">10.1609/icwsm.v14i1.7289</pub-id><pub-id pub-id-type="pmid">24771480</pub-id></citation></ref>
<ref id="B13">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Gangl</surname> <given-names>M.</given-names></name></person-group> (<year>2010</year>). <article-title>Causal inference in sociological research</article-title>. <source>Annu. Rev. Sociol</source>. <volume>36</volume>, <fpage>21</fpage>&#x02013;<lpage>47</lpage>. <pub-id pub-id-type="doi">10.1146/annurev.soc.012809.102702</pub-id></citation>
</ref>
<ref id="B14">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Gilbert</surname> <given-names>E.</given-names></name> <name><surname>Karahalios</surname> <given-names>K.</given-names></name></person-group> (<year>2009</year>). <article-title>&#x0201C;Predicting tie strength with social media,&#x0201D;</article-title> in <source>CHI</source> (<publisher-loc>Association for Computing Machinery</publisher-loc>: <publisher-name>Boston, MA: ACM</publisher-name>), <fpage>211</fpage>&#x02013;<lpage>220</lpage>. <pub-id pub-id-type="doi">10.1145/1518701.1518736</pub-id></citation>
</ref>
<ref id="B15">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Green</surname> <given-names>D. P.</given-names></name> <name><surname>Krasno</surname> <given-names>J. S.</given-names></name> <name><surname>Coppock</surname> <given-names>A.</given-names></name> <name><surname>Farrer</surname> <given-names>B. D.</given-names></name> <name><surname>Lenoir</surname> <given-names>B.</given-names></name> <name><surname>Zingher</surname> <given-names>J. N.</given-names></name></person-group> (<year>2016</year>). <article-title>The effects of lawn signs on vote outcomes: results from four randomized field experiments</article-title>. <source>Elect. Stud</source>. <volume>41</volume>, <fpage>143</fpage>&#x02013;<lpage>150</lpage>. <pub-id pub-id-type="doi">10.1016/j.electstud.2015.12.002</pub-id></citation>
</ref>
<ref id="B16">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Gui</surname> <given-names>H.</given-names></name> <name><surname>Xu</surname> <given-names>Y.</given-names></name> <name><surname>Bhasin</surname> <given-names>A.</given-names></name> <name><surname>Han</surname> <given-names>J.</given-names></name></person-group> (<year>2015</year>). <article-title>&#x0201C;Network a/b testing: from sampling to estimation,&#x0201D;</article-title> in <source>Proceedings of the 24th International Conference on World Wide Web</source> (<publisher-loc>Republic and Canton of Geneva</publisher-loc>: <publisher-name>International World Wide Web Conferences Steering Committee</publisher-name>), <fpage>399</fpage>&#x02013;<lpage>409</lpage>. <pub-id pub-id-type="doi">10.1145/2736277.2741081</pub-id></citation>
</ref>
<ref id="B17">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Halloran</surname> <given-names>M. E.</given-names></name> <name><surname>Struchiner</surname> <given-names>C. J.</given-names></name></person-group> (<year>1995</year>). <article-title>Causal inference in infectious diseases</article-title>. <source>Epidemiology</source> <volume>6</volume>, <fpage>142</fpage>&#x02013;<lpage>151</lpage>. <pub-id pub-id-type="doi">10.1097/00001648-199503000-00010</pub-id><pub-id pub-id-type="pmid">7742400</pub-id></citation></ref>
<ref id="B18">
<citation citation-type="web"><person-group person-group-type="author"><name><surname>Hamilton</surname> <given-names>W. L.</given-names></name> <name><surname>Ying</surname> <given-names>R.</given-names></name> <name><surname>Leskovec</surname> <given-names>J.</given-names></name></person-group> (<year>2017</year>). <article-title>Representation learning on graphs: Methods and applications</article-title>. <source>IEEE Data Eng. Bull.</source> <volume>40</volume>, <fpage>52</fpage>&#x02013;<lpage>74</lpage>. Available online at: <ext-link ext-link-type="uri" xlink:href="http://sites.computer.org/debull/A17sept/p52.pdf">http://sites.computer.org/debull/A17sept/p52.pdf</ext-link></citation>
</ref>
<ref id="B19">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Harvey</surname> <given-names>N. J. A.</given-names></name></person-group> (<year>2009</year>). <article-title>Algebraic algorithms for matching and matroid problems</article-title>. <source>SIAM J. Comput</source>. <volume>39</volume>, <fpage>679</fpage>&#x02013;<lpage>702</lpage>. <pub-id pub-id-type="doi">10.1137/070684008</pub-id></citation>
</ref>
<ref id="B20">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Holtz</surname> <given-names>D.</given-names></name> <name><surname>Lobel</surname> <given-names>R.</given-names></name> <name><surname>Liskovich</surname> <given-names>I.</given-names></name> <name><surname>Aral</surname> <given-names>S.</given-names></name></person-group> (<year>2020</year>). <article-title>Reducing interference bias in online marketplace pricing experiments</article-title>. <source>Available at SSRN</source> 3583836. <pub-id pub-id-type="doi">10.2139/ssrn.3583836</pub-id></citation>
</ref>
<ref id="B21">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hudgens</surname> <given-names>M. G.</given-names></name> <name><surname>Halloran</surname> <given-names>M. E.</given-names></name></person-group> (<year>2008</year>). <article-title>Toward causal inference with interference</article-title>. <source>J. Am. Stat. Assoc</source>. <volume>103</volume>, <fpage>832</fpage>&#x02013;<lpage>842</lpage>. <pub-id pub-id-type="doi">10.1198/016214508000000292</pub-id><pub-id pub-id-type="pmid">19081744</pub-id></citation></ref>
<ref id="B22">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Imbens</surname> <given-names>G.</given-names></name> <name><surname>Rubin</surname> <given-names>D.</given-names></name></person-group> (<year>2015</year>). <source>Causal Inference in Statistics, Social and Biomedical Sciences: An Introduction</source>. <publisher-loc>Cambridge</publisher-loc>: <publisher-name>Cambridge University Press</publisher-name>.</citation>
</ref>
<ref id="B23">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Jagadeesan</surname> <given-names>R.</given-names></name> <name><surname>Pillai</surname> <given-names>N. S.</given-names></name> <name><surname>Volfovsky</surname> <given-names>A.</given-names></name></person-group> (<year>2020</year>). <article-title>Designs for estimating the treatment effect in networks with interference</article-title>. <source>Ann. Stat</source>. <volume>48</volume>, <fpage>679</fpage>&#x02013;<lpage>712</lpage>. <pub-id pub-id-type="doi">10.1214/18-AOS1807</pub-id></citation>
</ref>
<ref id="B24">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Lee</surname> <given-names>S.</given-names></name> <name><surname>Honavar</surname> <given-names>V.</given-names></name></person-group> (<year>2016</year>). <article-title>&#x0201C;On learning causal models from relational data,&#x0201D;</article-title> in <source>Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence</source> (<publisher-loc>Phoenix. AZ</publisher-loc>: <publisher-name>AAAI Press</publisher-name>), <fpage>3263</fpage>&#x02013;<lpage>3270</lpage>. <pub-id pub-id-type="doi">10.5555/3016100.3016360</pub-id></citation>
</ref>
<ref id="B25">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Leskovec</surname> <given-names>J.</given-names></name> <name><surname>Kleinberg</surname> <given-names>J.</given-names></name> <name><surname>Faloutsos</surname> <given-names>C.</given-names></name></person-group> (<year>2007</year>). <article-title>Graph evolution: densification and shrinking diameters</article-title>. <source>ACM Trans. Knowl. Discov. Data</source> <volume>1</volume>, <fpage>1217301</fpage>. <pub-id pub-id-type="doi">10.1145/1217299.1217301</pub-id></citation>
</ref>
<ref id="B26">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Li</surname> <given-names>L.</given-names></name> <name><surname>Chu</surname> <given-names>W.</given-names></name> <name><surname>Langford</surname> <given-names>J.</given-names></name> <name><surname>Schapire</surname> <given-names>R. E.</given-names></name></person-group> (<year>2010</year>). <article-title>&#x0201C;A contextual-bandit approach to personalized news article recommendation,&#x0201D;</article-title> in <source>Proceedings of the 19th International Conference on World Wide Web</source> (<publisher-loc>Raleigh, NC</publisher-loc>: <publisher-name>Association for Computing Machinery</publisher-name>), <fpage>661</fpage>&#x02013;<lpage>670</lpage>. <pub-id pub-id-type="doi">10.1145/1772690.1772758</pub-id></citation>
</ref>
<ref id="B27">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Li</surname> <given-names>S.</given-names></name> <name><surname>Wager</surname> <given-names>S.</given-names></name></person-group> (<year>2022</year>). <article-title>Random graph asymptotics for treatment effect estimation under network interference</article-title>. <source>Ann. Stat</source>. <volume>50</volume>, <fpage>2334</fpage>&#x02013;<lpage>2358</lpage>. <pub-id pub-id-type="doi">10.1214/22-AOS2191</pub-id></citation>
</ref>
<ref id="B28">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Liben-Nowell</surname> <given-names>D.</given-names></name> <name><surname>Kleinberg</surname> <given-names>J.</given-names></name></person-group> (<year>2007</year>). <article-title>The link-prediction problem for social networks</article-title>. <source>JASIST</source> <volume>58</volume>, <fpage>1019</fpage>&#x02013;<lpage>1031</lpage>. <pub-id pub-id-type="doi">10.1002/asi.20591</pub-id></citation>
</ref>
<ref id="B29">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Loucks</surname> <given-names>A. B.</given-names></name> <name><surname>Thuma</surname> <given-names>J. R.</given-names></name></person-group> (<year>2003</year>). <article-title>Luteinizing hormone pulsatility is disrupted at a threshold of energy availability in regularly menstruating women</article-title>. <source>J. Clin. Endocrinol. Metabol</source>. <volume>88</volume>, <fpage>297</fpage>&#x02013;<lpage>311</lpage>. <pub-id pub-id-type="doi">10.1210/jc.2002-020369</pub-id><pub-id pub-id-type="pmid">12519869</pub-id></citation></ref>
<ref id="B30">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Luby</surname> <given-names>M.</given-names></name></person-group> (<year>1985</year>). <article-title>&#x0201C;A simple parallel algorithm for the maximal independent set problem,&#x0201D;</article-title> in <source>Proceedings of the Seventeenth Annual ACM Symposium on Theory of Computing, STOC 85</source> (<publisher-loc>New York, NY</publisher-loc>: <publisher-name>Association for Computing Machinery</publisher-name>), <fpage>1</fpage>&#x02013;<lpage>10</lpage>.</citation>
</ref>
<ref id="B31">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Maier</surname> <given-names>M.</given-names></name> <name><surname>Marazopoulou</surname> <given-names>K.</given-names></name> <name><surname>Arbour</surname> <given-names>D.</given-names></name> <name><surname>Jensen</surname> <given-names>D.</given-names></name></person-group> (<year>2013</year>). <article-title>&#x0201C;A sound and complete algorithm for learning causal models from relational data,&#x0201D;</article-title> in <source>Proceedings of the Twenty-Ninth Conference on Uncertainty in Artificial Intelligence</source> (<publisher-loc>Arlington, VA</publisher-loc>: <publisher-name>AUAI Press</publisher-name>), <fpage>371</fpage>&#x02013;<lpage>380</lpage>. <pub-id pub-id-type="doi">10.5555/3023638.3023676</pub-id></citation>
</ref>
<ref id="B32">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Maier</surname> <given-names>M.</given-names></name> <name><surname>Taylor</surname> <given-names>B.</given-names></name> <name><surname>Oktay</surname> <given-names>H.</given-names></name> <name><surname>Jensen</surname> <given-names>D.</given-names></name></person-group> (<year>2010</year>). <article-title>&#x0201C;Learning causal models of relational domains,&#x0201D;</article-title> in <source>Proceedings of the Twenty-Fourth AAAI Conference on Artificial Intelligence</source> (<publisher-loc>Atlanta, GA</publisher-loc>: <publisher-name>AAAI Press</publisher-name>), <fpage>531</fpage>&#x02013;<lpage>538</lpage>. <pub-id pub-id-type="doi">10.5555/2898607.2898693</pub-id></citation>
</ref>
<ref id="B33">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Marazopoulou</surname> <given-names>K.</given-names></name> <name><surname>Maier</surname> <given-names>M.</given-names></name> <name><surname>Jensen</surname> <given-names>D.</given-names></name></person-group> (<year>2015</year>). <article-title>&#x0201C;Learning the structure of causal models with rel. and temporal dependence,&#x0201D;</article-title> in <source>Proceedings of the UAI 2015 Conference on Advances in Causal Inference - Volume 1504</source> (<publisher-loc>Amsterdam</publisher-loc>: <publisher-name>CEUR-WS.org</publisher-name>), <fpage>66</fpage>&#x02013;<lpage>75</lpage>. <pub-id pub-id-type="doi">10.5555/3020267.3020274</pub-id></citation>
</ref>
<ref id="B34">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Michell</surname> <given-names>L.</given-names></name> <name><surname>Amos</surname> <given-names>A.</given-names></name></person-group> (<year>1997</year>). <article-title>Girls, pecking order and smoking</article-title>. <source>Soc. Sci. Med</source>. <volume>44</volume>, <fpage>1861</fpage>&#x02013;<lpage>1869</lpage>. <pub-id pub-id-type="doi">10.1016/S0277-9536(96)00295-X</pub-id><pub-id pub-id-type="pmid">9194247</pub-id></citation></ref>
<ref id="B35">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Milo</surname> <given-names>R.</given-names></name> <name><surname>Shen-Orr</surname> <given-names>S.</given-names></name> <name><surname>Itzkovitz</surname> <given-names>S.</given-names></name> <name><surname>Kashtan</surname> <given-names>N.</given-names></name> <name><surname>Chklovskii</surname> <given-names>D.</given-names></name> <name><surname>Alon</surname> <given-names>U.</given-names></name></person-group> (<year>2002</year>). <article-title>Network motifs: simple building blocks of complex networks</article-title>. <source>Science</source> <volume>298</volume>, <fpage>824</fpage>&#x02013;<lpage>827</lpage>. <pub-id pub-id-type="doi">10.1126/science.298.5594.824</pub-id><pub-id pub-id-type="pmid">15326338</pub-id></citation></ref>
<ref id="B36">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Mucha</surname> <given-names>M.</given-names></name> <name><surname>Sankowski</surname> <given-names>P.</given-names></name></person-group> (<year>2004</year>). <article-title>&#x0201C;Maximum matchings via gaussian elimination,&#x0201D;</article-title> in <source>Proceedings of the 45th Annual IEEE Symposium on Foundations of Computer Science</source> (<publisher-loc>New York, NY</publisher-loc>: <publisher-name>IEEE Computer Society</publisher-name>), <fpage>248</fpage>&#x02013;<lpage>255</lpage>. <pub-id pub-id-type="doi">10.1109/FOCS.2004.40</pub-id></citation>
</ref>
<ref id="B37">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Nishimura</surname> <given-names>J.</given-names></name> <name><surname>Ugander</surname> <given-names>J.</given-names></name></person-group> (<year>2013</year>). <article-title>&#x0201C;Restreaming graph partitioning: simple versatile algorithms for advanced balancing,&#x0201D;</article-title> in <source>Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining</source> (<publisher-loc>Chicago, IL</publisher-loc>: <publisher-name>Association for Computing Machinery</publisher-name>), <fpage>1106</fpage>&#x02013;<lpage>1114</lpage>. <pub-id pub-id-type="doi">10.1145/2487575.2487696</pub-id></citation>
</ref>
<ref id="B38">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Oktay</surname> <given-names>H.</given-names></name> <name><surname>Taylor</surname> <given-names>B.</given-names></name> <name><surname>Jensen</surname> <given-names>D.</given-names></name></person-group> (<year>2010</year>). <article-title>&#x0201C;Causal discovery in social media using quasi-experimental designs,&#x0201D;</article-title> in <source>Proceedings of the First Workshop on Social Media Analytics</source> (<publisher-loc>Washington, DC</publisher-loc>: <publisher-name>Association for Computing Machinery</publisher-name>), <fpage>1</fpage>&#x02013;<lpage>9</lpage>. <pub-id pub-id-type="doi">10.1145/1964858.1964859</pub-id></citation>
</ref>
<ref id="B39">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Pearl</surname> <given-names>J.</given-names></name></person-group> (<year>2009</year>). <source>Causality</source>. <publisher-loc>Cambridge</publisher-loc>: <publisher-name>Cambridge University Press</publisher-name>.</citation>
</ref>
<ref id="B40">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Pouget-Abadie</surname> <given-names>J.</given-names></name> <name><surname>Mirrokni</surname> <given-names>V.</given-names></name> <name><surname>Parkes</surname> <given-names>D. C.</given-names></name> <name><surname>Airoldi</surname> <given-names>E. M.</given-names></name></person-group> (<year>2018</year>). <article-title>&#x0201C;Optimizing cluster-based randomized experiments under monotonicity,&#x0201D;</article-title> in <source>Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 18</source> (<publisher-loc>New York, NY</publisher-loc>: <publisher-name>Association for Computing Machinery</publisher-name>), <fpage>2090</fpage>&#x02013;<lpage>2099</lpage>.</citation>
</ref>
<ref id="B41">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Pouget-Abadie</surname> <given-names>J.</given-names></name> <name><surname>Saint-Jacques</surname> <given-names>G.</given-names></name> <name><surname>Saveski</surname> <given-names>M.</given-names></name> <name><surname>Duan</surname> <given-names>W.</given-names></name> <name><surname>Ghosh</surname> <given-names>S.</given-names></name> <name><surname>Xu</surname> <given-names>Y.</given-names></name> <etal/></person-group>. (<year>2019</year>). <article-title>Testing for arbitrary interference on experimentation platforms</article-title>. <source>Biometrika</source> <volume>106</volume>, <fpage>929</fpage>&#x02013;<lpage>940</lpage>. <pub-id pub-id-type="doi">10.1093/biomet/asz047</pub-id></citation>
</ref>
<ref id="B42">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Rattigan</surname> <given-names>M. J.</given-names></name> <name><surname>Maier</surname> <given-names>M. E.</given-names></name> <name><surname>Jensen</surname> <given-names>D. D.</given-names></name></person-group> (<year>2011</year>). <article-title>&#x0201C;Relational blocking for causal discovery,&#x0201D;</article-title> in <source>Proceedings of the Twenty-Fifth AAAI Conference on Artificial Intelligence</source> (<publisher-loc>San Francisco, CA</publisher-loc>: <publisher-name>AAAI Press</publisher-name>) <fpage>145</fpage>&#x02013;<lpage>151</lpage>. <pub-id pub-id-type="doi">10.5555/2900423.2900446</pub-id></citation>
</ref>
<ref id="B43">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Ribeiro</surname> <given-names>M. H.</given-names></name> <name><surname>Calais</surname> <given-names>P. H.</given-names></name> <name><surname>Santos</surname> <given-names>Y. A.</given-names></name> <name><surname>Almeida</surname> <given-names>V. A.</given-names></name> <name><surname>Meira Jr</surname> <given-names>W.</given-names></name></person-group> (<year>2018</year>). <article-title>&#x0201C; Characterizing and detecting hateful users on twitter,&#x0201D;</article-title> in <source>ICWSM</source> (<publisher-loc>Stanford, CA</publisher-loc>). <pub-id pub-id-type="doi">10.1609/icwsm.v12i1.15057</pub-id></citation>
</ref>
<ref id="B44">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Rossi</surname> <given-names>R.</given-names></name> <name><surname>McDowell</surname> <given-names>L.</given-names></name> <name><surname>Aha</surname> <given-names>D.</given-names></name> <name><surname>Neville</surname> <given-names>J.</given-names></name></person-group> (<year>2012</year>). <article-title>Transforming graph data for statistical relational learning</article-title>. <source>JAIR</source> <volume>45</volume>, <fpage>363</fpage>&#x02013;<lpage>441</lpage>. <pub-id pub-id-type="doi">10.1613/jair.3659</pub-id></citation>
</ref>
<ref id="B45">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Saveski</surname> <given-names>M.</given-names></name> <name><surname>Pouget-Abadie</surname> <given-names>J.</given-names></name> <name><surname>Saint-Jacques</surname> <given-names>G.</given-names></name> <name><surname>Duan</surname> <given-names>W.</given-names></name> <name><surname>Ghosh</surname> <given-names>S.</given-names></name> <name><surname>Xu</surname> <given-names>Y.</given-names></name> <etal/></person-group>. (<year>2017</year>). <article-title>&#x0201C;Detecting network effects: Randomizing over randomized experiments,&#x0201D;</article-title> in <source>KDD</source> (<publisher-loc>Halifax, NS</publisher-loc>: <publisher-name>Association for Computing Machinery</publisher-name>), <fpage>1027</fpage>&#x02013;<lpage>1035</lpage>. <pub-id pub-id-type="doi">10.1145/3097983.3098192</pub-id></citation>
</ref>
<ref id="B46">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>S&#x000E4;vje</surname> <given-names>F.</given-names></name> <name><surname>Aronow</surname> <given-names>P.</given-names></name> <name><surname>Hudgens</surname> <given-names>M.</given-names></name></person-group> (<year>2021</year>). <article-title>Average treatment effects in the presence of unknown interference</article-title>. <source>Ann. Stat</source>. <volume>49</volume>, <fpage>673</fpage>. <pub-id pub-id-type="doi">10.1214/20-AOS1973</pub-id><pub-id pub-id-type="pmid">34421150</pub-id></citation></ref>
<ref id="B47">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Schaeffer</surname> <given-names>S. E.</given-names></name></person-group> (<year>2007</year>). <article-title>Graph clustering</article-title>. <source>Comput. Sci. Rev</source>. <volume>1</volume>, <fpage>27</fpage>&#x02013;<lpage>64</lpage>. <pub-id pub-id-type="doi">10.1016/j.cosrev.2007.05.001</pub-id></citation>
</ref>
<ref id="B48">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Sen</surname> <given-names>P.</given-names></name> <name><surname>Namata</surname> <given-names>G. M.</given-names></name> <name><surname>Bilgic</surname> <given-names>M.</given-names></name> <name><surname>Getoor</surname> <given-names>L.</given-names></name> <name><surname>Gallagher</surname> <given-names>B.</given-names></name> <name><surname>Eliassi-Rad</surname> <given-names>T.</given-names></name></person-group> (<year>2008</year>). <article-title>Collective classification in network data</article-title>. <source>AI Mag</source>. <volume>29</volume>, <fpage>93</fpage>&#x02013;<lpage>106</lpage>. <pub-id pub-id-type="doi">10.1609/aimag.v29i3.2157</pub-id></citation>
</ref>
<ref id="B49">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Sobel</surname> <given-names>M. E.</given-names></name></person-group> (<year>2000</year>). <article-title>Causal inference in the social sciences</article-title>. <source>J. Am. Stat. Assoc</source>. <volume>95</volume>, <fpage>647</fpage>&#x02013;<lpage>651</lpage>. <pub-id pub-id-type="doi">10.1080/01621459.2000.10474243</pub-id></citation>
</ref>
<ref id="B50">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Stuart</surname> <given-names>E. A.</given-names></name></person-group> (<year>2010</year>). <article-title>Matching methods for causal inference: A review and a look forward</article-title>. <source>Stat. Sci</source>. <volume>25</volume>, <fpage>1</fpage>. <pub-id pub-id-type="doi">10.1214/09-STS313</pub-id><pub-id pub-id-type="pmid">20871802</pub-id></citation></ref>
<ref id="B51">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Sussman</surname> <given-names>D. L.</given-names></name> <name><surname>Airoldi</surname> <given-names>E. M.</given-names></name></person-group> (<year>2017</year>). <source>Elements of Estimation Theory for Causal Effects in the Presence of Network Interference</source>.</citation>
</ref>
<ref id="B52">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Taylor</surname> <given-names>S. J.</given-names></name> <name><surname>Eckles</surname> <given-names>D.</given-names></name></person-group> (<year>2018</year>). <article-title>&#x0201C;Randomized experiments to detect and estimate social influence in networks,&#x0201D;</article-title> in <source>Complex Spreading Phenomena in Social Systems</source>, <fpage>289</fpage>&#x02013;<lpage>322</lpage>. <pub-id pub-id-type="doi">10.1007/978-3-319-77332-2_16</pub-id><pub-id pub-id-type="pmid">27534393</pub-id></citation></ref>
<ref id="B53">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Toulis</surname> <given-names>P.</given-names></name> <name><surname>Kao</surname> <given-names>E.</given-names></name></person-group> (<year>2013</year>). <article-title>&#x0201C;Estimation of causal peer influence effects,&#x0201D;</article-title> in <source>Proceedings of the 30th International Conference on International Conference on Machine Learning - Volume 28</source> (<publisher-loc>Atlanta, GA</publisher-loc>: <publisher-name>JMLR.org</publisher-name>), <fpage>1487</fpage>&#x02013;<lpage>1497</lpage>. <pub-id pub-id-type="doi">10.5555/3042817.3043103</pub-id></citation>
</ref>
<ref id="B54">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Ugander</surname> <given-names>J.</given-names></name> <name><surname>Karrer</surname> <given-names>B.</given-names></name> <name><surname>Backstrom</surname> <given-names>L.</given-names></name> <name><surname>Kleinberg</surname> <given-names>J.</given-names></name></person-group> (<year>2013</year>). <article-title>&#x0201C;Graph cluster randomization: network exposure to multiple universes,&#x0201D;</article-title> in <source>Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining</source> (<publisher-loc>Chicago, IL</publisher-loc>: <publisher-name>Association for Computing Machinery</publisher-name>), <fpage>329</fpage>&#x02013;<lpage>337</lpage>. <pub-id pub-id-type="doi">10.1145/2487575.2487695</pub-id></citation>
</ref>
<ref id="B55">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Ugander</surname> <given-names>J.</given-names></name> <name><surname>Yin</surname> <given-names>H.</given-names></name></person-group> (<year>2020</year>). <article-title>Randomized graph cluster randomization</article-title>. <source>arXiv preprint. arXiv, 2009.02297</source>. <pub-id pub-id-type="doi">10.48550/arXiv.2009.02297</pub-id></citation>
</ref>
<ref id="B56">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Varian</surname> <given-names>H. R.</given-names></name></person-group> (<year>2016</year>). <article-title>Causal inference in economics and marketing</article-title>. <source>Proc. Natl. Acad. Sci. U.S.A</source>. <volume>113</volume>, <fpage>7310</fpage>&#x02013;<lpage>7315</lpage>. <pub-id pub-id-type="doi">10.1073/pnas.1510479113</pub-id><pub-id pub-id-type="pmid">27382144</pub-id></citation></ref>
<ref id="B57">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Xiao</surname> <given-names>M.</given-names></name> <name><surname>Nagamochi</surname> <given-names>H.</given-names></name></person-group> (<year>2017</year>). <article-title>Exact algorithms for maximum independent set</article-title>. <source>Inf. Comput</source>. <volume>255</volume>, <fpage>126</fpage>&#x02013;<lpage>146</lpage>. <pub-id pub-id-type="doi">10.1016/j.ic.2017.06.001</pub-id></citation>
</ref>
<ref id="B58">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Yang</surname> <given-names>J.</given-names></name> <name><surname>Leskovec</surname> <given-names>J.</given-names></name></person-group> (<year>2015</year>). <article-title>Defining and evaluating network communities based on ground-truth</article-title>. <source>KAIS</source> <volume>42</volume>, <fpage>181</fpage>&#x02013;<lpage>213</lpage>. <pub-id pub-id-type="doi">10.1007/s10115-013-0693-z</pub-id></citation>
</ref>
<ref id="B59">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Yves</surname> <given-names>M.</given-names></name> <name><surname>Robson</surname> <given-names>J. M.</given-names></name> <name><surname>Nasser</surname> <given-names>S.-D.</given-names></name> <name><surname>Zemmari</surname> <given-names>A.</given-names></name></person-group> (<year>2009</year>). <article-title>&#x0201C;An optimal bit complexity randomized distributed mis algorithm (extended abstract),&#x0201D;</article-title> in <source>Proceedings of the 16th International Conference on Structural Information and Communication Complexity, SIROCCO 09</source> (<publisher-loc>Berlin; Heidelberg</publisher-loc>: <publisher-name>Springer-Verlag</publisher-name>), <fpage>323</fpage>&#x02013;<lpage>337</lpage>.</citation>
</ref>
<ref id="B60">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Zheleva</surname> <given-names>E. J.</given-names></name> <name><surname>Kuter</surname> <given-names>U.</given-names></name> <name><surname>Getoor</surname> <given-names>L.</given-names></name></person-group> (<year>2008</year>). <article-title>&#x0201C;Using friendship ties and family circles for link prediction,&#x0201D;</article-title> in <source>Proceedings of the Second International Conference on Advances in Social Network Mining and Analysis</source> (<publisher-loc>Las Vegas, NV</publisher-loc>: <publisher-name>Springer Verlag</publisher-name>), <fpage>97</fpage>&#x02013;<lpage>113</lpage>. <pub-id pub-id-type="doi">10.5555/1883692.1883698</pub-id></citation>
</ref>
<ref id="B61">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zhou</surname> <given-names>Y.</given-names></name> <name><surname>Cheng</surname> <given-names>H.</given-names></name> <name><surname>Yu</surname> <given-names>J. X.</given-names></name></person-group> (<year>2009</year>). <article-title>Graph clustering based on structural/attribute similarities</article-title>. <source>Proc. VLDB Endow</source>. <volume>2</volume>, <fpage>718</fpage>&#x02013;<lpage>729</lpage>. <pub-id pub-id-type="doi">10.14778/1687627.1687709</pub-id></citation>
</ref>
</ref-list> 
</back>
</article> 