<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v2.3 20070202//EN" "journalpublishing.dtd">
<article xml:lang="EN" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="research-article">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">Front. Big Data</journal-id>
<journal-title>Frontiers in Big Data</journal-title>
<abbrev-journal-title abbrev-type="pubmed">Front. Big Data</abbrev-journal-title>
<issn pub-type="epub">2624-909X</issn>
<publisher>
<publisher-name>Frontiers Media S.A.</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.3389/fdata.2022.888592</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Big Data</subject>
<subj-group>
<subject>Original Research</subject>
</subj-group>
</subj-group>
</article-categories>
<title-group>
<article-title>Causal Inference in the Presence of Interference in Sponsored Search Advertising</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author" corresp="yes">
<name><surname>Nabi</surname> <given-names>Razieh</given-names></name>
<xref ref-type="aff" rid="aff1"><sup>1</sup></xref>
<xref ref-type="aff" rid="aff2"><sup>2</sup></xref>
<xref ref-type="corresp" rid="c001"><sup>&#x0002A;</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/1679503/overview"/>
</contrib>
<contrib contrib-type="author">
<name><surname>Pfeiffer</surname> <given-names>Joel</given-names></name>
<xref ref-type="aff" rid="aff3"><sup>3</sup></xref>
</contrib>
<contrib contrib-type="author">
<name><surname>Charles</surname> <given-names>Denis</given-names></name>
<xref ref-type="aff" rid="aff3"><sup>3</sup></xref>
</contrib>
<contrib contrib-type="author">
<name><surname>K&#x00131;c&#x00131;man</surname> <given-names>Emre</given-names></name>
<xref ref-type="aff" rid="aff2"><sup>2</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/694162/overview"/>
</contrib>
</contrib-group>
<aff id="aff1"><sup>1</sup><institution>Department of Biostatistics and Bioinformatics, Emory University</institution>, <addr-line>Atlanta, GA</addr-line>, <country>United States</country></aff>
<aff id="aff2"><sup>2</sup><institution>Microsoft Research</institution>, <addr-line>Redmond, WA</addr-line>, <country>United States</country></aff>
<aff id="aff3"><sup>3</sup><institution>Microsoft Corporation</institution>, <addr-line>Redmond, WA</addr-line>, <country>United States</country></aff>
<author-notes>
<fn fn-type="edited-by"><p>Edited by: Elena Zheleva, University of Illinois at Chicago, United States</p></fn>
<fn fn-type="edited-by"><p>Reviewed by: Kun Kuang, Zhejiang University, China; Yongkai Wu, Clemson University, United States</p></fn>
<corresp id="c001">&#x0002A;Correspondence: Razieh Nabi <email>razieh.nabi&#x00040;emory.edu</email></corresp>
<fn fn-type="other" id="fn001"><p>This article was submitted to Data Mining and Management, a section of the journal Frontiers in Big Data</p></fn></author-notes>
<pub-date pub-type="epub">
<day>21</day>
<month>06</month>
<year>2022</year>
</pub-date>
<pub-date pub-type="collection">
<year>2022</year>
</pub-date>
<volume>5</volume>
<elocation-id>888592</elocation-id>
<history>
<date date-type="received">
<day>03</day>
<month>03</month>
<year>2022</year>
</date>
<date date-type="accepted">
<day>16</day>
<month>05</month>
<year>2022</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright &#x000A9; 2022 Nabi, Pfeiffer, Charles and K&#x00131;c&#x00131;man.</copyright-statement>
<copyright-year>2022</copyright-year>
<copyright-holder>Nabi, Pfeiffer, Charles and K&#x00131;c&#x00131;man</copyright-holder>
<license xlink:href="http://creativecommons.org/licenses/by/4.0/"><p>This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.</p></license></permissions>
<abstract>
<p>In classical causal inference, inferring cause-effect relations from data relies on the assumption that units are independent and identically distributed. This assumption is violated in settings where units are related through a network of dependencies. An example of such a setting is ad placement in sponsored search advertising, where the likelihood of a user clicking on a particular ad is potentially influenced by where it is placed and where other ads are placed on the search result page. In such scenarios, confounding arises due to not only the individual ad-level covariates but also the placements and covariates of other ads in the system. In this paper, we leverage the language of causal inference in the presence of interference to model interactions among the ads. Quantification of such interactions allows us to better understand the click behavior of users, which in turn impacts the revenue of the host search engine and enhances user satisfaction. We illustrate the utility of our formalization through experiments carried out on the ad placement system of the Bing search engine.</p></abstract>
<kwd-group>
<kwd>causal inference</kwd>
<kwd>allocational interference</kwd>
<kwd>spillover effect</kwd>
<kwd>dependent data</kwd>
<kwd>counterfactual layout</kwd>
<kwd>online advertising</kwd>
</kwd-group>
<counts>
<fig-count count="4"/>
<table-count count="5"/>
<equation-count count="11"/>
<ref-count count="50"/>
<page-count count="12"/>
<word-count count="9123"/>
</counts>
</article-meta>
</front>
<body>
<sec sec-type="intro" id="s1">
<title>1. Introduction</title>
<p>In recent years, advertisers have increasingly shifted their ad expenditures online. One of the most effective platforms for online advertising is search engine result pages. Given a user query, the search engine allocates a few ad slots (e.g., above or below its organic search results) and runs an auction among advertisers who are bidding and competing for these slots. Quantifying the effectiveness of ad placement is vital not only to the experience of the user, but also revenue of the advertiser and the search engine. Click yield is a common metric used in this regard. Often, statistical and flexible machine learning models are used to predict the click behavior of users by estimating the likelihood of receiving a click in a given slot using logged data. A rich literature is devoted to click prediction in sponsored search advertising (Shaparenko et al., <xref ref-type="bibr" rid="B34">2009</xref>; Cheng and Cant&#x000FA;-Paz, <xref ref-type="bibr" rid="B8">2010</xref>; Cheng et al., <xref ref-type="bibr" rid="B9">2012</xref>; Xiong et al., <xref ref-type="bibr" rid="B45">2012</xref>; Zhang et al., <xref ref-type="bibr" rid="B50">2014</xref>; Nabi-Abdolyousefi, <xref ref-type="bibr" rid="B21">2015</xref>; Effendi and Ali, <xref ref-type="bibr" rid="B12">2017</xref>; Bisht and Susan, <xref ref-type="bibr" rid="B6">2021</xref>). For a survey on click prediction in online advertising please refer to Wang (<xref ref-type="bibr" rid="B44">2020</xref>). However, a comprehensive understanding of click behavior requires causal, rather than associative, reasoning (Bottou et al., <xref ref-type="bibr" rid="B7">2013</xref>; Yin et al., <xref ref-type="bibr" rid="B46">2014</xref>; Hill et al., <xref ref-type="bibr" rid="B14">2015</xref>; Zeng et al., <xref ref-type="bibr" rid="B47">2021</xref>).</p>
<p>Causal inference is central to making data-driven decisions. Inferring valid cause-effect relations, even with granular data and large sample sizes, is complicated by confounding induced by common causes of observed exposures and outcomes. In classical causal inference, it is assumed that samples are <italic>independent and identically distributed</italic> (iid). However, a causal view of ad placement under the iid assumption is implausible as ads interfere with one another from the beginning of the auction until the end when clicks on impressed ads are recorded. In non-iid settings, confounding arises due to not only the individual ad-level covariates but also the exposures and covariates of other ads in the system. This is commonly referred to as <italic>interference</italic> (Hudgens and Halloran, <xref ref-type="bibr" rid="B16">2008</xref>). Incorporating knowledge of interference into the statistical models used to compute rank scores for each ad can help optimize the final layout of each search page. Moreover, a proper understanding of the interference issue in relation to causal inference directly impacts engineering of more purposeful interventions and design of more effective A/B testing for ad placement. Alternatively, randomized experiments via bipartite graphs offer a useful formalism to study two-sided market experiments under violation of iid assumption (Pouget-Abadie et al., <xref ref-type="bibr" rid="B25">2018</xref>, <xref ref-type="bibr" rid="B24">2019</xref>; Bajari et al., <xref ref-type="bibr" rid="B1">2021</xref>; Harshaw et al., <xref ref-type="bibr" rid="B13">2021</xref>; Johari et al., <xref ref-type="bibr" rid="B17">2022</xref>). This stands in contrast with interference that occurs on networks where all units are of the same type (e.g., ads in a block)&#x02014;in bipartite experiments, there is a distinction between units that can be subject to an intervention and units whose responses are of interest to the experimenter. Hence, modeling what we are after in the context of sponsored search advertising is closer to the causal framework for modeling interference in social networks.</p>
<p>In this paper, we formalize the problem of interference among ads using the language of causal inference. To the best of our knowledge, this is the first analyis of ads under the plausible and realistic setting of interference. We hope our proposed framework serves as a benchmark for future work in search advertising that go beyond the classical iid assumption. Throughout the paper, we discuss mechanisms that give rise to interference in ad placement. Using graphical models, we assume a causal structure that encodes the various sources of interference. We formulate our causal questions and discuss the identification and estimation of relevant effects. Our experiments find statistically significant interference effects among ads. We further adapt the <italic>constraint-based</italic> structure learning algorithm <italic>Fast Causal Inference</italic> (Spirtes et al., <xref ref-type="bibr" rid="B40">2000</xref>) to verify the correctness of our presumed causal structure and learn the underlying mechanisms that give rise to interference. Finally, we incorporate the knowledge of interference to improve the performance of the statistical models used during the course of the auction. We demonstrate this improvement in performance by running experiments that closely resemble the framework in the Genie model&#x02014;an offline counterfactual policy estimation framework for optimizing Sponsored Search Marketplace in Bing ads (Bayir et al., <xref ref-type="bibr" rid="B2">2019</xref>).</p>
</sec>
<sec id="s2">
<title>2. Preliminaries and Setup</title>
<p>In causal inference, we are interested in quantifying the cause-effect relationships between a treatment variable <italic>A</italic> and an outcome <italic>Y</italic> using experimental or observational data. A common setting assumes that the treatment received by one unit does not affect the outcomes of other units&#x02014;this is known as the <italic>stable unit treatment value assumption</italic> or SUTVA Rubin (<xref ref-type="bibr" rid="B31">1980</xref>) and is informally referred to as the &#x0201C;no-interference&#x0201D; assumption. In this setting, the <italic>average causal effect (ACE)</italic> of a binary treatment <italic>A</italic> on <italic>Y</italic> is defined as <italic>ACE</italic>: &#x0003D; &#x1D53C;[<italic>Y</italic>(1)] &#x02212; &#x1D53C;[<italic>Y</italic>(0)], where <italic>Y</italic>(<italic>a</italic>) denotes the counterfactual/potential outcome <italic>Y</italic> had treatment <italic>A</italic> been assigned to <italic>a</italic>, possibly contrary to the fact.</p>
<p>Causal inference uses assumptions in causal models to link the observed data distribution to the distribution over counterfactual random variables. A simple example of a causal model is the <italic>conditionally ignorable model</italic> which encodes three main assumptions: (i) <italic>Consistency</italic> assumes the mechanism that determines the value of the outcome does not distinguish the method by which the treatment was assigned, as long as the treatment value assigned was invariant, (ii) <italic>Conditional ignorability</italic> assumes <italic>Y</italic>(<italic>a</italic>) &#x022A5; <italic>A</italic> &#x02223; <italic>X</italic>, where <italic>X</italic> acts as a set of observed confounders, such that adjusting for their influence suffices to remove all non-causal dependence between <italic>A</italic> and <italic>Y</italic>, and (iii) <italic>Positivity</italic> of <italic>p</italic>(<italic>A</italic> &#x0003D; <italic>a</italic> &#x02223; <italic>X</italic> &#x0003D; <italic>x</italic>), &#x02200; <italic>a, x</italic>. Under these assumptions, <italic>p</italic>[<italic>Y</italic>(<italic>a</italic>)] is identified as the following function of the observed data: <inline-formula><mml:math id="M1"><mml:munder class="msub"><mml:mrow><mml:mo>&#x02211;</mml:mo></mml:mrow><mml:mrow><mml:mi>X</mml:mi></mml:mrow></mml:munder><mml:mi>p</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>Y</mml:mi><mml:mo>&#x02223;</mml:mo><mml:mi>A</mml:mi><mml:mo>=</mml:mo><mml:mi>a</mml:mi><mml:mo>,</mml:mo><mml:mi>X</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>&#x000D7;</mml:mo><mml:mi>p</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>X</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>,</mml:mo></mml:math></inline-formula> known as <italic>backdoor adjustment</italic> or <italic>g-formula</italic> (Robins, <xref ref-type="bibr" rid="B29">1986</xref>; Pearl, <xref ref-type="bibr" rid="B23">2009</xref>). For a general identification theory of causal effects in the presence of unmeasured confounders see (Huang and Valtorta, <xref ref-type="bibr" rid="B15">2006</xref>; Shpitser and Pearl, <xref ref-type="bibr" rid="B38">2006</xref>; Bhattacharya et al., <xref ref-type="bibr" rid="B3">2020a</xref>). Alternative causal quantities of interest include conditional causal effects (effects within subpopulations defined by covariates) (Shpitser and Pearl, <xref ref-type="bibr" rid="B37">2012</xref>), mediation quantities (which decompose effects into components along different mechanisms) (Shpitser, <xref ref-type="bibr" rid="B36">2013</xref>), and the effects of decision rules in sequential settings (such as dynamic treatment regimes in personalized medicine) (Nabi et al., <xref ref-type="bibr" rid="B19">2018</xref>, <xref ref-type="bibr" rid="B20">2019</xref>).</p>
<p>In this paper, we relax the implausible assumption of no-interference in ad placement. Interference among ads across different pageviews creates the most extreme scenario of <italic>full interference</italic>, as this allows for user interaction with the system over multiple time frames. Following the convention in Sobel (<xref ref-type="bibr" rid="B39">2006</xref>), Hudgens and Halloran (<xref ref-type="bibr" rid="B16">2008</xref>), Tchetgen and VanderWeele (<xref ref-type="bibr" rid="B41">2012</xref>), and Ogburn and VanderWeele (<xref ref-type="bibr" rid="B22">2014</xref>), we model only interference within pageviews and restrict any cross-pageview interference among ads. In other words, we restrict the interference to spatial constraints and exclude temporal dependence across pageviews. This is known as <italic>partial interference</italic> and could be justified by the fact that pageviews are query specific and are separated by time and space. In presence of interference, the counterfactual <italic>Y</italic>(<italic>a</italic>) is no longer well-defined as we need to distinguish ads by a proper indexing scheme and consider the treatment assignments of other ads simultaneously.</p>
<p>Suppose we have <italic>N</italic> pageviews, indexed by <italic>n</italic> &#x0003D; 1, &#x02026;, <italic>N</italic>, with each containing <italic>m</italic> impressed ads. We index the ads on each pageview by <italic>i</italic> &#x0003D; 1, &#x02026;, <italic>m</italic> based on the order in which they appear on the page. The <italic>i</italic>-th ad on the <italic>n</italic>-th pageview is represented by the tuple (<italic>X</italic><sub><italic>ni</italic></sub>, <italic>A</italic><sub><italic>ni</italic></sub>, <italic>Y</italic><sub><italic>ni</italic></sub>), where <italic>X</italic><sub><italic>ni</italic></sub> denotes the vector that collects all the ad-specific features such as geometric features (e.g., line width, pixel height), decorative features (e.g., rating information, twitter followers), and other textual features extracted from the ad. <italic>A</italic><sub><italic>ni</italic></sub> denotes the treatment and is predefined by the analyst. An example of a treatment is the <italic>block</italic> membership of the ad: an indicator that specifies whether the ad is placed on top of the page (Top) or bottom of the page (Bottom). Ads can also appear elsewhere such as the sidebars. In this paper, without loss of generality, we assume we only have two distinct blocks of ads on each pageview: Top and Bottom. <italic>Y</italic><sub><italic>ni</italic></sub> denotes a binary indicator of receiving a click by the user. We denote the state space of a random variable <italic>V</italic> by &#x1D51B;<sub><italic>V</italic></sub>.</p>
<p>Let <bold>X</bold><sub><italic>n</italic></sub>: &#x0003D; (<italic>X</italic><sub><italic>n</italic>1</sub>, &#x02026;, <italic>X</italic><sub><italic>nm</italic></sub>), <bold>A</bold><sub><italic>n</italic></sub>: &#x0003D; (<italic>A</italic><sub><italic>n</italic>1</sub>, &#x02026;, <italic>A</italic><sub><italic>nm</italic></sub>), and <bold>Y</bold><sub><italic>n</italic></sub>: &#x0003D; (<italic>Y</italic><sub><italic>n</italic>1</sub>, &#x02026;, <italic>Y</italic><sub><italic>nm</italic></sub>) collect the features, treatment assignments, and outcomes of all the ads on the <italic>n</italic>-th pageview, respectively. We define the counterfactual <italic>Y</italic><sub><italic>ni</italic></sub>(<bold>a</bold><sub><italic>n</italic></sub>) to be the click response of the <italic>i</italic>-th ad on the <italic>n</italic>-th pageview where every ad on the same pageview is relocated according to the treatment assignment rule <bold>a</bold><sub><italic>n</italic></sub>, which is a vector of size <italic>m</italic> and the <italic>i</italic>-th element <italic>a</italic><sub><italic>i</italic></sub> denotes the treatment value of the <italic>i</italic>-th ad. This notation makes the interference among ads on the same pageview more explicit as the potential outcome of a single ad now depends on the entire treatment assignment <bold>a</bold><sub><italic>n</italic></sub>, rather than just <italic>a</italic><sub><italic>ni</italic></sub>. The causal effect of interventions in the presence of interference can be quantified by comparing such counterfactuals under different interventions; for instance <italic>Y</italic><sub><italic>ni</italic></sub>(<bold>a</bold><sub><italic>n</italic></sub>) vs. <inline-formula><mml:math id="M2"><mml:msub><mml:mi>Y</mml:mi><mml:mrow><mml:mi>n</mml:mi><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo stretchy='false'>(</mml:mo><mml:mstyle mathvariant='bold' mathsize='normal'><mml:mi>a</mml:mi></mml:mstyle><mml:msub><mml:mo>&#x02032;</mml:mo><mml:mi>n</mml:mi></mml:msub><mml:mo stretchy='false'>)</mml:mo><mml:mo>,</mml:mo></mml:math></inline-formula> where <bold>a</bold><sub><italic>n</italic></sub> and <inline-formula><mml:math id="M3"><mml:mstyle mathvariant='bold' mathsize='normal'><mml:mi>a</mml:mi></mml:mstyle><mml:msub><mml:mo>&#x02032;</mml:mo><mml:mi>n</mml:mi></mml:msub></mml:math></inline-formula> denote two plausible interventions.</p>
<p>In the next section, we discuss various sources that give rise to interference among ads and propose a causal graphical model that captures such interactions in a reasonable way. In what follows, we discuss various ways of quantifying the interference effect among ads and provide sufficient conditions for identification of such effects along with estimation strategies. In general, we observe fewer pageviews that would have <italic>m</italic> &#x0003E; 5 impressed ads. This may affect the finite sample performances of our effect estimations for such pageviews, discussed in Section 4.3. The number of impressed ads does not affect any of our identification claims in Section 4.2. We do consider pageviews with up to <italic>m</italic> &#x0003D; 8 impressed ads in our experiments in Section 5.</p>
</sec>
<sec id="s3">
<title>3. Ad Placement in the Presence of Interference</title>
<p>We describe ad placement in the presence of interference by a system of nonparametric structural equation models with independent errors (Pearl, <xref ref-type="bibr" rid="B23">2009</xref>). The key characteristic of structural models is that they represent each variable as deterministic functions of their direct causes together with an unobserved exogenous noise term, which itself represents all causes outside of the model. Let <italic>U</italic> denote a variable capturing user intention which is unknown and hidden to the analyst. Given such intent, the user types a query, denoted by <italic>C</italic>, which is expressed as an unrestricted function of the intent <italic>U</italic> and a noise term &#x003F5;<sub><italic>c</italic></sub>, denoted by <italic>f</italic><sub><italic>c</italic></sub>(.). Upon observing the query, a set of ads are selected from the inventory, then online auction is run to determine winner ads to be displayed on the page. The <italic>i</italic>-th displayed ad is denoted by <italic>X</italic><sub><italic>i</italic></sub>. The relation between <italic>X</italic><sub><italic>i</italic></sub> and <italic>C</italic> is captured by an unrestricted function <italic>f</italic><sub><italic>x</italic><sub><italic>i</italic></sub></sub>(.) and the perturbation term &#x003F5;<sub><italic>xi</italic></sub>. The block allocation of <italic>i</italic>-th ad is denoted by <italic>A</italic><sub><italic>i</italic></sub>. The set of all impressed ads and the allocations are denoted by <bold>X</bold> and <bold>A</bold>, respectively (we suppress the indexing of pageviews for clarity). The information on <italic>U</italic>, <bold>X</bold>, <bold>A</bold>, along with the noise term &#x003F5;<sub><italic>yi</italic></sub>, determines whether the <italic>i</italic>-th ad is clicked or not which is captured by <italic>Y</italic><sub><italic>i</italic></sub>. The structural equation models are summarized as follows.</p>
<disp-formula id="E1"><label>(1)</label><mml:math id="M4"><mml:mtable columnalign='left'><mml:mtr><mml:mtd><mml:mi>u</mml:mi><mml:mi>s</mml:mi><mml:mi>e</mml:mi><mml:mi>r</mml:mi><mml:mtext>&#x000A0;</mml:mtext><mml:mi>i</mml:mi><mml:mi>n</mml:mi><mml:mi>t</mml:mi><mml:mi>e</mml:mi><mml:mi>n</mml:mi><mml:mi>t</mml:mi><mml:mtext>&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;</mml:mtext><mml:mi>U</mml:mi><mml:mtext>&#x000A0;</mml:mtext><mml:mo>&#x02190;</mml:mo><mml:mtext>&#x02009;</mml:mtext><mml:msub><mml:mi>f</mml:mi><mml:mi>u</mml:mi></mml:msub><mml:mo stretchy='false'>(</mml:mo><mml:msub><mml:mi>&#x003F5;</mml:mi><mml:mi>u</mml:mi></mml:msub><mml:mo stretchy='false'>)</mml:mo><mml:mtext>&#x000A0;</mml:mtext></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mi>u</mml:mi><mml:mi>s</mml:mi><mml:mi>e</mml:mi><mml:mi>r</mml:mi><mml:mtext>&#x000A0;</mml:mtext><mml:mi>q</mml:mi><mml:mi>u</mml:mi><mml:mi>e</mml:mi><mml:mi>r</mml:mi><mml:mi>y</mml:mi><mml:mtext>&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;</mml:mtext><mml:mi>C</mml:mi><mml:mtext>&#x000A0;&#x000A0;</mml:mtext><mml:mo>&#x02190;</mml:mo><mml:mtext>&#x02009;</mml:mtext><mml:msub><mml:mi>f</mml:mi><mml:mi>c</mml:mi></mml:msub><mml:mo stretchy='false'>(</mml:mo><mml:mi>U</mml:mi><mml:mo>,</mml:mo><mml:msub><mml:mi>&#x003F5;</mml:mi><mml:mi>c</mml:mi></mml:msub><mml:mo stretchy='false'>)</mml:mo><mml:mtext>&#x000A0;</mml:mtext></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mi>i</mml:mi><mml:mo>&#x02212;</mml:mo><mml:mi>t</mml:mi><mml:mi>h</mml:mi><mml:mtext>&#x000A0;</mml:mtext><mml:mi>a</mml:mi><mml:mi>d</mml:mi><mml:mtext>&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;</mml:mtext><mml:msub><mml:mi>X</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mtext>&#x000A0;&#x000A0;</mml:mtext><mml:mo>&#x02190;</mml:mo><mml:mtext>&#x02009;</mml:mtext><mml:msub><mml:mi>f</mml:mi><mml:mrow><mml:msub><mml:mi>x</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:msub><mml:mo stretchy='false'>(</mml:mo><mml:mi>C</mml:mi><mml:mo>,</mml:mo><mml:msub><mml:mi>&#x003F5;</mml:mi><mml:mrow><mml:msub><mml:mi>x</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:msub><mml:mo stretchy='false'>)</mml:mo></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mi>i</mml:mi><mml:mo>&#x02212;</mml:mo><mml:mi>t</mml:mi><mml:mi>h</mml:mi><mml:mtext>&#x000A0;</mml:mtext><mml:mi>a</mml:mi><mml:msup><mml:mi>d</mml:mi><mml:mo>&#x02032;</mml:mo></mml:msup><mml:mi>s</mml:mi><mml:mtext>&#x000A0;</mml:mtext><mml:mi>a</mml:mi><mml:mi>l</mml:mi><mml:mi>l</mml:mi><mml:mi>o</mml:mi><mml:mi>c</mml:mi><mml:mi>a</mml:mi><mml:mi>t</mml:mi><mml:mi>i</mml:mi><mml:mi>o</mml:mi><mml:mi>n</mml:mi><mml:mtext>&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;</mml:mtext><mml:msub><mml:mi>A</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mtext>&#x000A0;&#x000A0;</mml:mtext><mml:mo>&#x02190;</mml:mo><mml:mtext>&#x02009;</mml:mtext><mml:msub><mml:mi>f</mml:mi><mml:mrow><mml:msub><mml:mi>a</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:msub><mml:mo stretchy='false'>(</mml:mo><mml:mstyle mathvariant='bold' mathsize='normal'><mml:mi>X</mml:mi></mml:mstyle><mml:mo>,</mml:mo><mml:msub><mml:mi>&#x003F5;</mml:mi><mml:mrow><mml:msub><mml:mi>a</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:msub><mml:mo stretchy='false'>)</mml:mo></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mi>i</mml:mi><mml:mo>&#x02212;</mml:mo><mml:mi>t</mml:mi><mml:mi>h</mml:mi><mml:mtext>&#x000A0;</mml:mtext><mml:mi>a</mml:mi><mml:msup><mml:mi>d</mml:mi><mml:mo>&#x02032;</mml:mo></mml:msup><mml:mi>s</mml:mi><mml:mtext>&#x000A0;</mml:mtext><mml:mi>c</mml:mi><mml:mi>l</mml:mi><mml:mi>i</mml:mi><mml:mi>c</mml:mi><mml:mi>k</mml:mi><mml:mtext>&#x000A0;</mml:mtext><mml:mi>i</mml:mi><mml:mi>n</mml:mi><mml:mi>d</mml:mi><mml:mi>i</mml:mi><mml:mi>c</mml:mi><mml:mi>a</mml:mi><mml:mi>t</mml:mi><mml:mi>i</mml:mi><mml:mi>o</mml:mi><mml:mi>n</mml:mi><mml:mtext>&#x000A0;&#x000A0;</mml:mtext><mml:msub><mml:mi>Y</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mtext>&#x000A0;&#x000A0;</mml:mtext><mml:mo>&#x02190;</mml:mo><mml:mtext>&#x02009;</mml:mtext><mml:msub><mml:mi>f</mml:mi><mml:mrow><mml:msub><mml:mi>y</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:msub><mml:mo stretchy='false'>(</mml:mo><mml:mi>U</mml:mi><mml:mo>,</mml:mo><mml:mstyle mathvariant='bold' mathsize='normal'><mml:mi>X</mml:mi></mml:mstyle><mml:mo>,</mml:mo><mml:mstyle mathvariant='bold' mathsize='normal'><mml:mi>A</mml:mi></mml:mstyle><mml:mo>,</mml:mo><mml:msub><mml:mi>&#x003F5;</mml:mi><mml:mrow><mml:msub><mml:mi>y</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:msub><mml:mo stretchy='false'>)</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>Note that in the above display, when allocating the <italic>i</italic>-th ad to Top or Bottom, we are not only considering the corresponding features of the ad itself, but also features of other ads on the page, hence the entire array of <bold>X</bold> is acting as causes of <italic>A</italic><sub><italic>i</italic></sub>. Similarly, we allow for the entire vector of <bold>A</bold> and array of <bold>X</bold> to influence <italic>Y</italic><sub><italic>i</italic></sub>. These equation capture the interference mechanism in ad placement. In the absence of interference, the above equations simplify by replacing the allocation structural equation with <italic>A</italic><sub><italic>i</italic></sub> &#x02190; <italic>f</italic><sub><italic>a</italic><sub><italic>i</italic></sub></sub>(<italic>X</italic><sub><italic>i</italic></sub>, &#x003F5;<sub><italic>a</italic><sub><italic>i</italic></sub></sub>) and the click indication structural equation with <italic>Y</italic><sub><italic>i</italic></sub> &#x02190; <italic>f</italic><sub><italic>y</italic><sub><italic>i</italic></sub></sub>(<italic>U, X</italic><sub><italic>i</italic></sub>, <italic>A</italic><sub><italic>i</italic></sub>, &#x003F5;<sub><italic>y</italic><sub><italic>i</italic></sub></sub>).</p>
<p>Causal relationships are often represented by graphical causal models (Spirtes et al., <xref ref-type="bibr" rid="B40">2000</xref>; Pearl, <xref ref-type="bibr" rid="B23">2009</xref>). Such models generalize independence models on directed acyclic graphs (DAGs) to also encode conditional independencies on counterfactual variables (Richardson and Robins, <xref ref-type="bibr" rid="B27">2013</xref>). A DAG G(V) consists of a set of nodes <italic>V</italic> connected through directed edges such that there are no directed cycles. We will abbreviate <italic>G</italic>(<italic>V</italic>) as simply <italic>G</italic>, when the vertex set is clear from the given context. Statistical models of a DAG <italic>G</italic> are sets of distributions that factorize as <italic>p</italic>(<italic>V</italic>) &#x0003D; &#x0220F; <sub><italic>V</italic><sub><italic>i</italic></sub> &#x02208; <italic>V</italic></sub> <italic>p</italic>[<italic>V</italic><sub><italic>i</italic></sub> &#x02223;pa <sub><italic>G</italic></sub>(<italic>V</italic><sub><italic>i</italic></sub>)], where pa<sub><italic>G</italic></sub>(<italic>V</italic><sub><italic>i</italic></sub>) are the parents of <italic>V</italic><sub><italic>i</italic></sub> in <italic>G</italic>. The absence of edges between variables in <italic>G</italic>, relative to a complete DAG entails conditional independence facts in <italic>p</italic>(<italic>V</italic>). These can be directly read off from the DAG <italic>G</italic> by the well-known d-separation criterion (Pearl, <xref ref-type="bibr" rid="B23">2009</xref>). That is, for disjoint sets <italic>X, Y, Z</italic>, the following <italic>global Markov property</italic> holds: (<italic>X</italic><sub>&#x022A5;&#x022A5;<sub>d-sep</sub><italic>Y</italic> &#x02223; <italic>Z</italic>)<italic>G</italic></sub> &#x021D2; (<italic>X</italic> &#x022A5;&#x022A5;<italic>Y</italic> &#x02223; <italic>Z</italic>)<sub><italic>p</italic>(<italic>V</italic>)</sub>. When the context is clear, we will simply use <italic>X</italic> &#x022A5;&#x022A5; <italic>Y</italic> &#x02223; <italic>Z</italic> to denote the conditional independence between <italic>X</italic> and <italic>Y</italic> given <italic>Z</italic>. The DAG representation of the structural (Equation 1) for a pageview with three impressed ads is shown in <xref ref-type="fig" rid="F1">Figure 1A</xref>. For simplicity and to avoid cluttering the graph, we only depict the outcome of the <italic>i</italic>-th ad on the DAG and marginalize out all the other outcomes (since all the outcomes share the same set of parents). The statistical model of the DAG in <xref ref-type="fig" rid="F1">Figure 1A</xref>, assuming all outcomes are included on the DAG, can be written as,</p>
<disp-formula id="E2"><label>(2)</label><mml:math id="M5"><mml:mtable columnalign='left'><mml:mtr><mml:mtd><mml:mi>p</mml:mi><mml:mo stretchy='false'>(</mml:mo><mml:mi>U</mml:mi><mml:mo>,</mml:mo><mml:mi>C</mml:mi><mml:mo>,</mml:mo><mml:mstyle mathvariant='bold' mathsize='normal'><mml:mi>X</mml:mi></mml:mstyle><mml:mo>,</mml:mo><mml:mstyle mathvariant='bold' mathsize='normal'><mml:mi>A</mml:mi></mml:mstyle><mml:mo>,</mml:mo><mml:mstyle mathvariant='bold' mathsize='normal'><mml:mi>Y</mml:mi></mml:mstyle><mml:mo stretchy='false'>)</mml:mo><mml:mtext>&#x000A0;</mml:mtext><mml:mo>=</mml:mo><mml:mi>p</mml:mi><mml:mo stretchy='false'>(</mml:mo><mml:mi>U</mml:mi><mml:mo stretchy='false'>)</mml:mo><mml:mo>&#x000D7;</mml:mo><mml:mi>p</mml:mi><mml:mo stretchy='false'>(</mml:mo><mml:mi>C</mml:mi><mml:mo>&#x02223;</mml:mo><mml:mi>U</mml:mi><mml:mo stretchy='false'>)</mml:mo><mml:mo>&#x000D7;</mml:mo><mml:mstyle displaystyle='true'><mml:mo>&#x0220F;</mml:mo> <mml:mrow><mml:msubsup><mml:mrow></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mn>3</mml:mn></mml:msubsup></mml:mrow></mml:mstyle><mml:mtext>&#x02009;</mml:mtext><mml:mo>&#x0007B;</mml:mo><mml:mi>p</mml:mi><mml:mo stretchy='false'>(</mml:mo><mml:msub><mml:mi>X</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>&#x02223;</mml:mo><mml:mi>C</mml:mi><mml:mo stretchy='false'>)</mml:mo></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mtext>&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;</mml:mtext><mml:mo>&#x000D7;</mml:mo><mml:mi>p</mml:mi><mml:mo stretchy='false'>(</mml:mo><mml:msub><mml:mi>A</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>&#x02223;</mml:mo><mml:mstyle mathvariant='bold' mathsize='normal'><mml:mi>X</mml:mi></mml:mstyle><mml:mo stretchy='false'>)</mml:mo><mml:mo>&#x000D7;</mml:mo><mml:mi>p</mml:mi><mml:mo stretchy='false'>(</mml:mo><mml:msub><mml:mi>Y</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>&#x02223;</mml:mo><mml:mi>U</mml:mi><mml:mo>,</mml:mo><mml:mstyle mathvariant='bold' mathsize='normal'><mml:mi>X</mml:mi></mml:mstyle><mml:mo>,</mml:mo><mml:mstyle mathvariant='bold' mathsize='normal'><mml:mi>A</mml:mi></mml:mstyle><mml:mo stretchy='false'>)</mml:mo><mml:mo>&#x0007D;</mml:mo><mml:mo>.</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>As we mentioned earlier, the user intent is unmeasured. We further restrict our attention to ad-specific features and leave the query-specific features aside. In other words <italic>U</italic> and <italic>C</italic> are both treated as latent. We highlight this in <xref ref-type="fig" rid="F1">Figure 1A</xref> by coloring both vertices and the relevant edges in gray. In this case, the joint distribution over observed variables <bold>X, A, Y</bold> and latent variables <italic>U, C</italic> is said to be Markov relative to a hidden variable DAG. There may be infinitely many hidden variable DAGs that imply the same set of conditional independencies on the observed margin, i.e., <italic>p</italic>(<bold>X, A, Y</bold>). It is typical to use a single acyclic directed mixed graph that entails the same set of equality constraints as this infinite class; see Verma and Pearl (<xref ref-type="bibr" rid="B43">1990</xref>) and Richardson et al. (<xref ref-type="bibr" rid="B26">2017</xref>) for more details.</p>
<fig id="F1" position="float">
<label>Figure 1</label>
<caption><p><bold>(A)</bold> DAG representation of the SEM in Equation (1) for a pageview with three impressed ads (the independent error terms are omitted from the graph for simplicity). <bold>(B)</bold> The corresponding SWIG where we intervene on <italic>A</italic> and set the block allocations (<italic>A</italic><sub>1</sub>, <italic>A</italic><sub>2</sub>, <italic>A</italic><sub>3</sub>) to (<italic>a</italic><sub>1</sub>, <italic>a</italic><sub>2</sub>, <italic>a</italic><sub>3</sub>).</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fdata-05-888592-g0001.tif"/>
</fig>
<sec>
<title>3.1. Sources of Interference in Ad Placement</title>
<p>In order to better understand the interference behavior among ads, we need to identify the causal mechanisms that give rise to such behaviors. Looking at our causal model in <xref ref-type="fig" rid="F1">Figure 1A</xref>, we allow for two distinct pathways through which other ads influence <italic>Y</italic><sub><italic>i</italic></sub>. One is direct pathways such as <italic>X</italic><sub><italic>j</italic></sub> &#x02192; <italic>Y</italic><sub><italic>i</italic></sub> and <italic>A</italic><sub><italic>j</italic></sub> &#x02192; <italic>Y</italic><sub><italic>i</italic></sub>. This type of interference is called <italic>direct interference</italic>. As an example, suppose a low quality ad (determined by various scores) is placed in the Top. The poor quality of this ad may shape the user&#x00027;s opinion about the sorted search results in negative ways, preventing them from clicking on further ads. Similarly, placing a high quality ad in the Top may convince the user to return and explore more ads. Other pathways by which outcomes of different ads could be related are ones that go through the common unmeasured confounders and account for marginal dependencies between <italic>Y</italic><sub><italic>i</italic></sub> and <italic>Y</italic><sub><italic>j</italic></sub>. An example of this marginal dependency is through user intent <italic>U</italic>, <italic>Y</italic><sub><italic>j</italic></sub> &#x02190; <italic>U</italic> &#x02192; <italic>Y</italic><sub><italic>i</italic></sub>. This type of interference is called <italic>interference by homophily</italic> (Shalizi and Thomas, <xref ref-type="bibr" rid="B33">2011</xref>). Accounting for homophily makes our framework more practical as it allows for unmeasured confounders to influence multiple outcomes simultaneously. For a discussion on graphical representations of different sources of interference, see Ogburn and VanderWeele (<xref ref-type="bibr" rid="B22">2014</xref>).</p>
<p>The third type of interference that we account for is called <italic>allocational interference</italic>. In allocational interference, the interactions among units are modeled according to their corresponding group assignments. Through interactions within a group, units&#x00027; characteristics may affect one another. This type of interference is well-suited for our purposes since each pageview is divided into non-overlapping blocks (Top and Bottom), and we can simply treat each block as a single group of ads. In our setting, treatment allocates each ad to a single block (randomly or given covariates <italic>X</italic>), and the outcome of the ad is affected by which other ads are allocated to the same block. We call this behavior <italic>block-level interference</italic>. We can also imagine a scenario where the outcome of an ad is affected by the ads that are <bold>not</bold> allocated to the same block. In other words, ads could potentially interact across blocks. We call this <italic>cross-block interference</italic>. As an example, moving a high quality ad to the Bottom may improve the perception of other ads in the Bottom and yield higher clicks on these ads. On the other hand, it may also affect the click yields of ads in the Top by drawing attention away from these ads, resulting in cross-block interactions. In order to formalize the block-level interference and cross-block interference, we split <bold>X</bold> into two disjoint sets: one that contains block-level information, denoted by <bold>X</bold><sup><italic>b</italic></sup>, and one that contains information outside the block, denoted by <bold>X</bold><sup><italic>c</italic></sup>. For the <italic>i</italic>-th positioned ad, we define two disjoint sets:</p>
<disp-formula id="E3"><mml:math id="M6"><mml:mtable columnalign='left'><mml:mtr><mml:mtd><mml:msubsup><mml:mstyle mathvariant='bold' mathsize='normal'><mml:mi>X</mml:mi></mml:mstyle><mml:mi>i</mml:mi><mml:mi>b</mml:mi></mml:msubsup><mml:mo>=</mml:mo><mml:mo>&#x0007B;</mml:mo><mml:msub><mml:mi>X</mml:mi><mml:mi>j</mml:mi></mml:msub><mml:mo>&#x02208;</mml:mo><mml:mstyle mathvariant='bold' mathsize='normal'><mml:mi>X</mml:mi></mml:mstyle><mml:mtext>&#x02009;&#x000A0;s.t.&#x000A0;&#x02009;</mml:mtext><mml:msub><mml:mi>A</mml:mi><mml:mi>j</mml:mi></mml:msub><mml:mo>=</mml:mo><mml:msub><mml:mi>A</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>&#x0007D;</mml:mo><mml:mo>=</mml:mo><mml:mo>&#x0007B;</mml:mo><mml:mi mathvariant='double-struck'>I</mml:mi><mml:mo stretchy='false'>(</mml:mo><mml:msub><mml:mi>A</mml:mi><mml:mi>j</mml:mi></mml:msub><mml:mo>=</mml:mo><mml:msub><mml:mi>A</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo stretchy='false'>)</mml:mo><mml:mo>&#x000D7;</mml:mo><mml:msub><mml:mi>X</mml:mi><mml:mi>j</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:mtext>&#x02009;</mml:mtext><mml:mo>&#x02200;</mml:mo><mml:mi>j</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn><mml:mo>,</mml:mo><mml:mo>&#x02026;</mml:mo><mml:mo>,</mml:mo><mml:mi>m</mml:mi><mml:mo>&#x0007D;</mml:mo><mml:mo>,</mml:mo></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:msubsup><mml:mstyle mathvariant='bold' mathsize='normal'><mml:mi>X</mml:mi></mml:mstyle><mml:mi>i</mml:mi><mml:mi>c</mml:mi></mml:msubsup><mml:mtext>&#x000A0;</mml:mtext><mml:mo>=</mml:mo><mml:mo>&#x0007B;</mml:mo><mml:msub><mml:mi>X</mml:mi><mml:mi>k</mml:mi></mml:msub><mml:mo>&#x02208;</mml:mo><mml:mstyle mathvariant='bold' mathsize='normal'><mml:mi>X</mml:mi></mml:mstyle><mml:mtext>&#x02009;&#x000A0;s.t.&#x000A0;&#x02009;</mml:mtext><mml:msub><mml:mi>A</mml:mi><mml:mi>j</mml:mi></mml:msub><mml:menclose notation='updiagonalstrike'><mml:mo>=</mml:mo></mml:menclose><mml:msub><mml:mi>A</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>&#x0007D;</mml:mo><mml:mo>=</mml:mo><mml:mo>&#x0007B;</mml:mo><mml:mi mathvariant='double-struck'>I</mml:mi><mml:mo stretchy='false'>(</mml:mo><mml:msub><mml:mi>A</mml:mi><mml:mi>j</mml:mi></mml:msub><mml:menclose notation='updiagonalstrike'><mml:mo>=</mml:mo></mml:menclose><mml:msub><mml:mi>A</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo stretchy='false'>)</mml:mo><mml:mo>&#x000D7;</mml:mo><mml:msub><mml:mi>X</mml:mi><mml:mi>j</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:mtext>&#x02009;</mml:mtext><mml:mo>&#x02200;</mml:mo><mml:mi>j</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn><mml:mo>,</mml:mo><mml:mo>&#x02026;</mml:mo><mml:mo>,</mml:mo><mml:mi>m</mml:mi><mml:mo>&#x0007D;</mml:mo><mml:mo>.</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>We modify the structural equations for <italic>Y<sub>i</sub></italic> in (Equation 1) to directly account for the allocational interference in our framework by simply replacing <italic>f</italic><sub><italic>y</italic><sub><italic>i</italic></sub></sub>(<italic>U</italic>, <bold>X, A</bold>, &#x003F5;<sub><italic>y</italic><sub><italic>i</italic></sub></sub>) with <inline-formula><mml:math id="M7"><mml:mrow><mml:msub><mml:mi>f</mml:mi><mml:mrow><mml:msub><mml:mi>y</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:msub><mml:mo stretchy='false'>(</mml:mo><mml:mi>U</mml:mi><mml:mo>,</mml:mo><mml:msubsup><mml:mstyle mathvariant='bold' mathsize='normal'><mml:mi>X</mml:mi></mml:mstyle><mml:mi>i</mml:mi><mml:mi>b</mml:mi></mml:msubsup><mml:mo>,</mml:mo><mml:msubsup><mml:mstyle mathvariant='bold' mathsize='normal'><mml:mi>X</mml:mi></mml:mstyle><mml:mi>i</mml:mi><mml:mi>c</mml:mi></mml:msubsup><mml:mo>,</mml:mo><mml:msub><mml:mi>&#x003F5;</mml:mi><mml:mrow><mml:msub><mml:mi>y</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:msub><mml:mo stretchy='false'>)</mml:mo><mml:mo>.</mml:mo></mml:mrow></mml:math></inline-formula> Note that both <inline-formula><mml:math id="M8"><mml:msubsup><mml:mrow><mml:mstyle mathvariant="bold"><mml:mtext>X</mml:mtext></mml:mstyle></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>b</mml:mi></mml:mrow></mml:msubsup></mml:math></inline-formula> and <inline-formula><mml:math id="M9"><mml:msubsup><mml:mrow><mml:mstyle mathvariant="bold"><mml:mtext>X</mml:mtext></mml:mstyle></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>c</mml:mi></mml:mrow></mml:msubsup></mml:math></inline-formula> depend on the treatment rule <italic>A</italic> by construction. The function <italic>f<sub>yi</sub></italic> can take a nonlinear or a linear form. For illustration, assume <italic>f</italic><sub><italic>y</italic><sub><italic>i</italic></sub></sub> is linear in parameters. Therefore, we have:</p>
<disp-formula id="E4"><mml:math id="M10"><mml:mrow><mml:msub><mml:mi>Y</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mtext>&#x02009;</mml:mtext><mml:mo>&#x02190;</mml:mo><mml:mtext>&#x02009;</mml:mtext><mml:mstyle displaystyle='true'><mml:munderover><mml:mo>&#x02211;</mml:mo><mml:mrow><mml:mi>j</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mi>m</mml:mi></mml:munderover><mml:mrow><mml:msub><mml:mi>&#x003B3;</mml:mi><mml:mi>j</mml:mi></mml:msub></mml:mrow></mml:mstyle><mml:mo>&#x000D7;</mml:mo><mml:mi mathvariant='double-struck'>I</mml:mi><mml:mo stretchy='false'>(</mml:mo><mml:msub><mml:mi>A</mml:mi><mml:mi>j</mml:mi></mml:msub><mml:mo>=</mml:mo><mml:msub><mml:mi>A</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo stretchy='false'>)</mml:mo><mml:mo>&#x000D7;</mml:mo><mml:msub><mml:mi>X</mml:mi><mml:mi>j</mml:mi></mml:msub><mml:mo>+</mml:mo><mml:msub><mml:mi>&#x003B7;</mml:mi><mml:mi>j</mml:mi></mml:msub><mml:mo>&#x000D7;</mml:mo><mml:mi mathvariant='double-struck'>I</mml:mi><mml:mo stretchy='false'>(</mml:mo><mml:msub><mml:mi>A</mml:mi><mml:mi>j</mml:mi></mml:msub><mml:menclose notation='updiagonalstrike'><mml:mo>=</mml:mo></mml:menclose><mml:msub><mml:mi>A</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo stretchy='false'>)</mml:mo><mml:mo>&#x000D7;</mml:mo><mml:msub><mml:mi>X</mml:mi><mml:mi>j</mml:mi></mml:msub><mml:mo>+</mml:mo><mml:msub><mml:mi>&#x003F5;</mml:mi><mml:mrow><mml:msub><mml:mi>y</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:msub></mml:mrow></mml:math></disp-formula>
<p>In the above equation, &#x003B3;<sub><italic>j</italic></sub> controls the block-level influence of <italic>X</italic><sub><italic>j</italic></sub> on the <italic>i</italic>-th ad if <italic>X</italic><sub><italic>j</italic></sub> is in the same block as <italic>X</italic><sub><italic>i</italic></sub>, otherwise the influence is controlled by the parameter &#x003B7;<sub><italic>j</italic></sub>. If &#x003B7;<sub><italic>j</italic></sub> &#x0003D; 0, &#x02200;<italic>j</italic>, then this implies that there is no cross-block interference and blocks are independent. If &#x003B7;<sub><italic>j</italic></sub> &#x0003D; &#x003B3;<sub><italic>j</italic></sub>, &#x02200;<italic>j</italic>, then this implies that there is no allocational interference. In other words, interactions within blocks and across blocks are modeled exactly the same and therefore the notion of &#x0201C;groups&#x0201D; is ruled out.</p>
</sec>
</sec>
<sec id="s4">
<title>4. Interference Effects Among Ads</title>
<p>Structural equation models, such as the one in display (1), enable us to determine the response of variables to interventions through incorporating knowledge of the functional dependencies between variables. For instance, intervening on the block allocation of the <italic>i</italic>-th ad would fix the value of <italic>A</italic><sub><italic>i</italic></sub> to <italic>a</italic><sub><italic>i</italic></sub>, and would transform descendants of <italic>A</italic><sub><italic>i</italic></sub> to counterfactual variables of the form <italic>V</italic>(<italic>a</italic><sub><italic>i</italic></sub>). Under an intervention that sets <italic>A</italic> to <italic>a</italic>, the structural (Equation 1) are modified as follows:</p>
<disp-formula id="E5"><label>(3)</label><mml:math id="M11"><mml:mtable columnalign='left'><mml:mtr><mml:mtd><mml:mtext>&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;</mml:mtext><mml:msub><mml:mi>A</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mtext>&#x02009;</mml:mtext><mml:mo>&#x02190;</mml:mo><mml:mtext>&#x02009;</mml:mtext><mml:msub><mml:mi>a</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:mtext>&#x02009;</mml:mtext><mml:mo>&#x02200;</mml:mo><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn><mml:mo>,</mml:mo><mml:mo>&#x02026;</mml:mo><mml:mo>,</mml:mo><mml:mi>m</mml:mi><mml:mo>,</mml:mo><mml:mtext>&#x000A0;and&#x000A0;</mml:mtext></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:msub><mml:mi>Y</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo stretchy='false'>(</mml:mo><mml:mstyle mathvariant='bold' mathsize='normal'><mml:mi>a</mml:mi></mml:mstyle><mml:mo stretchy='false'>)</mml:mo><mml:mtext>&#x02009;</mml:mtext><mml:mo>&#x02190;</mml:mo><mml:mtext>&#x02009;</mml:mtext><mml:msub><mml:mi>f</mml:mi><mml:mrow><mml:msub><mml:mi>y</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:msub><mml:mo stretchy='false'>(</mml:mo><mml:mi>U</mml:mi><mml:mo>,</mml:mo><mml:mstyle mathvariant='bold' mathsize='normal'><mml:mi>X</mml:mi></mml:mstyle><mml:mo>,</mml:mo><mml:mstyle mathvariant='bold' mathsize='normal'><mml:mi>a</mml:mi></mml:mstyle><mml:mo>,</mml:mo><mml:msub><mml:mi>&#x003F5;</mml:mi><mml:mrow><mml:msub><mml:mi>y</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:msub><mml:mo stretchy='false'>)</mml:mo><mml:mo>,</mml:mo><mml:mtext>&#x02009;</mml:mtext><mml:mo>&#x02200;</mml:mo><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn><mml:mo>,</mml:mo><mml:mo>&#x02026;</mml:mo><mml:mo>,</mml:mo><mml:mi>m</mml:mi><mml:mo>.</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>Interventions can be directly applied to the causal graph through a node-splitting operation where random variables in <italic>A</italic> are split into two parts: a random part that takes all the incoming edges and a fixed part that takes all the outgoing edges. The resulting graph is called a single-world intervention graph (SWIG) which encodes counterfactual independencies associated with the intervention (Richardson and Robins, <xref ref-type="bibr" rid="B27">2013</xref>). Given the causal model in <xref ref-type="fig" rid="F1">Figure 1A</xref>, we obtain the corresponding SWIG in <xref ref-type="fig" rid="F1">Figure 1B</xref> after performing the intervention described in display (Equation 1).</p>
<sec>
<title>4.1. Causal Effects of Interest</title>
<p>We set block allocation as our treatment of interest, and based on the prior literature, consider several causal effects that are of particular interest in ad placement systems.</p>
<list list-type="order">
<list-item><p><italic>Unit-level effect</italic>: defined as the effect of modifying an ad&#x00027;s block allocation on its clickability but holding the block allocations of other ads fixed. Assume we have a fixed allocation rule <bold>a</bold>, and we are interested in moving the <italic>i</italic>-th ad from block <italic>a</italic>&#x02032; to <italic>a</italic>&#x02033;, i.e., altering the <italic>i</italic>-th element of <italic>a</italic> and allowing the other ads to follow the rule <bold>a</bold><sub>&#x02212;<italic>i</italic></sub>. Then the unit-level effect is quantified via</p></list-item>
</list>
<disp-formula id="E6"><mml:math id="M12"><mml:mrow><mml:msub><mml:mrow><mml:mtext>UE</mml:mtext></mml:mrow><mml:mi>i</mml:mi></mml:msub><mml:mo stretchy='false'>(</mml:mo><mml:msup><mml:mi>a</mml:mi><mml:mo>&#x02032;</mml:mo></mml:msup><mml:mo>,</mml:mo><mml:msup><mml:mi>a</mml:mi><mml:mo>&#x02033;</mml:mo></mml:msup><mml:mo>,</mml:mo><mml:mstyle mathvariant='bold' mathsize='normal'><mml:mi>a</mml:mi></mml:mstyle><mml:mo stretchy='false'>)</mml:mo><mml:mo>=</mml:mo><mml:mtext>&#x000A0;</mml:mtext><mml:mi>&#x1D53C;</mml:mi><mml:mo stretchy='true'>[</mml:mo><mml:msub><mml:mi>Y</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo stretchy='false'>(</mml:mo><mml:msup><mml:mi>a</mml:mi><mml:mo>&#x02032;</mml:mo></mml:msup><mml:mo>,</mml:mo><mml:msub><mml:mstyle mathvariant='bold' mathsize='normal'><mml:mi>a</mml:mi></mml:mstyle><mml:mrow><mml:mo>&#x02212;</mml:mo><mml:mstyle mathvariant='bold' mathsize='normal'><mml:mi>i</mml:mi></mml:mstyle></mml:mrow></mml:msub><mml:mo stretchy='false'>)</mml:mo><mml:mo stretchy='true'>]</mml:mo><mml:mo>&#x02212;</mml:mo><mml:mtext>&#x000A0;</mml:mtext><mml:mi>&#x1D53C;</mml:mi><mml:mo stretchy='true'>[</mml:mo><mml:msub><mml:mi>Y</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo stretchy='false'>(</mml:mo><mml:msup><mml:mi>a</mml:mi><mml:mo>&#x02033;</mml:mo></mml:msup><mml:mo>,</mml:mo><mml:msub><mml:mstyle mathvariant='bold' mathsize='normal'><mml:mi>a</mml:mi></mml:mstyle><mml:mrow><mml:mo>&#x02212;</mml:mo><mml:mstyle mathvariant='bold' mathsize='normal'><mml:mi>i</mml:mi></mml:mstyle></mml:mrow></mml:msub><mml:mo stretchy='false'>)</mml:mo><mml:mo stretchy='true'>]</mml:mo><mml:mo>.</mml:mo></mml:mrow></mml:math></disp-formula>
<list list-type="simple">
<list-item><p><italic>2. Spillover effect</italic>: defined as the effect of holding an ad&#x00027;s block allocation fixed but modifying the block allocations of other ads on the pageview. Assume we are interested in comparing two allocation rules <bold>a&#x02032;</bold> and <bold>a&#x02033;</bold> where the <italic>i</italic>-th element in each rule is fixed to <italic>a</italic>. Then the spill-over effect is quantified via</p></list-item>
</list>
<disp-formula id="E7"><mml:math id="M13"><mml:mrow><mml:msub><mml:mrow><mml:mtext>SE</mml:mtext></mml:mrow><mml:mi>i</mml:mi></mml:msub><mml:mo stretchy='false'>(</mml:mo><mml:mi>a</mml:mi><mml:mo>,</mml:mo><mml:mstyle mathvariant='bold' mathsize='normal'><mml:msup><mml:mi>a</mml:mi><mml:mo>&#x02032;</mml:mo></mml:msup></mml:mstyle><mml:mo>,</mml:mo><mml:mstyle mathvariant='bold' mathsize='normal'><mml:msup><mml:mi>a</mml:mi><mml:mo>&#x02033;</mml:mo></mml:msup></mml:mstyle><mml:mo stretchy='false'>)</mml:mo><mml:mo>=</mml:mo><mml:mtext>&#x000A0;</mml:mtext><mml:mi>&#x1D53C;</mml:mi><mml:mo stretchy='true'>[</mml:mo><mml:msub><mml:mi>Y</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo stretchy='false'>(</mml:mo><mml:mi>a</mml:mi><mml:mo>,</mml:mo><mml:msub><mml:mstyle mathvariant='bold' mathsize='normal'><mml:msup><mml:mi>a</mml:mi><mml:mo>&#x02032;</mml:mo></mml:msup></mml:mstyle><mml:mrow><mml:mo>&#x02212;</mml:mo><mml:mstyle mathvariant='bold' mathsize='normal'><mml:mi>i</mml:mi></mml:mstyle></mml:mrow></mml:msub><mml:mo stretchy='false'>)</mml:mo><mml:mo stretchy='true'>]</mml:mo><mml:mo>&#x02212;</mml:mo><mml:mtext>&#x000A0;</mml:mtext><mml:mi>&#x1D53C;</mml:mi><mml:mo stretchy='true'>[</mml:mo><mml:msub><mml:mi>Y</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo stretchy='false'>(</mml:mo><mml:mi>a</mml:mi><mml:mo>,</mml:mo><mml:msub><mml:mstyle mathvariant='bold' mathsize='normal'><mml:msup><mml:mi>a</mml:mi><mml:mo>&#x02033;</mml:mo></mml:msup></mml:mstyle><mml:mrow><mml:mo>&#x02212;</mml:mo><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo stretchy='false'>)</mml:mo><mml:mo stretchy='true'>]</mml:mo><mml:mo>.</mml:mo></mml:mrow></mml:math></disp-formula>
<list list-type="simple">
<list-item><p><italic>3. Overall effect</italic>: defined as the effect of allocation rule <italic>a</italic> vs. <italic>a</italic>&#x02032; on the outcome of the <italic>i</italic>-th ad, which can be quantified via</p></list-item>
</list>
<disp-formula id="E8"><mml:math id="M14"><mml:mrow><mml:msub><mml:mrow><mml:mtext>OE</mml:mtext></mml:mrow><mml:mi>i</mml:mi></mml:msub><mml:mo stretchy='false'>(</mml:mo><mml:mstyle mathvariant='bold' mathsize='normal'><mml:mi>a</mml:mi></mml:mstyle><mml:mo>,</mml:mo><mml:mstyle mathvariant='bold' mathsize='normal'><mml:msup><mml:mi>a</mml:mi><mml:mo>&#x02032;</mml:mo></mml:msup></mml:mstyle><mml:mo stretchy='false'>)</mml:mo><mml:mo>=</mml:mo><mml:mtext>&#x000A0;</mml:mtext><mml:mi>&#x1D53C;</mml:mi><mml:mo stretchy='true'>[</mml:mo><mml:msub><mml:mi>Y</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo stretchy='false'>(</mml:mo><mml:mstyle mathvariant='bold' mathsize='normal'><mml:mi>a</mml:mi></mml:mstyle><mml:mo stretchy='false'>)</mml:mo><mml:mo stretchy='true'>]</mml:mo><mml:mo>&#x02212;</mml:mo><mml:mtext>&#x000A0;</mml:mtext><mml:mi>&#x1D53C;</mml:mi><mml:mo stretchy='true'>[</mml:mo><mml:msub><mml:mi>Y</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo stretchy='false'>(</mml:mo><mml:mstyle mathvariant='bold' mathsize='normal'><mml:msup><mml:mi>a</mml:mi><mml:mo>&#x02032;</mml:mo></mml:msup></mml:mstyle><mml:mo stretchy='false'>)</mml:mo><mml:mo stretchy='true'>]</mml:mo><mml:mo>.</mml:mo></mml:mrow></mml:math></disp-formula>
<list list-type="simple">
<list-item><p><italic>4. Average overall effect</italic>: defines as a pageview-level comparison of two different allocation rules. This would require an average over all the overall effects computed on a single pageview, i.e.,</p></list-item>
</list>
<disp-formula id="E9"><mml:math id="M15"><mml:mrow><mml:mtext>AOE</mml:mtext><mml:mo stretchy='false'>(</mml:mo><mml:mstyle mathvariant='bold' mathsize='normal'><mml:mi>a</mml:mi></mml:mstyle><mml:mo>,</mml:mo><mml:mstyle mathvariant='bold' mathsize='normal'><mml:msup><mml:mi>a</mml:mi><mml:mo>&#x02032;</mml:mo></mml:msup></mml:mstyle><mml:mo stretchy='false'>)</mml:mo><mml:mo>=</mml:mo><mml:mfrac><mml:mn>1</mml:mn><mml:mi>m</mml:mi></mml:mfrac><mml:mtext>&#x02009;</mml:mtext><mml:mstyle displaystyle='true'><mml:munderover><mml:mo>&#x02211;</mml:mo><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mi>m</mml:mi></mml:munderover><mml:mrow><mml:mtext>&#x000A0;</mml:mtext><mml:mi>&#x1D53C;</mml:mi></mml:mrow></mml:mstyle><mml:mo stretchy='false'>[</mml:mo><mml:msub><mml:mi>Y</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo stretchy='false'>(</mml:mo><mml:mstyle mathvariant='bold' mathsize='normal'><mml:mi>a</mml:mi></mml:mstyle><mml:mo stretchy='false'>)</mml:mo><mml:mo stretchy='false'>]</mml:mo><mml:mo>&#x02212;</mml:mo><mml:mtext>&#x000A0;</mml:mtext><mml:mi>&#x1D53C;</mml:mi><mml:mo stretchy='false'>[</mml:mo><mml:msub><mml:mi>Y</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo stretchy='false'>(</mml:mo><mml:mstyle mathvariant='bold' mathsize='normal'><mml:msup><mml:mi>a</mml:mi><mml:mo>&#x02032;</mml:mo></mml:msup></mml:mstyle><mml:mo stretchy='false'>)</mml:mo><mml:mo stretchy='false'>]</mml:mo><mml:mo>.</mml:mo></mml:mrow></mml:math></disp-formula>
</sec>
<sec>
<title>4.2. Identification Assumptions</title>
<p>Counterfactuals cannot in general be identified from data alone, and require assumptions. It is straightforward to see that all the effects described above involve counterfactual mean contrasts of the form &#x1D53C;[<italic>Y</italic><sub><italic>i</italic></sub>(<bold>a</bold>)]. Thus if we can identify this counterfactual mean, all the effects described are identifiable. In order to identify the counterfactual mean &#x1D53C;[<italic>Y</italic><sub><italic>i</italic></sub>(<bold>a</bold>)], we make the following three assumptions: (i) <italic>Allocational consistency:</italic> <italic>Y</italic><sub><italic>i</italic></sub>(<bold>a</bold>) &#x0003D; <italic>Y</italic><sub><italic>i</italic></sub> if <bold>A</bold> &#x0003D; <bold>a</bold>, which means the potential outcome agrees with the observed outcome when the allocational intervention agrees with the observed allocations, (ii) <italic>Positivity:</italic> <italic>p</italic>(<bold>A</bold> &#x0003D; <bold>a</bold> &#x02223; <bold>X</bold> &#x0003D; <bold>x</bold>) &#x0003E; 0, &#x02200; <bold>a</bold> &#x02208;&#x1D51B; <sub><bold>A</bold></sub> and &#x02200; <bold>x</bold> &#x02208;&#x1D51B; <sub><bold>X</bold></sub>, and (ii) <italic>Network conditional ignorability:</italic> <italic>Y</italic><sub><italic>i</italic></sub>(<bold>a</bold>)&#x022A5;&#x022A5;<bold>A</bold>&#x02223;<bold>X</bold>, which means all the common confounders between each <italic>A</italic><sub><italic>j</italic></sub>&#x02208;<bold>A</bold> and <italic>Y</italic><sub><italic>i</italic></sub> are measured.</p>
<p>Consistency and positivity assumptions are standard in causal inference (with or without the presence of interference). Even though, the no-unmeasured confounder assumption is also a common assumption in the literature, see Hudgens and Halloran (<xref ref-type="bibr" rid="B16">2008</xref>), Tchetgen and VanderWeele (<xref ref-type="bibr" rid="B41">2012</xref>), and Ogburn and VanderWeele (<xref ref-type="bibr" rid="B22">2014</xref>) for examples in the context of interference, this assumption is often untestable. In practice, we may either rely on domain knowledge to argue for the conditional ignorability assumption, or we can conduct a sensitivity analysis to know whether, and to what extent, the conclusions are robust to potential unmeasured confounding (Robins et al., <xref ref-type="bibr" rid="B30">2000</xref>; Scharfstein et al., <xref ref-type="bibr" rid="B32">2021</xref>). Fortunately, given the ad placement setup, described via the structural equations in display (1) and illustrated via the DAG in <xref ref-type="fig" rid="F1">Figure 1A</xref>, we know the observed set <italic>X</italic> is fully responsible for deciding the allocations. Thus, the network conditional ignorability assumption still holds even in the presence of unmeasured confounders <italic>U</italic>, e.g., the use intent. Further, as mentioned previously, we can exclude the observed queries, collected in <italic>C</italic>, from the conditioning set as such factors do not play a direct role in neither choosing the allocations nor the final observed clicks. Using d-separation rules (Pearl, <xref ref-type="bibr" rid="B23">2009</xref>), we can read off the independence between allocations <italic>A</italic> and counterfactual variable <italic>Y</italic><sub><italic>i</italic></sub>(<bold>a</bold>) (conditioned on <italic>X</italic>) from the corresponding SWIG shown in <xref ref-type="fig" rid="F1">Figure 1B</xref>.</p>
<p>Given the structural equation model described in Equation (1), the represented causal model in <xref ref-type="fig" rid="F1">Figure 1A</xref>, and the corresponding SWIG in <xref ref-type="fig" rid="F1">Figure 1B</xref>, we can easily verify that network conditional ignorability holds in our model. By rules of d-separation, all the paths from <italic>Y</italic><sub><italic>i</italic></sub>(<bold>a</bold>) to each <italic>A</italic><sub><italic>j</italic></sub> is blocked by conditioning on <bold>X</bold>. Under the aforementioned assumptions, the identifying functional for &#x1D53C;[<italic>Y</italic><sub><italic>i</italic></sub>(<bold>a</bold>)] is then obtained as follows,</p>
<disp-formula id="E10"><label>(4)</label><mml:math id="M16"><mml:mrow><mml:mi>&#x1D53C;</mml:mi><mml:mo stretchy='true'>[</mml:mo><mml:msub><mml:mi>Y</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo stretchy='false'>(</mml:mo><mml:mstyle mathvariant='bold' mathsize='normal'><mml:mi>a</mml:mi></mml:mstyle><mml:mo stretchy='false'>)</mml:mo><mml:mo stretchy='true'>]</mml:mo><mml:mo>=</mml:mo><mml:mtext>&#x000A0;</mml:mtext><mml:mi>&#x1D53C;</mml:mi><mml:mo stretchy='true'>[</mml:mo><mml:mtext>&#x000A0;</mml:mtext><mml:mi>&#x1D53C;</mml:mi><mml:mo stretchy='false'>[</mml:mo><mml:msub><mml:mi>Y</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>&#x02223;</mml:mo><mml:mstyle mathvariant='bold' mathsize='normal'><mml:mi>A</mml:mi></mml:mstyle><mml:mo>=</mml:mo><mml:mstyle mathvariant='bold' mathsize='normal'><mml:mi>a</mml:mi></mml:mstyle><mml:mo>,</mml:mo><mml:mstyle mathvariant='bold' mathsize='normal'><mml:mi>X</mml:mi></mml:mstyle><mml:mo stretchy='false'>]</mml:mo><mml:mo stretchy='true'>]</mml:mo><mml:mo>,</mml:mo></mml:mrow></mml:math></disp-formula>
<p>where the outer expectation is taken with respect to the marginal distribution over <italic>X</italic>, i.e., <italic>p</italic>(<bold>X</bold>). For a general theory describing when causal inference with interference is possible, interested readers can refer to Sherman and Shpitser (<xref ref-type="bibr" rid="B35">2018</xref>).</p>
</sec>
<sec>
<title>4.3. Estimation of Causal Effects</title>
<p>We set our target of inference to be &#x003C8; &#x0003D; &#x1D53C;[<italic>Y</italic><sub><italic>i</italic></sub>(<bold>a</bold>)] which is identified via (4). There are several ways of estimating this identified functional (e.g., G-computation methods, inverse probability weighting estimators, etc). In our experiments, we use the <italic>augmented inverse probability weighting</italic> (AIPW) estimator, given as</p>
<disp-formula id="E11"><label>(5)</label><mml:math id="M17"><mml:mtable columnalign='left'><mml:mtr><mml:mtd><mml:msub><mml:mover accent='true'><mml:mi>&#x003C8;</mml:mi><mml:mo>&#x0005E;</mml:mo></mml:mover><mml:mrow><mml:mtext>aipw</mml:mtext></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mfrac><mml:mn>1</mml:mn><mml:mi>N</mml:mi></mml:mfrac><mml:mstyle displaystyle='true'><mml:munderover><mml:mo>&#x02211;</mml:mo><mml:mrow><mml:mi>n</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mi>N</mml:mi></mml:munderover><mml:mrow><mml:mo>[</mml:mo><mml:mrow><mml:mfrac><mml:mrow><mml:mi mathvariant='double-struck'>I</mml:mi><mml:mo stretchy='false'>(</mml:mo><mml:msub><mml:mstyle mathvariant='bold' mathsize='normal'><mml:mi>A</mml:mi></mml:mstyle><mml:mi>n</mml:mi></mml:msub><mml:mo>=</mml:mo><mml:mstyle mathvariant='bold' mathsize='normal'><mml:mi>a</mml:mi></mml:mstyle><mml:mo stretchy='false'>)</mml:mo><mml:mo>&#x000D7;</mml:mo><mml:mo stretchy='true'>(</mml:mo><mml:msub><mml:mi>Y</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mi>n</mml:mi></mml:mrow></mml:msub><mml:mo>&#x02212;</mml:mo><mml:mi>&#x1D53C;</mml:mi><mml:mo stretchy='true'>[</mml:mo><mml:msub><mml:mi>Y</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mi>n</mml:mi></mml:mrow></mml:msub><mml:mo>&#x02223;</mml:mo><mml:mstyle mathvariant='bold' mathsize='normal'><mml:mi>A</mml:mi></mml:mstyle><mml:mo>=</mml:mo><mml:mstyle mathvariant='bold' mathsize='normal'><mml:mi>a</mml:mi></mml:mstyle><mml:mo>,</mml:mo><mml:msub><mml:mstyle mathvariant='bold' mathsize='normal'><mml:mi>X</mml:mi></mml:mstyle><mml:mi>n</mml:mi></mml:msub><mml:mo>;</mml:mo><mml:msub><mml:mover accent='true'><mml:mi>&#x003B1;</mml:mi><mml:mo>&#x0005E;</mml:mo></mml:mover><mml:mi>y</mml:mi></mml:msub><mml:mo stretchy='true'>]</mml:mo><mml:mo stretchy='true'>)</mml:mo></mml:mrow><mml:mrow><mml:msubsup><mml:mstyle mathsize='80%' displaystyle='true'><mml:mo>&#x0220F;</mml:mo></mml:mstyle><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mi>m</mml:mi></mml:msubsup><mml:mtext>&#x000A0;</mml:mtext><mml:mi>p</mml:mi><mml:mo stretchy='false'>(</mml:mo><mml:msub><mml:mi>A</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mi>n</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:msub><mml:mi>a</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>&#x02223;</mml:mo><mml:msub><mml:mstyle mathvariant='bold' mathsize='normal'><mml:mi>X</mml:mi></mml:mstyle><mml:mi>n</mml:mi></mml:msub><mml:mo>;</mml:mo><mml:msub><mml:mover accent='true'><mml:mi>&#x003B1;</mml:mi><mml:mo>&#x0005E;</mml:mo></mml:mover><mml:mi>a</mml:mi></mml:msub><mml:mo stretchy='false'>)</mml:mo></mml:mrow></mml:mfrac></mml:mrow></mml:mrow></mml:mstyle></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mrow><mml:mrow><mml:mrow><mml:mtext>&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;</mml:mtext><mml:mo>+</mml:mo><mml:mtext>&#x000A0;</mml:mtext><mml:mi>&#x1D53C;</mml:mi><mml:mo stretchy='false'>[</mml:mo><mml:msub><mml:mi>Y</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mi>n</mml:mi></mml:mrow></mml:msub><mml:mo>&#x02223;</mml:mo><mml:mstyle mathvariant='bold' mathsize='normal'><mml:mi>A</mml:mi></mml:mstyle><mml:mo>=</mml:mo><mml:mstyle mathvariant='bold' mathsize='normal'><mml:mi>a</mml:mi></mml:mstyle><mml:mo>,</mml:mo><mml:msub><mml:mstyle mathvariant='bold' mathsize='normal'><mml:mi>X</mml:mi></mml:mstyle><mml:mi>n</mml:mi></mml:msub><mml:mo>;</mml:mo><mml:msub><mml:mover accent='true'><mml:mi>&#x003B1;</mml:mi><mml:mo>&#x0005E;</mml:mo></mml:mover><mml:mi>y</mml:mi></mml:msub><mml:mo stretchy='false'>]</mml:mo></mml:mrow><mml:mo>]</mml:mo></mml:mrow><mml:mo>,</mml:mo></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>where <inline-formula><mml:math id="M18"><mml:msub><mml:mrow><mml:mover accent="false"><mml:mrow><mml:mi>&#x003B1;</mml:mi></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mi>y</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> and <inline-formula><mml:math id="M19"><mml:msub><mml:mrow><mml:mover accent="false"><mml:mrow><mml:mi>&#x003B1;</mml:mi></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mi>a</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> are MLE estimates of the parameters in the outcome regression model &#x1D53C;[<italic>Y</italic> &#x02223; <bold>A</bold>, <bold>X</bold>] and propensity models <italic>p</italic>(<italic>A</italic><sub><italic>i</italic></sub> &#x02223; <bold>X</bold>), respectively. The above estimator is consistent <italic>if and only if</italic> either the propensity scores or the outcome regression models are correctly specified. This property is known as <italic>doubly robust</italic>. For a more general discussion of semiparametric doubly robust estimators of average causal effects in presence of unmeasured confounders, see Bhattacharya et al. (<xref ref-type="bibr" rid="B3">2020a</xref>). An alternative approach is to use targeted maximum likelihood estimators (Van der Laan et al., <xref ref-type="bibr" rid="B42">2007</xref>), that use an ensemble of machine learning models. We leave the exploration of TMLE to future work.</p>
</sec>
<sec>
<title>4.4. Verifying and Learning Causal Structure</title>
<p>Throughout the paper, we assumed a known causal structure for the ad placement system. To verify the correctness of our presumed causal structure, we adapt structure learning algorithms to learn the underlying mechanisms that give rise to interference. There is a rich literature on model selection from observational data in the context of causal inference with no interference (Spirtes et al., <xref ref-type="bibr" rid="B40">2000</xref>). This includes constraint-based algorithms such as PC (Spirtes et al., <xref ref-type="bibr" rid="B40">2000</xref>; Colombo and Maathuis, <xref ref-type="bibr" rid="B11">2014</xref>), score-based algorithms such as GES (Chickering, <xref ref-type="bibr" rid="B10">2002</xref>), and continuous optimization based algorithms such as the ones in and Bhattacharya et al. (<xref ref-type="bibr" rid="B4">2020b</xref>). Bhattacharya et al. (<xref ref-type="bibr" rid="B5">2019</xref>) provided a novel algorithm for model selection when units are related through a network of dependencies that can be modeled using a chain graph (Lauritzen, <xref ref-type="bibr" rid="B18">1996</xref>). However, in our context, dependencies are best modeled using DAGs with hidden variables. There exist (conditional independence) constraint-based algorithms such as <italic>fast causal inference</italic> (FCI) and variations of it, such as GFCI and RFCI, that tackle the model selection problem in the presence of unmeasured confounders.</p>
<p>Click yields are the primary target of interest. Hence, we adapt the FCI algorithm in order to learn the &#x0201C;causal parents&#x0201D; of each <italic>Y</italic><sub><italic>i</italic></sub>. We do this by performing a pre-processing step on the data, where each row corresponds to the information we collect on a single pageview, in order to account for block-level and cross-block interference. As an example, consider pageviews with three impressed ads where we are interested in finding the causal parents of the outcome in the first positioned ad, i.e., <italic>Y</italic><sub>1</sub>. We pre-process the data as follows: For each row, we evaluate the variables in <italic>X</italic><sub><italic>j</italic></sub> to zero if <italic>A</italic><sub><italic>j</italic></sub> &#x0003D; <italic>A</italic><sub>1</sub>, for <italic>j</italic> &#x0003D; 1, 2, 3. We call this pre-processed data <italic>D</italic><sub>1</sub>. We then evaluate the variables in <italic>X</italic><sub><italic>j</italic></sub> to zero if <italic>A</italic><sub><italic>j</italic></sub> &#x02260; <italic>A</italic><sub>1</sub>, for <italic>j</italic> &#x0003D; 2, 3. We call this pre-processed data <italic>D</italic><sub>2</sub>. We then append <italic>D</italic><sub>2</sub> to <italic>D</italic><sub>1</sub>, column-wise and pass this data to the FCI algorithm. Additional knowledge, such as causal ordering, can be incorporated in the procedure. The FCI algorithm then returns a partial ancestral graph (Zhang, <xref ref-type="bibr" rid="B48">2008</xref>) as the Markov equivalence class. The partial ancestral graph corresponds to a set of ancestral acyclic directed mixed graphs (Richardson and Spirtes, <xref ref-type="bibr" rid="B28">2002</xref>) that agree on conditional independence constraints on the observed data distribution. Under standard assumptions, that the true model can be represented via an ancestral graph and faithfulness, (asymptotically) FCI and hence our modification of it returns a Markov equivalence class that contains the true underlying model.</p>
<p>Here, we are working under a partial interference framework, where we model only interference within pageviews and exclude temporal dependence across pageviews. This means the search result pages are iid, but the ads inside each pageview do interact. Using the above description, we adapt the original FCI algorithm that assumes iid data to our framework for learning causal structures.</p>
</sec>
</sec>
<sec id="s5">
<title>5. Experiments</title>
<p>In this section, we illustrate the utility of our formalization of the ad interference problem through four separate experiments using Bing PC traffic: (i) estimating the counterfactual mean under interference as described in Section 4, (ii) identifying causally relevant features through structure learning, (iii) comparing click prediction models with and without accounting for interference, and (iv) evaluating the performance of models with interfernece on layouts that do not appear in the training data. For training and validation purposes, we used data from the first 2 weeks of June in 2020. The test data comes from the first 2 weeks of July in the same year. We use random forest classifiers for fitting the propensity score and the outcome regression models.</p>
<p>We focused on two types of pageviews: <italic>positive pageviews</italic>, i.e., pageviews with at least one observed click (corresponding to users with an &#x0201C;ad frame of mind" who are more likely to click on an ad), and <italic>balanced pageviews</italic>, i.e., pageviews with positive and zero-clicked views. This scenario captures a more realistic view. We used AIPW to estimate the counterfactual mean &#x1D53C;[<italic>Y</italic><sub><italic>i</italic></sub>(<bold>a</bold>)] and ran our experiments on pageviews with 3, 4, and 5 number of impressed ads.</p>
<sec>
<title>5.1. Calculation of Interference Effects</title>
<p>Recall that each allocation rule can be represented via a binary vector <bold>a</bold> &#x0003D; (<italic>a</italic><sub>1</sub>, &#x02026;, <italic>a</italic><sub><italic>m</italic></sub>); e.g., when <italic>m</italic> &#x0003D; 3, the allocation (1, 1, 1) corresponds to a scenario where all three ads are shown in the Top block. As mentioned in the preliminaries, ads are indexed according to the order in which they appear on the page. This indexing scheme restricts the state space of all possible allocation rules. For instance, an allocation like (0, 1, 1) where the first positioned ad is placed at the Bottom and the rest are on Top is ill-defined and therefore excluded from the set of possible allocation rules.</p>
<p>We use the AIPW estimator to compute the counterfactual mean &#x1D53C;[<italic>Y</italic><sub><italic>i</italic></sub>(<bold>a</bold>)] under all possible allocation rules for <italic>a</italic>. The results are shown in <xref ref-type="fig" rid="F2">Figure 2</xref>. The layout that yields the highest click for each position on the pageview corresponds to the tallest bar on each plot. For instance for <italic>m</italic> &#x0003D; 3, the first positioned ad benefits the most from being the sole ad in the Top block, i.e., &#x1D53C;[<italic>Y</italic><sub>1</sub>(1, 0, 0)]&#x0003E;&#x1D53C;[<italic>Y</italic><sub>1</sub>(<bold>a</bold>)], &#x02200;<bold>a</bold>&#x02260;(1, 0, 0). However, the corresponding optimal layout for the first positioned ad is not coherent with the optimal layout of other ads. For instance, the second positioned ad benefits the most from being on the Top block as well. On the other hand, the last positioned ad benefits slightly more when all ads are placed at the Bottom. In order to find a coherent optimal layout yielding the highest number of overall clicks, we need to compare the average click response over all positions on the pageview, i.e., the average overall effect <inline-formula><mml:math id="M20"><mml:mfrac><mml:mrow><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>m</mml:mi></mml:mrow></mml:mfrac><mml:munderover accentunder="false" accent="false"><mml:mrow><mml:mo>&#x02211;</mml:mo></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>m</mml:mi></mml:mrow></mml:munderover><mml:mtext>&#x000A0;</mml:mtext><mml:mi>&#x1D53C;</mml:mi><mml:mrow><mml:mo>[</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>Y</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mstyle mathvariant="bold"><mml:mtext>a</mml:mtext></mml:mstyle></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mo>]</mml:mo></mml:mrow><mml:mo>,</mml:mo></mml:math></inline-formula> for all possible <italic>a</italic>.</p>
<fig id="F2" position="float">
<label>Figure 2</label>
<caption><p>Estimates of &#x1D53C;[<italic>Y</italic><sub><italic>i</italic></sub>(a)] for all possible allocations using AIPW on pageviews with 3, 4, 5 impressed ads.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fdata-05-888592-g0002.tif"/>
</fig>
<p>Estimated values for all the counterfactual means (<italic>m</italic> &#x0003D; 3 with positive pageviews) are reported in <xref ref-type="table" rid="T1">Table 1</xref> along with the corresponding 95% confidence intervals.Results on <italic>m</italic> &#x0003D; 4, 5 with the two types of pageviews are provided in <xref ref-type="table" rid="T2">Table 2</xref>. Additional information on frequencies of allocations are reported in <xref ref-type="table" rid="T3">Table 3</xref>. We can use these tables to compute various effects that were discussed in the previous section. For instance with <italic>m</italic> &#x0003D; 3, the following contrast gives us the unit-level effect for <italic>Y</italic><sub>2</sub> under allocation rule <bold>a</bold> &#x0003D; (1, 0, 0): UE<sub>2</sub>(1, 0, <bold>a</bold>) &#x0003D; &#x1D53C;[<italic>Y</italic><sub>2</sub>(1, 1, 0)&#x02212;<italic>Y</italic><sub>2</sub>(1, 0, 0)] &#x0003D; 0.32&#x02212;0.11 &#x0003D; 0.21 (&#x000B1;0.004). This number quantifies the effect on clickability of the 2nd ad if we (hypothetically) moved it from Top to Bottom, while the 1st ad is kept on Top and the 3rd one is kept at Bottom. The spillover effect under allocation rules <bold>a</bold> &#x0003D; (1, 0, 0) and <bold>a</bold>&#x02032; &#x0003D; (1, 1, 1) is given by <inline-formula><mml:math id="M21"><mml:mrow><mml:msub><mml:mrow><mml:mtext>SE</mml:mtext></mml:mrow><mml:mn>2</mml:mn></mml:msub><mml:mo stretchy='false'>(</mml:mo><mml:mn>1</mml:mn><mml:mo>,</mml:mo><mml:mstyle mathvariant='bold' mathsize='normal'><mml:mi>a</mml:mi></mml:mstyle><mml:mo>&#x02032;</mml:mo><mml:mo>,</mml:mo><mml:mstyle mathvariant='bold' mathsize='normal'><mml:mi>a</mml:mi></mml:mstyle><mml:mo stretchy='false'>)</mml:mo><mml:mo>=</mml:mo><mml:mi>&#x1D53C;</mml:mi><mml:mo stretchy='false'>[</mml:mo><mml:msub><mml:mi>Y</mml:mi><mml:mn>2</mml:mn></mml:msub><mml:mo stretchy='false'>(</mml:mo><mml:mn>1</mml:mn><mml:mo>,</mml:mo><mml:mn>1</mml:mn><mml:mo>,</mml:mo><mml:mn>1</mml:mn><mml:mo stretchy='false'>)</mml:mo><mml:mo>&#x02212;</mml:mo><mml:msub><mml:mi>Y</mml:mi><mml:mn>2</mml:mn></mml:msub><mml:mo stretchy='false'>(</mml:mo><mml:mn>1</mml:mn><mml:mo>,</mml:mo><mml:mn>1</mml:mn><mml:mo>,</mml:mo><mml:mn>0</mml:mn><mml:mo stretchy='false'>)</mml:mo><mml:mo stretchy='false'>]</mml:mo><mml:mo>=</mml:mo><mml:mn>0.28</mml:mn><mml:mo>&#x02212;</mml:mo><mml:mn>0.32</mml:mn><mml:mo>=</mml:mo><mml:mo>&#x02212;</mml:mo><mml:mn>0.04</mml:mn><mml:mo>&#x000A0;</mml:mo><mml:mo stretchy='false'>(</mml:mo><mml:mo>&#x000B1;</mml:mo><mml:mn>0.006</mml:mn><mml:mo stretchy='false'>)</mml:mo><mml:mo>.</mml:mo></mml:mrow></mml:math></inline-formula> This number quantifies the effect on clickability of the 2nd ad if we changed the layout from <italic>a</italic>&#x02032; to <italic>a</italic>, while keeping the 2nd ad fixed on Top. The overall effect of <italic>a</italic>&#x02032; vs. <bold>a</bold>, i.e., <inline-formula><mml:math id="M22"><mml:mrow><mml:msub><mml:mrow><mml:mtext>OE</mml:mtext></mml:mrow><mml:mn>2</mml:mn></mml:msub><mml:mo stretchy='false'>(</mml:mo><mml:mstyle mathvariant='bold' mathsize='normal'><mml:mi>a</mml:mi></mml:mstyle><mml:mo>&#x02032;</mml:mo><mml:mo>,</mml:mo><mml:mstyle mathvariant='bold' mathsize='normal'><mml:mi>a</mml:mi></mml:mstyle><mml:mo stretchy='false'>)</mml:mo><mml:mo>=</mml:mo><mml:mi>&#x1D53C;</mml:mi><mml:mo stretchy='false'>[</mml:mo><mml:msub><mml:mi>Y</mml:mi><mml:mn>2</mml:mn></mml:msub><mml:mo stretchy='false'>(</mml:mo><mml:mn>1</mml:mn><mml:mo>,</mml:mo><mml:mn>1</mml:mn><mml:mo>,</mml:mo><mml:mn>1</mml:mn><mml:mo stretchy='false'>)</mml:mo><mml:mo>&#x02212;</mml:mo><mml:msub><mml:mi>Y</mml:mi><mml:mn>2</mml:mn></mml:msub><mml:mo stretchy='false'>(</mml:mo><mml:mn>1</mml:mn><mml:mo>,</mml:mo><mml:mn>0</mml:mn><mml:mo>,</mml:mo><mml:mn>0</mml:mn><mml:mo stretchy='false'>)</mml:mo><mml:mo stretchy='false'>]</mml:mo></mml:mrow></mml:math></inline-formula> is equal to the sum of UE and SE which is 0.17 (&#x000B1;0.007). Using <xref ref-type="table" rid="T2">Table 2</xref>, we can also compare the performance of each layout in terms of overall click yields. The results are provided in <xref ref-type="table" rid="T4">Table 4</xref>.</p>
<table-wrap position="float" id="T1">
<label>Table 1</label>
<caption><p>Estimated values for the counterfactual mean &#x1D53C;[<italic>Y</italic><sub><italic>i</italic></sub>(a)] for all possible <italic>a</italic>, along with the 95% confidence intervals.</p></caption>
<table frame="hsides" rules="groups">
<thead><tr>
<th/>
<th valign="top" align="center"><bold>&#x1D53C;[<italic>Y</italic><sub><italic>i</italic></sub>(1, 1, 1)]</bold></th>
<th valign="top" align="center"><bold><italic>&#x1D53C;</italic>[<italic>Y</italic><sub><italic>i</italic></sub>(1, 1, 0)]</bold></th>
<th valign="top" align="center"><bold>&#x1D53C;[<italic>Y</italic><sub><italic>i</italic></sub>(1, 0, 0)]</bold></th>
<th valign="top" align="center"><bold>&#x1D53C;[<italic>Y</italic><sub><italic>i</italic></sub>(0, 0, 0)]</bold></th>
<th valign="top" align="center"><bold><italic>&#x1D53C;</italic>[<italic>Y</italic><sub><italic>i</italic></sub>]</bold></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">1st ad</td>
<td valign="top" align="center">0.57 &#x000B1; 0.006</td>
<td valign="top" align="center">0.64 &#x000B1; 0.004</td>
<td valign="top" align="center">0.83 &#x000B1; 0.004</td>
<td valign="top" align="center">0.56 &#x000B1; 0.005</td>
<td valign="top" align="center">0.65</td>
</tr>
<tr>
<td valign="top" align="left">2nd ad</td>
<td valign="top" align="center">0.28 &#x000B1; 0.007</td>
<td valign="top" align="center">0.32 &#x000B1; 0.005</td>
<td valign="top" align="center">0.11 &#x000B1; 0.003</td>
<td valign="top" align="center">0.28 &#x000B1; 0.005</td>
<td valign="top" align="center">0.25</td>
</tr>
<tr>
<td valign="top" align="left">3rd ad</td>
<td valign="top" align="center">0.20 &#x000B1; 0.006</td>
<td valign="top" align="center">0.07 &#x000B1; 0.002</td>
<td valign="top" align="center">0.09 &#x000B1; 0.003</td>
<td valign="top" align="center">0.21 &#x000B1; 0.005</td>
<td valign="top" align="center">0.13</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<p><italic>The observed &#x1D53C;[Y<sub>i</sub>] is reported on the last column (positive pageviews with m &#x0003D; 3)</italic>.</p>
</table-wrap-foot>
</table-wrap>
<table-wrap position="float" id="T2">
<label>Table 2</label>
<caption><p>Estimation of counterfactual &#x1D53C;[<italic>Y</italic><sub><italic>i</italic></sub>(a)] along with 95% confidence interval.</p></caption>
<table frame="hsides" rules="groups">
<thead><tr>
<th/>
<th valign="top" align="center" colspan="2"><bold>Scenarios</bold></th>
<th valign="top" align="center"><bold>Observed mean</bold></th>
<th valign="top" align="center"><bold>0 at Bottom</bold></th>
<th valign="top" align="center"><bold>1 at Bottom</bold></th>
<th valign="top" align="center"><bold>2 at Bottom</bold></th>
<th valign="top" align="center"><bold>3 at Bottom</bold></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left"><italic>m</italic> &#x0003D; 3</td>
<td valign="top" align="center">Positives</td>
<td valign="top" align="center"><italic>Y</italic><sub>1</sub></td>
<td valign="top" align="center">0.65</td>
<td valign="top" align="center">0.57 &#x000B1; 0.006</td>
<td valign="top" align="center">0.64 &#x000B1; 0.004</td>
<td valign="top" align="center">0.83 &#x000B1; 0.004</td>
<td valign="top" align="center">0.56 &#x000B1; 0.005</td>
</tr>
<tr>
<td/>
<td/>
<td valign="top" align="center"><italic>Y</italic><sub>2</sub></td>
<td valign="top" align="center">0.25</td>
<td valign="top" align="center">0.28 &#x000B1; 0.007</td>
<td valign="top" align="center">0.32 &#x000B1; 0.005</td>
<td valign="top" align="center">0.11 &#x000B1; 0.003</td>
<td valign="top" align="center">0.28 &#x000B1; 0.005</td>
</tr>
<tr>
<td/>
<td/>
<td valign="top" align="center"><italic>Y</italic><sub>3</sub></td>
<td valign="top" align="center">0.13</td>
<td valign="top" align="center">0.20 &#x000B1; 0.006</td>
<td valign="top" align="center">0.07 &#x000B1; 0.002</td>
<td valign="top" align="center">0.09 &#x000B1; 0.003</td>
<td valign="top" align="center">0.21 &#x000B1; 0.005</td>
</tr>
<tr>
<td/>
<td valign="top" align="center">Balanced</td>
<td valign="top" align="center"><italic>Y</italic><sub>1</sub></td>
<td valign="top" align="center">0.48</td>
<td valign="top" align="center">0.41 &#x000B1; 0.006</td>
<td valign="top" align="center">0.50 &#x000B1; 0.004</td>
<td valign="top" align="center">0.60 &#x000B1; 0.004</td>
<td valign="top" align="center">0.36 &#x000B1; 0.008</td>
</tr>
<tr>
<td/>
<td/>
<td valign="top" align="center"><italic>Y</italic><sub>2</sub></td>
<td valign="top" align="center">0.18</td>
<td valign="top" align="center">0.22 &#x000B1; 0.008</td>
<td valign="top" align="center">0.25 &#x000B1; 0.004</td>
<td valign="top" align="center">0.10 &#x000B1; 0.003</td>
<td valign="top" align="center">0.15 &#x000B1; 0.011</td>
</tr>
<tr>
<td/>
<td/>
<td valign="top" align="center"><italic>Y</italic><sub>3</sub></td>
<td valign="top" align="center">0.10</td>
<td valign="top" align="center">0.16 &#x000B1; 0.007</td>
<td valign="top" align="center">0.06 &#x000B1; 0.002</td>
<td valign="top" align="center">0.07 &#x000B1; 0.003</td>
<td valign="top" align="center">0.11 &#x000B1; 0.011</td>
</tr>
<tr>
<td valign="top" align="left"><italic>m</italic> &#x0003D; 4</td>
<td valign="top" align="center">Positives</td>
<td valign="top" align="center"><italic>Y</italic><sub>1</sub></td>
<td valign="top" align="center">0.59</td>
<td valign="top" align="center">0.52 &#x000B1; 0.010</td>
<td valign="top" align="center">0.55 &#x000B1; 0.006</td>
<td valign="top" align="center">0.63 &#x000B1; 0.006</td>
<td valign="top" align="center">0.77 &#x000B1; 0.007</td>
</tr>
<tr>
<td/>
<td/>
<td valign="top" align="center"><italic>Y</italic><sub>2</sub></td>
<td valign="top" align="center">0.24</td>
<td valign="top" align="center">0.25 &#x000B1; 0.007</td>
<td valign="top" align="center">0.27 &#x000B1; 0.004</td>
<td valign="top" align="center">0.31 &#x000B1; 0.005</td>
<td valign="top" align="center">0.13 &#x000B1; 0.004</td>
</tr>
<tr>
<td/>
<td/>
<td valign="top" align="center"><italic>Y</italic><sub>3</sub></td>
<td valign="top" align="center">0.13</td>
<td valign="top" align="center">0.18 &#x000B1; 0.006</td>
<td valign="top" align="center">0.20 &#x000B1; 0.004</td>
<td valign="top" align="center">0.09 &#x000B1;0.003</td>
<td valign="top" align="center">0.12 &#x000B1; 0.004</td>
</tr>
<tr>
<td/>
<td/>
<td valign="top" align="center"><italic>Y</italic><sub>4</sub></td>
<td valign="top" align="center">0.08</td>
<td valign="top" align="center">0.14 &#x000B1; 0.005</td>
<td valign="top" align="center">0.06 &#x000B1; 0.002</td>
<td valign="top" align="center">0.09 &#x000B1; 0.002</td>
<td valign="top" align="center">0.11 &#x000B1; 0.003</td>
</tr>
<tr>
<td/>
<td valign="top" align="center">Balanced</td>
<td valign="top" align="center"><italic>Y</italic><sub>1</sub></td>
<td valign="top" align="center">0.49</td>
<td valign="top" align="center">0.45 &#x000B1; 0.007</td>
<td valign="top" align="center">0.46 &#x000B1; 0.005</td>
<td valign="top" align="center">0.51 &#x000B1; 0.005</td>
<td valign="top" align="center">0.59 &#x000B1; 0.009</td>
</tr>
<tr>
<td/>
<td/>
<td valign="top" align="center"><italic>Y</italic><sub>2</sub></td>
<td valign="top" align="center">0.20</td>
<td valign="top" align="center">0.22 &#x000B1; 0.004</td>
<td valign="top" align="center">0.24 &#x000B1; 0.003</td>
<td valign="top" align="center">0.26 &#x000B1; 0.004</td>
<td valign="top" align="center">0.10 &#x000B1; 0.003</td>
</tr>
<tr>
<td/>
<td/>
<td valign="top" align="center"><italic>Y</italic><sub>3</sub></td>
<td valign="top" align="center">0.11</td>
<td valign="top" align="center">0.16 &#x000B1; 0.004</td>
<td valign="top" align="center">0.17 &#x000B1; 0.003</td>
<td valign="top" align="center">0.08 &#x000B1; 0.002</td>
<td valign="top" align="center">0.09 &#x000B1; 0.003</td>
</tr>
<tr>
<td/>
<td/>
<td valign="top" align="center"><italic>Y</italic><sub>4</sub></td>
<td valign="top" align="center">0.06</td>
<td valign="top" align="center">0.13 &#x000B1; 0.003</td>
<td valign="top" align="center">0.05 &#x000B1; 0.001</td>
<td valign="top" align="center">0.07 &#x000B1; 0.002</td>
<td valign="top" align="center">0.08 &#x000B1; 0.003</td>
</tr>
<tr>
<td valign="top" align="left"><italic>m</italic> &#x0003D; 5</td>
<td valign="top" align="center">Positives</td>
<td valign="top" align="center"><italic>Y</italic><sub>1</sub></td>
<td valign="top" align="center">0.51</td>
<td valign="top" align="center">0.46 &#x000B1; 0.009</td>
<td valign="top" align="center">0.49 &#x000B1; 0.006</td>
<td valign="top" align="center">0.54 &#x000B1; 0.006</td>
<td valign="top" align="center">0.59 &#x000B1; 0.018</td>
</tr>
<tr>
<td/>
<td/>
<td valign="top" align="center"><italic>Y</italic><sub>2</sub></td>
<td valign="top" align="center">0.24</td>
<td valign="top" align="center">0.23 &#x000B1; 0.005</td>
<td valign="top" align="center">0.25 &#x000B1; 0.004</td>
<td valign="top" align="center">0.27 &#x000B1; 0.004</td>
<td valign="top" align="center">0.30 &#x000B1; 0.010</td>
</tr>
<tr>
<td/>
<td/>
<td valign="top" align="center"><italic>Y</italic><sub>3</sub></td>
<td valign="top" align="center">0.15</td>
<td valign="top" align="center">0.17 &#x000B1; 0.005</td>
<td valign="top" align="center">0.18 &#x000B1; 0.004</td>
<td valign="top" align="center">0.18 &#x000B1; 0.004</td>
<td valign="top" align="center">0.11 &#x000B1; 0.006</td>
</tr>
<tr>
<td/>
<td/>
<td valign="top" align="center"><italic>Y</italic><sub>4</sub></td>
<td valign="top" align="center">0.09</td>
<td valign="top" align="center">0.13 &#x000B1; 0.004</td>
<td valign="top" align="center">0.13 &#x000B1; 0.003</td>
<td valign="top" align="center">0.06 &#x000B1; 0.002</td>
<td valign="top" align="center">0.07 &#x000B1; 0.005</td>
</tr>
<tr>
<td/>
<td/>
<td valign="top" align="center"><italic>Y</italic><sub>5</sub></td>
<td valign="top" align="center">0.05</td>
<td valign="top" align="center">0.11 &#x000B1; 0.004</td>
<td valign="top" align="center">0.04 &#x000B1; 0.002</td>
<td valign="top" align="center">0.05 &#x000B1; 0.002</td>
<td valign="top" align="center">0.06 &#x000B1; 0.005</td>
</tr>
<tr>
<td/>
<td valign="top" align="center">Balanced</td>
<td valign="top" align="center"><italic>Y</italic><sub>1</sub></td>
<td valign="top" align="center">0.44</td>
<td valign="top" align="center">0.41 &#x000B1; 0.011</td>
<td valign="top" align="center">0.43 &#x000B1; 0.006</td>
<td valign="top" align="center">0.47 &#x000B1; 0.006</td>
<td valign="top" align="center">0.49 &#x000B1; 0.019</td>
</tr>
<tr>
<td/>
<td/>
<td valign="top" align="center"><italic>Y</italic><sub>2</sub></td>
<td valign="top" align="center">0.21</td>
<td valign="top" align="center">0.21 &#x000B1; 0.006</td>
<td valign="top" align="center">0.22 &#x000B1; 0.004</td>
<td valign="top" align="center">0.24 &#x000B1; 0.004</td>
<td valign="top" align="center">0.24 &#x000B1; 0.009</td>
</tr>
<tr>
<td/>
<td/>
<td valign="top" align="center"><italic>Y</italic><sub>3</sub></td>
<td valign="top" align="center">0.13</td>
<td valign="top" align="center">0.15 &#x000B1; 0.005</td>
<td valign="top" align="center">0.16 &#x000B1; 0.003</td>
<td valign="top" align="center">0.15 &#x000B1; 0.003</td>
<td valign="top" align="center">0.08 &#x000B1; 0.005</td>
</tr>
<tr>
<td/>
<td/>
<td valign="top" align="center"><italic>Y</italic><sub>4</sub></td>
<td valign="top" align="center">0.08</td>
<td valign="top" align="center">0.12 &#x000B1; 0.005</td>
<td valign="top" align="center">0.12 &#x000B1; 0.003</td>
<td valign="top" align="center">0.05 &#x000B1; 0.002</td>
<td valign="top" align="center">0.06 &#x000B1; 0.005</td>
</tr>
<tr>
<td/>
<td/>
<td valign="top" align="center"><italic>Y</italic><sub>5</sub></td>
<td valign="top" align="center">0.04</td>
<td valign="top" align="center">0.08 &#x000B1; 0.007</td>
<td valign="top" align="center">0.03 &#x000B1; 0.002</td>
<td valign="top" align="center">0.04 &#x000B1; 0.003</td>
<td valign="top" align="center">0.05 &#x000B1; 0.009</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<p><italic>Each allocation a is represented by the number of ads that it places at the Bottom. For instance &#x0201C;1 at Bottom&#x0201D; corresponds to the allocation (1, 1, 0) when m &#x0003D; 3, (1, 1, 1, 0) when m &#x0003D; 4, and (1, 1, 1, 1, 0) when m &#x0003D; 5. Allocations with &#x0201C;4 at Bottom&#x0201D; and &#x0201C;5 at Bottom&#x0201D; do not appear in the time span we considered. The observed &#x1D53C;[Y<sub>i</sub>] is reported on the second column</italic>.</p>
</table-wrap-foot>
</table-wrap>
<table-wrap position="float" id="T3">
<label>Table 3</label>
<caption><p>Observed frequencies of allocations.</p></caption>
<table frame="hsides" rules="groups">
<thead><tr>
<th valign="top" align="left" colspan="2"><bold>Impressed ads and scenarios</bold></th>
<th valign="top" align="center"><bold>0 at Bottom(%)</bold></th>
<th valign="top" align="center"><bold>1 at Bottom(%)</bold></th>
<th valign="top" align="center"><bold>2 at Bottom(%)</bold></th>
<th valign="top" align="center"><bold>3 at Bottom(%)</bold></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left"><italic>m</italic> &#x0003D; 3</td>
<td valign="top" align="center">Positives</td>
<td valign="top" align="center">36.9</td>
<td valign="top" align="center">31.7</td>
<td valign="top" align="center">22.6</td>
<td valign="top" align="center">8.8</td>
</tr>
<tr>
<td/>
<td valign="top" align="center">Balanced</td>
<td valign="top" align="center">32.5</td>
<td valign="top" align="center">29.4</td>
<td valign="top" align="center">22.9</td>
<td valign="top" align="center">15.1</td>
</tr>
<tr>
<td valign="top" align="left"><italic>m</italic> &#x0003D; 4</td>
<td valign="top" align="center">Positives</td>
<td valign="top" align="center">32.0</td>
<td valign="top" align="center">26.5</td>
<td valign="top" align="center">24.9</td>
<td valign="top" align="center">16.6</td>
</tr>
<tr>
<td/>
<td valign="top" align="center">Balanced</td>
<td valign="top" align="center">30.3</td>
<td valign="top" align="center">25.9</td>
<td valign="top" align="center">25.4</td>
<td valign="top" align="center">18.4</td>
</tr>
<tr>
<td valign="top" align="left"><italic>m</italic> &#x0003D; 5</td>
<td valign="top" align="center">Positives</td>
<td valign="top" align="center">29.3</td>
<td valign="top" align="center">27.2</td>
<td valign="top" align="center">23.5</td>
<td valign="top" align="center">20.0</td>
</tr>
<tr>
<td/>
<td valign="top" align="center">Balanced</td>
<td valign="top" align="center">28.5</td>
<td valign="top" align="center">26.7</td>
<td valign="top" align="center">23.8</td>
<td valign="top" align="center">21.1</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<p><italic>Allocations with &#x0201C;4 at Bottom&#x0201D; and &#x0201C;5 at Bottom&#x0201D; do not appear in the time span we considered</italic>.</p>
</table-wrap-foot>
</table-wrap>
<table-wrap position="float" id="T4">
<label>Table 4</label>
<caption><p>Layout comparisons by reporting average overall counterfactual mean, i.e., <inline-formula><mml:math id="M23"><mml:mfrac><mml:mrow><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>m</mml:mi></mml:mrow></mml:mfrac><mml:munderover accentunder="false" accent="false"><mml:mrow><mml:mo>&#x02211;</mml:mo></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>m</mml:mi></mml:mrow></mml:munderover><mml:mtext>&#x000A0;</mml:mtext><mml:mi>&#x1D53C;</mml:mi><mml:mrow><mml:mo>[</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>Y</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>a</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mo>]</mml:mo></mml:mrow><mml:mo>,</mml:mo></mml:math></inline-formula> for all possible allocations.</p></caption>
<table frame="hsides" rules="groups">
<thead><tr>
<th valign="top" align="left" colspan="2"><bold>Impressed ads and scenarios</bold></th>
<th valign="top" align="center"><bold>Observed</bold></th>
<th valign="top" align="center"><bold>0 at Bottom</bold></th>
<th valign="top" align="center"><bold>1 at Bottom</bold></th>
<th valign="top" align="center"><bold>2 at Bottom</bold></th>
<th valign="top" align="center"><bold>3 at Bottom</bold></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left"><italic>m</italic> &#x0003D; 3</td>
<td valign="top" align="center">Positives</td>
<td valign="top" align="center">0.3437</td>
<td valign="top" align="center">0.3495 &#x000B1;1.2e-7</td>
<td valign="top" align="center">0.3460 &#x000B1;1.4e-7</td>
<td valign="top" align="center">0.3442 &#x000B1;1.58e-7</td>
<td valign="top" align="center">0.3499 &#x000B1;1.91e-7</td>
</tr>
<tr>
<td/>
<td valign="top" align="center">Balanced</td>
<td valign="top" align="center">0.2526</td>
<td valign="top" align="center">0.2649 &#x000B1; 2.0e-7</td>
<td valign="top" align="center">0.2713 &#x000B1;2.3e-7</td>
<td valign="top" align="center">0.2577 &#x000B1;2.7e-7</td>
<td valign="top" align="center">0.2079 &#x000B1;5.0e-7</td>
</tr>
<tr>
<td valign="top" align="left"><italic>m</italic> &#x0003D; 4</td>
<td valign="top" align="center">Positives</td>
<td valign="top" align="center">0.2591</td>
<td valign="top" align="center">0.2727 &#x000B1; 1.5e-7</td>
<td valign="top" align="center">0.2709 &#x000B1;1.5e-7</td>
<td valign="top" align="center">0.2786 &#x000B1;1.9e-7</td>
<td valign="top" align="center">0.2816 &#x000B1;2.2e-7</td>
</tr>
<tr>
<td/>
<td valign="top" align="center">Balanced</td>
<td valign="top" align="center">0.2140</td>
<td valign="top" align="center">0.2374 &#x000B1;1.3e-7</td>
<td valign="top" align="center">0.2294 &#x000B1;1.8e-7</td>
<td valign="top" align="center">0.2275 &#x000B1;2.2e-7</td>
<td valign="top" align="center">0.2171 &#x000B1;3.1e-7</td>
</tr>
<tr>
<td valign="top" align="left"><italic>m</italic> &#x0003D; 5</td>
<td valign="top" align="center">Positives</td>
<td valign="top" align="center">0.2087</td>
<td valign="top" align="center">0.2211 &#x000B1;1.3e-7</td>
<td valign="top" align="center">0.2186 &#x000B1;1.6e-7</td>
<td valign="top" align="center">0.2227 &#x000B1;1.8e-7</td>
<td valign="top" align="center">0.2260 &#x000B1;3.2e-7</td>
</tr>
<tr>
<td/>
<td valign="top" align="center">Balanced</td>
<td valign="top" align="center">0.1807</td>
<td valign="top" align="center">0.1945 &#x000B1;2.3e-7</td>
<td valign="top" align="center">0.1926 &#x000B1;2.6e-7</td>
<td valign="top" align="center">0.1897 &#x000B1;3.0e-7</td>
<td valign="top" align="center">0.1836 &#x000B1;5.1e-7</td>
</tr>
</tbody>
</table>
</table-wrap>
</sec>
<sec>
<title>5.2. Learning the Causal Structure Using FCI</title>
<p>In this part of the experiment, we use data to learn the parents of each outcome for all ads on the pageview; while allowing for both block-level and cross-block interference. We preprocess the data as described in Section 4.4, and use the implementation of the FCI algorithm in the Tetrad software<xref ref-type="fn" rid="fn0001"><sup>1</sup></xref>. Independence tests are performed using kernel conditional independence tests (Zhang et al., <xref ref-type="bibr" rid="B49">2012</xref>) with a significance level of 0.01. On each pageview, we collect <italic>m</italic> &#x000D7; 22 different features. Neither plotting the learned graph nor enlisting all parental sets is relevant to the point we like to deliver here. Our primary objective is to show that for a particular positioned ad, features from other ads on the pageview (not necessarily from the same block even) are directly relevant to the clickability of the ad. In order for us to report the results in a more concise and clear way, we divide the ad-specific features into four distinct categories: (a) Calculated scores, such as <italic>PClick, PDefect, Relevance score, etc</italic>., (b) Decorative features, such as <italic>Twitter information, links, and ratings</italic>, (c) Geometric features, such as <italic>line counts, pixel heights, pixel heights from top of the block</italic>, and (d) match type information. We found out that the parent set of each <italic>Y</italic><sub><italic>i</italic></sub> contains at least one variable in each category of features from a different ad; providing further evidence for the presence of interference among ads. In our extended set of experiments, we learned that Decorative features are more influential on pageviews with higher number of impressed ads. Please refer to the appendix for more experiments.</p>
<p>We further designate a fifth category (e) for collection of exogenous features that are layout-specific, such as <italic>ad counts</italic>. For each scenario, we report what categories the causally relevant features belong to in <xref ref-type="table" rid="T5">Table 5</xref>. For each positioned ad, the influence of other ads on the pageview are spread over multiple categories of features. Calculated scores and geometric features are influential in clickability across all scenarios and pageviews with different number of impressed ads.</p>
<table-wrap position="float" id="T5">
<label>Table 5</label>
<caption><p>Using FCI procedure to learn the structure of our model, this table reports what categories the causally relevant features belong to.</p></caption>
<table frame="hsides" rules="groups">
<thead><tr>
<th/>
<th valign="top" align="center" colspan="2"><bold>Scenarios</bold></th>
<th valign="top" align="center"><bold>Calculated scores</bold></th>
<th valign="top" align="center"><bold>Decorative features</bold></th>
<th valign="top" align="center"><bold>Geometric features</bold></th>
<th valign="top" align="center"><bold>Match type</bold></th>
<th valign="top" align="center"><bold>Exogenous features</bold></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left"><italic>m</italic> &#x0003D; 3</td>
<td valign="top" align="center">Positives</td>
<td valign="top" align="center"><italic>Y</italic><sub>1</sub></td>
<td valign="top" align="center">&#x02713;</td>
<td/>
<td valign="top" align="center">&#x02713;</td>
<td valign="top" align="center">&#x02713;</td>
<td/>
</tr>
<tr>
<td/>
<td/>
<td valign="top" align="center"><italic>Y</italic><sub>2</sub></td>
<td valign="top" align="center">&#x02713;</td>
<td/>
<td valign="top" align="center">&#x02713;</td>
<td valign="top" align="center">&#x02713;</td>
<td/>
</tr>
<tr>
<td/>
<td/>
<td valign="top" align="center"><italic>Y</italic><sub>3</sub></td>
<td valign="top" align="center">&#x02713;</td>
<td/>
<td valign="top" align="center">&#x02713;</td>
<td valign="top" align="center">&#x02713;</td>
<td/>
</tr>
<tr>
<td/>
<td valign="top" align="center">Balanced</td>
<td valign="top" align="center"><italic>Y</italic><sub>1</sub></td>
<td valign="top" align="center">&#x02713;</td>
<td/>
<td valign="top" align="center">&#x02713;</td>
<td/>
<td valign="top" align="center">&#x02713;</td>
</tr>
<tr>
<td/>
<td/>
<td valign="top" align="center"><italic>Y</italic><sub>2</sub></td>
<td valign="top" align="center">&#x02713;</td>
<td/>
<td valign="top" align="center">&#x02713;</td>
<td/>
<td valign="top" align="center">&#x02713;</td>
</tr>
<tr>
<td/>
<td/>
<td valign="top" align="center"><italic>Y</italic><sub>3</sub></td>
<td valign="top" align="center">&#x02713;</td>
<td/>
<td valign="top" align="center">&#x02713;</td>
<td valign="top" align="center">&#x02713;</td>
<td valign="top" align="center">&#x02713;</td>
</tr>
<tr>
<td valign="top" align="left"><italic>m</italic> &#x0003D; 4</td>
<td valign="top" align="center">Positives</td>
<td valign="top" align="center"><italic>Y</italic><sub>1</sub></td>
<td valign="top" align="center">&#x02713;</td>
<td/>
<td valign="top" align="center">&#x02713;</td>
<td/>
<td valign="top" align="center">&#x02713;</td>
</tr>
<tr>
<td/>
<td/>
<td valign="top" align="center"><italic>Y</italic><sub>2</sub></td>
<td valign="top" align="center">&#x02713;</td>
<td/>
<td valign="top" align="center">&#x02713;</td>
<td valign="top" align="center">&#x02713;</td>
<td/>
</tr>
<tr>
<td/>
<td/>
<td valign="top" align="center"><italic>Y</italic><sub>3</sub></td>
<td valign="top" align="center">&#x02713;</td>
<td/>
<td valign="top" align="center">&#x02713;</td>
<td valign="top" align="center">&#x02713;</td>
<td/>
</tr>
<tr>
<td/>
<td/>
<td valign="top" align="center"><italic>Y</italic><sub>4</sub></td>
<td valign="top" align="center">&#x02713;</td>
<td valign="top" align="center">&#x02713;</td>
<td valign="top" align="center">&#x02713;</td>
<td valign="top" align="center">&#x02713;</td>
<td/>
</tr>
<tr>
<td/>
<td valign="top" align="center">Balanced</td>
<td valign="top" align="center"><italic>Y</italic><sub>1</sub></td>
<td valign="top" align="center">&#x02713;</td>
<td/>
<td valign="top" align="center">&#x02713;</td>
<td/>
<td/>
</tr>
<tr>
<td/>
<td/>
<td valign="top" align="center"><italic>Y</italic><sub>2</sub></td>
<td valign="top" align="center">&#x02713;</td>
<td/>
<td valign="top" align="center">&#x02713;</td>
<td valign="top" align="center">&#x02713;</td>
<td/>
</tr>
<tr>
<td/>
<td/>
<td valign="top" align="center"><italic>Y</italic><sub>3</sub></td>
<td valign="top" align="center">&#x02713;</td>
<td/>
<td valign="top" align="center">&#x02713;</td>
<td valign="top" align="center">&#x02713;</td>
<td/>
</tr>
<tr>
<td/>
<td/>
<td valign="top" align="center"><italic>Y</italic><sub>4</sub></td>
<td valign="top" align="center">&#x02713;</td>
<td valign="top" align="center">&#x02713;</td>
<td valign="top" align="center">&#x02713;</td>
<td valign="top" align="center">&#x02713;</td>
<td/>
</tr>
<tr>
<td valign="top" align="left"><italic>m</italic> &#x0003D; 5</td>
<td valign="top" align="center">Positives</td>
<td valign="top" align="center"><italic>Y</italic><sub>1</sub></td>
<td valign="top" align="center">&#x02713;</td>
<td valign="top" align="center">&#x02713;</td>
<td valign="top" align="center">&#x02713;</td>
<td valign="top" align="center">&#x02713;</td>
<td valign="top" align="center">&#x02713;</td>
</tr>
<tr>
<td/>
<td/>
<td valign="top" align="center"><italic>Y</italic><sub>2</sub></td>
<td valign="top" align="center">&#x02713;</td>
<td valign="top" align="center">&#x02713;</td>
<td valign="top" align="center">&#x02713;</td>
<td valign="top" align="center">&#x02713;</td>
<td/>
</tr>
<tr>
<td/>
<td/>
<td valign="top" align="center"><italic>Y</italic><sub>3</sub></td>
<td valign="top" align="center">&#x02713;</td>
<td/>
<td valign="top" align="center">&#x02713;</td>
<td/>
<td/>
</tr>
<tr>
<td/>
<td/>
<td valign="top" align="center"><italic>Y</italic><sub>4</sub></td>
<td valign="top" align="center">&#x02713;</td>
<td valign="top" align="center">&#x02713;</td>
<td valign="top" align="center">&#x02713;</td>
<td valign="top" align="center">&#x02713;</td>
<td/>
</tr>
<tr>
<td/>
<td/>
<td valign="top" align="center"><italic>Y</italic><sub>5</sub></td>
<td valign="top" align="center">&#x02713;</td>
<td/>
<td valign="top" align="center">&#x02713;</td>
<td valign="top" align="center">&#x02713;</td>
<td/>
</tr>
<tr>
<td/>
<td valign="top" align="center">Balanced</td>
<td valign="top" align="center"><italic>Y</italic><sub>1</sub></td>
<td valign="top" align="center">&#x02713;</td>
<td valign="top" align="center">&#x02713;</td>
<td valign="top" align="center">&#x02713;</td>
<td valign="top" align="center">&#x02713;</td>
<td/>
</tr>
<tr>
<td/>
<td/>
<td valign="top" align="center"><italic>Y</italic><sub>2</sub></td>
<td valign="top" align="center">&#x02713;</td>
<td valign="top" align="center">&#x02713;</td>
<td valign="top" align="center">&#x02713;</td>
<td valign="top" align="center">&#x02713;</td>
<td/>
</tr>
<tr>
<td/>
<td/>
<td valign="top" align="center"><italic>Y</italic><sub>3</sub></td>
<td valign="top" align="center">&#x02713;</td>
<td/>
<td valign="top" align="center">&#x02713;</td>
<td/>
<td/>
</tr>
<tr>
<td/>
<td/>
<td valign="top" align="center"><italic>Y</italic><sub>4</sub></td>
<td valign="top" align="center">&#x02713;</td>
<td valign="top" align="center">&#x02713;</td>
<td valign="top" align="center">&#x02713;</td>
<td/>
<td/>
</tr>
<tr>
<td/>
<td/>
<td valign="top" align="center"><italic>Y</italic><sub>5</sub></td>
<td valign="top" align="center">&#x02713;</td>
<td/>
<td valign="top" align="center">&#x02713;</td>
<td valign="top" align="center">&#x02713;</td>
<td/>
</tr>
</tbody>
</table>
</table-wrap>
</sec>
<sec>
<title>5.3. Improvements in Click Prediction</title>
<p>Given the set of experiments described above, we have more evidence to believe that interference <italic>does</italic> exist among ads. This was shown through both finding effects that are away from zero and learning causally relevant features that originate from other ads on the pageview. We now leverage this knowledge to better estimate the click yields. We considered fitting 5 different sets of models:</p>
<list list-type="simple">
<list-item><p><italic>(1) (Baseline)</italic> model where samples are assumed to be independent, i.e., fitting <italic>p</italic>(<italic>Y</italic><sub><italic>i</italic></sub> &#x0003D; 1&#x02223;<italic>L, X</italic><sub><italic>i</italic></sub>),</p></list-item>
<list-item><p><italic>(2) (Block-level interference)</italic> model where we allow for block-level interactions like <italic>I</italic>(<italic>A</italic><sub><italic>j</italic></sub> &#x0003D; <italic>A</italic><sub><italic>i</italic></sub>) &#x000D7; <italic>X</italic><sub><italic>j</italic></sub>, i.e., <italic>p</italic>[<italic>Y</italic><sub><italic>i</italic></sub> &#x0003D; 1&#x02223;<italic>L, X</italic><sub><italic>i</italic></sub>, <italic>I</italic>(<italic>A</italic><sub><italic>j</italic></sub> &#x0003D; <italic>A</italic><sub><italic>i</italic></sub>) &#x000D7; <italic>X</italic><sub><italic>j</italic></sub>],</p></list-item>
<list-item><p><italic>(3) (Block-level and cross-block interference)</italic> model where in addition to block-level interactions we allow for cross-block interactions, i.e., <italic>p</italic>[<italic>Y</italic><sub><italic>i</italic></sub> &#x0003D; 1&#x02223;<italic>L, X</italic><sub><italic>i</italic></sub>, <italic>I</italic>(<italic>A</italic><sub><italic>j</italic></sub> &#x0003D; <italic>A</italic><sub><italic>i</italic></sub>) &#x000D7; <italic>X</italic><sub><italic>j</italic></sub>, <italic>I</italic>(<italic>A</italic><sub><italic>k</italic></sub>&#x02260;<italic>A</italic><sub><italic>i</italic></sub>) &#x000D7; <italic>X</italic><sub><italic>k</italic></sub>],</p></list-item>
<list-item><p><italic>(4) (Full graph)</italic> with no block decomposition, i.e., <italic>p</italic>(<italic>Y</italic><sub><italic>i</italic></sub> &#x0003D; 1&#x02223;<italic>L, X</italic>), and</p></list-item>
<list-item><p><italic>(5) (FCI parents)</italic> model where we use the parents of <italic>Y</italic><sub><italic>i</italic></sub> in the graph that FCI outputs, i.e., fitting <italic>p</italic>[<italic>Y</italic><sub><italic>i</italic></sub> &#x0003D; 1&#x02223;pa(<italic>Y</italic><sub><italic>i</italic></sub>)].</p></list-item>
</list>
<p>We report relative improvements in area under the curve over the baseline in <xref ref-type="fig" rid="F3">Figure 3</xref>. All methods that account for interference show improvement over the baseline, demonstrating the utility of our formalization. It is also worth noting that the performance gains are greater for higher positioned ads compared to lower ones.</p>
<fig id="F3" position="float">
<label>Figure 3</label>
<caption><p>Relative difference (in percentage) in AUCs with respect to the baseline model.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fdata-05-888592-g0003.tif"/>
</fig>
</sec>
<sec>
<title>5.4. Performance in Unseen Layouts</title>
<p>We evaluate the performance of our models with interference on layouts that do not appear in the training data. We limit our training data to pageviews with 5 impressed ads and test the models on pageviews that have more than 5 impressed ads. <xref ref-type="fig" rid="F4">Figure 4</xref> highlights the improvement of the proposed models on pageviews with 6, 7, and 8 impressed ads.</p>
<fig id="F4" position="float">
<label>Figure 4</label>
<caption><p>Relative difference (in percentage) in AUCs with respect to the baseline model in unseen layouts.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fdata-05-888592-g0004.tif"/>
</fig>
</sec>
</sec>
<sec sec-type="conclusions" id="s6">
<title>6. Conclusion</title>
<p>Despite the intuition that ads should not be scrutinized independently of one another, to the best of our knowledge, there has not been a formal analysis of interference in advertisement placement and sponsored search marketing. In this paper, we formalized the interference problem among ads using the language of causal inference and counterfactual reasoning. We proposed a framework to quantify the interference effects by posing a graphical causal model that accounts for potential underlying interference mechanisms. We described several causal effects that might be of interest in ad placement systems and discussed identification assumptions and estimation strategies for computing these effects. We further adapted the FCI procedure to learn the underlying mechanisms that give rise to interference and verify the correctness of our presumed causal structures.</p>
<p>In the partial interference framework, it is often assumed that the iid units are of the same size. The equivalent assumption we made is that pageviews have a fixed number of impressed ads. If sample size is not of concern, we can analyze each pageview of size <italic>m</italic> in isolation. However, in scenarios where data is scarce, we need alternatives to relax this restrictive assumption. One approach is through feature engineering where we first assume that only <italic>k</italic> nearest neighbors are interacting with the ad itself, a <italic>Markov order of</italic> <italic>k</italic> assumption if you will. We further need to assume the neighboring ads influence one another in the exact similar ways, a <italic>parameter sharing</italic> assumptions, if you will. Investigating such alternatives and exploring other approaches opens up an interesting direction for future work.</p>
<p>In this paper, we focused on the impressed ads on the search result page, and marginalized out the ads involved in the search engine auction. Incorporating the knowledge on how exactly the auction optimizer works on the entire set of candidate ads is important in determining the optimal layouts in presence of interference. We further restricted our attention to auctions that only yield two blocks on the final pageview. This can be simply relaxed by allowing for the allocation treatment to have a discrete state space. We can further group the ads that were not impressed and treat them as a separate block, and investigate their impact on the click yields of the other ads on the page.</p>
</sec>
<sec sec-type="data-availability" id="s7">
<title>Data Availability Statement</title>
<p>The aggregated data supporting the conclusions of this article will be made available upon request. Further requests will be assessed on a case-by-case basis to ensure compliance with privacy agreements and other requirements. Requests to access the datasets should be directed to the corresponding author.</p>
</sec>
<sec id="s8">
<title>Author Contributions</title>
<p>RN, DC, and EK contributed to conception and design of the framework. JP organized the database. RN performed the statistical analysis and wrote the first draft of the manuscript. All authors contributed to manuscript revision, read, and approved the submitted version.</p>
</sec>
<sec sec-type="COI-statement" id="conf1">
<title>Conflict of Interest</title>
<p>JP, DC, and EK were employed by Microsoft Corporation. The research was conducted while RN was an intern at Microsoft Research.</p>
</sec>
<sec sec-type="disclaimer" id="s9">
<title>Publisher&#x00027;s Note</title>
<p>All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.</p>
</sec> 
</body>
<back>
<ref-list>
<title>References</title>
<ref id="B1">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Bajari</surname> <given-names>P.</given-names></name> <name><surname>Burdick</surname> <given-names>B.</given-names></name> <name><surname>Imbens</surname> <given-names>G. W.</given-names></name> <name><surname>Masoero</surname> <given-names>L.</given-names></name> <name><surname>McQueen</surname> <given-names>J.</given-names></name> <name><surname>Richardson</surname> <given-names>T.</given-names></name> <etal/></person-group>. (<year>2021</year>). <article-title>Multiple randomization designs</article-title>. <source>arXiv preprint arXiv:2112.13495</source>. <pub-id pub-id-type="doi">10.48550/arXiv.2112.13495</pub-id></citation>
</ref>
<ref id="B2">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Bayir</surname> <given-names>M. A.</given-names></name> <name><surname>Xu</surname> <given-names>M.</given-names></name> <name><surname>Zhu</surname> <given-names>Y.</given-names></name> <name><surname>Shi</surname> <given-names>Y.</given-names></name></person-group> (<year>2019</year>). <article-title>&#x0201C;Genie: an open box counterfactual policy estimator for optimizing sponsored search marketplace,&#x0201D;</article-title> in <source>Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining</source>, <fpage>465</fpage>&#x02013;<lpage>473</lpage>.</citation>
</ref>
<ref id="B3">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Bhattacharya</surname> <given-names>R.</given-names></name> <name><surname>Nabi</surname> <given-names>R.</given-names></name> <name><surname>Shpitser</surname> <given-names>I.</given-names></name></person-group> (<year>2020a</year>). <article-title>Semiparametric inference for causal effects in graphical models with hidden variables</article-title>. <source>arXiv preprint arXiv:2003.12659</source>. <pub-id pub-id-type="doi">10.48550/arXiv.2003.12659</pub-id></citation>
</ref>
<ref id="B4">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Bhattacharya</surname> <given-names>R.</given-names></name> <name><surname>Nagarajan</surname> <given-names>T.</given-names></name> <name><surname>Malinsky</surname> <given-names>D.</given-names></name> <name><surname>Shpitser</surname> <given-names>I.</given-names></name></person-group> (<year>2020b</year>). <article-title>Differentiable causal discovery under unmeasured confounding</article-title>. <source>arXiv preprint arXiv:2010.06978</source>. <pub-id pub-id-type="doi">10.48550/arXiv.2010.06978</pub-id></citation>
</ref>
<ref id="B5">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Bhattacharya</surname> <given-names>R.</given-names></name> <name><surname>Malinsky</surname> <given-names>D.</given-names></name> <name><surname>Shpitser</surname> <given-names>I.</given-names></name></person-group> (<year>2019</year>). <article-title>&#x0201C;Causal inference under interference and network uncertainty,&#x0201D;</article-title> in <source>Uncertainty in Artificial Intelligence: Proceedings of the... Conference. Conference on Uncertainty in Artificial Intelligence, volume 2019</source> (<publisher-loc>NIH Public Access</publisher-loc>).<pub-id pub-id-type="pmid">31885520</pub-id></citation></ref>
<ref id="B6">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Bisht</surname> <given-names>K.</given-names></name> <name><surname>Susan</surname> <given-names>S.</given-names></name></person-group> (<year>2021</year>). <article-title>&#x0201C;Weighted ensemble of neural and probabilistic graphical models for click prediction,&#x0201D;</article-title> in <source>2021 the 5th International Conference on Information System and Data Mining</source>, <fpage>145</fpage>&#x02013;<lpage>150</lpage>.</citation>
</ref>
<ref id="B7">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Bottou</surname> <given-names>L.</given-names></name> <name><surname>Peters</surname> <given-names>J.</given-names></name> <name><surname>Qui&#x000F1;onero-Candela</surname> <given-names>J.</given-names></name> <name><surname>Charles</surname> <given-names>D. X.</given-names></name> <name><surname>Chickering</surname> <given-names>D. M.</given-names></name> <name><surname>Portugaly</surname> <given-names>E.</given-names></name> <etal/></person-group>. (<year>2013</year>). <article-title>Counterfactual reasoning and learning systems: the example of computational advertising</article-title>. <source>J. Mach. Learn. Res</source>. <volume>14</volume>, <fpage>3207</fpage>&#x02013;<lpage>3260</lpage>.</citation>
</ref>
<ref id="B8">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Cheng</surname> <given-names>H.</given-names></name> <name><surname>Cant&#x000FA;-Paz</surname> <given-names>E.</given-names></name></person-group> (<year>2010</year>). <article-title>&#x0201C;Personalized click prediction in sponsored search,&#x0201D;</article-title> in <source>Proceedings of the Third ACM International Conference on Web Search and data Mining</source>, <fpage>351</fpage>&#x02013;<lpage>360</lpage>.</citation>
</ref>
<ref id="B9">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Cheng</surname> <given-names>H.</given-names></name> <name><surname>Zwol</surname> <given-names>R. V.</given-names></name> <name><surname>Azimi</surname> <given-names>J.</given-names></name> <name><surname>Manavoglu</surname> <given-names>E.</given-names></name> <name><surname>Zhang</surname> <given-names>R.</given-names></name> <name><surname>Zhou</surname> <given-names>Y.</given-names></name> <etal/></person-group>. (<year>2012</year>). <article-title>&#x0201C;Multimedia features for click prediction of new ads in display advertising,&#x0201D;</article-title> in <source>Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining</source>, <fpage>777</fpage>&#x02013;<lpage>785</lpage>.</citation>
</ref>
<ref id="B10">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Chickering</surname> <given-names>D. M..</given-names></name></person-group> (<year>2002</year>). <article-title>Optimal structure identification with greedy search</article-title>. <source>J. Mach. Learn. Res</source>. <volume>3</volume>, <fpage>507</fpage>&#x02013;<lpage>554</lpage>.</citation>
</ref>
<ref id="B11">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Colombo</surname> <given-names>D.</given-names></name> <name><surname>Maathuis</surname> <given-names>M. H.</given-names></name></person-group> (<year>2014</year>). <article-title>Order-independent constraint-based causal structure learning</article-title>. <source>J. Mach. Learn. Res</source>. <volume>15</volume>, <fpage>3741</fpage>&#x02013;<lpage>3782</lpage>.<pub-id pub-id-type="pmid">35327862</pub-id></citation></ref>
<ref id="B12">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Effendi</surname> <given-names>M. J.</given-names></name> <name><surname>Ali</surname> <given-names>S. A.</given-names></name></person-group> (<year>2017</year>). <article-title>Click through rate prediction for contextual advertisment using linear regression</article-title>. <source>arXiv preprint arXiv:1701.08744</source>. <pub-id pub-id-type="doi">10.48550/arXiv.1701.08744</pub-id></citation>
</ref>
<ref id="B13">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Harshaw</surname> <given-names>C.</given-names></name> <name><surname>S&#x000E4;vje</surname> <given-names>F.</given-names></name> <name><surname>Eisenstat</surname> <given-names>D.</given-names></name> <name><surname>Mirrokni</surname> <given-names>V.</given-names></name> <name><surname>Pouget-Abadie</surname> <given-names>J.</given-names></name></person-group> (<year>2021</year>). <article-title>Design and analysis of bipartite experiments under a linear exposure-response model</article-title>. <source>arXiv preprint arXiv:2103.06392</source>. <pub-id pub-id-type="doi">10.48550/arXiv.2103.06392</pub-id></citation>
</ref>
<ref id="B14">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hill</surname> <given-names>D. N.</given-names></name> <name><surname>Moakler</surname> <given-names>R.</given-names></name> <name><surname>Hubbard</surname> <given-names>A. E.</given-names></name> <name><surname>Tsemekhman</surname> <given-names>V.</given-names></name> <name><surname>Provost</surname> <given-names>F.</given-names></name> <name><surname>Tsemekhman</surname> <given-names>K.</given-names></name></person-group> (<year>2015</year>). <article-title>&#x0201C;Measuring causal impact of online actions via natural experiments: application to display advertising,&#x0201D;</article-title> in <source>Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining</source>, <fpage>1839</fpage>&#x02013;<lpage>1847</lpage>.</citation>
</ref>
<ref id="B15">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Huang</surname> <given-names>Y.</given-names></name> <name><surname>Valtorta</surname> <given-names>M.</given-names></name></person-group> (<year>2006</year>). <article-title>&#x0201C;Pearl&#x00027;s calculus of intervention is complete,&#x0201D;</article-title> in <source>Proceedings of the 22nd Conference on Uncertainty in Artificial Intelligence</source>, <fpage>13</fpage>&#x02013;<lpage>16</lpage>.</citation>
</ref>
<ref id="B16">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hudgens</surname> <given-names>M. G.</given-names></name> <name><surname>Halloran</surname> <given-names>M. E.</given-names></name></person-group> (<year>2008</year>). <article-title>Toward causal inference with interference</article-title>. <source>J. Am. Stat. Assoc</source>. <volume>103</volume>, <fpage>832</fpage>&#x02013;<lpage>842</lpage>. <pub-id pub-id-type="doi">10.1198/016214508000000292</pub-id><pub-id pub-id-type="pmid">19081744</pub-id></citation></ref>
<ref id="B17">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Johari</surname> <given-names>R.</given-names></name> <name><surname>Li</surname> <given-names>H.</given-names></name> <name><surname>Liskovich</surname> <given-names>I.</given-names></name> <name><surname>Weintraub</surname> <given-names>G. Y.</given-names></name></person-group> (<year>2022</year>). <article-title>Experimental design in two-sided platforms: an analysis of bias</article-title>. <source>Manag. Sci</source>. <pub-id pub-id-type="doi">10.1287/mnsc.2021.4247</pub-id></citation>
</ref>
<ref id="B18">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Lauritzen</surname> <given-names>S. L..</given-names></name></person-group> (<year>1996</year>). <source>Graphical Models</source>. <publisher-loc>Oxford, UK</publisher-loc>: <publisher-name>Clarendon</publisher-name>.</citation>
</ref>
<ref id="B19">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Nabi</surname> <given-names>R.</given-names></name> <name><surname>Kanki</surname> <given-names>P.</given-names></name> <name><surname>Shpitser</surname> <given-names>I.</given-names></name></person-group> (<year>2018</year>). <article-title>&#x0201C;Estimation of personalized effects associated with causal pathways,&#x0201D;</article-title> in <source>Uncertainty in Artificial Intelligence: Proceedings of the... Conference. Conference on Uncertainty in Artificial Intelligence, Vol. 2018</source> (<publisher-loc>NIH Public Access</publisher-loc>).<pub-id pub-id-type="pmid">30643490</pub-id></citation></ref>
<ref id="B20">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Nabi</surname> <given-names>R.</given-names></name> <name><surname>Malinsky</surname> <given-names>D.</given-names></name> <name><surname>Shpitser</surname> <given-names>I.</given-names></name></person-group> (<year>2019</year>). <article-title>&#x0201C;Learning optimal fair policies,&#x0201D;</article-title> in <source>International Conference on Machine Learning</source> (<publisher-loc>PMLR</publisher-loc>), <fpage>4674</fpage>&#x02013;<lpage>4682</lpage>.<pub-id pub-id-type="pmid">31886463</pub-id></citation></ref>
<ref id="B21">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Nabi-Abdolyousefi</surname> <given-names>R..</given-names></name></person-group> (<year>2015</year>). <source>Conversion rate prediction in search engine marketing</source> (Ph.D. thesis).</citation>
</ref>
<ref id="B22">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Ogburn</surname> <given-names>E. L.</given-names></name> <name><surname>VanderWeele</surname> <given-names>T. J.</given-names></name></person-group> (<year>2014</year>). <article-title>Causal diagrams for interference</article-title>. <source>Stat. Sci</source>. <volume>29</volume>, <fpage>559</fpage>&#x02013;<lpage>578</lpage>. <pub-id pub-id-type="doi">10.1214/14-STS501</pub-id></citation>
</ref>
<ref id="B23">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Pearl</surname> <given-names>J..</given-names></name></person-group> (<year>2009</year>). <source>Causality</source>. <publisher-loc>Cambridge</publisher-loc>: <publisher-name>Cambridge University Press</publisher-name>.</citation>
</ref>
<ref id="B24">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Pouget-Abadie</surname> <given-names>J.</given-names></name> <name><surname>Aydin</surname> <given-names>K.</given-names></name> <name><surname>Schudy</surname> <given-names>W.</given-names></name> <name><surname>Brodersen</surname> <given-names>K.</given-names></name> <name><surname>Mirrokni</surname> <given-names>V.</given-names></name></person-group> (<year>2019</year>). <article-title>&#x0201C;Variance reduction in bipartite experiments through correlation clustering,&#x0201D;</article-title> in <source>33rd Conference on Neural Information Processing Systems (NeurIPS 2019)</source> (<publisher-loc>Vancouver, BC</publisher-loc>).</citation>
</ref>
<ref id="B25">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Pouget-Abadie</surname> <given-names>J.</given-names></name> <name><surname>Parkes</surname> <given-names>D. C.</given-names></name> <name><surname>Mirrokni</surname> <given-names>V.</given-names></name> <name><surname>Airoldi</surname> <given-names>E. M.</given-names></name></person-group> (<year>2018</year>). <article-title>Optimizing cluster-based randomized experiments under a monotonicity assumption</article-title>. <source>arXiv preprint arXiv:1803.02876</source>. <pub-id pub-id-type="doi">10.1145/3219819.3220067</pub-id></citation>
</ref>
<ref id="B26">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Richardson</surname> <given-names>T. S.</given-names></name> <name><surname>Evans</surname> <given-names>R. J.</given-names></name> <name><surname>Robins</surname> <given-names>J. M.</given-names></name> <name><surname>Shpitser</surname> <given-names>I.</given-names></name></person-group> (<year>2017</year>). <article-title>Nested Markov properties for acyclic directed mixed graphs</article-title>. <source>arXiv preprint arXiv:1701.06686</source>. <pub-id pub-id-type="doi">10.48550/arXiv.1701.06686</pub-id><pub-id pub-id-type="pmid">30983907</pub-id></citation></ref>
<ref id="B27">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Richardson</surname> <given-names>T. S.</given-names></name> <name><surname>Robins</surname> <given-names>J. M.</given-names></name></person-group> (<year>2013</year>). <article-title>&#x0201C;Single world intervention graphs (SWIGs): a unification of the counterfactual and graphical approaches to causality,&#x0201D;</article-title> in <source>Center for the Statistics and the Social Sciences, University of Washington Series. Working Paper</source> (<publisher-loc>Washington, DC</publisher-loc>).</citation>
</ref>
<ref id="B28">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Richardson</surname> <given-names>T. S.</given-names></name> <name><surname>Spirtes</surname> <given-names>P. L.</given-names></name></person-group> (<year>2002</year>). <article-title>Ancestral graph Markov models</article-title>. <source>Ann. Stat</source>. <volume>30</volume>, <fpage>962</fpage>&#x02013;<lpage>1030</lpage>. <pub-id pub-id-type="doi">10.1214/aos/1031689015</pub-id></citation>
</ref>
<ref id="B29">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Robins</surname> <given-names>J. M..</given-names></name></person-group> (<year>1986</year>). <article-title>A new approach to causal inference in mortality studies with a sustained exposure period-application to control of the healthy worker survivor effect</article-title>. <source>Math. Model</source>. <volume>7</volume>, <fpage>1393</fpage>&#x02013;<lpage>1512</lpage>. <pub-id pub-id-type="doi">10.1016/0270-0255(86)90088-6</pub-id></citation>
</ref>
<ref id="B30">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Robins</surname> <given-names>J. M.</given-names></name> <name><surname>Rotnitzky</surname> <given-names>A.</given-names></name> <name><surname>Scharfstein</surname> <given-names>D. O.</given-names></name></person-group> (<year>2000</year>). <article-title>&#x0201C;Sensitivity analysis for selection bias and unmeasured confounding in missing data and causal inference models,&#x0201D;</article-title> in <source>Statistical Models in Epidemiology, the Environment, and Clinical Trials</source> (<publisher-loc>Springer</publisher-loc>), <fpage>1</fpage>&#x02013;<lpage>94</lpage>.</citation>
</ref>
<ref id="B31">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Rubin</surname> <given-names>D. B..</given-names></name></person-group> (<year>1980</year>). <article-title>Randomization analysis of experimental data: the fisher randomization test comment</article-title>. <source>J. Am. Stat. Assoc</source>. <volume>75</volume>, <fpage>591</fpage>&#x02013;<lpage>593</lpage>. <pub-id pub-id-type="doi">10.2307/2287653</pub-id></citation>
</ref>
<ref id="B32">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Scharfstein</surname> <given-names>D. O.</given-names></name> <name><surname>Nabi</surname> <given-names>R.</given-names></name> <name><surname>Kennedy</surname> <given-names>E. H.</given-names></name> <name><surname>Huang</surname> <given-names>M.-Y.</given-names></name> <name><surname>Bonvini</surname> <given-names>M.</given-names></name> <name><surname>Smid</surname> <given-names>M.</given-names></name></person-group> (<year>2021</year>). <article-title>Semiparametric sensitivity analysis: unmeasured confounding in observational studies</article-title>. <source>arXiv preprint arXiv:2104.08300</source>. <pub-id pub-id-type="doi">10.48550/arXiv.2104.08300</pub-id></citation>
</ref>
<ref id="B33">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Shalizi</surname> <given-names>C. R.</given-names></name> <name><surname>Thomas</surname> <given-names>A. C.</given-names></name></person-group> (<year>2011</year>). <article-title>Homophily and contagion are generically confounded in observational social network studies</article-title>. <source>Sociol. Methods Res</source>. <volume>40</volume>, <fpage>211</fpage>&#x02013;<lpage>239</lpage>. <pub-id pub-id-type="doi">10.1177/0049124111404820</pub-id><pub-id pub-id-type="pmid">22523436</pub-id></citation></ref>
<ref id="B34">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Shaparenko</surname> <given-names>B.</given-names></name> <name><surname>&#x000C7;etin</surname> <given-names>&#x000D6;.</given-names></name> <name><surname>Iyer</surname> <given-names>R.</given-names></name></person-group> (<year>2009</year>). <article-title>&#x0201C;Data-driven text features for sponsored search click prediction,&#x0201D;</article-title> in <source>Proceedings of the Third International Workshop on Data Mining and Audience Intelligence for Advertising</source>, <fpage>46</fpage>&#x02013;<lpage>54</lpage>.</citation>
</ref>
<ref id="B35">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Sherman</surname> <given-names>E.</given-names></name> <name><surname>Shpitser</surname> <given-names>I.</given-names></name></person-group> (<year>2018</year>). <article-title>&#x0201C;Identification and estimation of causal effects from dependent data,&#x0201D;</article-title> in <source>Advances in Neural Information Processing Systems</source>, <fpage>9424</fpage>&#x02013;<lpage>9435</lpage>.<pub-id pub-id-type="pmid">30643365</pub-id></citation></ref>
<ref id="B36">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Shpitser</surname> <given-names>I..</given-names></name></person-group> (<year>2013</year>). <article-title>Counterfactual graphical models for longitudinal mediation analysis with unobserved confounding</article-title>. <source>Cogn. Sci</source>. <volume>37</volume>, <fpage>1011</fpage>&#x02013;<lpage>1035</lpage>. <pub-id pub-id-type="doi">10.1111/cogs.12058</pub-id><pub-id pub-id-type="pmid">23899340</pub-id></citation></ref>
<ref id="B37">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Shpitser</surname> <given-names>I.</given-names></name> <name><surname>Pearl</surname> <given-names>J.</given-names></name></person-group> (<year>2012</year>). <article-title>Identification of conditional interventional distributions</article-title>. <source>arXiv preprint arXiv:1206.6876</source>. <pub-id pub-id-type="doi">10.48550/arXiv.1206.6876</pub-id></citation>
</ref>
<ref id="B38">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Shpitser</surname> <given-names>I.</given-names></name> <name><surname>Pearl</surname> <given-names>J.</given-names></name></person-group> (<year>2006</year>). <article-title>&#x0201C;Identification of joint interventional distributions in recursive semi-Markovian causal models,&#x0201D;</article-title> in <source>Proceedings of the 21st National Conference on Artificial Intelligence</source>.</citation>
</ref>
<ref id="B39">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Sobel</surname> <given-names>M. E..</given-names></name></person-group> (<year>2006</year>). <article-title>What do randomized studies of housing mobility demonstrate? causal inference in the face of interference</article-title>. <source>J. Am. Stat. Assoc</source>. <volume>101</volume>, <fpage>1398</fpage>&#x02013;<lpage>1407</lpage>. <pub-id pub-id-type="doi">10.1198/016214506000000636</pub-id></citation>
</ref>
<ref id="B40">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Spirtes</surname> <given-names>P. L.</given-names></name> <name><surname>Glymour</surname> <given-names>C. N.</given-names></name> <name><surname>Scheines</surname> <given-names>R.</given-names></name> <name><surname>Heckerman</surname> <given-names>D.</given-names></name> <name><surname>Meek</surname> <given-names>C.</given-names></name> <name><surname>Cooper</surname> <given-names>G.</given-names></name> <etal/></person-group>. (<year>2000</year>). <source>Causation, Prediction, and Search</source>. <publisher-name>MIT Press</publisher-name>.</citation>
</ref>
<ref id="B41">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Tchetgen</surname> <given-names>E. J. T.</given-names></name> <name><surname>VanderWeele</surname> <given-names>T. J.</given-names></name></person-group> (<year>2012</year>). <article-title>On causal inference in the presence of interference</article-title>. <source>Stat. Methods Med. Res</source>. <volume>21</volume>, <fpage>55</fpage>&#x02013;<lpage>75</lpage>. <pub-id pub-id-type="doi">10.1177/0962280210386779</pub-id><pub-id pub-id-type="pmid">21068053</pub-id></citation></ref>
<ref id="B42">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Van der Laan</surname> <given-names>M. J.</given-names></name> <name><surname>Polley</surname> <given-names>E. C.</given-names></name> <name><surname>Hubbard</surname> <given-names>A. E.</given-names></name></person-group> (<year>2007</year>). <article-title>Super learner</article-title>. <source>Stat. Appl. Genet. Mol. Biol</source>. 6, 25. <pub-id pub-id-type="doi">10.2202/1544-6115.1309</pub-id><pub-id pub-id-type="pmid">17910531</pub-id></citation></ref>
<ref id="B43">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Verma</surname> <given-names>T.</given-names></name> <name><surname>Pearl</surname> <given-names>J.</given-names></name></person-group> (<year>1990</year>). <article-title>&#x0201C;Equivalence and synthesis of causal models,&#x0201D;</article-title> in <source>Proceedings of the 6th Conference on Uncertainty in Artificial Intelligence</source>.</citation>
</ref>
<ref id="B44">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Wang</surname> <given-names>X..</given-names></name></person-group> (<year>2020</year>). <article-title>&#x0201C;A survey of online advertising click-through rate prediction models,&#x0201D;</article-title> in <source>2020 IEEE International Conference on Information Technology, Big Data and Artificial Intelligence (ICIBA), Vol. 1</source>, (<publisher-loc>Chongqing</publisher-loc>: <publisher-name>IEEE</publisher-name>), <fpage>516</fpage>&#x02013;<lpage>521</lpage>.</citation>
</ref>
<ref id="B45">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Xiong</surname> <given-names>C.</given-names></name> <name><surname>Wang</surname> <given-names>T.</given-names></name> <name><surname>Ding</surname> <given-names>W.</given-names></name> <name><surname>Shen</surname> <given-names>Y.</given-names></name> <name><surname>Liu</surname> <given-names>T.-Y.</given-names></name></person-group> (<year>2012</year>). <article-title>&#x0201C;Relational click prediction for sponsored search,&#x0201D;</article-title> in <source>Proceedings of the Fifth ACM International Conference on Web Search and Data Mining</source>, <fpage>493</fpage>&#x02013;<lpage>502</lpage>.</citation>
</ref>
<ref id="B46">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Yin</surname> <given-names>D.</given-names></name> <name><surname>Cao</surname> <given-names>B.</given-names></name> <name><surname>Sun</surname> <given-names>J.-T.</given-names></name> <name><surname>Davison</surname> <given-names>B. D.</given-names></name></person-group> (<year>2014</year>). <article-title>&#x0201C;Estimating ad group performance in sponsored search,&#x0201D;</article-title> in <source>Proceedings of the 7th ACM International Conference on Web Search and Data Mining</source>, <fpage>143</fpage>&#x02013;<lpage>152</lpage>.</citation>
</ref>
<ref id="B47">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zeng</surname> <given-names>S.</given-names></name> <name><surname>Bayir</surname> <given-names>M. A.</given-names></name> <name><surname>Pfeiffer</surname> <given-names>I. I. I. J. J</given-names></name> <name><surname>Charles</surname> <given-names>D.</given-names></name> <name><surname>Kiciman</surname> <given-names>E.</given-names></name></person-group> (<year>2021</year>). <article-title>&#x0201C;Causal transfer random forest: combining logged data and randomized experiments for robust prediction,&#x0201D;</article-title> in <source>Proceedings of the 14th ACM International Conference on Web Search and Data Mining</source>, <fpage>211</fpage>&#x02013;<lpage>219</lpage>.</citation>
</ref>
<ref id="B48">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zhang</surname> <given-names>J..</given-names></name></person-group> (<year>2008</year>). <article-title>On the completeness of orientation rules for causal discovery in the presence of latent confounders and selection bias</article-title>. <source>Artif. Intell</source>. <volume>172</volume>, <fpage>1873</fpage>&#x02013;<lpage>1896</lpage>. <pub-id pub-id-type="doi">10.1016/j.artint.2008.08.001</pub-id></citation>
</ref>
<ref id="B49">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zhang</surname> <given-names>K.</given-names></name> <name><surname>Peters</surname> <given-names>J.</given-names></name> <name><surname>Janzing</surname> <given-names>D.</given-names></name> <name><surname>Sch&#x000F6;lkopf</surname> <given-names>B.</given-names></name></person-group> (<year>2012</year>). <article-title>Kernel-based conditional independence test and application in causal discovery</article-title>. <source>arXiv preprint arXiv:1202.3775</source>. <pub-id pub-id-type="doi">10.48550/arXiv.1202.3775</pub-id></citation>
</ref>
<ref id="B50">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zhang</surname> <given-names>Y.</given-names></name> <name><surname>Dai</surname> <given-names>H.</given-names></name> <name><surname>Xu</surname> <given-names>C.</given-names></name> <name><surname>Feng</surname> <given-names>J.</given-names></name> <name><surname>Wang</surname> <given-names>T.</given-names></name> <name><surname>Bian</surname> <given-names>J.</given-names></name> <etal/></person-group>. (<year>2014</year>). <article-title>Sequential click prediction for sponsored search with recurrent neural networks</article-title>. <source>arXiv preprint arXiv:1404.5772</source>. <pub-id pub-id-type="doi">10.48550/arXiv.1404.5772</pub-id></citation>
</ref>
</ref-list>
<fn-group>
<fn id="fn0001"><p><sup>1</sup><ext-link ext-link-type="uri" xlink:href="https://www.phil.cmu.edu/tetrad/publications.html">https://www.phil.cmu.edu/tetrad/publications.html</ext-link></p></fn>
</fn-group>
</back>
</article> 